|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v2 1/1] xen/arm: fix sparse cpu_possible_map calculation on SMP boot
Hello,
> > The issue here is that nr_cpu_ids is calculated in a way that
> > doesn't properly account for the maximum CPU ID when the map is
> > sparse, causing a mismatch. For example, if cpu_possible_map is
> > 0xff0f, nr_cpu_ids becomes 12, but the actual maximum CPU ID
> > is 15. Xen's common code is built on the assumption that
> > 'CPU ID < nr_cpu_ids', so this mismatch can break things.
> >
> > To fix this, modify dt_smp_init_cpus() so that if the
> > arch_cpu_init() call fails, we don't consume the CPU ID slot.
> >
> > Changes in v2:
> This should come after ---, not to be included in the final commit msg.
Okay.
> > Fix an issue where cpu_logical_map(0) is cleared when boot CPU
> > initialization fails.
> >
> > Signed-off-by: Hirokazu Takahashi <taka@xxxxxxxxxxxxx>
> This should have a Fixes tag (I traced to 4557c2292854).
Okay, I will add a tag in v3.
> > ---
> > xen/arch/arm/smpboot.c | 29 +++++++++++++++--------------
> > 1 file changed, 15 insertions(+), 14 deletions(-)
> >
> > diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> > index 7f3cfa812e..0ab9619398 100644
> > --- a/xen/arch/arm/smpboot.c
> > +++ b/xen/arch/arm/smpboot.c
> > @@ -191,6 +191,14 @@ static void __init dt_smp_init_cpus(void)
> > continue;
> > }
> >
> > + if ( hwid != mpidr && cpuidx >= NR_CPUS )
> This should stay where it was with just i >= NR_CPUS. By moving it here,
> above a
> duplicate filter, you are causing a regression. When the CPU list is full and
> it
> hits a duplicate, instead of ignoring the duplicate and moving on, it stops
> reading the list entirely (the boot CPU can be past that). Also, it makes the
> diff smaller.
Okay, I will try to rewrite the code.
> > + {
> > + printk(XENLOG_WARNING
> > + "DT /cpu %u node exceeds the max cores %u, capping
> them\n",
> > + cpuidx, NR_CPUS);
> > + break;
> > + }
> > +
> > /*
> > * Duplicate MPIDRs are a recipe for disaster. Scan all initialized
> > * entries and check for duplicates. If any found just skip the
> > node.
> > @@ -224,24 +232,19 @@ static void __init dt_smp_init_cpus(void)
> > bootcpu_valid = true;
> > }
> > else
> > - i = cpuidx++;
> > -
> > - if ( cpuidx > NR_CPUS )
> > - {
> > - printk(XENLOG_WARNING
> > - "DT /cpu %u node greater than max cores %u, capping
> > them\n",
> > - cpuidx, NR_CPUS);
> > - cpuidx = NR_CPUS;
> > - break;
> > - }
> > + i = cpuidx;
> >
> > if ( (rc = arch_cpu_init(i, cpu)) < 0 )
> > {
> > printk("cpu%d init failed (hwid %"PRIregister"): %d\n", i,
> > hwid, rc);
> > - tmp_map[i] = MPIDR_INVALID;
> > }
> > else
> > + {
> > tmp_map[i] = hwid;
> > +
> > + if ( i != 0 )
> > + cpuidx++;
> > + }
> > }
> >
> > if ( !bootcpu_valid )
> > @@ -251,10 +254,8 @@ static void __init dt_smp_init_cpus(void)
> > return;
> > }
> >
> > - for ( i = 0; i < cpuidx; i++ )
> > + for ( i = 1; i < cpuidx; i++ )
> Starting at index 1 is correct, since smp_prepare_boot_cpu() already
> set cpu_possible_map bit 0 and cpu_logical_map(0). That reason lives in
> a different function though, so please add a short comment here to make
> clear index 0 is skipped on purpose.
Okay, I will.
> > {
> > - if ( tmp_map[i] == MPIDR_INVALID )
> > - continue;
> > cpumask_set_cpu(i, &cpu_possible_map);
> > cpu_logical_map(i) = tmp_map[i];
> > }
>
> Other than that, this is a good fix, thanks.
Thank you,
Hirokazu Takahashi.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |