|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v2 2/9] xen/x86: Use enumerations to indicate NUMA status
Hi Jan,
> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 2022年7月12日 20:49
> To: Wei Chen <Wei.Chen@xxxxxxx>
> Cc: nd <nd@xxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Roger Pau
> Monné <roger.pau@xxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 2/9] xen/x86: Use enumerations to indicate NUMA
> status
>
> On 08.07.2022 16:54, Wei Chen wrote:
> > In current code, x86 is using two variables, numa_off and acpi_numa,
> > to indicate the NUMA status. This is because NUMA is not coupled with
> > ACPI, and ACPI still can work without NUMA on x86. With these two
> > variables' combinations, x86 can have several NUMA status:
> > NUMA swith on,
> > NUMA swith off,
> > NUMA swith on with NUMA emulation,
> > NUMA swith on with no-ACPI,
> > NUMA swith on with ACPI.
>
> Hmm, with both this and the actual change I'm not able to convince
> myself that you've expressed the prior combinations correctly. May
> I suggest that you make table representing the 6 (I think)
> combinations of original states with their mapping to the new
> enumerators? (It doesn't need to be 6 different enumerators, but
> all 6 existing states need a [proper] representation in the new
> model.)
>
Sorry for replying later, I paid sometime to check the code again,
and drew a table like below, I ignore two columns when numa_off=true
and acpi_numa=1/-1. Because when numa_off = true, ACPI srat table
will not be parsed:
+-----------+---------+---------------+-----------+------------+----------+
| original | col1 | col2 | col3 | col4 | col5 |
+-----------+---------+---------------+-----------+------------+----------+
|numa_off |true |false |false |false |false |
|acpi_numa |0 |0 |1 |-1 |x |
|numa_fake |x |x |x |x |fake_nodes|
|enum state |numa_off |numa_fw_nodata |numa_fw_ok |numa_fw_bad |numa_emu |
+-----------+---------+---------------+-----------+------------+----------+
Notes:
The following scenarios will make acpi_numa=0:
1. Xen disable ACPI through acpi_disabled = 1.
2. ACPI table doesn't have SRAT, or SRAT doesn't contain CPU and memory
NUMA affinity information.
3. Emulate numa through "numa=fake" command line parameter. But I found
that when numa_fake is enabled, current code will not prevent ACPI
srat parsers to parse NUMA information. So acpi_numa still may be
changed later. Is there any further reasons that we still need to
parse NUMA info from SRAT table when numa=fake? Or can we set
acpi_numa = -1 in nume_setup when "numa=fake" to make srat_disabled
can prevent ACPI SRAT parsing.
If meet the following conditions, then acpi_numa = 1:
1. Xen enable ACPI through acpi_disabled = 0.
2. ACPI SRAT parser can parse CPU and Memory NUMA affinities successfully
from SRAT table.
3. Pass the memmory blocks' sanity check and hash computing in
acpi_scan_nodes.
The following scenarios will make acpi_numa=-1:
1. ACPI table is disabled by "numa=noacpi" command like parameter.
2. Xen enable ACPI through acpi_disabled = 0 but ACPI SRAT parser can not
parse CPU or Memory NUMA affinities successfully from SRAT table.
3. The memmory blocks' sanity check or hash computing in acpi_scan_nodes
is failed.
> As an aside - I think you mean "switched" in all five of these
> lines.
>
Yes, I will fix them in next version.
> > --- a/xen/arch/x86/include/asm/numa.h
> > +++ b/xen/arch/x86/include/asm/numa.h
> > @@ -28,12 +28,22 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
> > #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
> > #define VIRTUAL_BUG_ON(x)
> >
> > +/* Enumerations for NUMA status. */
> > +enum numa_mode {
> > + numa_on = 0,
> > + numa_off,
>
> May I suggest to switch these two around, such that "off" becomes
> the meaning of 0, potentially allowing ! to be used in a boolean-
> like fashion here or there? And please omit the "= 0" part - it's
> only non-zero first values which actually need spelling out.
>
Ok.
> > + /* NUMA turns on, but ACPI table is bad or disabled. */
> > + numa_no_acpi,
> > + /* NUMA turns on, and ACPI table works well. */
> > + numa_acpi,
>
> As to the names of these: In the description you already say that
> you want to re-use the code for non-ACPI cases. Wouldn't you better
> avoid "acpi" in the names then (rather than perhaps renaming these
> another time later on)?
>
> I'd also like to understand what useful state "numa_no_acpi" is.
> I realize this was a state expressable by the two original
> variables, but does it make sense?
>
I have updated the new names in above table. And "numa_no_acpi"
is mapping to " numa_fw_bad" enum state.
> > @@ -528,7 +528,8 @@ int __init acpi_scan_nodes(paddr_t start, paddr_t
> end)
> > for (i = 0; i < MAX_NUMNODES; i++)
> > cutoff_node(i, start, end);
> >
> > - if (acpi_numa <= 0)
> > + /* Only when numa_on with good firmware, we can do numa scan nodes.
> */
> > + if (!numa_enabled_with_firmware())
> > return -1;
>
> Nit: Perhaps drop "do numa" from the comment?
>
Ok, I will do it in next version.
Cheers,
Wei Chen
> Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |