[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/2] xen/common: Add NUMA node id bounds check to page_alloc.c/node_to_scrub



On 26/09/2023 7:54 pm, Shawn Anastasio wrote:
> When building for Power with CONFIG_DEBUG unset, a compiler error gets
> raised inside page_alloc.c's node_to_scrub function:
>
> common/page_alloc.c: In function 'node_to_scrub.part.0':
> common/page_alloc.c:1217:29: error: array subscript 1 is above array
>             bounds of 'long unsigned int[1]' [-Werror=array-bounds]
>  1217 |         if ( node_need_scrub[node] )
>
> It appears that this is a false positive, given that in practice
> cycle_node should never return a node ID >= MAX_NUMNODES as long as the
> architecture's node_online_map is properly defined and initialized, so
> this additional bounds check is only to satisfy GCC.

If GCC thinks it's got an index of 1 here, then it thinks it's proved
that a 1 gets passed.  But the fact that cycle_node() really can return
1 if one variable gets tweaked in memory means that logic derived on
this property is broken.

But we've got even more basic problems to fix first.

xen$ git grep -B1 '__read_mostly node_online_map'
arch/arm/smpboot.c-45-/* Fake one node for now. See also asm/numa.h */
arch/arm/smpboot.c:46:nodemask_t __read_mostly node_online_map = { { [0]
= 1UL } };
arch/ppc/stubs.c-25-
arch/ppc/stubs.c:26:nodemask_t __read_mostly node_online_map = { { [0] =
1UL } };
common/numa.c-44-
common/numa.c:45:nodemask_t __read_mostly node_online_map = { { [0] =
1UL } };

There are 3 identical definitions of node_online_map, one for the
architecture which really supports NUMA, and two for the stubs which don't.

And the bug here is that code outside of CONFIG_NUMA assumes NUMA is
valid, including several places in page_alloc.c,
domain_set_node_affinity(), credit1 and sysctl.  XEN_SYSCTL_numainfo
even has a bigger sin of using a static MAX_NUMNODES array when it
doesn't need an array in the first place.

It's rude for xen/nodemask.h to short circuit some of the operations on
node_online_map based on MAX_NUMNODES but not others.

If nothing else, the fallback for "not really NUMA" needs providing by
the common code and not by the arch stubs.  When that is sorted, we
might have more consistent behaviour to investigate.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.