|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v9 3/6] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
On 18.11.2022 11:45, Wei Chen wrote:
> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
> results in two lines of error-checking code in phys_to_nid
> that is not actually working and causing two compilation
> errors:
> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
> This is because in the common header file, "MAX_NUMNODES" is
> defined after the common header file includes the ARCH header
> file, where phys_to_nid has attempted to use "MAX_NUMNODES".
> This error was resolved after we moved the phys_to_nid from
> x86 ARCH header file to common header file.
> 2. error: wrong type argument to unary exclamation mark.
> This is because, the error-checking code contains !node_data[nid].
> But node_data is a data structure variable, it's not a pointer.
>
> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
> enable the two lines of error-checking code. And fix the left
> compilation errors by replacing !node_data[nid] to
> !node_data[nid].node_spanned_pages. Although NUMA allows one node
> can only have CPUs but without any memory. And node with 0 bytes
> of memory might have an entry in memnodemap[] theoretically. But
> that doesn't mean phys_to_nid can find any valid address from a
> node with 0 bytes memory.
>
> Signed-off-by: Wei Chen <wei.chen@xxxxxxx>
> Tested-by: Jiamei Xie <jiamei.xie@xxxxxxx>
> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
This patch is what is causing the regression found by osstest. The
previously silent (dead) checks no trigger when paging_init()
encounters a hole in SRAT-described space, as is the case e.g. on
the himrods:
(XEN) NUMA: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
(XEN) NUMA: Node 0 PXM 0 [0000000100000000, 000000087fffffff]
(XEN) NUMA: Node 1 PXM 1 [0000000880000000, 000000107fffffff]
(XEN) NUMA: Using 19 for the hash shift
The node ID for 0x80000000 (slot 1 of memnodemap[]) is NUMA_NO_NODE,
which of course fails the left side of ...
> @@ -69,9 +67,9 @@ extern struct node_data node_data[];
> static inline nodeid_t __attribute_pure__ phys_to_nid(paddr_t addr)
> {
> nodeid_t nid;
> - VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
> + ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
> nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
> - VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
> + ASSERT(nid < MAX_NUMNODES && node_data[nid].node_spanned_pages);
... the && here. As I don't think the use of phys_to_nid() by
paging_init() is outright invalid, I would conclude that the
condition needs to be relaxed to permit for NUMA_NO_NODE. Otoh it's
possible that the function was really intended to never return
NUMA_NO_NODE (and only ever be used on valid memory ranges), in
which case paging_init() would need to use something else (or
make the call conditional): According to my audit all uses except
the two in paging_init() are on (supposedly) valid addresses only
(commonly when already holding in hands a valid struct page_info).
Then again us having phys_to_nid() in the first place is somewhat
bogus: No caller really starts from a physical address. It would
be quite a bit more sensible to have page_to_nid() and mfn_to_nid(),
the more that what phys_to_nid() does is pass the address to
paddr_to_pdx() (undoing every caller's to-addr conversion). At which
point the former could do strict checking (disallowing NUMA_NO_NODE
output) while the latter could be more relaxed. I guess I'll make a
patch along these lines.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |