[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers




On 7/2/25 12:09 PM, Jan Beulich wrote:
On 10.06.2025 15:05, Oleksii Kurochko wrote:
Implement the mfn_valid() macro to verify whether a given MFN is valid by
checking that it falls within the range [start_page, max_page).
These bounds are initialized based on the start and end addresses of RAM.

As part of this patch, start_page is introduced and initialized with the
PFN of the first RAM page.

Also, after providing a non-stub implementation of the mfn_valid() macro,
the following compilation errors started to occur:
  riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
  /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
  riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
  /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
  riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
  /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
  riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
  riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
  /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
  riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
  riscv64-linux-gnu-ld: final link failed: bad value
  make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
To resolve these errors, the following functions have also been introduced,
based on their Arm counterparts:
- page_get_owner_and_reference() and its variant to safely acquire a
  reference to a page and retrieve its owner.
- put_page() and put_page_nr() to release page references and free the page
  when the count drops to zero.
  For put_page_nr(), code related to static memory configuration is wrapped
  with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
  common code. Therefore, PGC_static and free_domstatic_page() are not
  introduced for RISC-V. However, since this configuration could be useful
  in the future, the relevant code is retained and conditionally compiled.
- A stub for page_is_ram_type() that currently always returns 0 and asserts
  unreachable, as RAM type checking is not yet implemented.
How does this end up working when common code references the function?
Based on the following commit message:
    Callers are VT-d (so x86 specific) and various bits of page offlining
    support, which although it looks generic (and is in xen/common) does
    things like diving into page_info->count_info which is not generic.
    
    In any case on this is only reachable via XEN_SYSCTL_page_offline_op,
    which clearly shouldn't be called on ARM just yet.

There is no callers for page_is_ram_type() for Arm now, and I expect something similar
for RISC-V. As we don't even introduce hypercalls for RISC-V, we can just live
without it.


@@ -288,8 +289,12 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
 #define page_get_owner(p)    (p)->v.inuse.domain
 #define page_set_owner(p, d) ((p)->v.inuse.domain = (d))
 
-/* TODO: implement */
-#define mfn_valid(mfn) ({ (void)(mfn); 0; })
+extern unsigned long start_page;
+
+#define mfn_valid(mfn) ({                                   \
+    unsigned long mfn__ = mfn_x(mfn);                       \
+    likely((mfn__ >= start_page) && (mfn__ < max_page));    \
+})
I don't think you should try to be clever and avoid using __mfn_valid() here,
at least not without an easily identifiable TODO. Surely you've seen that both
Arm and x86 use it.
Overlooked that pdx.c is compiled unconditionally, so I thought that __mfn_valid() common
implementation isn't avaiable as, at the moment, RISC-V doesn't support PDX_COMPRESSION...

Also, according to all I know, likely() doesn't work very well when used like
this, except for architectures supporting conditionally executed insns (like
Arm32 or IA-64, i.e. beyond conditional branches). I.e. if you want to use
likely() here, I think you need two of them.
... I will update mfn_valid() definition according to your recommendations.


@@ -525,6 +520,8 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
 #error setup_{directmap,frametable}_mapping() should be implemented for RV_32
 #endif
 
+unsigned long __read_mostly start_page;
Memory hotplug question again: __read_mostly or __ro_after_init?

@@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
 {
     return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
 }
+
+int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
+{
+    ASSERT_UNREACHABLE();
+
+    return 0;
+}
+
+static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
+                                                      unsigned long nr)
+{
+    unsigned long x, y = page->count_info;
+    struct domain *owner;
+
+    /* Restrict nr to avoid "double" overflow */
+    if ( nr >= PGC_count_mask )
+    {
+        ASSERT_UNREACHABLE();
+        return NULL;
+    }
I question the validity of this, already in the Arm original: I can't spot
how the caller guarantees to stay below that limit. Without such an
(attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
any limit check.
Agree, it could be really dropped.


+    do {
+        x = y;
+        /*
+         * Count ==  0: Page is not allocated, so we cannot take a reference.
+         * Count == -1: Reference count would wrap, which is invalid.
+         */
May I once again ask that you look carefully at comments (as much as at code)
you copy. Clearly this comment wasn't properly updated when the bumping by 1
was changed to bumping by nr.

+        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
+            return NULL;
+    }
+    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
+
+    owner = page_get_owner(page);
+    ASSERT(owner);
+
+    return owner;
+}
+
+struct domain *page_get_owner_and_reference(struct page_info *page)
+{
+    return page_get_owner_and_nr_reference(page, 1);
+}
+
+void put_page_nr(struct page_info *page, unsigned long nr)
+{
+    unsigned long nx, x, y = page->count_info;
+
+    do {
+        ASSERT((y & PGC_count_mask) >= nr);
+        x  = y;
+        nx = x - nr;
+    }
+    while ( unlikely((y = cmpxchg(&page->count_info, x, nx)) != x) );
+
+    if ( unlikely((nx & PGC_count_mask) == 0) )
+    {
+#ifdef CONFIG_STATIC_MEMORY
+        if ( unlikely(nx & PGC_static) )
+            free_domstatic_page(page);
+        else
+#endif
Such #ifdef-ed-out code is liable to go stale. Minimally use IS_ENABLED().
Even better would imo be if you introduced a "stub" PGC_static, resolving
to 0 (i.e. for now unconditionally).
An introduction of PGC_static would be better.

Thanks.
~ Oleksii

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.