|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 16/20] xen/riscv: Implement superpage splitting for p2m mappings
On 31.07.2025 17:58, Oleksii Kurochko wrote:
> Add support for down large memory mappings ("superpages") in the RISC-V
> p2m mapping so that smaller, more precise mappings ("finer-grained entries")
> can be inserted into lower levels of the page table hierarchy.
>
> To implement that the following is done:
> - Introduce p2m_split_superpage(): Recursively shatters a superpage into
> smaller page table entries down to the target level, preserving original
> permissions and attributes.
> - p2m_set_entry() updated to invoke superpage splitting when inserting
> entries at lower levels within a superpage-mapped region.
>
> This implementation is based on the ARM code, with modifications to the part
> that follows the BBM (break-before-make) approach, some parts are simplified
> as according to RISC-V spec:
> It is permitted for multiple address-translation cache entries to co-exist
> for the same address. This represents the fact that in a conventional
> TLB hierarchy, it is possible for multiple entries to match a single
> address if, for example, a page is upgraded to a superpage without first
> clearing the original non-leaf PTE’s valid bit and executing an SFENCE.VMA
> with rs1=x0, or if multiple TLBs exist in parallel at a given level of the
> hierarchy. In this case, just as if an SFENCE.VMA is not executed between
> a write to the memory-management tables and subsequent implicit read of the
> same address: it is unpredictable whether the old non-leaf PTE or the new
> leaf PTE is used, but the behavior is otherwise well defined.
> In contrast to the Arm architecture, where BBM is mandatory and failing to
> use it in some cases can lead to CPU instability, RISC-V guarantees
> stability, and the behavior remains safe — though unpredictable in terms of
> which translation will be used.
>
> Additionally, the page table walk logic has been adjusted, as ARM uses the
> opposite number of levels compared to RISC-V.
As before, I think you mean "numbering".
> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -539,6 +539,91 @@ static void p2m_free_subtree(struct p2m_domain *p2m,
> p2m_free_page(p2m, pg);
> }
>
> +static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
> + unsigned int level, unsigned int target,
> + const unsigned int *offsets)
> +{
> + struct page_info *page;
> + unsigned long i;
> + pte_t pte, *table;
> + bool rv = true;
> +
> + /* Convenience aliases */
> + mfn_t mfn = pte_get_mfn(*entry);
> + unsigned int next_level = level - 1;
> + unsigned int level_order = XEN_PT_LEVEL_ORDER(next_level);
> +
> + /*
> + * This should only be called with target != level and the entry is
> + * a superpage.
> + */
> + ASSERT(level > target);
> + ASSERT(pte_is_superpage(*entry, level));
> +
> + page = p2m_alloc_page(p2m->domain);
> + if ( !page )
> + {
> + /*
> + * The caller is in charge to free the sub-tree.
> + * As we didn't manage to allocate anything, just tell the
> + * caller there is nothing to free by invalidating the PTE.
> + */
> + memset(entry, 0, sizeof(*entry));
> + return false;
> + }
> +
> + table = __map_domain_page(page);
> +
> + /*
> + * We are either splitting a second level 1G page into 512 first level
> + * 2M pages, or a first level 2M page into 512 zero level 4K pages.
> + */
Such a comment is at risk of (silently) going stale when support for 512G
mappings is added. I wonder if it's really that informative to have here.
> + for ( i = 0; i < XEN_PT_ENTRIES; i++ )
> + {
> + pte_t *new_entry = table + i;
> +
> + /*
> + * Use the content of the superpage entry and override
> + * the necessary fields. So the correct permission are kept.
> + */
It's not just permissions though? The memory type field also needs
retaining (and is being retained this way). Maybe better say "attributes"?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |