|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 20/20] xen/riscv: introduce metadata table to store P2M type
On 31.07.2025 17:58, Oleksii Kurochko wrote:
> RISC-V's PTE has only two available bits that can be used to store the P2M
> type. This is insufficient to represent all the current RISC-V P2M types.
> Therefore, some P2M types must be stored outside the PTE bits.
>
> To address this, a metadata table is introduced to store P2M types that
> cannot fit in the PTE itself. Not all P2M types are stored in the
> metadata table—only those that require it.
>
> The metadata table is linked to the intermediate page table via the
> `struct page_info`'s list field of the corresponding intermediate page.
>
> To simplify the allocation and linking of intermediate and metadata page
> tables, `p2m_{alloc,free}_table()` functions are implemented.
>
> These changes impact `p2m_split_superpage()`, since when a superpage is
> split, it is necessary to update the metadata table of the new
> intermediate page table — if the entry being split has its P2M type set
> to `p2m_ext_storage` in its `P2M_TYPES` bits.
Oh, this was an aspect I didn't realize when commenting on the name of
the enumerator. I think you want to keep the name for the purpose here,
but you better wouldn't apply relational operators to it (and hence
have a second value to serve that purpose).
> In addition to updating
> the metadata of the new intermediate page table, the corresponding entry
> in the metadata for the original superpage is invalidated.
>
> Also, update p2m_{get,set}_type to work with P2M types which don't fit
> into PTE bits.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
No Suggested-by: or anything?
> --- a/xen/arch/riscv/include/asm/mm.h
> +++ b/xen/arch/riscv/include/asm/mm.h
> @@ -150,6 +150,15 @@ struct page_info
> /* Order-size of the free chunk this page is the head of. */
> unsigned int order;
> } free;
> +
> + /* Page is used to store metadata: p2m type. */
That's not correct. The page thus described is what the pointer below
points to. Here it's more like "Page is used as an intermediate P2M
page table".
> + struct {
> + /*
> + * Pointer to a page which store metadata for an intermediate
> page
> + * table.
> + */
> + struct page_info *metadata;
> + } md;
In the description you say you would re-use the list field.
> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -101,7 +101,16 @@ static int p2m_alloc_root_table(struct p2m_domain *p2m)
> {
> struct domain *d = p2m->domain;
> struct page_info *page;
> - const unsigned int nr_root_pages = P2M_ROOT_PAGES;
> + /*
> + * If the root page table starts at Level <= 2, and since only 1GB, 2MB,
> + * and 4KB mappings are supported (as enforced by the ASSERT() in
> + * p2m_set_entry()), it is necessary to allocate P2M_ROOT_PAGES for
> + * the root page table itself, plus an additional P2M_ROOT_PAGES for
> + * metadata storage. This is because only two free bits are available in
> + * the PTE, which are not sufficient to represent all possible P2M types.
> + */
> + const unsigned int nr_root_pages = P2M_ROOT_PAGES *
> + ((P2M_ROOT_LEVEL <= 2) ? 2 : 1);
>
> /*
> * Return back nr_root_pages to assure the root table memory is also
> @@ -114,6 +123,23 @@ static int p2m_alloc_root_table(struct p2m_domain *p2m)
> if ( !page )
> return -ENOMEM;
>
> + if ( P2M_ROOT_LEVEL <= 2 )
> + {
> + /*
> + * In the case where P2M_ROOT_LEVEL <= 2, it is necessary to allocate
> + * a page of the same size as that used for the root page table.
> + * Therefore, p2m_allocate_root() can be safely reused.
> + */
> + struct page_info *metadata = p2m_allocate_root(d);
> + if ( !metadata )
> + {
> + free_domheap_pages(page, P2M_ROOT_ORDER);
> + return -ENOMEM;
> + }
> +
> + page->v.md.metadata = metadata;
Don't you need to install such a link for every one of the 4 pages?
> @@ -198,24 +224,25 @@ static pte_t *p2m_get_root_pointer(struct p2m_domain
> *p2m, gfn_t gfn)
> return __map_domain_page(p2m->root + root_table_indx);
> }
>
> -static int p2m_set_type(pte_t *pte, p2m_type_t t)
> +static void p2m_set_type(pte_t *pte, const p2m_type_t t, const unsigned int
> i)
> {
> - int rc = 0;
> -
> if ( t > p2m_ext_storage )
> - panic("unimplemeted\n");
> + {
> + ASSERT(pte);
> +
> + pte[i].pte = t;
What does i identify here?
> + }
> else
> pte->pte |= MASK_INSR(t, P2M_TYPE_PTE_BITS_MASK);
> -
> - return rc;
> }
>
> -static p2m_type_t p2m_get_type(const pte_t pte)
> +static p2m_type_t p2m_get_type(const pte_t pte, const pte_t *metadata,
> + const unsigned int i)
> {
> p2m_type_t type = MASK_EXTR(pte.pte, P2M_TYPE_PTE_BITS_MASK);
>
> if ( type == p2m_ext_storage )
> - panic("unimplemented\n");
> + type = metadata[i].pte;
>
> return type;
> }
Overall this feels pretty fragile, as the caller has to pass several values
which all need to be in sync with one another. If you ...
> @@ -265,7 +292,10 @@ static void p2m_set_permission(pte_t *e, p2m_type_t t)
> }
> }
>
> -static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t, bool is_table)
> +static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t,
> + struct page_info *metadata_pg,
> + const unsigned int indx,
> + bool is_table)
> {
> pte_t e = (pte_t) { PTE_VALID };
>
> @@ -285,12 +315,21 @@ static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t,
> bool is_table)
>
> if ( !is_table )
> {
> + pte_t *metadata = __map_domain_page(metadata_pg);
... map the page anyway, no matter whether ...
> p2m_set_permission(&e, t);
>
> + metadata[indx].pte = p2m_invalid;
> +
> if ( t < p2m_ext_storage )
> - p2m_set_type(&e, t);
> + p2m_set_type(&e, t, indx);
> else
> - panic("unimplemeted\n");
> + {
> + e.pte |= MASK_INSR(p2m_ext_storage, P2M_TYPE_PTE_BITS_MASK);
> + p2m_set_type(metadata, t, indx);
> + }
... you'll actually use it, maybe best to map both pages at the same point?
And as said elsewhere, no, I don't think you want to use p2m_set_type() for
two entirely different purposes.
> @@ -323,22 +364,71 @@ static struct page_info *p2m_alloc_page(struct
> p2m_domain *p2m)
> return pg;
> }
>
> +static void p2m_free_page(struct p2m_domain *p2m, struct page_info *pg);
> +
> +/*
> + * Allocate a page table with an additional extra page to store
> + * metadata for each entry of the page table.
> + * Link this metadata page to page table page's list field.
> + */
> +static struct page_info * p2m_alloc_table(struct p2m_domain *p2m)
Nit: Stray blank after * again.
> +{
> + enum table_type
> + {
> + INTERMEDIATE_TABLE=0,
If you really think you need the "= 0", then please with blanks around '='.
> + /*
> + * At the moment, metadata is going to store P2M type
> + * for each PTE of page table.
> + */
> + METADATA_TABLE,
> + TABLE_MAX
> + };
> +
> + struct page_info *tables[TABLE_MAX];
> +
> + for ( unsigned int i = 0; i < TABLE_MAX; i++ )
> + {
> + tables[i] = p2m_alloc_page(p2m);
> +
> + if ( !tables[i] )
> + goto out;
> +
> + clear_and_clean_page(tables[i]);
> + }
> +
> + tables[INTERMEDIATE_TABLE]->v.md.metadata = tables[METADATA_TABLE];
> +
> + return tables[INTERMEDIATE_TABLE];
> +
> + out:
> + for ( unsigned int i = 0; i < TABLE_MAX; i++ )
> + if ( tables[i] )
You didn't clear all of tables[] first, though. This kind of cleanup is
often better done as
while ( i-- > 0 )
...
You don't even need an if() then, as you know allocations succeeded for all
earlier array slots.
> + p2m_free_page(p2m, tables[i]);
> +
> + return NULL;
> +}
I'm also surprised you allocate the metadata table no matter whether you'll
actually need it. That'll double your average paging pool usage, when in a
typical case only very few entries would actually require this extra
storage.
> @@ -453,10 +543,9 @@ static void p2m_put_2m_superpage(mfn_t mfn, p2m_type_t
> type)
> }
>
> /* Put any references on the page referenced by pte. */
> -static void p2m_put_page(const pte_t pte, unsigned int level)
> +static void p2m_put_page(const pte_t pte, unsigned int level, p2m_type_t
> p2mt)
> {
> mfn_t mfn = pte_get_mfn(pte);
> - p2m_type_t p2m_type = p2m_get_type(pte);
>
> ASSERT(pte_is_valid(pte));
>
> @@ -470,10 +559,10 @@ static void p2m_put_page(const pte_t pte, unsigned int
> level)
> switch ( level )
> {
> case 1:
> - return p2m_put_2m_superpage(mfn, p2m_type);
> + return p2m_put_2m_superpage(mfn, p2mt);
>
> case 0:
> - return p2m_put_4k_page(mfn, p2m_type);
> + return p2m_put_4k_page(mfn, p2mt);
> }
> }
Might it be better to introduce this function in this shape right away, in
the earlier patch?
> @@ -690,18 +791,23 @@ static int p2m_set_entry(struct p2m_domain *p2m,
> {
> /* We need to split the original page. */
> pte_t split_pte = *entry;
> + struct page_info *metadata = virt_to_page(table)->v.md.metadata;
This (or along these lines) is how I would have expected things to be done
elsewhere as well, limiting the amount of arguments you need to pass
around.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |