|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 06/17] xen/riscv: add root page table allocation
On 30.06.2025 18:18, Oleksii Kurochko wrote:
> On 6/30/25 5:22 PM, Jan Beulich wrote:
>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/include/asm/p2m.h
>>> +++ b/xen/arch/riscv/include/asm/p2m.h
>>> @@ -26,6 +26,12 @@ struct p2m_domain {
>>> /* Pages used to construct the p2m */
>>> struct page_list_head pages;
>>>
>>> + /* The root of the p2m tree. May be concatenated */
>>> + struct page_info *root;
>>> +
>>> + /* Address Translation Table for the p2m */
>>> + paddr_t hgatp;
>> Does this really need holding in a struct field? Can't is be re-created at
>> any time from "root" above?
>
> Yes, with the current one implementation, I agree it would be enough only
> root. But as you noticed below...
>
>> And such re-creation is apparently infrequent,
>> if happening at all after initial allocation. (But of course I don't know
>> what future patches of yours will bring.) This is even more so if ...
>>
>>> --- a/xen/arch/riscv/include/asm/riscv_encoding.h
>>> +++ b/xen/arch/riscv/include/asm/riscv_encoding.h
>>> @@ -133,11 +133,13 @@
>>> #define HGATP_MODE_SV48X4 _UL(9)
>>>
>>> #define HGATP32_MODE_SHIFT 31
>>> +#define HGATP32_MODE_MASK _UL(0x80000000)
>>> #define HGATP32_VMID_SHIFT 22
>>> #define HGATP32_VMID_MASK _UL(0x1FC00000)
>>> #define HGATP32_PPN _UL(0x003FFFFF)
>>>
>>> #define HGATP64_MODE_SHIFT 60
>>> +#define HGATP64_MODE_MASK _ULL(0xF000000000000000)
>>> #define HGATP64_VMID_SHIFT 44
>>> #define HGATP64_VMID_MASK _ULL(0x03FFF00000000000)
>> ... VMID management is going to change as previously discussed, at which
>> point the value to put in hgatp will need (partly) re-calculating at certain
>> points anyway.
>
> ... after VMID management will changed to per-CPU base then it will be needed
> to update re-calculate hgatp each time vCPU on pCPU is changed.
> In this case I prefer to have partially calculated 'hgatp'.
But why, when you need to do some recalculation anyway?
>>> --- a/xen/arch/riscv/p2m.c
>>> +++ b/xen/arch/riscv/p2m.c
>>> @@ -41,6 +41,91 @@ void p2m_write_unlock(struct p2m_domain *p2m)
>>> write_unlock(&p2m->lock);
>>> }
>>>
>>> +static void clear_and_clean_page(struct page_info *page)
>>> +{
>>> + clean_dcache_va_range(page, PAGE_SIZE);
>>> + clear_domain_page(page_to_mfn(page));
>>> +}
>> A function of this name can, imo, only clear and then clean. Question is why
>> it's the other way around, and what the underlying requirement is for the
>> cleaning part to be there in the first place. Maybe that's obvious for a
>> RISC-V person, but it's entirely non-obvious to me (Arm being different in
>> this regard because of running with caches disabled at certain points in
>> time).
>
> You're right, the current name|clear_and_clean_page()| implies that clearing
> should come before cleaning, which contradicts the current implementation.
> The intent here is to ensure that the page contents are consistent in RAM
> (not just in cache) before use by other entities (guests or devices).
>
> The clean must follow the clear — so yes, the order needs to be reversed.
What you don't address though - why's the cleaning needed in the first place?
>>> +static struct page_info *p2m_allocate_root(struct domain *d)
>>> +{
>>> + struct page_info *page;
>>> + unsigned int order = get_order_from_bytes(KB(16));
>> While better than a hard-coded order of 2, this still is lacking. Is there
>> a reason there can't be a suitable manifest constant in the header?
>
> No any specific reason, I just decided not to introduce new definition as
> it is going to be used only inside this function.
>
> I think it will make sense to have in p2m.c:
> #define P2M_ROOT_PT_SIZE KB(16)
> If it isn't the best one option, then what about to move this defintion
> to config.h or asm/p2m.h.
It's defined by the hardware, so neither of the two headers looks to be a
good fit. Nor is the P2M_ prefix really in line with this being hardware-
defined. page.h has various paging-related hw definitions, and
riscv_encoding.h may also be a suitable place. There may be other candidates
that I'm presently overlooking.
>>> + unsigned int nr_pages = _AC(1,U) << order;
>> Nit (style): Missing blank after comma.
>
> I've changed that to BIT(order, U)
>
>>
>>> + /* Return back nr_pages necessary for p2m root table. */
>>> +
>>> + if ( ACCESS_ONCE(d->arch.paging.p2m_total_pages) < nr_pages )
>>> + panic("Specify more xen,domain-p2m-mem-mb\n");
>> You shouldn't panic() in anything involved in domain creation. You want to
>> return NULL in this case.
>
> It makes sense in this case just to return NULL.
>
>>
>> Further, to me the use of "more" looks misleading here. Do you perhaps mean
>> "larger" or "bigger"?
>>
>> This also looks to be happening without any lock held. If that's intentional,
>> I think the "why" wants clarifying in a code comment.
>
> Agree, returning back pages necessary for p2m root table should be done under
> spin_lock(&d->arch.paging.lock).
Which should be acquired at the paging_*() layer then, not at the p2m_*() layer.
(As long as you mean to have that separation, that is. See the earlier
discussion
on that matter.)
>>> + for ( unsigned int i = 0; i < nr_pages; i++ )
>>> + {
>>> + /* Return memory to domheap. */
>>> + page = page_list_remove_head(&d->arch.paging.p2m_freelist);
>>> + if( page )
>>> + {
>>> + ACCESS_ONCE(d->arch.paging.p2m_total_pages)--;
>>> + free_domheap_page(page);
>>> + }
>>> + else
>>> + {
>>> + printk(XENLOG_ERR
>>> + "Failed to free P2M pages, P2M freelist is empty.\n");
>>> + return NULL;
>>> + }
>>> + }
>> The reason for doing this may also want to be put in a comment.
>
> I thought it would be enough the comment above: /* Return back nr_pages
> necessary for p2m root table. */
That describes what the code does, but not why.
>>> +{
>>> + struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> +
>>> + p2m->root = p2m_allocate_root(d);
>>> + if ( !p2m->root )
>>> + return -ENOMEM;
>>> +
>>> + p2m->hgatp = hgatp_from_page(p2m);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> static spinlock_t vmid_alloc_lock = SPIN_LOCK_UNLOCKED;
>>>
>>> /*
>>> @@ -228,5 +313,14 @@ int p2m_set_allocation(struct domain *d, unsigned long
>>> pages, bool *preempted)
>>> }
>>> }
>>>
>>> + /*
>>> + * First, wait for the p2m pool to be initialized. Then allocate the
>>> root
>> Why "wait"? There's waiting here.
>
> I am not really get your question.
>
> "wait" here is about the initialization of the pool which happens above this
> comment.
But there's no "waiting" involved. What you talk about is one thing needing to
happen after the other.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |