[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/3] xen/riscv: implement software page table walking


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 29 Jan 2025 15:01:36 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 29 Jan 2025 14:01:46 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.01.2025 14:12, Oleksii Kurochko wrote:
> 
> On 1/28/25 9:14 AM, Jan Beulich wrote:
>> On 27.01.2025 18:22, Oleksii Kurochko wrote:
>>> On 1/27/25 1:57 PM, Jan Beulich wrote:
>>>> On 27.01.2025 13:29, Oleksii Kurochko wrote:
>>>>> On 1/27/25 11:06 AM, Jan Beulich wrote:
>>>>>> On 20.01.2025 17:54, Oleksii Kurochko wrote:
>>>>>>> RISC-V doesn't have hardware feature to ask MMU to translate
>>>>>>> virtual address to physical address ( like Arm has, for example ),
>>>>>>> so software page table walking in implemented.
>>>>>>>
>>>>>>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@xxxxxxxxx>
>>>>>>> ---
>>>>>>>     xen/arch/riscv/include/asm/mm.h |  2 ++
>>>>>>>     xen/arch/riscv/pt.c             | 56 
>>>>>>> +++++++++++++++++++++++++++++++++
>>>>>>>     2 files changed, 58 insertions(+)
>>>>>>>
>>>>>>> diff --git a/xen/arch/riscv/include/asm/mm.h 
>>>>>>> b/xen/arch/riscv/include/asm/mm.h
>>>>>>> index 292aa48fc1..d46018c132 100644
>>>>>>> --- a/xen/arch/riscv/include/asm/mm.h
>>>>>>> +++ b/xen/arch/riscv/include/asm/mm.h
>>>>>>> @@ -15,6 +15,8 @@
>>>>>>>     
>>>>>>>     extern vaddr_t directmap_virt_start;
>>>>>>>     
>>>>>>> +paddr_t pt_walk(vaddr_t va);
>>>>>> In the longer run, is returning just the PA really going to be 
>>>>>> sufficient?
>>>>>> If not, perhaps say a word on the limitation in the description.
>>>>> In the long run, this function's prototype looks like|paddr_t 
>>>>> pt_walk(vaddr_t root, vaddr_t va, bool is_xen)| [1]. However, I'm not 
>>>>> sure if it will stay that way,
>>>>> as I think|is_xen| could be skipped, since using|map_table()| should be 
>>>>> sufficient (as it now considers|system_state|) and I'm not really sure if 
>>>>> I need root argument
>>>>> as initial goal was to use this function for debug only purposes and I've 
>>>>> never used it for guest page table (stage-1) walking.
>>>>> Anyway, yes, it is still returning a physical address, and that seems 
>>>>> enough to me.
>>>>>
>>>>> Could you share your thoughts on what I should take into account for 
>>>>> returning value, probably, I am missing something really useful?
>>>> Often you care about the permissions as well. Sometimes it may even be 
>>>> relevant
>>>> to know the (super-)page size of the mapping.
>>> Perhaps it would be better to change the prototype to:
>>>     bool pt_walk(vaddr_t va, mfn_t *ret_pa);
>>> or even
>>>     void pt_walk(vaddr_t va, mfn_t *ret_pa);
>>>     In this case,|ret_pa = INVALID_MFN| could serve as a signal 
>>> that|pt_walk()| failed.
>>> If there's a need to return permissions or (super-)page size in the future, 
>>> another argument could be added.
>>>
>>> What do you think? Would this approach be better?
>>>
>>> I am also considering returning a structure containing the|mfn| 
>>> (or|paddr_t|) and adding other properties (such as permissions or
>>> page size) as needed in the future. Both solutions seem more or less 
>>> equivalent.
>> Imo the most natural thing for a page walking function would be to return the
>> leaf PTE (or the leaf-most not-present [or otherwise "no-access"] one). That
>> would provide (almost) all possible information to the caller. "Almost"
>> because depending on how page walk works, permissions may combine across page
>> table levels. Yet then (see also the "no-access" above) this would also
>> require further input, to specify the context for which the translation is
>> being seeked. For example, the intention to write may want to yield no valid
>> PTE when there are present ones down to the leaf, but effective permissions
>> say "read-only".
> 
> Perhaps returning the leaf PTE could be a really good option.
> 
> I'm not entirely sure I understand what you mean by "leaf-most not-present". 
> Could you please try to explain this moment one more time?
> My expectation was that the function should return an existing leaf PTE (from 
> which "access" rights could be determined)
> or|NULL| to indicate that no leaf PTE was found.

"no leaf PTE" may be for a variety of reasons. Hence why I think returning
the PTE at which the walk stopped (leaf or leaf-most not-present) is likely
best. Such a not-present PTE may, after all, still contain valuable
information; it's not like it has to be all zero.

> Another thing I'm curious about is whether this would be sufficient for 
> determining the level.
> It seems clear that, given a PTE and a virtual address, we could compute:
> |mask = VA | paddr_from_pte(pte)|

What would this value represent? No, from holding a PTE in your hands you
can't determine the level it came from. So yes, ...

> Then, iterating through each level, we could apply and understand on which 
> one level it was mapped:
> |mask & (BIT(XEN_PT_LEVEL_ORDER(i), UL) - 1)|.
> 
> If I haven't overlooked any other way to calculate the page table level, 
> would it be better to simply add another argument
> to|pt_walk()| to return the level.

... for callers who care doing this might then be necessary (this would be
a pointer parameter, and since I expect many callers wouldn't care about
the level, it likely wants to be permissible to pass in NULL).

Question then is whether it's better to hand back the level or the page
order of the mapping. On x86 we return the latter from P2M lookups, for
example.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.