|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 6/7] xen/riscv: page table handling
On 29.08.2024 14:04, oleksii.kurochko@xxxxxxxxx wrote:
> On Thu, 2024-08-29 at 09:01 +0200, Jan Beulich wrote:
>> On 28.08.2024 18:11, oleksii.kurochko@xxxxxxxxx wrote:
>>> On Tue, 2024-08-27 at 17:00 +0200, Jan Beulich wrote:
>>>> On 21.08.2024 18:06, Oleksii Kurochko wrote:
>>>>> @@ -68,6 +111,20 @@ static inline bool pte_is_valid(pte_t p)
>>>>> return p.pte & PTE_VALID;
>>>>> }
>>>>>
>>>>> +inline bool pte_is_table(const pte_t p)
>>>>> +{
>>>>> + return ((p.pte & (PTE_VALID |
>>>>> + PTE_READABLE |
>>>>> + PTE_WRITABLE |
>>>>> + PTE_EXECUTABLE)) == PTE_VALID);
>>>>> +}
>>>>
>>>> In how far is the READABLE check valid here? You (imo correctly)
>>>> ...
>>
>> Oh, I wrongly picked on READABLE when it should have been the
>> WRITABLE
>> bit.
>>
>>>>> +static inline bool pte_is_mapping(const pte_t p)
>>>>> +{
>>>>> + return (p.pte & PTE_VALID) &&
>>>>> + (p.pte & (PTE_WRITABLE | PTE_EXECUTABLE));
>>>>> +}
>>>>
>>>> ... don't consider this bit here.
>>> pte_is_mapping() seems to me is correct as according to the RISC-V
>>> privileged spec:
>>> 4. Otherwise, the PTE is valid. If pte.r = 1 or pte.x = 1, go to
>>> step
>>> 5. Otherwise, this PTE is a pointer to the next level of the
>>> page
>>> table.
>>> 5. A leaf PTE has been found. ...
>>
>> Right. And then why do you check all three of r, x, and w, when the
>> doc
>> mentions only r and x? There may be reasons, but such reasons then
>> need
>> clearly stating in a code comment, for people to understand why the
>> code
>> is not following the spec.
> So I remembered why R, W, and X are checked. There is contradictory
> information about these bits
> (https://github.com/riscv/riscv-isa-manual/blob/main/src/supervisor.adoc?plain=1#L1317C64-L1321C10
> ):
> ```
> The permission bits, R, W, and X, indicate whether the page is
> readable, writable, and executable, respectively. When all three are
> zero, the PTE is a pointer to the next level of the page table;
> otherwise, it is a leaf PTE.
> ```
>
> However, it is also written here
> (https://github.com/riscv/riscv-isa-manual/blob/main/src/supervisor.adoc?plain=1#L1539
> ) that only pte.r and pte.x should be checked.
>
> I can assume that the interpretation that R=W=X=0 indicates a pointer
> to the next level of the page table could come from this statement
> (https://github.com/riscv/riscv-isa-manual/blob/main/src/supervisor.adoc?plain=1#L1538
> ):
> ```
> If _pte_._v_ = 0, or if _pte_._r_ = 0 and _pte_._w_ = 1, or if any bits
> or encodings that are reserved for future standard use are set within
> _pte_, stop and raise a page-fault exception corresponding to the
> original access type.
> ```
> From this, I can assume that when pte.r = 0, pte.w should also always
> be zero; otherwise, a page-fault exception will be raised. ( but it is
> no obviously connected to if the PTE is a pointer to the next page
> table or not... ).
I don't view the information provided as contradictory, especially when
further taking the "Encoding of PTE R/W/X fields" table into account: W
set but the other two clear is "Reserved for future use."
>>>>> + * If `mfn` is valid and flags has PTE_VALID bit set then it
>>>>> means
>>>>> that
>>>>> + * inserting will be done.
>>>>> + */
>>>>
>>>> What about mfn != INVALID_MFN and PTE_VALID clear?
>>> PTE_VALID=0 will be always considered as destroying and no matter
>>> what
>>> is mfn value as in this case the removing is done in the way where
>>> mfn
>>> isn't used:
>>
>> Right, yet elsewhere you're restrictive as to MFN values valid to
>> use.
>> Not requiring INVALID_MFN here looks inconsistent to me.
> but actually if we will leave ASSERT in pt_check_entry() we will be
> sure that we are here with mfn = INVALID_MFN:
> /* Sanity check when removing a mapping. */
> else if ( (flags & (PTE_VALID | PTE_POPULATE)) == 0 )
> {
> /* We should be here with an invalid MFN. */
> ASSERT(mfn_eq(mfn, INVALID_MFN));
Having such an assertion there is fine, but doesn't save you from getting
comments correct / complete.
>>> memset(&pte, 0x00, sizeof(pte));
>>
>> Just to mention it: I don't think memset() is a very good way of
>> clearing
>> a PTE, even if right here it's not a live one.
> Just direct assigning would be better?
Imo yes.
>>>> Also note that "`mfn` is
>>>> valid" isn't the same as "mfn != INVALID_MFN". You want to be
>>>> precise
>>>> here,
>>>> to avoid confusion later on. (I say that knowing that we're still
>>>> fighting
>>>> especially shadow paging code on x86 not having those properly
>>>> separated.)
>>> If it is needed to be precise and mfn is valid isn't the same as
>>> "mfn
>>> != INVALID_MFN" only for the case of shadow paging?
>>
>> No, I used shadow paging only as an example of where we have similar
>> issues. I'd like to avoid that a new port starts out with introducing
>> more instances of that. You want to properly separate INVALID_MFN
>> from
>> "invalid MFN", where the latter means any MFN where either nothing
>> exists at all, or (see mfn_valid()) where no struct page_info exists.
> Well, now I think I understand the difference between "INVALID_MFN" and
> "invalid MFN."
>
> Referring back to your original reply, I need to update the comment
> above pt_update():
> ```
> ...
> * If `mfn` is valid ( exist ) and flags has PTE_VALID bit set then it
> means that inserting will be done.
> ```
> Would this be correct and more precise?
That depends on whether it correctly describes what the code does. If
the code continues to check against INVALID_MFN, such a description
wouldn't be correct. Also, just to re-iterate, ...
> Based on the code for mfn_valid(), the separation is currently done
> using the max_page value, which can't be initialized at the moment as
> it requires reading the device tree file to obtain the RAM end.
... mfn_valid() may return false for MMIO pages, for which it may still
be legitimate to create mappings. IMO ...
> We could use a placeholder for the RAM end (for example, a very high
> value like -1UL) and then add __mfn_valid() within pt_update().
> However, I'm not sure if this approach aligns with what you consider by
> proper separation between INVALID_MFN and "invalid MFN."
... throughout the code here you mean INVALID_MFN and never "invalid MFN".
Populating page tables is lower a layer than where you want to be
concerned with that distinction; the callers of these low level functions
will need to make the distinction where necessary.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |