[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 6/7] xen/riscv: page table handling
On 29.08.2024 14:04, oleksii.kurochko@xxxxxxxxx wrote: > On Thu, 2024-08-29 at 09:01 +0200, Jan Beulich wrote: >> On 28.08.2024 18:11, oleksii.kurochko@xxxxxxxxx wrote: >>> On Tue, 2024-08-27 at 17:00 +0200, Jan Beulich wrote: >>>> On 21.08.2024 18:06, Oleksii Kurochko wrote: >>>>> @@ -68,6 +111,20 @@ static inline bool pte_is_valid(pte_t p) >>>>> return p.pte & PTE_VALID; >>>>> } >>>>> >>>>> +inline bool pte_is_table(const pte_t p) >>>>> +{ >>>>> + return ((p.pte & (PTE_VALID | >>>>> + PTE_READABLE | >>>>> + PTE_WRITABLE | >>>>> + PTE_EXECUTABLE)) == PTE_VALID); >>>>> +} >>>> >>>> In how far is the READABLE check valid here? You (imo correctly) >>>> ... >> >> Oh, I wrongly picked on READABLE when it should have been the >> WRITABLE >> bit. >> >>>>> +static inline bool pte_is_mapping(const pte_t p) >>>>> +{ >>>>> + return (p.pte & PTE_VALID) && >>>>> + (p.pte & (PTE_WRITABLE | PTE_EXECUTABLE)); >>>>> +} >>>> >>>> ... don't consider this bit here. >>> pte_is_mapping() seems to me is correct as according to the RISC-V >>> privileged spec: >>> 4. Otherwise, the PTE is valid. If pte.r = 1 or pte.x = 1, go to >>> step >>> 5. Otherwise, this PTE is a pointer to the next level of the >>> page >>> table. >>> 5. A leaf PTE has been found. ... >> >> Right. And then why do you check all three of r, x, and w, when the >> doc >> mentions only r and x? There may be reasons, but such reasons then >> need >> clearly stating in a code comment, for people to understand why the >> code >> is not following the spec. > So I remembered why R, W, and X are checked. There is contradictory > information about these bits > (https://github.com/riscv/riscv-isa-manual/blob/main/src/supervisor.adoc?plain=1#L1317C64-L1321C10 > ): > ``` > The permission bits, R, W, and X, indicate whether the page is > readable, writable, and executable, respectively. When all three are > zero, the PTE is a pointer to the next level of the page table; > otherwise, it is a leaf PTE. > ``` > > However, it is also written here > (https://github.com/riscv/riscv-isa-manual/blob/main/src/supervisor.adoc?plain=1#L1539 > ) that only pte.r and pte.x should be checked. > > I can assume that the interpretation that R=W=X=0 indicates a pointer > to the next level of the page table could come from this statement > (https://github.com/riscv/riscv-isa-manual/blob/main/src/supervisor.adoc?plain=1#L1538 > ): > ``` > If _pte_._v_ = 0, or if _pte_._r_ = 0 and _pte_._w_ = 1, or if any bits > or encodings that are reserved for future standard use are set within > _pte_, stop and raise a page-fault exception corresponding to the > original access type. > ``` > From this, I can assume that when pte.r = 0, pte.w should also always > be zero; otherwise, a page-fault exception will be raised. ( but it is > no obviously connected to if the PTE is a pointer to the next page > table or not... ). I don't view the information provided as contradictory, especially when further taking the "Encoding of PTE R/W/X fields" table into account: W set but the other two clear is "Reserved for future use." >>>>> + * If `mfn` is valid and flags has PTE_VALID bit set then it >>>>> means >>>>> that >>>>> + * inserting will be done. >>>>> + */ >>>> >>>> What about mfn != INVALID_MFN and PTE_VALID clear? >>> PTE_VALID=0 will be always considered as destroying and no matter >>> what >>> is mfn value as in this case the removing is done in the way where >>> mfn >>> isn't used: >> >> Right, yet elsewhere you're restrictive as to MFN values valid to >> use. >> Not requiring INVALID_MFN here looks inconsistent to me. > but actually if we will leave ASSERT in pt_check_entry() we will be > sure that we are here with mfn = INVALID_MFN: > /* Sanity check when removing a mapping. */ > else if ( (flags & (PTE_VALID | PTE_POPULATE)) == 0 ) > { > /* We should be here with an invalid MFN. */ > ASSERT(mfn_eq(mfn, INVALID_MFN)); Having such an assertion there is fine, but doesn't save you from getting comments correct / complete. >>> memset(&pte, 0x00, sizeof(pte)); >> >> Just to mention it: I don't think memset() is a very good way of >> clearing >> a PTE, even if right here it's not a live one. > Just direct assigning would be better? Imo yes. >>>> Also note that "`mfn` is >>>> valid" isn't the same as "mfn != INVALID_MFN". You want to be >>>> precise >>>> here, >>>> to avoid confusion later on. (I say that knowing that we're still >>>> fighting >>>> especially shadow paging code on x86 not having those properly >>>> separated.) >>> If it is needed to be precise and mfn is valid isn't the same as >>> "mfn >>> != INVALID_MFN" only for the case of shadow paging? >> >> No, I used shadow paging only as an example of where we have similar >> issues. I'd like to avoid that a new port starts out with introducing >> more instances of that. You want to properly separate INVALID_MFN >> from >> "invalid MFN", where the latter means any MFN where either nothing >> exists at all, or (see mfn_valid()) where no struct page_info exists. > Well, now I think I understand the difference between "INVALID_MFN" and > "invalid MFN." > > Referring back to your original reply, I need to update the comment > above pt_update(): > ``` > ... > * If `mfn` is valid ( exist ) and flags has PTE_VALID bit set then it > means that inserting will be done. > ``` > Would this be correct and more precise? That depends on whether it correctly describes what the code does. If the code continues to check against INVALID_MFN, such a description wouldn't be correct. Also, just to re-iterate, ... > Based on the code for mfn_valid(), the separation is currently done > using the max_page value, which can't be initialized at the moment as > it requires reading the device tree file to obtain the RAM end. ... mfn_valid() may return false for MMIO pages, for which it may still be legitimate to create mappings. IMO ... > We could use a placeholder for the RAM end (for example, a very high > value like -1UL) and then add __mfn_valid() within pt_update(). > However, I'm not sure if this approach aligns with what you consider by > proper separation between INVALID_MFN and "invalid MFN." ... throughout the code here you mean INVALID_MFN and never "invalid MFN". Populating page tables is lower a layer than where you want to be concerned with that distinction; the callers of these low level functions will need to make the distinction where necessary. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |