[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 6/7] xen/riscv: page table handling
On 15.08.2024 13:21, oleksii.kurochko@xxxxxxxxx wrote: > On Thu, 2024-08-15 at 10:09 +0200, Jan Beulich wrote: >> On 14.08.2024 18:50, oleksii.kurochko@xxxxxxxxx wrote: >>> On Tue, 2024-08-13 at 12:31 +0200, Jan Beulich wrote: >>>> On 09.08.2024 18:19, Oleksii Kurochko wrote: >>>>> RISC-V detects superpages using `pte.x` and `pte.r`, as there >>>>> is no specific bit in the PTE for this purpose. From the RISC-V >>>>> spec: >>>>> ``` >>>>> ... >>>>> 4. Otherwise, the PTE is valid. If pte.r = 1 or pte.x = 1, go >>>>> to >>>>> step 5. >>>>> Otherwise, this PTE is a pointer to the next level of the >>>>> page >>>>> table. >>>>> ... . >>>>> 5. A leaf PTE has been found. >>>>> ... >>>>> ... >>>>> ``` >>>>> >>>>> The code doesn’t support super page shattering so 4KB pages are >>>>> used as >>>>> default. >>>> >>>> ... you have this. Yet still callers expecting re-mapping in the >>>> (large) >>>> range they map can request small-page mappings right away. >>> I am not sure that I fully understand what do you mean by "expcting >>> re- >>> mapping". >> >> Right now you have callers pass PTE_BLOCK when they know that no >> small >> page re-mappings are going to occur for an area. What I'm suggesting >> is >> that you invert this logic: Have callers pass PTE_SMALL when there is >> a possibility that re-mapping requests may be issued later. Then, >> later, by simply grep-ing for PTE_SMALL you'll be able to easily find >> all candidates that possibly can be relaxed when super-page >> shattering >> starts being supported. That's going to be easier than finding all >> instances where PTE_BLOCK is _not_used. > So if I understand correctly. Actually nothing will change in algorithm > of pt_update() and only PTE_SMALL should be introduced instead of > PTE_BLOCK. And if I will know that something will be better to map as > PTE_SMALL to not face shattering in case of unmap (for example) I just > can map this memory as PTE_SMALL and that is it? That is it. >>>>> + spin_unlock(&xen_pt_lock); >>>> >>>> Does this really need to come after fence and flush? >>> I think yes, as page table should be updated only by 1 CPU at the >>> same >>> time. And before give ability to other CPU to update page table we >>> have >>> to finish a work on current CPU. >> >> Can you then explain to me, perhaps by way of an example, what will >> go >> wrong if the unlock is ahead of the flush? (I'm less certain about >> the >> fence, and that's also less expensive.) > pt_update() will be called for interleaved region, for example, by > different CPUs: > > pt_update(): > CPU1: CPU2: > ... spin_lock(&xen_pt_lock); > RISCV_FENCE(rw, rw); .... > > /* After this function will be > executed the following thing > can happen ------------------> start to update page table > */ entries which was partially > spin_unlock(&xen_pt_lock); created during CPU1 but CPU2 > .... doesn't know about them yet > .... because flush_tlb() ( sfence.vma ) > .... wasn't done > .... > flush_tlb_range_va(); Not exactly: CPU2 knows about them as far as the memory used / modified goes, and that's all that matters for further page table modifications. CPU2 only doesn't know about the new page table entries yet when it comes to using them for a translation (by the hardware page walker). Yet this aspect is irrelevant here, if I'm not mistaken. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |