[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 6/7] xen/riscv: page table handling
On 15.08.2024 15:34, oleksii.kurochko@xxxxxxxxx wrote: > On Thu, 2024-08-15 at 14:16 +0200, Jan Beulich wrote: >> On 15.08.2024 13:21, oleksii.kurochko@xxxxxxxxx wrote: >>> On Thu, 2024-08-15 at 10:09 +0200, Jan Beulich wrote: >>>> On 14.08.2024 18:50, oleksii.kurochko@xxxxxxxxx wrote: >>>>> On Tue, 2024-08-13 at 12:31 +0200, Jan Beulich wrote: >>>>>> On 09.08.2024 18:19, Oleksii Kurochko wrote: >>>>>>> RISC-V detects superpages using `pte.x` and `pte.r`, as >>>>>>> there >>>>>>> is no specific bit in the PTE for this purpose. From the >>>>>>> RISC-V >>>>>>> spec: >>>>>>> ``` >>>>>>> ... >>>>>>> 4. Otherwise, the PTE is valid. If pte.r = 1 or pte.x = >>>>>>> 1, go >>>>>>> to >>>>>>> step 5. >>>>>>> Otherwise, this PTE is a pointer to the next level of >>>>>>> the >>>>>>> page >>>>>>> table. >>>>>>> ... . >>>>>>> 5. A leaf PTE has been found. >>>>>>> ... >>>>>>> ... >>>>>>> ``` >>>>>>> >>>>>>> The code doesn’t support super page shattering so 4KB pages >>>>>>> are >>>>>>> used as >>>>>>> default. >>>>>> >>>>>> ... you have this. Yet still callers expecting re-mapping in >>>>>> the >>>>>> (large) >>>>>> range they map can request small-page mappings right away. >>>>> I am not sure that I fully understand what do you mean by >>>>> "expcting >>>>> re- >>>>> mapping". >>>> >>>> Right now you have callers pass PTE_BLOCK when they know that no >>>> small >>>> page re-mappings are going to occur for an area. What I'm >>>> suggesting >>>> is >>>> that you invert this logic: Have callers pass PTE_SMALL when >>>> there is >>>> a possibility that re-mapping requests may be issued later. Then, >>>> later, by simply grep-ing for PTE_SMALL you'll be able to easily >>>> find >>>> all candidates that possibly can be relaxed when super-page >>>> shattering >>>> starts being supported. That's going to be easier than finding >>>> all >>>> instances where PTE_BLOCK is _not_used. >>> So if I understand correctly. Actually nothing will change in >>> algorithm >>> of pt_update() and only PTE_SMALL should be introduced instead of >>> PTE_BLOCK. And if I will know that something will be better to map >>> as >>> PTE_SMALL to not face shattering in case of unmap (for example) I >>> just >>> can map this memory as PTE_SMALL and that is it? >> >> That is it. >> >>>>>>> + spin_unlock(&xen_pt_lock); >>>>>> >>>>>> Does this really need to come after fence and flush? >>>>> I think yes, as page table should be updated only by 1 CPU at >>>>> the >>>>> same >>>>> time. And before give ability to other CPU to update page table >>>>> we >>>>> have >>>>> to finish a work on current CPU. >>>> >>>> Can you then explain to me, perhaps by way of an example, what >>>> will >>>> go >>>> wrong if the unlock is ahead of the flush? (I'm less certain >>>> about >>>> the >>>> fence, and that's also less expensive.) >>> pt_update() will be called for interleaved region, for example, by >>> different CPUs: >>> >>> pt_update(): >>> CPU1: CPU2: >>> ... spin_lock(&xen_pt_lock); >>> RISCV_FENCE(rw, rw); .... >>> >>> /* After this function will be >>> executed the following thing >>> can happen ------------------> start to update page table >>> */ entries which was partially >>> spin_unlock(&xen_pt_lock); created during CPU1 but >>> CPU2 >>> .... doesn't know about them >>> yet >>> .... because flush_tlb() ( sfence.vma >>> ) >>> .... wasn't done >>> .... >>> >>> flush_tlb_range_va(); >> >> Not exactly: CPU2 knows about them as far as the memory used / >> modified >> goes, and that's all that matters for further page table >> modifications. >> CPU2 only doesn't know about the new page table entries yet when it >> comes >> to using them for a translation (by the hardware page walker). Yet >> this >> aspect is irrelevant here, if I'm not mistaken. > And it isn't an issue that CPU2 will add these new page table entries > again during execution of CPU2's pt_update()? Add these page table entries again? That's only going to happen due to another bug somewhere, I suppose. And it would be as much (or as little) of an issue if that happened right after dropping the lock. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |