Xen project Mailing List

Re: [Xen-devel] [PATCH v3 07/10] xen/arm: Add handling write fault for dirty-page tracing

To: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>

Date: Mon, 5 Aug 2013 14:52:34 +0100

Cc: Jaeyong Yoo <jaeyong.yoo@xxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Mon, 05 Aug 2013 13:52:55 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, 2013-08-05 at 12:11 +0100, Stefano Stabellini wrote: > On Mon, 5 Aug 2013, Jaeyong Yoo wrote: > > > -----Original Message----- > > > From: Stefano Stabellini [mailto:stefano.stabellini@xxxxxxxxxxxxx] > > > Sent: Monday, August 05, 2013 1:28 AM > > > To: Jaeyong Yoo > > > Cc: xen-devel@xxxxxxxxxxxxx > > > Subject: Re: [Xen-devel] [PATCH v3 07/10] xen/arm: Add handling write > > > fault for dirty-page tracing > > > > > > On Thu, 1 Aug 2013, Jaeyong Yoo wrote: > > > > Add handling write fault in do_trap_data_abort_guest for dirty-page > > > tracing. > > > > Rather than maintaining a bitmap for dirty pages, we use the avail bit > > > in p2m entry. > > > > For locating the write fault pte in guest p2m, we use virtual-linear > > > > page table that slots guest p2m into xen's virtual memory. > > > > > > > > Signed-off-by: Jaeyong Yoo <jaeyong.yoo@xxxxxxxxxxx> > > > > > > Looks good to me. > > > I would appreciated some more comments in the code to explain the inner > > > working of the vlp2m. > > I got it. > > > > One question: If you see patch #6, it implements the allocation and free of > > vlp2m memory (xen/arch/arm/vlpt.c) which is almost the same to vmap > > allocation (xen/arch/arm/vmap.c). To be honest, I copied vmap.c and change > > the virtual address start/end points and the name. While I was doing that, > > I think it would be better if we naje a common interface, something like > > Virtual address allocator. That is, if we create a virtual address allocator > > > > giving the VA range from A to B, the allocator allocates the VA in between > > A and B. And, we initialize the virtual allocator instance at boot stage. > > Good question. I think it might be best to improve the current vmap > (it's actually xen/common/vmap.c) so that we can have multiple vmap > instances for different virtual address ranges at the same time. Before we go off and do that: I don't think this patch implements a linear p2m mapping in the sense in which I intended it when I suggested it. The patch implements a manual lookup with a kind of cache of the resulting mapping, I think. A linear mapping means inserting the current p2m base pointer into Xen's own pagetables in such a way that you can access a leaf node of the p2m by dereferencing a virtual address. Given this setup there should be no need for on-demand mapping as part of the log-dirty stuff, all the smarts happen at context switch time. Normally a linear memory map is done by creating a loop in the page tables, i.e. HTTBR[N] would contain an entry which referenced HTTBR again. In this case we actually have a separate p2m table which we want to stitch into the normal tables, which makes it a bit different to the classical case. Lets assume both Xen's page tables and the 2pm are two level, to simplify the ascii art. So for the P2M you have: VTTBR `-------> P2M FIRST `----------> P2M SECOND `-------------GUEST RAM Now if we arrange that Xen's page tables contains the VTTBR in a top level page table slot: HTTBR `-------> VTTBR `----------> P2M FIRST `-------------P2M SECOND, ACCESSED AS XEN RAM So now Xen can access the leaf PTE's of the P2M directly just by using the correct virtual address. This can be slightly tricky if P2M FIRST can contain super page mappings, since you need to arrange to stop a level sooner to get the correct PT entry. This means we need to arrange for a second virtual address region which maps to that, by arranging for a loop in the page table, e.g. HTTBR `-------> HTTBR `----------> VTTBR `-------------P2M FIRST, ACCESSED AS XEN RAM Under Xen, which uses LPAE and 3-level tables, I think the P2M SECOND would require 16 first level slots in the xen_second tables, which need to be context switched, the regions needed to hit the super page mappings would need slots too. If we use the gap between 128M and 256M in the Xen memory map then that means we are using xen_second[64..80]=p2m[0..16] for the linear map of the p2m leaf nodes. We can then use xen_second[80..144] to point back to xen_second allowing xen_second[64..80] to be dereferenced and create the loop needed for mapping for the superpage ptes in the P2M. So given VTTBR->P2M FIRST->P2M SECOND->P2M THIRD->GUEST RAM We have in the Xen mappings: HTTBR->XEN_SECOND[64..80]->P2M FIRST[0..16]->P2M SECOND->P2M THIRD AS XEN RAM HTTBR->XEN_SECOND[80..144]->XEN_SECOND(*)->P2M FIRST[0..16]->P2M SECOND AS XEN RAM (*) here we only care about XEN_SECOND[64..80] but the loop maps XEN_SECOND[0..512], a larger region which we can safely ignore. So if my maths is correct this means Xen can access P2M THIRD entries at virtual addresses 0x8000000..0xa000000 and P2M SECOND entries at 0x12000000..0x14000000, which means that the fault handler just needs to lookup the P2M SECOND to check it isn't super page mapping and then lookup P2M FIRST to mark it dirty etc. If for some reason we also need to access P2M FIRST efficiently we could add a third region, but I don't think we will be doing 1GB P2M mappings for the time being. It occurs to me now that with 16 slots changing on context switch and a further 16 aliasing them (and hence requiring maintenance too) for the super pages it is possible that the TLB maintenance at context switch might get prohibitively expensive. We could address this by firstly only doing it when switching to/from domains which have log dirty mode enabled and then secondly by seeing if we can make use of global or locked down mappings for the static Xen .text/.data/.xenheap mappings and therefore allow us to use a bigger global flush. In hindsight it might be the case that doing the domain_map_page walk on each lookup might be offset by the need to do all that TLB maintenance on context switch. It may be that this is something which we can only resolve by measuring? BTW, eventually we will have a direct map of all RAM for 64-bit only, so we would likely end up with different schemes for p2m lookups for the two sub arches, since the 64-bit direct map case the domain_map_page is very cheap. I hope my description of a linear map makes sense, hard to do without a whiteboard ;-) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.