Xen project Mailing List

Re: [Xen-devel] [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

To: 'Ian Campbell' <Ian.Campbell@xxxxxxxxxx>

From: Jaeyong Yoo <jaeyong.yoo@xxxxxxxxxxx>

Date: Thu, 17 Oct 2013 17:58:08 +0900

Cc: xen-devel@xxxxxxxxxxxxx

Delivery-date: Thu, 17 Oct 2013 08:58:37 +0000

Dlp-filter: Pass

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: Ac7KbtIT7TG8UW4RSpq726/IPyZmlAApQhtg

> -----Original Message----- > From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxx] > Sent: Wednesday, October 16, 2013 9:51 PM > To: Jaeyong Yoo > Cc: xen-devel@xxxxxxxxxxxxx > Subject: Re: [Xen-devel] [PATCH v4 7/9] xen/arm: Implement virtual-linear > page table for guest p2m mapping in live migration > > On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote: > > Implement vlpt (virtual-linear page table) for fast accessing of 3rd > > PTE of guest p2m. When creating a mapping for vlpt, just copy the 1st > > level PTE of guest p2m to the xen's 2nd level PTE. Then the mapping > > becomes the following: > > xen's 1st PTE --> > > xen's 2nd PTE (which is the same as 1st PTE of guest p2m) --> > > guest p2m's 2nd PTE --> > > guest p2m's 3rd PTE (the memory contents where the vlpt points) > > For more info about vlpt, see: > > http://www.technovelty.org/linux/virtual-linear-page-table.html > > > > This function is used in dirty-page tracing: when domU write-fault is > > trapped by xen, xen can immediately locate the 3rd PTE of guest p2m. > > ISTR you had some numbers which backed up the choice of this solution. > Can you either include them here or link to the list archives? I think link to the list is OK. The full text is too long, I guess. > > > Signed-off-by: Jaeyong Yoo <jaeyong.yoo@xxxxxxxxxxx> > > --- > > xen/arch/arm/mm.c | 78 > ++++++++++++++++++++++++++++++++++++++++++++ > > xen/include/asm-arm/config.h | 6 ++++ xen/include/asm-arm/domain.h > > | 7 ++++ > > xen/include/asm-arm/mm.h | 22 +++++++++++++ > > 4 files changed, 113 insertions(+) > > > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index > > 120eae8..98edbc9 100644 > > --- a/xen/arch/arm/mm.c > > +++ b/xen/arch/arm/mm.c > > @@ -1382,6 +1382,84 @@ int is_iomem_page(unsigned long mfn) > > return 1; > > return 0; > > } > > + > > +/* flush the vlpt area */ > > +static void flush_vlpt(struct domain *d) { > > + int flush_size; > > + flush_size = (d->arch.dirty.second_lvl_end - > > + d->arch.dirty.second_lvl_start) << SECOND_SHIFT; > > + /* flushing the 3rd level mapping */ > > + flush_xen_data_tlb_range_va(VIRT_LIN_P2M_START, > > + flush_size); > > Shouldn't the base here be VIRT_LIN_P2M_START + > (d->arch.dirty.second_lvl_start) << SECOND_SHIFT) or something like that? Yes, right. It will also decrease the flush overhead. > > flush_xen_data_tlb_range_va just turns into a loop over the addresses, so > you might find you may as well do the flushes as you update the ptes in > the below, perhaps with an optimisation to pull the barriers outside that > loop. You mean the barriers in write_pte? OK. It looks better. > > > +} > > + > > +/* restore the xen page table for vlpt mapping for domain d */ void > > +restore_vlpt(struct domain *d) { > > + int i, j = 0; > > + int need_flush = 0; > > + > > + for ( i = d->arch.dirty.second_lvl_start; > > + i < d->arch.dirty.second_lvl_end; > > + ++i, ++j ) > > + { > > + if ( xen_second[i].bits != d->arch.dirty.second_lvl[j].bits ) > > + { > > + write_pte(&xen_second[i], d->arch.dirty.second_lvl[j]); > > + need_flush = 1; > > + } > > + } > > + > > + if ( need_flush ) flush_vlpt(d); > > +} > > + > > +/* setting up the xen page table for vlpt mapping for domain d */ > > +void prepare_vlpt(struct domain *d) { > > + int xen_second_linear_base; > > + int gp2m_start_index, gp2m_end_index; > > + struct p2m_domain *p2m = &d->arch.p2m; > > + struct page_info *second_lvl_page; > > + vaddr_t gma_start = 0; > > + vaddr_t gma_end = 0; > > + lpae_t *first; > > + int i, j; > > + > > + xen_second_linear_base = second_linear_offset(VIRT_LIN_P2M_START); > > + get_gma_start_end(d, &gma_start, &gma_end); > > + > > + gp2m_start_index = gma_start >> FIRST_SHIFT; > > + gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1; > > + > > + second_lvl_page = alloc_domheap_page(NULL, 0); > > The p2m first is two concatenated pages with a total of 1024 entries > (which is needed to give the full 40 bit IPA space). I have a feeling this > means you need two pages here? > > Or maybe we shouldn't be supporting the full 40 bit addresses on 32-bit? For generality, I think it is better to support it. But, honestly, I'm not sure how many people would use 40 bit ARM guest. > > NB currently create_p2m_entries doesn't actually support 40 bits, but I > need to fix that for an arm64 platform I'm looking at. > > > + d->arch.dirty.second_lvl = map_domain_page_global( > > + page_to_mfn(second_lvl_page) ); > > + > > + first = __map_domain_page(p2m->first_level); > > + for ( i = gp2m_start_index, j = 0; i < gp2m_end_index; ++i, ++j ) > > + { > > + write_pte(&xen_second[xen_second_linear_base+i], > > + first[i]); > > + > > + /* we copy the mapping into domain's structure as a reference in > case > > + * of the context switch (used in restore_vlpt) */ > > + d->arch.dirty.second_lvl[j] = first[i]; > > + } > > + unmap_domain_page(first); > > + > > + /* storing the start and end index */ > > + d->arch.dirty.second_lvl_start = xen_second_linear_base + > gp2m_start_index; > > + d->arch.dirty.second_lvl_end = xen_second_linear_base + > > + gp2m_end_index; > > + > > + flush_vlpt(d); > > +} > > + > > +void cleanup_vlpt(struct domain *d) > > +{ > > + unmap_domain_page_global(d->arch.dirty.second_lvl); > > +} > > + > > /* > > * Local variables: > > * mode: C > > diff --git a/xen/include/asm-arm/config.h > > b/xen/include/asm-arm/config.h index 9e395c2..47fc26e 100644 > > --- a/xen/include/asm-arm/config.h > > +++ b/xen/include/asm-arm/config.h > > @@ -87,6 +87,7 @@ > > * 0 - 8M <COMMON> > > * > > * 32M - 128M Frametable: 24 bytes per page for 16GB of RAM > > + * 128M - 256M Virtual-linear mapping to P2M table > > * 256M - 1G VMAP: ioremap and early_ioremap use this virtual address > > * space > > * > > @@ -132,6 +133,9 @@ > > > > #define VMAP_VIRT_END XENHEAP_VIRT_START > > > > +#define VIRT_LIN_P2M_START _AT(vaddr_t,0x08000000) > > Can you put this up in the list of addresses in order please. > > > +#define VIRT_LIN_P2M_END VMAP_VIRT_START > > > + > > #define DOMHEAP_ENTRIES 1024 /* 1024 2MB mapping slots */ > > > > /* Number of domheap pagetable pages required at the second level > > (2MB mappings) */ @@ -158,6 +162,8 @@ > > > > #define HYPERVISOR_VIRT_END DIRECTMAP_VIRT_END > > > > +/*TODO (ARM_64): define VIRT_LIN_P2M_START VIRT_LIN_P2M_END */ > > + > > #endif > > > > /* Fixmap slots */ > > diff --git a/xen/include/asm-arm/domain.h > > b/xen/include/asm-arm/domain.h index 755c7af..7e7743a 100644 > > --- a/xen/include/asm-arm/domain.h > > +++ b/xen/include/asm-arm/domain.h > > @@ -112,6 +112,13 @@ struct arch_domain > > spinlock_t lock; > > } vuart; > > > > + /* dirty-page tracing */ > > + struct { > > + int second_lvl_start; /* for context switch */ > > + int second_lvl_end; > > + lpae_t *second_lvl; > > + } dirty; > > + > > struct dt_mem_info map_domain; > > spinlock_t map_lock; > > } __cacheline_aligned; > > diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index > > 021e8d6..b9cbcf2 100644 > > --- a/xen/include/asm-arm/mm.h > > +++ b/xen/include/asm-arm/mm.h > > @@ -344,6 +344,28 @@ static inline void put_page_and_type(struct > > page_info *page) > > > > void get_gma_start_end(struct domain *d, vaddr_t *start, vaddr_t > > *end); > > > > +/* vlpt (virtual linear page table) related functions */ void > > +prepare_vlpt(struct domain *d); void cleanup_vlpt(struct domain *d); > > +void restore_vlpt(struct domain *d); > > + > > +/* calculate the xen's virtual address for accessing the leaf PTE of > > + * a given address (GPA) */ > > +static inline lpae_t * get_vlpt_3lvl_pte(vaddr_t addr) > > Argument is an IPA and should therefore be a paddr_t I think? Oh, right, and the type in traps.c is also paddr_t. It is my mistake. > > > +{ > > + unsigned long ipa_third = third_table_offset(addr); > > + unsigned long ipa_second = second_table_offset(addr); > > + unsigned long ipa_first = first_table_offset(addr); > > + > > + /* guest p2m's first level index is the index of xen's second level, > > + and guest p2m's second level index is the index of xen's third > level */ > > + return (lpae_t *)( > > + ( (second_linear_offset(VIRT_LIN_P2M_START) + ipa_first) > > + << SECOND_SHIFT) + > > + (ipa_second << THIRD_SHIFT) + > > + (ipa_third << 3) ); > > +} > > Can we assign the table base to an "lpae_t *table" and then use > &table[ipa_third] rather than hardcoding << 3? OK. > > I can't quite shake the feeling that the "ipa_first<<SECOND_SHIFT" and > "ipa_second<<THIRD_SHIFT" elements of this could be achieved with a single > shift of the original address (and maybe a mask). Right, just one shift with LPAE_SHIFT on the addr is sufficient. I'm surprise you figure this out immediately. > > > + > > #endif /* __ARCH_ARM_MM__ */ > > /* > > * Local variables: _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.