[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH 4/4] arm/mem_access: Add software guest-page-table walk
Hi Sergej, On 30/04/17 20:48, Sergej Proskurin wrote: The function p2m_mem_access_check_and_get_page in mem_access.c translates a gva to an ipa by means of the hardware functionality implemented in the function gva_to_ipa. If mem_access is active, hardware-based gva to ipa translation might fail, as gva_to_ipa uses the guest's translation tables, access to which might be restricted by the active VTTBR. To address this issue, we perform the gva to ipa translation in software. Signed-off-by: Sergej Proskurin <proskurin@xxxxxxxxxxxxx> --- Cc: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> Cc: Tamas K Lengyel <tamas@xxxxxxxxxxxxx> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> Cc: Julien Grall <julien.grall@xxxxxxx> --- xen/arch/arm/mem_access.c | 140 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 139 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c index 04b1506b00..352eb6073f 100644 --- a/xen/arch/arm/mem_access.c +++ b/xen/arch/arm/mem_access.c @@ -20,6 +20,7 @@ #include <xen/monitor.h> #include <xen/sched.h> #include <xen/vm_event.h> +#include <xen/domain_page.h> #include <public/vm_event.h> #include <asm/event.h> @@ -90,6 +91,129 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn, return 0; } +static int +p2m_gva_to_ipa(struct p2m_domain *p2m, vaddr_t gva, + paddr_t *ipa, unsigned int flags) I don't think this function belongs to mem_access.c. It would be better in p2m.c. Also, the name is a bit confusing, a user would not know the difference between p2m_gva_to_ipa and gva_to_ipa. +{ + int level=0, t0_sz, t1_sz; Coding style level = 0. Also please use unsigned int for level. t0_sz and t1_sz should be stay signed. + unsigned long t0_max, t1_min; + lpae_t pte, *table; + mfn_t root_mfn; + uint64_t ttbr; + uint32_t sctlr = READ_SYSREG(SCTLR_EL1); + register_t ttbcr = READ_SYSREG(TCR_EL1); So you are assuming that the current vCPU is running, right? If so, this should be clarified in the commit message and in a comment above the function. Also s/ttbcr/tcr/ + struct domain *d = p2m->domain; + + const unsigned int offsets[4] = { +#ifdef CONFIG_ARM_64 + zeroeth_table_offset(gva), +#endif + first_table_offset(gva), + second_table_offset(gva), + third_table_offset(gva) + }; Offsets are based on the granularity of Xen (currently 4KB). However the guests can support 4KB, 16KB, 64KB. Please handle everything correctly. + + const paddr_t masks[4] = { +#ifdef CONFIG_ARM_64 + ZEROETH_SIZE - 1, +#endif + FIRST_SIZE - 1, + SECOND_SIZE - 1, + THIRD_SIZE - 1 + }; + + /* If the MMU is disabled, there is no need to translate the gva. */ + if ( !(sctlr & SCTLR_M) ) + { + *ipa = gva; + + return 0; + } + + if ( is_32bit_domain(d) ) + { + /* + * XXX: We do not support 32-bit domain translation table walks for + * domains using the short-descriptor translation table format, yet. + */ Debian ARM 32bit is using short-descriptor translation table. So are you suggesting that memaccess will not correctly with Debian guest? + if ( !(ttbcr & TTBCR_EAE) ) See my comment on patch #2 about the naming convention. + return -EFAULT; + +#ifdef CONFIG_ARM_64 + level = 1; This sounds wrong to me. The first lookup level is determined by the value of T0SZ and T1SZ (see B3-1352 in ARM DDI 0406C.c). For the AArch64 version see D4.2.5 in ARM DDI 0487A.k_iss10775. +#endif + } + +#ifdef CONFIG_ARM_64 + if ( is_64bit_domain(d) ) + { + /* Get the max GVA that can be translated by TTBR0. */ + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TCR_SZ_MASK; + t0_max = (1UL << (64 - t0_sz)) - 1; + + /* Get the min GVA that can be translated by TTBR1. */ + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TCR_SZ_MASK; + t1_min = ~0UL - (1UL << (64 - t1_sz)) + 1; + } + else +#endif + { + /* Get the max GVA that can be translated by TTBR0. */ + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TTBCR_SZ_MASK; See my comment on patch #2 for the naming convention. + t0_max = (1U << (32 - t0_sz)) - 1; + + /* Get the min GVA that can be translated by TTBR1. */ + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TTBCR_SZ_MASK; + t1_min = ~0U - (1U << (32 - t1_sz)) + 1; 1U << (32 - t1_sz) will not fit in an unsigned long if t1_sz = 0.Also, this code looks wrong to me. Looking at B3.6.4 in DDI 0406C.c, you don't handle properly t0_max and t1_min when T0SZ or T1SZ is 0. I think it would be worth for you to have a look to the pseudo-code in the ARM ARM for TranslationTableWalk (see B3.19.3 in ARM DDI 0406C.c and J1-5295 in ARM DDI 0487A.k_iss10775) which gives you all the details for doing properly translation table walk. + } + + if ( t0_max >= gva ) + /* Use TTBR0 for GVA to IPA translation. */ + ttbr = READ_SYSREG64(TTBR0_EL1); + else if ( t1_min <= gva ) + /* Use TTBR1 for GVA to IPA translation. */ + ttbr = READ_SYSREG64(TTBR1_EL1); + else + /* GVA out of bounds of TTBR(0|1). */ + return -EFAULT; + + /* Bits [63..48] might be used by an ASID. */ + root_mfn = p2m_lookup(d, _gfn(paddr_to_pfn(ttbr & ((1ULL<<48)-1))), NULL); Please don't hardcode the mask. + + /* Check, whether TTBR holds a valid address. */ + if ( mfn_eq(root_mfn, INVALID_MFN) ) + return -EFAULT; + + table = map_domain_page(root_mfn); Nothing prevents the guest to give back the page table whilst you browsing it. You may want to have a look to patch "ARM: introduce vgic_access_guest_memory()" [1] to generalize the function and avoid re-inventing the wheel. + + for ( ; ; level++ ) + { + pte = table[offsets[level]]; You likely want to use ACCESS_ONCE here to prevent the compiler to read the pte twice from the memory. + + if ( level == 3 || !pte.walk.valid || !pte.walk.table ) + break; + + unmap_domain_page(table); + + root_mfn = p2m_lookup(d, _gfn(pte.walk.base), NULL); No check on root_mfn? + table = map_domain_page(root_mfn); + } + + unmap_domain_page(table); + + if ( !pte.walk.valid ) + return -EFAULT; + + /* Make sure the entry holds the requested access attributes. */ + if ( ((flags & GV2M_WRITE) == GV2M_WRITE) && pte.pt.ro ) + return -EFAULT; IHMO, this function should return the access attribute of the page and let the caller decides what to do. This would make this function more generic. + + *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]); + + return 0; +} + + /* * If mem_access is in use it might have been the reason why get_page_from_gva * failed to fetch the page, as it uses the MMU for the permission checking. @@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag, struct page_info *page = NULL; struct p2m_domain *p2m = &v->domain->arch.p2m; + ASSERT(p2m->mem_access_enabled); + Why this ASSERT has been added? rc = gva_to_ipa(gva, &ipa, flag); + + /* + * In case mem_access is active, hardware-based gva_to_ipa translation + * might fail. Since gva_to_ipa uses the guest's translation tables, access + * to which might be restricted by the active VTTBR, we perform a gva to + * ipa translation in software. + */ if ( rc < 0 ) - goto err; + if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 ) + /* + * The software gva to ipa translation can still fail, if the the + * gva is not mapped or does not hold the requested access rights. + */ + goto err; Rather falling back, why don't we do software page table walk every time? gfn = _gfn(paddr_to_pfn(ipa)); Cheers, [1] https://patchwork.kernel.org/patch/9676291/ -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |