Xen project Mailing List

Re: [PATCH] x86/hvm: allow for more fine grained assisted flush

To: Roger Pau Monne <roger.pau@xxxxxxxxxx>

Date: Mon, 4 May 2020 14:30:06 +0200

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Mon, 04 May 2020 12:30:36 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 30.04.2020 11:17, Roger Pau Monne wrote: > Improve the assisted flush by expanding the interface and allowing for > more fine grained TLB flushes to be issued using the HVMOP_flush_tlbs > hypercall. Support for such advanced flushes is signaled in CPUID > using the XEN_HVM_CPUID_ADVANCED_FLUSH flag. > > The new features make use of the NULL parameter so far passed in the > hypercall in order to convey extra data to perform more selective > flushes: a virtual address, an order field, a flags field and finally a > vCPU bitmap. Note that not all features are implemented as part of > this patch, but are already added to the interface in order to avoid > having to introduce yet a new CPUID flag when the new features are > added. > > The feature currently implemented is the usage of a guest provided > vCPU bitmap in order to signal which vCPUs require a TLB flush, > instead of assuming all vCPUs must be flushed. Note that not > implementing the rest of the features just make the flush less > efficient, but it's still correct and safe. A risk of not supporting these right away is that guest bugs may go unnoticed for quite some time. Just as a remark, not as a request to do the implementation right away. > --- a/xen/arch/x86/guest/xen/xen.c > +++ b/xen/arch/x86/guest/xen/xen.c > @@ -326,7 +326,27 @@ static void __init e820_fixup(struct e820map *e820) > > static int flush_tlb(const cpumask_t *mask, const void *va, unsigned int > flags) > { > - return xen_hypercall_hvm_op(HVMOP_flush_tlbs, NULL); > + xen_hvm_flush_tlbs_t op = { > + .va = (unsigned long)va, > + .order = (flags - 1) & FLUSH_ORDER_MASK, I consider such an expression as reasonable to use when there's a single, central place (in flushtlb.c). For uses elsewhere (and then better mirrored to the original place) I think we want to have a macro inverting FLUSH_ORDER(), e.g. FLUSH_GET_ORDER(). > + .flags = ((flags & FLUSH_TLB_GLOBAL) ? HVMOP_flush_global : 0) | > + ((flags & FLUSH_VA_VALID) ? HVMOP_flush_va_valid : 0), > + .mask_size = BITS_TO_LONGS(nr_cpu_ids), > + }; > + static int8_t __read_mostly advanced_flush = -1; > + > + if ( advanced_flush == -1 ) > + { > + uint32_t eax, ebx, ecx, edx; > + > + ASSERT(xen_cpuid_base); > + cpuid(xen_cpuid_base + 4, &eax, &ebx, &ecx, &edx); > + advanced_flush = (eax & XEN_HVM_CPUID_ADVANCED_FLUSH) ? 1 : 0; Use the more conventional (in our code base) !! here? Also use cpuid_eax(), to avoid several dead locals? > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -4011,17 +4011,42 @@ static void hvm_s3_resume(struct domain *d) > } > } > > -static bool always_flush(void *ctxt, struct vcpu *v) > +static bool flush_check(void *mask, struct vcpu *v) const twice? > { > - return true; > + return mask ? test_bit(v->vcpu_id, (unsigned long *)mask) : true; And a 3rd time? > } > > -static int hvmop_flush_tlb_all(void) > +static int hvmop_flush_tlb(XEN_GUEST_HANDLE_PARAM(xen_hvm_flush_tlbs_t) uop) > { > + xen_hvm_flush_tlbs_t op; This could move into the more narrow scope below. > + DECLARE_BITMAP(mask, HVM_MAX_VCPUS) = { }; > + bool valid_mask = false; > + > if ( !is_hvm_domain(current->domain) ) > return -EINVAL; > > - return paging_flush_tlb(always_flush, NULL) ? 0 : -ERESTART; > + if ( !guest_handle_is_null(uop) ) > + { > + if ( copy_from_guest(&op, uop, 1) ) > + return -EFAULT; > + > + /* > + * TODO: implement support for the other features present in > + * xen_hvm_flush_tlbs_t: flushing a specific virtual address and not > + * flushing global mappings. > + */ > + > + if ( op.mask_size > ARRAY_SIZE(mask) ) > + return -EINVAL; While a proper safeguard for the implementation, this looks rather arbitrary from the guests's pov: Bits beyond the guest's vCPU count aren't meaningful anyway. They could be ignored by not copying here, rather than by never inspecting them in flush_check(). And ignoring some bits here would then call for this to be done consistently for all of them, i.e. not returning -EINVAL. > + if ( copy_from_guest(mask, op.vcpu_mask, op.mask_size) ) > + return -EFAULT; > + > + valid_mask = true; > + } > + > + return paging_flush_tlb(flush_check, valid_mask ? mask : NULL) ? 0 > + : > -ERESTART; Just as a suggestion, this might look a little less odd when wrapped as return paging_flush_tlb(flush_check, valid_mask ? mask : NULL) ? 0 : -ERESTART; But it's really up to you. > @@ -5017,7 +5042,7 @@ long do_hvm_op(unsigned long op, > XEN_GUEST_HANDLE_PARAM(void) arg) > break; > > case HVMOP_flush_tlbs: > - rc = guest_handle_is_null(arg) ? hvmop_flush_tlb_all() : -EINVAL; > + rc = hvmop_flush_tlb(guest_handle_cast(arg, xen_hvm_flush_tlbs_t)); There's nothing to be passed back so maybe you could even cast to a const handle here? > --- a/xen/include/public/hvm/hvm_op.h > +++ b/xen/include/public/hvm/hvm_op.h > @@ -99,8 +99,29 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_set_pci_link_route_t); > > #endif /* __XEN_INTERFACE_VERSION__ < 0x00040900 */ > > -/* Flushes all VCPU TLBs: @arg must be NULL. */ > +/* > + * Flushes all VCPU TLBs: @arg can be NULL or xen_hvm_flush_tlbs_t. > + * > + * Support for passing a xen_hvm_flush_tlbs_t parameter is signaled in CPUID, > + * see XEN_HVM_CPUID_ADVANCED_FLUSH. > + */ > #define HVMOP_flush_tlbs 5 > +struct xen_hvm_flush_tlbs { > + /* Virtual address to be flushed. */ > + uint64_t va; > + uint16_t order; > + uint16_t flags; > +/* Flush global mappings. */ > +#define HVMOP_flush_global (1u << 0) > +/* VA for the flush has a valid mapping. */ > +#define HVMOP_flush_va_valid (1u << 1) > + /* Number of uint64_t elements in vcpu_mask. */ > + uint32_t mask_size; > + /* Bitmask of vcpus that should be flushed. */ > + XEN_GUEST_HANDLE(const_uint64) vcpu_mask; This will make the structure have different size for 64- and 32-bit callers. Apart from altp2m interfaces there's no precedent of a handle in the hvmop structures, so I wonder whether this wouldn't better be replaced, e.g. by having an effectively variable size array at the end of the struct. Simply forcing suitable padding, e.g. by using uint64_aligned_t for the first field, won't help, as this would lead to 32-bit callers not having suitable control over the upper 32 bits of what Xen will take as the handle. (And of course right now both uint64_aligned_t and XEN_GUEST_HANDLE_64 are Xen+tools constructs only.) Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.