Xen project Mailing List

Re: [Xen-devel] [PATCH v9 11/20] x86/VPMU: Interface for setting PMU mode and flags

>>> On 12.08.14 at 17:12, <boris.ostrovsky@xxxxxxxxxx> wrote: > On 08/12/2014 06:37 AM, Jan Beulich wrote: >>>>> On 08.08.14 at 18:55, <boris.ostrovsky@xxxxxxxxxx> wrote: >>> --- a/xen/arch/x86/domain.c >>> +++ b/xen/arch/x86/domain.c >>> @@ -1482,7 +1482,7 @@ void context_switch(struct vcpu *prev, struct vcpu >>> *next) >>> >>> if ( is_hvm_vcpu(prev) ) >>> { >>> - if (prev != next) >>> + if ( (prev != next) && (vpmu_mode & XENPMU_MODE_SELF) ) >>> vpmu_save(prev); >>> >>> if ( !list_empty(&prev->arch.hvm_vcpu.tm_list) ) >>> @@ -1526,7 +1526,7 @@ void context_switch(struct vcpu *prev, struct vcpu >>> *next) >>> !is_hardware_domain(next->domain)); >>> } >>> >>> - if (is_hvm_vcpu(next) && (prev != next) ) >>> + if ( is_hvm_vcpu(next) && (prev != next) && (vpmu_mode & >>> XENPMU_MODE_SELF) ) >>> /* Must be done with interrupts enabled */ >>> vpmu_load(next); >> Wouldn't such vPMU internals be better hidden in the functions >> themselves? I realize you can save the calls this way, but if the >> condition changes again later, we'll again have to adjust this >> core function rather than just the vPMU code. It's bad enough that >> the vpmu_mode variable is visible to non-vPMU code. > > How about an inline function? Yeah, that would perhaps do too. Albeit I'd still prefer all vPMU logic to be handle in the called vPMU functions. > I would prefer the whole series to get in in one go (with patch 2 > possibly being the only exception). All other patches (including patch > 1) are not particularly useful on their own. Okay. In that case in patch 1, please consider swapping struct xenpf_symdata's address and name fields (I had put a respective note on the patch for when committing it), shrinking the structure for compat mode guests. >>> + goto cont_wait; >>> + >>> + cpumask_andnot(&allbutself, &cpu_online_map, >>> + cpumask_of(smp_processor_id())); >>> + >>> + sync_task = xmalloc_array(struct tasklet, allbutself_num); >>> + if ( !sync_task ) >>> + { >>> + printk("vpmu_force_context_switch: out of memory\n"); >>> + return -ENOMEM; >>> + } >>> + >>> + for ( i = 0; i < allbutself_num; i++ ) >>> + tasklet_init(&sync_task[i], vpmu_sched_checkin, 0); >>> + >>> + atomic_set(&vpmu_sched_counter, 0); >>> + >>> + j = 0; >>> + for_each_cpu ( i, &allbutself ) >> This looks to be the only use for the (on stack) allbutself variable, >> but you could easily avoid this by using for_each_online_cpu() and >> skipping the local one. I'd also recommend that you count >> allbutself_num here rather than up front, since that will much more >> obviously prove that you wait for exactly as many CPUs as you >> scheduled. The array allocation above is bogus anyway, as on a >> huge system this can easily be more than a page in size. > > I don't understand the last sentence. What does page-sized allocation > have to do with this? We dislike greater than page size allocations at runtime. >>> +long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg) >>> +{ >>> + int ret = -EINVAL; >>> + xen_pmu_params_t pmu_params; >>> + >>> + switch ( op ) >>> + { >>> + case XENPMU_mode_set: >>> + { >>> + static DEFINE_SPINLOCK(xenpmu_mode_lock); >>> + uint32_t current_mode; >>> + >>> + if ( !is_control_domain(current->domain) ) >>> + return -EPERM; >>> + >>> + if ( copy_from_guest(&pmu_params, arg, 1) ) >>> + return -EFAULT; >>> + >>> + if ( pmu_params.val & ~XENPMU_MODE_SELF ) >>> + return -EINVAL; >>> + >>> + /* >>> + * Return error is someone else is in the middle of changing mode >>> --- >>> + * this is most likely indication of two system administrators >>> + * working against each other >>> + */ >>> + if ( !spin_trylock(&xenpmu_mode_lock) ) >>> + return -EAGAIN; >>> + >>> + current_mode = vpmu_mode; >>> + vpmu_mode = pmu_params.val; >>> + >>> + if ( vpmu_mode == XENPMU_MODE_OFF ) >>> + { >>> + /* >>> + * Make sure all (non-dom0) VCPUs have unloaded their VPMUs. >>> This >>> + * can be achieved by having all physical processors go through >>> + * context_switch(). >>> + */ >>> + ret = vpmu_force_context_switch(arg); >>> + if ( ret ) >>> + vpmu_mode = current_mode; >>> + } >>> + else >>> + ret = 0; >>> + >>> + spin_unlock(&xenpmu_mode_lock); >>> + break; >> This still isn't safe: There's nothing preventing another vCPU to issue >> another XENPMU_mode_set operation while the one turning the vPMU >> off is still in the process of waiting, but having exited the lock protected >> region in order to allow other processing to occur. > > I think that's OK: one of these two competing requests will timeout in > the while loop of vpmu_force_context_switch(). It will, in fact, likely > be the first caller because the second one will get right into the while > loop, without setting up the waiting data. This is a little > counter-intuitive but trying to set mode simultaneously from two > different places is asking for trouble anyway. Sure it is, but we nevertheless want the hypervisor to be safe. So I consider what you currently have acceptable only if you can prove that nothing bad can happen no matter in what order multiple requests get issued - from looking over your code I wasn't able to convince myself of this. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.