|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v9 11/20] x86/VPMU: Interface for setting PMU mode and flags
>>> On 12.08.14 at 17:12, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 08/12/2014 06:37 AM, Jan Beulich wrote:
>>>>> On 08.08.14 at 18:55, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -1482,7 +1482,7 @@ void context_switch(struct vcpu *prev, struct vcpu
>>> *next)
>>>
>>> if ( is_hvm_vcpu(prev) )
>>> {
>>> - if (prev != next)
>>> + if ( (prev != next) && (vpmu_mode & XENPMU_MODE_SELF) )
>>> vpmu_save(prev);
>>>
>>> if ( !list_empty(&prev->arch.hvm_vcpu.tm_list) )
>>> @@ -1526,7 +1526,7 @@ void context_switch(struct vcpu *prev, struct vcpu
>>> *next)
>>> !is_hardware_domain(next->domain));
>>> }
>>>
>>> - if (is_hvm_vcpu(next) && (prev != next) )
>>> + if ( is_hvm_vcpu(next) && (prev != next) && (vpmu_mode &
>>> XENPMU_MODE_SELF) )
>>> /* Must be done with interrupts enabled */
>>> vpmu_load(next);
>> Wouldn't such vPMU internals be better hidden in the functions
>> themselves? I realize you can save the calls this way, but if the
>> condition changes again later, we'll again have to adjust this
>> core function rather than just the vPMU code. It's bad enough that
>> the vpmu_mode variable is visible to non-vPMU code.
>
> How about an inline function?
Yeah, that would perhaps do too. Albeit I'd still prefer all vPMU
logic to be handle in the called vPMU functions.
> I would prefer the whole series to get in in one go (with patch 2
> possibly being the only exception). All other patches (including patch
> 1) are not particularly useful on their own.
Okay. In that case in patch 1, please consider swapping struct
xenpf_symdata's address and name fields (I had put a respective
note on the patch for when committing it), shrinking the structure
for compat mode guests.
>>> + goto cont_wait;
>>> +
>>> + cpumask_andnot(&allbutself, &cpu_online_map,
>>> + cpumask_of(smp_processor_id()));
>>> +
>>> + sync_task = xmalloc_array(struct tasklet, allbutself_num);
>>> + if ( !sync_task )
>>> + {
>>> + printk("vpmu_force_context_switch: out of memory\n");
>>> + return -ENOMEM;
>>> + }
>>> +
>>> + for ( i = 0; i < allbutself_num; i++ )
>>> + tasklet_init(&sync_task[i], vpmu_sched_checkin, 0);
>>> +
>>> + atomic_set(&vpmu_sched_counter, 0);
>>> +
>>> + j = 0;
>>> + for_each_cpu ( i, &allbutself )
>> This looks to be the only use for the (on stack) allbutself variable,
>> but you could easily avoid this by using for_each_online_cpu() and
>> skipping the local one. I'd also recommend that you count
>> allbutself_num here rather than up front, since that will much more
>> obviously prove that you wait for exactly as many CPUs as you
>> scheduled. The array allocation above is bogus anyway, as on a
>> huge system this can easily be more than a page in size.
>
> I don't understand the last sentence. What does page-sized allocation
> have to do with this?
We dislike greater than page size allocations at runtime.
>>> +long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
>>> +{
>>> + int ret = -EINVAL;
>>> + xen_pmu_params_t pmu_params;
>>> +
>>> + switch ( op )
>>> + {
>>> + case XENPMU_mode_set:
>>> + {
>>> + static DEFINE_SPINLOCK(xenpmu_mode_lock);
>>> + uint32_t current_mode;
>>> +
>>> + if ( !is_control_domain(current->domain) )
>>> + return -EPERM;
>>> +
>>> + if ( copy_from_guest(&pmu_params, arg, 1) )
>>> + return -EFAULT;
>>> +
>>> + if ( pmu_params.val & ~XENPMU_MODE_SELF )
>>> + return -EINVAL;
>>> +
>>> + /*
>>> + * Return error is someone else is in the middle of changing mode
>>> ---
>>> + * this is most likely indication of two system administrators
>>> + * working against each other
>>> + */
>>> + if ( !spin_trylock(&xenpmu_mode_lock) )
>>> + return -EAGAIN;
>>> +
>>> + current_mode = vpmu_mode;
>>> + vpmu_mode = pmu_params.val;
>>> +
>>> + if ( vpmu_mode == XENPMU_MODE_OFF )
>>> + {
>>> + /*
>>> + * Make sure all (non-dom0) VCPUs have unloaded their VPMUs.
>>> This
>>> + * can be achieved by having all physical processors go through
>>> + * context_switch().
>>> + */
>>> + ret = vpmu_force_context_switch(arg);
>>> + if ( ret )
>>> + vpmu_mode = current_mode;
>>> + }
>>> + else
>>> + ret = 0;
>>> +
>>> + spin_unlock(&xenpmu_mode_lock);
>>> + break;
>> This still isn't safe: There's nothing preventing another vCPU to issue
>> another XENPMU_mode_set operation while the one turning the vPMU
>> off is still in the process of waiting, but having exited the lock protected
>> region in order to allow other processing to occur.
>
> I think that's OK: one of these two competing requests will timeout in
> the while loop of vpmu_force_context_switch(). It will, in fact, likely
> be the first caller because the second one will get right into the while
> loop, without setting up the waiting data. This is a little
> counter-intuitive but trying to set mode simultaneously from two
> different places is asking for trouble anyway.
Sure it is, but we nevertheless want the hypervisor to be safe. So I
consider what you currently have acceptable only if you can prove
that nothing bad can happen no matter in what order multiple
requests get issued - from looking over your code I wasn't able to
convince myself of this.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |