|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen: add hypercall option to temporarily pin a vcpu
On 26/02/16 11:39, Jan Beulich wrote:
>>>> On 25.02.16 at 17:50, <JGross@xxxxxxxx> wrote:
>> @@ -670,7 +676,13 @@ int cpu_disable_scheduler(unsigned int cpu)
>> if ( cpumask_empty(&online_affinity) &&
>> cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
>> {
>> - printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
>> + if ( v->affinity_broken )
>> + {
>> + /* The vcpu is temporarily pinned, can't move it. */
>> + vcpu_schedule_unlock_irqrestore(lock, flags, v);
>> + ret = -EBUSY;
>> + continue;
>> + }
>
> So far the function can only return 0 or -EAGAIN. By using "continue"
> here you will make it impossible for the caller to reliably determine
> whether possibly both things failed. Despite -EBUSY being a logical
> choice here, I think you'd better use -EAGAIN here too. And it needs
> to be determined whether continuing the loop in this as well as the
> pre-existing cases is actually the right thing to do.
EBUSY vs. EAGAIN: by returning EAGAIN I would signal to Xen tools that
the hypervisor is currently not able to do the desired operation
(especially removing a cpu from a cpupool), but the situation will
change automatically via scheduling. EBUSY will stop retries in Xen
tools and this is want I want here: I can't be sure the situation
will change soon.
Regarding continuation of the loop: I think you are right in the
EBUSY case: I should break out of the loop. I should not do so in the
EAGAIN case as I want to remove as many vcpus from the physical cpu as
possible without returning to the Xen tools in between.
>
>> @@ -679,6 +691,8 @@ int cpu_disable_scheduler(unsigned int cpu)
>> v->affinity_broken = 1;
>> }
>>
>> + printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
>
> Wouldn't it be even better to make this the "else" to the
> preceding if(), since in the suspend case this is otherwise going
> to be printed for every vCPU not currently running on pCPU0?
Yes, I'll change it.
>
>> @@ -753,14 +767,22 @@ static int vcpu_set_affinity(
>> struct vcpu *v, const cpumask_t *affinity, cpumask_t *which)
>> {
>> spinlock_t *lock;
>> + int ret = 0;
>>
>> lock = vcpu_schedule_lock_irq(v);
>>
>> - cpumask_copy(which, affinity);
>> + if ( v->affinity_broken )
>> + {
>> + ret = -EBUSY;
>> + }
>
> Unnecessary braces.
Will remove.
>
>> @@ -979,6 +1001,53 @@ void watchdog_domain_destroy(struct domain *d)
>> kill_timer(&d->watchdog_timer[i]);
>> }
>>
>> +static long do_pin_temp(int cpu)
>> +{
>> + struct vcpu *v = current;
>> + spinlock_t *lock;
>> + long ret = -EINVAL;
>> +
>> + lock = vcpu_schedule_lock_irq(v);
>> +
>> + if ( cpu == -1 )
>> + {
>> + if ( v->affinity_broken )
>> + {
>> + cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved);
>> + v->affinity_broken = 0;
>> + set_bit(_VPF_migrating, &v->pause_flags);
>> + ret = 0;
>> + }
>> + }
>> + else if ( cpu < nr_cpu_ids && cpu >= 0 )
>
> Perhaps easier to simply use "cpu < 0" in the first if()?
Okay.
>
>> + {
>> + if ( v->affinity_broken )
>> + {
>> + ret = -EBUSY;
>> + }
>> + else if ( cpumask_test_cpu(cpu, VCPU2ONLINE(v)) )
>> + {
>
> This is a rather ugly restriction: How would a caller fulfill its job
> when this is not the case?
He can't. We should document that at least on hardware requiring this
functionality it is a bad idea to remove cpu 0 from the cpupool with the
hardware domain.
>
>> @@ -1088,6 +1157,23 @@ ret_t do_sched_op(int cmd,
>> XEN_GUEST_HANDLE_PARAM(void) arg)
>> break;
>> }
>>
>> + case SCHEDOP_pin_temp:
>> + {
>> + struct sched_pin_temp sched_pin_temp;
>> +
>> + ret = -EFAULT;
>> + if ( copy_from_guest(&sched_pin_temp, arg, 1) )
>> + break;
>> +
>> + ret = xsm_schedop_pin_temp(XSM_PRIV);
>> + if ( ret )
>> + break;
>> +
>> + ret = do_pin_temp(sched_pin_temp.pcpu);
>> +
>> + break;
>> + }
>
> So having come here I still don't see why this is called "temp":
> Nothing enforces this to be a temporary state, and hence the
> sub-op name currently is actively misleading.
I've chosen this name as the old affinity is saved and can (and should)
be recovered later. So it is intended to be temporary.
>> --- a/xen/include/public/sched.h
>> +++ b/xen/include/public/sched.h
>> @@ -118,6 +118,15 @@
>> * With id != 0 and timeout != 0, poke watchdog timer and set new timeout.
>> */
>> #define SCHEDOP_watchdog 6
>> +
>> +/*
>> + * Temporarily pin the current vcpu to one physical cpu or undo that
>> pinning.
>> + * @arg == pointer to sched_pin_temp_t structure.
>> + *
>> + * Setting pcpu to -1 will undo a previous temporary pinning.
>> + * This call is allowed for domains with domain control privilege only.
>> + */
>
> Why domain control privilege? I'd actually suggest limiting the
> ability to the hardware domain, at once eliminating the need
> for the XSM check.
Sure, I'd be happy to simplify the patch.
>
>> +struct sched_pin_temp {
>> + int pcpu;
>
> Fixed width types only please in the public interface. Also this needs
> an entry in xen/include/xlat.lst, and a consumer of the resulting
> check macro.
Aah, okay.
Thanks for the review,
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |