[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] sched: credit2: respect per-vcpu hard affinity



On Mon, Jan 19, 2015 at 9:21 PM, Justin Weaver <jtweaver@xxxxxxxxxx> wrote:
> On Mon, Jan 12, 2015 at 8:05 AM, Dario Faggioli
> <dario.faggioli@xxxxxxxxxx> wrote:

>>>          if ( __vcpu_on_runq(svc) )
>>> +            on_runq = 1;
>>> +
>>> +        /* If the runqs are different, move svc to trqd. */
>>> +        if ( svc->rqd != trqd )
>>>          {
>>> -            __runq_remove(svc);
>>> -            update_load(ops, svc->rqd, svc, -1, now);
>>> -            on_runq=1;
>>> +            if ( on_runq )
>>> +            {
>>> +                __runq_remove(svc);
>>> +                update_load(ops, svc->rqd, svc, -1, now);
>>> +            }
>>> +            __runq_deassign(svc);
>>> +            __runq_assign(svc, trqd);
>>> +            if ( on_runq )
>>> +            {
>>> +                update_load(ops, svc->rqd, svc, 1, now);
>>> +                runq_insert(ops, svc->vcpu->processor, svc);
>>> +            }
>>>          }
>>> -        __runq_deassign(svc);
>>> -        svc->vcpu->processor = cpumask_any(&trqd->active);
>>> -        __runq_assign(svc, trqd);
>>> +
>>>
>> Mmm.. I do not like the way the code looks after this is applied. Before
>> the patch, it was really straightforward and easy to understand. Now
>> it's way more involved. Can you explain why this rework is necessary?
>> For now do it here, then we'll see whether and how to put that into a
>> doc comment.
>
> When I was testing, if I removed hard affinity from a vcpu's current
> pcpu to another pcpu in the same run queue, the VM would stop
> executing. I'll go back and look at this because I see what you wrote
> below about wake being called by vcpu_migrate in schedule.c; it
> shouldn't freeze on the old cpu, it should wake on the new cpu no
> matter if the run queue changed or not. I'll address this again after
> some testing.

>>> @@ -1399,8 +1531,12 @@ csched2_vcpu_migrate(
>>>
>>>      trqd = RQD(ops, new_cpu);
>>>
>>> -    if ( trqd != svc->rqd )
>>> -        migrate(ops, svc, trqd, NOW());
>>> +    /*
>>> +     * Call migrate even if svc->rqd == trqd; there may have been an
>>> +     * affinity change that requires a call to runq_tickle for a new
>>> +     * processor within the same run queue.
>>> +     */
>>> +    migrate(ops, svc, trqd, NOW());
>>>  }
>>>
>> As said above, I don't think I see the reason for this. Affinity
>> changes, e.g., due to calls to vcpu_set_affinity() in schedule.c, forces
>> the vcpu through a sleep wakeup cycle (it calls vcpu_sleep_nosync()
>> direcly, while vcpu_wake() is called inside vcpu_migrate()).
>>
>> So, looks like what you are after (i.e., runq_tickle being called)
>> should happen already, isn't it? Are there other reasons you need it
>> for?
>
> Like I said above, I will look at this again. My VMs were getting
> stuck after certain hard affinity changes. I'll roll back some of
> these changes and test it out again.

I discovered that SCHED_OP(VCPU2OP(v), wake, v); in function vcpu_wake
in schedule.c is not being called because v's pause flags has
_VPF_blocked set.

For example...
I start a guest with one vcpu with hard affinity 8 - 15 and xl
vcpu-list says it's running on pcpu 15
I run xl vcpu-pin 1 0 8 to change it to hard affinity only with pcpu 8
When it gets to vcpu_wake, it tests vcpu_runnable(v) which is false
because _VPF_blocked is set, so it skips the call to
SCHED_OP(VCPU2OP(v), wake, v); and so does not get a runq_tickle
xl vcpu-list now shows --- for the state and I cannot console into it
What I don't understand though is if I then enter xl vcpu-pin 1 0 15
it reports that _VPF_blocked is NOT set, vcpu_wake calls credit2's
wake, it gets a runq_tickle and everything is fine again
Why did the value of the _VPF_blocked flag change after I entered xl
vcpu-pin the second time?? I dove deep in the code and could not
figure it out.

So that is why v1 of my patch worked because I let it run migrate
during an affinity change even if the current and destination run
queues were the same, so it would do the processor assignment and
runq_tickle regardless. I think you'll have to tell me if that's a
hack or a good solution!

I greatly appreciate any feedback.

Thank you,
Justin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.