[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/7] xen: credit2: soft-affinity awareness in fallback_cpu()



On 07/25/2017 05:47 PM, Dario Faggioli wrote:
> On Tue, 2017-07-25 at 17:17 +0100, George Dunlap wrote:
>> On 07/25/2017 05:00 PM, Dario Faggioli wrote:
>>> On Tue, 2017-07-25 at 11:19 +0100, George Dunlap wrote:
>>>>
>>> Mmm.. I think you're right. In fact, in a properly configured
>>> system,
>>> we'll never go past step 3 (from the comment at the top).
>>>
>>> Which is not ideal, or at least not what I had in mind. In fact, I
>>> think it's better to check step 4 (svc->vcpu->processor in hard-
>>> affinity) and step 5 (a CPU from svc's runqueue in hard affinity),
>>> as
>>> that would mean avoiding a runqueue migration.
>>>
>>> What about I basically kill step 3, i.e., if we reach this point
>>> during
>>> the soft-affinity step, I just continue to the hard-affinity one?
>>
>> Hmm, well *normally* we would rather have a vcpu running within its
>> soft
>> affinity, even if that means moving it to another runqueue.  
>>
> Yes, but both *ideally* and *normally*, we just should not be here. :-)
> 
> If we did end up here, we're in guessing territory, and, although what
> you say about a guest wanting to run on within its soft-affinity is
> always true, from the guest own point of view, our job as the scheduler
> is to do what would be best for the system as a whole. But we are in a
> situation where we could not gather the information to make such a
> decision.
> 
>> Is your
>> idea that, the only reason we're in this particular code is because
>> we
>> couldn't grab the lock we need to make a more informed decision; so
>> defer if possible to previous decisions, which (we might presume) was
>> able to make a more informed decision?
>>
> Kind of, yes. Basically I think we should "escape" from this situation
> as quickly as possible, and causing as few troubles as possible to both
> ourself and to others, in the hope that it will go better next time.
> 
> Trying to stay in the same runqueue seems to me to fit this
> requirement, as:
> - as you say, we're here because a previous (presumably well informed)
>   decision brought us here so, hopefully, staying here is not too bad, 
>   neither for us nor overall;
> - staying here is quicker and means less overhead for svc;
> - staying here means less overhead overall. In fact, if we decide to 
>   change runqueue, we will have to take the remote runqueue lock at 
>   some point... And I'd prefer that to be for good reasons.
> 
> All that being said, it probably would be good to add a performance
> counter, and try to get a sense of how frequently we actually end up in
> this function as a fallback.
> 
> But in the meantime, yes, I'd try to make svc stay in the runqueue
> where it is, in this case, if possible.

Sounds good.  So are you going to respin the series then?

> 
>>> ASSERT_UNREACHABLE() is indeed much better. What do you mean with
>>> "something random"? The value to be assigned to cpu?
>>
>> Er, yes, I meant the return value.  Returning 0, or v->processor
>> would
>> be simple options.  *Really* defensive programming would attempt to
>> chose something somewhat sensible with the minimal risk of triggering
>> some other hidden assumptions (say, any cpu on our current runqueue).
>> But part of me says even thinking too long about it is a waste of
>> time
>> for something we're 99.99% sure can never happen. :-)
>>
> Agreed. IAC, I'll go for ASSERT_UNREACHABLE() and then see about using
> either v->processor (with a comment), or a cpumask_any(something). Of
> course the latter is expensive, but it should not be a big problem,
> considering we'll never get there (I'll have a look at generated the
> assembly, to confirm that).

OK, thanks.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.