[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/9] xen: sched: make locking for {insert, remove}_vcpu consistent



On 08/10/15 18:23, Andrew Cooper wrote:
> On 08/10/15 17:46, George Dunlap wrote:
>> On 08/10/15 16:20, Andrew Cooper wrote:
>>> On 08/10/15 15:58, George Dunlap wrote:
>>>> On 29/09/15 18:31, Andrew Cooper wrote:
>>>>> On 29/09/15 17:55, Dario Faggioli wrote:
>>>>>> The insert_vcpu() scheduler hook is called with an
>>>>>> inconsistent locking strategy. In fact, it is sometimes
>>>>>> invoked while holding the runqueue lock and sometimes
>>>>>> when that is not the case.
>>>>>>
>>>>>> In other words, some call sites seems to imply that
>>>>>> locking should be handled in the callers, in schedule.c
>>>>>> --e.g., in schedule_cpu_switch(), which acquires the
>>>>>> runqueue lock before calling the hook; others that
>>>>>> specific schedulers should be responsible for locking
>>>>>> themselves --e.g., in sched_move_domain(), which does
>>>>>> not acquire any lock for calling the hook.
>>>>>>
>>>>>> The right thing to do seems to always defer locking to
>>>>>> the specific schedulers, as it's them that know what, how
>>>>>> and when it is best to lock (as in: runqueue locks, vs.
>>>>>> private scheduler locks, vs. both, etc.)
>>>>>>
>>>>>> This patch, therefore:
>>>>>>  - removes any locking around insert_vcpu() from
>>>>>>    generic code (schedule.c);
>>>>>>  - add the _proper_ locking in the hook implementations,
>>>>>>    depending on the scheduler (for instance, credit2
>>>>>>    does that already, credit1 and RTDS need to grab
>>>>>>    the runqueue lock while manipulating runqueues).
>>>>>>
>>>>>> In case of credit1, remove_vcpu() handling needs some
>>>>>> fixing remove_vcpu() too, i.e.:
>>>>>>  - it manipulates runqueues, so the runqueue lock must
>>>>>>    be acquired;
>>>>>>  - *_lock_irq() is enough, there is no need to do
>>>>>>    _irqsave()
>>>>> Nothing in any of generic scheduling code should need interrupts
>>>>> disabled at all.
>>>>>
>>>>> One of the problem-areas identified by Jenny during the ticketlock
>>>>> performance work was that the SCHEDULE_SOFTIRQ was a large consumer of
>>>>> time with interrupts disabled.  (The other large one being the time
>>>>> calibration rendezvous, but that is a wildly different can of worms to 
>>>>> fix.)
>>>> Generic scheduling code is called from interrupt contexts -- namely,
>>>> vcpu_wake()
>>> There are a lot of codepaths, but I cant see one which is definitely
>>> called with interrupts disables.  (OTOH, I can see several where
>>> interrupts are definitely enabled).
>> Oh, I think I misunderstood you.  You meant, "No codepaths *calling
>> into* generic scheduling code should need interrupts disabled at all".
>> I can certainly believe that to be true in most cases; there's no sense
>> in saving the flags if we don't need to.
> 
> My original statement came from the observation that schedule() runs
> with interrupts disabled, and takes between 2.2 and 4 microseconds to
> run (as measured during the ticketlock performance analysis).
> 
> It is the biggest consumer of time with interrupts disabled, next being
> the time calibration rendezvous.
> 
> I am going to go out on a limb and say that the majority of that time
> does not need to be spent with interrupts disabled.  I might easily be
> wrong, but I suspect I am not.

It's certainly worth taking a look at -- in particular, as (if I recall
correctly) we grab the schedule lock, then release it briefly, then grab
it again for the context switch.

Two things related to irqs and the schedule / context-switch path.  One
we've already covered: one is calling vcpu_wake from within an interrupt
context.  The second is what might be called the "idle race": we need
interrupts disabled from the time we last check for softirqs until we
actually return to user mode.  But that's only a few dozen instructions
in most cases.

It might be possible to break things down into two locks -- one for
general schedule data structures, which would not be allowed to be
called from within an interrupt context, and one specifically to be used
for vcpu_wake (i.e., protecting manipulations to the actual runqueue)
which would have to be called with interrupts off.  But the generic
scheduling framework might make that a bit more tricky to get right.

 -George



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.