[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] xen: sched: convert RTDS from time to event driven model



On Fri, 2016-02-05 at 23:27 -0500, Tianyang Chen wrote:
> 
> On 2/5/2016 9:39 AM, Dario Faggioli wrote:
> > On Wed, 2016-02-03 at 21:23 -0500, Tianyang Chen wrote:
> > > 
> I see. So I'm just curious what can cause spurious wakeup? Does it
> onlyÂ
> happen to a running vcpu (currently on pcpu), and a vcpu that is onÂ
> either runq or depletedq?
> 
Hehe, "that's virt, baby!" :-P

No, seriously, vcpu being already running or already in a queue, is
indeed what makes me call a wakeup a "spurious" wakeup. You can check
at the performance counters that we update in those cases, to see when
they happen most. In my experience, on Credit, I've seen them happening
mostly when a vcpu is running. For instance:

root@Zhaman:~# xenperf |grep vcpu_wake
sched: vcpu_wake_runningÂÂÂÂÂÂÂÂÂÂÂÂT=ÂÂÂÂÂÂÂÂ39Â
sched: vcpu_wake_onrunqÂÂÂÂÂÂÂÂÂÂÂÂÂT=ÂÂÂÂÂÂÂÂÂ0Â
sched: vcpu_wake_runnableÂÂÂÂÂÂÂÂÂÂÂT=ÂÂÂÂÂ59875Â
sched: vcpu_wake_not_runnableÂÂÂÂÂÂÂT=ÂÂÂÂÂÂÂÂÂ0

> > So we realized that this is just a spurious wakeup, and get back to
> > what we were doing.
> > 
> > What's wrong with this I just said?
> > 
> 
> The reason why I wanted to add a vcpu back to replq is because it
> couldÂ
> be taken off from the timer handler. Now, because of the spuriousÂ
> wakeup, I think the timer shouldn't take any vcpus off of replq, inÂ
> which case wake() should be doing nothing just like other schedulersÂ
> when it's a spurious wakeup. Also, only sleep() should remove eventsÂ
> from replq.
> 
Err... well, that depends on the how the code ends up looking like, and
it's start to become difficult to reason on it like this.

Perhaps, considering that it looks to me that we are in agreement on
all the important aspects, you can draft a new version and we'll reason
and comment on top of that?

> > Mmm... no, I think we should know/figure out whether a
> > replenishment is
> > pending already and we are ok with it, or if we need a new one.
> > 
> 
> Just to make sure, spurious sleep/wakeup are for a vcpu that is on
> pcpuÂ
> or any queue right?
> 
Yes, spurious wakeup is how I'm calling wakeups that needs no action
from the schedule, because the vcpu is already awake (and either
running or runnable).

I don't think there is such a thing as spurious sleep... When sleep is
called, we go to sleep. There are cases where the sleep-->wakeup
turnaround is really really fast, but I would not call them "spurious
sleeps", and they should not need much special treatment, not in
sched_rt.c at least.

Oh, maybe you mean the cases where you wakeup and you find out that
context_saved() has not run yet (just occurred by looking at the code
below)? Well, yes... But I wouldn't call those "spurious sleeps"
either.

> If a real sleep happens, it should be taken off from replq. However,
> inÂ
> spurious wakeup (which I assume corresponds to a spurious sleep?),
> itÂ
> shouldn't be taken off from replq. Adding back to replq should
> happenÂ
> for those vcpus which were taken off because of "real sleep".
> 
I don't really follow, but I have the feeling that you're making it
sound more complicated like it is in reality... :-)

> So here is a summary to make sure everyone's on the same page :)
> This is basically your code with a slight modification beforeÂ
> replq_insert().
> 
> static void
> rt_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
> {
> ÂÂÂÂÂstruct rt_vcpu * const svc = rt_vcpu(vc);
> ÂÂÂÂÂs_time_t now = NOW();
> 
> ÂÂÂÂÂBUG_ON( is_idle_vcpu(vc) );
> 
> ÂÂÂÂÂif ( unlikely(curr_on_cpu(vc->processor) == vc) )
> ÂÂÂÂÂ{
>       /*
>        * spurious wakeup so do nothing as it is
>        * not removed from replq
> ÂÂÂ   Â*/
>
Kill the comment (and below too), it's pretty useless.

> ÂÂÂÂÂÂÂÂÂSCHED_STAT_CRANK(vcpu_wake_running);
> ÂÂÂÂÂÂÂÂÂreturn;
> ÂÂÂÂÂ}
> 
> ÂÂÂÂÂ/* on RunQ/DepletedQ, just update info is ok */
> ÂÂÂÂÂif ( unlikely(__vcpu_on_q(svc)) )
> ÂÂÂÂÂ{
>       /*
>        * spurious wakeup so do nothing as it is
>        * not removed from replq
>        */
> ÂÂÂÂÂÂÂÂÂSCHED_STAT_CRANK(vcpu_wake_onrunq);
> ÂÂÂÂÂÂÂÂÂreturn;
> ÂÂÂÂÂ}
> 
> ÂÂÂÂÂif ( likely(vcpu_runnable(vc)) )
> ÂÂÂÂÂÂÂÂÂSCHED_STAT_CRANK(vcpu_wake_runnable);
> ÂÂÂÂÂelse
> ÂÂÂÂÂÂÂÂÂSCHED_STAT_CRANK(vcpu_wake_not_runnable);
> 
> 
> ÂÂÂÂÂ/*
> ÂÂÂÂÂÂ* sleep() might have taken it off from replq before
> ÂÂÂÂÂÂ* conext_saved(). Three cases:
> ÂÂÂÂÂÂ* 1) sleep on pcpu--> context_saved() --> wake()
> ÂÂÂÂÂÂ* 2) sleep on pcpu--> wake() --> context_saved()
> ÂÂÂÂÂÂ* 3) sleep before context_saved() -->wake().
> ÂÂÂÂÂÂ* The first two cases sleep() doesn't remove this vcpu
> ÂÂÂÂÂÂ* so add a check before inserting.
> ÂÂÂÂÂÂ*/
>
I'm not sure why you're saying that, in cases 1) and 2), sleep should
not remove the replenishment event from the queue...

> ÂÂÂÂÂif ( now >= svc->cur_deadline)
> ÂÂÂÂÂ{
> ÂÂÂÂÂÂÂÂÂrt_update_deadline(now, svc);
> ÂÂÂÂÂÂÂÂÂ__replq_remove(svc);
> ÂÂÂÂÂ}
> 
> ÂÂÂÂÂif( !__vcpu_on_replq(svc) )
> ÂÂÂÂÂÂÂÂÂ__replq_insert(svc);
> 
And despite the comment above being a bit unclear to me, I *think* this
code is ok. :-)

> ÂÂÂÂÂ/* If context hasn't been saved for this vcpu yet, we can't put
> it on
> ÂÂÂÂÂÂ* the Runqueue/DepletedQ. Instead, we set a flag so that it
> will be
> ÂÂÂÂÂÂ* put on the Runqueue/DepletedQ after the context has been
> saved.
> ÂÂÂÂÂÂ*/
> ÂÂÂÂÂif ( unlikely(svc->flags & RTDS_scheduled) )
> ÂÂÂÂÂ{
> ÂÂÂÂÂÂÂÂÂset_bit(__RTDS_delayed_runq_add, &svc->flags);
> 
> ÂÂÂÂÂÂÂÂÂreturn;
> ÂÂÂÂÂ}
> 
> ÂÂÂÂÂ/* insert svc to runq/depletedq because svc is not in queue now
> */
> ÂÂÂÂÂ__runq_insert(ops, svc);
> ÂÂÂÂÂrunq_tickle(ops, snext);
> 
> ÂÂÂÂÂreturn;
> }
> 
> 
> static void
> rt_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
> {
> ÂÂÂÂÂstruct rt_vcpu * const svc = rt_vcpu(vc);
> 
> ÂÂÂÂÂBUG_ON( is_idle_vcpu(vc) );
> ÂÂÂÂÂSCHED_STAT_CRANK(vcpu_sleep);
> 
> ÂÂÂÂÂ/*
> ÂÂÂÂÂÂ* If a vcpu is not sleeping on a pcpu or a queue,
> ÂÂÂÂÂÂ* it should be taken off
> ÂÂÂÂÂÂ* from the replenishment queue
> ÂÂÂÂÂÂ* Case 1: sleep on a pcpu
> ÂÂÂÂÂÂ* Case 2: sleep on a queue
> ÂÂÂÂÂÂ* Case 3: sleep before context switch is finished
> ÂÂÂÂÂÂ*/
> ÂÂÂÂÂif ( curr_on_cpu(vc->processor) == vc )
> ÂÂÂÂÂÂÂÂÂcpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ);
> ÂÂÂÂÂelse if ( __vcpu_on_q(svc) )
> ÂÂÂÂÂÂÂÂÂ__q_remove(svc);
> ÂÂÂÂÂelse if ( svc->flags & RTDS_delayed_runq_add )
> ÂÂÂÂÂ{
> ÂÂÂÂÂÂÂÂÂclear_bit(__RTDS_delayed_runq_add, &svc->flags);
> 
> ÂÂÂÂÂÂÂÂÂif( __vcpu_on_replq(svc) )
> ÂÂÂÂÂÂÂÂÂÂÂÂ__replq_remove(svc);
> ÂÂÂÂÂ}
> }
> 
Yep, as I was saying above, why you're only removing from the
replenishment queue in the latest case?

For instance, if a vcpu is running on a pcpu, and it goes to sleep (the
first if), do you want the timer to fire while it is still asleep or
not?

My impression is that, no, you don't want that, and you compensate for
it in rt_vcpu_wake(). Well, if this is the case, then you should remove
the replenishment event (and, potentially, stop or reprogram the timer)
in any of these cases.

Don't you think?

Note that the if-s here are not to be matched with the ones in _wake().
As I said, when you sleep you sleep, and there's no such thing as
spurious sleep. The reason why we have these if-s, is that different
actions need to be undertaken depending on whether the vcpu that is
going to sleep was running, in a queue, or waiting to be re-queued.

For instance, you only want the pcpu to go through schedule again, only
if the vcpu was running. Also, you want to be sure that the vcpu is not
queued back in context_saved(), if it was flagged to be. Etc.

> > Before, this was done in __repl_update(), which is (and that's a
> > good
> > thing) no longer there.
> > 
> > But maybe it's me that missed something...
> > 
> 
> __repl_update() only scans runq and depletedq for replenishment
> right?Â
> And right after burn_budget(), if the current vcpu runs out of
> budget,
> snext will be picked here because of:
> 
> /* if scurr has higher priority and budget, still pick scurr */
> if ( !is_idle_vcpu(current) && vcpu_runnable(current) &&
> ÂÂÂÂÂÂÂscurr->cur_budget > 0 && ( is_idle_vcpu(snext->vcpu) ||
> ÂÂÂÂÂscurr->cur_deadline <= snext->cur_deadline ) )
> ÂÂÂÂÂsnext = scurr;
> 
> if scurr->cur_budget == 0, snext will be picked immediately.
> Is that what you looking for?
> 
Ok, but then scurr need to go to the depleted queue. Double checking, I
see that this is happening in __runq_insert(), so, yes, this code also
looks fine.

Thanks for bearing with me. :-D

> > This all sounds fine to me.
> > 
> Great. I will put together v5 soon. If we all agree on the rt_wake()
> andÂ
> rt_sleep().
> 
We're probably quite close. :-P

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.