[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model



On Thu, 2016-02-25 at 01:15 -0500, Tianyang Chen wrote:
> 
> On 2/24/2016 9:02 PM, Dario Faggioli wrote:
> > Hey,
> > 
> > Here I am, sorry for the delay. :-(
> No problem, I think we are almost there.
>
We probably are, although a few more adjustments in the timer logic is
required, IMO.

I'll try to be quick in replying to next rounds. :-)

> > On Mon, 2016-02-08 at 23:33 -0500, Tianyang Chen wrote:
> > > 
> > > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> > > index 2e5430f..1f0bb7b 100644
> > > --- a/xen/common/sched_rt.c
> > > +++ b/xen/common/sched_rt.c
> > > @@ -388,6 +424,66 @@ __q_remove(struct rt_vcpu *svc)
> > >   }
> > >   
> > >   /*
> > > + * Removing a vcpu from the replenishment queue could
> > > + * re-program the timer for the next replenishment event
> > > + * if the timer is currently active
> > > + */
> > > +static inline void
> > > +__replq_remove(const struct scheduler *ops, struct rt_vcpu *svc)
> > > +{
> > > +    struct rt_private *prv = rt_priv(ops);
> > > +    struct list_head *replq = rt_replq(ops);
> > > +    struct timer* repl_timer = prv->repl_timer;
> > > +
> > > +    if ( __vcpu_on_replq(svc) )
> > > +    {
> > So, can this be false? If yes, when and why?
> > 
> > > +        /*
> > > +         * disarm the timer if removing the first replenishment
> > > event
> > > +         * which is going to happen next
> > > +         */
> > > +        if( active_timer(repl_timer) )
> > > +        {
> > > 
> > And here again, isn't it the case that, if there is a vcpu in the
> > replenishment queue, then the timer _must_ be active? (I.e., if the
> > above is true, isn't this also always true?)
> 
> So, the reason for this check is that the code inside timer handler
> also 
> calls this function when
> timer is stopped. We don't want to start the timer inside the timer 
> handler when it's manipulating
> the replenishment queue right? 
>
So, this function and the timer handler are serialized by the global
RTDS scheduler lock, so I don't see chances for races due to concurrent
execution of them from different pcpus.

And for the case when the timer handler calls this, if that is the
*only* reason why you put this check here, I think it's overkill.
Perhaps this can be dealt with in the timer handler, by just doing the
list_del_init() itself (ugly, but maybe with a comment...). Or we can
make __replq_inset() able to deal with re-insertions, i.e., cases when
the vcpu was already on the replenishment queue, and we're basically
changing it's position (slightly better, but we loose the ASSERT() in
__replq_insert() itself). Or we can even split this function in two,
one (e.g., _replq_remove()) only doing the actual dequeueing and
another one (e.g., replq_remove()) calling _replq_remove() for the
dequeueing and then dealing with the timer (and in this case, the timer
handler can just call _replq_remove(), as, in that case, the
reprogramming will be later anyway.

I can't be sure which solution would look and work best without
actually seeing them... I guess, pick one up and give it a try.

> Because it could be removing/adding back 
> more than one vcpu
> and the timer should only be set at the very end of the timer
> handler.
> 
Ok, so that's why there's a similar check (which I also did not like)
in __replq_insert() then, I see it now.

> In fact, I copied a static function from timer.c just to check if a 
> timer is active or not. Not sure if
> this is a good practice or not.
> 
Yes, I wanted to comment on that too, but then forgot.

Indeed it is *not* a good practise. If you need anything like that, you
should expose the original function (by making it !static and adding it
to an header). However, all the times that something like that happens,
it means that we're abusing the intended API for a particular
subsystem, and should serve as a warning for thinking twice whether
there really is not another way of doing things (of course, there are
cases where there aren't, but this should not be one of those).

> > > +            struct rt_vcpu *next_repl = __replq_elem(replq-
> > > >next);
> > > +
> > > +            if( next_repl->cur_deadline == svc->cur_deadline )
> > > +                repl_timer->expires = 0;
> > > +
> > I think we need to call stop_timer() here, don't we? I know that
> > set_timer() does that itself, but I think it is indeed necessary.
> > E.g.,
> > right now, you're manipulating the timer without the timer lock.
> 
> So when the timer handler is protected by a global lock, it is still 
> necessary to stop the timer?
>
What do you mean with "a global lock"? Timers are maintainer in the
per-cpu heap. timer->expires is what's used to keep the heap... well..
an heap, not to mention that stopping a timer may call for some action
on the timers of a cpu (e.g., look at deactivate_timer() and at the
fact that it may or may actually raise TIMER_SOFTIQ).

So, stopping and starting a timer needs to be done with the proper
timer API calls, because the timer infrastructure needs to know that,
and the key of the heap can't be just changed behind the timer
infrastructure's back, and without the timer lock (check, in the whole
xen source code, how often that happens, outside of timer.c: basically
never! :-D).

> Also, the timer only runs on a single CPU for a scheduler too.
> 
That's how all timer works, and I don't see how that matters here.

>   So this function reduces to:
>     
>          /*
>           * disarm the timer if removing the first replenishment
> event
>           * which is going to happen next
>           */
>          if( active_timer(repl_timer) )
>
This needs to somehow go away.

>          {
>              stop_timer(replq_timer);
>           repl_timer->expires = 0;
>
And you shouldn't need to explicitly set this to 0. It's stopped, so it
won't fire, and that's it.

>              list_del_init(&svc->replq_elem);
> 
>              /* re-arm the timer for the next replenishment event */
>              if( !list_empty(replq) )
>              {
>                  struct rt_vcpu *svc_next = __replq_elem(replq-
> >next);
>                  set_timer(repl_timer, svc_next->cur_deadline);
>              }
>
BTW, here, if we're coming from the timer handler, it is certainly the
case that we are removing one or more vcpus from the front of the
replenishment queue, including the one that actually made the timer
trigger. Therefore, it's ok to restart the timer, if there are
replenishment events left, and the one that now become the front of the
queue, is the one we want to use for reprogramming.

But, if we come from, say, sleep, and we've removed a vcpu the
replenishment event for which was, say, at 5th place in the queue,
there should be no need to re-program the timer, should it?

So, it looks to me that this function can also use some hints on
whether the timer should really be re-programmed or not.

> @@ -405,22 +500,37 @@ __runq_insert(const struct scheduler *ops,
> > > struct rt_vcpu *svc)
> > >   
> > >       /* add svc to runq if svc still has budget */
> > >       if ( svc->cur_budget > 0 )
> > > -    {
> > > -        list_for_each(iter, runq)
> > > -        {
> > > -            struct rt_vcpu * iter_svc = __q_elem(iter);
> > > -            if ( svc->cur_deadline <= iter_svc->cur_deadline )
> > > -                    break;
> > > -         }
> > > -        list_add_tail(&svc->q_elem, iter);
> > > -    }
> > > +        _deadline_queue_insert(&__q_elem, svc, &svc->q_elem,
> > > runq);
> > >       else
> > > -    {
> > >           list_add(&svc->q_elem, &prv->depletedq);
> > > -    }
> > Can we ASSERT() something about a replenishment event being queued,
> > in
> > either case?
> Do you mean asserting if the svc added is on queue after being
> inserted 
> in either case?
>
Not sure... I'm thinking whether it is an invariant that, if a vcpu is
inserted in either the runqueue or the depletedq, there should be a
replenighment event queued for it. In both cases, it would be the event
that will give the vcpu itself its full budget, no matter whether it
has still some available (in the former case) or has already run out of
it (in the latter case).

So, if that is the case, I think we should put down an
ASSERT(__vcpu_on_replq(svc)) (or something like that).

> > > + * it could re-program the timer if it fires later than
> > > + * this vcpu's cur_deadline.
> > > 
> > Do you mean "The timer may be re-programmed if svc is inserted at
> > the
> > front of the events list" ?
> > 
> > > Also, this is used to program
> > > + * the timer for the first time.
> > > 
> > "Also, this is what programs the timer the first time, when called
> > from
> > xxx"
> I agree with above and for this one, if svc is going to be in the
> front 
> of the list, it programs the timer.
>
Good. :-)

> Or in the other case It sees that repl_timer->expires == 0, which is 
> either during system boot or one the only one vcpu
> on the replenishment list has been removed, in replq_remove(). So
> it's 
> not called anywhere in such sense.
>
Sorry, I don't get this last part... probably because I really don't
like (and find it rather hard to follow) this reasoning about expires
being (artificially) 0. I really think we should get rid of that part.

> Maybe "It programs the timer for the first timer, or when the only
> vcpu 
> on replenishment event list has
> been removed"?
>
Again, what does it mean "or when the only vcpu on replenishment event
list has been removed"? If there is no one in the replenishment queue,
why do we need to reprogram? If it's for us, why we're not in the
replenishment queue? And if this is about some interdependency between
__replq_remove() and __replq_insert() in some special call sites, such
special case needs to be dealt with locally, in that call site, not by
having obscure counter measures here, for which any user of this
function pays the price (in terms of both performance and code being
harder to understand).

> > > @@ -651,6 +785,10 @@ rt_vcpu_remove(const struct scheduler *ops,
> > > struct vcpu *vc)
> > >       lock = vcpu_schedule_lock_irq(vc);
> > >       if ( __vcpu_on_q(svc) )
> > >           __q_remove(svc);
> > > +
> > > +    if( __vcpu_on_replq(svc) )
> > > +        __replq_remove(ops,svc);
> > > +
> > As per the code in this patch looks like, you're
> > checking __vcpu_on_replq(svc) twice. In fact, you do it here and
> > inside
> > __replq_remove(). I've already said that I'd like for the check
> > inside
> > __rqplq_remove() to go away so, keep this one here (if it's really
> > necessary) and kill the other one inside the function.
> > 
> > > @@ -924,6 +1027,9 @@ rt_vcpu_sleep(const struct scheduler *ops,
> > > struct vcpu *vc)
> > >           __q_remove(svc);
> > >       else if ( svc->flags & RTDS_delayed_runq_add )
> > >           clear_bit(__RTDS_delayed_runq_add, &svc->flags);
> > > +
> > > +    if( __vcpu_on_replq(svc) )
> > > +        __replq_remove(ops, svc);
> > > 
> > Same here.
> So, what about __q_remove()? It looks like couple places are double 
> checking as well.
>
Indeed they are. They're bugs. :-D

Not in the sense that they make the code explode, or misbehave, of
course, but I still dislike them rather passionately. If, after this
patch will be done and checked in, you guys would want to take a stab
at sanitizing that too, it would be awesome. :-)


> > > @@ -1051,6 +1153,18 @@ rt_vcpu_wake(const struct scheduler *ops,
> > > struct vcpu *vc)
> > >       else
> > >           SCHED_STAT_CRANK(vcpu_wake_not_runnable);
> > >   
> > > +    /* budget repl here is needed before inserting back to runq.
> > > If
> > > so,
> > > 
> > Comment style.
> > 
> > Also, what do you mean with that "If so"... "If so" what?
> I mean if the budget is replenished here, then re-insert back. I
> guess 
> it's self-explanatory.
>
I still can parse that. Maybe:

"If deadline passed while svc was asleep/blocked, we need new
scheduling parameters (a new deadline and full budget), and we also
need to schedule a new replenishment event, according to them."

But then again, I think we should try to enforce the fact that, if a
vcpu is not runnable, there's no pending replenishment for it, and make
it an invariant we can throw ASSERT() against.

We're already doing it for sleeps, after all...

> > > +     * it should be re-inserted back to the replenishment queue.
> > > +     */
> > > +    if ( now >= svc->cur_deadline)
> > > +    {
> > > +        rt_update_deadline(now, svc);
> > > +        __replq_remove(ops, svc);
> > > +    }
> > > +
> > > +    if( !__vcpu_on_replq(svc) )
> > > +        __replq_insert(ops, svc);
> > > +
> > And here I am again: is it really necessary to check whether svc is
> > not
> > in the replenishment queue? It looks to me that it really should
> > not be
> > there... but maybe it can, because we remove the event from the
> > queue
> > when the vcpu sleeps, but *not* when the vcpu blocks?
> Yeah. That is the case where I keep getting assertion failure if
> it's 
> removed. 
>
Which one ASSERT() fails?

> I'm thinking when
> a vcpu unblocks, it could potentially fall through here. 
>
Well, when unblocking, wake() is certainly called, yes.

> And like you 
> said, mostly spurious sleep
> happens when a vcpu is running and it could happen in other cases, 
> although rare.
> 
I think I said already there's no such thing as "spurious sleep". Or at
least, I can't think of anything that I would define a spurious sleep,
if you do, please, explain what situation you're referring to.

In any case, one way of dealing with vcpus blocking/offlining/etc could
be to, in context_saved(), in case we are not adding the vcpu back to
the runq, cancel its replenishment event with __replq_remove().

(This may make it possible to avoid doing it in rt_vcpu_sleep() too,
but you'll need to check and test.)

Can you give this a try.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.