Xen project Mailing List

Re: [Xen-devel] [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model

To: Tianyang Chen <tiche@xxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Thu, 25 Feb 2016 11:34:16 +0100

Cc: george.dunlap@xxxxxxxxxx, Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx>, Meng Xu <mengxu@xxxxxxxxxxxxx>

Delivery-date: Thu, 25 Feb 2016 10:34:36 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, 2016-02-25 at 01:15 -0500, Tianyang Chen wrote: > > On 2/24/2016 9:02 PM, Dario Faggioli wrote: > > Hey, > > > > Here I am, sorry for the delay. :-( > No problem, I think we are almost there. > We probably are, although a few more adjustments in the timer logic is required, IMO. I'll try to be quick in replying to next rounds. :-) > > On Mon, 2016-02-08 at 23:33 -0500, Tianyang Chen wrote: > > > > > > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c > > > index 2e5430f..1f0bb7b 100644 > > > --- a/xen/common/sched_rt.c > > > +++ b/xen/common/sched_rt.c > > > @@ -388,6 +424,66 @@ __q_remove(struct rt_vcpu *svc) > > > } > > > > > > /* > > > + * Removing a vcpu from the replenishment queue could > > > + * re-program the timer for the next replenishment event > > > + * if the timer is currently active > > > + */ > > > +static inline void > > > +__replq_remove(const struct scheduler *ops, struct rt_vcpu *svc) > > > +{ > > > + struct rt_private *prv = rt_priv(ops); > > > + struct list_head *replq = rt_replq(ops); > > > + struct timer* repl_timer = prv->repl_timer; > > > + > > > + if ( __vcpu_on_replq(svc) ) > > > + { > > So, can this be false? If yes, when and why? > > > > > + /* > > > + * disarm the timer if removing the first replenishment > > > event > > > + * which is going to happen next > > > + */ > > > + if( active_timer(repl_timer) ) > > > + { > > > > > And here again, isn't it the case that, if there is a vcpu in the > > replenishment queue, then the timer _must_ be active? (I.e., if the > > above is true, isn't this also always true?) > > So, the reason for this check is that the code inside timer handler > also > calls this function when > timer is stopped. We don't want to start the timer inside the timer > handler when it's manipulating > the replenishment queue right? > So, this function and the timer handler are serialized by the global RTDS scheduler lock, so I don't see chances for races due to concurrent execution of them from different pcpus. And for the case when the timer handler calls this, if that is the *only* reason why you put this check here, I think it's overkill. Perhaps this can be dealt with in the timer handler, by just doing the list_del_init() itself (ugly, but maybe with a comment...). Or we can make __replq_inset() able to deal with re-insertions, i.e., cases when the vcpu was already on the replenishment queue, and we're basically changing it's position (slightly better, but we loose the ASSERT() in __replq_insert() itself). Or we can even split this function in two, one (e.g., _replq_remove()) only doing the actual dequeueing and another one (e.g., replq_remove()) calling _replq_remove() for the dequeueing and then dealing with the timer (and in this case, the timer handler can just call _replq_remove(), as, in that case, the reprogramming will be later anyway. I can't be sure which solution would look and work best without actually seeing them... I guess, pick one up and give it a try. > Because it could be removing/adding back > more than one vcpu > and the timer should only be set at the very end of the timer > handler. > Ok, so that's why there's a similar check (which I also did not like) in __replq_insert() then, I see it now. > In fact, I copied a static function from timer.c just to check if a > timer is active or not. Not sure if > this is a good practice or not. > Yes, I wanted to comment on that too, but then forgot. Indeed it is *not* a good practise. If you need anything like that, you should expose the original function (by making it !static and adding it to an header). However, all the times that something like that happens, it means that we're abusing the intended API for a particular subsystem, and should serve as a warning for thinking twice whether there really is not another way of doing things (of course, there are cases where there aren't, but this should not be one of those). > > > + struct rt_vcpu *next_repl = __replq_elem(replq- > > > >next); > > > + > > > + if( next_repl->cur_deadline == svc->cur_deadline ) > > > + repl_timer->expires = 0; > > > + > > I think we need to call stop_timer() here, don't we? I know that > > set_timer() does that itself, but I think it is indeed necessary. > > E.g., > > right now, you're manipulating the timer without the timer lock. > > So when the timer handler is protected by a global lock, it is still > necessary to stop the timer? > What do you mean with "a global lock"? Timers are maintainer in the per-cpu heap. timer->expires is what's used to keep the heap... well.. an heap, not to mention that stopping a timer may call for some action on the timers of a cpu (e.g., look at deactivate_timer() and at the fact that it may or may actually raise TIMER_SOFTIQ). So, stopping and starting a timer needs to be done with the proper timer API calls, because the timer infrastructure needs to know that, and the key of the heap can't be just changed behind the timer infrastructure's back, and without the timer lock (check, in the whole xen source code, how often that happens, outside of timer.c: basically never! :-D). > Also, the timer only runs on a single CPU for a scheduler too. > That's how all timer works, and I don't see how that matters here. > So this function reduces to: > > /* > * disarm the timer if removing the first replenishment > event > * which is going to happen next > */ > if( active_timer(repl_timer) ) > This needs to somehow go away. > { > stop_timer(replq_timer); > repl_timer->expires = 0; > And you shouldn't need to explicitly set this to 0. It's stopped, so it won't fire, and that's it. > list_del_init(&svc->replq_elem); > > /* re-arm the timer for the next replenishment event */ > if( !list_empty(replq) ) > { > struct rt_vcpu *svc_next = __replq_elem(replq- > >next); > set_timer(repl_timer, svc_next->cur_deadline); > } > BTW, here, if we're coming from the timer handler, it is certainly the case that we are removing one or more vcpus from the front of the replenishment queue, including the one that actually made the timer trigger. Therefore, it's ok to restart the timer, if there are replenishment events left, and the one that now become the front of the queue, is the one we want to use for reprogramming. But, if we come from, say, sleep, and we've removed a vcpu the replenishment event for which was, say, at 5th place in the queue, there should be no need to re-program the timer, should it? So, it looks to me that this function can also use some hints on whether the timer should really be re-programmed or not. > @@ -405,22 +500,37 @@ __runq_insert(const struct scheduler *ops, > > > struct rt_vcpu *svc) > > > > > > /* add svc to runq if svc still has budget */ > > > if ( svc->cur_budget > 0 ) > > > - { > > > - list_for_each(iter, runq) > > > - { > > > - struct rt_vcpu * iter_svc = __q_elem(iter); > > > - if ( svc->cur_deadline <= iter_svc->cur_deadline ) > > > - break; > > > - } > > > - list_add_tail(&svc->q_elem, iter); > > > - } > > > + _deadline_queue_insert(&__q_elem, svc, &svc->q_elem, > > > runq); > > > else > > > - { > > > list_add(&svc->q_elem, &prv->depletedq); > > > - } > > Can we ASSERT() something about a replenishment event being queued, > > in > > either case? > Do you mean asserting if the svc added is on queue after being > inserted > in either case? > Not sure... I'm thinking whether it is an invariant that, if a vcpu is inserted in either the runqueue or the depletedq, there should be a replenighment event queued for it. In both cases, it would be the event that will give the vcpu itself its full budget, no matter whether it has still some available (in the former case) or has already run out of it (in the latter case). So, if that is the case, I think we should put down an ASSERT(__vcpu_on_replq(svc)) (or something like that). > > > + * it could re-program the timer if it fires later than > > > + * this vcpu's cur_deadline. > > > > > Do you mean "The timer may be re-programmed if svc is inserted at > > the > > front of the events list" ? > > > > > Also, this is used to program > > > + * the timer for the first time. > > > > > "Also, this is what programs the timer the first time, when called > > from > > xxx" > I agree with above and for this one, if svc is going to be in the > front > of the list, it programs the timer. > Good. :-) > Or in the other case It sees that repl_timer->expires == 0, which is > either during system boot or one the only one vcpu > on the replenishment list has been removed, in replq_remove(). So > it's > not called anywhere in such sense. > Sorry, I don't get this last part... probably because I really don't like (and find it rather hard to follow) this reasoning about expires being (artificially) 0. I really think we should get rid of that part. > Maybe "It programs the timer for the first timer, or when the only > vcpu > on replenishment event list has > been removed"? > Again, what does it mean "or when the only vcpu on replenishment event list has been removed"? If there is no one in the replenishment queue, why do we need to reprogram? If it's for us, why we're not in the replenishment queue? And if this is about some interdependency between __replq_remove() and __replq_insert() in some special call sites, such special case needs to be dealt with locally, in that call site, not by having obscure counter measures here, for which any user of this function pays the price (in terms of both performance and code being harder to understand). > > > @@ -651,6 +785,10 @@ rt_vcpu_remove(const struct scheduler *ops, > > > struct vcpu *vc) > > > lock = vcpu_schedule_lock_irq(vc); > > > if ( __vcpu_on_q(svc) ) > > > __q_remove(svc); > > > + > > > + if( __vcpu_on_replq(svc) ) > > > + __replq_remove(ops,svc); > > > + > > As per the code in this patch looks like, you're > > checking __vcpu_on_replq(svc) twice. In fact, you do it here and > > inside > > __replq_remove(). I've already said that I'd like for the check > > inside > > __rqplq_remove() to go away so, keep this one here (if it's really > > necessary) and kill the other one inside the function. > > > > > @@ -924,6 +1027,9 @@ rt_vcpu_sleep(const struct scheduler *ops, > > > struct vcpu *vc) > > > __q_remove(svc); > > > else if ( svc->flags & RTDS_delayed_runq_add ) > > > clear_bit(__RTDS_delayed_runq_add, &svc->flags); > > > + > > > + if( __vcpu_on_replq(svc) ) > > > + __replq_remove(ops, svc); > > > > > Same here. > So, what about __q_remove()? It looks like couple places are double > checking as well. > Indeed they are. They're bugs. :-D Not in the sense that they make the code explode, or misbehave, of course, but I still dislike them rather passionately. If, after this patch will be done and checked in, you guys would want to take a stab at sanitizing that too, it would be awesome. :-) > > > @@ -1051,6 +1153,18 @@ rt_vcpu_wake(const struct scheduler *ops, > > > struct vcpu *vc) > > > else > > > SCHED_STAT_CRANK(vcpu_wake_not_runnable); > > > > > > + /* budget repl here is needed before inserting back to runq. > > > If > > > so, > > > > > Comment style. > > > > Also, what do you mean with that "If so"... "If so" what? > I mean if the budget is replenished here, then re-insert back. I > guess > it's self-explanatory. > I still can parse that. Maybe: "If deadline passed while svc was asleep/blocked, we need new scheduling parameters (a new deadline and full budget), and we also need to schedule a new replenishment event, according to them." But then again, I think we should try to enforce the fact that, if a vcpu is not runnable, there's no pending replenishment for it, and make it an invariant we can throw ASSERT() against. We're already doing it for sleeps, after all... > > > + * it should be re-inserted back to the replenishment queue. > > > + */ > > > + if ( now >= svc->cur_deadline) > > > + { > > > + rt_update_deadline(now, svc); > > > + __replq_remove(ops, svc); > > > + } > > > + > > > + if( !__vcpu_on_replq(svc) ) > > > + __replq_insert(ops, svc); > > > + > > And here I am again: is it really necessary to check whether svc is > > not > > in the replenishment queue? It looks to me that it really should > > not be > > there... but maybe it can, because we remove the event from the > > queue > > when the vcpu sleeps, but *not* when the vcpu blocks? > Yeah. That is the case where I keep getting assertion failure if > it's > removed. > Which one ASSERT() fails? > I'm thinking when > a vcpu unblocks, it could potentially fall through here. > Well, when unblocking, wake() is certainly called, yes. > And like you > said, mostly spurious sleep > happens when a vcpu is running and it could happen in other cases, > although rare. > I think I said already there's no such thing as "spurious sleep". Or at least, I can't think of anything that I would define a spurious sleep, if you do, please, explain what situation you're referring to. In any case, one way of dealing with vcpus blocking/offlining/etc could be to, in context_saved(), in case we are not adding the vcpu back to the runq, cancel its replenishment event with __replq_remove(). (This may make it possible to avoid doing it in rt_vcpu_sleep() too, but you'll need to check and test.) Can you give this a try. Thanks and Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.