Xen project Mailing List

Re: [Xen-devel] [PATCH v2] Modified RTDS scheduler to use an event-driven model instead of polling.

From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Mon, 6 Jul 2015 13:51:17 +0200

Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx>, Sisu Xi <xisisu@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Mon, 06 Jul 2015 11:51:51 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, 2015-07-02 at 22:12 -0700, Meng Xu wrote: > Hi Dario, > Hi, > [Since I have commented on this thread in previous email, I just > top-post it for reminder.] > > Just in case, this email is out of your radar... :-) > It is on my radar. It's just that I really had to drain my queue of bugfixes. Sorry for the dealy, I'm looking at it today. Regards, Dario > I had discussed this patch with Dagaen as shown in > http://www.gossamer-threads.com/lists/xen/devel/386651 > > You don't need any detailed comment for this patch. We only need your > confirmation if this is in the correct direction and there is not > "serious" design issue in this implementation. > > Once we are confirmed with that, we can send the next version with the > minor stuffs (such as coding style) done soon. > > > 2015-06-27 12:46 GMT-07:00 Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx>: > > To do this, we create a new list that holds, for each > > vcpu, the time least into the future that it may need to be > > rescheduled. The scheduler chooses the lowest time off of this > > list and waits until the specified time instead of running every > > 1 ms as it did before. > > > > Signed-off-by: Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx> > > Signed-off-by: Meng Xu <mengxu@xxxxxxxxxxxxx> > > --- > > xen/common/sched_rt.c | 100 > > +++++++++++++++++++++++++++++++++++++++++++++---- > > 1 file changed, 93 insertions(+), 7 deletions(-) > > > > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c > > index 4372486..cce3446 100644 > > --- a/xen/common/sched_rt.c > > +++ b/xen/common/sched_rt.c > > @@ -4,6 +4,7 @@ > > * > > * by Sisu Xi, 2013, Washington University in Saint Louis > > * and Meng Xu, 2014, University of Pennsylvania > > + * and Dagaen Golomb, 2015, University of Pennsylvania > > * > > * based on the code of credit Scheduler > > */ > > @@ -152,6 +153,8 @@ struct rt_private { > > struct list_head sdom; /* list of availalbe domains, used for > > dump */ > > struct list_head runq; /* ordered list of runnable vcpus */ > > struct list_head depletedq; /* unordered list of depleted vcpus */ > > + struct timer replenishment_timer; > > + unsigned NUM_CPUS; > > cpumask_t tickled; /* cpus been tickled */ > > }; > > > > @@ -178,6 +181,8 @@ struct rt_vcpu { > > unsigned flags; /* mark __RTDS_scheduled, etc.. */ > > }; > > > > +static void runq_tickle(const struct scheduler *, struct rt_vcpu *); > > + > > /* > > * Domain > > */ > > @@ -391,13 +396,19 @@ __q_remove(struct rt_vcpu *svc) > > * Insert svc with budget in RunQ according to EDF: > > * vcpus with smaller deadlines go first. > > * Insert svc without budget in DepletedQ unsorted; > > + * > > + * Returns the position, 1-indexed, of where the item was inserted. > > + * Returns a positive index if placed on runq, and a negative index (the > > index > > + * in the depletedq * -1) if placed on the depletedq > > */ > > -static void > > +static int > > __runq_insert(const struct scheduler *ops, struct rt_vcpu *svc) > > { > > struct rt_private *prv = rt_priv(ops); > > struct list_head *runq = rt_runq(ops); > > + struct list_head *depletedq = rt_depletedq(ops); > > struct list_head *iter; > > + unsigned inserted_index = 1; > > > > ASSERT( spin_is_locked(&prv->lock) ); > > > > @@ -411,15 +422,77 @@ __runq_insert(const struct scheduler *ops, struct > > rt_vcpu *svc) > > struct rt_vcpu * iter_svc = __q_elem(iter); > > if ( svc->cur_deadline <= iter_svc->cur_deadline ) > > break; > > + else > > + ++inserted_index; > > } > > list_add_tail(&svc->q_elem, iter); > > + return inserted_index; > > } > > else > > { > > - list_add(&svc->q_elem, &prv->depletedq); > > + // If we are inserting into previously empty depletedq, rearm > > + // rearm replenish timer. It will handle further scheduling until > > + // the depletedq is empty again (if ever) > > + if( list_empty(depletedq) ) > > + set_timer( &prv->replenishment_timer, svc->cur_deadline ); > > + > > + list_for_each(iter, depletedq) > > + { > > + struct rt_vcpu * iter_svc = __q_elem(iter); > > + if ( svc->cur_deadline <= iter_svc->cur_deadline ) > > + break; > > + else > > + ++inserted_index; > > + } > > + list_add_tail(&svc->q_elem, iter); > > + return -inserted_index; > > } > > } > > > > +static void replenishment_handler(void* data) > > +{ > > + const struct scheduler *ops = data; > > + struct rt_private *prv = rt_priv(ops); > > + struct list_head *depletedq = rt_depletedq(ops); > > + struct list_head *iter; > > + struct list_head *tmp; > > + struct rt_vcpu *svc = NULL; > > + s_time_t now = NOW(); > > + int new_position = 0; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&prv->lock, flags); > > + > > + // Replenish the vCPU that triggered this, and resort it into runq > > + list_for_each_safe(iter, tmp, depletedq) > > + { > > + svc = __q_elem(iter); > > + if ( now >= svc->cur_deadline ) > > + { > > + rt_update_deadline(now, svc); > > + __q_remove(svc); /* remove from depleted queue */ > > + new_position = __runq_insert(ops, svc); /* add to runq */ > > + } > > + else break; // leave list_for_each_safe > > + } > > + > > + // If we become one of top [# CPUs] in the runq, tickle it > > + // TODO: make this work when multiple tickles are required > > + if ( new_position > 0 && new_position <= prv->NUM_CPUS ) > > + runq_tickle(ops, svc); > > + > > + // Use first item on deletedq to schedule next replenishment. > > + // If no replenishments are pending, disable timer for now > > + if( !list_empty(depletedq) ) > > + { > > + struct rt_vcpu * soonest_to_replenish = __q_elem(depletedq->next); > > + set_timer( &prv->replenishment_timer, > > + soonest_to_replenish->cur_deadline ); > > + } > > + > > + spin_unlock_irqrestore(&prv->lock, flags); > > +} > > + > > /* > > * Init/Free related code > > */ > > @@ -427,6 +500,7 @@ static int > > rt_init(struct scheduler *ops) > > { > > struct rt_private *prv = xzalloc(struct rt_private); > > + int cpu = 0; > > > > printk("Initializing RTDS scheduler\n" > > "WARNING: This is experimental software in development.\n" > > @@ -450,6 +524,15 @@ rt_init(struct scheduler *ops) > > INIT_LIST_HEAD(&prv->runq); > > INIT_LIST_HEAD(&prv->depletedq); > > > > + // Replenishment timer, for now always on CPU 0 > > + init_timer(&prv->replenishment_timer, replenishment_handler, ops, 0); > > + // Calculate number of pCPUs present. > > + // Note: assumes does not change online. > > + for_each_present_cpu(cpu) > > + { > > + ++(prv->NUM_CPUS); > > + } > > + > > cpumask_clear(&prv->tickled); > > > > ops->sched_data = prv; > > @@ -466,6 +549,8 @@ rt_deinit(const struct scheduler *ops) > > { > > struct rt_private *prv = rt_priv(ops); > > > > + kill_timer(&prv->replenishment_timer); > > + > > ASSERT( _cpumask_scratch && nr_rt_ops > 0 ); > > > > if ( (--nr_rt_ops) == 0 ) > > @@ -653,8 +738,7 @@ rt_vcpu_remove(const struct scheduler *ops, struct vcpu > > *vc) > > BUG_ON( sdom == NULL ); > > > > lock = vcpu_schedule_lock_irq(vc); > > - if ( __vcpu_on_q(svc) ) > > - __q_remove(svc); > > + __q_remove(svc); > > vcpu_schedule_unlock_irq(lock, vc); > > > > if ( !is_idle_vcpu(vc) ) > > @@ -838,6 +922,7 @@ rt_schedule(const struct scheduler *ops, s_time_t now, > > bool_t tasklet_work_sched > > { > > const int cpu = smp_processor_id(); > > struct rt_private *prv = rt_priv(ops); > > + struct list_head *runq = rt_runq(ops); > > struct rt_vcpu *const scurr = rt_vcpu(current); > > struct rt_vcpu *snext = NULL; > > struct task_slice ret = { .migrated = 0 }; > > @@ -848,8 +933,6 @@ rt_schedule(const struct scheduler *ops, s_time_t now, > > bool_t tasklet_work_sched > > /* burn_budget would return for IDLE VCPU */ > > burn_budget(ops, scurr, now); > > > > - __repl_update(ops, now); > > - > > if ( tasklet_work_scheduled ) > > { > > snext = rt_vcpu(idle_vcpu[cpu]); > > @@ -889,7 +972,10 @@ rt_schedule(const struct scheduler *ops, s_time_t now, > > bool_t tasklet_work_sched > > } > > } > > > > - ret.time = MIN(snext->budget, MAX_SCHEDULE); /* sched quantum */ > > + /* Use minimum from runq */ > > + ret.time = MAX_SCHEDULE; > > + if( !list_empty(runq) ) > > + ret.time = __q_elem(runq->next)->cur_deadline; > > ret.task = snext->vcpu; > > > > /* TRACE */ > > -- > > 1.7.9.5 > > > > > > Thanks, > > Meng > > > ----------- > Meng Xu > PhD Student in Computer and Information Science > University of Pennsylvania > http://www.cis.upenn.edu/~mengxu/ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.