[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [pli PATCH bla] Xen: credit2: avoid picking a spurious idle unit when caps are used
Err... of course, the "pli" and "bla" stuff between the [] are a leftover of some experiments that I had to do with `stg email`, due to mail handling changes, and should really not have been there in this email... Sorry. :-/ On Tue, 2021-08-03 at 19:36 +0200, Dario Faggioli wrote: > Commit 07b0eb5d0ef0 ("credit2: make sure we pick a runnable unit from > the > runq if there is one") did not fix completely the problem of > potentially > selecting a scheduling unit that will then not be able to run. > > In fact, in case caps are used and the unit we are currently looking > at, during the runqueue scan, does not have budget to be executed, we > should continue looking instead than giving up and picking the idle > unit. > > Signed-off-by: Dario Faggioli <dfaggioli@xxxxxxxx> > Suggested-by: George Dunlap <george.dunlap@xxxxxxxxxx> > Cc: George Dunlap <george.dunlap@xxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > --- > This is necessary to completely fix the bug that was described in and > addressed by 07b0eb5d0ef0 ("credit2: make sure we pick a runnable > unit > from the runq if there is one"). > > It should, therefore, be backported and applied to all the branches > to > which that commit has been. About backports, it should be > straigthforward to do that until 4.13. > > For 4.12 and earlier, it's trickier, but the fix is still necessary. > Actually, both 07b0eb5d0ef0 and this patch should be backported to > that > branch! > > I will provide the backports myself in a email that I'll send as a > reply to this one. > > Regards, > Dario > --- > xen/common/sched/credit2.c | 85 +++++++++++++++++++++++++--------- > ---------- > 1 file changed, 49 insertions(+), 36 deletions(-) > > diff --git a/xen/common/sched/credit2.c b/xen/common/sched/credit2.c > index ebb09ea43a..f9b95db313 100644 > --- a/xen/common/sched/credit2.c > +++ b/xen/common/sched/credit2.c > @@ -3463,48 +3463,61 @@ runq_candidate(struct csched2_runqueue_data > *rqd, > (unsigned char *)&d); > } > > - /* Skip non runnable units that we (temporarily) have in the > runq */ > - if ( unlikely(!unit_runnable_state(svc->unit)) ) > - continue; > - > - /* Only consider vcpus that are allowed to run on this > processor. */ > - if ( !cpumask_test_cpu(cpu, svc->unit->cpu_hard_affinity) ) > - continue; > - > /* > - * If an unit is meant to be picked up by another processor, > and such > - * processor has not scheduled yet, leave it in the runqueue > for him. > + * If the unit in the runqueue has more credit than current > (or than > + * idle, if current is not runnable) or if current is > yielding, we may > + * want to pick it up. > */ > - if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu && > - cpumask_test_cpu(svc->tickled_cpu, &rqd->tickled) ) > + if ( (yield || svc->credit > snext->credit) ) > { > - SCHED_STAT_CRANK(deferred_to_tickled_cpu); > - continue; > - } > + /* Skip non runnable units that we (temporarily) have in > the runq */ > + if ( unlikely(!unit_runnable_state(svc->unit)) ) > + continue; > > - /* > - * If this is on a different processor, don't pull it unless > - * its credit is at least CSCHED2_MIGRATE_RESIST higher. > - */ > - if ( sched_unit_master(svc->unit) != cpu > - && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit > ) > - { > - SCHED_STAT_CRANK(migrate_resisted); > - continue; > - } > + /* Only consider vcpus that are allowed to run on this > processor. */ > + if ( !cpumask_test_cpu(cpu, svc->unit- > >cpu_hard_affinity) ) > + continue; > > - /* > - * If the one in the runqueue has more credit than current > (or idle, > - * if current is not runnable), or if current is yielding, > and also > - * if the one in runqueue either is not capped, or is capped > but has > - * some budget, then choose it. > - */ > - if ( (yield || svc->credit > snext->credit) && > - (!has_cap(svc) || unit_grab_budget(svc)) ) > - snext = svc; > + /* > + * If an unit is meant to be picked up by another > processor, and such > + * processor has not scheduled yet, leave it in the > runqueue for him. > + */ > + if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu > && > + cpumask_test_cpu(svc->tickled_cpu, &rqd->tickled) ) > + { > + SCHED_STAT_CRANK(deferred_to_tickled_cpu); > + continue; > + } > > - /* In any case, if we got this far, break. */ > - break; > + /* > + * If this is on a different processor, don't pull it > unless > + * its credit is at least CSCHED2_MIGRATE_RESIST higher. > + */ > + if ( sched_unit_master(svc->unit) != cpu > + && snext->credit + CSCHED2_MIGRATE_RESIST > svc- > >credit ) > + { > + SCHED_STAT_CRANK(migrate_resisted); > + continue; > + } > + > + /* > + * If we are here, we are almost sure we want to pick > the unit in > + * the runqueue. Last thing we need to check is that it > either is > + * is not capped, or if it is, it has some budget. > + * > + * Note that cap & budget should really be the last > thing we do > + * check. In fact, unit_grab_budget() will reserve some > budget for > + * this unit, from the per-domain budget pool, and we > want to do > + * that only if we are sure that we'll then run the > unit, consume > + * some of it, and return the leftover (if any) in the > usual way. > + */ > + if ( has_cap(svc) && !unit_grab_budget(svc) ) > + continue; > + > + /* If we got this far, we are done. */ > + snext = svc; > + break; > + } > } > > if ( unlikely(tb_init_done) ) > > > -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere) Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |