|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 1/4] sched: credit2: respect per-vcpu hard affinity
On 03/26/2015 09:48 AM, Justin T. Weaver wrote:
> by making sure that vcpus only run on the pcpu(s) they are allowed to
> run on based on their hard affinity cpu masks.
>
> Signed-off-by: Justin T. Weaver <jtweaver@xxxxxxxxxx>
Hey Justin! Getting close. A couple of comments:
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 7581731..af716e4 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -176,6 +176,7 @@ integer_param("sched_credit2_migrate_resist",
> opt_migrate_resist);
> #define c2r(_ops, _cpu) (CSCHED2_PRIV(_ops)->runq_map[(_cpu)])
> /* CPU to runqueue struct macro */
> #define RQD(_ops, _cpu) (&CSCHED2_PRIV(_ops)->rqd[c2r(_ops, _cpu)])
> +#define VCPU2ONLINE(_v) cpupool_online_cpumask((_v)->domain->cpupool)
>
> /*
> * Shifts for load average.
> @@ -194,6 +195,12 @@ int opt_overload_balance_tolerance=-3;
> integer_param("credit2_balance_over", opt_overload_balance_tolerance);
>
> /*
> + * Use this to avoid having too many cpumask_t structs on the stack
> + */
> +static cpumask_t **scratch_mask = NULL;
> +#define csched2_cpumask scratch_mask[smp_processor_id()]
Since I'm asking for changes below: It's not clear to me when I'm
scanning the code that csched2_cpumask is a scratch variable. What
about calling the actual variable _scratch_cpumask and havind the
#define be scratch_cpumask?
> /* Find the runqueue with the lowest instantaneous load */
> for_each_cpu(i, &prv->active_queues)
> {
> struct csched2_runqueue_data *rqd;
> - s_time_t rqd_avgload;
> + s_time_t rqd_avgload = MAX_LOAD;
>
> rqd = prv->rqd + i;
>
> /* If checking a different runqueue, grab the lock,
> - * read the avg, and then release the lock.
> + * check hard affinity, read the avg, and then release the lock.
> *
> * If on our own runqueue, don't grab or release the lock;
> * but subtract our own load from the runqueue load to simulate
> - * impartiality */
> + * impartiality.
> + *
> + * svc's hard affinity may have changed; this function is the
> + * credit 2 scheduler's first opportunity to react to the change,
> + * so it is possible here that svc does not have hard affinity
> + * with any of the pcpus of svc's currently assigned run queue.
> + */
> if ( rqd == svc->rqd )
> {
> - rqd_avgload = rqd->b_avgload - svc->avgload;
> + if ( cpumask_intersects(vc->cpu_hard_affinity, &rqd->active) )
> + rqd_avgload = rqd->b_avgload - svc->avgload;
> }
> else if ( spin_trylock(&rqd->lock) )
> {
> - rqd_avgload = rqd->b_avgload;
> + if ( cpumask_intersects(vc->cpu_hard_affinity, &rqd->active) )
> + rqd_avgload = rqd->b_avgload;
> +
> spin_unlock(&rqd->lock);
> }
> else
Since we're already potentially falling through and doing the comparison
below with an unchanged rqd_avgload (which has now been set to MAX_LOAD
above), I wonder if it makes more sense to remove this "else continue"
here, just to avoid confusing people.
> @@ -2024,6 +2096,13 @@ csched2_alloc_pdata(const struct scheduler *ops, int
> cpu)
> printk("%s: cpu %d not online yet, deferring initializatgion\n",
> __func__, cpu);
>
> + /*
> + * For each new pcpu, allocate a cpumask_t for use throughout the
> + * scheduler to avoid putting any cpumask_t structs on the stack.
> + */
> + if ( !zalloc_cpumask_var(&scratch_mask[cpu]) )
Any reason not to use "scratch_mask + cpu" here rather than
"&scratch_mask[cpu]"?
It might not be a bad idea to ad ASSERT(scratch_mask[cpu] == NULL)
before this, just to be paranoid...
> + return NULL;
> +
> return (void *)1;
> }
>
> @@ -2072,6 +2151,8 @@ csched2_free_pdata(const struct scheduler *ops, void
> *pcpu, int cpu)
>
> spin_unlock_irqrestore(&prv->lock, flags);
>
> + free_cpumask_var(scratch_mask[cpu]);
We should definitely set this to NULL after freeing it (see below)
> +
> return;
> }
>
> @@ -2159,6 +2240,10 @@ csched2_init(struct scheduler *ops)
>
> prv->load_window_shift = opt_load_window_shift;
>
> + scratch_mask = xmalloc_array(cpumask_t *, nr_cpu_ids);
I realize Dario recommended using xmalloc_array() instead of
xzalloc_array(), but I don't understand why he thinks that's OK. His
mail says "(see below about why I actually don't
think we need)", but I don't actually see that addressed in that e-mail.
I think it's just dangerous to leave uninitialized pointers around. The
invariant should be that if the array entry is invalid it's NULL, and if
it's non-null then it's valid.
(* marc.info/?i=<1426266674.32500.25.camel@xxxxxxxxxx>)
Also -- I think this allocation wants to happen in global_init(), right?
Otherwise if you make a second cpupool with the credit2 scheduler this
will be clobbered. (I think nr_cpu_ids should be defined at that point.)
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |