|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] xen: credit1: fix a race when picking initial pCPU for a vCPU
On 19/08/16 17:26, Dario Faggioli wrote:
> In the Credit1 hunk of 9f358ddd69463 ("xen: Have
> schedulers revise initial placement") csched_cpu_pick()
> is called without taking the runqueue lock of the
> (temporary) pCPU that the vCPU has been assigned to
> (e.g., in XEN_DOMCTL_max_vcpus).
>
> However, although 'hidden' in the IS_RUNQ_IDLE() macro,
> that function does access the runq (for doing load
> balancing calculations). Two scenarios are possible:
> 1) we are on cpu X, and IS_RUNQ_IDLE() peeks at cpu's
> X own runq;
> 2) we are on cpu X, but IS_RUNQ_IDLE() peeks at some
> other cpu's runq.
>
> Scenario 2) absolutely requies that the appropriate
> runq lock is taken. Scenario 1) works even without
> taking the cpu's own runq lock. That is actually what
> happens when when _csched_pick_cpu() is called from
> csched_vcpu_acct() (in turn, called by csched_tick()).
>
> Races have been observed and reported (by both XenServer
> own testing and OSSTest [1]), in the form of
> IS_RUNQ_IDLE() falling over LIST_POISON, because we're
> not currently holding the proper lock, in
> csched_vcpu_insert(), when scenario 1) occurs.
>
> However, for better robustness, from now on we always
> ask for the proper runq lock to be held when calling
> IS_RUNQ_IDLE() (which is also becoming a static inline
> function instead of macro).
>
> In order to comply with that, we take the lock around
> the call to _csched_cpu_pick() in csched_vcpu_acct().
>
> [1] https://lists.xen.org/archives/html/xen-devel/2016-08/msg02144.html
>
> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx>
> ---
> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: Jan Beulich <JBeulich@xxxxxxxx>
> ---
> Changes from v2:
> - actually, take the runq lock around the call to _csched_cpu_pick() in
> csched_vcpu_acct(), as considered better by George during review.
>
> Changes from v1:
> - macro IS_RUNQ_IDLE() to static inline is_runq_idle(), as suggested
> during review;
> - add an ASSERT() and a comment, as suggested during review;
> - take into account what's described in the changelog as "scenario 1)",
> which wasn't being considered in v1.
> ---
> xen/common/sched_credit.c | 53
> +++++++++++++++++++++++++++++++++------------
> 1 file changed, 39 insertions(+), 14 deletions(-)
>
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 220ff0d..c2b4b24 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -84,9 +84,6 @@
> #define CSCHED_VCPU(_vcpu) ((struct csched_vcpu *) (_vcpu)->sched_priv)
> #define CSCHED_DOM(_dom) ((struct csched_dom *) (_dom)->sched_priv)
> #define RUNQ(_cpu) (&(CSCHED_PCPU(_cpu)->runq))
> -/* Is the first element of _cpu's runq its idle vcpu? */
> -#define IS_RUNQ_IDLE(_cpu) (list_empty(RUNQ(_cpu)) || \
> -
> is_idle_vcpu(__runq_elem(RUNQ(_cpu)->next)->vcpu))
>
>
> /*
> @@ -248,6 +245,18 @@ __runq_elem(struct list_head *elem)
> return list_entry(elem, struct csched_vcpu, runq_elem);
> }
>
> +/* Is the first element of cpu's runq (if any) cpu's idle vcpu? */
> +static inline bool_t is_runq_idle(unsigned int cpu)
> +{
> + /*
> + * We're peeking at cpu's runq, we must hold the proper lock.
> + */
> + ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
> +
> + return list_empty(RUNQ(cpu)) ||
> + is_idle_vcpu(__runq_elem(RUNQ(cpu)->next)->vcpu);
> +}
> +
> static inline void
> __runq_insert(struct csched_vcpu *svc)
> {
> @@ -771,7 +780,7 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu
> *vc, bool_t commit)
> * runnable vcpu on cpu, we add cpu to the idlers.
> */
> cpumask_and(&idlers, &cpu_online_map, CSCHED_PRIV(ops)->idlers);
> - if ( vc->processor == cpu && IS_RUNQ_IDLE(cpu) )
> + if ( vc->processor == cpu && is_runq_idle(cpu) )
> __cpumask_set_cpu(cpu, &idlers);
> cpumask_and(&cpus, &cpus, &idlers);
>
> @@ -951,21 +960,33 @@ csched_vcpu_acct(struct csched_private *prv, unsigned
> int cpu)
> /*
> * Put this VCPU and domain back on the active list if it was
> * idling.
> - *
> - * If it's been active a while, check if we'd be better off
> - * migrating it to run elsewhere (see multi-core and multi-thread
> - * support in csched_cpu_pick()).
> */
> if ( list_empty(&svc->active_vcpu_elem) )
> {
> __csched_vcpu_acct_start(prv, svc);
> }
> - else if ( _csched_cpu_pick(ops, current, 0) != cpu )
> + else
> {
> - SCHED_VCPU_STAT_CRANK(svc, migrate_r);
> - SCHED_STAT_CRANK(migrate_running);
> - set_bit(_VPF_migrating, ¤t->pause_flags);
> - cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> + unsigned int new_cpu;
> + unsigned long flags;
> + spinlock_t *lock = vcpu_schedule_lock_irqsave(current, &flags);
> +
> + /*
> + * If it's been active a while, check if we'd be better off
> + * migrating it to run elsewhere (see multi-core and multi-thread
> + * support in csched_cpu_pick()).
> + */
> + new_cpu = _csched_cpu_pick(ops, current, 0);
> +
> + vcpu_schedule_unlock_irqrestore(lock, flags, current);
> +
> + if ( new_cpu != cpu )
> + {
> + SCHED_VCPU_STAT_CRANK(svc, migrate_r);
> + SCHED_STAT_CRANK(migrate_running);
> + set_bit(_VPF_migrating, ¤t->pause_flags);
> + cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> + }
> }
> }
>
> @@ -998,9 +1019,13 @@ csched_vcpu_insert(const struct scheduler *ops, struct
> vcpu *vc)
>
> BUG_ON( is_idle_vcpu(vc) );
>
> - /* This is safe because vc isn't yet being scheduled */
> + /* csched_cpu_pick() looks in vc->processor's runq, so we need the lock.
> */
> + lock = vcpu_schedule_lock_irq(vc);
> +
> vc->processor = csched_cpu_pick(ops, vc);
>
> + spin_unlock_irq(lock);
> +
> lock = vcpu_schedule_lock_irq(vc);
>
> if ( !__vcpu_on_runq(svc) && vcpu_runnable(vc) && !vc->is_running )
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |