[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy



On 14/07/16 18:18, Dario Faggioli wrote:
> So, during domain destruction, we do:
>  cpupool_rm_domain()    [ in domain_destroy() ]
>  sched_destroy_domain() [ in complete_domain_destroy() ]
> 
> Therefore, there's a window during which, from the
> scheduler's point of view, a domain stilsts outside
> of any cpupool.
> 
> In fact, cpupool_rm_domain() does d->cpupool=NULL,
> and we don't allow that to hold true, for anything
> but the idle domain (and there are, in fact, ASSERT()s
> and BUG_ON()s to that effect).
> 
> Currently, we never really check d->cpupool during the
> window, but that does not mean the race is not there.
> For instance, Credit2 at some point (during load balancing)
> iterates on the list of domains, and if we add logic that
> needs checking d->cpupool, and any one of them had
> cpupool_rm_domain() called on itself already... Boom!
> 
> (In fact, calling __vcpu_has_soft_affinity() from inside
> balance_load() makes `xl shutdown <domid>' reliably
> crash, and this is how I discovered this.)
> 
> On the other hand, cpupool_rm_domain() "only" does
> cpupool related bookkeeping, and there's no harm
> postponing it a little bit.
> 
> Also, considering that, during domain initialization,
> we do:
>  cpupool_add_domain()
>  sched_init_domain()
> 
> It makes sense for the destruction path to look like
> the opposite of it, i.e.:
>  sched_destroy_domain()
>  cpupool_rm_domain()
> 
> And hence that's what this patch does.
> 
> Actually, for better robustness, what we really do is
> moving both cpupool_add_domain() and cpupool_rm_domain()
> inside sched_init_domain() and sched_destroy_domain(),
> respectively (and also add a couple of ASSERT()-s).
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Hmm, are you aware of commit bac6334b51d9bcfe57ecf4a4cb5288348fcf044a
which explicitly moved cpupool_rm_domain() at the place where you are
removing it again? Please verify that the scenario mentioned in the
description of that commit is still working with your patch.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.