|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy
On Fri, 2016-07-15 at 12:36 +0200, Juergen Gross wrote:
> On 15/07/16 12:14, Dario Faggioli wrote:
> > In particular, I'm probably not fully understanding, from that
> > commit
> > changelog, what is the set of operations/command that I should run
> > to
> > check whether or not I reintroduced the issue back.
> You need to create a domain in a cpupool and destroy it again while
> some dom0 process still is holding a reference to it (resulting in a
> zombie domain). Then try to destroy the cpupool.
>
Ah, I see. I wasn't get the fact that it needed to be a zombie domain
from anywhere.
> > What am I missing?
> The domain being a zombie domain might change the picture. Moving it
> to
> cpupool0 was failing before my patch and it might do so again with
> your
> patch applied.
>
Mmmm... I don't immediately see the reason why moving a zombie domain
fails either, but I guess I'll have to try.
But then, correct me if I'm wrong, the situation is like this:
- right now there's a (potential) race between domain's scheduling
data destruction and domain removal from a cpupool;
- with my patch, the race goes away, but we risk not being able to
destroy a cpupool with a zombie domain in it.
I don't think we want to be in either of these two situations. :-(
The race is never triggering so far, but I've already patches to
Credit2 (finishing implementing soft affinity for it) that make it a
real thing.
I think I can work around that specific case by doing something like
the below:
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index bc0e794..d91fd70 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -193,12 +193,9 @@ struct cpupool
static inline cpumask_t* cpupool_domain_cpumask(struct domain *d)
{
- /*
- * d->cpupool is NULL only for the idle domain, and no one should
- * be interested in calling this for the idle domain.
- */
- ASSERT(d->cpupool != NULL);
- return d->cpupool->cpu_valid;
+ /* No one should be interested in calling this for the idle domain! */
+ ASSERT(!is_idle_domain(d));
+ return d->cpupool ? d->cpupool->cpu_valid : cpupool0->cpu_valid;
}
#endif /* __XEN_SCHED_IF_H__ */
But even if that would be acceptable (what do you think?), I still
don't like to have the race there lurking. :-/
Therefore, I still think this patch is correct, but I'm up for
investigating further and finding a way to solve the "zombie in
cpupool" issue as well.
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |