[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy
On 15/07/16 12:14, Dario Faggioli wrote: > On Fri, 2016-07-15 at 11:38 +0200, Juergen Gross wrote: >> Hmm, are you aware of commit bac6334b51d9bcfe57ecf4a4cb5288348fcf044a >> which explicitly moved cpupool_rm_domain() at the place where you are >> removing it again? Please verify that the scenario mentioned in the >> description of that commit is still working with your patch. >> > Sorry, but I only partly see the problem. > > In particular, I'm probably not fully understanding, from that commit > changelog, what is the set of operations/command that I should run to > check whether or not I reintroduced the issue back. You need to create a domain in a cpupool and destroy it again while some dom0 process still is holding a reference to it (resulting in a zombie domain). Then try to destroy the cpupool. > > What I did so far is as follows: > > root@Zhaman:~# xl cpupool-list > Name CPUs Sched Active Domain count > Pool-0 12 credit y 1 > Pool-credit 4 credit y 1 > root@Zhaman:~# xl list -c > Name ID Mem VCPUs State Time(s) > Cpupool > Domain-0 0 1019 16 r----- > 34.5 Pool-0 > vm1 1 4096 4 -b---- > 9.7 Pool-credit > root@Zhaman:~# xl cpupool-cpu-remove Pool-credit all > libxl: error: libxl.c:6998:libxl_cpupool_cpuremove: Error removing cpu 9 from > cpupool: Device or resource busy > Some cpus may have not or only partially been removed from 'Pool-credit'. > If a cpu can't be added to another cpupool, add it to 'Pool-credit' again and > retry. > root@Zhaman:~# xl cpupool-list -c > Name CPU list > Pool-0 0,1,2,3,4,5,10,11,12,13,14,15 > Pool-credit 9 > root@Zhaman:~# xl shutdown vm1 > Shutting down domain 1 > root@Zhaman:~# xl cpupool-cpu-remove Pool-credit all > root@Zhaman:~# xl cpupool-list -c > Name CPU list > Pool-0 0,1,2,3,4,5,10,11,12,13,14,15 > Pool-credit > > If (with vm1 still in Pool-credit), I do this, it indeed fails: > > root@Zhaman:~# xl shutdown vm1 & xl cpupool-cpu-remove Pool-credit all > [1] 3275 > Shutting down domain 2 > libxl: error: libxl.c:6998:libxl_cpupool_cpuremove: Error removing cpu 9 from > cpupool: Device or resource busy > Some cpus may have not or only partially been removed from 'Pool-credit'. > If a cpu can't be added to another cpupool, add it to 'Pool-credit' again and > retry. > [1]+ Done xl shutdown vm1 > root@Zhaman:~# xl cpupool-list -c > Name CPU list > Pool-0 0,1,2,3,4,5,10,11,12,13,14,15 > Pool-credit 9 > > But that does not look too strange to me, as it's entirely possible > that the domain has not been moved yet, when we try to remove the last > cpu. And in fact, after the domain has properly shutdown: > > root@Zhaman:~# xl cpupool-cpu-remove Pool-credit all > root@Zhaman:~# xl cpupool-list > Name CPUs Sched Active Domain count > Pool-0 12 credit y 1 > Pool-credit 0 credit y 0 > > And in fact, looking at the code introduced by that commit, the > important part, to me, seems to be the moving of the domain to > cpupool0, which is indeed the right thing to do. OTOH, what I am seeing > and fixing, happens (well, could happen) all the times, even when the > domain being shutdown is already in cpupool0, and (as you say yourself > in your changelog) there not such issue as removing the last cpu of > cpupool0. > > What am I missing? The domain being a zombie domain might change the picture. Moving it to cpupool0 was failing before my patch and it might do so again with your patch applied. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |