[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xl shutdown --wait "racy"
On Wed, 2014-04-16 at 17:20 +0200, Sander Eikelenboom wrote: > Wednesday, April 16, 2014, 5:02:50 PM, you wrote: > > > On Wed, 2014-04-16 at 16:55 +0200, Sander Eikelenboom wrote: > >> Wednesday, April 16, 2014, 4:33:30 PM, you wrote: > >> > >> > On Wed, 2014-04-16 at 16:26 +0200, Sander Eikelenboom wrote: > >> >> Wednesday, April 16, 2014, 4:13:59 PM, you wrote: > >> >> > >> >> > On Wed, 2014-04-16 at 16:08 +0200, Sander Eikelenboom wrote: > >> >> >> Hi Ian (C|J) Konrad, > >> >> >> > >> >> >> I'm currently trying to workaround the > >> >> >> pci-(detach|assignable-remove) issues i > >> >> >> reported earlier. > >> >> >> > >> >> >> The workaround i thought of was: > >> >> >> - shutting down the guest > >> >> >> - starting it without 1 of the original devices passed through > >> >> >> - use xl pci-assignable-remove and bind the device to the dom0 > >> >> >> driver. > >> >> >> > >> >> >> But during this i noticed that a "xl shutdown --wait" does wait .. > >> >> >> but returns: > >> >> >> - Before the domain is removed from for instance "xl list", it still > >> >> >> listed there in > >> >> >> "--ps--" state. > >> >> >> - before pciback has done it's restore config space magic. > >> >> >> > >> >> >> So it seems the wait loop is exiting somewhat prematurely, is this > >> >> >> expected ? > >> >> > >> >> > It is waiting for the domain to be shutdown (state 's') not for the > >> >> > domain to be destroyed. So it's doing what it said it would (I > >> >> > appreciate you might not find this distinction helpful under the > >> >> > circumstances...) > >> >> > >> >> It's at least not entirely what i expected ;-) > >> >> > >> >> Is it because there can be different "follow-up actions" due to the > >> >> "on_poweroff=" config option ? > >> > >> > Not really, those are somewhat unrelated. > >> > >> > shutdown and destroy are two distinct events. Once a domain has shutdown > >> > (called the shutdown hypercall etc) it goes into state "shutdown" and an > >> > event is generated from the hypervisor to the toolstack. The toolstack's > >> > response to this is to actually destroy the domain, that is to tear down > >> > the resources it is using etc. > >> > >> > on_* only matter for the destroy phase since they tell the toolstack > >> > what it should do (restart, preserve, really destroy etc). > >> > >> Hmm ok, it should be called "--wait_until_halfway" then ;-) > > > ;-) > > >> On the more serious side .. would patches be accepted that: > >> > >> a) differentiate when it returns from waiting based on the on_* > >> > >> preserve: this could probably stay as is .. after the shutdown > >> event > >> destroy: > >> restart: > >> rename-restart: > >> coredump-destroy: > >> coredump-restart: > >> > >> for the other ones .. i don't know if there actually are events in > >> libxl > >> that could be 'easily' coupled ? > > > Might be tricky, since on_* is processed by the daemonised xl which is > > monitoring the domain, not the xl shutdown process. > > >> b) make it possible for the xl commandline to overrule the on_* from the > >> configfile > > > I guess you mean the xl shutdown command. This will also be tricky, for > > the same reasons as a. > > >> c) also introduce a -w/--wait for xl destroy > > > Yes. > > > I'll add: > > > d) Make "xl shutdown --wait" actually wait for the domain to be > > destroyed. > > > Probably, assuming that is possible (I'm concerned about races in the > > implementation of this...). Might also interact weirdly with on_* I > > suppose. > > Well if we could pass down the events that "wait_for_domain_deaths" is > allowed > to return on ... now it seems to return on *any* event .. and only print > something different on both shutdown and complete death ... Any other even would be unexpected I think, since the corresponding libxl_evenable_* would never have been called in the xl shutdown path. > Is there a special event that's triggered on timeout (as defined in > /etc/defaults/xendomains: XENDOMAINS_STOP_MAXWAIT=300 ? I don't see a timeout in the libxl_event_wait prototype so I suppose not. > The the solution seems to be to let the caller of "wait_for_domain_deaths" be > able the specify the events it should return on. They already can -- by only enabling those events. > (always return on timeout ... > return on any unless specified ... only return on specified when specific > events > are specified) > > Could you elaborate on how you think this would get "racy" ? Oh, it looks like libxl already solved it and has an event, never mind. (I was concerned that a new domain with the same domid might appear right after the domain was gone) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |