[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xl shutdown --wait "racy"
Wednesday, April 16, 2014, 6:27:10 PM, you wrote: > On Wed, 2014-04-16 at 17:20 +0200, Sander Eikelenboom wrote: >> Wednesday, April 16, 2014, 5:02:50 PM, you wrote: >> >> > On Wed, 2014-04-16 at 16:55 +0200, Sander Eikelenboom wrote: >> >> Wednesday, April 16, 2014, 4:33:30 PM, you wrote: >> >> >> >> > On Wed, 2014-04-16 at 16:26 +0200, Sander Eikelenboom wrote: >> >> >> Wednesday, April 16, 2014, 4:13:59 PM, you wrote: >> >> >> >> >> >> > On Wed, 2014-04-16 at 16:08 +0200, Sander Eikelenboom wrote: >> >> >> >> Hi Ian (C|J) Konrad, >> >> >> >> >> >> >> >> I'm currently trying to workaround the >> >> >> >> pci-(detach|assignable-remove) issues i >> >> >> >> reported earlier. >> >> >> >> >> >> >> >> The workaround i thought of was: >> >> >> >> - shutting down the guest >> >> >> >> - starting it without 1 of the original devices passed through >> >> >> >> - use xl pci-assignable-remove and bind the device to the dom0 >> >> >> >> driver. >> >> >> >> >> >> >> >> But during this i noticed that a "xl shutdown --wait" does wait .. >> >> >> >> but returns: >> >> >> >> - Before the domain is removed from for instance "xl list", it >> >> >> >> still listed there in >> >> >> >> "--ps--" state. >> >> >> >> - before pciback has done it's restore config space magic. >> >> >> >> >> >> >> >> So it seems the wait loop is exiting somewhat prematurely, is this >> >> >> >> expected ? >> >> >> >> >> >> > It is waiting for the domain to be shutdown (state 's') not for the >> >> >> > domain to be destroyed. So it's doing what it said it would (I >> >> >> > appreciate you might not find this distinction helpful under the >> >> >> > circumstances...) >> >> >> >> >> >> It's at least not entirely what i expected ;-) >> >> >> >> >> >> Is it because there can be different "follow-up actions" due to the >> >> >> "on_poweroff=" config option ? >> >> >> >> > Not really, those are somewhat unrelated. >> >> >> >> > shutdown and destroy are two distinct events. Once a domain has shutdown >> >> > (called the shutdown hypercall etc) it goes into state "shutdown" and an >> >> > event is generated from the hypervisor to the toolstack. The toolstack's >> >> > response to this is to actually destroy the domain, that is to tear down >> >> > the resources it is using etc. >> >> >> >> > on_* only matter for the destroy phase since they tell the toolstack >> >> > what it should do (restart, preserve, really destroy etc). >> >> >> >> Hmm ok, it should be called "--wait_until_halfway" then ;-) >> >> > ;-) >> >> >> On the more serious side .. would patches be accepted that: >> >> >> >> a) differentiate when it returns from waiting based on the on_* >> >> >> >> preserve: this could probably stay as is .. after the shutdown >> >> event >> >> destroy: >> >> restart: >> >> rename-restart: >> >> coredump-destroy: >> >> coredump-restart: >> >> >> >> for the other ones .. i don't know if there actually are events >> >> in libxl >> >> that could be 'easily' coupled ? >> >> > Might be tricky, since on_* is processed by the daemonised xl which is >> > monitoring the domain, not the xl shutdown process. >> >> >> b) make it possible for the xl commandline to overrule the on_* from the >> >> configfile >> >> > I guess you mean the xl shutdown command. This will also be tricky, for >> > the same reasons as a. >> >> >> c) also introduce a -w/--wait for xl destroy >> >> > Yes. >> >> > I'll add: >> >> > d) Make "xl shutdown --wait" actually wait for the domain to be >> > destroyed. >> >> > Probably, assuming that is possible (I'm concerned about races in the >> > implementation of this...). Might also interact weirdly with on_* I >> > suppose. >> >> Well if we could pass down the events that "wait_for_domain_deaths" is >> allowed >> to return on ... now it seems to return on *any* event .. and only print >> something different on both shutdown and complete death ... > Any other even would be unexpected I think, since the corresponding > libxl_evenable_* would never have been called in the xl shutdown path. >> Is there a special event that's triggered on timeout (as defined in >> /etc/defaults/xendomains: XENDOMAINS_STOP_MAXWAIT=300 ? > I don't see a timeout in the libxl_event_wait prototype so I suppose > not. >> The the solution seems to be to let the caller of "wait_for_domain_deaths" >> be >> able the specify the events it should return on. > They already can -- by only enabling those events. >> (always return on timeout ... >> return on any unless specified ... only return on specified when specific >> events >> are specified) >> >> Could you elaborate on how you think this would get "racy" ? > Oh, it looks like libxl already solved it and has an event, never mind. > (I was concerned that a new domain with the same domid might appear > right after the domain was gone) Hrmm yes well .. i only see a way to enable both from libxl_event.h: typedef struct libxl__evgen_domain_death libxl_evgen_domain_death; int libxl_evenable_domain_death(libxl_ctx *ctx, uint32_t domid, libxl_ev_user, libxl_evgen_domain_death **evgen_out); void libxl_evdisable_domain_death(libxl_ctx *ctx, libxl_evgen_domain_death*); /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DESTROY * events. A domain which is destroyed before it shuts down * may generate only a DESTROY event. */ Although from libxl_internal.h it looks like there once has been a thought about differentiating: struct libxl__evgen_domain_death { uint32_t domid; unsigned shutdown_reported:1, death_reported:1; LIBXL_TAILQ_ENTRY(libxl_evgen_domain_death) entry; /* on list .death_reported ? CTX->death_list : CTX->death_reported */ libxl_ev_user user; }; _hidden void libxl__evdisable_domain_death(libxl__gc*, libxl_evgen_domain_death*); > Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |