|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xl shutdown --wait "racy"
Wednesday, April 16, 2014, 6:27:10 PM, you wrote:
> On Wed, 2014-04-16 at 17:20 +0200, Sander Eikelenboom wrote:
>> Wednesday, April 16, 2014, 5:02:50 PM, you wrote:
>>
>> > On Wed, 2014-04-16 at 16:55 +0200, Sander Eikelenboom wrote:
>> >> Wednesday, April 16, 2014, 4:33:30 PM, you wrote:
>> >>
>> >> > On Wed, 2014-04-16 at 16:26 +0200, Sander Eikelenboom wrote:
>> >> >> Wednesday, April 16, 2014, 4:13:59 PM, you wrote:
>> >> >>
>> >> >> > On Wed, 2014-04-16 at 16:08 +0200, Sander Eikelenboom wrote:
>> >> >> >> Hi Ian (C|J) Konrad,
>> >> >> >>
>> >> >> >> I'm currently trying to workaround the
>> >> >> >> pci-(detach|assignable-remove) issues i
>> >> >> >> reported earlier.
>> >> >> >>
>> >> >> >> The workaround i thought of was:
>> >> >> >> - shutting down the guest
>> >> >> >> - starting it without 1 of the original devices passed through
>> >> >> >> - use xl pci-assignable-remove and bind the device to the dom0
>> >> >> >> driver.
>> >> >> >>
>> >> >> >> But during this i noticed that a "xl shutdown --wait" does wait ..
>> >> >> >> but returns:
>> >> >> >> - Before the domain is removed from for instance "xl list", it
>> >> >> >> still listed there in
>> >> >> >> "--ps--" state.
>> >> >> >> - before pciback has done it's restore config space magic.
>> >> >> >>
>> >> >> >> So it seems the wait loop is exiting somewhat prematurely, is this
>> >> >> >> expected ?
>> >> >>
>> >> >> > It is waiting for the domain to be shutdown (state 's') not for the
>> >> >> > domain to be destroyed. So it's doing what it said it would (I
>> >> >> > appreciate you might not find this distinction helpful under the
>> >> >> > circumstances...)
>> >> >>
>> >> >> It's at least not entirely what i expected ;-)
>> >> >>
>> >> >> Is it because there can be different "follow-up actions" due to the
>> >> >> "on_poweroff=" config option ?
>> >>
>> >> > Not really, those are somewhat unrelated.
>> >>
>> >> > shutdown and destroy are two distinct events. Once a domain has shutdown
>> >> > (called the shutdown hypercall etc) it goes into state "shutdown" and an
>> >> > event is generated from the hypervisor to the toolstack. The toolstack's
>> >> > response to this is to actually destroy the domain, that is to tear down
>> >> > the resources it is using etc.
>> >>
>> >> > on_* only matter for the destroy phase since they tell the toolstack
>> >> > what it should do (restart, preserve, really destroy etc).
>> >>
>> >> Hmm ok, it should be called "--wait_until_halfway" then ;-)
>>
>> > ;-)
>>
>> >> On the more serious side .. would patches be accepted that:
>> >>
>> >> a) differentiate when it returns from waiting based on the on_*
>> >>
>> >> preserve: this could probably stay as is .. after the shutdown
>> >> event
>> >> destroy:
>> >> restart:
>> >> rename-restart:
>> >> coredump-destroy:
>> >> coredump-restart:
>> >>
>> >> for the other ones .. i don't know if there actually are events
>> >> in libxl
>> >> that could be 'easily' coupled ?
>>
>> > Might be tricky, since on_* is processed by the daemonised xl which is
>> > monitoring the domain, not the xl shutdown process.
>>
>> >> b) make it possible for the xl commandline to overrule the on_* from the
>> >> configfile
>>
>> > I guess you mean the xl shutdown command. This will also be tricky, for
>> > the same reasons as a.
>>
>> >> c) also introduce a -w/--wait for xl destroy
>>
>> > Yes.
>>
>> > I'll add:
>>
>> > d) Make "xl shutdown --wait" actually wait for the domain to be
>> > destroyed.
>>
>> > Probably, assuming that is possible (I'm concerned about races in the
>> > implementation of this...). Might also interact weirdly with on_* I
>> > suppose.
>>
>> Well if we could pass down the events that "wait_for_domain_deaths" is
>> allowed
>> to return on ... now it seems to return on *any* event .. and only print
>> something different on both shutdown and complete death ...
> Any other even would be unexpected I think, since the corresponding
> libxl_evenable_* would never have been called in the xl shutdown path.
>> Is there a special event that's triggered on timeout (as defined in
>> /etc/defaults/xendomains: XENDOMAINS_STOP_MAXWAIT=300 ?
> I don't see a timeout in the libxl_event_wait prototype so I suppose
> not.
>> The the solution seems to be to let the caller of "wait_for_domain_deaths"
>> be
>> able the specify the events it should return on.
> They already can -- by only enabling those events.
>> (always return on timeout ...
>> return on any unless specified ... only return on specified when specific
>> events
>> are specified)
>>
>> Could you elaborate on how you think this would get "racy" ?
> Oh, it looks like libxl already solved it and has an event, never mind.
> (I was concerned that a new domain with the same domid might appear
> right after the domain was gone)
Hrmm yes well .. i only see a way to enable both from libxl_event.h:
typedef struct libxl__evgen_domain_death libxl_evgen_domain_death;
int libxl_evenable_domain_death(libxl_ctx *ctx, uint32_t domid,
libxl_ev_user, libxl_evgen_domain_death **evgen_out);
void libxl_evdisable_domain_death(libxl_ctx *ctx, libxl_evgen_domain_death*);
/* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DESTROY
* events. A domain which is destroyed before it shuts down
* may generate only a DESTROY event.
*/
Although from libxl_internal.h it looks like there once has been a thought
about
differentiating:
struct libxl__evgen_domain_death {
uint32_t domid;
unsigned shutdown_reported:1, death_reported:1;
LIBXL_TAILQ_ENTRY(libxl_evgen_domain_death) entry;
/* on list .death_reported ? CTX->death_list : CTX->death_reported */
libxl_ev_user user;
};
_hidden void
libxl__evdisable_domain_death(libxl__gc*, libxl_evgen_domain_death*);
> Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |