[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xl shutdown --wait "racy"



Wednesday, April 16, 2014, 6:27:10 PM, you wrote:

> On Wed, 2014-04-16 at 17:20 +0200, Sander Eikelenboom wrote:
>> Wednesday, April 16, 2014, 5:02:50 PM, you wrote:
>> 
>> > On Wed, 2014-04-16 at 16:55 +0200, Sander Eikelenboom wrote:
>> >> Wednesday, April 16, 2014, 4:33:30 PM, you wrote:
>> >> 
>> >> > On Wed, 2014-04-16 at 16:26 +0200, Sander Eikelenboom wrote:
>> >> >> Wednesday, April 16, 2014, 4:13:59 PM, you wrote:
>> >> >> 
>> >> >> > On Wed, 2014-04-16 at 16:08 +0200, Sander Eikelenboom wrote:
>> >> >> >> Hi Ian (C|J) Konrad,
>> >> >> >> 
>> >> >> >> I'm currently trying to workaround the 
>> >> >> >> pci-(detach|assignable-remove) issues i 
>> >> >> >> reported earlier. 
>> >> >> >> 
>> >> >> >> The workaround i thought of was:
>> >> >> >> - shutting down the guest
>> >> >> >> - starting it without 1 of the original devices passed through
>> >> >> >> - use xl pci-assignable-remove and bind the device to the dom0 
>> >> >> >> driver.
>> >> >> >> 
>> >> >> >> But during this i noticed that a "xl shutdown --wait" does wait .. 
>> >> >> >> but returns:
>> >> >> >> - Before the domain is removed from for instance "xl list", it 
>> >> >> >> still listed there in 
>> >> >> >> "--ps--" state.
>> >> >> >> - before pciback has done it's restore config space magic.
>> >> >> >> 
>> >> >> >> So it seems the wait loop is exiting somewhat prematurely, is this 
>> >> >> >> expected ? 
>> >> >> 
>> >> >> > It is waiting for the domain to be shutdown (state 's') not for the
>> >> >> > domain to be destroyed. So it's doing what it said it would (I
>> >> >> > appreciate you might not find this distinction helpful under the
>> >> >> > circumstances...)
>> >> >> 
>> >> >> It's at least not entirely what i expected ;-)
>> >> >> 
>> >> >> Is it because there can be different "follow-up actions" due to the 
>> >> >> "on_poweroff=" config option ?
>> >> 
>> >> > Not really, those are somewhat unrelated.
>> >> 
>> >> > shutdown and destroy are two distinct events. Once a domain has shutdown
>> >> > (called the shutdown hypercall etc) it goes into state "shutdown" and an
>> >> > event is generated from the hypervisor to the toolstack. The toolstack's
>> >> > response to this is to actually destroy the domain, that is to tear down
>> >> > the resources it is using etc.
>> >> 
>> >> > on_* only matter for the destroy phase since they tell the toolstack
>> >> > what it should do (restart, preserve, really destroy etc).
>> >> 
>> >> Hmm ok, it should be called "--wait_until_halfway" then ;-)
>> 
>> > ;-)
>> 
>> >> On the more serious side .. would patches be accepted that:
>> >> 
>> >> a) differentiate when it returns from waiting based on the on_*
>> >> 
>> >>         preserve: this could probably stay as is .. after the shutdown 
>> >> event
>> >>         destroy:
>> >>         restart:
>> >>         rename-restart:
>> >>         coredump-destroy:
>> >>         coredump-restart:
>> >> 
>> >>         for the other ones .. i don't know if there actually are events 
>> >> in libxl 
>> >>         that could be 'easily' coupled ?
>> 
>> > Might be tricky, since on_* is processed by the daemonised xl which is
>> > monitoring the domain, not the xl shutdown process.
>> 
>> >> b) make it possible for the xl commandline to overrule the on_* from the 
>> >> configfile
>> 
>> > I guess you mean the xl shutdown command. This will also be tricky, for
>> > the same reasons as a.
>> 
>> >> c) also introduce a -w/--wait for xl destroy
>> 
>> > Yes.
>> 
>> > I'll add:
>> 
>> > d) Make "xl shutdown --wait" actually wait for the domain to be
>> > destroyed.
>> 
>> > Probably, assuming that is possible (I'm concerned about races in the
>> > implementation of this...). Might also interact weirdly with on_* I
>> > suppose.
>> 
>> Well if we could pass down the events that "wait_for_domain_deaths" is 
>> allowed 
>> to return on ... now it seems to return on *any* event .. and only print 
>> something different on both shutdown and complete death ... 

> Any other even would be unexpected I think, since the corresponding
> libxl_evenable_* would never have been called in the xl shutdown path.

>> Is there a special event that's triggered on timeout (as defined in 
>> /etc/defaults/xendomains: XENDOMAINS_STOP_MAXWAIT=300 ?

> I don't see a timeout in the libxl_event_wait prototype so I suppose
> not.

>> The the solution seems to be to let the caller of "wait_for_domain_deaths" 
>> be 
>> able the specify the events it should return on.

> They already can -- by only enabling those events.

>>  (always return on timeout ... 
>> return on any unless specified ... only return on specified when specific 
>> events 
>> are specified)  
>> 
>> Could you elaborate on how you think this would get "racy" ?

> Oh, it looks like libxl already solved it and has an event, never mind.
> (I was concerned that a new domain with the same domid might appear
> right after the domain was gone)

Hrmm yes well .. i only see a way to enable both from libxl_event.h:

typedef struct libxl__evgen_domain_death libxl_evgen_domain_death;
int libxl_evenable_domain_death(libxl_ctx *ctx, uint32_t domid,
                         libxl_ev_user, libxl_evgen_domain_death **evgen_out);
void libxl_evdisable_domain_death(libxl_ctx *ctx, libxl_evgen_domain_death*);
  /* Arranges for the generation of DOMAIN_SHUTDOWN and DOMAIN_DESTROY
   * events.  A domain which is destroyed before it shuts down
   * may generate only a DESTROY event.
   */


Although from libxl_internal.h it looks like there once has been a thought 
about 
differentiating:

struct libxl__evgen_domain_death {
    uint32_t domid;
    unsigned shutdown_reported:1, death_reported:1;
    LIBXL_TAILQ_ENTRY(libxl_evgen_domain_death) entry;
        /* on list .death_reported ? CTX->death_list : CTX->death_reported */
    libxl_ev_user user;
};
_hidden void
libxl__evdisable_domain_death(libxl__gc*, libxl_evgen_domain_death*);



> Ian.




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.