[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 24/24] libxl: aborting bootloader invocation when domain dioes
Ian Campbell writes ("Re: [Xen-devel] [PATCH 24/24] libxl: aborting bootloader invocation when domain dioes"): > > + LIBXL__EVENT_DISASTER(egc,"failed to read xenstore" > > + " for domain deatch check", errno, 0); > > detach Fixed. > > +int libxl__domaindeathcheck_start(libxl__gc *gc, > > + libxl__domaindeathcheck *dc) > > +{ > > + const char *path = GCSPRINTF("/local/domain/%"PRIu32, dc->domid); > > + return libxl__ev_xswatch_register(gc, &dc->watch, > > + domaindeathcheck_callback, path); > > So we are watching for the toolstack (possibly the same one as we are > running in) to destroy the domain and therefore nuke the xs directory -- > does that actually work? Yes. And, yes. > Specifically in the case of xl running the bootloader -- is xl in any > position to handle a domain death during domain build? Isn't it doing > the create synchronously? The copy of xl which is trying to create the domain is indeed not capable of initiating destruction. But another copy of xl might destroy the domain. In my tests I found that if one runs, in one window, "xl create -c", and then when that command is sat at the bootloader prompt (which might be indefinitely), runs "xl destroy <name of guest>", before my patch the xl create continues to sit and wait for the bootloader. (If the bootloader is then told to continue, the xl create will of course bomb out since its domain has vanished.) This is bad not just because it is poor UX, but also because after "xl destroy" it ought to be possible to run "xl create" without any difficulty - however the bootloader from the previous incarnation still has the disk open. It's also bad because it leaks the bootloader until the bootloader times out (which might be indefinitely). After applying my patch, running "xl destroy" while the create is waiting for the bootloader causes the creation to be immediately aborted, as one would hope. > Wouldn't using the @releaseDomain watch be more reliable? It would be more complicated. We don't have a general internal-facing facility for "has this domain been destroyed", and turning @releaseDomain watch events into information about specific domains is complicated because the @releaseDomain event doesn't contain the domid. And it is not necessary because, if the domain is destroyed then either (a) the entries in xenstore have already been deleted, in which case the test here works or (b) they have not in which case something has gone very badly wrong and we are going to leak those xenstore entries, in which case trying to avoid leaking other stuff seems futile. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |