[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] libxl: treat "dying" domains as destroyed

On Mon, 2012-01-30 at 14:01 +0000, Ian Jackson wrote:
> Rename the DOMAIN_DESTROY event to DOMAIN_DEATH and have it trigger
> when the domain goes into the state indicated by the domaininfo flag
> "dying".
> This fixes a race which could leak a daemonised xl process, which
> would have ignored the domain becoming "dying" and would then wait
> forever to be told the domain was destroyed.
> After the domain becomes "dying" we can't generate an event when it is
> actually destroyed because xenstored will eat the relevant
> VIRT_DOM_EXC virq and not generate an @releaseDomain, since xenstored
> discards its own record of the domain's existence as soon as it sees
> the domain "dying" and will not trigger @releaseDomain watches for
> domains it knows nothing about.  Arguably this is a bug in xenstored,
> and the whole @releaseDomain machinery is rather poor, but let us not
> fix that now.
> Anyway, xl does not really want to know when the domain is ultimately
> destroyed.  It is enough for xl to know that it is on the way out, in
> the "dying" state (which leads later to destruction by Xen).
> Also fix a bug where domain_death_xswatch_callback might read one
> domain beyond the valid data in its domaininfos array, by correctly
> ordering the checks for empty domain list, end of domain list, and our
> domain being missing.
> Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

This seems to do what I would expect, I tried with "xl reboot", "xl
shutdown" & "xl destroy" as well as in-guest shutdown (which is no
different in reality from "xl shutdown"). No leaks and in each case it
rebooted, destroyed etc the domain as expected.

> ---
>  tools/libxl/libxl.c         |   57 ++++++++++++++++++++++++++++--------------
>  tools/libxl/libxl_types.idl |    4 +-
>  tools/libxl/xl_cmdimpl.c    |    4 +-
>  3 files changed, 42 insertions(+), 23 deletions(-)
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 2758d4c..b74a4d9 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -685,6 +685,29 @@ int libxl_domain_reboot(libxl_ctx *ctx, uint32_t domid)
>      return ret;
>  }
> +static void domain_death_occurred(libxl__egc *egc,
> +                                  libxl_evgen_domain_death **evg_upd,
> +                                  const char *why) {
> +    /* Removes **evg from the list and advances *evg_upd to the next
> +     * entry.  Call sites in _xswatch_callback must use "continue". */

There's no **evg in this context? ITYM *evg or just evg?

Also it's not clear which list this talks about since there is
death_list and death_reported as well as, presumably, a list of current
domain infos somewhere. Did you mean "moves evg from death_list to
death_reported list and advances *evg_upd ...."?

> +    EGC_GC;
> +    libxl_evgen_domain_death *const evg = *evg_upd;
> +
> +    LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, " death %s", why);

LIBXL__LOG puts a space between the boilerplate and the message so you
end up with two spaces before "death" here.

Also the resultant messages are:
        " death destroyed"
        " death missing"
        " death dying"
in context that is:
        libxl: debug: libxl.c:696:domain_death_occurred:  death dying

Which could do with a noun or something. 

Or, maybe you meant s/death/domain/?

Other than those small nits:

Acked-/Tested-by: Ian Campbell <ian.campbell@xxxxxxxxxx>


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.