[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore: run in a separate process



On Fri, 2012-06-08 at 18:34 +0100, Ian Jackson wrote:
> libxenctrl expects to be able to simply run the save or restore
> operation synchronously.  This won't work well in a process which is
> trying to handle multiple domains.
> 
> The options are:
> 
>  - Block such a whole process (eg, the whole of libvirt) while
>    migration completes (or until it fails).
> 
>  - Create a thread to run xc_domain_save and xc_domain_restore on.
>    This is quite unpalatable.  Multithreaded programming is error
>    prone enough without generating threads in libraries, particularly
>    if the thread does some very complex operation.
> 
>  - Fork and run the operation in the child without execing.  This is
>    no good because we would need to negotiate with the caller about
>    fds we would inherit (and we might be a very large process).
> 
>  - Fork and exec a helper.
> 
> Of these options the latter is the most palatable.
> 
> Consequently:
> 
>  * A new helper program libxl-save-helper (which does both save and
>    restore).  It will be installed in /usr/lib/xen/bin.  It does not
>    link against libxl, only libxc, and its error handling does not
>    need to be very advanced.  It does contain a plumbing through of
>    the logging interface into the callback stream.
> 
>  * A small ad-hoc protocol between the helper and libxl which allows
>    log messages and the libxc callbacks to be passed up and down.
>    Protocol doc comment is in libxl_save_helper.c.
> 
>  * To avoid a lot of tedium the marshalling boilerplate (stubs for the
>    helper and the callback decoder for libxl) is generated with a
>    small perl script.
> 
>  * Implement new functionality to spawn the helper, monitor its
>    output, provide responses, and check on its exit status.
> 
>  * The functions libxl__xc_domain_restore_done and
>    libxl__xc_domain_save_done now turn out to want be called in the
>    same place.  So make their state argument a void* so that the two
>    functions are type compatible.
> 
> The domain save path still writes the qemu savefile synchronously.
> This will need to be fixed in a subsequent patch.
> 
> Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> 
> Changes in v3:
>  * Suppress errno value in debug message when helper reports successful
>    completion.
>  * Significant consequential changes to cope with changes to
>    earlier patches in the series.
> 
> Changes in v2:
>  * Helper path can be overridden by an environment variable for testing.
>  * Add a couple of debug logging messages re toolstack data.
>  * Fixes from testing.
>  * Helper protocol message lengths (and numbers) are 16-bit which
>    more clearly avoids piling lots of junk on the stack.
>  * Merged with remus changes.
>  * Callback implementations in libxl now called via pointers
>    so remus can have its own callbacks.
>  * Better namespace prefixes on autogenerated names etc.
>  * Autogenerator can generate debugging printfs too.
> ---
[...]
> diff --git a/.hgignore b/.hgignore
> index 27d8f79..05304ea 100644
> --- a/.hgignore
> +++ b/.hgignore
> @@ -180,9 +180,11 @@
>  ^tools/libxl/_.*\.c$
>  ^tools/libxl/libxlu_cfg_y\.output$
>  ^tools/libxl/xl$
> +^tools/libxl/libxl-save-helper$
>  ^tools/libxl/testidl$
>  ^tools/libxl/testidl\.c$
>  ^tools/libxl/tmp\..*$
> +^tools/libxl/.*\.new$

Our naming scheme for these sorts of temporary files is  rather in
consistent (including an existing use of .new), oh well...

[...]

> +SAVE_HELPER_OBJS = libxl_save_helper.o _libxl_save_msgs_helper.o
> +$(SAVE_HELPER_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenlight)
> +
>  testidl.o: CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenlight)
>  testidl.c: libxl_types.idl gentest.py libxl.h $(AUTOINCS)
>         $(PYTHON) gentest.py libxl_types.idl testidl.c.new
> @@ -117,6 +122,12 @@ _libxl_list.h: 
> $(XEN_INCLUDE)/xen-external/bsd-sys-queue-h-seddery $(XEN_INCLUDE
>         perl $^ --prefix=libxl >$@.new
>         $(call move-if-changed,$@.new,$@)
> 
> +_libxl_save_msgs_helper.c _libxl_save_msgs_callout.c \
> +_libxl_save_msgs_helper.h _libxl_save_msgs_callout.h: \
> +               libxl_save_msgs_gen.pl
> +       $(PERL) -w $< $@ >$@.new
> +       $(call move-if-changed,$@.new,$@)
> +
>  libxl.h: _libxl_types.h
>  libxl_json.h: _libxl_types_json.h
>  libxl_internal.h: _libxl_types_internal.h _paths.h
> @@ -159,6 +170,9 @@ libxlutil.a: $(LIBXLU_OBJS)
>  xl: $(XL_OBJS) libxlutil.so libxenlight.so
>         $(CC) $(LDFLAGS) -o $@ $(XL_OBJS) libxlutil.so $(LDLIBS_libxenlight) 
> $(LDLIBS_libxenctrl) -lyajl $(APPEND_LDFLAGS)
> 
> +libxl-save-helper: $(SAVE_HELPER_OBJS) libxenlight.so
> +       $(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxenctrl) 
> $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)

The changelog says that libxl-save-helper doesn't link libxenlight but
it is declared as a dependency here and CFLAGS_libxenlight is included
in SAVE_HELPER_OBJS' CFLAGS above.

> +
>  testidl: testidl.o libxlutil.so libxenlight.so
>         $(CC) $(LDFLAGS) -o $@ testidl.o libxlutil.so $(LDLIBS_libxenlight) 
> $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
[...]
> @@ -170,6 +184,7 @@ install: all
>         $(INSTALL_DIR) $(DESTDIR)$(BASH_COMPLETION_DIR)
>         $(INSTALL_DIR) $(DESTDIR)$(XEN_RUN_DIR)
>         $(INSTALL_PROG) xl $(DESTDIR)$(SBINDIR)
> +       $(INSTALL_PROG) libxl-save-helper $(DESTDIR)$(PRIVATE_BINDIR)

This needs an INSTALL_DIR for $(DESTDIR)$(PRIVATE_BINDIR) to support
"make -C tools/libxl DESTDIR=xxx install" else:
        install: cannot create regular file `/tmp/tmplKa0Y7/usr/lib/xen/bin': 
No such file or directory

>         $(INSTALL_PROG) libxenlight.so.$(MAJOR).$(MINOR) $(DESTDIR)$(LIBDIR)
>         ln -sf libxenlight.so.$(MAJOR).$(MINOR) 
> $(DESTDIR)$(LIBDIR)/libxenlight.so.$(MAJOR)
>         ln -sf libxenlight.so.$(MAJOR) $(DESTDIR)$(LIBDIR)/libxenlight.so
> 
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index b48452c..fea0c94 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -465,16 +465,20 @@ static inline char *restore_helper(libxl__gc *gc, 
> uint32_t domid,
>  }
> 
>  int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
> -        uint32_t size, void *data)
> +                             uint32_t size, void *user)
>  {
> -    libxl__gc *gc = data;
> -    libxl_ctx *ctx = gc->owner;
> +    libxl__save_helper_state *shs = user;
> +    libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
> +    STATE_AO_GC(dcs->ao);
> +    libxl_ctx *ctx = CTX;

Any reason you didn't s/ctx/CTX/ in one of your preparatory patches?

>      int i, ret;
>      const uint8_t *ptr = buf;
>      uint32_t count = 0, version = 0;
>      struct libxl__physmap_info* pi;
>      char *xs_path;

[...]
> diff --git a/tools/libxl/libxl_save_callout.c 
> b/tools/libxl/libxl_save_callout.c
> index 1b481ab..a39f434 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -16,6 +16,19 @@
> 
>  #include "libxl_internal.h"
> 
> +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> +                       const char *mode_arg, int data_fd,
> +                       const unsigned long *argnums, int num_argnums);
> +
> +static void helper_failed(libxl__egc*, libxl__save_helper_state *shs, int 
> rc);
> +static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
> +                                   int fd, short events, short revents);
> +static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
> +                          pid_t pid, int status);
> +static void helper_done(libxl__egc *egc, libxl__save_helper_state *shs);
> +
> +/*----- entrypoints -----*/
> +
>  void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state 
> *dcs,
>                                int hvm, int pae, int superpages,
>                                int no_incr_generationid)
> @@ -27,22 +40,319 @@ void libxl__xc_domain_restore(libxl__egc *egc, 
> libxl__domain_create_state *dcs,
>      const int restore_fd = dcs->restore_fd;
>      libxl__domain_build_state *const state = &dcs->build_state;
> 
> -    int r = xc_domain_restore(CTX->xch, restore_fd, domid,
> -                              state->store_port, &state->store_mfn,
> -                              state->store_domid, state->console_port,
> -                              &state->console_mfn, state->console_domid,
> -                              hvm, pae, superpages, no_incr_generationid,
> -                              &state->vm_generationid_addr, &dcs->callbacks);
> -    libxl__xc_domain_restore_done(egc, dcs, 0, r, errno);
> +    unsigned cbflags = libxl__srm_callout_enumcallbacks_restore
> +        (&dcs->shs.callbacks.restore.a);
> +
> +    const unsigned long argnums[] = {
> +        restore_fd, domid,
> +        state->store_port,
> +        state->store_domid, state->console_port,
> +        state->console_domid,
> +        hvm, pae, superpages, no_incr_generationid,
> +        cbflags,
> +    };
> +
> +    dcs->shs.ao = ao;
> +    dcs->shs.domid = domid;
> +    dcs->shs.recv_callback = libxl__srm_callout_received_restore;
> +    dcs->shs.completion_callback = libxl__xc_domain_restore_done;
> +    dcs->shs.caller_state = dcs;
> +    dcs->shs.need_results = 1;
> +    dcs->shs.toolstack_data_file = 0;
> +
> +    run_helper(egc, &dcs->shs, "--restore-domain", restore_fd,
> +               argnums, ARRAY_SIZE(argnums));
>  }
> 
>  void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss,
>                             unsigned long vm_generationid_addr)
>  {
>      STATE_AO_GC(dss->ao);
> -    int r;
> +    int r, rc;
> +    uint32_t toolstack_data_len;
> +
> +    /* Resources we need to free */
> +    uint8_t *toolstack_data_buf = 0;
> +
> +    unsigned cbflags = libxl__srm_callout_enumcallbacks_save
> +        (&dss->shs.callbacks.save.a);
> +
> +    r = libxl__toolstack_save(dss->domid, &toolstack_data_buf,
> +                              &toolstack_data_len, dss);

This seems to be called twice? I'm looking at the source after the full
series is applied and I see
        dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
in libxl__domain_suspend as well as this call here.

The callback one seems to be otherwise unused.

> +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> +                       const char *mode_arg, int data_fd,
> +                       const unsigned long *argnums, int num_argnums)
> +{
> +    STATE_AO_GC(shs->ao);
> +    const char *args[3 + num_argnums];
> +    const char **arg = args;
> +    int i, rc;
> +
> +    /* Resources we must free */
> +    libxl__carefd *childs_pipes[2] = { 0,0 };
> +
> +    /* Convenience aliases */
> +    const uint32_t domid = shs->domid;
> +
> +    shs->rc = 0;
> +    shs->completed = 0;
> +    shs->pipes[0] = shs->pipes[1] = 0;
> +    libxl__ev_fd_init(&shs->readable);
> +    libxl__ev_child_init(&shs->child);
> +
> +    shs->stdin_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
> +                                " stdin pipe", domid);
> +    shs->stdout_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
> +                                 " stdout pipe", domid);
> +
> +    *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC "/" "libxl-save-helper";
> +    *arg++ = mode_arg;
> +    for (i=0; i<num_argnums; i++)
> +        *arg++ = GCSPRINTF("%lu", argnums[i]);
> +    *arg++ = 0;
> +    assert(arg == args + ARRAY_SIZE(args));
> +
> +    libxl__carefd_begin();
> +    for (i=0; i<2; i++) {
> +        int fds[2];
> +        if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
> +        childs_pipes[i] = libxl__carefd_record(CTX, fds[i]);
> +        shs->pipes[i] = libxl__carefd_record(CTX, fds[!i]);

Using !i here is clever, but I had to deploy a pen to be sure the
assignments were all correct.

Open coding this simple loop would also give you the opportunity the
have comments like "/* This is the helper processes's stdout */" etc in
the appropriate place to aid in comprehension when checking the correct
end of each pipe goes in the right place.

> +    }
> +    libxl__carefd_unlock();
> +
> +    pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
> +    if (!pid) {
> +        libxl_fd_set_cloexec(CTX, data_fd, 0);
> +        libxl__exec(gc,
> +                    libxl__carefd_fd(childs_pipes[0]),
> +                    libxl__carefd_fd(childs_pipes[1]),
> +                    -1,
> +                    args[0], (char**)args, 0);
> +    }
> +
> +    libxl__carefd_close(childs_pipes[0]);
> +    libxl__carefd_close(childs_pipes[1]);
> +
> +    rc = libxl__ev_fd_register(gc, &shs->readable, helper_stdout_readable,
> +                               libxl__carefd_fd(shs->pipes[1]), 
> POLLIN|POLLPRI);
> +    if (rc) goto out;
> +    return;
> +
> + out:
> +    libxl__carefd_close(childs_pipes[0]);
> +    libxl__carefd_close(childs_pipes[1]);
> +    helper_failed(egc, shs, rc);;
> +}
> +
> +static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
> +                          int rc)
> +{
> +    STATE_AO_GC(shs->ao);
> +
> +    if (!shs->rc)
> +        shs->rc = rc;
> +
> +    libxl__ev_fd_deregister(gc, &shs->readable);
> +
> +    if (!libxl__ev_child_inuse(&shs->child)) {
> +        helper_done(egc, shs);
> +        return;
> +    }
> +
> +    int r = kill(shs->child.pid, SIGKILL);
> +    if (r) LOGE(WARN, "failed to kill save/restore helper [%lu]",
> +                (unsigned long)shs->child.pid);
> +}
> +
> +static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
> +                                   int fd, short events, short revents)
> +{
> +    libxl__save_helper_state *shs = CONTAINER_OF(ev, *shs, readable);
> +    STATE_AO_GC(shs->ao);
> +    int rc, errnoval;
> +
> +    if (revents & POLLPRI) {
> +        LOG(ERROR, "%s signaled POLLERR", shs->stdout_what);

                          signalled ?

(it's not impossible that my spell checker is wrong, google seem to
think both are ok)

The if says POLLPRI but the log says POLLERR?

> +        rc = ERROR_FAIL;
> + out:
> +        /* this is here because otherwise we bypass the decl of msg[] */

I don't get this comment, why can't "out:" be in the normal place after
the non-error return?

> +        helper_failed(egc, shs, rc);
> +        return;
> +    }
> +
> +    uint16_t msglen;
> +    errnoval = libxl_read_exactly(CTX, fd, &msglen, sizeof(msglen),
> +                                  shs->stdout_what, "ipc msg header");
> +    if (errnoval) { rc = ERROR_FAIL; goto out; }
> +
> +    unsigned char msg[msglen];
> +    errnoval = libxl_read_exactly(CTX, fd, msg, msglen,
> +                                  shs->stdout_what, "ipc msg body");
> +    if (errnoval) { rc = ERROR_FAIL; goto out; }
> +
> +    shs->egc = egc;
> +    shs->recv_callback(msg, msglen, shs);
> +    shs->egc = 0;
> +    return;
> +}
> +
> +static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
> +                          pid_t pid, int status)
> +{
> +    libxl__save_helper_state *shs = CONTAINER_OF(ch, *shs, child);
> +    STATE_AO_GC(shs->ao);
> +
> +    /* Convenience aliases */
> +    const uint32_t domid = shs->domid;
> +
> +    const char *what =
> +        GCSPRINTF("domain %"PRIu32" save/restore helper", domid);
> +
> +    if (status) {
> +        libxl_report_child_exitstatus(CTX, XTL_ERROR, what, pid, status);
> +        shs->rc = ERROR_FAIL;
> +    }
> +
> +    if (shs->need_results) {
> +        if (!shs->rc)
> +            LOG(ERROR,"%s exited without providing results",what);
> +        shs->rc = ERROR_FAIL;
> +    }
> +
> +    if (!shs->completed) {
> +        if (!shs->rc)
> +            LOG(ERROR,"%s exited without signaling completion",what);

                                            signalling again, same caveat


> +        shs->rc = ERROR_FAIL;
> +    }
> +
> +    helper_done(egc, shs);
> +    return;
> +}
[...]
> +/*----- generic helpers for the autogenerated code -----*/
[...]
> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> new file mode 100644
> index 0000000..d29f1f7
> --- /dev/null
> +++ b/tools/libxl/libxl_save_helper.c
> @@ -0,0 +1,279 @@
> +/*
> + * Copyright (C) 2012      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.

LGPL, or perhaps GPL since this is a helper?

(I suppose the same argument could be made for xl itself)

[...]
> +/*----- helper functions called by autogenerated stubs -----*/
> +
> +unsigned char * helper_allocbuf(int len, void *user)
> +{
> +    return xmalloc(len);
> +}
> +
> +static void transmit(const unsigned char *msg, int len, void *user)
> +{
> +    while (len) {
> +        int r = write(1, msg, len);
> +        if (r<0) { perror("write"); exit(-1); }
> +        assert(r >= 0);
> +        assert(r <= len);
> +        len -= r;
> +        msg += r;
> +    }
> +}
> +
> +void helper_transmitmsg(unsigned char *msg_freed, int len_in, void *user)
> +{
> +    assert(len_in < 64*1024);
> +    uint16_t len = len_in;
> +    transmit((const void*)&len, sizeof(len), user);
> +    transmit(msg_freed, len, user);
> +    free(msg_freed);
> +}
> +
> +int helper_getreply(void *user)
> +{
> +    int v;
> +    int r = read_exactly(0, &v, sizeof(v));
> +    if (r<=0) exit(-2);

You use -1 a lot but here you use -2, are we supposed to be able to
infer something from the specific error code?

> +    return v;
> +}
> +
> +/*----- other callbacks -----*/
> +
> +static int toolstack_save_fd;
> +static uint32_t toolstack_save_len;
[...]
> diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
> b/tools/libxl/libxl_save_msgs_gen.pl
> new file mode 100755
> index 0000000..cd0837e
> --- /dev/null
> +++ b/tools/libxl/libxl_save_msgs_gen.pl
> @@ -0,0 +1,393 @@
> +#!/usr/bin/perl -w
> +
> +use warnings;
> +use strict;
> +use POSIX;
> +
> +our $debug = 0; # produce copius debugging output at run-time?

                             copious

(which is probably the most useful feedback you are going to get from me
on a Perl script ;-))

> +
> +our @msgs = (
> +    # flags:
> +    #   s  - applicable to save
> +    #   r  - applicable to restore
> +    #   c  - function pointer in callbacks struct rather than fixed function
> +    #   x  - function pointer is in struct {save,restore}_callbacks
> +    #         and its null-ness needs to be passed through to the helper's xc
> +    #   W  - needs a return value; callback is synchronous
> +    [  1, 'sr',     "log",                   [qw(uint32_t level
> +                                                 uint32_t errnoval
> +                                                 STRING context
> +                                                 STRING formatted)] ],
> +    [  2, 'sr',     "progress",              [qw(STRING context
> +                                                 STRING doing_what),
> +                                                'unsigned long', 'done',
> +                                                'unsigned long', 'total'] ],
> +    [  3, 'scxW',   "suspend", [] ],
> +    [  4, 'scxW',   "postcopy", [] ],
> +    [  5, 'scxW',   "checkpoint", [] ],
> +    [  6, 'scxW',   "switch_qemu_logdirty",  [qw(int domid
> +                                              unsigned enable)] ],
> +    #                toolstack_save          done entirely `by hand'
> +    [  7, 'rcxW',   "toolstack_restore",     [qw(uint32_t domid
> +                                                BLOCK tsdata)] ],
> +    [  8, 'r',      "restore_results",       ['unsigned long', 'store_mfn',
> +                                              'unsigned long', 'console_mfn',
> +                                              'unsigned long', 'genidad'] ],
> +    [  9, 'srW',    "complete",              [qw(int retval
> +                                                 int errnoval)] ],
> +);
> +
> +#----------------------------------------
> +
> +our %cbs;
> +our %func;
> +our %func_ah;
> +our @outfuncs;
> +our %out_decls;
> +our %out_body;
> +our %msgnum_used;
> +
> +die unless @ARGV==1;
> +die if $ARGV[0] =~ m/^-/;
> +
> +our ($intendedout) = @ARGV;
> +
> +$intendedout =~ m/([a-z]+)\.([ch])$/ or die;
> +my ($want_ah, $ch) = ($1, $2);
> +
> +my $declprefix = '';
> +
> +foreach my $ah (qw(callout helper)) {
> +    $out_body{$ah} .= <<END.($ah eq 'callout' ? <<END : <<END);
[...]
> +END
[...]
> +END
[...]
> +END

I think I can infer what this does, but wow ;-)

> +}
[...]



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.