[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore: run in a separate process



Ian Campbell writes ("Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore: 
run in a separate process"):
> On Fri, 2012-06-08 at 18:34 +0100, Ian Jackson wrote:
> > libxenctrl expects to be able to simply run the save or restore
> > operation synchronously.  This won't work well in a process which is
> > trying to handle multiple domains.
...
> > diff --git a/.hgignore b/.hgignore
> > index 27d8f79..05304ea 100644

> >  ^tools/libxl/tmp\..*$
> > +^tools/libxl/.*\.new$
> 
> Our naming scheme for these sorts of temporary files is  rather in
> consistent (including an existing use of .new), oh well...

Hmm, yes.  I took my cue from libxl_list.h, which is immediately before
_libxl_save_msgs_helper.[ch] in the Makefile.

> > +libxl-save-helper: $(SAVE_HELPER_OBJS) libxenlight.so
> > +       $(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxenctrl) 
> > $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
> 
> The changelog says that libxl-save-helper doesn't link libxenlight but
> it is declared as a dependency here and CFLAGS_libxenlight is included
> in SAVE_HELPER_OBJS' CFLAGS above.

This is a mistake in the Makefile, which I have fixed.

> >  testidl: testidl.o libxlutil.so libxenlight.so
> >         $(CC) $(LDFLAGS) -o $@ testidl.o libxlutil.so $(LDLIBS_libxenlight) 
> > $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> [...]
> > @@ -170,6 +184,7 @@ install: all
> >         $(INSTALL_DIR) $(DESTDIR)$(BASH_COMPLETION_DIR)
> >         $(INSTALL_DIR) $(DESTDIR)$(XEN_RUN_DIR)
> >         $(INSTALL_PROG) xl $(DESTDIR)$(SBINDIR)
> > +       $(INSTALL_PROG) libxl-save-helper $(DESTDIR)$(PRIVATE_BINDIR)
> 
> This needs an INSTALL_DIR for $(DESTDIR)$(PRIVATE_BINDIR) to support
> "make -C tools/libxl DESTDIR=xxx install" else:
>         install: cannot create regular file `/tmp/tmplKa0Y7/usr/lib/xen/bin': 
> No such file or directory

OK.  I guess I never do that and it's a bit of a mystery why one
would without having done a more general install before, but it's
clearly a correct change.

> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> > index b48452c..fea0c94 100644
...
> >  int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
> > -        uint32_t size, void *data)
...
> > +    libxl_ctx *ctx = CTX;
> 
> Any reason you didn't s/ctx/CTX/ in one of your preparatory patches?

I didn't want to do that unless it was necessary.  There was already
enough else going on.  I don't think having a ctx rather than using
CTX is wrong, although it is slightly less in the modern idiom.

> >  void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state 
> > *dss,
> >                             unsigned long vm_generationid_addr)
> >  {
> >      STATE_AO_GC(dss->ao);
> > -    int r;
> > +    int r, rc;
> > +    uint32_t toolstack_data_len;
> > +
> > +    /* Resources we need to free */
> > +    uint8_t *toolstack_data_buf = 0;
> > +
> > +    unsigned cbflags = libxl__srm_callout_enumcallbacks_save
> > +        (&dss->shs.callbacks.save.a);
> > +
> > +    r = libxl__toolstack_save(dss->domid, &toolstack_data_buf,
> > +                              &toolstack_data_len, dss);
> 
> This seems to be called twice? I'm looking at the source after the full
> series is applied and I see
>         dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
> in libxl__domain_suspend as well as this call here.

But in fact dss->shs.callbacks.save.toolstack_save is never used
anywhere in that version.  The bug is that libxl__xc_domain_save
should call the callback function (if it is non-null), not call
libxl__toolstack_save directly.

> The callback one seems to be otherwise unused.

Exactly.

I have fixed this:

  * libxl__xc_domain_save uses supplied callback function pointer,
    rather than calling libxl__toolstack_save directly;
    toolstack data save callback is only supplied to libxc if
    in-libxl caller supplied a callback.

> > +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> > +                       const char *mode_arg, int data_fd,
> > +                       const unsigned long *argnums, int num_argnums)
> > +{
...
> > +    libxl__carefd_begin();
> > +    for (i=0; i<2; i++) {
> > +        int fds[2];
> > +        if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
> > +        childs_pipes[i] = libxl__carefd_record(CTX, fds[i]);
> > +        shs->pipes[i] = libxl__carefd_record(CTX, fds[!i]);
> 
> Using !i here is clever, but I had to deploy a pen to be sure the
> assignments were all correct.
> 
> Open coding this simple loop would also give you the opportunity the
> have comments like "/* This is the helper processes's stdout */" etc in
> the appropriate place to aid in comprehension when checking the correct
> end of each pipe goes in the right place.

Perhaps the right answer is to use a better variable name than `i'.
I wrote it out longhand and it looked like this:

    libxl__carefd_begin();
    int fds[2];

    /* child's stdin */
    if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
    childs_pipes[0] = libxl__carefd_record(CTX, fds[0]);
    shs->pipes[0] = libxl__carefd_record(CTX, fds[1]);

    /* child's stdout */
    if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
    childs_pipes[1] = libxl__carefd_record(CTX, fds[1]);
    shs->pipes[1] = libxl__carefd_record(CTX, fds[0]);

    libxl__carefd_unlock();

which is a great way to obscure the subtle differences between the two
stanzas.  How about:

    libxl__carefd_begin();
    int childfd;
    for (childfd=0; childfd<2; childfd++) {
        /* Setting up the pipe for the child's fd childfd */
        int fds[2];
        if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
        int childs_end = childfd==0 ? 0 /*read*/  : 1 /*write*/;
        int our_end    = childfd==0 ? 1 /*write*/ : 0 /*read*/;
        childs_pipes[childfd] = libxl__carefd_record(CTX, fds[childs_end]);
        shs->pipes[childfd] =   libxl__carefd_record(CTX, fds[our_end]);
    }
    libxl__carefd_unlock();

?

> > +static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
> > +                                   int fd, short events, short revents)
> > +{
> > +    libxl__save_helper_state *shs = CONTAINER_OF(ev, *shs, readable);
> > +    STATE_AO_GC(shs->ao);
> > +    int rc, errnoval;
> > +
> > +    if (revents & POLLPRI) {
> > +        LOG(ERROR, "%s signaled POLLERR", shs->stdout_what);
> 
>                           signalled ?
> 
> (it's not impossible that my spell checker is wrong, google seem to
> think both are ok)

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/kill.html

uses "signaled".

> The if says POLLPRI but the log says POLLERR?

This should check and log both.

> > +        rc = ERROR_FAIL;
> > + out:
> > +        /* this is here because otherwise we bypass the decl of msg[] */
> 
> I don't get this comment, why can't "out:" be in the normal place after
> the non-error return?

Because it is not legal to "goto" within the scope of a variable whose
size is not a constant.  The alternative would be to introduce a block
scope starting before `unsigned char msg[msglen]' and ending after
`return'.

Arguably msg[msglen] is asking too much (up to 64K) of the caller's
stack.  Should I change it ?

> > +    if (!shs->completed) {
> > +        if (!shs->rc)
> > +            LOG(ERROR,"%s exited without signaling completion",what);
> 
>                                             signalling again, same caveat

I'll go with what SuS says.

> > --- /dev/null
> > +++ b/tools/libxl/libxl_save_helper.c
> > @@ -0,0 +1,279 @@
> > +/*
> > + * Copyright (C) 2012      Citrix Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU Lesser General Public License as published
> > + * by the Free Software Foundation; version 2.1 only. with the special
> > + * exception on linking described in file LICENSE.
> 
> LGPL, or perhaps GPL since this is a helper?
> 
> (I suppose the same argument could be made for xl itself)

I can see no reason to deviate from the licence of the rest of libxl.
If nothing else, code may move between the helper and libxl proper.

> > +int helper_getreply(void *user)
> > +{
> > +    int v;
> > +    int r = read_exactly(0, &v, sizeof(v));
> > +    if (r<=0) exit(-2);
> 
> You use -1 a lot but here you use -2, are we supposed to be able to
> infer something from the specific error code?

Not really.  I used -1 for general system call failures.  This
particular situation might include a protocol error by the parent,
though, so I thought I'd distinguish it.  That way if you get a
message saying the helper exited with a nonzero exit status foobar you
get slightly more information.

I could make this more formal and be more consistent with these exit
statuses, or alternatively I could abolish it.  I don't think it's
critical - nothing turns on this exit status.

> > diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
> > b/tools/libxl/libxl_save_msgs_gen.pl
> > new file mode 100755
> > index 0000000..cd0837e
> > --- /dev/null
> > +++ b/tools/libxl/libxl_save_msgs_gen.pl
> > @@ -0,0 +1,393 @@
> > +#!/usr/bin/perl -w
> > +
> > +use warnings;
> > +use strict;
> > +use POSIX;
> > +
> > +our $debug = 0; # produce copius debugging output at run-time?
> 
>                              copious
> 
> (which is probably the most useful feedback you are going to get from me
> on a Perl script ;-))

Fixed.

> > +foreach my $ah (qw(callout helper)) {
> > +    $out_body{$ah} .= <<END.($ah eq 'callout' ? <<END : <<END);
> [...]
> > +END
> [...]
> > +END
> [...]
> > +END
> 
> I think I can infer what this does, but wow ;-)

Following a suggestion from Mark Wooding I have changed this to:

          <<END_BOTH.($ah eq 'callout' ? <<END_CALLOUT : <<END_HELPER);
  #include "libxl_osdeps.h"

  #include <assert.h>
  #include <string.h>
  #include <stdint.h>
  #include <limits.h>
  END_BOTH

  #include "libxl_internal.h"

  END_CALLOUT

  #include "_libxl_save_msgs_${ah}.h"
  #include <xenctrl.h>
  #include <xenguest.h>

  END_HELPER

which I think is much clearer.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.