[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore: run in a separate process



> > >  testidl: testidl.o libxlutil.so libxenlight.so
> > >         $(CC) $(LDFLAGS) -o $@ testidl.o libxlutil.so 
> > > $(LDLIBS_libxenlight) $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> > [...]
> > > @@ -170,6 +184,7 @@ install: all
> > >         $(INSTALL_DIR) $(DESTDIR)$(BASH_COMPLETION_DIR)
> > >         $(INSTALL_DIR) $(DESTDIR)$(XEN_RUN_DIR)
> > >         $(INSTALL_PROG) xl $(DESTDIR)$(SBINDIR)
> > > +       $(INSTALL_PROG) libxl-save-helper $(DESTDIR)$(PRIVATE_BINDIR)
> > 
> > This needs an INSTALL_DIR for $(DESTDIR)$(PRIVATE_BINDIR) to support
> > "make -C tools/libxl DESTDIR=xxx install" else:
> >         install: cannot create regular file 
> > `/tmp/tmplKa0Y7/usr/lib/xen/bin': No such file or directory
> 
> OK.  I guess I never do that and it's a bit of a mystery why one
> would without having done a more general install before, but it's
> clearly a correct change.

I build and install on a different machine, which means I install into
DESTDIR=`mkdtemp ...` and tar, copy etc. I do one full build to seed
everything and then incrementally just the dir I'm working in, so this
hits me because on the incremental ones the fresh DESTDIR doesn't have
the directory yet.

> 
> > > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> > > index b48452c..fea0c94 100644
> ...
> > >  int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
> > > -        uint32_t size, void *data)
> ...
> > > +    libxl_ctx *ctx = CTX;
> > 
> > Any reason you didn't s/ctx/CTX/ in one of your preparatory patches?
> 
> I didn't want to do that unless it was necessary.  There was already
> enough else going on.  I don't think having a ctx rather than using
> CTX is wrong, although it is slightly less in the modern idiom.

Fair enough.

> 
> > >  void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state 
> > > *dss,
> > >                             unsigned long vm_generationid_addr)
> > >  {
> > >      STATE_AO_GC(dss->ao);
> > > -    int r;
> > > +    int r, rc;
> > > +    uint32_t toolstack_data_len;
> > > +
> > > +    /* Resources we need to free */
> > > +    uint8_t *toolstack_data_buf = 0;
> > > +
> > > +    unsigned cbflags = libxl__srm_callout_enumcallbacks_save
> > > +        (&dss->shs.callbacks.save.a);
> > > +
> > > +    r = libxl__toolstack_save(dss->domid, &toolstack_data_buf,
> > > +                              &toolstack_data_len, dss);
> > 
> > This seems to be called twice? I'm looking at the source after the full
> > series is applied and I see
> >         dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
> > in libxl__domain_suspend as well as this call here.
> 
> But in fact dss->shs.callbacks.save.toolstack_save is never used
> anywhere in that version.  The bug is that libxl__xc_domain_save
> should call the callback function (if it is non-null), not call
> libxl__toolstack_save directly.
> 
> > The callback one seems to be otherwise unused.
> 
> Exactly.
> 
> I have fixed this:
> 
>   * libxl__xc_domain_save uses supplied callback function pointer,

                                ^ the ?
>     rather than calling libxl__toolstack_save directly;
>     toolstack data save callback is only supplied to libxc if
>     in-libxl caller supplied a callback.

     ^ "the" or "an"?

Otherwise looks good.

> > > +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> > > +                       const char *mode_arg, int data_fd,
> > > +                       const unsigned long *argnums, int num_argnums)
> > > +{
> ...
> > > +    libxl__carefd_begin();
> > > +    for (i=0; i<2; i++) {
> > > +        int fds[2];
> > > +        if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
> > > +        childs_pipes[i] = libxl__carefd_record(CTX, fds[i]);
> > > +        shs->pipes[i] = libxl__carefd_record(CTX, fds[!i]);
> > 
> > Using !i here is clever, but I had to deploy a pen to be sure the
> > assignments were all correct.
> > 
> > Open coding this simple loop would also give you the opportunity the
> > have comments like "/* This is the helper processes's stdout */" etc in
> > the appropriate place to aid in comprehension when checking the correct
> > end of each pipe goes in the right place.
> 
> Perhaps the right answer is to use a better variable name than `i'.
> I wrote it out longhand and it looked like this:

[...]
> which is a great way to obscure the subtle differences between the two
> stanzas.  How about:
> 
>     libxl__carefd_begin();
>     int childfd;
>     for (childfd=0; childfd<2; childfd++) {
>         /* Setting up the pipe for the child's fd childfd */
>         int fds[2];
>         if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
>         int childs_end = childfd==0 ? 0 /*read*/  : 1 /*write*/;
>         int our_end    = childfd==0 ? 1 /*write*/ : 0 /*read*/;
>         childs_pipes[childfd] = libxl__carefd_record(CTX, fds[childs_end]);
>         shs->pipes[childfd] =   libxl__carefd_record(CTX, fds[our_end]);
>     }
>     libxl__carefd_unlock();
> 
> ?

Yes renaming that variable, adding those inline /*read*/ etc comments
and the local variable naming has really helped, thanks

> > > +        rc = ERROR_FAIL;
> > > + out:
> > > +        /* this is here because otherwise we bypass the decl of msg[] */
> > 
> > I don't get this comment, why can't "out:" be in the normal place after
> > the non-error return?
> 
> Because it is not legal to "goto" within the scope of a variable whose
> size is not a constant.

Ah OK, I didn't realise this, another interesting C quirk!

>   The alternative would be to introduce a block
> scope starting before `unsigned char msg[msglen]' and ending after
> `return'.
> 
> Arguably msg[msglen] is asking too much (up to 64K) of the caller's
> stack.  Should I change it ?

I think it's OK in a userspace process.

Although one thing to watch out for would be the stack guard page which
modern Linux has -- this can cause issues with large stack allocations
since you trigger the guard page instead of growing the stack like you'd
expect -- I suppose this is really a bug in your compiler and/or libc
but it does seem common enough to avoid for now...

> > > --- /dev/null
> > > +++ b/tools/libxl/libxl_save_helper.c
> > > @@ -0,0 +1,279 @@
> > > +/*
> > > + * Copyright (C) 2012      Citrix Ltd.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU Lesser General Public License as 
> > > published
> > > + * by the Free Software Foundation; version 2.1 only. with the special
> > > + * exception on linking described in file LICENSE.
> > 
> > LGPL, or perhaps GPL since this is a helper?
> > 
> > (I suppose the same argument could be made for xl itself)
> 
> I can see no reason to deviate from the licence of the rest of libxl.
> If nothing else, code may move between the helper and libxl proper.

That's a very good point.

> 
> > > +int helper_getreply(void *user)
> > > +{
> > > +    int v;
> > > +    int r = read_exactly(0, &v, sizeof(v));
> > > +    if (r<=0) exit(-2);
> > 
> > You use -1 a lot but here you use -2, are we supposed to be able to
> > infer something from the specific error code?
> 
> Not really.  I used -1 for general system call failures.  This
> particular situation might include a protocol error by the parent,
> though, so I thought I'd distinguish it.  That way if you get a
> message saying the helper exited with a nonzero exit status foobar you
> get slightly more information.
> 
> I could make this more formal and be more consistent with these exit
> statuses, or alternatively I could abolish it.  I don't think it's
> critical - nothing turns on this exit status.

Lets just leave it then.

> > > +foreach my $ah (qw(callout helper)) {
> > > +    $out_body{$ah} .= <<END.($ah eq 'callout' ? <<END : <<END);
> > [...]
> > > +END
> > [...]
> > > +END
> > [...]
> > > +END
> > 
> > I think I can infer what this does, but wow ;-)
> 
> Following a suggestion from Mark Wooding I have changed this to:
> 
>           <<END_BOTH.($ah eq 'callout' ? <<END_CALLOUT : <<END_HELPER);
>   #include "libxl_osdeps.h"
> 
>   #include <assert.h>
>   #include <string.h>
>   #include <stdint.h>
>   #include <limits.h>
>   END_BOTH
> 
>   #include "libxl_internal.h"
> 
>   END_CALLOUT
> 
>   #include "_libxl_save_msgs_${ah}.h"
>   #include <xenctrl.h>
>   #include <xenguest.h>
> 
>   END_HELPER
> 
> which I think is much clearer.

I agree with Mark, thanks!

> 
> Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.