[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl: trigger attach events for devices attached before xl devd startup



On Mon, Jul 11, 2016 at 10:31:17AM +0200, Roger Pau Monné wrote:
> On Sun, Jul 10, 2016 at 07:35:47PM +0200, Marek Marczykowski-Górecki wrote:
> > When this daemon is started after creating backend device, that device
> > will not be configured.
> > 
> > Racy situation:
> > 1. driver domain is started
> > 2. frontend domain is started (just after kicking driver domain off)
> > 3. device in frontend domain is connected to the backend (as specified
> >    in frontend domain configuration)
> > 4. xl devd is started in driver domain
> > 
> > End result is that backend device in driver domain is not configured
> > (like network interface is not enabled), so the device doesn't work.
> > 
> > Fix this by artifically triggering events for devices already present in
> > xenstore before xl devd is started. Do this only after xenstore watch is
> > already registered, and only for devices not already initialized (in
> > XenbusStateInitWait state).
> 
> Thanks!
> 
> > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> > ---
> >  tools/libxl/libxl.c | 40 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> > 
> > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > index 1c81239..99815a7 100644
> > --- a/tools/libxl/libxl.c
> > +++ b/tools/libxl/libxl.c
> > @@ -4743,8 +4743,16 @@ int libxl_device_events_handler(libxl_ctx *ctx,
> >      uint32_t domid;
> >      libxl__ddomain ddomain;
> >      char *be_path;
> > +    char **kinds = NULL, **domains = NULL, **devs = NULL;
> > +    const char *sstate;
> > +    char *state_path;
> > +    int state;
> > +    unsigned int nkinds, ndomains, ndevs;
> > +    int i, j, k;
> > +    xs_transaction_t t;
> >  
> >      ddomain.ao = ao;
> > +    FILLZERO(ddomain.watch);
> 
> Is this a different bugfix or stray change?

To cleanly unregister watch and not do nothing if wasn't registered at
all. If it isn't initialized, libxl__ev_xswatch_deregister call on
not registered watch isn't harmless.

> >      LIBXL_SLIST_INIT(&ddomain.guests);
> >  
> >      rc = libxl__get_domid(gc, &domid);
> > @@ -4762,9 +4770,41 @@ int libxl_device_events_handler(libxl_ctx *ctx,
> >                                      be_path);
> >      if (rc) goto out;
> >  
> > +    rc = libxl__xs_transaction_start(gc, &t);
> > +    if (rc) goto out;
> 
> Why do you need to start a transaction here if you end up aborting it when 
> finished?

Mostly to ease error checking. Because below code does three level
listing, I don't want to deal with races where some entry was removed
between those calls, at least not here. Like this:

xs_directory('backend/vif') -> 3, 4, 5
xs_directory('backend/vif/3') -> 0, 1
xs_read('backend/vif/3/0/state') -> ...
xs_read('backend/vif/3/1/state') -> ...
toolstack removes backend/vif/4 here
xs_directory('backend/vif/4') ->  fail

Of course backend_watch_callback would fail anyway in such a case, which
is ok. But having snapshot of xenstore during this multi-level listing
looks like avoiding some corner cases during listing itself.

> > +    kinds = libxl__xs_directory(gc, t, be_path, &nkinds);
> > +    if (kinds) {
> > +        for (i = 0; i < nkinds; i++) {
> > +            domains = libxl__xs_directory(gc, t,
> > +                    GCSPRINTF("%s/%s", be_path, kinds[i]), &ndomains);
> > +            if (!domains)
> > +                continue;
> > +            for (j = 0; j < ndomains; j++) {
> > +                devs = libxl__xs_directory(gc, t,
> > +                        GCSPRINTF("%s/%s/%s", be_path, kinds[i], 
> > domains[j]), &ndevs);
> > +                if (!devs)
> > +                    continue;
> > +                for (k = 0; k < ndevs; k++) {
> > +                    state_path = GCSPRINTF("%s/%s/%s/%s/state",
> > +                            be_path, kinds[i], domains[j], devs[k]);
> > +                    rc = libxl__xs_read_checked(gc, t, state_path, 
> > &sstate);
> > +                    if (rc)
> > +                        continue;
> > +                    state = atoi(sstate);
> > +                    if (state == XenbusStateInitWait)
> > +                        backend_watch_callback(egc, &ddomain.watch,
> > +                                be_path, state_path);
> > +                }
> > +            }
> > +        }
> > +    }
> > +
> > +    libxl__xs_transaction_abort(gc, &t);
> > +
> >      return AO_INPROGRESS;
> >  
> >  out:
> > +    libxl__ev_xswatch_deregister(gc, &ddomain.watch);
> 
> This seems to be part of a different bugfix also.

No, this code previously wasn't reachable if xswatch was correctly
registered.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.