[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 12/12] libxl: add device backend listener in order to launch backends



On 30/10/13 18:00, Ian Jackson wrote:
> Roger Pau Monne writes ("[PATCH v1 12/12] libxl: add device backend listener 
> in order to launch backends"):
>> Add the necessary logic in libxl to allow it to act as a listener for
>> launching backends in a driver domain, replacing udev (like we already
>> do on Dom0). This new functionality is acomplished by watching the
>> domain backend path (/local/domain/<domid>/backend) and reacting to
>> device creation/destruction.
>>
>> The way to launch this listener daemon is from xl, using the newly
>> introduced "devd" command. The command will daemonize by default,
>> using "xldevd.log" as it's logfile. Optionally the user can force the
>> execution of the listener in the foreground by passing the "-F"
>> option to the devd command.
>>
>> Current backends handled by this daemon include Qdisk, vbd and vif
>> device types.
> ...
>> I'm quite sure memory management is not done correctly, after each
>> call to backend_watch_callback the garbage collector should free all
>> the references, but I think this is not happening, and I would
>> like someone with experience on libxl memory management (IanJ) to
>> comment on possible ways to deal with that.
> 
> We discussed this on IRC:
> 
> 11:57 <Diziet> You seem to be using the ao gc everywhere.  I think if you're 
>                going to make a never-ending ao you need to do something more 
>                complicated.  The ao gc's allocations aren't freed until the 
> ao 
>                completes, which of course will never happen here.
> 11:58 <Diziet> I would create a fresh gc out of whole cloth and free it 
>                explicitly when you have finished processing the event.
> 11:59 <Diziet> Well when I say whole cloth, I mean using LIBXL_INIT_GC or 
>                something.
> 11:59 <Diziet> I think you may need to make a kind of fake sub-ao.
> 12:00 <Diziet> For the benefit of libxl_blah_device_blah(ao) etc.
> 12:00 <Diziet> Does this make any kind of sense ?  If not I can sketch it out 
>                or something.
> 12:01 <royger> Diziet: can I actually have two different GC? I mean, ideally 
> I 
>                would like to allocate a GC at the start of each iteration, 
> and 
>                free it when we have finished processing the loop
> 12:12 <Diziet> Yes, you can do that.
> 12:12 <Diziet> Of course it'll get a bit more complex because there will be a 
>                function (your watch event function) where you have two gcs 
> and 
>                have to use the right one each time.
> 12:13 <Diziet> Thinking about it I think we should have something like    
>                libxl__nested_ao_create   and   libxl_nested_ao_free   which 
>                take an existing ao* and give you a new ao* with its own gc.
> 
> If you want me to write libxl__nested_ao_create for you I could do
> that.

So if I got it right, this new libxl__nested_ao_create will return a new
ao (with a new gc), that I could use in conjunction with the
long-running ao that I use in the main xs_watch loop, right?

That sounds like a good solution to my problem, I wouldn't mind if you
write that :)

I'm wondering if there are also other memory problems, even when using
this approach, for example I register a xswatch callback, and the
callback gets called with a watch_path and an event_path arguments, does
the internal libxl event handler machinery reuse those (or allocate and
free them after each loop)?

>> +    LIBXL_SLIST_FOREACH(ddev, &dguest->devices, next) {
>> +        if (memcmp(ddev->dev, dev, sizeof(*dev)) == 0)
>> +            return ddev;
> 
> I'm afraid that you can't memcmp a struct like this.  structs are
> allowed to have padding which may contain junk.

Yes, will replace this in next version.

>> +static void device_complete(libxl__egc *egc, libxl__ao_device *aodev)
>> +{
>> +    STATE_AO_GC(aodev->ao);
>> +
>> +    if (aodev->action == LIBXL__DEVICE_ACTION_REMOVE)
>> +        free(aodev->dev);
>> +
>> +    LOG(DEBUG, "device %s %s %s",
>> +               libxl__device_backend_path(gc, aodev->dev),
>> +               libxl__device_action_to_string(aodev->action),
>> +               aodev->rc ? "failed" : "succeed");
> 
> AFAICT there is nothing here reporting success or failure to the
> initiator in the toolstack domain.  So you'll say "xl block-attach"
> and it will appear to work but in fact there's an error in a logfile
> in the driver domain.
> 
> At the very least this deserves something in the documentation.

Ack.

> 
>> +    case LIBXL__DEVICE_KIND_VBD:
>> +    case LIBXL__DEVICE_KIND_VIF:
>> +        if (dev->backend_kind == LIBXL__DEVICE_KIND_VBD) dguest->num_vbds--;
>> +        if (dev->backend_kind == LIBXL__DEVICE_KIND_VIF) dguest->num_vifs--;
> 
> Is it really safe to decrement these already ?  What if something else
> comes along in the meantime and makes num_devs 0 (below) and removes
> everything while this operation is still running and liable to be
> reentered on completion ?

That's the point of decrementing it here, so that we get to 0 (if this
is the last device), and remove the libxl__ddomain_guest and
libxl__ddomain_device. Then, when the remove AO finishes, the AO
callback will take care of removing the associated libxl__device.

I thought backend_watch_callback could not be called concurrently, but
maybe that's not true? (and if that's the case ignore everything above
because it's completely wrong)

> 
>>  /*
>> + * Function that drives the main loop that checks for device actions and
>> + * launches the callbacks passed by the user.
>> + */
> 
> I think this comment is confusing.  I would say:
> 
>    /*
>     * Turns the current process into a backend device service daemon
>     * for a driver domain.
>     *
>     * From a libxl API point of view, this starts a long-running
>     * operation.  That operation consists of "being a driver domain"
>     * and never completes.
>     */
> 
> or something.  I might rename it to have "driver domain" or "backend"
> in it somewhere.

Ack, will try to come up with a more "representative" name.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.