[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Driver domains and hotplug scripts, redux

On Wed, 2012-01-11 at 11:50 +0000, Roger Pau Monnà wrote:
> Hello,
> This comes from my experience with Xen and hotplug scripts, and it
> might be wrong, since I wasn't able to find any document explaining
> exactly how hotplug execution works and who does what. I'm gonna try
> to list the sequence of events that happens when a device is added (I
> really don't want to keep on with the discusion if this is a protocol
> or not):
> 1. Toolstack writes: /local/domain/0/backend/<vbd or vif>/... with "state = 
> 1".
> 2. Kernel acks xenstore backend device creation, creates the device
> and sets backend "state = 2".
> 3. xenbackendd notices backend device with "state == 2" and launches
> hotplug script.

In the Linux I think state == 2 corresponds to the generation of a
uevent which triggers udev to run the hotplug script. I'm not 100% sure
that all devices do this at the same point though.

> 4. Hotplug script executes necessary actions and sets backend
> "hotplug-status = connected".
> 5. Kernel notices "hotplug-status == connected", plugs the device, and
> sets xenstore backend device "state = 4".

I think 4+5 are correct for Linux netback but for blkback it actually
waits for phys-dev (or whatever it's real name is) to be written wather
than the hotplug-status node.

> This is true on NetBSD, because there aren't any userspace hotplug
> devices, someone should probably add the missing bits if the device is
> implemented in userspace (I'm not really sure of what happens inside
> the kernel in #2 and #5, specially when using blktap or qdisk).

Nothing happens in the kernel for qdisk. It is a separate backend path
which the kernel doesn't watch or have a driver for.

blktap1 behaves a lot like blkback, I think.

blktap2 doesn't use xenbus IIRC, rather it is created via userspace
tools/libraries. There _might_ be some hotplug script interaction which
causes the phys-dev node to get written to the associated blkback device
but I think this is not the case and the toolstack just writes the
phys-dev because it knows what it is from when it created it.

> Regarding device shutdown/destroy:

We need to consider 3 cases:
      * guest initiated graceful shutdown
      * toolstack initiated graceful shutdown
      * toolstack initiated forceful destroy.

> 1. Guest sets frontend state to 6 (closed)
> 2. Kernel unplugs the device and sets backend "state = 6".
> 3. xenbackendd notices device with "state == 6", and performs the
> necessary cleanup.
> 3. Toolstack notices device with "state == 6" and removes xenstore
> backend entries.

At least some backend/frontends make use of state 5 as part of this,
probably at #1 or #2.

The ordering of #1 and #2 probably depends on whether the frontend or
the backend initiates things.

The forceful destroy case is different, it is effectively:
1. rm backend dir in xenstore.

Somewhere in both of these a Linux backend will generate a hotplug event
which will cause a script to run, although in some cases the script
can't do much because the backend dir is already gone...

> Notice that I've used two #3, that's where the race condition happens,
> because there's no synchronization between toolstack and
> hotplug/xenbackendd to know when hotplug scripts have been executed
> (however we should be able to synchronize this watching
> "hotplug-status" instead of "state", and waiting for it to change to
> "disconnected").
> Now, we have to decide how to fix the shutdown/destroy race and how to
> implement this outside of the Dom0. I'm not really sure if it's a good
> idea to try so hard to keep this flow intact, I think it's best to try
> to define a flow that solves our current problems, regardless of how
> things are now, and then try to map both flows to see what should be
> changed and how.
> Since the device will be plugged from a Domain different than Dom0,
> the toolstack doesn't really (and probably shouldn't) know anything
> about which backend type will be used (phy, blktap, qdisk...). Having
> that in mind, I don't know how can we write
> /local/domain/<driverdom_id>/backend/... from Dom0, instead we should
> create something like:
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/params
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/script
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/state
> [This seem like the minimum necessary parameters, but probably there
> are others, so add what you feel necessary]
> With that the driver domain should be able to create
> /local/domain/<driverdomain_id>/backend/... and the frontend also.
> I'm not sure if we should control the execution of hotplug scripts
> from Dom0, or instead let the driver domain decide when it's best to
> execute each script. This adds /hotplug to xenstore, but the
> plug/unplug sequence could be the same as the one we currently have,
> the only change is that each driver domain is in charge of writing
> it's own xenstore backend/frontend entries to trigger the plug
> sequence.
> Hope that helps, Roger.
> (xen-devel mailing list was removed at some point during the
> conversation, so I'm adding it again)
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.