[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b)
On Wed, 2015-07-29 at 11:45 -0400, Konrad Rzeszutek Wilk wrote: > relying on the (stale) 4.5 rules file having the UDEV_CALL=1 in them. > > I don't exactly understand how the hotplug scripts are invoked via 'xl'. They are called when the backend gets to (or passes through) XenbusStateInitWait (attach) or XenbusStateClosed (detach). > With udev it was pretty clear and easy to me. It was also, unfortunately, racy. In particular on tear down there was no interlock between the scripts (executed asynchronously by udev) and the toolstack. Some backends have interlock between the backend and the script, but that's not the same/sufficient. This race means that the xenstore dir could be removed before the script runs, and the script may need information from xenstore in order to do the tear down. This was a particular problem for detaching a vif on a vswitch system, since vswitch (unlike Linux bridge) does not automatically remove a port when the device disappears, so we need xenstore info (specifically the bridge node) to clean up. I believe there were also similar issues with block-iscsi (to logout of the target) and even regular block devices where loopback was in use (to see the type and know whether to losetup -d or not, this was the reason why we didn't do loopback for file:// devices with libxl for quite a while). It was also completely different for each backend platform (Linux, BSD,etc), which was problematic from a support PoV. > Note that I see this problem regardless of me having 'xl devd' running or > not. You definitely do _not_ want to run xl devd in dom0 (or more precisely in your toolstack domain). Having both xl and devd doing this operations will not result in anything you want. Might be worth having some interlock on that, if we don't already. > > Another option would be to install an empty xen-backend.rules for the > > 4.6 release, and then remove it for 4.7. > > Or trim down the udev rules ? The udev scripts should have been unused since 4.5.0-rc1, where they were by default gated from running in dom0 in favour of the libxl version. In the default configuration the scripts detected when they were called via udev and exited immediately without doing anything, leaving them to do the real work when called directly from the toolstack. Have you been seeing this issue since then and "fixing" it by manually reverting to the udev behaviour in /etc/xen/xl.conf (or elsewhere for other libxl clients)? If not then there is some unintentional change in 2ba368d138934 as well as the unintentional removal of the udev scripts. There really should have been no semantic change compared with the default behaviour from 4.5.0-rc1. Putting back the udev rules (even a trimmed down version, whatever that means) is just papering over the underlying issue, whatever that is. Only once we have understood the underlying issue can we consider whether the appropriate remedial action for 4.6 is to put udev back (i.e. if the real fix is too intrusive etc) I think the next thing to try should be to revert only the tools/libxl portion of 2ba368d138934, i.e. return to the old toolstack code without putting the udev scripts back (being careful to clear up any remnants of the previous larger revert from the installed system). That should also be a change with no functional difference. So it will, I think, help rule in/out any unintentional change in behaviour in (lib)xl as opposed to some weird interaction with the inactive udev scripts. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |