[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Bug in Xen 4.1.0: Xen leaks tapdisk2 processes
On Mon, 2011-05-09 at 04:23 -0400, Ian Campbell wrote: > On Sat, 2011-05-07 at 00:25 +0100, Nathan March wrote: > > > > On 5/6/2011 11:27 AM, Jim Fehlig wrote: > > >> > > >>> I don't have a spare server to test the patch with at the moment, but I > > >>> can try this out later this week. > > >>> > > >> If you are running xm/xend rather than xl then it won't help. > > >> > > >> But I'm not sure how one tells with libvrit which you are running, I'd > > >> expect that if xend were running it would be used by default. Jim? > > >> > > > If xend is running, libvirt will use it. If not, it will attempt to use > > > libxenlight. 'virsh version' will tell which xen backend you are using. > > > > > > E.g. if xend is running: > > > xen33: # virsh version > > > Compiled against library: libvir 0.9.0 > > > Using library: libvir 0.9.0 > > > Using API: Xen 3.0.1 > > > > > > If xend is not running: > > > xen33: # virsh version > > > Compiled against library: libvir 0.9.0 > > > Using library: libvir 0.9.0 > > > Using API: xenlight 0.9.0 > > > > > > Looks like I need to put libxenlight's version in there instead of > > > libvirt's version, but 'Xen' vs. 'xenlight' will tell which libvirt > > > backend is being used. > > > > > In that case, I can confirm that I'm using xend: > > Hrm, then my earlier patch is irrelevant and I've got no idea what is > supposed to cause the tapdisk process to exit in either the xend or xl > case but it seems like the issue is common to both -- Daniel, any > ideas?. This stuff was originally written with toolstacks in mind which already manage storage in more detail than just plug/unplug, so tap-ctl only provides the minimum tool, not the framework. XCP will refcount the node's usage, and shut down once dropping back to zero. Does XL promote storage shared across VMs? Does XL have a big lock? If no + yes then shutting down after the VBD should have worked. Otherwise it gets more complicated. Try/error, i.e. calling destroy and bailing out if the device node is found busy is not fully reliable, see my other mail regarding bdev access noise. And plug/unplugs by concurrent XLs interleaving will obviously race. I can offer a patch which adds a timeout to destroy (possibly as 0), but in theory the same issue obviously remains. Usually it boils down to a refcount. Could go into xenstore, sth like /local/domain/<me>/blktap/<minor>/refs, plus transactions. I think that way even XCP might go use it at some point. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |