[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Basic blktap2 functionality issues.

On Apr 10, 10:04am, Ian Campbell wrote:
} Subject: Re: [Xen-devel] Basic blktap2 functionality issues.

> On Sat, 2012-04-07 at 16:59 +0100, Dr. Greg Wettstein wrote:
> > On Mar 30, 12:23pm, Ian Campbell wrote:
> > } Subject: Re: [Xen-devel] Basic blktap2 functionality issues.

> > > On Fri, 2012-03-30 at 09:17 +0100, Ian Campbell wrote:
> > > > I think an approach worth trying would be to have
> > > > tapdisk_control_detach_vbd respond to TAPDISK_MESSAGE_DETACH before
> > > > doing the actual detach. i.e. it would respond with "Yes, I will do
> > > > that" rather than "Yes, I have done that". My speculation is that this
> > > > will allow libxl to continue and hopefully avoid the deadlock.
> > 
> > > This seems to be the case as the following fixes things for
> > > me. Thanks very much for your analysis which lead me to this
> > > solution...
> > 
> > I ported your fix into 4.1.2 but I think we still have a problem, at
> > least in this codebase.
> > 
> > I no longer see the select timeout delay when xl shuts down but upon
> > shutdown the minor number is not freed.  A 'tap-ctl list' shows a
> > steadily increasing set of orphaned minor numbers as VM's are started
> > up and shutdown.
> > 
> > Are you seeing this in your development codebase?

> It turns out that I am, yes. e.g. after starting/stopping a guest 3
> times:
> # tap-ctl list
>        -  0    -          - -
>        -  1    -          - -
>        -  2    -          - -

Yes, that is the same thing I'm seeing and would be expected.

> > The culprit is a failed ioctl call for BLKTAP2_IOCTL_FREE_TAP in
> > tap_ctl_free().  The underlying reason for the ioctl failure is the
> > check in [linuxsrc]:drivers/block/blktap/ring.c:blktap_ring_destroy()

> drivers/*xen*/... right? Or do you have a different blktap to me?

I believe the 'drivers/*xen*' was an old location.

You may have seen the note I sent to Tim Wood yesterday.  We isolated
the blktap2 code from the last GIT tree which Dan Stodden had
available and are carrying it as a standalone patch.  The version we
are testing against is available from the following location:


Dan had the code in the following location:


> > for whether or not the task_struct pointer in the blktap_ring
> > structure has been NULLed.
> > 
> > Which certainly makes sense since there is a race between xl's call to
> > tap_ctl_free() and tapdisk2 getting to the point where it can close
> > its descriptor to the blktap instance and thus invoke the .release
> > method which translates into a call to blktap_ring_release() which
> > NULL's the task_struct pointer.

> Sounds right, but then you would expect both the ioctl and release
> path to cleanup, depending on who loses the race?

The problem is the ioctl path can't clean up properly since it is
blocked from doing so by the check for a valid task_struct pointer.
The tapdisk2 side cleans up properly but with the 'acknowledge the
detach and then detach patch' the xl side is always guaranteed to win
the race.

So the patch moved the failure one ladder step downwards in:


But the effective result is the same, modulo the elimination of the
'hang' while the select call times out.

> There also seems to be a BLKTAP_SHUTDOWN_REQUESTED bit which looks like
> it is involved somehow...
> We've gotten way beyond my understanding of blktap internals though.

That bit is set by the kernel kobject infrastructure as part of the
teardown of the block device and I believe is implicated to the extent
that the xl<->tapdisk2 deadlock is caused by a reference still being
held to the block device when the the detach message is sent to

I can now reproduce the deadlock with a tap-ctl recipe which should
help in chasing things down.  CAUTION: *** the following will livelock
your kernel, you will need to have a shell available to force an
unmount of the tapdev device. ***

        1.) Create aio based device with tap-ctl.

        2.) Mount tapdev device.

        3.) Close device with tap-ctl.

        4.) Send detach message with tap-ctl.

        * Livelock *

        5.) Kill tapdisk2 process.

        6.) Unmount tapdev - kernel issues WARNING from fs/buffer.c.

Having this recipe should at least help run down which lock we are
wedging on.

> > Let me know if you are seeing the issues I'm seeing, in the meantime I
> > will keep hunting to see if I can rundown the ultimate cause of the
> > deadlock.  Given the above trace it has to be an issue with xl
> > orchestrating the release of resources which reference the tapdev
> > block device.

> I'm not convinced that the shutdown stuff on the kernel side isn't
> either horribly broken or at best fragile (perhaps xl just tickles
> it differently to xend).

I don't believe its fragile as much as the classic problem of having
the kernel dependent on userspace.

I think the underlying problem is the device release order between
xend and xl.  I believe xend is releasing all references to the tapdev
device before invoking userspace cleanup.

> Please do keep hunting and let us know what you find...

I will wade a bit deeper into xl to see if I can isolate the ordering

> Thanks,
> Ian

Thanks for the verification from your end on the problem.

Have a good day.


}-- End of excerpt from Ian Campbell

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@xxxxxxxxxxxx
"Inspite of all evidence to the contrary, the entire universe is
 composed of only two basic substances: magic and bullshit."
                                -- Ian Macdonald

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.