[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Livelock induced failure in blktap2.



We've been working on getting XEN 4.1.2 validated for internal use and
have run into what appears to be a livelock induced failure in
properly freeing a blktap2 device.

We ported the blktap2 driver from Dan Stodden's GIT tree into 3.2.x
which is a reasonably straight forward process.  We are also running
the toolchain with a patch which Ian Campbell posted in order to get
xl to properly terminate the tapdisk2 driver process which is spawned
for a block device.  With that patch applied we've isolated a problem
which prevents the tapdev minor from being released.

We tracked the problem down to what appears to be a livelock related
to the unmapping of the ring memory by the tapdisk2 process.  Most
specificaly the following fragment in
tapdisk-vbc:tapdisk_vbd_detach():

        if (vbd->ring.mem > 0)
                munmap(vbd->ring.mem, psize * BLKTAP_MMAP_REGION_SIZE);

xl sends a DETACH request to the tapdisk2 process which results in the
execution of the the munmap function call and what appears to be a
livelock.

xl times out its select() call waiting for the DETACH response from
tapdisk2 and errors out resulting in tap_ctl_free not being call for
the minor.  The exit of xl does, however, release the livelock which
causes the tapdisk2 to generate an error and terminate since it can't
send the DETACH response to the xl process which is now gone.

The livelock period can be widened by increasing the timeout on the
select call in tap-ctl.ipc.c:tap_ctl_read_message().  Doing so allows
the impact of the lock to be noted with, for example, the inability to
run a ps command in a separate shell session.  The ps, or any other
command hangs, until the select call times out in xl and the guest
instance terminates.

The problem is definitely linked to xl since a block device can be
initialized and released with the tap-ctl utility with no problems.

Let me know if anyone has run into this or a solution.  We will keep
digging and report back if we isolate the problem.

Have a good week.

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@xxxxxxxxxxxx
------------------------------------------------------------------------------
"Everything should be made as simple as possible, but not simpler."
                                -- Albert Einstein

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.