[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.1 + 3ware 9690SA = rejecting I/O to offline device
On 27/09/2011 19:13, Christopher S. Aker wrote: > On 10/11/10 5:44 PM, Christopher S. Aker wrote: >> In an effort to fix the problem described in my previous xen-devel post >> ("New CPUS, now get: NETDEV WATCHDOG: eth0: transmit timed out"), we've >> come across another problem. 3ware 9690SA cards to not behave under Xen >> 4.1 (as of cs 22155). >> >> We have a simple Xen thrash test suite which fires up domUs that do >> different workloads (some swap thrash, some kernel build, some spin >> CPUs, some cycle rebooting, etc). Almost immediately after launching the >> suite we can get the 3ware 9690SA card to fail with something like the >> following: >> >> sd 0:0:0:0: WARNING: (0x06:0x002C): Command (0x28) timed out, resetting >> card. >> sd 0:0:0:0: WARNING: (0x06:0x002C): Command (0x0) timed out, resetting >> card. >> sd 0:0:0:0: rejecting I/O to offline device >> sd 0:0:0:0: rejecting I/O to offline device >> >> Under a 2.6.32 dom0 it sometimes also triggers Xenwatch like so: >> >> http://theshore.net/~caker/xen/BUGS/9690SA/xenwatch.txt >> >> Results matrix: >> >> +---------------------------------------------------------------+ >> | Xen | Dom0 | 9550SXU | 9690SA | 9750 | >> +---------------------------------------------------------------+ >> | 3.4.1 | 2.6.18.8-931-2 | OK | OK | OK | >> | 3.4.4-rc1-pre | 2.6.18.8-931-2 | OK | OK | OK | >> | 3.4.4-rc1-pre | 2.6.32.23-g41a85de5 | OK | OK | OK | >> | 4.1 @ 22155 | 2.6.18.8-931-2 | OK | FAIL | OK | >> | 4.1 @ 22155 | 2.6.32.23-g41a85de5 | OK | FAIL | OK | >> +---------------------------------------------------------------+ >> >> The failures were verified on at least 2 machines of identical >> specification. >> >> The same dom0 kernels that produce a stable 9690SA under Xen 3.4, bomb >> under Xen 4.1. > I'm back at this, and the problem still exists with a 4.1.1/3.0.4 stack. > > Konrad, in the "offline raid" thread you asked for the following debug > information: > > http://www.theshore.net/~caker/xen/BUGS/offline-raid/ > > The sysrq-t.txt and triple-a-star.txt outputs are after I got the raid > card to hang up (but before it timed out and started spewing to the > console). > > Oddly, lspci shows three devices assigned IRQ 16, however > /proc/interrupts only lists two of them. Side effect of MSI? > > Also, the problem still happens even with MSI disabled (pci=nomsi). > > Thanks, > -Chris This is almost certainly the bug to do with not ack'ing a migrating line level interrupt which I fixed in c/s 23145:1092a143ef9d. Try applying that patch, or just running from the tip of http://xenbits.xen.org/hg/xen-4.1-testing.hg/ ~Andrew > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |