[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Network and SATA Instability on Xen 4.6/4.8
On Fri, Dec 8, 2017 at 9:17 PM, Kevin Stange <kevin@xxxxxxxxxxxxx> wrote: > Hi, > > I've been running Xen 4.4 stably for some time under kernel 4.9 in dom0 > on CentOS 6 and have been trying to finally move my environment up to > Xen 4.6 or 4.8 using CentOS 7. Since I've built out my test server with > Xen 4.6, I've been having issues where the Intel NICs begin flapping > repeatedly and the SATA disk interfaces go down and will not come back > up until I reboot the server. Even sending the bus rescan command > doesn't bring the drives back. The issue seems to trigger based on > activity, so during something like an mdraid resync is more likely to > cause the issue, but it's not reproducible in a consistent amount of > time, which makes it hard to tell if a particular change has definitely > fixed it. > > This is reminiscent of a problem I had been experiencing while running > kernel 3.18 and Xen 4.4 on CentOS 6, but the problem resolved itself > upon upgrading to kernel 4.4 and later 4.9, so I chalked that up to > something bad with PCIe management in kernel 3.18 and thought nothing > more of it until now. > > The initial test environment where the issue occurred was kernel 4.9.58 > and Xen 4.6.6-7 (with security patches from CentOS). I then tried > upgrading to kernel 4.9.63 and Xen 4.8.2-5, which didn't result in any > improvements. > > I tried pcie_aspm=off on the kernel line, which has helped in the past > with similar issues, but that didn't help here. > > I tried booting without Xen (just kernel 4.9.63) and it seems like that > made the issue go away, which lead me to believe the issue only happens > with hardware accessed from dom0. I dug through Xen command line > options and tried booting with msi=off and that now seems to have > resulted in the problem going away, or at least, the system hasn't > exhibited the issue since last week. Previously, the issue would tend > to manifest after less than 24 hours. > > My hardware is Supermicro X8DT3-F with Dual Intel Xeon E5620 CPUs. > > Disk issues begin with a kernel message like this followed by continuous > ATA command failures: > > ata2.00: exception emask 0x0 sact 0x7c01ffff serr 0x50000 action 0x6 frozen > > NIC issues begin with a message like: > > igb 0000:04:00.1: enp4s0f1: Reset adapter unexpectedly > > NICs do recover almost immediately but continue to flap periodically > until reboot. > > I don't know if this is a bug in Xen or something else at play, but I > could really use some help figuring out what's going on, why msi=off > seems to fix it, and if there are any better ways to resolve this. Jan / Andy, Any idea why Kevin might be seeing stability issues under 4.6 / 4.8 that is solved by adding 'msi=off'? -George _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |