[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/watchdog: Use real timestamps for watchdog timeout
On 24/05/13 10:37, Tim Deegan wrote: > At 21:32 +0100 on 23 May (1369344726), Andrew Cooper wrote: >> Do not assume that we will only receive interrupts at a rate of nmi_hz. On a >> test system being debugged, I observed a PCI SERR being continuously asserted >> without the SERR bit being set. The result was Xen "exceeding" a 300 second >> timeout within 1 second. > Sounds like the CPU is indeed stuck, and the watchdog has just optimized > away the 5 minutes of back-to-back NMIs. :) > > Handling this case it nice, but I wonder whether this patch ought to > detect and report ludicrous NMI rates rather than silently ignoring > them. I guess that's hard to do in an NMI handler, other than by > adjusting the printk when we crash. > > Tim. Actually I suspect the system was livelocked with PCI SERRs being issued from a PCIe switch. I only have second granularity on the serial console, but can confirm that cpu0 was perfectly alive and well within the same second as the watchdog supposedly expiring. I was considering trying to work around a ludicrous rate of interrupts, but decided to go for the easier patch first ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |