[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RE: How to generate a HW NMI
> BTW, "rmmod processor thermal" (should be equivalent to your Xen I am not familiar with the thermal module but my guess is that they are not the same as the C3 states which can be entered when the kernel becomes idle. I believe the thermal plays with other type of state (P?) where it alters the voltage and frequency of the CPU to keep the CPU still running but at a particular % of the top speed. The C3 state causes the CPU clocks to shutdown entirely and then it is awaken by an external event. R. -----Original Message----- From: Jan Kiszka [mailto:jan.kiszka@xxxxxxxxxxx] Sent: Monday, October 04, 2010 11:23 AM To: Roger Cruz Cc: Konrad Rzeszutek Wilk; xen-devel@xxxxxxxxxxxxxxxxxxx Subject: Re: How to generate a HW NMI Am 04.10.2010 16:19, Roger Cruz wrote: > Until Friday, all hard hangs that we and our customers had experienced > were on Lenovo T500 and X200, even with their latest BIOSes. Yeah, the T500 was reported as problematic here as well. My Fujitsu Celsius H700 also crashes. In contrast, we have positive results from a Dell server with an Asus P6T Deluxe V2 board and a Core i7 920. > The Lenovo > T400 has never hung for me and I don't have any reports on them from the > field. On Friday, I had an HP i5 hard hang with similar footprint as i5? Mmh, we only have reports from i7 so far. Which BIOS vendor? > the Lenovos. When this hard hang happens, the Xen watchdog (which is > driven by the NMI handler) will not do its job and cause a crash/stack > trace. > This is why we have started to suspect something with the BIOS > and SMIs as they are the only thing that can block an NMI. I am pretty > certain that this is somehow related to entering C3 power states and > possibly at the same time an SMI comes in. I tried various stuff under Linux as well: nmi_watchdog=1, tracing to VGA buffer right before/after guest-host switch (it always hangs after entry here), verified guest interruptibility before entry (though hypervisors usually do not play with the critical bits), read-out of host RAM (including kernel log buffer) via Firewire - it all points to a crash outside the scope of the host OS. > The time it takes to hang > varies from 30mins to 24 hrs. We are a bit more lucky, maybe due to our special guest (an old RTOS in 16-bit mode): I can reproduce the hang after a few minutes. BTW, "rmmod processor thermal" (should be equivalent to your Xen parameter) did not make a difference here. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: 10/04/10 02:35:00 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |