[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] State of current Xen debugger
On 28/09/2010 16:21, "Roger Cruz" <roger.cruz@xxxxxxxxxxxxxxxxxxx> wrote: > I am still chasing this hard hang in our system with a modified 3.4.2 xen. I > have upgraded the BIOS and the problem still exists. The only thing that so > far had appeared to work was adding max_cstate=0 but now I have a report where > it still hung in one customer who had that flag enabled. The rest of them > have been successfully running for more than a week with this ³work-around². > I have isolated the problem to Lenovo with the Centrino processors. These > guys will stop the TSC when in C3. > > What I need to really understand is why the NMI/watchdog in Xen is not working > and causing a crash when the CPU hangs. I was under the impression that NMIs > couldn¹t be masked at all. Is there anyway that Xen could be disabling or > changing that behavior? I know the NMI is being driven by a timer set in the > NMI handler. Could there be a case where this timer is disabled? Any ideas > are welcome! The NMI counter gets driven by the APIC timer. Perhaps it needs poking womehow on wakeup from C3? My suggestion for debugging this would be to take a look at what native Linux does. The NMI perfctr poking logic was all taken from (rather old now) upstream Linux. -- Keir > Thanks > Roger R. Cruz > > > > > > > > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Roger Cruz > Sent: Tuesday, September 14, 2010 11:55 AM > To: Dan Magenheimer; Tim Deegan > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: RE: [Xen-devel] State of current Xen debugger > > Hi Dan, > > I am using 3.4.2 where we have made very minor modifications (some backports, > for example). > > I have not tried your suggestions.. so I will do that next.. thanks! > > R. > > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > Sent: Tue 9/14/2010 11:19 AM > To: Roger Cruz; Tim Deegan > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: RE: [Xen-devel] State of current Xen debugger > > A couple of thoughts: > > > > Have you tried max_cstate=0 (as a Xen boot option)? > > > > Also, you didn't say what version of Xen you are using but playing around with > hpet_broadcast (enabling it or force-disabling it as below) might be worth a > try. > > > > http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00556.html > > > > From: Roger Cruz [mailto:roger.cruz@xxxxxxxxxxxxxxxxxxx] > Sent: Tuesday, September 14, 2010 8:56 AM > To: Tim Deegan > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: RE: [Xen-devel] State of current Xen debugger > > > > Hi Tim, good to hear from you again > > I had a pretty good inkling that one of you hardcore developers would say that > :-) Yes, it is pretty well wedged. I can cause the problem more rapidly by > dropping to a single CPU. When the hang happens, the Xen console is > completely dead. None of the special keys work. > > I do have hopes a BIOS upgrade could fix this as a last resort but I want to > see if at least I can understand the problem. We have a few different > machines that are exhibiting similar symptoms so I have to see if I can find a > work-around without requiring every user to upgrade their BIOS :-( > > Just in case, what debugger have you been using? Are there recent > instructions on how to set it up that you can point me to? > > Thanks > Roger > > > -----Original Message----- > From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxx] > Sent: Tue 9/14/2010 10:30 AM > To: Roger Cruz > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: Re: [Xen-devel] State of current Xen debugger > > Hi, > > At 15:22 +0100 on 14 Sep (1284477779), Roger Cruz wrote: >> I am trying to debug a problem where the hypervisor is hanging hard. >> Not even the NMI watchdog is triggering a reboot. So I wanted to hook >> up a debugger. > > Sorry to bring a counsel of despair but if the NMI watchdog isn't > working then your chances of getting a working debugger are slim. It's > likely that at least one CPU is very very stuck. Does the 'd' debug key > work on the serial line when the machine is wedged? > > On a more cheerful note, I've twice seen hard hangs like this that > turned out to be hardware issues, fixable with BIOS upgrades. > > Cheers, > > Tim. > >> What is the state of the current debuggers out there? >> Any input on how I should set it up (kdb, gdb, etc) and pointers to a >> good wiki page are much appreciated. I did perform a Google search >> and found some links but I want to hear from the current developers as >> to what is most stable and useful for debugging this type of hard >> hang. I only have a serial port PCI-express card to use as the laptop >> has no built in port. > > -- > Tim Deegan <Tim.Deegan@xxxxxxxxxx> > Principal Software Engineer, XenServer Engineering > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.851 / Virus Database: 271.1.1/3119 - Release Date: 09/14/10 > 02:35:00 > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.851 / Virus Database: 271.1.1/3119 - Release Date: 09/14/10 > 02:35:00 > > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.851 / Virus Database: 271.1.1/3119 - Release Date: 09/14/10 > 02:35:00 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |