[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
Hello,
In Bladecenter webfrontend appears:
27 |
I |
Blade_09 |
09/08/13 13:25:17 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
28 |
E |
Blade_09 |
09/08/13 13:25:12 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
29 |
I |
Blade_09 |
09/08/13 13:09:14 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
30 |
I |
Blade_09 |
09/08/13 13:09:03 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
31 |
E |
Blade_09 |
09/08/13 13:08:58 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
32 |
I |
Blade_09 |
09/08/13 12:46:26 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
33 |
I |
Blade_09 |
09/08/13 12:46:15 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
34 |
E |
Blade_09 |
09/08/13 12:46:11 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
35 |
I |
Blade_09 |
09/08/13 12:34:13 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
36 |
I |
Blade_09 |
09/08/13 12:34:03 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
37 |
E |
Blade_09 |
09/08/13 12:33:58 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
38 |
I |
Blade_09 |
09/08/13 12:27:25 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
39 |
I |
Blade_09 |
09/08/13 12:27:14 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
40 |
E |
Blade_09 |
09/08/13 12:27:10 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
41 |
I |
Blade_09 |
09/08/13 12:20:45 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
42 |
I |
Blade_09 |
09/08/13 12:20:34 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
43 |
E |
Blade_09 |
09/08/13 12:20:30 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
44 |
I |
Blade_09 |
09/08/13 12:18:20 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
45 |
I |
Blade_09 |
09/08/13 12:18:10 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
46 |
E |
Blade_09 |
09/08/13 12:18:05 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
47 |
I |
Blade_09 |
09/08/13 12:15:47 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
48 |
I |
Blade_09 |
09/08/13 12:15:37 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
49 |
E |
Blade_09 |
09/08/13 12:15:32 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
27 I Blade_09 09/08/13 13:25:17 0x806f0013 Chassis, (NMI State) diagnostic interrupt 28 E Blade_09 09/08/13 13:25:12 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 29 I Blade_09 09/08/13 13:09:14 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
30 I Blade_09 09/08/13 13:09:03 0x806f0013 Chassis, (NMI State) diagnostic interrupt 31 E Blade_09 09/08/13 13:08:58 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 32 I Blade_09 09/08/13 12:46:26 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
33 I Blade_09 09/08/13 12:46:15 0x806f0013 Chassis, (NMI State) diagnostic interrupt 34 E Blade_09 09/08/13 12:46:11 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 35 I Blade_09 09/08/13 12:34:13 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
36 I Blade_09 09/08/13 12:34:03 0x806f0013 Chassis, (NMI State) diagnostic interrupt 37 E Blade_09 09/08/13 12:33:58 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 38 I Blade_09 09/08/13 12:27:25 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
39 I Blade_09 09/08/13 12:27:14 0x806f0013 Chassis, (NMI State) diagnostic interrupt 40 E Blade_09 09/08/13 12:27:10 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 41 I Blade_09 09/08/13 12:20:45 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
42 I Blade_09 09/08/13 12:20:34 0x806f0013 Chassis, (NMI State) diagnostic interrupt 43 E Blade_09 09/08/13 12:20:30 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 44 I Blade_09 09/08/13 12:18:20 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
45 I Blade_09 09/08/13 12:18:10 0x806f0013 Chassis, (NMI State) diagnostic interrupt 46 E Blade_09 09/08/13 12:18:05 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 47 I Blade_09 09/08/13 12:15:47 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt
48 I Blade_09 09/08/13 12:15:37 0x806f0013 Chassis, (NMI State) diagnostic interrupt 49 E Blade_09 09/08/13 12:15:32 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020
Thanks
27 |
I |
Blade_09 |
09/08/13 13:25:17 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
28 |
E |
Blade_09 |
09/08/13 13:25:12 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
29 |
I |
Blade_09 |
09/08/13 13:09:14 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
30 |
I |
Blade_09 |
09/08/13 13:09:03 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
31 |
E |
Blade_09 |
09/08/13 13:08:58 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
32 |
I |
Blade_09 |
09/08/13 12:46:26 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
33 |
I |
Blade_09 |
09/08/13 12:46:15 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
34 |
E |
Blade_09 |
09/08/13 12:46:11 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
35 |
I |
Blade_09 |
09/08/13 12:34:13 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
36 |
I |
Blade_09 |
09/08/13 12:34:03 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
37 |
E |
Blade_09 |
09/08/13 12:33:58 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
38 |
I |
Blade_09 |
09/08/13 12:27:25 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
39 |
I |
Blade_09 |
09/08/13 12:27:14 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
40 |
E |
Blade_09 |
09/08/13 12:27:10 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
41 |
I |
Blade_09 |
09/08/13 12:20:45 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
42 |
I |
Blade_09 |
09/08/13 12:20:34 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
43 |
E |
Blade_09 |
09/08/13 12:20:30 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
44 |
I |
Blade_09 |
09/08/13 12:18:20 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
45 |
I |
Blade_09 |
09/08/13 12:18:10 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
46 |
E |
Blade_09 |
09/08/13 12:18:05 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
47 |
I |
Blade_09 |
09/08/13 12:15:47 |
0x806f0013 |
Recovery Chassis, (NMI State) diagnostic interrupt |
48 |
I |
Blade_09 |
09/08/13 12:15:37 |
0x806f0013 |
Chassis, (NMI State) diagnostic interrupt |
49 |
E |
Blade_09 |
09/08/13 12:15:32 |
0x10000002 |
SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 |
2013/9/23 Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
On Thu, Sep 12, 2013 at 02:47:39PM +0200, Trenta sis wrote: > Hello, >
> We need this server and we have made a downgrade to Debian Squeeze. > I hope in a few day to have another HS20 to make some additional test, I'll > try to get all information that you asked and send
> Sorry, one question what is PCI SERR ? Where?
If you log in the BladeCenter webfrontend you should see logs of each blade. Some of them are 'User XYZ logged in'. But in some cases the are more serious ones - such an NMI or PCI SERR. If you could copy-n-paste
them it could help in figuring which PCI device is responsible for causing the NMI.
> > Thanks for all > > 2013/9/9 Konrad Rzeszutek Wilk < konrad.wilk@xxxxxxxxxx> > > > On Sun, Sep 08, 2013 at 04:41:02PM +0200, Trenta sis wrote:
> > > Hello, > > > > > > I have the same error, server is auto rebooted during every boot with > > > kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment > > show
> > > same errors described in previous mails. With Debian wheezy wit non-xen > > > kernel boots correcte, it seems that problems is with xen kernel > > > Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN
> > > 4.0 working perfect > > > > > > Upgraded to Debian testing and unstable with same results XEN 4.1 and > > 4.2. > > > > > > If you need more information, you can ask.
> > > How can be solved this bug? > > > > Did you the workaround help? > > > > And in regards to finding out exactly what causes it - well there are > > logs in the BMC that can point to it the PCI device? Did you check those?
> > Do they save if there is any device that has PCI SERR on them? > > > > Thanks. > >
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|