[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: Wednesday, May 24, 2017 2:25 PM > To: Hao, Xudong <xudong.hao@xxxxxxxxx> > Cc: Julien Grall <julien.grall@xxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx>; Lars Kurth <lars.kurth@xxxxxxxxxx>; Zhang, > Haozhong <haozhong.zhang@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx > Subject: RE: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash > > >>> On 24.05.17 at 07:32, <xudong.hao@xxxxxxxxx> wrote: > >> -----Original Message----- > >> From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of > >> Hao, Xudong > >> Sent: Tuesday, May 23, 2017 5:34 PM > >> To: Jan Beulich <JBeulich@xxxxxxxx> > >> Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>; Julien Grall > >> <julien.grall@xxxxxxx>; George Dunlap <George.Dunlap@xxxxxxxxxx>; > >> Zhang, Haozhong <haozhong.zhang@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx > >> Subject: Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 > >> crash > >> > >> > -----Original Message----- > >> > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf > >> > Of Jan Beulich > >> > Sent: Tuesday, May 23, 2017 12:06 AM > >> > To: Hao, Xudong <xudong.hao@xxxxxxxxx> > >> > Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>; Julien Grall > >> > <julien.grall@xxxxxxx>; xen-devel@xxxxxxxxxxxxx; George Dunlap > >> > <George.Dunlap@xxxxxxxxxx>; Zhang, Haozhong > >> > <haozhong.zhang@xxxxxxxxx> > >> > Subject: Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 > >> > crash > >> > > >> > >>> On 22.05.17 at 10:39, <xudong.hao@xxxxxxxxx> wrote: > >> > > (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds. > >> > > >> > Not this - Xen is unavoidably going to go down in such a case, yet > >> > your log has no hint at all what kind of problem Dom0 experienced > >> > (e.g. whether one of the injected #MC-s caused this). > >> > > >> > >> Jan, > >> The first mail attached the complete log from Xen booting, hope there > >> is > > some > >> hint from the full log. > >> > >> > > (XEN) ----[ Xen-4.9-rc x86_64 debug=y Tainted: MCE ]---- > >> > > (XEN) CPU: 0 > >> > > (XEN) RIP: e008:[<0000000065eb1e13>] 0000000065eb1e13 > >> > > ... > >> > > (XEN) Pagetable walk from 00000000682ab009: > >> > > (XEN) L4[0x000] = 000000102c961063 ffffffffffffffff > >> > > (XEN) L3[0x001] = 000000005f812063 ffffffffffffffff > >> > > (XEN) L2[0x141] = 0000000000000000 ffffffffffffffff > >> > > >> > Here you're apparently hitting a firmware bug: While RIP points > >> > into runtime services memory, CR2 doesn't: > >> > > >> > (XEN) 0000065eb8000-00000682acfff type=0 attr=000000000000000f > >> > > >> > You may try working around this via one of "reboot=acpi" or > >> > "efi=no-rs" on the hypervisor command line. > >> > > >> > >> Will try them. > >> > > > > Neither "reboot=acpi" nor "efi=no-rs" can work around this issue. > > Apparently I didn't express myself clearly enough: These workarounds were > supposed to help with the Xen crash, not the Dom0 one. And as your logs prove > they did fulfill that purpose. Yet still there are no Dom0 log messages at > all near > the crash, which leaves open whether there is a completely silent path in its > MCE > handling, or whether some messages simply don't make it through. Right now I > can't see any Xen side of the issue here though, so from a 4.9 perspective I > think > we're fine. > We figured out the problem, some corner scripts triggered the error injection at the same page (pfn 0x180020) twice, i.e. "./xen-mceinj -t 0" run over one time, which resulted in Dom0 crash. Let's close this bug thread, sorry for the invalid report and thanks Jan's analysis. Thanks, -Xudong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |