[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] Emulation issues
> -----Original Message----- > From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx] > Sent: 29 July 2015 14:54 > To: Paul Durrant; xen-devel; Andrew Cooper > Subject: Re: [BUG] Emulation issues > > El 29/07/15 a les 14.41, Paul Durrant ha escrit: > >> -----Original Message----- > >> From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx] > >> Sent: 29 July 2015 11:37 > >> To: Paul Durrant; xen-devel; Andrew Cooper > >> Subject: Re: [BUG] Emulation issues > >> > >> El 29/07/15 a les 12.27, Paul Durrant ha escrit: > >>>> -----Original Message----- > >>>> From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx] > >>>> Sent: 29 July 2015 11:17 > >>>> To: xen-devel; Andrew Cooper; Paul Durrant > >>>> Subject: [BUG] Emulation issues > >>>> > >>>> Hello, > >>>> > >>>> While trying to debug a hotplug scripts issue, I came across what seems > >>>> to be an emulation bug inside of Xen. The result of this is a bunch of > >>>> repeated messages on the serial console: > >>>> > >>> > >>> Was there anything of interest before this? You got an 'unhandleable' > >> emulation which generally should not happen, but I guess there may be a > >> shutdown race in tearing down the ioreq server list and sending > emulation > >> requests which may cause hvm_send_ioreq() to return > >> X86EMUL_UNHANDLEABLE. It would be good to better understand the > >> sequence of events. > >> > >> I don't think there's anything relevant before the messages I've posted, > >> here is a more complete log: > >> > >> (XEN) irq.c:386: Dom91 callback via changed to Direct Vector 0x93 > >> (XEN) irq.c:386: Dom92 callback via changed to Direct Vector 0x93 > >> (XEN) irq.c:276: Dom91 PCI link 0 changed 5 -> 0 > >> (XEN) irq.c:276: Dom91 PCI link 1 changed 10 -> 0 > >> (XEN) irq.c:276: Dom91 PCI link 2 changed 11 -> 0 > >> (XEN) irq.c:276: Dom91 PCI link 3 changed 5 -> 0 > >> (XEN) irq.c:276: Dom92 PCI link 0 changed 5 -> 0 > >> (XEN) irq.c:276: Dom92 PCI link 1 changed 10 -> 0 > >> (XEN) irq.c:276: Dom92 PCI link 2 changed 11 -> 0 > >> (XEN) irq.c:276: Dom92 PCI link 3 changed 5 -> 0 > >> INIT: Id "T0" respawning too fast: disabled for 5 minutes > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. > >> (XEN) domain_crash called from io.c:166 > >> > >> If you can provide a debug/trace patch I can run the same workload with > >> it in order to trace the sequence of events. > >> > > > > Could you try this? > > > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > > index 30acb78..1bc3cc9 100644 > > --- a/xen/arch/x86/hvm/emulate.c > > +++ b/xen/arch/x86/hvm/emulate.c > > @@ -145,6 +145,8 @@ static int hvmemul_do_io( > > return X86EMUL_UNHANDLEABLE; > > goto finish_access; > > default: > > + gprintk(XENLOG_ERR, "weird emulation state %u\n", > > + vio->io_req.state); > > return X86EMUL_UNHANDLEABLE; > > } > > > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > > index ec1d797..38d6d99 100644 > > --- a/xen/arch/x86/hvm/hvm.c > > +++ b/xen/arch/x86/hvm/hvm.c > > @@ -2747,6 +2747,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, > ioreq_t *pr > > } > > } > > > > + gprintk(XENLOG_ERR, "unable to contact device model\n"); > > return X86EMUL_UNHANDLEABLE; > > } > > I've applied your patch and the one from Andrew, so my current diff is: > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 30acb78..1bc3cc9 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -145,6 +145,8 @@ static int hvmemul_do_io( > return X86EMUL_UNHANDLEABLE; > goto finish_access; > default: > + gprintk(XENLOG_ERR, "weird emulation state %u\n", > + vio->io_req.state); > return X86EMUL_UNHANDLEABLE; > } > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index ec1d797..38d6d99 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -2747,6 +2747,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, > ioreq_t *proto_p, > } > } > > + gprintk(XENLOG_ERR, "unable to contact device model\n"); > return X86EMUL_UNHANDLEABLE; > } > > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c > index d3b9cae..12d50c2 100644 > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -163,7 +163,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir) > break; > default: > gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc); > - domain_crash(curr->domain); > + show_execution_state(&curr->arch.user_regs); > + dump_execution_state(); > + domain_crash_synchronous(); > break; > } > > And got the following panic while doing a `xl shutdown -w -a` of 20 HVM > guests: > > (XEN) irq.c:386: Dom19 callback via changed to Direct Vector 0x93 > (XEN) irq.c:276: Dom19 PCI link 0 changed 5 -> 0 > (XEN) irq.c:276: Dom19 PCI link 1 changed 10 -> 0 > (XEN) irq.c:276: Dom19 PCI link 2 changed 11 -> 0 > (XEN) irq.c:276: Dom19 PCI link 3 changed 5 -> 0 > (XEN) d10v0 weird emulation state 1 > (XEN) io.c:165:d10v0 Weird HVM ioemulation status 1. > (XEN) Assertion 'diff < STACK_SIZE' failed at traps.c:91 > (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82d080234b83>] show_registers+0x60/0x32f > (XEN) RFLAGS: 0000000000010212 CONTEXT: hypervisor (d10v0) > (XEN) rax: 000000001348fc88 rbx: ffff8300cc668290 rcx: 0000000000000000 > (XEN) rdx: ffff8300dfaf0000 rsi: ffff8300cc668358 rdi: ffff8300dfaf7bb8 > (XEN) rbp: ffff8300dfaf7bd8 rsp: ffff8300dfaf7a98 r8: ffff83019d270000 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: ffff8300cc668000 r13: 0000000000000000 r14: ffff82c00026c000 > (XEN) r15: ffff830198bf9000 cr0: 000000008005003b cr4: 00000000000026e0 > (XEN) cr3: 00000000cc77b000 cr2: ffff880002762df8 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff8300dfaf7a98: > (XEN) ffff8300dfaf7ac8 ffff82d080144b11 0000000000000046 > ffff8300dfaf7ac8 > (XEN) 0000000000000046 0000000000000092 ffff8300dfaf7ae0 > ffff82d08012cfd3 > (XEN) ffff82d0802a1bc0 ffff8300dfaf7af8 0000000000000046 > 0000000000002001 > (XEN) 0000000000002001 fffff80002089e28 0000000000000001 > fffffe00003829c0 > (XEN) 000000000000b004 0000000000000000 0000000000000014 > 0000000000000002 > (XEN) 000000000000b004 0000000000002001 000000000000b005 > 000000000000b004 > (XEN) 0000000000002001 000000000000b004 > 0000beef0000beef<G><0>d15v0 weird emulation state 1 > (XEN) ffffffff8036fa45<G><0>io.c:165:d15v0 Weird HVM ioemulation status > 1. > (XEN) > (XEN) Assertion 'diff < STACK_SIZE' failed at traps.c:91 > (XEN) 000000bf0000beef----[ Xen-4.6-unstable x86_64 debug=y Tainted: > C ]---- > (XEN) 0000000000000046CPU: 6 > (XEN) fffffe00003829c0RIP: e008:[<ffff82d080234b83>] 000000000000beef > show_registers+0x60/0x32f > (XEN) > (XEN) RFLAGS: 0000000000010212 0000000000000000CONTEXT: hypervisor > 0000000000000000 (d15v0) 0000000000000000 > (XEN) rax: 0000000121dd3c88 rbx: ffff83007b4c4290 rcx: 0000000000000000 > (XEN) 0000000000000000rdx: ffff83019d290000 rsi: ffff83007b4c4358 rdi: > ffff83019d297bb8 > (XEN) > (XEN) rbp: ffff83019d297bd8 rsp: ffff83019d297a98 r8: ffff83019d270000 > (XEN) ffff8300cc668290r9: 0000000000000001 r10: 0000000000000001 r11: > 0000000000000001 > (XEN) ffff8300cc668000r12: ffff83007b4c4000 r13: 0000000000000000 r14: > ffff82c000299000 > (XEN) 0000000000000000r15: ffff830198bf9000 cr0: 000000008005003b cr4: > 00000000000026e0 > (XEN) ffff82c00026c000cr3: 000000007b5d7000 cr2: ffff8800026b14d8 > (XEN) > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) ffff8300dfaf7bf8Xen stack trace from rsp=ffff83019d297a98: > (XEN) ffff82d08018dd4d ffff82d0802685bf 0000000000000001 > ffff830198bf9000 0000000000000002 00007cfe62d68527 > (XEN) ffff82d08023b132 ffff8300dfaf7c38 > (XEN) ffff82d0801caff0 ffff830198bf9000 ffff8300dfaf7c38 ffff82d0802685bf > 0000000000002001 ffff83019d297b70 > (XEN) 0000000000000200 ffff8300cc7da000 > (XEN) ffff83019d29ecc0 ffff83019d297b98 ffff8300cc668000 > 0000000000000000 ffff8300cc7da250 0000000000000001 > (XEN) 0000000000002001 ffff8300dfaf7db8 > (XEN) ffff82d0801c5934 0000000000002001 8000000000000000 > fffff80002089e28 ffff8300cc7da000 0000000000000001 > (XEN) fffffe00003829c0 ffff8300dfaf0000 > (XEN) ffff8300cc7da250 000000000000b004 ffff8300dfaf7cf8 > 0000000000000000 00000000000cc277 0000000000000014 > (XEN) 0000000000000002 0000000000000000 > (XEN) 0000000000000001 000000000000b004 00000000000feff0 > 0000000000002001 ffff8300ccfec820 000000000000b005 > (XEN) 000000000000b004 ffff8300dfaf7d08 > (XEN) ffff82d0801f2009 0000000000002001 ffffffffffffffff 000000000000b004 > ffffffffffffffff 0000beef0000beef > (XEN) ffffffff8036fa45 00000000000001f0 > (XEN) 000000004003b000 000000bf0000beef ffff8300cc7da000 > 0000000000000046 0000000000000000 fffffe00003829c0 > (XEN) 000000000000beef ffff8300ccfec820 > (XEN) 00000000000cc278 0000000000000000 ffff8300ccfec820 > 0000000000000000 ffff8300cc7da000 0000000000000000 > (XEN) 0000000000000000 ffff8300dfaf7da8 > (XEN) ffff82d080122c5a ffff83007b4c4290 ffff8300dfaf7db8 > ffff83007b4c4000 ffff8300dfaf7d28 0000000000000000 > (XEN) ffff82c000299000Xen call trace: > (XEN) > (XEN) [<ffff82d080234b83>] show_registers+0x60/0x32f > (XEN) ffff83019d297bf8 [<ffff82d08018dd4d>] > show_execution_state+0x11/0x20 > (XEN) ffff82d08018dd4d [<ffff82d0801caff0>] handle_pio+0x129/0x158 > (XEN) 0000000000000001 [<ffff82d0801c5934>] > hvm_do_resume+0x258/0x33e > (XEN) 0000000000000002 [<ffff82d0801e3166>] > vmx_do_resume+0x12b/0x142 Ok, so this is a handle_pio() that's being called to pick up the results of an I/O that was sent to an external emulator. Did you manage to apply my patch too? I'd really like to know what part of the emulation handling is actually returning X86EMUL_UNHANDLEABLE. Paul > (XEN) > (XEN) [<ffff82d080164adc>] context_switch+0xf0c/0xf63 > (XEN) ffff83019d297c38 [<ffff82d0801299e0>] schedule+0x5b9/0x612 > (XEN) ffff82d0801caff0 [<ffff82d08012c765>] __do_softirq+0x82/0x8d > (XEN) ffff83019d297c38 [<ffff82d08012c7bd>] do_softirq+0x13/0x15 > (XEN) 0000000000002001 [<ffff82d08023ace1>] process_softirqs+0x21/0x30 > (XEN) > (XEN) > (XEN) ffff83007b637000 > (XEN) **************************************** > (XEN) ffff83007b60aed0Panic on CPU 0: > (XEN) ffff83007b4c4000Assertion 'diff < STACK_SIZE' failed at traps.c:91 > (XEN) ffff83007b637250**************************************** > (XEN) > (XEN) > (XEN) Reboot in five seconds... > (XEN) ffff83019d297db8 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |