[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] Emulation issues
El 29/07/15 a les 14.41, Paul Durrant ha escrit: >> -----Original Message----- >> From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx] >> Sent: 29 July 2015 11:37 >> To: Paul Durrant; xen-devel; Andrew Cooper >> Subject: Re: [BUG] Emulation issues >> >> El 29/07/15 a les 12.27, Paul Durrant ha escrit: >>>> -----Original Message----- >>>> From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx] >>>> Sent: 29 July 2015 11:17 >>>> To: xen-devel; Andrew Cooper; Paul Durrant >>>> Subject: [BUG] Emulation issues >>>> >>>> Hello, >>>> >>>> While trying to debug a hotplug scripts issue, I came across what seems >>>> to be an emulation bug inside of Xen. The result of this is a bunch of >>>> repeated messages on the serial console: >>>> >>> >>> Was there anything of interest before this? You got an 'unhandleable' >> emulation which generally should not happen, but I guess there may be a >> shutdown race in tearing down the ioreq server list and sending emulation >> requests which may cause hvm_send_ioreq() to return >> X86EMUL_UNHANDLEABLE. It would be good to better understand the >> sequence of events. >> >> I don't think there's anything relevant before the messages I've posted, >> here is a more complete log: >> >> (XEN) irq.c:386: Dom91 callback via changed to Direct Vector 0x93 >> (XEN) irq.c:386: Dom92 callback via changed to Direct Vector 0x93 >> (XEN) irq.c:276: Dom91 PCI link 0 changed 5 -> 0 >> (XEN) irq.c:276: Dom91 PCI link 1 changed 10 -> 0 >> (XEN) irq.c:276: Dom91 PCI link 2 changed 11 -> 0 >> (XEN) irq.c:276: Dom91 PCI link 3 changed 5 -> 0 >> (XEN) irq.c:276: Dom92 PCI link 0 changed 5 -> 0 >> (XEN) irq.c:276: Dom92 PCI link 1 changed 10 -> 0 >> (XEN) irq.c:276: Dom92 PCI link 2 changed 11 -> 0 >> (XEN) irq.c:276: Dom92 PCI link 3 changed 5 -> 0 >> INIT: Id "T0" respawning too fast: disabled for 5 minutes >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1. >> (XEN) domain_crash called from io.c:166 >> >> If you can provide a debug/trace patch I can run the same workload with >> it in order to trace the sequence of events. >> > > Could you try this? > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 30acb78..1bc3cc9 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -145,6 +145,8 @@ static int hvmemul_do_io( > return X86EMUL_UNHANDLEABLE; > goto finish_access; > default: > + gprintk(XENLOG_ERR, "weird emulation state %u\n", > + vio->io_req.state); > return X86EMUL_UNHANDLEABLE; > } > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index ec1d797..38d6d99 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -2747,6 +2747,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t > *pr > } > } > > + gprintk(XENLOG_ERR, "unable to contact device model\n"); > return X86EMUL_UNHANDLEABLE; > } I've applied your patch and the one from Andrew, so my current diff is: diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 30acb78..1bc3cc9 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -145,6 +145,8 @@ static int hvmemul_do_io( return X86EMUL_UNHANDLEABLE; goto finish_access; default: + gprintk(XENLOG_ERR, "weird emulation state %u\n", + vio->io_req.state); return X86EMUL_UNHANDLEABLE; } diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index ec1d797..38d6d99 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2747,6 +2747,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p, } } + gprintk(XENLOG_ERR, "unable to contact device model\n"); return X86EMUL_UNHANDLEABLE; } diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index d3b9cae..12d50c2 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -163,7 +163,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir) break; default: gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc); - domain_crash(curr->domain); + show_execution_state(&curr->arch.user_regs); + dump_execution_state(); + domain_crash_synchronous(); break; } And got the following panic while doing a `xl shutdown -w -a` of 20 HVM guests: (XEN) irq.c:386: Dom19 callback via changed to Direct Vector 0x93 (XEN) irq.c:276: Dom19 PCI link 0 changed 5 -> 0 (XEN) irq.c:276: Dom19 PCI link 1 changed 10 -> 0 (XEN) irq.c:276: Dom19 PCI link 2 changed 11 -> 0 (XEN) irq.c:276: Dom19 PCI link 3 changed 5 -> 0 (XEN) d10v0 weird emulation state 1 (XEN) io.c:165:d10v0 Weird HVM ioemulation status 1. (XEN) Assertion 'diff < STACK_SIZE' failed at traps.c:91 (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d080234b83>] show_registers+0x60/0x32f (XEN) RFLAGS: 0000000000010212 CONTEXT: hypervisor (d10v0) (XEN) rax: 000000001348fc88 rbx: ffff8300cc668290 rcx: 0000000000000000 (XEN) rdx: ffff8300dfaf0000 rsi: ffff8300cc668358 rdi: ffff8300dfaf7bb8 (XEN) rbp: ffff8300dfaf7bd8 rsp: ffff8300dfaf7a98 r8: ffff83019d270000 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: ffff8300cc668000 r13: 0000000000000000 r14: ffff82c00026c000 (XEN) r15: ffff830198bf9000 cr0: 000000008005003b cr4: 00000000000026e0 (XEN) cr3: 00000000cc77b000 cr2: ffff880002762df8 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff8300dfaf7a98: (XEN) ffff8300dfaf7ac8 ffff82d080144b11 0000000000000046 ffff8300dfaf7ac8 (XEN) 0000000000000046 0000000000000092 ffff8300dfaf7ae0 ffff82d08012cfd3 (XEN) ffff82d0802a1bc0 ffff8300dfaf7af8 0000000000000046 0000000000002001 (XEN) 0000000000002001 fffff80002089e28 0000000000000001 fffffe00003829c0 (XEN) 000000000000b004 0000000000000000 0000000000000014 0000000000000002 (XEN) 000000000000b004 0000000000002001 000000000000b005 000000000000b004 (XEN) 0000000000002001 000000000000b004 0000beef0000beef<G><0>d15v0 weird emulation state 1 (XEN) ffffffff8036fa45<G><0>io.c:165:d15v0 Weird HVM ioemulation status 1. (XEN) (XEN) Assertion 'diff < STACK_SIZE' failed at traps.c:91 (XEN) 000000bf0000beef----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- (XEN) 0000000000000046CPU: 6 (XEN) fffffe00003829c0RIP: e008:[<ffff82d080234b83>] 000000000000beef show_registers+0x60/0x32f (XEN) (XEN) RFLAGS: 0000000000010212 0000000000000000CONTEXT: hypervisor 0000000000000000 (d15v0) 0000000000000000 (XEN) rax: 0000000121dd3c88 rbx: ffff83007b4c4290 rcx: 0000000000000000 (XEN) 0000000000000000rdx: ffff83019d290000 rsi: ffff83007b4c4358 rdi: ffff83019d297bb8 (XEN) (XEN) rbp: ffff83019d297bd8 rsp: ffff83019d297a98 r8: ffff83019d270000 (XEN) ffff8300cc668290r9: 0000000000000001 r10: 0000000000000001 r11: 0000000000000001 (XEN) ffff8300cc668000r12: ffff83007b4c4000 r13: 0000000000000000 r14: ffff82c000299000 (XEN) 0000000000000000r15: ffff830198bf9000 cr0: 000000008005003b cr4: 00000000000026e0 (XEN) ffff82c00026c000cr3: 000000007b5d7000 cr2: ffff8800026b14d8 (XEN) (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) ffff8300dfaf7bf8Xen stack trace from rsp=ffff83019d297a98: (XEN) ffff82d08018dd4d ffff82d0802685bf 0000000000000001 ffff830198bf9000 0000000000000002 00007cfe62d68527 (XEN) ffff82d08023b132 ffff8300dfaf7c38 (XEN) ffff82d0801caff0 ffff830198bf9000 ffff8300dfaf7c38 ffff82d0802685bf 0000000000002001 ffff83019d297b70 (XEN) 0000000000000200 ffff8300cc7da000 (XEN) ffff83019d29ecc0 ffff83019d297b98 ffff8300cc668000 0000000000000000 ffff8300cc7da250 0000000000000001 (XEN) 0000000000002001 ffff8300dfaf7db8 (XEN) ffff82d0801c5934 0000000000002001 8000000000000000 fffff80002089e28 ffff8300cc7da000 0000000000000001 (XEN) fffffe00003829c0 ffff8300dfaf0000 (XEN) ffff8300cc7da250 000000000000b004 ffff8300dfaf7cf8 0000000000000000 00000000000cc277 0000000000000014 (XEN) 0000000000000002 0000000000000000 (XEN) 0000000000000001 000000000000b004 00000000000feff0 0000000000002001 ffff8300ccfec820 000000000000b005 (XEN) 000000000000b004 ffff8300dfaf7d08 (XEN) ffff82d0801f2009 0000000000002001 ffffffffffffffff 000000000000b004 ffffffffffffffff 0000beef0000beef (XEN) ffffffff8036fa45 00000000000001f0 (XEN) 000000004003b000 000000bf0000beef ffff8300cc7da000 0000000000000046 0000000000000000 fffffe00003829c0 (XEN) 000000000000beef ffff8300ccfec820 (XEN) 00000000000cc278 0000000000000000 ffff8300ccfec820 0000000000000000 ffff8300cc7da000 0000000000000000 (XEN) 0000000000000000 ffff8300dfaf7da8 (XEN) ffff82d080122c5a ffff83007b4c4290 ffff8300dfaf7db8 ffff83007b4c4000 ffff8300dfaf7d28 0000000000000000 (XEN) ffff82c000299000Xen call trace: (XEN) (XEN) [<ffff82d080234b83>] show_registers+0x60/0x32f (XEN) ffff83019d297bf8 [<ffff82d08018dd4d>] show_execution_state+0x11/0x20 (XEN) ffff82d08018dd4d [<ffff82d0801caff0>] handle_pio+0x129/0x158 (XEN) 0000000000000001 [<ffff82d0801c5934>] hvm_do_resume+0x258/0x33e (XEN) 0000000000000002 [<ffff82d0801e3166>] vmx_do_resume+0x12b/0x142 (XEN) (XEN) [<ffff82d080164adc>] context_switch+0xf0c/0xf63 (XEN) ffff83019d297c38 [<ffff82d0801299e0>] schedule+0x5b9/0x612 (XEN) ffff82d0801caff0 [<ffff82d08012c765>] __do_softirq+0x82/0x8d (XEN) ffff83019d297c38 [<ffff82d08012c7bd>] do_softirq+0x13/0x15 (XEN) 0000000000002001 [<ffff82d08023ace1>] process_softirqs+0x21/0x30 (XEN) (XEN) (XEN) ffff83007b637000 (XEN) **************************************** (XEN) ffff83007b60aed0Panic on CPU 0: (XEN) ffff83007b4c4000Assertion 'diff < STACK_SIZE' failed at traps.c:91 (XEN) ffff83007b637250**************************************** (XEN) (XEN) (XEN) Reboot in five seconds... (XEN) ffff83019d297db8 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |