[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Emulation issues



El 29/07/15 a les 14.41, Paul Durrant ha escrit:
>> -----Original Message-----
>> From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx]
>> Sent: 29 July 2015 11:37
>> To: Paul Durrant; xen-devel; Andrew Cooper
>> Subject: Re: [BUG] Emulation issues
>>
>> El 29/07/15 a les 12.27, Paul Durrant ha escrit:
>>>> -----Original Message-----
>>>> From: Roger Pau Monnà [mailto:roger.pau@xxxxxxxxxx]
>>>> Sent: 29 July 2015 11:17
>>>> To: xen-devel; Andrew Cooper; Paul Durrant
>>>> Subject: [BUG] Emulation issues
>>>>
>>>> Hello,
>>>>
>>>> While trying to debug a hotplug scripts issue, I came across what seems
>>>> to be an emulation bug inside of Xen. The result of this is a bunch of
>>>> repeated messages on the serial console:
>>>>
>>>
>>> Was there anything of interest before this? You got an 'unhandleable'
>> emulation which generally should not happen, but I guess there may be a
>> shutdown race in tearing down the ioreq server list and sending emulation
>> requests which may cause hvm_send_ioreq() to return
>> X86EMUL_UNHANDLEABLE. It would be good to better understand the
>> sequence of events.
>>
>> I don't think there's anything relevant before the messages I've posted,
>> here is a more complete log:
>>
>> (XEN) irq.c:386: Dom91 callback via changed to Direct Vector 0x93
>> (XEN) irq.c:386: Dom92 callback via changed to Direct Vector 0x93
>> (XEN) irq.c:276: Dom91 PCI link 0 changed 5 -> 0
>> (XEN) irq.c:276: Dom91 PCI link 1 changed 10 -> 0
>> (XEN) irq.c:276: Dom91 PCI link 2 changed 11 -> 0
>> (XEN) irq.c:276: Dom91 PCI link 3 changed 5 -> 0
>> (XEN) irq.c:276: Dom92 PCI link 0 changed 5 -> 0
>> (XEN) irq.c:276: Dom92 PCI link 1 changed 10 -> 0
>> (XEN) irq.c:276: Dom92 PCI link 2 changed 11 -> 0
>> (XEN) irq.c:276: Dom92 PCI link 3 changed 5 -> 0
>> INIT: Id "T0" respawning too fast: disabled for 5 minutes
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>> (XEN) io.c:165:d83v0 Weird HVM ioemulation status 1.
>> (XEN) domain_crash called from io.c:166
>>
>> If you can provide a debug/trace patch I can run the same workload with
>> it in order to trace the sequence of events.
>>
> 
> Could you try this?
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 30acb78..1bc3cc9 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -145,6 +145,8 @@ static int hvmemul_do_io(
>              return X86EMUL_UNHANDLEABLE;
>          goto finish_access;
>      default:
> +        gprintk(XENLOG_ERR, "weird emulation state %u\n",
> +                vio->io_req.state);
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ec1d797..38d6d99 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2747,6 +2747,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t 
> *pr
>          }
>      }
> 
> +    gprintk(XENLOG_ERR, "unable to contact device model\n");
>      return X86EMUL_UNHANDLEABLE;
>  }

I've applied your patch and the one from Andrew, so my current diff is:

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 30acb78..1bc3cc9 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -145,6 +145,8 @@ static int hvmemul_do_io(
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
     default:
+        gprintk(XENLOG_ERR, "weird emulation state %u\n",
+                vio->io_req.state);
         return X86EMUL_UNHANDLEABLE;
     }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ec1d797..38d6d99 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2747,6 +2747,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t 
*proto_p,
         }
     }
 
+    gprintk(XENLOG_ERR, "unable to contact device model\n");
     return X86EMUL_UNHANDLEABLE;
 }
 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index d3b9cae..12d50c2 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -163,7 +163,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
-        domain_crash(curr->domain);
+        show_execution_state(&curr->arch.user_regs);
+        dump_execution_state();
+        domain_crash_synchronous();
         break;
     }
 
And got the following panic while doing a `xl shutdown -w -a` of 20 HVM 
guests:

(XEN) irq.c:386: Dom19 callback via changed to Direct Vector 0x93
(XEN) irq.c:276: Dom19 PCI link 0 changed 5 -> 0
(XEN) irq.c:276: Dom19 PCI link 1 changed 10 -> 0
(XEN) irq.c:276: Dom19 PCI link 2 changed 11 -> 0
(XEN) irq.c:276: Dom19 PCI link 3 changed 5 -> 0
(XEN) d10v0 weird emulation state 1
(XEN) io.c:165:d10v0 Weird HVM ioemulation status 1.
(XEN) Assertion 'diff < STACK_SIZE' failed at traps.c:91
(XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d080234b83>] show_registers+0x60/0x32f
(XEN) RFLAGS: 0000000000010212   CONTEXT: hypervisor (d10v0)
(XEN) rax: 000000001348fc88   rbx: ffff8300cc668290   rcx: 0000000000000000
(XEN) rdx: ffff8300dfaf0000   rsi: ffff8300cc668358   rdi: ffff8300dfaf7bb8
(XEN) rbp: ffff8300dfaf7bd8   rsp: ffff8300dfaf7a98   r8:  ffff83019d270000
(XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
(XEN) r12: ffff8300cc668000   r13: 0000000000000000   r14: ffff82c00026c000
(XEN) r15: ffff830198bf9000   cr0: 000000008005003b   cr4: 00000000000026e0
(XEN) cr3: 00000000cc77b000   cr2: ffff880002762df8
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff8300dfaf7a98:
(XEN)    ffff8300dfaf7ac8 ffff82d080144b11 0000000000000046 ffff8300dfaf7ac8
(XEN)    0000000000000046 0000000000000092 ffff8300dfaf7ae0 ffff82d08012cfd3
(XEN)    ffff82d0802a1bc0 ffff8300dfaf7af8 0000000000000046 0000000000002001
(XEN)    0000000000002001 fffff80002089e28 0000000000000001 fffffe00003829c0
(XEN)    000000000000b004 0000000000000000 0000000000000014 0000000000000002
(XEN)    000000000000b004 0000000000002001 000000000000b005 000000000000b004
(XEN)    0000000000002001 000000000000b004 0000beef0000beef<G><0>d15v0 weird 
emulation state 1
(XEN)  ffffffff8036fa45<G><0>io.c:165:d15v0 Weird HVM ioemulation status 1.
(XEN)
(XEN)   Assertion 'diff < STACK_SIZE' failed at traps.c:91
(XEN)  000000bf0000beef----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C 
]----
(XEN)  0000000000000046CPU:    6
(XEN)  fffffe00003829c0RIP:    e008:[<ffff82d080234b83>] 000000000000beef 
show_registers+0x60/0x32f
(XEN)
(XEN) RFLAGS: 0000000000010212    0000000000000000CONTEXT: hypervisor 
0000000000000000 (d15v0) 0000000000000000
(XEN) rax: 0000000121dd3c88   rbx: ffff83007b4c4290   rcx: 0000000000000000
(XEN)  0000000000000000rdx: ffff83019d290000   rsi: ffff83007b4c4358   rdi: 
ffff83019d297bb8
(XEN)
(XEN)   rbp: ffff83019d297bd8   rsp: ffff83019d297a98   r8:  ffff83019d270000
(XEN)  ffff8300cc668290r9:  0000000000000001   r10: 0000000000000001   r11: 
0000000000000001
(XEN)  ffff8300cc668000r12: ffff83007b4c4000   r13: 0000000000000000   r14: 
ffff82c000299000
(XEN)  0000000000000000r15: ffff830198bf9000   cr0: 000000008005003b   cr4: 
00000000000026e0
(XEN)  ffff82c00026c000cr3: 000000007b5d7000   cr2: ffff8800026b14d8
(XEN)
(XEN)   ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN)  ffff8300dfaf7bf8Xen stack trace from rsp=ffff83019d297a98:
(XEN)    ffff82d08018dd4d ffff82d0802685bf 0000000000000001 ffff830198bf9000 
0000000000000002 00007cfe62d68527
(XEN)    ffff82d08023b132 ffff8300dfaf7c38
(XEN)    ffff82d0801caff0 ffff830198bf9000 ffff8300dfaf7c38 ffff82d0802685bf 
0000000000002001 ffff83019d297b70
(XEN)    0000000000000200 ffff8300cc7da000
(XEN)    ffff83019d29ecc0 ffff83019d297b98 ffff8300cc668000 0000000000000000 
ffff8300cc7da250 0000000000000001
(XEN)    0000000000002001 ffff8300dfaf7db8
(XEN)    ffff82d0801c5934 0000000000002001 8000000000000000 fffff80002089e28 
ffff8300cc7da000 0000000000000001
(XEN)    fffffe00003829c0 ffff8300dfaf0000
(XEN)    ffff8300cc7da250 000000000000b004 ffff8300dfaf7cf8 0000000000000000 
00000000000cc277 0000000000000014
(XEN)    0000000000000002 0000000000000000
(XEN)    0000000000000001 000000000000b004 00000000000feff0 0000000000002001 
ffff8300ccfec820 000000000000b005
(XEN)    000000000000b004 ffff8300dfaf7d08
(XEN)    ffff82d0801f2009 0000000000002001 ffffffffffffffff 000000000000b004 
ffffffffffffffff 0000beef0000beef
(XEN)    ffffffff8036fa45 00000000000001f0
(XEN)    000000004003b000 000000bf0000beef ffff8300cc7da000 0000000000000046 
0000000000000000 fffffe00003829c0
(XEN)    000000000000beef ffff8300ccfec820
(XEN)    00000000000cc278 0000000000000000 ffff8300ccfec820 0000000000000000 
ffff8300cc7da000 0000000000000000
(XEN)    0000000000000000 ffff8300dfaf7da8
(XEN)    ffff82d080122c5a ffff83007b4c4290 ffff8300dfaf7db8 ffff83007b4c4000 
ffff8300dfaf7d28 0000000000000000
(XEN)  ffff82c000299000Xen call trace:
(XEN)
(XEN)      [<ffff82d080234b83>] show_registers+0x60/0x32f
(XEN)  ffff83019d297bf8   [<ffff82d08018dd4d>] show_execution_state+0x11/0x20
(XEN)  ffff82d08018dd4d   [<ffff82d0801caff0>] handle_pio+0x129/0x158
(XEN)  0000000000000001   [<ffff82d0801c5934>] hvm_do_resume+0x258/0x33e
(XEN)  0000000000000002   [<ffff82d0801e3166>] vmx_do_resume+0x12b/0x142
(XEN)
(XEN)      [<ffff82d080164adc>] context_switch+0xf0c/0xf63
(XEN)  ffff83019d297c38   [<ffff82d0801299e0>] schedule+0x5b9/0x612
(XEN)  ffff82d0801caff0   [<ffff82d08012c765>] __do_softirq+0x82/0x8d
(XEN)  ffff83019d297c38   [<ffff82d08012c7bd>] do_softirq+0x13/0x15
(XEN)  0000000000002001   [<ffff82d08023ace1>] process_softirqs+0x21/0x30
(XEN)
(XEN)
(XEN)  ffff83007b637000
(XEN) ****************************************
(XEN)  ffff83007b60aed0Panic on CPU 0:
(XEN)  ffff83007b4c4000Assertion 'diff < STACK_SIZE' failed at traps.c:91
(XEN)  ffff83007b637250****************************************
(XEN)
(XEN)
(XEN)   Reboot in five seconds...
(XEN)  ffff83019d297db8

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.