[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1163



Den 17. jan. 2016 17:30, skrev Håkon Alstadheim:
> Den 17. jan. 2016 16:16, skrev Andrew Cooper:
>> On 17/01/16 14:50, Håkon Alstadheim wrote:
>>> Den 15. jan. 2016 12:05, skrev Andrew Cooper:
>>>> On 15/01/16 10:58, Håkon Alstadheim wrote:
>>>>> CPUINFO:
>>>>> vendor_id    : GenuineIntel
>>>>> cpu family    : 6
>>>>> model        : 63
>>>>> model name    : Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
>>>>>
>>>>> # smbios-sys-info
>>>>> Libsmbios version:      2.2.28
>>>>> Product Name:           Z10PE-D8 WS
>>>>> Vendor:                 ASUSTeK COMPUTER INC.
>>>>> BIOS Version:           3101
>>>>>
>>>>>
>>>>> I have been experiencing issues with domains with passed through PCIe
>>>>> devices since I first installed xen. Then at version 4.5.x , I'm now
>>>>> at 4.6.0 with gentoo patches. Crashes SEEM mostly related to this pci
>>>>> pass through and interrupts (usb-cards, sound cards).
>>>>>
>>>>> Recently the system has been more stable, whether it is because I pass
>>>>> through as few things as possible, or because of improvements in Xen I
>>>>> do not know. I have also taken to building with debug, which leads to
>>>>> more abrupt but less mysterious failures. Earlier (w/o debug and under
>>>>> xen 4.5 ) stuff would just gradually stop working and end up in total
>>>>> hang of everything. So, hey, things are improving :-b
>>>> This isn't the first time we have seen this on Haswell processors. Do
>>>> you have microcode loading set up?
>>>>
>>>> ~Andrew
>>>>
>>> Still happening with kernel-genkernel-x86_64-4.1.15-gentoo and updated
>>> cpu microcode, using microcode from 20151106.
>> Ok - I previously investigated this issue, but my repro evaporated from
>> under my feet with a firmware update, and I never got to the bottom of it.
>>
>> Please can you start with the following patch which will dump some more
>> information on crash.
>>
>> ---8<---
>> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
>> index 1228568..588b562 100644
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -1165,6 +1165,13 @@ static void __do_IRQ_guest(int irq)
>>      if ( action->ack_type == ACKTYPE_EOI )
>>      {
>>          sp = pending_eoi_sp(peoi);
>> +        if ( unlikely(!((sp == 0) || (peoi[sp-1].vector < vector))) )
>> +        {
>> +            int p;
>> +            for ( p = sp; p > 0; --p )
>> +                printk("**peoi[%d] = {%d, 0x%u, %d}\n",
>> +                       p-1, peoi[p-1].irq, peoi[p-1].vector,
>> peoi[p-1].ready);
>> +        }
>>          ASSERT((sp == 0) || (peoi[sp-1].vector < vector));
>>          ASSERT(sp < (NR_DYNAMIC_VECTORS-1));
>>          peoi[sp].irq = irq;
>>
>>
> Will do. Building now.
> Seems there is a line accidentally folded "peoi[p-1].ready);" belongs at
> the end of preceding line I presume?
>
There we go :-/ . Log attached from boot to assertion-failure with
loglvl=all guest_loglvl=all . Some of the log output might be a bit
cryptic, they are notes to myself from local boot-scripts, basically
firing up my router/name-server/dhcp-server and waiting until services
are ready before continuing.

---
(XEN) [2016-01-17 22:46:38] **peoi[0] = {107, 0x40, 0}
(XEN) [2016-01-17 22:46:38] Assertion '(sp == 0) || (peoi[sp-1].vector <
vector)' failed at irq.c:1170
(XEN) [2016-01-17 22:46:38] ----[ Xen-4.6.0  x86_64  debug=y
Tainted:    C ]----
(XEN) [2016-01-17 22:46:38] CPU:    21
(XEN) [2016-01-17 22:46:38] RIP:    e008:[<ffff82d0801701e0>]
do_IRQ+0x42c/0x6c9
(XEN) [2016-01-17 22:46:38] RFLAGS: 0000000000010046   CONTEXT: hypervisor
(XEN) [2016-01-17 22:46:38] rax: 0000000000000028   rbx:
0000000000000028   rcx: 0000000000000000
(XEN) [2016-01-17 22:46:38] rdx: ffff83107e040000   rsi:
000000000000000a   rdi: ffff82d0802ab768
(XEN) [2016-01-17 22:46:38] rbp: ffff83107e047dc8   rsp:
ffff83107e047d58   r8:  ffff83083ff00000
(XEN) [2016-01-17 22:46:38] r9:  0000000000000002   r10:
0000000000000026   r11: 0000000000000002
(XEN) [2016-01-17 22:46:38] r12: ffff830b12ea3ef0   r13:
ffff830839bc8480   r14: 0000000000000001
(XEN) [2016-01-17 22:46:38] r15: 000000000000006b   cr0:
000000008005003b   cr4: 00000000001526e0
(XEN) [2016-01-17 22:46:38] cr3: 000000107e08c000   cr2: 00007f2d8d2a6000
(XEN) [2016-01-17 22:46:38] ds: 0000   es: 0000   fs: 0000   gs: 0000
ss: 0000   cs: e008
(XEN) [2016-01-17 22:46:38] Xen stack trace from rsp=ffff83107e047d58:
(XEN) [2016-01-17 22:46:38]    000000000000006b ffff830839bc8480
ffff830000000000 ffff830839c06b24
(XEN) [2016-01-17 22:46:38]    0000000000000000 0000006b00000001
0000000000000000 ffff83107e047de0
(XEN) [2016-01-17 22:46:38]    ffff82d0801d502c 000027734f4a9a61
ffff83107e0568e0 0000000000000004
(XEN) [2016-01-17 22:46:38]    0000000000000008 ffff83107e0569a0
00007cef81fb8207 ffff82d08023b132
(XEN) [2016-01-17 22:46:38]    ffff83107e0569a0 0000000000000008
0000000000000004 ffff83107e0568e0
(XEN) [2016-01-17 22:46:38]    ffff83107e047ef0 000027734f4a9a61
0000277396d1352e ffff83107f9bc920
(XEN) [2016-01-17 22:46:38]    ffff8304d3cfb650 0000000000000585
ffff830839bc8020 20c49ba5e353f7cf
(XEN) [2016-01-17 22:46:38]    ffff83107e040000 000027734f4a7acd
ffff83107e056910 0000002800000000
(XEN) [2016-01-17 22:46:38]    ffff82d0801af14a 000000000000e008
0000000000000202 ffff83107e047e80
(XEN) [2016-01-17 22:46:38]    0000000000000000 000000206fd4e000
000027734f43d00e ffff82d080321080
(XEN) [2016-01-17 22:46:38]    ffffffffffffffff ffff83107e040000
0000000000000000 0000000000000000
(XEN) [2016-01-17 22:46:38]    000003e4000003d3 ffff83107e040000
ffff83107e040000 ffff83007ddbb000
(XEN) [2016-01-17 22:46:38]    00000000ffffffff ffff83083ffe7000
ffff83054cec4000 ffff83107e047f10
(XEN) [2016-01-17 22:46:38]    ffff82d0801607bc ffff82d08012c574
ffff83006fd4e000 ffff83107e047dd8
(XEN) [2016-01-17 22:46:38]    ffff8803b120c000 ffff8803b120c000
ffff8803b120c000 0000000000000000
(XEN) [2016-01-17 22:46:38]    ffff8803b120beb8 0000000000000000
0000000000000246 ffff8801faf97bd0
(XEN) [2016-01-17 22:46:38]    0000000000000000 0000000000000000
0000000000000000 ffffffff81a2e700
(XEN) [2016-01-17 22:46:38]    ffff8803bf2ada70 0000000000000000
0000000000000000 0000beef0000beef
(XEN) [2016-01-17 22:46:38]    ffffffff81038162 000000bf0000beef
0000000000000286 ffff8803b120beb8
(XEN) [2016-01-17 22:46:38]    000000000000beef 000000000000beef
000000000000beef 000000000000beef
(XEN) [2016-01-17 22:46:38] Xen call trace:
(XEN) [2016-01-17 22:46:38]    [<ffff82d0801701e0>] do_IRQ+0x42c/0x6c9
(XEN) [2016-01-17 22:46:38]    [<ffff82d08023b132>]
common_interrupt+0x62/0x70
(XEN) [2016-01-17 22:46:38]    [<ffff82d0801af14a>] mwait_idle+0x2cb/0x315
(XEN) [2016-01-17 22:46:38]    [<ffff82d0801607bc>] idle_loop+0x51/0x6b
(XEN) [2016-01-17 22:46:38]
(XEN) [2016-01-17 22:46:38]
(XEN) [2016-01-17 22:46:38] ****************************************
(XEN) [2016-01-17 22:46:38] Panic on CPU 21:
(XEN) [2016-01-17 22:46:38] Assertion '(sp == 0) || (peoi[sp-1].vector <
vector)' failed at irq.c:1170
(XEN) [2016-01-17 22:46:38] ****************************************
(XEN) [2016-01-17 22:46:38]
(XEN) [2016-01-17 22:46:38] Reboot in five seconds...
---


Attachment: xen-serial.1.log
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.