|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x
On 26.03.2013 17:12, Andrew Cooper wrote:
> On 26/03/2013 15:47, Andrew Cooper wrote:
>> On 26/03/2013 13:50, Marek Marczykowski wrote:
>>> On 26.03.2013 14:11, Jan Beulich wrote:
>>>>>>> On 26.03.13 at 13:17, Marek Marczykowski
>>>>>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>>>> Finally got serial console :)
>>>>> The debug=y problem is (actually at resume):
>>>>> (XEN) Assertion 'test_bit(vector, cfg->used_vectors)' failed at
>>>>> io_apic.c:542
>>>>> (XEN) ----[ Xen-4.1.5-rc1 x86_64 debug=y Tainted: C ]----
>>>>> (XEN) CPU: 0
>>>>> (XEN) RIP: e008:[<ffff82c48015e288>]
>>>>> smp_irq_move_cleanup_interrupt+0x1c3/0x23d
>>>>> (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
>>>>> (XEN) rax: 0000000000000000 rbx: 00000000000000e9 rcx:
>>>>> ffff82c48029ff18
>>>>> (XEN) rdx: 00000000000000e9 rsi: 000000000000002a rdi:
>>>>> ffff830421060538
>>>>> (XEN) rbp: ffff82c48029ff08 rsp: ffff82c48029feb8 r8:
>>>>> ffff88041820eb60
>>>>> (XEN) r9: 0000000000000000 r10: 0000000000007ff0 r11:
>>>>> 0000000000000000
>>>>> (XEN) r12: ffff830421080250 r13: ffff830421060534 r14:
>>>>> ffff82c48029ff18
>>>>> (XEN) r15: ffff82c4802dd9e0 cr0: 000000008005003b cr4:
>>>>> 00000000000026f0
>>>>> (XEN) cr3: 0000000300b81000 cr2: ffff880402070198
>>>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
>>>>> (XEN) Xen stack trace from rsp=ffff82c48029feb8:
>>>>> (XEN) 0000000000000000 000000000000e030 ffff82c48029ff18
>>>>> ffff82c4802dd9e0
>>>>> (XEN) ffff8802cac3c7c0 00000000ffff3729 00000000ffff3729
>>>>> 000000013fff3728
>>>>> (XEN) ffffffff81b907c0 00000000ffff3729 00007d3b7fd600c7
>>>>> ffff82c48014de60
>>>>> (XEN) 00000000ffff3729 ffffffff81b907c0 000000013fff3728
>>>>> 00000000ffff3729
>>>>> (XEN) ffffffff81a01e18 00000000ffff3729 0000000000000000
>>>>> 0000000000007ff0
>>>>> (XEN) 0000000000000000 ffff88041820eb60 ffff8803fd1820a8
>>>>> ffffffff81b90a88
>>>>> (XEN) 000000000000002a 000000000000002a 00000000ffff372a
>>>>> 0000002000000000
>>>>> (XEN) ffffffff8105dd5a 000000000000e033 0000000000000246
>>>>> ffffffff81a01db8
>>>>> (XEN) 000000000000e02b 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>> (XEN) 0000000000000000 0000000000000000 ffff8300ca9a0000
>>>>> 0000000000000000
>>>>> (XEN) 0000000000000000
>>>>> (XEN) Xen call trace:
>>>>> (XEN) [<ffff82c48015e288>] smp_irq_move_cleanup_interrupt+0x1c3/0x23d
>>>>> (XEN)
>>>>> (XEN)
>>>>> (XEN) ****************************************
>>>>> (XEN) Panic on CPU 0:
>>>>> (XEN) Assertion 'test_bit(vector, cfg->used_vectors)' failed at
>>>>> io_apic.c:542
>>>>> (XEN) ****************************************
>>>> To make sense of this, we need to know the register (and maybe
>>>> stack) allocation at this point, to know which vector it was that
>>>> triggered the assertion. You can either do this analysis for us, or
>>>> point us at the xen-syms binary matching the xen.gz you used.
>>> "info scope smp_irq_move_cleanup_interrupt" said vector is in %rbx, so 0xe9.
>>>
>>>> From the register values, the most likely candidates are vector 0xe9
>>>> and 0x2a. The former having two registers set to this value seems
>>>> more likely from than angle, but vectors in the 0xe? range should
>>>> never end up in smp_irq_move_cleanup_interrupt().
>>>>
>>>> And if it's the 0x2a one, then we'd need to know what IRQ it was
>>>> last used for. That can't be reconstructed from the data above, so
>>>> would require you being able to reproduce this and adding some
>>>> instrumentation to the code.
>>>>
>>>> Jan
>>>>
>> Could it be something to do with switching virtual wire mode, and having
>> PIC compatibility stuff left in the IO-APIC after leaving the BIOS but
>> before starting back up again?
>>
>> Looking at the stack dump, there is an extra exception frame under what
>> is printed by the assertion failure.
>>
>> 0000002000000000 TRAP_syscall
>
> Apologies - this is a vector 0x20 interrupt, not TRAP_syscall, which
> makes sense as 0x20 is FIRST_DYNAMIC_IRQ which is also the cleanup IPI
> vector.
>
> The other comments still stand, espcially as we appear to be
> interrupting dom0 which is already running.
Indeed, dom0 is running at this stage (see log in my second email).
>
> ~Andrew
>
>> ffffffff81a01db8 guest kernel addr
>> 0000000000000246 FLAGS
>> 000000000000e033 FLAT_RING3_CS64
>> ffffffff8105dd5a guest kernel addr
>> 000000000000e02b FLAT_RING3_SS{64,32}
>>
>> So it appears that we are already executing a guest (presumably dom0) by the
>> time this assertion occurs. From the serial, is there any indication that
>> dom0 has started up again?
>>
>> I would have thought that we should have successfully reset the IO-APIC back
>> up properly before we would ever get back around to executing dom0.
>>
>> ~Andrew
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel
>
--
Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab
Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |