Re: [Xen-devel] High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x

On 25.03.2013 15:17, Konrad Rzeszutek Wilk wrote:
> On Mon, Mar 25, 2013 at 12:36:31PM +0100, Marek Marczykowski wrote:
>> On 22.03.2013 17:56, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Mar 22, 2013 at 04:34:11PM +0100, Marek Marczykowski wrote:
>>> This reminds me of something. I recall a long long time ago seeing 
>>> something like this....
>>> Completly forgot about this until now. The difference was whether the Xen's 
>>> cpu_idle 
>>> as running a) the acpi_idle (so using the different C-states), or b) the 
>>> default one
>>> (so just using HLT).
>>> With the b), during resume it would get half-way through
>>> (http://darnok.org/xen/devel.acpi-s3.v1.serial.log) while with a) it would 
>>> actually
>>> continue on - http://darnok.org/xen/devel.acpi-s3.v0.serial.log
>>> This was on some MSI MS-7680/H61M-P23 (MS-7680) motherboard.
>>> Oh look: http://lists.xen.org/archives/html/xen-devel/2011-06/msg02059.html
>>> And it looks Kevin's recommendation was use the a) case with max_cstates=1
>>> to narrow it down.
>> When default_idle used, resume doesn't work at all (even the first one). 
>> Details:
>> (1) With max_cstates=1, without xen-acpi-processor module: default_idle used.
>> Suspend succeed, but always hang at resume.
> AHA! So the bug persist.
>> (2) With max_cstate=1, with xen-acpi-processor module loaded: acpi_idle used.
>> Suspend succeed, resume also, but after resume above problem exists (high
>> temperature, C2-C3 states only present on CPU0, subsequent suspends always
>> ends up with reboot).
>> (3) Without max_cstate=1, with xen-acpi-processor module loaded: same as (2).
>> (4) Without max_cstate=1, without xen-acpi-processor module loaded: same as 
>> (1).
>> One more observation: when xen compiled with debug=y, (2) and (4) cases
>> behaves the same as (1).
> Oh, that is something new.

Finally got serial console :)
The debug=y problem is (actually at resume):
(XEN) Assertion 'test_bit(vector, cfg->used_vectors)' failed at io_apic.c:542
(XEN) ----[ Xen-4.1.5-rc1  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48015e288>] 
(XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: 00000000000000e9   rcx: ffff82c48029ff18
(XEN) rdx: 00000000000000e9   rsi: 000000000000002a   rdi: ffff830421060538
(XEN) rbp: ffff82c48029ff08   rsp: ffff82c48029feb8   r8:  ffff88041820eb60
(XEN) r9:  0000000000000000   r10: 0000000000007ff0   r11: 0000000000000000
(XEN) r12: ffff830421080250   r13: ffff830421060534   r14: ffff82c48029ff18
(XEN) r15: ffff82c4802dd9e0   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 0000000300b81000   cr2: ffff880402070198
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff82c48029feb8:
(XEN)    0000000000000000 000000000000e030 ffff82c48029ff18 ffff82c4802dd9e0
(XEN)    ffff8802cac3c7c0 00000000ffff3729 00000000ffff3729 000000013fff3728
(XEN)    ffffffff81b907c0 00000000ffff3729 00007d3b7fd600c7 ffff82c48014de60
(XEN)    00000000ffff3729 ffffffff81b907c0 000000013fff3728 00000000ffff3729
(XEN)    ffffffff81a01e18 00000000ffff3729 0000000000000000 0000000000007ff0
(XEN)    0000000000000000 ffff88041820eb60 ffff8803fd1820a8 ffffffff81b90a88
(XEN)    000000000000002a 000000000000002a 00000000ffff372a 0000002000000000
(XEN)    ffffffff8105dd5a 000000000000e033 0000000000000246 ffffffff81a01db8
(XEN)    000000000000e02b 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffff8300ca9a0000 0000000000000000
(XEN)    0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82c48015e288>] smp_irq_move_cleanup_interrupt+0x1c3/0x23d
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'test_bit(vector, cfg->used_vectors)' failed at io_apic.c:542
(XEN) ****************************************

Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab

