[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x

On 22.03.2013 17:56, Konrad Rzeszutek Wilk wrote:
> On Fri, Mar 22, 2013 at 04:34:11PM +0100, Marek Marczykowski wrote:
>> I've switched to 3.8.4, on which problem is much easier to reproduce (almost
>> every startup).
>> On bad bootup, xen-acpi-processor didn't found any C-state: for each CPU
>> _pr.flags.power and _pr->power.count was 0 (but flags.power_setup_done=1). In
>> this case suspend (or shutdown) always ends up with reset.
> This is you booting the machine from a cold-state or a warm one?

Doesn't matter - in both cases the same result.

> There are some BIOSes out there that I know that use the scratchpad registers 
> in
> IOH (so depending on the platform that can be 0:0e.1 , Reg 0x84). If Xen or 
> Linux
> touch it then the P-states and C-states that the BIOS generates are buggy.
> But that is not the case here - you are saying that the DSDT after 
> disassembling
> (so cat /sys/firmware/acpi/tables/DSDT, or SSDT* and the iasl -d on them), the
> _PSD, _PSS, and _PCT look the same?

Binary versions are the same so assume disassembled also. I've copied full
/sys/firmware/acpi/tables at some startups and in all cases (both cold and
warm startups) all were the same.
In case of any noticed difference will check disassembled versions.

> You could also look at the FACP table and see if they are different.
>> On good one xen-acpi-processor got C1-C3 states for each CPU, then suspend
>> succeeded, but after resume CPU0 had C1-C3, but others only C1. Reloading
>> xen-acpi-processor (rmmod -f...) fixes this (according to xl debug-key c), 
>> but
>> still temperature keep high. Regardless of xen-acpi-processor reloading, next
>> suspend always fails.
> If you reload, and look at the runqeueus, are all of them using the ACPI
> idler or the default one?

The ACPI one (before reload and after).

>> Not sure how C-states can be related to S3 suspend, but perhaps something 
>> more
>> general with ACPI is wrong?
> This reminds me of something. I recall a long long time ago seeing something 
> like this....
> Completly forgot about this until now. The difference was whether the Xen's 
> cpu_idle 
> as running a) the acpi_idle (so using the different C-states), or b) the 
> default one
> (so just using HLT).
> With the b), during resume it would get half-way through
> (http://darnok.org/xen/devel.acpi-s3.v1.serial.log) while with a) it would 
> actually
> continue on - http://darnok.org/xen/devel.acpi-s3.v0.serial.log
> This was on some MSI MS-7680/H61M-P23 (MS-7680) motherboard.
> Oh look: http://lists.xen.org/archives/html/xen-devel/2011-06/msg02059.html
> And it looks Kevin's recommendation was use the a) case with max_cstates=1
> to narrow it down.

When default_idle used, resume doesn't work at all (even the first one). 
(1) With max_cstates=1, without xen-acpi-processor module: default_idle used.
Suspend succeed, but always hang at resume.

(2) With max_cstate=1, with xen-acpi-processor module loaded: acpi_idle used.
Suspend succeed, resume also, but after resume above problem exists (high
temperature, C2-C3 states only present on CPU0, subsequent suspends always
ends up with reboot).

(3) Without max_cstate=1, with xen-acpi-processor module loaded: same as (2).

(4) Without max_cstate=1, without xen-acpi-processor module loaded: same as (1).

One more observation: when xen compiled with debug=y, (2) and (4) cases
behaves the same as (1).

Hopefully I will have real serial console somehow in this week and will be
able to get more details from hang and reboot cases.

BTW Any chances for Xen ACPI S3 patches in upstream kernel?

Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab

Attachment: signature.asc
Description: OpenPGP digital signature

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.