[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Only CPU0 active after ACPI S3, xen 4.1.3

On 16/01/13 11:22, George Dunlap wrote:
On 03/01/13 08:52, Jan Beulich wrote:
On 31.12.12 at 13:51, Ben Guthro <ben.guthro@xxxxxxxxx> wrote:
My current suspicion is irq delivery, because of the following messages I
see on the console on the way down:

(XEN) Preparing system for ACPI S3 state.
(XEN) Disabling non-boot CPUs ...
(XEN) Broke affinity for irq 1
(XEN) Broke affinity for irq 9
(XEN) Broke affinity for irq 12
(XEN) Broke affinity for irq 26
(XEN) Broke affinity for irq 30
(XEN) Broke affinity for irq 1
(XEN) Broke affinity for irq 1
(XEN) Entering ACPI S3 state.
No, that's normal behavior. But you ought to be able to verify by
pinning Dom0's vCPU 0 to pCPU 0, and within Dom0 setting the
affinities of all interrupts to CPU 0 - that should make all of these
messages go away.

Jan - any suggestions on how to procede with this? FWIW, Xen 4.0.y suspends
on this machine reliably.
With two scheduler related changesets having got spotted as
problematic by now (23255:1f95b55ef427 and 23269:d67e4d12723f,
albeit the latter not really scheduler specific), I'm really very much
hoping for George to have an idea, the more that ...


Sorry I haven't been following the thread -- have you tested this with 4.2, with and without the corresponding patch reverted (25079:d5ccb2d1dbd1)? That might tell us whether the patch itself was wrong, or whether there was a mistake in back-porting the patch (possibly because of different invariants outside of the patched code).

Jan, the commit message isn't very informative -- can you point me to a conversation describing the problem you're fixing wrt suspend/resume, and/or describe what you were trying to do? Given the results, the whole thing about not disabling scheduling during suspend seems a bit suspect...

In particular, just on a fairly cursory bit of function call skimming, it looks like: * This change means that cpupool.c:cpu_callback() won't call cpupool_cpu_add() when resuming * cpupool_cpu_add() does a bunch of paperwork (which would be unnecessary given the changes re suspend), but also calls cpupool_assign_cpu_locked()
* cpupool_assign_cpu_locked() calls schedule_cpu_switch()
* schedule_cpu_switch() calls the scheduler's tick_resume()

So is it possible that on resume ticks are not being re-enabled, or something like that?

(And possibly related to Ben's problem, ticks are not being disabled on suspend?)


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.