[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x
On 01.04.2013 15:53, Ben Guthro wrote: > On Thu, Mar 28, 2013 at 3:03 PM, Marek Marczykowski > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> (XEN) Restoring affinity for d2v3 >> (XEN) Assertion '!cpus_empty(cpus) && cpu_isset(cpu, cpus)' failed at >> sched_credit.c:481 > > > I think the "fix-suspend-scheduler-*" patches posted here are applicable here: > http://markmail.org/message/llj3oyhgjzvw3t23 > > > Specifically, I think you need this bit: > > diff --git a/xen/common/cpu.c b/xen/common/cpu.c > index 630881e..e20868c 100644 > --- a/xen/common/cpu.c > +++ b/xen/common/cpu.c > @@ -5,6 +5,7 @@ > #include <xen/init.h> > #include <xen/sched.h> > #include <xen/stop_machine.h> > +#include <xen/sched-if.h> > > unsigned int __read_mostly nr_cpu_ids = NR_CPUS; > #ifndef nr_cpumask_bits > @@ -212,6 +213,8 @@ void enable_nonboot_cpus(void) > BUG_ON(error == -EBUSY); > printk("Error taking CPU%d up: %d\n", cpu, error); > } > + if (system_state == SYS_STATE_resume) > + cpumask_set_cpu(cpu, cpupool0->cpu_valid); > } > > cpumask_clear(&frozen_cpus); > Indeed, this makes things better, but still not ideal. Now after resume all CPUs are in Pool-0, which is good. But CPU0 is much more preferred than others (xl vcpu-list). For example if I start 4 busy loops in dom0, I got (even after some time): [user@dom0 ~]$ xl vcpu-list Name ID VCPU CPU State Time(s) CPU Affinity dom0 0 0 0 r-- 98.5 any cpu dom0 0 1 0 --- 181.3 any cpu dom0 0 2 2 r-- 262.4 any cpu dom0 0 3 3 r-- 230.8 any cpu netvm 1 0 0 -b- 18.4 any cpu netvm 1 1 0 -b- 9.1 any cpu netvm 1 2 0 -b- 7.1 any cpu netvm 1 3 0 -b- 5.4 any cpu firewallvm 2 0 0 -b- 10.7 any cpu firewallvm 2 1 0 -b- 3.0 any cpu firewallvm 2 2 0 -b- 2.5 any cpu firewallvm 2 3 3 -b- 3.6 any cpu If I remove some CPU from Pool-0 and re-add it, things back to normal for this particular CPU (so I got two equally used CPUs) - to fully restore system I must remove all but CPU0 from Pool-0 and add it again. Also still only CPU0 have all C-states (C0-C3), all others have only C0-C1. This probably could be fixed by your "xen: Re-upload processor PM data to hypervisor after S3 resume" patch (reload of xen-acpi-processor module helps here). But I don't think it is a right way. It isn't necessary on other systems (with somehow older hardware). It must be something missing on resume path. The question is what... Perhaps someone need to go through enable_nonboot_cpus() (__cpu_up?) and check if it restore all things disabled in disable_nonboot_cpus() (__cpu_disable?). Unfortunately I don't know x86 details so good to follow that code... -- Best Regards / Pozdrawiam, Marek Marczykowski Invisible Things Lab Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |