Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/S3: Fix cpu pool scheduling after suspend/resume

To: Ben Guthro <benjamin.guthro@xxxxxxxxxx>

From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>

Date: Tue, 09 Apr 2013 14:57:02 +0200

Delivery-date: Tue, 09 Apr 2013 12:58:21 +0000

Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=qHSUVYRxm7/ml+s+/BNt6xAMn4D7aw8xF+QSawiAGTl2Yi96JH/i9CEu AZwsCPPVGkZUqi5+EJ6fhYqvpC105VQVcRnvrpEbQVS3ixOB+3tZ/7pZU wJG/oU/ND5qngRqrvCDswlqJp/P/9DtUVgvzqvPyGKDSqfeq+ZDMkpw+v cV6LuzQJOB4tZrW0quX01Hnb8Su7xLhO7tCYjmkT5MTvbopHrwyE2WviO 2EM6CXtZo4VsCTn94oTH8V0AUxEVC;

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 09.04.2013 14:46, Ben Guthro wrote:

This review is another S3 scheduler problem with the system_state variable 
introduced with the following changeset:
http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=269f543ea750ed567d18f2e819e5d5ce58eda5c5

Specifically, the cpu_callback function that takes the CPU down during suspend, 
and back up during resume.
We were seeing situations where, after S3, only CPU0 was in cpupool0. Guest 
performance suffered greatly, since all vcpus were only on a single pcpu. 
Guests under high CPU load showed the problem much more quickly than an idle 
guest.

Removing this if condition forces the CPUs to go through the expected 
online/offline state, and be properly scheduled after S3.

This also includes a necessary partial change proposed earlier by Tomasz 
Wroblewski here:
http://lists.xen.org/archives/html/xen-devel/2013-01/msg02206.html

It should also resolve the issues discussed in this thread:
http://lists.xen.org/archives/html/xen-devel/2012-11/msg01801.html

Signed-off-by: Ben Guthro<benjamin.guthro@xxxxxxxxxx>
---
  xen/common/cpu.c     |    3 +++
  xen/common/cpupool.c |    5 -----
  2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/xen/common/cpu.c b/xen/common/cpu.c
index 630881e..e20868c 100644
--- a/xen/common/cpu.c
+++ b/xen/common/cpu.c
@@ -5,6 +5,7 @@
  #include<xen/init.h>
  #include<xen/sched.h>
  #include<xen/stop_machine.h>
+#include<xen/sched-if.h>

  unsigned int __read_mostly nr_cpu_ids = NR_CPUS;
  #ifndef nr_cpumask_bits
@@ -212,6 +213,8 @@ void enable_nonboot_cpus(void)
              BUG_ON(error == -EBUSY);
              printk("Error taking CPU%d up: %d\n", cpu, error);
          }
+        if (system_state == SYS_STATE_resume)
+            cpumask_set_cpu(cpu, cpupool0->cpu_valid);

This might solve YOUR problem, but reintroduces the problem why the original change was done: ALL cpus will be in cpupool0 after resume! So: NAK Juergen -- Juergen Gross Principal Developer Operating Systems PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@xxxxxxxxxxxxxx Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.