[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 11/16] xen/arm: Implement PSCI system suspend


  • To: Mykola Kvach <xakep.amatop@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Grygorii Strashko <grygorii_strashko@xxxxxxxx>
  • Date: Fri, 21 Mar 2025 16:46:50 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ao5jplHr/16EZLRN/tTjgLLC/6oDzQFGFPiXju5oWo4=; b=IvUb18iIxCJ4hT0VblZngeBsv7UDaFqiHI3R+5+xq9zgUiv1LgpVKfNWlzmGVYtQ+a8kaCrtdfA+OJM4EkOM2rEMNTzp3+HIgNJhYNcTiVXWeLWQIOjD+1O5jjqYEJYtSTw6prsgP10OmxiGrYNaL1FpN0OAhZEcCZXcVv7OhkCfXl1emRU8OzxeNGLQi12JOldD21LsdGdPFXdWigO8jd9wkGCHQuz092IH3PXvAFSyOXk29MpsqrMmKISmYXE69vuJXnGUqilPbMnsrC08tL7WZgXWhX/1OYcYTrtO0u+XiMDcekRH8q+yElcT4bkDfmzc/jtjbTeMgVIUivTJUQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=bEdCoXU1X4EPYIJPmVY75P4G1xzY9woOmxRVb/FgVWVaq5WK77068b7BEA2X5ft+cRpph9kUWHEYyJ9YVwS5/iWYeUsVzwOka1eh1et9Pt0prOwsy45+db54qMdB3Jhr2YRjKRP9JFqEppbvjcSnGwXhj5KtxUGOrp9++qAC9KkNrIm0P1IDt2gvEf+MgnKfqGyZVK1stEhoCY6FJsoa4Jx7+4A2+wRg9fMRsvIHB6qfJew6hqAOif9x1l+cf9+HPC6BMYBu66t4e6TW+UsyIaP4bJVjpg4TjX07EGrcNliyN0remZU8fSwM95dLOqKtt+BjALiPLWVpIPq3Py5hWA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Saeed Nowshadi <saeed.nowshadi@xxxxxxxxxx>, Mykyta Poturai <mykyta_poturai@xxxxxxxx>, Mykola Kvach <mykola_kvach@xxxxxxxx>
  • Delivery-date: Fri, 21 Mar 2025 14:47:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Mykola,

On 05.03.25 11:11, Mykola Kvach wrote:
From: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>

The implementation consists of:
-Adding PSCI system suspend call as new PSCI function
-Trapping PSCI system_suspend HVC
-Implementing PSCI system suspend call (virtual interface that allows
  guests to suspend themselves), but currently it is only partially
  implemented, so suspend/resume will correctly work only for dom0

The PSCI system suspend should be called by a guest from its boot
VCPU. Non-boot VCPUs of the guest should be hot-unplugged using PSCI
CPU_OFF call prior to issuing PSCI system suspend. Interrupts that
are left enabled by the guest are assumed to be its wake-up interrupts.
Therefore, a wake-up interrupt triggers the resume of the guest.

Guest
should resume regardless of the state of Xen (suspended or not).

This is strange statement - nothing can be resumed unles Xen itself is
resumed.


When a guest calls PSCI system suspend the respective domain will be
suspended if the following conditions are met:
1) Given resume entry point is not invalid

I think, you meant here - "is valid"

2) Other (if any) VCPUs of the calling guest are hot-unplugged


If the conditions above are met the calling domain is labeled as
suspended and the calling VCPU is blocked. If nothing else wouldn't
be done the suspended domain would resume from the place where it
called PSCI system suspend. This is expected if processing of the PSCI
system suspend call fails.

Could you clarify my understanding here pls about implementation?
Note - below is related only to Linux Arm64 mostly

1) this patch alone, actually enables Suspend2ram of the quest domains, but not
a Xen System suspend2ram.
2) with this patch domain can actually enter suspend, but
   - only from it's own console by issuing "echo mem > /sys/power/state"
   -- guest end up in Xen by issuing vPSCI PSCI_1_0_FN64_SYSTEM_SUSPEND HVC
      (psci is considered standard suspend mechanism for arm64)
   -- Xen blocks last guest vcpu and suspend considered done
3) at this point guest is suspended, but no way to resume it.
4) Xen remote/external "control" interface is not available, so neither
   suspend/neither resume can't be triggered from control domain using
   "xl suspend/resume".

5) Xen System suspend can happens only from HWDOM when HWDOM is iteslf 
suspended.
   The Xen system suspend process end up issuing PSCI SYSTEM_SUSPEND to EL3 FW.

   There is a prerequisite requirement that all guests, except HWDOM, have been 
suspended already.

6) No wakeup-source abstraction is not defined for guests, so they can be 
resumed only manually

...

However, in the case of success the calling
guest should resume (continue execution after the wake-up) from the entry
point which is given as the first argument of the PSCI system suspend
call.
In addition to the entry point, the guest expects to start within
the environment whose state matches the state after reset. This means
that the guest should find reset register values, MMU disabled, etc.
Thereby, the context of VCPU should be 'reset' (as if the system is
comming out of reset), the program counter should contain entry point,
which is 1st argument, and r0/x0 should contain context ID which is 2nd
argument of PSCI system suspend call.

I'm not sure that above is really needed in case of Xen guest domains, because:
- RAM state is retained
- Xen quest is running on virt CPU, interrupt-controller and timer, neither of 
them
  will lose context comparing to the real HW.

As result, just ublocking vCPU(boot) will cause proper resume of the quest and 
it will
continue execution from the point of issuing  vPSCI SYSTEM_SUSPEND HVC.


Actually I've tried:
- applied only patches 6 and 11
- applied test diff below
- triggered suspend in quest and then resume it by sending Xen "q" cmd

Guest can wake up.
(and no manipulations with vCPU state)

====
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -361,6 +361,7 @@ static void cf_check dump_domains(unsigned char key)
             printk("Notifying guest %d:%d (virq %d, port %d)\n",
                    d->domain_id, v->vcpu_id,
                    VIRQ_DEBUG, v->virq_to_evtchn[VIRQ_DEBUG]);
+            vcpu_unblock(v);
             send_guest_vcpu_virq(v, VIRQ_DEBUG);
         }
     }

Above should work nicely for Xen anaware guest, but Xen aware guest, 
specifically Linux,
need to be updated as drivers/xen/manage.c code doesn't support standard 
Suspend-2-ram
sequence properly - it's tied to hibernation.

It seems "xl suspend/resume" can't be used as is for ARM64, but many parts can 
be reused, probably.
I could be mistaken here - still studying interaction between control domain, 
remote domain and Xen.

One thing, I worry in case of Linux, is that changing System PM state
triggered by writing into some xenstore property and this happens inside Kernel,
while Linux, by design, expect changing System PM state only from User space,
at least it's true for suspend2ram which can be triggered only by:
- writing to /sys/power/state
- by auto-suspend + wakelocks.

The context of VCPU is set
accordingly when the PSCI system suspend is processed, so that nothing
needs to be done on resume/wake-up path.

Signed-off-by: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xxxxxxxxxx>
Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>
Signed-off-by: Mykola Kvach <mykola_kvach@xxxxxxxx>
---
Changes in V3:
Dropped all domain flags and related code (which touched common functions like
vcpu_unblock), keeping only the necessary changes for Xen suspend/resume, i.e.
suspend/resume is now fully supported only for the hardware domain.
Proper support for domU suspend/resume will be added in a future patch.
This patch does not yet include VCPU context reset or domain context
restoration in VCPU.
---
  xen/arch/arm/Makefile                 |  1 +
  xen/arch/arm/include/asm/domain.h     |  3 ++
  xen/arch/arm/include/asm/perfc_defn.h |  1 +
  xen/arch/arm/include/asm/psci.h       |  2 +
  xen/arch/arm/include/asm/suspend.h    | 18 +++++++
  xen/arch/arm/suspend.c                | 67 +++++++++++++++++++++++++++
  xen/arch/arm/vpsci.c                  | 32 +++++++++++++
  7 files changed, 124 insertions(+)
  create mode 100644 xen/arch/arm/include/asm/suspend.h
  create mode 100644 xen/arch/arm/suspend.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 43ab5e8f25..70d4b5daf8 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -53,6 +53,7 @@ obj-y += smpboot.o
  obj-$(CONFIG_STATIC_EVTCHN) += static-evtchn.init.o
  obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o
  obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o
+obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
  obj-y += sysctl.o
  obj-y += time.o
  obj-y += traps.o
diff --git a/xen/arch/arm/include/asm/domain.h 
b/xen/arch/arm/include/asm/domain.h
index 50b6a4b009..8b1bdf3d74 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -233,6 +233,9 @@ struct arch_vcpu
      struct vtimer virt_timer;
      bool   vtimer_initialized;
+ register_t suspend_ep;
+    register_t suspend_cid;
+
      /*
       * The full P2M may require some cleaning (e.g when emulation
       * set/way). As the action can take a long time, it requires
diff --git a/xen/arch/arm/include/asm/perfc_defn.h 
b/xen/arch/arm/include/asm/perfc_defn.h
index 3ab0391175..5049563718 100644
--- a/xen/arch/arm/include/asm/perfc_defn.h
+++ b/xen/arch/arm/include/asm/perfc_defn.h
@@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset,        "vpsci: system_reset")
  PERFCOUNTER(vpsci_cpu_suspend,         "vpsci: cpu_suspend")
  PERFCOUNTER(vpsci_cpu_affinity_info,   "vpsci: cpu_affinity_info")
  PERFCOUNTER(vpsci_features,            "vpsci: features")
+PERFCOUNTER(vpsci_system_suspend,      "vpsci: system_suspend")
PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu") diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 4780972621..48a93e6b79 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -47,10 +47,12 @@ void call_psci_system_reset(void);
  #define PSCI_0_2_FN32_SYSTEM_OFF          PSCI_0_2_FN32(8)
  #define PSCI_0_2_FN32_SYSTEM_RESET        PSCI_0_2_FN32(9)
  #define PSCI_1_0_FN32_PSCI_FEATURES       PSCI_0_2_FN32(10)
+#define PSCI_1_0_FN32_SYSTEM_SUSPEND      PSCI_0_2_FN32(14)
#define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
  #define PSCI_0_2_FN64_CPU_ON              PSCI_0_2_FN64(3)
  #define PSCI_0_2_FN64_AFFINITY_INFO       PSCI_0_2_FN64(4)
+#define PSCI_1_0_FN64_SYSTEM_SUSPEND      PSCI_0_2_FN64(14)
/* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
  #define PSCI_0_2_AFFINITY_LEVEL_ON      0
diff --git a/xen/arch/arm/include/asm/suspend.h 
b/xen/arch/arm/include/asm/suspend.h
new file mode 100644
index 0000000000..745377dbcf
--- /dev/null
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_ARM_SUSPEND_H__
+#define __ASM_ARM_SUSPEND_H__
+
+int32_t domain_suspend(register_t epoint, register_t cid);
+
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
new file mode 100644
index 0000000000..27fab8c999
--- /dev/null
+++ b/xen/arch/arm/suspend.c
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/sched.h>
+#include <asm/cpufeature.h>
+#include <asm/event.h>
+#include <asm/psci.h>
+#include <asm/suspend.h>
+#include <asm/platform.h>
+#include <public/sched.h>
+
+static void vcpu_suspend_prepare(register_t epoint, register_t cid)
+{
+    struct vcpu *v = current;
+
+    v->arch.suspend_ep = epoint;
+    v->arch.suspend_cid = cid;
+}
+
+int32_t domain_suspend(register_t epoint, register_t cid)
+{
+    struct vcpu *v;
+    struct domain *d = current->domain;
+    bool is_thumb = epoint & 1;
+
+    dprintk(XENLOG_DEBUG,
+            "Dom%d suspend: epoint=0x%"PRIregister", cid=0x%"PRIregister"\n",
+            d->domain_id, epoint, cid);
+
+    /* THUMB set is not allowed with 64-bit domain */
+    if ( is_64bit_domain(d) && is_thumb )
+        return PSCI_INVALID_ADDRESS;
+
+    /* TODO: care about locking here */
+    /* Ensure that all CPUs other than the calling one are offline */
+    for_each_vcpu ( d, v )
+    {
+        if ( v != current && is_vcpu_online(v) )
+            return PSCI_DENIED;
+    }
+
+    /*
+     * Prepare the calling VCPU for suspend (save entry point into pc and
+     * context ID into r0/x0 as specified by PSCI SYSTEM_SUSPEND)
+     */
+    vcpu_suspend_prepare(epoint, cid);
+
+    /* Disable watchdogs of this domain */
+    watchdog_domain_suspend(d);
+
+    /*
+     * The calling domain is suspended by blocking its last running VCPU. If an
+     * event is pending the domain will resume right away (VCPU will not block,
+     * but when scheduled in it will resume from the given entry point).
+     */
+    vcpu_block_unless_event_pending(current);
+
+    return PSCI_SUCCESS;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index d1615be8a6..96eef06c18 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -7,6 +7,7 @@
  #include <asm/vgic.h>
  #include <asm/vpsci.h>
  #include <asm/event.h>
+#include <asm/suspend.h>
#include <public/sched.h> @@ -197,6 +198,15 @@ static void do_psci_0_2_system_reset(void)
      domain_shutdown(d,SHUTDOWN_reboot);
  }
+static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
+{
+#ifdef CONFIG_SYSTEM_SUSPEND
+    return domain_suspend(epoint, cid);
+#else
+    return PSCI_NOT_SUPPORTED;
+#endif
+}
+
  static int32_t do_psci_1_0_features(uint32_t psci_func_id)
  {
      /* /!\ Ordered by function ID and not name */
@@ -214,6 +224,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
      case PSCI_0_2_FN32_SYSTEM_OFF:
      case PSCI_0_2_FN32_SYSTEM_RESET:
      case PSCI_1_0_FN32_PSCI_FEATURES:
+    case PSCI_1_0_FN32_SYSTEM_SUSPEND:
+    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
      case ARM_SMCCC_VERSION_FID:
          return 0;
      default:
@@ -344,6 +356,26 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, 
uint32_t fid)
          return true;
      }
+ case PSCI_1_0_FN32_SYSTEM_SUSPEND:
+    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
+    {
+        register_t epoint = PSCI_ARG(regs,1);
+        register_t cid = PSCI_ARG(regs,2);
+        register_t ret;
+
+        perfc_incr(vpsci_system_suspend);
+        /* Set the result to PSCI_SUCCESS if the call fails.
+         * Otherwise preserve the context_id in x0. For now
+         * we don't support the case where the system is suspended
+         * to a shallower level and PSCI_SUCCESS is returned to the
+         * caller.
+         */
+        ret = do_psci_1_0_system_suspend(epoint, cid);
+        if ( ret != PSCI_SUCCESS )
+            PSCI_SET_RESULT(regs, ret);
+        return true;
+    }
+
      default:
          return false;
      }

--
Best regards,
-grygorii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.