[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 12/13] xen/arm: Add vPSCI SYSTEM_SUSPEND policy





On 5/21/26 20:45, Mykola Kvach wrote:

Hello Mykola

From: Mykola Kvach <mykola_kvach@xxxxxxxx>

Introduce CONFIG_HAS_HWDOM_SYSTEM_SUSPEND as an architecture-selected
capability for platforms where the hardware domain can be parked with
SHUTDOWN_suspend without calling hwdom_shutdown().

Expose PSCI SYSTEM_SUSPEND as a vPSCI operation for all domains. For
non-control domains, including the hardware domain when it is not acting
as a control domain, the call is handled as a guest/domain suspend request
and parks the domain in SHUTDOWN_suspend.

Control domains need additional sequencing because their SYSTEM_SUSPEND
request is used to coordinate host-wide suspend. A non-last awake control
domain may be parked in SHUTDOWN_suspend without requiring the host
suspend path to be available. The last awake control domain is treated as
the point where the request becomes a host-suspend request, and it may
only proceed when all non-control domains are already in SHUTDOWN_suspend
and the host suspend path is available.

Keep the control-domain sequencing and domain-readiness checks out of
PSCI_FEATURES. They are per-attempt runtime conditions rather than stable
PSCI function availability. Advertise SYSTEM_SUSPEND as implemented by
vPSCI and report attempt-time policy failures as PSCI_DENIED.

Select HAS_HWDOM_SYSTEM_SUSPEND independently from CONFIG_SYSTEM_SUSPEND
so that SHUTDOWN_suspend from the hardware domain can be treated as a
domain suspend state rather than as a hardware-domain initiated host
shutdown. This does not by itself imply that host-wide suspend is
available.

Add host_system_suspend_allowed() to combine the host PSCI SYSTEM_SUSPEND
capability with runtime blockers reported by Xen-owned subsystems. Add
runtime blockers for registered serial, IOMMU, GIC and SMMUv3 MSI IRQ
paths lacking suspend/resume support. These blockers are runtime based,
so they only apply to drivers or paths that Xen actually uses on the
platform. For SMMUv3, the blocker applies only when Xen actually uses the
MSI IRQ path, since resume does not restore the SMMU *_IRQ_CFGn MSI
registers yet.

Add a struct domain forward declaration to xen/suspend.h so the generic
header can expose arch_domain_resume() without requiring a full domain.h
include.

Signed-off-by: Mykola Kvach <mykola_kvach@xxxxxxxx>
---
Changes in V10:
- Return PSCI_DENIED rather than PSCI_NOT_SUPPORTED when the last awake
   control domain cannot proceed to host suspend, keeping PSCI_FEATURES
   stable once SYSTEM_SUSPEND is advertised.
- Shorten SYSTEM_SUSPEND blocker messages and use %pd when logging the
   control domain.
- Mark serial_suspend_available as __ro_after_init.
- Mention the struct domain forward declaration added to xen/suspend.h.

I did not spot obvious issues while reviewing this patch, so you can add:

Reviewed-by: Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>

Several nits/remarks below, feel free to skip them if you find them irrelevant or not useful.



Changes in V9:
- Select HAS_HWDOM_SYSTEM_SUSPEND independently from CONFIG_SYSTEM_SUSPEND
   so that hardware-domain SHUTDOWN_suspend support is not tied to
   host-wide system suspend availability.
- Add runtime host suspend blockers for Xen-owned subsystems lacking
   suspend/resume support.
- Keep vPSCI SYSTEM_SUSPEND advertised through PSCI_FEATURES and enforce
   control-domain sequencing in the call handler.
---
  xen/arch/arm/Kconfig                  |   1 +
  xen/arch/arm/gic.c                    |   6 ++
  xen/arch/arm/include/asm/psci.h       |   3 +
  xen/arch/arm/include/asm/suspend.h    |  10 ++-
  xen/arch/arm/psci.c                   |   7 ++
  xen/arch/arm/suspend.c                |  40 +++++++++
  xen/arch/arm/vpsci.c                  | 114 +++++++++++++++++++++++---
  xen/common/Kconfig                    |   3 +
  xen/common/domain.c                   |   7 +-
  xen/drivers/char/serial.c             |  12 +++
  xen/drivers/passthrough/arm/iommu.c   |   4 +
  xen/drivers/passthrough/arm/smmu-v3.c |   4 +
  xen/include/xen/serial.h              |   1 +
  xen/include/xen/suspend.h             |   2 +
  14 files changed, 201 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 79622b46a1..54a5bfb9ae 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -19,6 +19,7 @@ config ARM
        select HAS_ALTERNATIVE if HAS_VMAP
        select HAS_DEVICE_TREE_DISCOVERY
        select HAS_DOM0LESS
+       select HAS_HWDOM_SYSTEM_SUSPEND if !MPU
        select HAS_GRANT_CACHE_FLUSH if GRANT_TABLE
        select HAS_STACK_PROTECTOR
        select HAS_UBSAN
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 7727ffed5a..60488c95b4 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -26,6 +26,7 @@
  #include <asm/device.h>
  #include <asm/io.h>
  #include <asm/gic.h>
+#include <asm/suspend.h>
  #include <asm/vgic.h>
  #include <asm/acpi.h>
@@ -44,6 +45,11 @@ static void __init __maybe_unused build_assertions(void)
  void register_gic_ops(const struct gic_hw_operations *ops)
  {
      gic_hw_ops = ops;
+
+#ifdef CONFIG_SYSTEM_SUSPEND
+    if ( !ops->suspend || !ops->resume )
+        host_system_suspend_disable("GIC driver lacks suspend support");
+#endif
  }
static void clear_cpu_lr_mask(void)
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index bb3c73496e..142fa1bfe5 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -24,6 +24,9 @@ void call_psci_cpu_off(void);
  void call_psci_system_off(void);
  void call_psci_system_reset(void);
  int call_psci_system_suspend(void);
+#ifdef CONFIG_SYSTEM_SUSPEND
+bool psci_system_suspend_allowed(void);
+#endif
/* Range of allocated PSCI function numbers */
  #define       PSCI_FNUM_MIN_VALUE                 _AC(0,U)
diff --git a/xen/arch/arm/include/asm/suspend.h 
b/xen/arch/arm/include/asm/suspend.h
index c848fc6340..50dc6e9fdf 100644
--- a/xen/arch/arm/include/asm/suspend.h
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -39,7 +39,15 @@ extern struct resume_cpu_context resume_cpu_context;
int prepare_resume_ctx(void);
  void hyp_resume(void);
-#endif /* CONFIG_SYSTEM_SUSPEND */
+bool host_system_suspend_allowed(void);
+void host_system_suspend_disable(const char *reason);
+
+#else /* !CONFIG_SYSTEM_SUSPEND */
+
+static inline bool host_system_suspend_allowed(void) { return false; }
+static inline void host_system_suspend_disable(const char *reason) {}
+
+#endif
#endif /* ARM_SUSPEND_H */ diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index e05dae1133..e9d78668fd 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -41,6 +41,13 @@ static bool __ro_after_init has_psci_system_suspend;
#define PSCI_RET(res) ((int32_t)(res).a0) +#ifdef CONFIG_SYSTEM_SUSPEND
+bool psci_system_suspend_allowed(void)
+{
+    return has_psci_system_suspend;
+}
+#endif
+
  int call_psci_cpu_on(int cpu)
  {
      struct arm_smccc_res res;
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
index 6ea4a0f9cc..98ddd46a47 100644
--- a/xen/arch/arm/suspend.c
+++ b/xen/arch/arm/suspend.c
@@ -1,9 +1,49 @@
  /* SPDX-License-Identifier: GPL-2.0-only */
+#include <asm/psci.h>
  #include <asm/suspend.h>
+#include <xen/lib.h>
+#include <xen/serial.h>
+
  struct resume_cpu_context resume_cpu_context;
+/*
+ * Non-PSCI infrastructure can make host suspend impossible even when the PSCI
+ * SYSTEM_SUSPEND conduit is present, e.g. when a Xen-owned driver has no valid
+ * suspend/resume path.
+ *
+ * This gate is checked only when the last awake control domain attempts to
+ * turn a guest SYSTEM_SUSPEND request into a host-suspend request.
+ */
+static bool host_system_suspend_runtime_allowed = true;

All callers of host_system_suspend_disable() appear to execute during boot. I noticed you added __ro_after_init to serial_suspend_available (same lifecycle), making the omission here a bit inconsistent. Should host_system_suspend_runtime_allowed be marked as __ro_after_init as well? Or I missed something?

+
+static bool host_serial_suspend_allowed(void)
+{
+    if ( serial_suspend_supported() )
+        return true;
+
+    printk_once(XENLOG_INFO
+                "Host SYSTEM_SUSPEND blocked: serial unsupported\n");
+
+    return false;
+}
+
+bool host_system_suspend_allowed(void)
+{
+    return psci_system_suspend_allowed() &&
+           host_serial_suspend_allowed() &&
+           host_system_suspend_runtime_allowed;
+}
+
+void host_system_suspend_disable(const char *reason)
+{
+    host_system_suspend_runtime_allowed = false;
+
+    printk(XENLOG_INFO "Host SYSTEM_SUSPEND blocked: %s\n",
+           reason ? reason : "unsupported suspend/resume path");

On a system with N SMMUv3 instances using MSIs (each calling host_system_suspend_disable() from arm_smmu_setup_msis()),
the same message is printed N times...

+}
+
  /*
   * Local variables:
   * mode: C
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index ac6af6118f..0bae42c1bd 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -5,6 +5,7 @@
#include <asm/current.h>
  #include <asm/domain.h>
+#include <asm/suspend.h>
  #include <asm/vgic.h>
  #include <asm/vpsci.h>
  #include <asm/event.h>
@@ -219,6 +220,89 @@ static void do_psci_0_2_system_reset(void)
      domain_shutdown(d,SHUTDOWN_reboot);
  }
+/*
+ * Serialise SYSTEM_SUSPEND policy decisions with the domain suspend 
transition,
+ * so multiple control domains cannot all observe each other as still awake.
+ */
+static DEFINE_SPINLOCK(vpsci_system_suspend_lock);
+
+static bool domain_in_suspend_state(struct domain *d)
+{
+    bool suspended;
+
+    spin_lock(&d->shutdown_lock);
+    suspended = d->is_shut_down && d->shutdown_code == SHUTDOWN_suspend;
+    spin_unlock(&d->shutdown_lock);
+
+    return suspended;
+}
+
+static int32_t domain_psci_system_suspend_policy(struct domain *d)
+{
+    struct domain *other;
+    bool last_awake_control_domain = true;
+    bool awake_non_control_domain = false;
+
+    /* Only control domains participate in sequencing policy. */
+    if ( !is_control_domain(d) )
+        return 0;
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain ( other )
+    {
+        bool suspended;
+
+        if ( other == d )
+            continue;
+
+        suspended = domain_in_suspend_state(other);
+        if ( suspended )
+            continue;
+
+        if ( is_control_domain(other) )
+        {
+            last_awake_control_domain = false;
+            break;
+        }
+
+        awake_non_control_domain = true;
+    }
+
+    rcu_read_unlock(&domlist_read_lock);
+
+    /*
+     * Another control domain is still awake. This request is only the first
+     * phase of the sequencing: park this control domain and leave the host
+     * running. Host-wide suspend gates must not block this intermediate state.
+     */
+    if ( !last_awake_control_domain )
+        return 0;
+
+    /*
+     * This is the last awake control domain. It must not be parked unless the
+     * request can proceed as a host-suspend request; otherwise Xen would lose
+     * the last domain that can coordinate the system suspend.
+     */
+    if ( awake_non_control_domain )
+    {
+        printk(XENLOG_DEBUG
+               "SYSTEM_SUSPEND denied for %pd: non-control domains awake\n",
+               d);
+        return PSCI_DENIED;
+    }
+
+    /*
+     * Host-wide gates are relevant only for the last-control-domain case. They
+     * must not block parking of a non-last control domain, but they must deny
+     * the last control domain when host suspend is not currently available.
+     */
+    if ( !host_system_suspend_allowed() )
+        return PSCI_DENIED;
+
+    return 0;
+}
+
  static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
  {
      int32_t rc;
@@ -232,10 +316,6 @@ static int32_t do_psci_1_0_system_suspend(register_t 
epoint, register_t cid)
      if ( is_64bit_domain(d) && is_thumb )
          return PSCI_INVALID_ADDRESS;
- /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
-    if ( is_hardware_domain(d) )
-        return PSCI_NOT_SUPPORTED;
-
      /* Ensure that all CPUs other than the calling one are offline */
      domain_lock(d);
      for_each_vcpu ( d, v )
@@ -252,16 +332,29 @@ static int32_t do_psci_1_0_system_suspend(register_t 
epoint, register_t cid)
      if ( rc )
          return PSCI_DENIED;
- rc = domain_shutdown(d, SHUTDOWN_suspend);
+    spin_lock(&vpsci_system_suspend_lock);
+
+    rc = domain_psci_system_suspend_policy(d);
+    if ( !rc )
+    {
+        rc = domain_shutdown(d, SHUTDOWN_suspend);
+        if ( rc )
+            rc = PSCI_DENIED;
+        else
+        {
+            rctx->ctxt = ctxt;
+            rctx->wake_cpu = current;
+        }
+    }
+
+    spin_unlock(&vpsci_system_suspend_lock);
+
      if ( rc )
      {
          free_vcpu_guest_context(ctxt);
-        return PSCI_DENIED;
+        return rc;
      }
- rctx->ctxt = ctxt;
-    rctx->wake_cpu = current;
-
      gprintk(XENLOG_DEBUG,
              "SYSTEM_SUSPEND requested, epoint=%#"PRIregister", 
cid=%#"PRIregister"\n",
              epoint, cid);
@@ -287,10 +380,9 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
      case PSCI_0_2_FN32_SYSTEM_RESET:
      case PSCI_1_0_FN32_PSCI_FEATURES:
      case ARM_SMCCC_VERSION_FID:
-        return 0;
      case PSCI_1_0_FN32_SYSTEM_SUSPEND:
      case PSCI_1_0_FN64_SYSTEM_SUSPEND:
-        return is_hardware_domain(current->domain) ? PSCI_NOT_SUPPORTED : 0;
+        return 0;
      default:
          return PSCI_NOT_SUPPORTED;
      }
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 5ff71480ee..816a1a4ecb 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -140,6 +140,9 @@ config HAS_EX_TABLE
  config HAS_FAST_MULTIPLY
        bool
+config HAS_HWDOM_SYSTEM_SUSPEND
+       bool
+
  config HAS_IOPORTS
        bool
diff --git a/xen/common/domain.c b/xen/common/domain.c
index bb9e210c28..d3edfb2a13 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1375,6 +1375,11 @@ void __domain_crash(struct domain *d)
      domain_shutdown(d, SHUTDOWN_crash);
  }
+static inline bool want_hwdom_shutdown(uint8_t reason)
+{
+    return !IS_ENABLED(CONFIG_HAS_HWDOM_SYSTEM_SUSPEND) ||
+           reason != SHUTDOWN_suspend;
+}
int domain_shutdown(struct domain *d, u8 reason)
  {
@@ -1391,7 +1396,7 @@ int domain_shutdown(struct domain *d, u8 reason)
          d->shutdown_code = reason;
      reason = d->shutdown_code;
- if ( is_hardware_domain(d) )
+    if ( is_hardware_domain(d) && want_hwdom_shutdown(reason) )
          hwdom_shutdown(reason);
if ( d->is_shutting_down )
diff --git a/xen/drivers/char/serial.c b/xen/drivers/char/serial.c
index adb312d796..e5348b5445 100644
--- a/xen/drivers/char/serial.c
+++ b/xen/drivers/char/serial.c
@@ -497,6 +497,8 @@ const struct vuart_info *serial_vuart_info(int idx)
#ifdef CONFIG_SYSTEM_SUSPEND +static bool __ro_after_init serial_suspend_available = true;
+
  void serial_suspend(void)
  {
      int i;
@@ -513,6 +515,11 @@ void serial_resume(void)
              com[i].driver->resume(&com[i]);
  }
+bool serial_suspend_supported(void)
+{
+    return serial_suspend_available;
+}
+
  #endif /* CONFIG_SYSTEM_SUSPEND */
void __init serial_register_uart(int idx, struct uart_driver *driver,
@@ -521,6 +528,11 @@ void __init serial_register_uart(int idx, struct 
uart_driver *driver,
      /* Store UART-specific info. */
      com[idx].driver = driver;
      com[idx].uart   = uart;
+
+#ifdef CONFIG_SYSTEM_SUSPEND
+    if ( !driver->suspend || !driver->resume )
+        serial_suspend_available = false;
+#endif
  }
void __init serial_async_transmit(struct serial_port *port)
diff --git a/xen/drivers/passthrough/arm/iommu.c 
b/xen/drivers/passthrough/arm/iommu.c
index 100545e23f..547048af05 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -19,6 +19,7 @@
  #include <xen/device_tree.h>
  #include <xen/iommu.h>
  #include <xen/lib.h>
+#include <xen/suspend.h>
#include <asm/device.h> @@ -46,6 +47,9 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
      }
iommu_ops = ops;
+
+    if ( !ops->suspend || !ops->resume )
+        host_system_suspend_disable("IOMMU driver lacks suspend support");

I was initially wondering why the suspend/resume checks for the GIC and serial are wrapped in #ifdef CONFIG_SYSTEM_SUSPEND, but the IOMMU check is not, despite host_system_suspend_disable() acting as no-op when CONFIG_SYSTEM_SUSPEND=n. Then I realized the divergence here because the suspend/resume members in struct iommu_ops are not gated with #ifdef, unlike those in struct gic_hw_operations and struct uart_driver.


  }
int __init iommu_hardware_setup(void)
diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
b/xen/drivers/passthrough/arm/smmu-v3.c
index be8028c036..1b6bb1bc5f 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -91,6 +91,7 @@
  #include <asm/io.h>
  #include <asm/iommu_fwspec.h>
  #include <asm/platform.h>
+#include <asm/suspend.h>
#include "smmu-v3.h" @@ -1903,6 +1904,9 @@ static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
                }
        }
+ host_system_suspend_disable(
+               "SMMUv3 MSI IRQ path is unsupported for host suspend");
+
        /* Add callback to free MSIs on teardown */
        devm_add_action(dev, arm_smmu_free_msis, dev);
  }
diff --git a/xen/include/xen/serial.h b/xen/include/xen/serial.h
index 8e18445552..418b00ead0 100644
--- a/xen/include/xen/serial.h
+++ b/xen/include/xen/serial.h
@@ -137,6 +137,7 @@ const struct vuart_info* serial_vuart_info(int idx);
  /* Serial suspend/resume. */
  void serial_suspend(void);
  void serial_resume(void);
+bool serial_suspend_supported(void);
  #endif
/*
diff --git a/xen/include/xen/suspend.h b/xen/include/xen/suspend.h
index 6f94fd53b0..a941331035 100644
--- a/xen/include/xen/suspend.h
+++ b/xen/include/xen/suspend.h
@@ -6,6 +6,8 @@
  #if __has_include(<asm/suspend.h>)
  #include <asm/suspend.h>
  #else
+struct domain;
+
  static inline void arch_domain_resume(struct domain *d) {}
  #endif




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.