|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v11 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
Hi Volodymyr,
On Thu, Aug 28, 2025 at 7:32 PM Volodymyr Babchuk
<Volodymyr_Babchuk@xxxxxxxx> wrote:
>
>
> Hi Mykola,
>
> Mykola Kvach <xakep.amatop@xxxxxxxxx> writes:
>
> > From: Mykola Kvach <mykola_kvach@xxxxxxxx>
> >
> > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface,
> > allowing guests to request suspend via the PSCI v1.0 SYSTEM_SUSPEND call
> > (both 32-bit and 64-bit variants).
> >
> > Implementation details:
> > - Add SYSTEM_SUSPEND function IDs to PSCI definitions
> > - Trap and handle SYSTEM_SUSPEND in vPSCI
> > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return
> > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system
> > in hwdom_shutdown() via domain_shutdown
> > - Require all secondary VCPUs of the calling domain to be offline before
> > suspend, as mandated by the PSCI specification
> >
> > The arch_domain_resume() function is an architecture-specific hook that is
> > invoked during domain resume to perform any necessary setup or restoration
> > steps required by the platform.
> >
> > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set
> > up
> > the vCPU context (such as entry point, some system regs and context ID)
> > before
> > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of
> > common
> > code and avoids intrusive changes to the generic resume flow.
> >
> > Usage:
> >
> > For Linux-based guests, suspend can be initiated with:
> > echo mem > /sys/power/state
> > or via:
> > systemctl suspend
> >
> > Resuming the guest is performed from control domain using:
> > xl resume <domain>
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@xxxxxxxx>
>
> Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>
>
> But check a small comment below
>
> > ---
> > Changes in V11:
> > - introduce arch_domain_resume() and vpsci_vcpu_up_prepare(), which are now
> > called on the resume path to avoid extra modifications to common code.
> > The vCPU context is now updated during domain resume.
> >
> > Changes in V10:
> > - small changes to the commit message reflect updates introduced in this
> > version of the patch.
> > - Comments are improved, clarified, and expanded, especially regarding PSCI
> > requirements and context handling.
> > - An ARM-specific helper (domain_resume_nopause_helper)
> > - gprintk() and PRIregister are used for logging in vPSCI code.
> > - An isb() is added before p2m_save_state
> > - The is_64bit_domain check is dropped when masking the upper part of entry
> > point and cid for SMC32 SYSTEM_SUSPEND PSCI calls
> >
> > Changes in V9:
> > - no functional changes
> > - cosmetic chnages after review
> > - enhance commit message and add extra comment to the code after review
> >
> > Changes in V8:
> > - GIC and virtual timer context must be saved when the domain suspends
> > - rework locking
> > - minor changes after code review
> >
> > Changes in V7:
> > - add proper locking
> > - minor changes after code review
> >
> > Changes in V6:
> > - skip execution of ctxt_switch_from for vcpu that is in paused domain
> > - add implementation of domain_resume without domain_pause
> > - add helper function to determine if vcpu is suspended or not
> > - ignore upper 32 bits of argument values when the domain is 64-bit
> > and calls the SMC32 SYSTEM_SUSPEND function
> > - cosmetic changes after review
> >
> > Changes in V5:
> > - don't use standby mode, restore execution in a provided by guest point
> > - move checking that all CPUs, except current one, are offline to after
> > pausing the vCPUs
> > - provide ret status from arch_domain_shutdown and handle it in
> > domain_shutdown
> > - adjust VPSCI_NR_FUNCS to reflect the number of newly added PSCI functions
> >
> > Changes in V4:
> > Dropped all changes related to watchdog, domain is marked as shutting
> > down in domain_shutdown and watchdog timeout handler won't trigger
> > because of it.
> >
> > Previous versions included code to manage Xen watchdog timers during
> > suspend,
> > but this was removed. When a guest OS starts the Xen watchdog (either via
> > the
> > kernel driver or xenwatchdogd), it is responsible for managing that state
> > across suspend/resume. On Linux, the Xen kernel driver properly stops the
> > watchdog during suspend. However, when xenwatchdogd is used instead, suspend
> > handling is incomplete, potentially leading to watchdog-triggered resets on
> > resume. Xen leaves watchdog handling to the guest OS and its services.
> >
> > Dropped all changes related to VCPU context, because instead domain_shutdown
> > is used, so we don't need any extra changes for suspending domain.
> >
> > Changes in V3:
> > Dropped all domain flags and related code (which touched common functions
> > like
> > vcpu_unblock), keeping only the necessary changes for Xen suspend/resume,
> > i.e.
> > suspend/resume is now fully supported only for the hardware domain.
> > Proper support for domU suspend/resume will be added in a future patch.
> > This patch does not yet include VCPU context reset or domain context
> > restoration in VCPU.
> > ---
> > xen/arch/arm/domain.c | 22 +++++
> > xen/arch/arm/include/asm/domain.h | 6 ++
> > xen/arch/arm/include/asm/perfc_defn.h | 1 +
> > xen/arch/arm/include/asm/psci.h | 2 +
> > xen/arch/arm/include/asm/vpsci.h | 5 +-
> > xen/arch/arm/vpsci.c | 114 +++++++++++++++++++++-----
> > xen/arch/ppc/stubs.c | 5 ++
> > xen/arch/riscv/stubs.c | 5 ++
> > xen/arch/x86/domain.c | 5 ++
> > xen/common/domain.c | 9 ++
> > xen/include/xen/domain.h | 2 +
> > 11 files changed, 156 insertions(+), 20 deletions(-)
> >
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > index 863ae18157..6fd73eedde 100644
> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -12,6 +12,8 @@
> > #include <xen/softirq.h>
> > #include <xen/wait.h>
> >
> > +#include <public/sched.h>
> > +
> > #include <asm/arm64/sve.h>
> > #include <asm/cpuerrata.h>
> > #include <asm/cpufeature.h>
> > @@ -27,6 +29,7 @@
> > #include <asm/tee/tee.h>
> > #include <asm/vfp.h>
> > #include <asm/vgic.h>
> > +#include <asm/vpsci.h>
> > #include <asm/vtimer.h>
> >
> > #include "vpci.h"
> > @@ -880,6 +883,25 @@ void arch_domain_creation_finished(struct domain *d)
> > p2m_domain_creation_finished(d);
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + int rc;
> > + typeof(d->arch.resume_ctx) *ctx = &d->arch.resume_ctx;
> > +
> > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend )
> > + {
> > + dprintk(XENLOG_WARNING,
> > + "%pd: Invalid domain state for resume:
> > is_shutting_down=%d, shutdown_code=%d\n",
> > + d, d->is_shutting_down, d->shutdown_code);
> > + return -EINVAL;
> > + }
>
> This check probably can go into common domain_resume() function, as
> there is nothing arch-specific in here. Probably this can be done during
> commit, to save us from v12? If commiters are okay with this.
It is more complicated than I expected, because this function is also used
for the "soft-reset" action (see domain_soft_reset). Currently, it may only
be called in the x86 case, but as I understand it, almost for any event an
action can be rewritten:
https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html#Event-Actions.
It also appears that domain_resume() is used to clean up state after
domain_shutdown(). So, if I simply add a condition to check for a specific
shutdown_code, it could break the "soft-reset" functionality.
This needs a deeper investigation. I propose to leave this check inside
arch_domain_resume() for now.
>
> > +
> > + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid);
> > + memset(ctx, 0, sizeof(*ctx));
> > +
> > + return rc;
> > +}
> > +
> > static int is_guest_pv32_psr(uint32_t psr)
> > {
> > switch (psr & PSR_MODE_MASK)
> > diff --git a/xen/arch/arm/include/asm/domain.h
> > b/xen/arch/arm/include/asm/domain.h
> > index a3487ca713..68185fc4d6 100644
> > --- a/xen/arch/arm/include/asm/domain.h
> > +++ b/xen/arch/arm/include/asm/domain.h
> > @@ -121,6 +121,12 @@ struct arch_domain
> > void *tee;
> > #endif
> >
> > + struct resume_info {
> > + register_t ep;
> > + register_t cid;
> > + struct vcpu *wake_cpu;
> > + } resume_ctx;
> > +
> > } __cacheline_aligned;
> >
> > struct arch_vcpu
> > diff --git a/xen/arch/arm/include/asm/perfc_defn.h
> > b/xen/arch/arm/include/asm/perfc_defn.h
> > index effd25b69e..8dfcac7e3b 100644
> > --- a/xen/arch/arm/include/asm/perfc_defn.h
> > +++ b/xen/arch/arm/include/asm/perfc_defn.h
> > @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci:
> > system_reset")
> > PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend")
> > PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info")
> > PERFCOUNTER(vpsci_features, "vpsci: features")
> > +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend")
> >
> > PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu")
> >
> > diff --git a/xen/arch/arm/include/asm/psci.h
> > b/xen/arch/arm/include/asm/psci.h
> > index 4780972621..48a93e6b79 100644
> > --- a/xen/arch/arm/include/asm/psci.h
> > +++ b/xen/arch/arm/include/asm/psci.h
> > @@ -47,10 +47,12 @@ void call_psci_system_reset(void);
> > #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8)
> > #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9)
> > #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10)
> > +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14)
> >
> > #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1)
> > #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3)
> > #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4)
> > +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14)
> >
> > /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> > #define PSCI_0_2_AFFINITY_LEVEL_ON 0
> > diff --git a/xen/arch/arm/include/asm/vpsci.h
> > b/xen/arch/arm/include/asm/vpsci.h
> > index 0cca5e6830..d790ab3715 100644
> > --- a/xen/arch/arm/include/asm/vpsci.h
> > +++ b/xen/arch/arm/include/asm/vpsci.h
> > @@ -23,12 +23,15 @@
> > #include <asm/psci.h>
> >
> > /* Number of function implemented by virtual PSCI (only 0.2 or later) */
> > -#define VPSCI_NR_FUNCS 12
> > +#define VPSCI_NR_FUNCS 14
> >
> > /* Functions handle PSCI calls from the guests */
> > bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
> > bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid);
> >
> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> > + register_t context_id);
> > +
> > #endif /* __ASM_VPSCI_H__ */
> >
> > /*
> > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> > index 7ba9ccd94b..50cf5fd96c 100644
> > --- a/xen/arch/arm/vpsci.c
> > +++ b/xen/arch/arm/vpsci.c
> > @@ -10,30 +10,16 @@
> >
> > #include <public/sched.h>
> >
> > -static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > - register_t context_id)
> > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point,
> > + register_t context_id)
> > {
> > - struct vcpu *v;
> > - struct domain *d = current->domain;
> > - struct vcpu_guest_context *ctxt;
> > int rc;
> > + struct domain *d = current->domain;
> > bool is_thumb = entry_point & 1;
> > - register_t vcpuid;
> > -
> > - vcpuid = vaffinity_to_vcpuid(target_cpu);
> > -
> > - if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> > - return PSCI_INVALID_PARAMETERS;
> > -
> > - /* THUMB set is not allowed with 64-bit domain */
> > - if ( is_64bit_domain(d) && is_thumb )
> > - return PSCI_INVALID_ADDRESS;
> > -
> > - if ( !test_bit(_VPF_down, &v->pause_flags) )
> > - return PSCI_ALREADY_ON;
> > + struct vcpu_guest_context *ctxt;
> >
> > if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> > - return PSCI_DENIED;
> > + return -ENOMEM;
> >
> > vgic_clear_pending_irqs(v);
> >
> > @@ -76,8 +62,37 @@ static int do_common_cpu_on(register_t target_cpu,
> > register_t entry_point,
> > free_vcpu_guest_context(ctxt);
> >
> > if ( rc < 0 )
> > + return rc;
> > +
> > + return 0;
> > +}
> > +
> > +static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
> > + register_t context_id)
> > +{
> > + struct vcpu *v;
> > + struct domain *d = current->domain;
> > + int rc;
> > + bool is_thumb = entry_point & 1;
> > + register_t vcpuid;
> > +
> > + vcpuid = vaffinity_to_vcpuid(target_cpu);
> > +
> > + if ( (v = domain_vcpu(d, vcpuid)) == NULL )
> > + return PSCI_INVALID_PARAMETERS;
> > +
> > + /* THUMB set is not allowed with 64-bit domain */
> > + if ( is_64bit_domain(d) && is_thumb )
> > + return PSCI_INVALID_ADDRESS;
> > +
> > + if ( !test_bit(_VPF_down, &v->pause_flags) )
> > + return PSCI_ALREADY_ON;
> > +
> > + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id);
> > + if ( rc )
> > return PSCI_DENIED;
> >
> > + vgic_clear_pending_irqs(v);
> > vcpu_wake(v);
> >
> > return PSCI_SUCCESS;
> > @@ -197,6 +212,48 @@ static void do_psci_0_2_system_reset(void)
> > domain_shutdown(d,SHUTDOWN_reboot);
> > }
> >
> > +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t
> > cid)
> > +{
> > + int32_t rc;
> > + struct vcpu *v;
> > + struct domain *d = current->domain;
> > + bool is_thumb = epoint & 1;
> > +
> > + /* THUMB set is not allowed with 64-bit domain */
> > + if ( is_64bit_domain(d) && is_thumb )
> > + return PSCI_INVALID_ADDRESS;
> > +
> > + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
> > + if ( is_hardware_domain(d) )
> > + return PSCI_NOT_SUPPORTED;
> > +
> > + /* Ensure that all CPUs other than the calling one are offline */
> > + domain_lock(d);
> > + for_each_vcpu ( d, v )
> > + {
> > + if ( v != current && is_vcpu_online(v) )
> > + {
> > + domain_unlock(d);
> > + return PSCI_DENIED;
> > + }
> > + }
> > + domain_unlock(d);
> > +
> > + rc = domain_shutdown(d, SHUTDOWN_suspend);
> > + if ( rc )
> > + return PSCI_DENIED;
> > +
> > + d->arch.resume_ctx.ep = epoint;
> > + d->arch.resume_ctx.cid = cid;
> > + d->arch.resume_ctx.wake_cpu = current;
> > +
> > + gprintk(XENLOG_DEBUG,
> > + "SYSTEM_SUSPEND requested, epoint=0x%"PRIregister",
> > cid=0x%"PRIregister,
> > + epoint, cid);
> > +
> > + return rc;
> > +}
> > +
> > static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> > {
> > /* /!\ Ordered by function ID and not name */
> > @@ -214,6 +271,8 @@ static int32_t do_psci_1_0_features(uint32_t
> > psci_func_id)
> > case PSCI_0_2_FN32_SYSTEM_OFF:
> > case PSCI_0_2_FN32_SYSTEM_RESET:
> > case PSCI_1_0_FN32_PSCI_FEATURES:
> > + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> > + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > case ARM_SMCCC_VERSION_FID:
> > return 0;
> > default:
> > @@ -344,6 +403,23 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs,
> > uint32_t fid)
> > return true;
> > }
> >
> > + case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> > + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > + {
> > + register_t epoint = PSCI_ARG(regs, 1);
> > + register_t cid = PSCI_ARG(regs, 2);
> > +
> > + if ( fid == PSCI_1_0_FN32_SYSTEM_SUSPEND )
> > + {
> > + epoint &= GENMASK(31, 0);
> > + cid &= GENMASK(31, 0);
> > + }
> > +
> > + perfc_incr(vpsci_system_suspend);
> > + PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
> > + return true;
> > + }
> > +
> > default:
> > return false;
> > }
> > diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c
> > index bdaf474c5c..0db0627b5c 100644
> > --- a/xen/arch/ppc/stubs.c
> > +++ b/xen/arch/ppc/stubs.c
> > @@ -224,6 +224,11 @@ void arch_domain_creation_finished(struct domain *d)
> > BUG_ON("unimplemented");
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + return 0;
> > +}
> > +
> > int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c)
> > {
> > BUG_ON("unimplemented");
> > diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
> > index 1a8c86cd8d..52532ae14d 100644
> > --- a/xen/arch/riscv/stubs.c
> > +++ b/xen/arch/riscv/stubs.c
> > @@ -198,6 +198,11 @@ void arch_domain_creation_finished(struct domain *d)
> > BUG_ON("unimplemented");
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + return 0;
> > +}
> > +
> > int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c)
> > {
> > BUG_ON("unimplemented");
> > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> > index 19fd86ce88..94a06bc697 100644
> > --- a/xen/arch/x86/domain.c
> > +++ b/xen/arch/x86/domain.c
> > @@ -1138,6 +1138,11 @@ void arch_domain_creation_finished(struct domain *d)
> > hvm_domain_creation_finished(d);
> > }
> >
> > +int arch_domain_resume(struct domain *d)
> > +{
> > + return 0;
> > +}
> > +
> > #ifdef CONFIG_COMPAT
> > #define xen_vcpu_guest_context vcpu_guest_context
> > #define fpu_ctxt fpu_ctxt.x
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 104e917f07..d73a88ced5 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -1352,6 +1352,7 @@ int domain_shutdown(struct domain *d, u8 reason)
> > void domain_resume(struct domain *d)
> > {
> > struct vcpu *v;
> > + int rc;
> >
> > /*
> > * Some code paths assume that shutdown status does not get reset under
> > @@ -1359,6 +1360,14 @@ void domain_resume(struct domain *d)
> > */
> > domain_pause(d);
> >
> > + rc = arch_domain_resume(d);
> > + if ( rc )
> > + {
> > + domain_unpause(d);
> > + printk("%pd: Failed to resume domain (ret %d)\n", d, rc);
> > + return;
> > + }
> > +
> > spin_lock(&d->shutdown_lock);
> >
> > d->is_shutting_down = d->is_shut_down = 0;
> > diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> > index e10baf2615..5f77ffadf1 100644
> > --- a/xen/include/xen/domain.h
> > +++ b/xen/include/xen/domain.h
> > @@ -109,6 +109,8 @@ int arch_domain_soft_reset(struct domain *d);
> >
> > void arch_domain_creation_finished(struct domain *d);
> >
> > +int arch_domain_resume(struct domain *d);
> > +
> > void arch_p2m_set_access_required(struct domain *d, bool access_required);
> >
> > int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c);
>
> --
> WBR, Volodymyr
Best regards,
Mykola
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |