[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 13/13] xen/arm: Add support for system suspend triggered by hardware domain


  • To: Mykola Kvach <xakep.amatop@xxxxxxxxx>
  • From: Luca Fancellu <Luca.Fancellu@xxxxxxx>
  • Date: Fri, 8 May 2026 14:30:15 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 4.158.2.129) smtp.rcpttodomain=gmail.com smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5iKWRb5SMys4cCNq04uSSJ1M6DF+kI3YEVZCXIrJo5I=; b=nKMoMYMXn6WfMwxpFkLZhA2u5NE86TjrcDe7H0WqiOvqIbBMJ+ssXtjYZBzHJ2CWfR8glKd5Aw+wILtNlBUfcogqSuuXCessKxNHgiiV3laBv8aYaJB+OrLU5QZ5yGW2lTVfWU4idLYNnPoMPYu8d21QK4uWqJr0JMvSbNtat4b7Io7H+FRxr8kaR606MziGptnWSvpa/vpgAII4HVfXcDwkbxKeb720ML9g0jFfVFtTjKCyaR+rrC6mcylpCOrIHdpUTKN89iw3kWeDiqF/pnBw/qfC41DHAOh9ilSEJ363pDqPX3RSfyNlncE7rdJpvdI1paa2IIzhq8RZ86eXfA==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5iKWRb5SMys4cCNq04uSSJ1M6DF+kI3YEVZCXIrJo5I=; b=nYfUwfc/OUxPZ+gtS2o8j7D7yudLLqjMDSXNp5gaZGc4Pj0sqwo3DiHX3EOLNXcU2A9jgnw9C+GXDOiP7wK4YjN1cHDG8c2+neRYZ3+1Uqa/1Nw8gDfYbY5+HLBMObtSVt3H0mHqBcgxo3v9a0IO4ObijgZy6PdBSuL514ZKyqPEahf/Nci2X3pO3QPqZo+I0S82aRave/3HNnaFWSCkU7svpRgNREJLQsCm2IU2KGP7yqdevMcqQKnvibqquxVGtZqJuBd5OZR4L0ZCzMou4pkFFuTsXnb2H3nLYTxI+FCEIAE5IbFfwQ+ikCGOCi2t/euVToGTRJWyxPiQx9MNSA==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=FnjVv4Su5F6rH55wfX6KcZweMK/UnFbILLLOEco2aAN1VFkfjIkqLUpSXPgwjzdd2SipuEMStvWc3BTx1DqZ2QdhkqWduIAQ4RH/SpZshl3YtDcDJT+V/gpWcPq89l9sSbvpegPDLMLlo/qlqC+8swOU69KTblklF5bXyrEDubO+roqXWNJlbClqFNcHMu+ydZO9SJdiu2orth2pFmBCqAS0gwbRiPlF7TDrsjGSukikXGWkGCRzD1K55uQGRxBjp4+CWzMcmp9JOMy2QlBnwhKYz1fNpl+ckRmwZvXZ2Rdq+s7Z7HS1ALuwM4RFXwWzUWI5WZS/zsBjTvJkOTPO3g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=W6N7Kbymj45Oteag9NWPgvwGR12cgFxR4cCKyqhcXKmdhCk7QzOZdLkEr9pjgt7/8ehVqx/locHbMwBpygHkQkzbQIA/7Ims71k9otjAkB5i7tDdF8Kt19SnufZ3QPPrtgnw9QaKty0U1lVeYG2tKL/yj+sznORCHod3FE+h1dzPQP+CxmoqPboM5S0xz4dZEaKecoLdH3aIZkJrs2m3+hXbksMFTrKE+leR0nEdpwzvdBjPiAif9TKe33gAgPdfxRkLDQep3SVKmRkmOYVlT6kwmu9eLLZNP5fD3WXYbOypnpy7dn0xVpQC1p3gVRyAvFNX0Xz7FF3TBJosTYi4lA==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=arm.com header.i="@arm.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"; dkim=pass header.s=selector1 header.d=arm.com header.i="@arm.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Mykola Kvach <mykola_kvach@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Rahul Singh <Rahul.Singh@xxxxxxx>
  • Delivery-date: Fri, 08 May 2026 14:31:46 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Thread-index: AQHc16717IQ0/wsQJ0u4S7T7zIC8eLX/7bkAgARRIwA=
  • Thread-topic: [PATCH v8 13/13] xen/arm: Add support for system suspend triggered by hardware domain

Hi Mykola,

>>> +/* Xen suspend. data identifies the domain that initiated suspend. */
>>> +static void system_suspend(void *data)
>>> +{
>>> +    int status;
>>> +    unsigned long flags;
>>> +    struct domain *d = (struct domain *)data;
>>> +
>>> +    BUG_ON(system_state != SYS_STATE_active);
>>> +
>>> +    system_state = SYS_STATE_suspend;
>>> +
>>> +    printk("Xen suspending...\n");
>>> +
>>> +    freeze_domains();
>>> +    scheduler_disable();
>>> +
>>> +    status = can_system_suspend();
>>> +    if ( status )
>>> +    {
>>> +        system_state = SYS_STATE_resume;
>>> +        goto resume_scheduler;
>> 
>> When we have an error and we get the resume_scheduler path, we apply back the
>> context of the guest saved previously in do_psci_1_0_system_suspend(), so am 
>> I
>> correct saying the guest won’t get any PSCI error back and we resume the 
>> guest
>> from the guest resume entrypoint?
>> 
>> In case, should we have a different path that returns a PSCI error (PSCI_*) 
>> into the guest
>> x0, and skips the context restore?
> 
> You are right about the current control flow: once the virtual
> SYSTEM_SUSPEND request has been accepted and the domain has been parked, a
> later failure in the Xen-wide suspend path resumes the domain through the 
> normal
> domain resume path, rather than returning a PSCI error from the original call.
> 
> This is intentional in the current design. The virtual PSCI SYSTEM_SUSPEND
> path parks the domain and saves its resume context. The actual Xen-wide host
> suspend is a separate step that is attempted only after all domains are
> suspended.
> 
> So a failure in the later Xen-wide suspend step is treated as an abort of the
> host suspend attempt after the domain suspend was already accepted. The domain
> is then resumed through the existing domain resume path, similarly to the
> toolstack/xl suspend-resume flow, rather than by re-entering the guest PSCI
> call path and modifying the saved vCPU context again.
> 
> I agree this design is not obvious from the patch. I will clarify the commit
> message and comments. If you or the maintainers think that failures before the
> physical SYSTEM_SUSPEND call succeeds should be reported back through the
> original virtual PSCI call, then this would require a different flow. I was
> trying to avoid that extra complexity in this series.

Ok I understand, I’m wondering if inside do_psci_1_0_system_suspend() we could 
do something
like:

[…]
if ( is_control_domain(d) && !other_domains_ready_for_suspend(d) )
    return PSCI_DENIED;

[…]

But I’m ok also to only document this behaviour.


>>> 
>>> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
>>> index bd87ec430d..8fb9172186 100644
>>> --- a/xen/arch/arm/vpsci.c
>>> +++ b/xen/arch/arm/vpsci.c
>>> @@ -5,6 +5,7 @@
>>> 
>>> #include <asm/current.h>
>>> #include <asm/domain.h>
>>> +#include <asm/suspend.h>
>>> #include <asm/vgic.h>
>>> #include <asm/vpsci.h>
>>> #include <asm/event.h>
>>> @@ -232,8 +233,7 @@ static int32_t do_psci_1_0_system_suspend(register_t 
>>> epoint, register_t cid)
>>>    if ( is_64bit_domain(d) && is_thumb )
>>>        return PSCI_INVALID_ADDRESS;
>>> 
>>> -    /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
>>> -    if ( is_hardware_domain(d) )
>>> +    if ( !IS_ENABLED(CONFIG_SYSTEM_SUSPEND) && is_hardware_domain(d) )
>>>        return PSCI_NOT_SUPPORTED;
>>> 
>>>    /* Ensure that all CPUs other than the calling one are offline */
>>> @@ -266,6 +266,9 @@ static int32_t do_psci_1_0_system_suspend(register_t 
>>> epoint, register_t cid)
>>>            "SYSTEM_SUSPEND requested, epoint=%#"PRIregister", 
>>> cid=%#"PRIregister"\n",
>>>            epoint, cid);
>>> 
>>> +    if ( is_control_domain(d) )
>> 
>> Why is_control_domain() here and not is_hardware_domain() ?
> 
> The use of is_control_domain() is intentional.
> 
> The intended model is that Xen-wide host suspend is orchestrated by the
> privileged management/control domain. The control domain coordinates the
> toolstack side, asks other domains to enter suspend, and then issues the final
> SYSTEM_SUSPEND request to Xen.
> 
> This does not have to be the same entity as the hardware domain. If the
> hardware domain is separate, it is one of the domains that the control domain
> parks before the final host suspend step.
> 
> The hwdom-specific checks in this patch have a different purpose: they avoid
> the old hwdom_shutdown() path for SHUTDOWN_suspend and allow the hardware
> domain to be parked as part of the suspend sequence. They do not define the
> policy for who is allowed to trigger Xen-wide host suspend.
> 
> That said, this policy may not be optimal for all configurations, especially
> when the control and hardware domain roles are split. I would appreciate your
> view, as well as the maintainers' views, on whether the trigger should remain
> control-domain based, be tied to the hardware domain instead, or be expressed
> through a separate host-suspend capability/helper.

In the commit message and title I saw HW domain, so maybe the commit should be 
updated
to say control domain instead?

At this point however I’m wondering about this code above:
```
    if ( !IS_ENABLED(CONFIG_SYSTEM_SUSPEND) && is_hardware_domain(d) )
        return PSCI_NOT_SUPPORTED;
```
and in do_psci_1_0_features(), shouldn’t we use consistently 
is_control_domain()?

> 
>> 
>>> +        host_system_suspend(d);
>>> +
>>>    return rc;
>>> }
>>> 
>>> @@ -290,7 +293,10 @@ static int32_t do_psci_1_0_features(uint32_t 
>>> psci_func_id)
>>>        return 0;
>>>    case PSCI_1_0_FN32_SYSTEM_SUSPEND:
>>>    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>>> -        return is_hardware_domain(current->domain) ? PSCI_NOT_SUPPORTED : 
>>> 0;
>>> +        if ( IS_ENABLED(CONFIG_SYSTEM_SUSPEND) ||
>>> +             !is_hardware_domain(current->domain) )
>> 
>> Should this have also the condition that “is hardware domain and psci_ver >= 
>> PSCI_VERSION(1, 0)”?
>> Otherwise if the host machine doestn’t support PSCI 1.0 we would return OK 
>> here but the call would
>> fail later in call_psci_system_suspend()?
> 
> Good point.
> 
> I agree that, for the domain allowed to trigger Xen-wide suspend, Xen should
> not advertise SYSTEM_SUSPEND if the host suspend path cannot be used.
> 
> I think this should be checked as an explicit host SYSTEM_SUSPEND capability,
> rather than only as psci_ver >= PSCI_VERSION(1, 0). The same capability check
> also needs to be enforced in the actual SYSTEM_SUSPEND handler before parking
> the domain, because a caller may invoke SYSTEM_SUSPEND directly without first
> querying PSCI_FEATURES.
> 
> For ordinary guests, the physical PSCI version is not relevant because they
> cannot trigger host suspend; their SYSTEM_SUSPEND path is virtual.
> 
> I will make this consistent in v9: PSCI_FEATURES will advertise SYSTEM_SUSPEND
> for the host-suspend-triggering domain only when the host SYSTEM_SUSPEND 
> backend
> is available, and the actual SYSTEM_SUSPEND path will enforce the same check.

ok

>>> 
>>> diff --git a/xen/drivers/passthrough/arm/smmu.c 
>>> b/xen/drivers/passthrough/arm/smmu.c
>>> index 22d306d0cb..45f29ef8ec 100644
>>> --- a/xen/drivers/passthrough/arm/smmu.c
>>> +++ b/xen/drivers/passthrough/arm/smmu.c
>>> @@ -2947,6 +2947,13 @@ static void arm_smmu_iommu_domain_teardown(struct 
>>> domain *d)
>>> xfree(xen_domain);
>>> }
>>> 
>>> +#ifdef CONFIG_SYSTEM_SUSPEND
>>> +static int arm_smmu_suspend(void)
>>> +{
>>> + return -ENOSYS;
>>> +}
>>> +#endif
>> 
>> Maybe we want to gate the feature also to !CONFIG_ARM_SMMU ? I would wait 
>> for the maintainers
>> view on this.
> 
> I feel that gating this strictly on !CONFIG_ARM_SMMU might not be the most
> optimal approach here.
> 
> CONFIG_ARM_SMMU is a build-time option and does not mean that an old SMMUv1/v2
> device is actually present. Using it would disable system suspend even on
> platforms where only SMMUv3 is used, because CONFIG_ARM_SMMU is enabled by
> default for Arm.
> 
> The condition should be runtime-based: whether the active/probed IOMMU devices
> have system suspend/resume support. For the old ARM SMMU driver this is not
> implemented today, so a platform with an SMMUv1/v2 instance should not expose
> or attempt host suspend.
> 
> I think we should handle this by tracking whether any old ARM SMMUv1/v2 device
> was actually probed, or by adding a generic IOMMU suspend capability check. 
> Then
> the host suspend availability check can reject system suspend only when such 
> an
> unsupported IOMMU is present, instead of disabling the feature for all
> Arm builds
> with CONFIG_ARM_SMMU enabled.
> 
> I would be interested to hear if you or the maintainers see a better way to
> express this capability.

ok, let’s address Jan comment now and we can see what the maintainers think 
about this.

Cheers,
Luca



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.