|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v8 13/13] xen/arm: Add support for system suspend triggered by hardware domain
Hi Mykola,
>>> +/* Xen suspend. data identifies the domain that initiated suspend. */
>>> +static void system_suspend(void *data)
>>> +{
>>> + int status;
>>> + unsigned long flags;
>>> + struct domain *d = (struct domain *)data;
>>> +
>>> + BUG_ON(system_state != SYS_STATE_active);
>>> +
>>> + system_state = SYS_STATE_suspend;
>>> +
>>> + printk("Xen suspending...\n");
>>> +
>>> + freeze_domains();
>>> + scheduler_disable();
>>> +
>>> + status = can_system_suspend();
>>> + if ( status )
>>> + {
>>> + system_state = SYS_STATE_resume;
>>> + goto resume_scheduler;
>>
>> When we have an error and we get the resume_scheduler path, we apply back the
>> context of the guest saved previously in do_psci_1_0_system_suspend(), so am
>> I
>> correct saying the guest won’t get any PSCI error back and we resume the
>> guest
>> from the guest resume entrypoint?
>>
>> In case, should we have a different path that returns a PSCI error (PSCI_*)
>> into the guest
>> x0, and skips the context restore?
>
> You are right about the current control flow: once the virtual
> SYSTEM_SUSPEND request has been accepted and the domain has been parked, a
> later failure in the Xen-wide suspend path resumes the domain through the
> normal
> domain resume path, rather than returning a PSCI error from the original call.
>
> This is intentional in the current design. The virtual PSCI SYSTEM_SUSPEND
> path parks the domain and saves its resume context. The actual Xen-wide host
> suspend is a separate step that is attempted only after all domains are
> suspended.
>
> So a failure in the later Xen-wide suspend step is treated as an abort of the
> host suspend attempt after the domain suspend was already accepted. The domain
> is then resumed through the existing domain resume path, similarly to the
> toolstack/xl suspend-resume flow, rather than by re-entering the guest PSCI
> call path and modifying the saved vCPU context again.
>
> I agree this design is not obvious from the patch. I will clarify the commit
> message and comments. If you or the maintainers think that failures before the
> physical SYSTEM_SUSPEND call succeeds should be reported back through the
> original virtual PSCI call, then this would require a different flow. I was
> trying to avoid that extra complexity in this series.
Ok I understand, I’m wondering if inside do_psci_1_0_system_suspend() we could
do something
like:
[…]
if ( is_control_domain(d) && !other_domains_ready_for_suspend(d) )
return PSCI_DENIED;
[…]
But I’m ok also to only document this behaviour.
>>>
>>> diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
>>> index bd87ec430d..8fb9172186 100644
>>> --- a/xen/arch/arm/vpsci.c
>>> +++ b/xen/arch/arm/vpsci.c
>>> @@ -5,6 +5,7 @@
>>>
>>> #include <asm/current.h>
>>> #include <asm/domain.h>
>>> +#include <asm/suspend.h>
>>> #include <asm/vgic.h>
>>> #include <asm/vpsci.h>
>>> #include <asm/event.h>
>>> @@ -232,8 +233,7 @@ static int32_t do_psci_1_0_system_suspend(register_t
>>> epoint, register_t cid)
>>> if ( is_64bit_domain(d) && is_thumb )
>>> return PSCI_INVALID_ADDRESS;
>>>
>>> - /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
>>> - if ( is_hardware_domain(d) )
>>> + if ( !IS_ENABLED(CONFIG_SYSTEM_SUSPEND) && is_hardware_domain(d) )
>>> return PSCI_NOT_SUPPORTED;
>>>
>>> /* Ensure that all CPUs other than the calling one are offline */
>>> @@ -266,6 +266,9 @@ static int32_t do_psci_1_0_system_suspend(register_t
>>> epoint, register_t cid)
>>> "SYSTEM_SUSPEND requested, epoint=%#"PRIregister",
>>> cid=%#"PRIregister"\n",
>>> epoint, cid);
>>>
>>> + if ( is_control_domain(d) )
>>
>> Why is_control_domain() here and not is_hardware_domain() ?
>
> The use of is_control_domain() is intentional.
>
> The intended model is that Xen-wide host suspend is orchestrated by the
> privileged management/control domain. The control domain coordinates the
> toolstack side, asks other domains to enter suspend, and then issues the final
> SYSTEM_SUSPEND request to Xen.
>
> This does not have to be the same entity as the hardware domain. If the
> hardware domain is separate, it is one of the domains that the control domain
> parks before the final host suspend step.
>
> The hwdom-specific checks in this patch have a different purpose: they avoid
> the old hwdom_shutdown() path for SHUTDOWN_suspend and allow the hardware
> domain to be parked as part of the suspend sequence. They do not define the
> policy for who is allowed to trigger Xen-wide host suspend.
>
> That said, this policy may not be optimal for all configurations, especially
> when the control and hardware domain roles are split. I would appreciate your
> view, as well as the maintainers' views, on whether the trigger should remain
> control-domain based, be tied to the hardware domain instead, or be expressed
> through a separate host-suspend capability/helper.
In the commit message and title I saw HW domain, so maybe the commit should be
updated
to say control domain instead?
At this point however I’m wondering about this code above:
```
if ( !IS_ENABLED(CONFIG_SYSTEM_SUSPEND) && is_hardware_domain(d) )
return PSCI_NOT_SUPPORTED;
```
and in do_psci_1_0_features(), shouldn’t we use consistently
is_control_domain()?
>
>>
>>> + host_system_suspend(d);
>>> +
>>> return rc;
>>> }
>>>
>>> @@ -290,7 +293,10 @@ static int32_t do_psci_1_0_features(uint32_t
>>> psci_func_id)
>>> return 0;
>>> case PSCI_1_0_FN32_SYSTEM_SUSPEND:
>>> case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>>> - return is_hardware_domain(current->domain) ? PSCI_NOT_SUPPORTED :
>>> 0;
>>> + if ( IS_ENABLED(CONFIG_SYSTEM_SUSPEND) ||
>>> + !is_hardware_domain(current->domain) )
>>
>> Should this have also the condition that “is hardware domain and psci_ver >=
>> PSCI_VERSION(1, 0)”?
>> Otherwise if the host machine doestn’t support PSCI 1.0 we would return OK
>> here but the call would
>> fail later in call_psci_system_suspend()?
>
> Good point.
>
> I agree that, for the domain allowed to trigger Xen-wide suspend, Xen should
> not advertise SYSTEM_SUSPEND if the host suspend path cannot be used.
>
> I think this should be checked as an explicit host SYSTEM_SUSPEND capability,
> rather than only as psci_ver >= PSCI_VERSION(1, 0). The same capability check
> also needs to be enforced in the actual SYSTEM_SUSPEND handler before parking
> the domain, because a caller may invoke SYSTEM_SUSPEND directly without first
> querying PSCI_FEATURES.
>
> For ordinary guests, the physical PSCI version is not relevant because they
> cannot trigger host suspend; their SYSTEM_SUSPEND path is virtual.
>
> I will make this consistent in v9: PSCI_FEATURES will advertise SYSTEM_SUSPEND
> for the host-suspend-triggering domain only when the host SYSTEM_SUSPEND
> backend
> is available, and the actual SYSTEM_SUSPEND path will enforce the same check.
ok
>>>
>>> diff --git a/xen/drivers/passthrough/arm/smmu.c
>>> b/xen/drivers/passthrough/arm/smmu.c
>>> index 22d306d0cb..45f29ef8ec 100644
>>> --- a/xen/drivers/passthrough/arm/smmu.c
>>> +++ b/xen/drivers/passthrough/arm/smmu.c
>>> @@ -2947,6 +2947,13 @@ static void arm_smmu_iommu_domain_teardown(struct
>>> domain *d)
>>> xfree(xen_domain);
>>> }
>>>
>>> +#ifdef CONFIG_SYSTEM_SUSPEND
>>> +static int arm_smmu_suspend(void)
>>> +{
>>> + return -ENOSYS;
>>> +}
>>> +#endif
>>
>> Maybe we want to gate the feature also to !CONFIG_ARM_SMMU ? I would wait
>> for the maintainers
>> view on this.
>
> I feel that gating this strictly on !CONFIG_ARM_SMMU might not be the most
> optimal approach here.
>
> CONFIG_ARM_SMMU is a build-time option and does not mean that an old SMMUv1/v2
> device is actually present. Using it would disable system suspend even on
> platforms where only SMMUv3 is used, because CONFIG_ARM_SMMU is enabled by
> default for Arm.
>
> The condition should be runtime-based: whether the active/probed IOMMU devices
> have system suspend/resume support. For the old ARM SMMU driver this is not
> implemented today, so a platform with an SMMUv1/v2 instance should not expose
> or attempt host suspend.
>
> I think we should handle this by tracking whether any old ARM SMMUv1/v2 device
> was actually probed, or by adding a generic IOMMU suspend capability check.
> Then
> the host suspend availability check can reject system suspend only when such
> an
> unsupported IOMMU is present, instead of disabling the feature for all
> Arm builds
> with CONFIG_ARM_SMMU enabled.
>
> I would be interested to hear if you or the maintainers see a better way to
> express this capability.
ok, let’s address Jan comment now and we can see what the maintainers think
about this.
Cheers,
Luca
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |