[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: S3 regression on AMD in 4.20



On Mon, Mar 17, 2025 at 06:35:08PM +0100, Roger Pau Monné wrote:
> On Mon, Mar 17, 2025 at 05:35:51PM +0100, Marek Marczykowski-Górecki wrote:
> > On Mon, Mar 17, 2025 at 04:56:15PM +0100, Roger Pau Monné wrote:
> > > On Sat, Mar 15, 2025 at 12:02:50AM +0000, Andrew Cooper wrote:
> > > > On 14/03/2025 11:53 pm, Marek Marczykowski-Górecki wrote:
> > > > > On Fri, Mar 14, 2025 at 11:23:28PM +0100, Marek Marczykowski-Górecki 
> > > > > wrote:
> > > > >> On Fri, Mar 14, 2025 at 02:19:19PM -0700, Stefano Stabellini wrote:
> > > > >>> On Fri, 14 Mar 2025, Marek Marczykowski-Górecki wrote:
> > > > >>>> This is AMD Zen2 (Ryzen 5 4500U specifically), in a HP Probook 445 
> > > > >>>> G7.
> > > > >>>>
> > > > >>>> This one has working S3, so add a test for it here.
> > > > >>>>
> > > > >>>> Signed-off-by: Marek Marczykowski-Górecki 
> > > > >>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> > > > >>>> ---
> > > > >>>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> > > > >>>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > > > >>>>
> > > > >>>> The suspend test added here currently fails on staging[1], but 
> > > > >>>> passes on
> > > > >>>> staging-4.19[2]. So the regression wants fixing before committing 
> > > > >>>> this
> > > > >>>> patch.
> > > > >>>>
> > > > >>>> [1] 
> > > > >>>> https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408437140
> > > > >>>> [2] 
> > > > >>>> https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408943441
> > > > >>> We could commit the patch now without the s3 test.
> > > > >>>
> > > > >>> I don't know what the x86 maintainers think about fixing the suspend
> > > > >>> bug, but one idea would be to run a bisection between 4.20 and 4.19.
> > > > >> I'm on it already, but it's annoying. Lets convert this thread to
> > > > >> discussion about the issue:
> > > > >>
> > > > >> So, I bisected it between staging-4.19 and master. The breakage is
> > > > >> somewhere between (inclusive):
> > > > >> eb21ce14d709 x86/boot: Rewrite EFI/MBI2 code partly in C
> > > > >> and
> > > > >> 47990ecef286 x86/boot: Improve MBI2 structure check
> > > > >>
> > > > >> But, the first one breaks booting on this system and it remains 
> > > > >> broken
> > > > >> until the second commit (or its parent) - at which point S3 is 
> > > > >> already
> > > > >> broken. So, there is a range of 71 commits that may be responsible...
> > > > >>
> > > > >> But then, based on a matrix chat and Jan's observation I've tried
> > > > >> reverting f75780d26b2f "xen: move per-cpu area management into common
> > > > >> code" just on top of 47990ecef286, and that fixed suspend.
> > > > >> Applying "xen/percpu: don't initialize percpu on resume" on top of
> > > > >> 47990ecef286 fixes suspend too.
> > > > >> But applying it on top of master
> > > > >> (91772f8420dfa2fcfe4db68480c216db5b79c512 specifically) does not fix 
> > > > >> it,
> > > > >> but the failure mode is different than without the patch - system 
> > > > >> resets
> > > > >> on S3 resume, with no crash message on the serial console (even with
> > > > >> sync_console), instead of hanging.
> > > > >> And one more data point: reverting f75780d26b2f on top of master is 
> > > > >> the
> > > > >> same as applying "xen/percpu: don't initialize percpu on resume" on
> > > > >> master - system reset on S3 resume.
> > > > >> So, it looks like there are more issues...
> > > > > Another bisection round and I have the second culprit:
> > > > >
> > > > >     8e60d47cf011 x86/iommu: avoid MSI address and data writes if IRT 
> > > > > index hasn't changed
> > > > >
> > > > > With master+"xen/percpu: don't initialize percpu on resume"+revert of
> > > > > 8e60d47cf011 suspend works again on this AMD system.
> > > > 
> > > > That's not surprising in the slightest.
> > > > 
> > > > Caching hardware values in Xen isn't safe across S3, which QubesOS has
> > > > found time and time again, and for which we still have outstanding bugs.
> > > > 
> > > > S3 turns most of the system off.  RAM gets preserved, but devices and
> > > > plenty of internal registers don't.
> > > 
> > > I think I've spotted the issue.  enable_iommu() called on resume
> > > (ab)uses set_msi_affinity() to force an MSI register write, as it's
> > > previous behavior was to unconditionally propagate the values.  With
> > > my change it would no longer perform such writes on resume.
> > > 
> > > I think the patch below should help.
> > 
> > It doesn't, I still got reboot on resume, with no crash message on
> > serial (even with sync_console).
> 
> There was an error with the previous patch, and it's also a bug in the
> original patch.  Could you give a try to the updated patch below?
> 
> Sorry for bothering you, but ATM I haven't found a way to trigger
> this myself.

This one does fix the issue :)


> Thanks, Roger.
> ---
> commit 70ea4301d5ca663ac6d850658b3fe832950ec363
> Author: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Date:   Mon Mar 17 15:40:11 2025 +0100
> 
>     x86/msi: always propagate MSI writes when not in active system mode
>     
>     Relax the limitation on MSI register writes, and only apply it when the
>     system is in active state.  For example AMD IOMMU drivers rely on using
>     set_msi_affinity() to force an MSI register write on resume from
>     suspension.
>     
>     The original patch intention was to reduce the number of MSI register
>     writes when the system is in active state.  Leave the other states to
>     always perform the writes, as it's safer given the existing code, and it's
>     expected to not make a difference performance wise.
>     
>     For such propagation to work even when the IRT index is not updated the 
> MSI
>     message must be adjusted in all success cases for AMD IOMMU, not just when
>     the index has been newly allocated.
>     
>     Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>     Fixes: ('8e60d47cf011 x86/iommu: avoid MSI address and data writes if IRT 
> index hasn't changed')
>     Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index 163ccf874720..8bb3bb18af61 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -189,6 +189,15 @@ static int write_msi_msg(struct msi_desc *entry, struct 
> msi_msg *msg,
>  {
>      entry->msg = *msg;
>  
> +    if ( unlikely(system_state != SYS_STATE_active) )
> +        /*
> +         * Always propagate writes when not in the 'active' state.  The
> +         * optimization to avoid the MSI address and data registers write is
> +         * only relevant for runtime state, and drivers on resume (at least)
> +         * rely on set_msi_affinity() to update the hardware state.
> +         */
> +        force = true;
> +
>      if ( iommu_intremap != iommu_intremap_off )
>      {
>          int rc;
> diff --git a/xen/drivers/passthrough/amd/iommu_intr.c 
> b/xen/drivers/passthrough/amd/iommu_intr.c
> index 9abdc38053d7..08766122b421 100644
> --- a/xen/drivers/passthrough/amd/iommu_intr.c
> +++ b/xen/drivers/passthrough/amd/iommu_intr.c
> @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire(
>      rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr,
>                                              &msi_desc->remap_index,
>                                              msg, &data);
> -    if ( rc > 0 )
> +    if ( rc >= 0 )
>      {
>          for ( i = 1; i < nr; ++i )
>              msi_desc[i].remap_index = msi_desc->remap_index + i;
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.