[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: S3 regression on AMD in 4.20


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 17 Mar 2025 17:11:56 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Doug Goldstein <cardoe@xxxxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Mykola Kvach <xakep.amatop@xxxxxxxxx>, Mykyta Poturai <mykyta_poturai@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 17 Mar 2025 16:12:06 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17.03.2025 16:56, Roger Pau Monné wrote:
> On Sat, Mar 15, 2025 at 12:02:50AM +0000, Andrew Cooper wrote:
>> On 14/03/2025 11:53 pm, Marek Marczykowski-Górecki wrote:
>>> On Fri, Mar 14, 2025 at 11:23:28PM +0100, Marek Marczykowski-Górecki wrote:
>>>> On Fri, Mar 14, 2025 at 02:19:19PM -0700, Stefano Stabellini wrote:
>>>>> On Fri, 14 Mar 2025, Marek Marczykowski-Górecki wrote:
>>>>>> This is AMD Zen2 (Ryzen 5 4500U specifically), in a HP Probook 445 G7.
>>>>>>
>>>>>> This one has working S3, so add a test for it here.
>>>>>>
>>>>>> Signed-off-by: Marek Marczykowski-Górecki 
>>>>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>>>>> ---
>>>>>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
>>>>>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>>>>>
>>>>>> The suspend test added here currently fails on staging[1], but passes on
>>>>>> staging-4.19[2]. So the regression wants fixing before committing this
>>>>>> patch.
>>>>>>
>>>>>> [1] https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408437140
>>>>>> [2] https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408943441
>>>>> We could commit the patch now without the s3 test.
>>>>>
>>>>> I don't know what the x86 maintainers think about fixing the suspend
>>>>> bug, but one idea would be to run a bisection between 4.20 and 4.19.
>>>> I'm on it already, but it's annoying. Lets convert this thread to
>>>> discussion about the issue:
>>>>
>>>> So, I bisected it between staging-4.19 and master. The breakage is
>>>> somewhere between (inclusive):
>>>> eb21ce14d709 x86/boot: Rewrite EFI/MBI2 code partly in C
>>>> and
>>>> 47990ecef286 x86/boot: Improve MBI2 structure check
>>>>
>>>> But, the first one breaks booting on this system and it remains broken
>>>> until the second commit (or its parent) - at which point S3 is already
>>>> broken. So, there is a range of 71 commits that may be responsible...
>>>>
>>>> But then, based on a matrix chat and Jan's observation I've tried
>>>> reverting f75780d26b2f "xen: move per-cpu area management into common
>>>> code" just on top of 47990ecef286, and that fixed suspend.
>>>> Applying "xen/percpu: don't initialize percpu on resume" on top of
>>>> 47990ecef286 fixes suspend too.
>>>> But applying it on top of master
>>>> (91772f8420dfa2fcfe4db68480c216db5b79c512 specifically) does not fix it,
>>>> but the failure mode is different than without the patch - system resets
>>>> on S3 resume, with no crash message on the serial console (even with
>>>> sync_console), instead of hanging.
>>>> And one more data point: reverting f75780d26b2f on top of master is the
>>>> same as applying "xen/percpu: don't initialize percpu on resume" on
>>>> master - system reset on S3 resume.
>>>> So, it looks like there are more issues...
>>> Another bisection round and I have the second culprit:
>>>
>>>     8e60d47cf011 x86/iommu: avoid MSI address and data writes if IRT index 
>>> hasn't changed
>>>
>>> With master+"xen/percpu: don't initialize percpu on resume"+revert of
>>> 8e60d47cf011 suspend works again on this AMD system.
>>
>> That's not surprising in the slightest.
>>
>> Caching hardware values in Xen isn't safe across S3, which QubesOS has
>> found time and time again, and for which we still have outstanding bugs.
>>
>> S3 turns most of the system off.  RAM gets preserved, but devices and
>> plenty of internal registers don't.
> 
> I think I've spotted the issue.  enable_iommu() called on resume
> (ab)uses set_msi_affinity() to force an MSI register write, as it's
> previous behavior was to unconditionally propagate the values.  With
> my change it would no longer perform such writes on resume.
> 
> I think the patch below should help.
> 
> I wonder if we should unconditionally propagate the write from
> __setup_msi_irq(), as it's also unlikely to make any difference to
> skip that write, and would further keep the previous behavior.

That might be better. In fact I wonder whether it wouldn't better be
callers of write_msi_msg() to use ...

> ---
> commit 1d9bfd0d45f6b547b19f0d2f752fc3bd10103971
> Author: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Date:   Mon Mar 17 15:40:11 2025 +0100
> 
>     x86/msi: always propagate MSI writes when not in active system mode
>     
>     Relax the limitation on MSI register writes, and only apply it when the
>     system is in active state.  For example AMD IOMMU drivers rely on using
>     set_msi_affinity() to force an MSI register write on resume from
>     suspension.
>     
>     The original patch intention was to reduce the number of MSI register
>     writes when the system is in active state.  Leave the other states to
>     always perform the writes, as it's safer given the existing code, and it's
>     expected to not make a difference performance wise.
>     
>     Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>     Fixes: ('8e60d47cf011 x86/iommu: avoid MSI address and data writes if IRT 
> index hasn't changed')
>     Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -189,6 +189,15 @@ static int write_msi_msg(struct msi_desc *entry, struct 
> msi_msg *msg,
>  {
>      entry->msg = *msg;
>  
> +    if ( unlikely(system_state != SYS_STATE_active) )

... this condition as needed. Conceivably there might be cases where even
during resume writes aren't always necessary to propagate to hardware.

Jan

> +        /*
> +         * Always propagate writes when not in the 'active' state.  The
> +         * optimization to avoid the MSI address and data registers write is
> +         * only relevant for runtime state, and drivers on resume (at least)
> +         * rely on set_msi_affinity() to update the hardware state.
> +         */
> +        force = true;
> +
>      if ( iommu_intremap != iommu_intremap_off )
>      {
>          int rc;




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.