[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: S3 regression on AMD in 4.20
On Tue, Mar 18, 2025 at 07:24:31AM +0100, Jan Beulich wrote: > On 17.03.2025 18:38, Roger Pau Monné wrote: > > On Mon, Mar 17, 2025 at 05:11:56PM +0100, Jan Beulich wrote: > >> On 17.03.2025 16:56, Roger Pau Monné wrote: > >>> On Sat, Mar 15, 2025 at 12:02:50AM +0000, Andrew Cooper wrote: > >>>> On 14/03/2025 11:53 pm, Marek Marczykowski-Górecki wrote: > >>>>> On Fri, Mar 14, 2025 at 11:23:28PM +0100, Marek Marczykowski-Górecki > >>>>> wrote: > >>>>>> On Fri, Mar 14, 2025 at 02:19:19PM -0700, Stefano Stabellini wrote: > >>>>>>> On Fri, 14 Mar 2025, Marek Marczykowski-Górecki wrote: > >>>>>>>> This is AMD Zen2 (Ryzen 5 4500U specifically), in a HP Probook 445 > >>>>>>>> G7. > >>>>>>>> > >>>>>>>> This one has working S3, so add a test for it here. > >>>>>>>> > >>>>>>>> Signed-off-by: Marek Marczykowski-Górecki > >>>>>>>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > >>>>>>>> --- > >>>>>>>> Cc: Jan Beulich <jbeulich@xxxxxxxx> > >>>>>>>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > >>>>>>>> > >>>>>>>> The suspend test added here currently fails on staging[1], but > >>>>>>>> passes on > >>>>>>>> staging-4.19[2]. So the regression wants fixing before committing > >>>>>>>> this > >>>>>>>> patch. > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408437140 > >>>>>>>> [2] > >>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408943441 > >>>>>>> We could commit the patch now without the s3 test. > >>>>>>> > >>>>>>> I don't know what the x86 maintainers think about fixing the suspend > >>>>>>> bug, but one idea would be to run a bisection between 4.20 and 4.19. > >>>>>> I'm on it already, but it's annoying. Lets convert this thread to > >>>>>> discussion about the issue: > >>>>>> > >>>>>> So, I bisected it between staging-4.19 and master. The breakage is > >>>>>> somewhere between (inclusive): > >>>>>> eb21ce14d709 x86/boot: Rewrite EFI/MBI2 code partly in C > >>>>>> and > >>>>>> 47990ecef286 x86/boot: Improve MBI2 structure check > >>>>>> > >>>>>> But, the first one breaks booting on this system and it remains broken > >>>>>> until the second commit (or its parent) - at which point S3 is already > >>>>>> broken. So, there is a range of 71 commits that may be responsible... > >>>>>> > >>>>>> But then, based on a matrix chat and Jan's observation I've tried > >>>>>> reverting f75780d26b2f "xen: move per-cpu area management into common > >>>>>> code" just on top of 47990ecef286, and that fixed suspend. > >>>>>> Applying "xen/percpu: don't initialize percpu on resume" on top of > >>>>>> 47990ecef286 fixes suspend too. > >>>>>> But applying it on top of master > >>>>>> (91772f8420dfa2fcfe4db68480c216db5b79c512 specifically) does not fix > >>>>>> it, > >>>>>> but the failure mode is different than without the patch - system > >>>>>> resets > >>>>>> on S3 resume, with no crash message on the serial console (even with > >>>>>> sync_console), instead of hanging. > >>>>>> And one more data point: reverting f75780d26b2f on top of master is the > >>>>>> same as applying "xen/percpu: don't initialize percpu on resume" on > >>>>>> master - system reset on S3 resume. > >>>>>> So, it looks like there are more issues... > >>>>> Another bisection round and I have the second culprit: > >>>>> > >>>>> 8e60d47cf011 x86/iommu: avoid MSI address and data writes if IRT > >>>>> index hasn't changed > >>>>> > >>>>> With master+"xen/percpu: don't initialize percpu on resume"+revert of > >>>>> 8e60d47cf011 suspend works again on this AMD system. > >>>> > >>>> That's not surprising in the slightest. > >>>> > >>>> Caching hardware values in Xen isn't safe across S3, which QubesOS has > >>>> found time and time again, and for which we still have outstanding bugs. > >>>> > >>>> S3 turns most of the system off. RAM gets preserved, but devices and > >>>> plenty of internal registers don't. > >>> > >>> I think I've spotted the issue. enable_iommu() called on resume > >>> (ab)uses set_msi_affinity() to force an MSI register write, as it's > >>> previous behavior was to unconditionally propagate the values. With > >>> my change it would no longer perform such writes on resume. > >>> > >>> I think the patch below should help. > >>> > >>> I wonder if we should unconditionally propagate the write from > >>> __setup_msi_irq(), as it's also unlikely to make any difference to > >>> skip that write, and would further keep the previous behavior. > >> > >> That might be better. In fact I wonder whether it wouldn't better be > >> callers of write_msi_msg() to use ... > >> > >>> --- > >>> commit 1d9bfd0d45f6b547b19f0d2f752fc3bd10103971 > >>> Author: Roger Pau Monne <roger.pau@xxxxxxxxxx> > >>> Date: Mon Mar 17 15:40:11 2025 +0100 > >>> > >>> x86/msi: always propagate MSI writes when not in active system mode > >>> > >>> Relax the limitation on MSI register writes, and only apply it when > >>> the > >>> system is in active state. For example AMD IOMMU drivers rely on > >>> using > >>> set_msi_affinity() to force an MSI register write on resume from > >>> suspension. > >>> > >>> The original patch intention was to reduce the number of MSI register > >>> writes when the system is in active state. Leave the other states to > >>> always perform the writes, as it's safer given the existing code, and > >>> it's > >>> expected to not make a difference performance wise. > >>> > >>> Reported-by: Marek Marczykowski-Górecki > >>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > >>> Fixes: ('8e60d47cf011 x86/iommu: avoid MSI address and data writes if > >>> IRT index hasn't changed') > >>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >>> > >>> --- a/xen/arch/x86/msi.c > >>> +++ b/xen/arch/x86/msi.c > >>> @@ -189,6 +189,15 @@ static int write_msi_msg(struct msi_desc *entry, > >>> struct msi_msg *msg, > >>> { > >>> entry->msg = *msg; > >>> > >>> + if ( unlikely(system_state != SYS_STATE_active) ) > >> > >> ... this condition as needed. Conceivably there might be cases where even > >> during resume writes aren't always necessary to propagate to hardware. > > > > Isn't this going to be quite more cumbersome, and I don't think the > > cases outside of active mode are very interesting or relevant > > performance wise? > > Certainly. As to cumbersome - having the conditions at the call sites > would also serve as kind of documentation. That said, this was just a > suggestion; I'm not going to insist. I see from other traffic on this > thread the updated patch is now fully addressing the regression. The > secondary change, if not broken out to a separate patch, would want > mentioning in the description, though (which may have been the plan > anyway). Indeed, it's already mentioned in the description. Let me formally send the patch. Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |