[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: S3 regression on AMD in 4.20 (was: Re: [PATCH] ci: add yet another HW runner)
On Fri, Mar 14, 2025 at 11:23:28PM +0100, Marek Marczykowski-Górecki wrote: > On Fri, Mar 14, 2025 at 02:19:19PM -0700, Stefano Stabellini wrote: > > On Fri, 14 Mar 2025, Marek Marczykowski-Górecki wrote: > > > This is AMD Zen2 (Ryzen 5 4500U specifically), in a HP Probook 445 G7. > > > > > > This one has working S3, so add a test for it here. > > > > > > Signed-off-by: Marek Marczykowski-Górecki > > > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > > > --- > > > Cc: Jan Beulich <jbeulich@xxxxxxxx> > > > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > > > > > The suspend test added here currently fails on staging[1], but passes on > > > staging-4.19[2]. So the regression wants fixing before committing this > > > patch. > > > > > > [1] https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408437140 > > > [2] https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/9408943441 > > > > We could commit the patch now without the s3 test. > > > > I don't know what the x86 maintainers think about fixing the suspend > > bug, but one idea would be to run a bisection between 4.20 and 4.19. > > I'm on it already, but it's annoying. Lets convert this thread to > discussion about the issue: > > So, I bisected it between staging-4.19 and master. The breakage is > somewhere between (inclusive): > eb21ce14d709 x86/boot: Rewrite EFI/MBI2 code partly in C > and > 47990ecef286 x86/boot: Improve MBI2 structure check > > But, the first one breaks booting on this system and it remains broken > until the second commit (or its parent) - at which point S3 is already > broken. So, there is a range of 71 commits that may be responsible... > > But then, based on a matrix chat and Jan's observation I've tried > reverting f75780d26b2f "xen: move per-cpu area management into common > code" just on top of 47990ecef286, and that fixed suspend. > Applying "xen/percpu: don't initialize percpu on resume" on top of > 47990ecef286 fixes suspend too. > But applying it on top of master > (91772f8420dfa2fcfe4db68480c216db5b79c512 specifically) does not fix it, > but the failure mode is different than without the patch - system resets > on S3 resume, with no crash message on the serial console (even with > sync_console), instead of hanging. > And one more data point: reverting f75780d26b2f on top of master is the > same as applying "xen/percpu: don't initialize percpu on resume" on > master - system reset on S3 resume. > So, it looks like there are more issues... Another bisection round and I have the second culprit: 8e60d47cf011 x86/iommu: avoid MSI address and data writes if IRT index hasn't changed With master+"xen/percpu: don't initialize percpu on resume"+revert of 8e60d47cf011 suspend works again on this AMD system. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |