[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: S3 resume issue in xstate_init
On 17/08/2021 13:53, Andrew Cooper wrote: > On 17/08/2021 13:21, Jan Beulich wrote: >> On 17.08.2021 13:44, Marek Marczykowski-Górecki wrote: >>> On Tue, Aug 17, 2021 at 12:14:36PM +0100, Andrew Cooper wrote: >>>> On 17/08/2021 12:02, Marek Marczykowski-Górecki wrote: >>>>> On Tue, Aug 17, 2021 at 03:25:21AM +0200, Marek Marczykowski-Górecki >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I've got another S3 issue: >>>>>> >>>>>> (XEN) Preparing system for ACPI S3 state. >>>>>> (XEN) Disabling non-boot CPUs ... >>>>>> (XEN) Broke affinity for IRQ1, new: ffff >>>>>> (XEN) Broke affinity for IRQ16, new: ffff >>>>>> (XEN) Broke affinity for IRQ9, new: ffff >>>>>> (XEN) Broke affinity for IRQ139, new: ffff >>>>>> (XEN) Broke affinity for IRQ8, new: ffff >>>>>> (XEN) Broke affinity for IRQ14, new: ffff >>>>>> (XEN) Broke affinity for IRQ20, new: ffff >>>>>> (XEN) Broke affinity for IRQ137, new: ffff >>>>>> (XEN) Broke affinity for IRQ138, new: ffff >>>>>> (XEN) Entering ACPI S3 state. >>>>>> (XEN) mce_intel.c:773: MCA Capability: firstbank 0, extended MCE MSR 0, >>>>>> BCAST, CMCI >>>>>> (XEN) CPU0 CMCI LVT vector (0xf1) already installed >>>>>> (XEN) Finishing wakeup from ACPI S3 state. >>>>>> (XEN) microcode: CPU0 updated from revision 0xca to 0xea, date = >>>>>> 2021-01-05 >>>>>> (XEN) xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f >>>>>> (XEN) Enabling non-boot CPUs ... >>>>>> (XEN) xstate: size: 0x440 (uncompressed 0x240) and states: 0x1f >>>>>> (XEN) Xen BUG at xstate.c:673 >>>>>> (XEN) ----[ Xen-4.16-unstable x86_64 debug=y Not tainted ]---- >>>>>> (XEN) CPU: 1 >>>>>> (XEN) RIP: e008:[<ffff82d040350ee4>] xstate_init+0x24b/0x2ff >>>>>> (XEN) RFLAGS: 0000000000010087 CONTEXT: hypervisor >>>>>> (XEN) rax: 0000000000000240 rbx: 000000000000001f rcx: >>>>>> 0000000000000440 >>>>>> (XEN) rdx: 0000000000000001 rsi: 000000000000000a rdi: >>>>>> 000000000000001f >>>>>> (XEN) rbp: ffff83025dc9fd38 rsp: ffff83025dc9fd20 r8: >>>>>> 0000000000000001 >>>>>> (XEN) r9: ffff83025dc9fc88 r10: 0000000000000001 r11: >>>>>> 0000000000000001 >>>>>> (XEN) r12: ffff83025dc9fd80 r13: 000000000000001f r14: >>>>>> 0000000000000001 >>>>>> (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: >>>>>> 00000000003526e0 >>>>>> (XEN) cr3: 0000000049656000 cr2: 0000000000000000 >>>>>> (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: >>>>>> 0000000000000000 >>>>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >>>>>> (XEN) Xen code around <ffff82d040350ee4> (xstate_init+0x24b/0x2ff): >>>>>> (XEN) ff e9 a2 00 00 00 0f 0b <0f> 0b 89 f8 89 f1 0f a2 89 f2 4c 8b 0d >>>>>> cb b4 0f >>>>>> (XEN) Xen stack trace from rsp=ffff83025dc9fd20: >>>>>> (XEN) 0000000000000240 ffff83025dc9fd80 0000000000000001 >>>>>> ffff83025dc9fd70 >>>>>> (XEN) ffff82d04027e7a1 000000004035a7f1 7ffafbbf01100800 >>>>>> 00000000bfebfbff >>>>>> (XEN) 0000000000000001 00000000000000c8 ffff83025dc9feb8 >>>>>> ffff82d0402e43ce >>>>>> (XEN) 000000160a9e0106 bfebfbff80000008 2c1008007ffaf3bf >>>>>> 0000000f00000121 >>>>>> (XEN) 00000000029c6fbf 0000000000000100 000000009c002e00 >>>>>> 02afcd7f00000000 >>>>>> (XEN) 756e654700000000 6c65746e49656e69 65746e4904b21920 >>>>>> 726f43202952286c >>>>>> (XEN) 376920294d542865 432048303537382d 322e322040205550 >>>>>> 000000007a484730 >>>>>> (XEN) ffff830000000000 ffff83025dc9fe18 00002400402e8e0b >>>>>> 000000085dc9fe30 >>>>>> (XEN) 00000002402e9f21 0000000000000001 ffffffff00000000 >>>>>> ffff82d0402e0040 >>>>>> (XEN) 00000000003526e0 ffff83025dc9fe68 ffff82d04027bd15 >>>>>> 0000000000000001 >>>>>> (XEN) ffff8302590a0000 0000000000000000 00000000000000c8 >>>>>> 0000000000000001 >>>>>> (XEN) 0000000000000001 ffff83025dc9feb8 ffff82d0402e32b7 >>>>>> 0000000000000001 >>>>>> (XEN) 0000000000000001 00000000000000c8 0000000000000001 >>>>>> ffff83025dc9fee8 >>>>>> (XEN) ffff82d04030e401 0000000000000001 0000000000000000 >>>>>> 0000000000000000 >>>>>> (XEN) 0000000000000000 0000000000000000 ffff82d040200122 >>>>>> 0800002000000002 >>>>>> (XEN) 0100000400010000 0000002000000000 2000000000100000 >>>>>> 0000001000000000 >>>>>> (XEN) 2000000000000000 0000000029000000 0000008000000000 >>>>>> 00110000a0000000 >>>>>> (XEN) 8000000080000000 4000000000000008 0000100000000000 >>>>>> 0200000040000080 >>>>>> (XEN) 0004000000000000 0000010000000002 0400002030000000 >>>>>> 0000000060000000 >>>>>> (XEN) 0400001000010000 0000000010000000 0000004010000000 >>>>>> 0000000000000000 >>>>>> (XEN) Xen call trace: >>>>>> (XEN) [<ffff82d040350ee4>] R xstate_init+0x24b/0x2ff >>>>>> (XEN) [<ffff82d04027e7a1>] F identify_cpu+0x318/0x4af >>>>>> (XEN) [<ffff82d0402e43ce>] F recheck_cpu_features+0x1f/0x72 >>>>>> (XEN) [<ffff82d04030e401>] F start_secondary+0x255/0x38a >>>>>> (XEN) [<ffff82d040200122>] F __high_start+0x82/0x91 >>>>>> (XEN) >>>>>> (XEN) >>>>>> (XEN) **************************************** >>>>>> (XEN) Panic on CPU 1: >>>>>> (XEN) Xen BUG at xstate.c:673 >>>>>> (XEN) **************************************** >>>>>> (XEN) >>>>>> (XEN) Reboot in five seconds... >>>>>> >>>>>> This is with added debug patch: >>>>>> >>>>>> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c >>>>>> index 6aaf9a2f1546..7873a21b356a 100644 >>>>>> --- a/xen/arch/x86/xstate.c >>>>>> +++ b/xen/arch/x86/xstate.c >>>>>> @@ -668,6 +668,8 @@ void xstate_init(struct cpuinfo_x86 *c) >>>>>> else >>>>>> { >>>>>> BUG_ON(xfeature_mask != feature_mask); >>>>>> + printk("xstate: size: %#x (uncompressed %#x) and states: >>>>>> %#"PRIx64"\n", >>>>>> + xsave_cntxt_size, hw_uncompressed_size(feature_mask), >>>>>> feature_mask); >>>>>> BUG_ON(xsave_cntxt_size != hw_uncompressed_size(feature_mask)); >>>>>> } >>>>>> >>>>>> >>>>>> As can be seen above - the xsave size differs between BSP and other >>>>>> CPU(s) - likely because of (not) loaded ucode update there. >>>>>> I guess it's a matter of moving ucode loading somewhere else, right? >>>>> Few more data points: >>>>> >>>>> 1. The CPU is i7-8750H (family 6, model 158, stepping 10). >>>>> 2. I do have "smt=off" on the Xen cmdline, if that matters. >>>> As a datapoint, it would be interesting to confirm what the behaviour is >>>> with SMT enabled. >>>> >>>> I'd expect it not to make a difference, because smt=off is a purely Xen >>>> construct and doesn't change the hardware configuration. >>> Uhm, changing to smt=on actually _did_ change it. Now it doesn't crash! >>> >>> Let me add CPU number to the above printk - is smp_processor_id() the >>> thing I want? >>> With that, I get: >>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fmarmarek%2Fae604a1e5cf49639a1eec9e220c037ca&data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C9bcffe997fb34fe8b89408d9617e2986%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637648016779334734%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xK014jSmKmoKdw%2BiqvsX7TLpLswzf7uzCc04xM1C8co%3D&reserved=0 >>> Note that at boot all CPUs reports 0x440 (but only later are parked). >> And for a feature mask of 0x1f only 0x440 can possibly be correct. > As a complete tangent, we really want an mpx=off option (or extend > cpuid=no-mpx to clobber bnd{regs,csr} too). For systems not wanting to > use MPX in the first place - and I'd expect QubesOS falls into this > category - this will increase performance by not partitioning the > physical register file, as well as reducing the xstate size for client > parts. > >> I'm kind of guessing that set_xcr0() mistakenly skips the actual XCR0 >> write, due to the cached value matching the to-be-written one, yet >> the cache having gone stale across S3. > This is a rats nest, and area for concern, but ... > >> I think this is to be expected >> for previously parked CPUs, as those don't have their per-CPU data >> de-allocated (and hence also not re-setup, and thus also not starting >> out as zero). > ... we don't deallocate regular APs either, so I don't see why the smt= > setting would make a difference in this case. > > To be clear - I think your guess about set_xcr0() skipping the write is > correct, because 0x240 is correct for xcr0=X87, but I don't see why smt= > makes a difference. > >> I guess an easy fix would be to write 0 to >> this_cpu(xcr0) directly early in xstate_init(), maybe in an "else" >> to the early "if ( bsp )". >> >> I'm not sure though this would be a good fix longer term, as there >> might easily be other similar issues elsewhere. IOW we may need to >> see whether per-CPU data initialization wouldn't want changing. > We've got other registers too, like MSR_TSC_AUX, but I don't think we > want to be doing anything as drastic as changing how the initialisation > works. > > The S3 path needs to explicitly write every register we do lazy context > switching of. Actually no - that's a dumb suggestion because the APs don't know better, and we don't want for_each_cpu() loops running from the BSP. Perhaps we want the cpu_down() logic to explicitly invalidate their lazily cached values? ~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |