[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: S3 resume crash in memguard_guard_stack (stable-4.14)
On 03.10.2020 15:57, Marek Marczykowski-Górecki wrote: > With this, I get a crash on S3 resume: > > (XEN) Preparing system for ACPI S3 state. > (XEN) Disabling non-boot CPUs ... > (XEN) Entering ACPI S3 state. > (XEN) [VT-D]Passed iommu=no-igfx option. Disabling IGD VT-d engine. > (XEN) mce_intel.c:773: MCA Capability: firstbank 0, extended MCE MSR 0, > BCAST, CMCI > (XEN) CPU0 CMCI LVT vector (0xf1) already installed > (XEN) Finishing wakeup from ACPI S3 state. > (XEN) Enabling non-boot CPUs ... > (XEN) ----[ Xen-4.14.1-pre x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82d040311090>] memguard_guard_stack+0x7/0x1a5 > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor > (XEN) rax: ffff830250ca03f8 rbx: 0000000000000001 rcx: ffff830250cb10b0 > (XEN) rdx: 0000003210739000 rsi: 0000000000000001 rdi: ffff830250ca0000 > (XEN) rbp: ffff830049a6fd70 rsp: ffff830049a6fd40 r8: 0000000000000001 > (XEN) r9: 0000000000000000 r10: 0000000000000001 r11: 0000000000000002 > (XEN) r12: 0000000000010000 r13: 0000000000000000 r14: 0000000000000001 > (XEN) r15: ffff82d040598440 cr0: 000000008005003b cr4: 00000000003526e0 > (XEN) cr3: 0000000049a5d000 cr2: ffff830250ca03f8 > (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen code around <ffff82d040311090> (memguard_guard_stack+0x7/0x1a5): > (XEN) c3 48 8d 87 f8 03 00 00 <48> 89 87 f8 03 00 00 48 8d 87 f8 07 00 00 48 > 89 > (XEN) Xen stack trace from rsp=ffff830049a6fd40: > (XEN) ffff82d040321c2e ffff82d040461b68 ffff82d040461b60 ffff82d040461240 > (XEN) 0000000000000001 0000000000000000 ffff830049a6fdb8 ffff82d040221f9c > (XEN) ffff830049a6fde0 0000000000000001 0000000000000000 00000000ffffffef > (XEN) ffff830049a6fe08 0000000000000001 ffff830250b66000 ffff830049a6fdd0 > (XEN) ffff82d0402036cf 0000000000000001 ffff830049a6fdf8 ffff82d040203a4d > (XEN) 0000000000000000 0000000000000001 0000000000000010 ffff830049a6fe28 > (XEN) ffff82d040203d00 ffff830049a6fef8 0000000000000000 0000000000000003 > (XEN) 0000000000000200 ffff830049a6fe58 ffff82d040270c9a ffff830250139f70 > (XEN) ffff830250b45000 0000000000000000 0000000000000000 ffff830049a6fe78 > (XEN) ffff82d040207064 ffff830250b451b8 ffff82d0405781b0 ffff830049a6fe90 > (XEN) ffff82d04022b7bb ffff82d0405781a0 ffff830049a6fec0 ffff82d04022ba9c > (XEN) 0000000000000000 ffff82d0405781b0 ffff82d04057ed00 ffff82d040598440 > (XEN) ffff830049a6fef0 ffff82d0402f33e3 ffff830252b0e000 ffff830250b45000 > (XEN) ffff830252b0f000 0000000000000000 ffff830049a6fdc8 ffff88818ce029e0 > (XEN) ffffc900026b7f08 0000000000000003 0000000000000000 0000000000003403 > (XEN) ffffffff8277a5a8 0000000000000246 0000000000000003 0000000000003403 > (XEN) 0000000000003403 0000000000000000 ffffffff810020ea 0000000000003403 > (XEN) 0000000000000010 deadbeefdeadf00d 0000010000000000 ffffffff810020ea > (XEN) 000000000000e033 0000000000000246 ffffc900026b7cb8 000000000000e02b > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) Xen call trace: > (XEN) [<ffff82d040311090>] R memguard_guard_stack+0x7/0x1a5 > (XEN) [<ffff82d040321c2e>] S smpboot.c#cpu_smpboot_callback+0xe5/0x6d5 > (XEN) [<ffff82d040221f9c>] F notifier_call_chain+0x6b/0x96 > (XEN) [<ffff82d0402036cf>] F cpu.c#cpu_notifier_call_chain+0x1b/0x33 > (XEN) [<ffff82d040203a4d>] F cpu_up+0x5f/0xd5 > (XEN) [<ffff82d040203d00>] F enable_nonboot_cpus+0xea/0x1fb > (XEN) [<ffff82d040270c9a>] F power.c#enter_state_helper+0x152/0x606 > (XEN) [<ffff82d040207064>] F > domain.c#continue_hypercall_tasklet_handler+0x4c/0xb9 > (XEN) [<ffff82d04022b7bb>] F tasklet.c#do_tasklet_work+0x76/0xa9 > (XEN) [<ffff82d04022ba9c>] F do_tasklet+0x58/0x8a > (XEN) [<ffff82d0402f33e3>] F domain.c#idle_loop+0x40/0x96 > (XEN) > (XEN) Pagetable walk from ffff830250ca03f8: > (XEN) L4[0x106] = 8000000049a5b063 ffffffffffffffff > (XEN) L3[0x009] = 0000000250cae063 ffffffffffffffff > (XEN) L2[0x086] = 0000000250cad063 ffffffffffffffff > (XEN) L1[0x0a0] = 8000000250ca0161 ffffffffffffffff Now this one's pretty obvious: The call to memguard_unguard_stack() during bringing down the APs is conditional (in cpu_smpboot_free()) and hence memguard_guard_stack() may (at present) not assume the stack is writable (by ordinary writes, i.e. write_sss_token()). I guess we may want something like if ( stack_base[cpu] == NULL ) { stack_base[cpu] = alloc_xenheap_pages(STACK_ORDER, memflags); if ( stack_base[cpu] == NULL ) goto out; } else if ( IS_ENABLED(CONFIG_XEN_SHSTK) ) memguard_unguard_stack(stack_base[cpu]); in cpu_smpboot_alloc(). But of course the question is whether the conditions here and there wouldn't better become cpu_has_xen_shstk, since right now the breakage (afaict) needlessly extends to systems that aren't CET-capable. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |