[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Need help with fixing the Xen waitqueue feature
On Sat, Nov 12, Keir Fraser wrote: > On 11/11/2011 22:56, "Olaf Hering" <olaf@xxxxxxxxx> wrote: > > > Keir, > > > > just do dump my findings to the list: > > > > On Tue, Nov 08, Keir Fraser wrote: > > > >> Tbh I wonder anyway whether stale hypercall context would be likely to > >> cause > >> a silent machine reboot. Booting with max_cpus=1 would eliminate moving > >> between CPUs as a cause of inconsistencies, or pin the guest under test. > >> Another problem could be sleeping with locks held, but we do test for that > >> (in debug builds at least) and I'd expect crash/hang rather than silent > >> reboot. Another problem could be if the vcpu has its own state in an > >> inconsistent/invalid state temporarily (e.g., its pagetable base pointers) > >> which then is attempted to be restored during a waitqueue wakeup. That > >> could > >> certainly cause a reboot, but I don't know of an example where this might > >> happen. > > > > The crashes also happen with maxcpus=1 and a single guest cpu. > > Today I added wait_event to ept_get_entry and this works. > > > > But at some point the codepath below is executed, after that wake_up the > > host hangs hard. I will trace it further next week, maybe the backtrace > > gives a glue what the cause could be. > > So you run with a single CPU, and with wait_event() in one location, and > that works for a while (actually doing full waitqueue work: executing wait() > and wake_up()), but then hangs? That's weird, but pretty interesting if I've > understood correctly. Yes, thats what happens with single cpu in dom0 and domU. I have added some more debug. After the backtrace below I see one more call to check_wakeup_from_wait() for dom0, then the host hangs hard. > > Also, the 3K stacksize is still too small, this path uses 3096. > > I'll allocate a whole page for the stack then. Thanks. Olaf > > (XEN) prep 127a 30 0 > > (XEN) wake 127a 30 > > (XEN) prep 1cf71 30 0 > > (XEN) wake 1cf71 30 > > (XEN) prep 1cf72 30 0 > > (XEN) wake 1cf72 30 > > (XEN) prep 1cee9 30 0 > > (XEN) wake 1cee9 30 > > (XEN) prep 121a 30 0 > > (XEN) wake 121a 30 > > > > (This means 'gfn (p2m_unshare << 4) in_atomic)' > > > > (XEN) prep 1ee61 20 0 > > (XEN) max stacksize c18 > > (XEN) Xen WARN at wait.c:126 > > (XEN) ----[ Xen-4.2.24114-20111111.221356 x86_64 debug=y Tainted: C > > ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e008:[<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2 > > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor > > (XEN) rax: 0000000000000000 rbx: ffff830201f76000 rcx: 0000000000000000 > > (XEN) rdx: ffff82c4802b7f18 rsi: 000000000000000a rdi: ffff82c4802673f0 > > (XEN) rbp: ffff82c4802b73a8 rsp: ffff82c4802b7378 r8: 0000000000000000 > > (XEN) r9: ffff82c480221da0 r10: 00000000fffffffa r11: 0000000000000003 > > (XEN) r12: ffff82c4802b7f18 r13: ffff830201f76000 r14: ffff83003ea5c000 > > (XEN) r15: 000000000001ee61 cr0: 000000008005003b cr4: 00000000000026f0 > > (XEN) cr3: 000000020336d000 cr2: 00007fa88ac42000 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > > (XEN) Xen stack trace from rsp=ffff82c4802b7378: > > (XEN) 0000000000000020 000000000001ee61 0000000000000002 ffff830201aa9e90 > > (XEN) ffff830201aa9f60 0000000000000020 ffff82c4802b7428 ffff82c4801e02f9 > > (XEN) ffff830000000002 0000000000000000 ffff82c4802b73f8 ffff82c4802b73f4 > > (XEN) 0000000000000000 ffff82c4802b74e0 ffff82c4802b74e4 0000000101aa9e90 > > (XEN) 000000ffffffffff ffff830201aa9e90 000000000001ee61 ffff82c4802b74e4 > > (XEN) 0000000000000002 0000000000000000 ffff82c4802b7468 ffff82c4801d810f > > (XEN) ffff82c4802b74e0 000000000001ee61 ffff830201aa9e90 ffff82c4802b75bc > > (XEN) 00000000002167f5 ffff88001ee61900 ffff82c4802b7518 ffff82c480211b80 > > (XEN) ffff8302167f5000 ffff82c4801c168c 0000000000000000 ffff83003ea5c000 > > (XEN) ffff88001ee61900 0000000001805063 0000000001809063 000000001ee001e3 > > (XEN) 000000001ee61067 00000000002167f5 000000000022ee70 000000000022ed10 > > (XEN) ffffffffffffffff 0000000a00000007 0000000000000004 ffff82c48025db80 > > (XEN) ffff83003ea5c000 ffff82c4802b75bc ffff88001ee61900 ffff830201aa9e90 > > (XEN) ffff82c4802b7528 ffff82c480211cb1 ffff82c4802b7568 ffff82c4801da97f > > (XEN) ffff82c4801be053 0000000000000008 ffff82c4802b7b58 ffff88001ee61900 > > (XEN) 0000000000000000 ffff82c4802b78b0 ffff82c4802b75f8 ffff82c4801aaec8 > > (XEN) 0000000000000003 ffff88001ee61900 ffff82c4802b78b0 ffff82c4802b7640 > > (XEN) ffff83003ea5c000 00000000000000a0 0000000000000900 0000000000000008 > > (XEN) 00000003802b7650 0000000000000004 00000003802b7668 0000000000000000 > > (XEN) ffff82c4802b7b58 0000000000000001 0000000000000003 ffff82c4802b78b0 > > (XEN) Xen call trace: > > (XEN) [<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2 > > (XEN) [<ffff82c4801e02f9>] ept_get_entry+0x81/0xd8 > > (XEN) [<ffff82c4801d810f>] gfn_to_mfn_type_p2m+0x55/0x114 > > (XEN) [<ffff82c480211b80>] hap_p2m_ga_to_gfn_4_levels+0x1c4/0x2d6 > > (XEN) [<ffff82c480211cb1>] hap_gva_to_gfn_4_levels+0x1f/0x2e > > (XEN) [<ffff82c4801da97f>] paging_gva_to_gfn+0xae/0xc4 > > (XEN) [<ffff82c4801aaec8>] hvmemul_linear_to_phys+0xf1/0x25c > > (XEN) [<ffff82c4801ab762>] hvmemul_rep_movs+0xe8/0x31a > > (XEN) [<ffff82c48018de07>] x86_emulate+0x4e01/0x10fde > > (XEN) [<ffff82c4801aab3c>] hvm_emulate_one+0x12d/0x1c5 > > (XEN) [<ffff82c4801b68a9>] handle_mmio+0x4e/0x1d8 > > (XEN) [<ffff82c4801b3a1e>] hvm_hap_nested_page_fault+0x1e7/0x302 > > (XEN) [<ffff82c4801d1ff6>] vmx_vmexit_handler+0x12cf/0x1594 > > (XEN) > > (XEN) wake 1ee61 20 > > > > > > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |