[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM guests hanging in hpet_rtc_interrupt
>>> On 31.01.13 at 18:07, Olaf Hering <olaf@xxxxxxxxx> wrote: > xenctx output: I'm confused by this: For one, the mere fact that hpet_rtc_interrupt() is in use suggests that the kernel uses RTC emulation (and hence wouldn't depend on any of the RTC registers we emulate to be set correctly). The only issue then could be that we're raising way too many IRQ8-s. While that could explain hangs (due to the IRQ getting shut off by the kernel), has two other inconsistencies in itself: (a) hpet_rtc_interrupt() unconditionally returns IRQ_HANDLED, and hence the interrupt should never be regarded as spurious by the kernel (and hence never be shut off). (b) Once turned off (which we see happening in the kernel log you posted subsequently), hpet_rtc_interrupt() cannot possibly get onto the call stack at all anymore, hence a hang _here_ is impossible. So one question to you would be to clarify which hangs you see where under what conditions (I'm particularly unclear about the exact difference in behavior that you observe with xend vs xl). And then there is a problem with the xenctx output below, assuming they were taken subsequently on the same guest instance while in that hung state: > ... > rip: ffffffff81034b97 hpet_rtc_interrupt+0x97 > flags: 00000086 s nz p > rsp: ffff88001b003df8 > rax: 00000000bc004a66 rcx: 00000000000ee6b2 rdx: 00000000ba6b145a > rbx: 000000000000087d rsi: ffff88001aec5000 rdi: 0000000000000008 > rbp: ffff88001b003e68 r8: ffff88001ac06800 r9: ffff88001b003e38 > r10: 000000000000008c r11: 000000000000f800 r12: ffff88001ac06884 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000008 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b000000/0000000000000000 > Code (instr addr ffffffff81034b97) > 00 89 10 48 8b 05 74 da d0 00 83 c3 01 48 05 f0 00 00 00 8b 00 <3b> 05 d3 da > d0 00 79 c9 85 db 74 > > > Stack: > ffff88001b003e68 ffff88001aec5000 0000000000000000 0000000800000000 > ffff88001ad49c68 0000000000000000 00000008156570c6 ffffffff8165e9cc > ffff88001b0125c0 ffff880019278480 ffff88001ac06884 0000000000000000 > 0000000000000000 0000000000000008 ffff88001b003eb8 ffffffff8109aa2b > > Call Trace: > [<ffffffff81034b97>] hpet_rtc_interrupt+0x97 <-- > [<ffffffff8165e9cc>] _raw_spin_unlock_irqrestore+0xc > [<ffffffff8109aa2b>] handle_irq_event_percpu+0x4b > [<ffffffff8109ab8c>] handle_irq_event+0x3c > [<ffffffff8109d79d>] handle_edge_irq+0x6d > [<ffffffff8142b474>] __xen_evtchn_do_upcall+0x1a4 > [<ffffffff8142ca6a>] xen_evtchn_do_upcall+0x2a > [<ffffffff81660aad>] xen_hvm_callback_vector+0x6d > rip: ffffffff810360d6 native_safe_halt+0x6 > flags: 00000246 i z p > rsp: ffff88001acbdef8 > rax: 0000000000000000 rcx: 00000000ffffffff rdx: 0000000000000000 > rbx: ffff88001acbc010 rsi: 0140000000000000 rdi: 0000000000000086 > rbp: ffff88001acbdef8 r8: 0000000000000000 r9: 0000000000000000 > r10: 0000000000000001 r11: 0000000000000000 r12: ffffffff81c62330 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000000 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b040000/0000000000000000 > Code (instr addr ffffffff810360d6) > 48 89 e5 fb c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <c9> c3 0f 1f > 84 00 00 00 00 00 55 > > > Stack: > ffff88001acbdf08 ffffffff81014735 ffff88001acbdf28 ffffffff81014a68 > 0000000000000001 0000000000000000 ffff88001acbdf48 ffffffff8165824a > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > Call Trace: > [<ffffffff810360d6>] native_safe_halt+0x6 <-- > [<ffffffff81014735>] default_idle+0x45 > [<ffffffff81014a68>] cpu_idle+0x88 > [<ffffffff8165824a>] start_secondary+0x188 > > rip: ffffffff81034b97 hpet_rtc_interrupt+0x97 > flags: 00000086 s nz p > rsp: ffff88001b003df8 > rax: 00000000bc004a66 rcx: 00000000000ee6b2 rdx: 00000000ba6b145a > rbx: 000000000000087d rsi: ffff88001aec5000 rdi: 0000000000000008 > rbp: ffff88001b003e68 r8: ffff88001ac06800 r9: ffff88001b003e38 > r10: 000000000000008c r11: 000000000000f800 r12: ffff88001ac06884 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000008 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b000000/0000000000000000 > Code (instr addr ffffffff81034b97) > 00 89 10 48 8b 05 74 da d0 00 83 c3 01 48 05 f0 00 00 00 8b 00 <3b> 05 d3 da > d0 00 79 c9 85 db 74 > > > Stack: > ffff88001b003e68 ffff88001aec5000 0000000000000000 0000000800000000 > ffff88001ad49c68 0000000000000000 00000008156570c6 ffffffff8165e9cc > ffff88001b0125c0 ffff880019278480 ffff88001ac06884 0000000000000000 > 0000000000000000 0000000000000008 ffff88001b003eb8 ffffffff8109aa2b > > Call Trace: > [<ffffffff81034b97>] hpet_rtc_interrupt+0x97 <-- > [<ffffffff8165e9cc>] _raw_spin_unlock_irqrestore+0xc > [<ffffffff8109aa2b>] handle_irq_event_percpu+0x4b > [<ffffffff8109ab8c>] handle_irq_event+0x3c > [<ffffffff8109d79d>] handle_edge_irq+0x6d > [<ffffffff8142b474>] __xen_evtchn_do_upcall+0x1a4 > [<ffffffff8142ca6a>] xen_evtchn_do_upcall+0x2a > [<ffffffff81660aad>] xen_hvm_callback_vector+0x6d > rip: ffffffff810360d6 native_safe_halt+0x6 > flags: 00000246 i z p > rsp: ffff88001acbdef8 > rax: 0000000000000000 rcx: 00000000ffffffff rdx: 0000000000000000 > rbx: ffff88001acbc010 rsi: 0140000000000000 rdi: 0000000000000086 > rbp: ffff88001acbdef8 r8: 0000000000000000 r9: 0000000000000000 > r10: 0000000000000001 r11: 0000000000000000 r12: ffffffff81c62330 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000000 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b040000/0000000000000000 > Code (instr addr ffffffff810360d6) > 48 89 e5 fb c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <c9> c3 0f 1f > 84 00 00 00 00 00 55 > > > Stack: > ffff88001acbdf08 ffffffff81014735 ffff88001acbdf28 ffffffff81014a68 > 0000000000000001 0000000000000000 ffff88001acbdf48 ffffffff8165824a > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > Call Trace: > [<ffffffff810360d6>] native_safe_halt+0x6 <-- > [<ffffffff81014735>] default_idle+0x45 > [<ffffffff81014a68>] cpu_idle+0x88 > [<ffffffff8165824a>] start_secondary+0x188 > > rip: ffffffff81034b97 hpet_rtc_interrupt+0x97 > flags: 00000086 s nz p > rsp: ffff88001b003df8 > rax: 00000000bc004a66 rcx: 00000000000ee6b2 rdx: 00000000ba6b145a > rbx: 000000000000087d rsi: ffff88001aec5000 rdi: 0000000000000008 > rbp: ffff88001b003e68 r8: ffff88001ac06800 r9: ffff88001b003e38 > r10: 000000000000008c r11: 000000000000f800 r12: ffff88001ac06884 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000008 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b000000/0000000000000000 > Code (instr addr ffffffff81034b97) > 00 89 10 48 8b 05 74 da d0 00 83 c3 01 48 05 f0 00 00 00 8b 00 <3b> 05 d3 da > d0 00 79 c9 85 db 74 > > > Stack: > ffff88001b003e68 ffff88001aec5000 0000000000000000 0000000800000000 > ffff88001ad49c68 0000000000000000 00000008156570c6 ffffffff8165e9cc > ffff88001b0125c0 ffff880019278480 ffff88001ac06884 0000000000000000 > 0000000000000000 0000000000000008 ffff88001b003eb8 ffffffff8109aa2b > > Call Trace: > [<ffffffff81034b97>] hpet_rtc_interrupt+0x97 <-- > [<ffffffff8165e9cc>] _raw_spin_unlock_irqrestore+0xc > [<ffffffff8109aa2b>] handle_irq_event_percpu+0x4b > [<ffffffff8109ab8c>] handle_irq_event+0x3c > [<ffffffff8109d79d>] handle_edge_irq+0x6d > [<ffffffff8142b474>] __xen_evtchn_do_upcall+0x1a4 > [<ffffffff8142ca6a>] xen_evtchn_do_upcall+0x2a > [<ffffffff81660aad>] xen_hvm_callback_vector+0x6d > rip: ffffffff810360d6 native_safe_halt+0x6 > flags: 00000246 i z p > rsp: ffff88001acbdef8 > rax: 0000000000000000 rcx: 00000000ffffffff rdx: 0000000000000000 > rbx: ffff88001acbc010 rsi: 0140000000000000 rdi: 0000000000000086 > rbp: ffff88001acbdef8 r8: 0000000000000000 r9: 0000000000000000 > r10: 0000000000000001 r11: 0000000000000000 r12: ffffffff81c62330 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000000 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b040000/0000000000000000 > Code (instr addr ffffffff810360d6) > 48 89 e5 fb c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <c9> c3 0f 1f > 84 00 00 00 00 00 55 > > > Stack: > ffff88001acbdf08 ffffffff81014735 ffff88001acbdf28 ffffffff81014a68 > 0000000000000001 0000000000000000 ffff88001acbdf48 ffffffff8165824a > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > Call Trace: > [<ffffffff810360d6>] native_safe_halt+0x6 <-- > [<ffffffff81014735>] default_idle+0x45 > [<ffffffff81014a68>] cpu_idle+0x88 > [<ffffffff8165824a>] start_secondary+0x188 > > rip: ffffffff81034b97 hpet_rtc_interrupt+0x97 > flags: 00000086 s nz p > rsp: ffff88001b003df8 > rax: 00000000bc004a66 rcx: 00000000000ee6b2 rdx: 00000000ba6b145a > rbx: 000000000000087d rsi: ffff88001aec5000 rdi: 0000000000000008 > rbp: ffff88001b003e68 r8: ffff88001ac06800 r9: ffff88001b003e38 > r10: 000000000000008c r11: 000000000000f800 r12: ffff88001ac06884 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000008 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b000000/0000000000000000 > Code (instr addr ffffffff81034b97) > 00 89 10 48 8b 05 74 da d0 00 83 c3 01 48 05 f0 00 00 00 8b 00 <3b> 05 d3 da > d0 00 79 c9 85 db 74 > > > Stack: > ffff88001b003e68 ffff88001aec5000 0000000000000000 0000000800000000 > ffff88001ad49c68 0000000000000000 00000008156570c6 ffffffff8165e9cc > ffff88001b0125c0 ffff880019278480 ffff88001ac06884 0000000000000000 > 0000000000000000 0000000000000008 ffff88001b003eb8 ffffffff8109aa2b > > Call Trace: > [<ffffffff81034b97>] hpet_rtc_interrupt+0x97 <-- > [<ffffffff8165e9cc>] _raw_spin_unlock_irqrestore+0xc > [<ffffffff8109aa2b>] handle_irq_event_percpu+0x4b > [<ffffffff8109ab8c>] handle_irq_event+0x3c > [<ffffffff8109d79d>] handle_edge_irq+0x6d > [<ffffffff8142b474>] __xen_evtchn_do_upcall+0x1a4 > [<ffffffff8142ca6a>] xen_evtchn_do_upcall+0x2a > [<ffffffff81660aad>] xen_hvm_callback_vector+0x6d > rip: ffffffff810360d6 native_safe_halt+0x6 > flags: 00000246 i z p > rsp: ffff88001acbdef8 > rax: 0000000000000000 rcx: 00000000ffffffff rdx: 0000000000000000 > rbx: ffff88001acbc010 rsi: 0140000000000000 rdi: 0000000000000086 > rbp: ffff88001acbdef8 r8: 0000000000000000 r9: 0000000000000000 > r10: 0000000000000001 r11: 0000000000000000 r12: ffffffff81c62330 > r13: 0000000000000000 r14: 0000000000000000 r15: 0000000000000000 > cs: 0010 ss: 0018 ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffff88001b040000/0000000000000000 > Code (instr addr ffffffff810360d6) > 48 89 e5 fb c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <c9> c3 0f 1f > 84 00 00 00 00 00 55 > > > Stack: > ffff88001acbdf08 ffffffff81014735 ffff88001acbdf28 ffffffff81014a68 > 0000000000000001 0000000000000000 ffff88001acbdf48 ffffffff8165824a > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > Call Trace: > [<ffffffff810360d6>] native_safe_halt+0x6 <-- > [<ffffffff81014735>] default_idle+0x45 > [<ffffffff81014a68>] cpu_idle+0x88 > [<ffffffff8165824a>] start_secondary+0x188 > ... One of the vCPU-s (presumably vCPU0, but that doesn't really matter) has an _invariant_ register state, yet the loop it's in guarantees at least %ebx to change on each iteration (and one would expect %eax to change too). Which would imply the guest isn't making forward progress at all anymore. With interrupts being disabled during that code sequence, this also can't mean the guest is completely busy servicing interrupts - it must be getting no execution time anymore. Which in turn would mean either the vCPU is no longer in running state, or Xen itself is hung. As you didn't indicate the latter, could you check the former? Finally, while I don't see a connection (namely because I don't think periodic interrupts would be in use by the guest here, which is what this change to a large part revolves around), you might nevertheless want to try to apply the last outstanding change (attached) as an alternative to reverting. Jan Attachment:
x86-HVM-RTC.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |