[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] cpuidle and un-eoid interrupts at the local apic
Recently our automated testing system has caught a curious assertion while testing Xen 4.1.5 on a HaswellDT system. (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1030 (XEN) ----[ Xen-4.1.5 x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48016b2b4>] do_IRQ+0x514/0x750 (XEN) RFLAGS: 0000000000010093 CONTEXT: hypervisor (XEN) rax: 000000000000002f rbx: ffff830249841e80 rcx: ffff82c4803127c0 (XEN) rdx: 0000000000000004 rsi: 0000000000000027 rdi: 0000000000000001 (XEN) rbp: 0000000000001e00 rsp: ffff82c4802bfd48 r8: ffff82c480312abc (XEN) r9: ffff8302498a5948 r10: 0000000000000009 r11: ffff8302498c6c80 (XEN) r12: ffff830243b07f50 r13: ffff8300a24f8000 r14: 00000af8373788e3 (XEN) r15: ffff830249841e80 cr0: 000000008005003b cr4: 00000000001026f0 (XEN) cr3: 00000002479e6000 cr2: 00000000e6d3c090 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff82c4802bfd48: (XEN) ffff830249841eb4 ffff82c480312ec0 000000000000001e 0000001e00000000 (XEN) 0000000000000000 00000000498a5670 ffff830249841d80 ffff830249840080 (XEN) ffff830249841db4 0000000000000000 ffff8302498a55e0 ffff8302498a5670 (XEN) ffff8300a24f8000 00000af8373788e3 00000af83736b8ed ffff82c480162ca0 (XEN) 00000af83736b8ed 00000af8373788e3 ffff8300a24f8000 ffff8302498a5670 (XEN) ffff8302498a55e0 0000000000000000 ffff8302498c6c80 0000000000000009 (XEN) ffff8302498a5948 ffff82c480313000 0000000000007f40 0000000000000001 (XEN) 0000000000000000 0000000000000000 00000af80db652fd 0000002700000000 (XEN) ffff82c4801a50a0 000000000000e008 0000000000000246 ffff82c4802bfe78 (XEN) 0000000000000000 ffff8302498a5670 ffff82c4801a6a56 ffffffffffffffff (XEN) ffff830249818000 0000000000000000 ffff8300a24f8000 ffff82c480122c11 (XEN) 00000af839021119 0000000000000000 0000000000000000 00000000802bff18 (XEN) 0000025c0000013b ffff82c4802e7580 ffff82c4802bff18 ffff8300a2838000 (XEN) ffff82c4802f61a0 ffff8300a24f8000 0000000000000002 00000af837304b45 (XEN) ffff82c48015b67a 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 00000000ee8a3f8c 0000000000000001 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 00000000ee8a3f74 0000000000000af8 (XEN) 0000000000000001 0000010000000000 00000000c01013a7 0000000000000061 (XEN) 0000000000000246 00000000ee8a3f70 0000000000000069 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c48016b2b4>] do_IRQ+0x514/0x750 (XEN) 15[<ffff82c480162ca0>] common_interrupt+0x20/0x30 (XEN) 32[<ffff82c4801a50a0>] lapic_timer_nop+0x0/0x10 (XEN) 38[<ffff82c4801a6a56>] acpi_processor_idle+0x376/0x740 (XEN) 43[<ffff82c480122c11>] do_block+0x71/0xd0 (XEN) 56[<ffff82c48015b67a>] idle_loop+0x1a/0x50 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1030 (XEN) **************************************** And the disassembly before the assertion: ffff82c48016b29f: 48 8d 14 85 00 00 00 lea 0x0(,%rax,4),%rdx ffff82c48016b2a6: 00 ffff82c48016b2a7: 0f b6 44 11 ff movzbl -0x1(%rcx,%rdx,1),%eax ffff82c48016b2ac: 39 c6 cmp %eax,%esi ffff82c48016b2ae: 0f 8f 5c ff ff ff jg ffff82c48016b210 <do_IRQ+0x470> ffff82c48016b2b4: 0f 0b ud2 Xen has been woken up by an interrupt of vector 0x27, but has a vector 0x2f on the top of the pending EOI stack for the local APIC. I have put in more debugging to dump the LAPIC state of the two interesting vectors and the IOAPIC state, but I have no idea if/when the problem might reoccur. My understanding of LAPIC priority leads me to think that Xen really shouldn't be woken up by a lower priority vector if a higher priority one is still un-eoi'd. There is not yet sufficient information to tell whether this is truely the case, or that Xen has simply gotten confused about which vectors it eoi'd. Having said that, we do keep line level interrupts un-eoi'd for extended periods while guests service the interrupt. Given that vectors are chosen at random, we could get into a situation where a line interrupt has a vector 0xdf and stays pending for 150ms (which I measured as a not-overly-uncommon mean-time-till-eoi for line level interrupt). This would starve any other guest interrupts for an extended period. Given directed-eoi support in the past few generations of processor, the requirement for the pending EOI stack has disappeared as far as I am aware. Would it be sensible idea in general to make use of the pending eoi stack conditional on not having/using directed EOI support? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |