Xen project Mailing List

[Xen-devel] HPET Stack overflow

To: Xen-devel List <xen-devel@xxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Mon, 30 Sep 2013 18:00:38 +0100

Delivery-date: Mon, 30 Sep 2013 17:01:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hello, After some of the more urgent regressions XenServer testing has found have been fixed, I finally have time to get back to this one. CA-113568 - Dumping interesting state: local_irq_count was 8 num_hpets_used = 8 HPET[00] idx 0, cpu 3, flags 0x00000001, cpumask 00000008 MSI: HPET 25 vec=41 fixed edge assert phys cpu dest=00000002 mask=1/0/? HPET[01] idx 1, cpu 4294967295, flags 0x00000000, cpumask 00000000 MSI: HPET 26 vec=c0 fixed edge assert phys cpu dest=00000002 mask=1/0/? HPET[02] idx 2, cpu 4294967295, flags 0x00000000, cpumask 00000000 MSI: HPET 27 vec=c8 fixed edge assert phys cpu dest=00000002 mask=1/0/? HPET[03] idx 3, cpu 4294967295, flags 0x00000000, cpumask 00000000 MSI: HPET 28 vec=d0 fixed edge assert phys cpu dest=00000006 mask=1/0/? HPET[04] idx 4, cpu 4294967295, flags 0x00000000, cpumask 00000000 MSI: HPET 29 vec=21 fixed edge assert phys cpu dest=00000006 mask=1/0/? HPET[05] idx 5, cpu 2, flags 0x00000001, cpumask 00000004 MSI: HPET 30 vec=29 fixed edge assert phys cpu dest=00000006 mask=1/0/? HPET[06] idx 6, cpu 4294967295, flags 0x00000000, cpumask 00000000 MSI: HPET 31 vec=31 fixed edge assert phys cpu dest=00000002 mask=1/0/? HPET[07] idx 7, cpu 1, flags 0x00000001, cpumask 00000002 MSI: HPET 32 vec=39 fixed edge assert phys cpu dest=00000006 mask=1/0/? This debugging is taken after nmi_shootdown_cpus(), when all other pcpus are stopped (In due course, I shall get around to submitting the debugging infrastructure patches, as I suspect others upstream might find them useful). This is one of the more interesting traces, but other times have been seen every HPET with cpu set to -1. local_irq_count proves that we have taken 8 nested interrupts, and the full stack debugging (not included here for brevity) shows that they were all HPET interrupts with different vectors. On this particular hardware, using "maxcpus=4" does work around the issue, but is not a valid fix. Playing with the position of ack_APIC_irq() does appear to affect the problem (as suspected), but I am not convinced it is the correct fix either. Having looked at the implementation, which uses regular irqs and irq migration, I have to admit to being surprised it even works in the slightest. For safety reasons, the irq migration code requires a irq to arrive at the old cpu, at which point state gets updates to point it at the new cpu. It also means that the vector is essentially random between 0x21 and 0xef, which plays havoc with priorities of other interrupts, including line level interrupts (where a high priority line level interrupt with its ack sitting on the pending EOI stack can block a a lower priority HPET timer interrupt for an indefinite period of time). From my reading of the code, a pcpu which calls hpet_get_channel() and needs to share channels will cause an early HPET interrupt to occur on the old pcpu, which appears to then go out of its way to wake up the cpu which should have received the interrupt. It occurs to me that a substantially more sane method of doing this is to install a high priority handler with the hpet handler directly wired in. This way, hpet_get_channel() becomes vastly more simple and just involves rewriting the MSI to change the destination. It also allows for correct use of ack_APIC_irq() (i.e. prevent reentrancy), and frees up some space in the dynamic range. Jan: as the author of the current code, is there anything I have overlooked? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.