[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable: pci-passthrough "irq 16: nobody cared" on HVM guest shutdown on irq of device not passed through.



Thursday, September 25, 2014, 7:02:02 PM, you wrote:


> Thursday, September 25, 2014, 6:14:43 PM, you wrote:

>>>>> On 25.09.14 at 17:49, <linux@xxxxxxxxxxxxxx> wrote:

>>> Thursday, September 25, 2014, 5:11:33 PM, you wrote:
>>> 
>>>>>>> On 25.09.14 at 16:36, <linux@xxxxxxxxxxxxxx> wrote:
>>>>> - When shutting down the HVM guest when A happens the number of 
>>>>> interrupts in 
>>> 
>>>>> /proc/interrups is still what it was, but when B happens it seems like a 
>>>>> irq 
>>> 
>>>>> storm
>>>>>   and after the irq nobody cared that ends with (always that 200000 so 
>>>>> perhaps a threshold ?):
>>>>>   16:     200000          0          0          0          0          0  
>>> xen-pirq-ioapic-level  snd_hda_intel
>>> 
>>>> 100,000 is the traditional threshold, so I would expect the cited
>>>> instance to be the second one. It didn't really become clear to me
>>>> - is this observed in Dom0 or in the shutting down guest? 
>>> 
>>> This is from the /proc/interrupts of dom0 after the irq nobody cared 
>>> message 
>>> appeared in dom0 (so after B happened). Just after host boot and first 
>>> guest 
>>> boot it was stable around 500. On the next start (and after which B would 
>>> happen on shutting the guest down again) it doubled to about 1000 (perhaps 
>>> when 
>>> the "Unsupported MSI delivery mode 3 for Dom2" occured).

>> Something odd must then be going on - the threshold _is_ 100,000,
>> not 200,000.

>>>> And did you really check that no other device (even if currently not having
>>>> an interrupt handler bound) is sitting on IRQ 16?
>>> 
>>> In what way could i check that to be certain ?
>>> (if it's not bound, lspci and /proc/interrupts will probably be 
>>> insufficient 
>>> for that ?)

>> If the BIOS sets these up, lspci might still be of help. Consulting
>> boot messages of the kernel may also provide some hints. Beyond
>> that I'm not really sure how to figure out.

>> Jan

> lspci gives only one device with IRQ 16, the soundcontroller 

> 00:14.2 Audio device: Advanced Micro Devices [AMD] nee ATI SBx00 Azalia 
> (Intel HDA) (rev 40)
>         Subsystem: Micro-Star International Co., Ltd. Device 7640
>         Flags: bus master, slow devsel, latency 64, IRQ 16
>         Memory at fdbf8000 (64-bit, non-prefetchable) [size=16K]
>         Capabilities: [50] Power Management version 2
>         Kernel driver in use: snd_hda_intel

> On boot i do get a message  "Already setup the GSI :16", however that seems 
> to 
> happen for multiple devices and irq/gsi's.

> I did have a go at copy and pasting the (hopefully) most relevant messages 
> around IRQ's and MSI's for the different stages. But my untrained eye doesn't 
> spot a difference that i can relate.


> ##Cold boot of the host system
>     [   35.556728] xen: registering gsi 16 triggering 0 polarity 1
>     [   35.573157] xen: --> pirq=16 -> irq=16 (gsi=16)
>     (XEN) [2014-09-25 13:08:55.771] IOAPIC[0]: Set PCI routing entry (6-16 -> 
> 0x89 -> IRQ 16 Mode:1 Active:1)
>     [   38.575661] pciback 0000:09:00.0: enabling device (0000 -> 0003)
>     [   38.593584] xen: registering gsi 32 triggering 0 polarity 1
>     [   38.610461] xen: --> pirq=32 -> irq=32 (gsi=32)
>     (XEN) [2014-09-25 13:08:58.809] IOAPIC[1]: Set PCI routing entry (7-8 -> 
> 0xc9 -> IRQ 32 Mode:1 Active:1)
>     [   42.713230] xen: registering gsi 16 triggering 0 polarity 1
>     [   42.713233] Already setup the GSI :16
>     
>     
>     (XEN) [2014-09-25 13:29:04.111]    IRQ:  16 affinity:01 vec:89 
> type=IO-APIC-level   status=00000030 in-flight=0 domain-list=0: 16(---),
>     (XEN) [2014-09-25 13:29:04.370]    IRQ:  32 affinity:3f vec:c9 
> type=IO-APIC-level   status=00000002 mapped, unbound
>     
>     (XEN) [2014-09-25 13:29:06.583]     IRQ 16 Vec137:
>     (XEN) [2014-09-25 13:29:06.597]       Apic 0x00, Pin 16: vec=89 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:1
>     (XEN) [2014-09-25 13:29:06.978]     IRQ 32 Vec201:
>     (XEN) [2014-09-25 13:29:06.991]       Apic 0x01, Pin  8: vec=00 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=1 dest_id:63
>     
>     (XEN) [2014-09-25 13:29:19.108] 0000:09:00.0 - dom 0   - MSIs < >
>     (XEN) [2014-09-25 13:29:19.442] 0000:00:14.2 - dom 0   - MSIs < >
>     
> ##Start of the HVM guest with pci device passed through (dom1).
>     (XEN) [2014-09-25 13:30:32.831] io.c:280: d1: bind: m_gsi=32 g_gsi=36 
> dev=00.00.5 intx=0
>     
>     (XEN) [2014-09-25 13:35:10.930]    IRQ:  16 affinity:01 vec:89 
> type=IO-APIC-level   status=00000030 in-flight=0 domain-list=0: 16(---),
>     (XEN) [2014-09-25 13:35:11.189]    IRQ:  32 affinity:02 vec:c9 
> type=IO-APIC-level   status=00000010 in-flight=0 domain-list=1: 32(-M-),
>     (XEN) [2014-09-25 13:35:12.498]    IRQ:  84 affinity:04 vec:aa 
> type=PCI-MSI         status=00000030 in-flight=0 domain-list=1: 87(---),
>     
>     (XEN) [2014-09-25 13:35:13.443]     IRQ 16 Vec137:
>     (XEN) [2014-09-25 13:35:13.456]       Apic 0x00, Pin 16: vec=89 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:1
>     (XEN) [2014-09-25 13:35:13.837]     IRQ 32 Vec201:
>     (XEN) [2014-09-25 13:35:13.851]       Apic 0x01, Pin  8: vec=c9 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:2
>     
>     (XEN) [2014-09-25 13:35:28.164] 0000:09:00.0 - dom 1   - MSIs < 84 >
>     (XEN) [2014-09-25 13:35:28.515] 0000:00:14.2 - dom 0   - MSIs < >
>     
>     (XEN) [2014-09-25 13:35:37.013]  MSI     84 vec=aa lowest  edge   assert  
> log lowest dest=00000004 mask=0/1/?
>     
> ##Shutdown of the HVM guest with pci device passed through, A happened.
>     (XEN) [2014-09-25 13:38:27.974]    IRQ:  16 affinity:01 vec:89 
> type=IO-APIC-level   status=00000030 in-flight=0 domain-list=0: 16(---),
>     (XEN) [2014-09-25 13:38:28.233]    IRQ:  32 affinity:02 vec:c9 
> type=IO-APIC-level   status=00000002 mapped, unbound
>     
>     (XEN) [2014-09-25 13:38:30.446]     IRQ 16 Vec137:
>     (XEN) [2014-09-25 13:38:30.459]       Apic 0x00, Pin 16: vec=89 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:1
>     (XEN) [2014-09-25 13:38:30.840]     IRQ 32 Vec201:
>     (XEN) [2014-09-25 13:38:30.854]       Apic 0x01, Pin  8: vec=c9 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=1 dest_id:2
>     
>     (XEN) [2014-09-25 13:38:39.255] 0000:09:00.0 - dom 0   - MSIs < >
>     (XEN) [2014-09-25 13:38:39.590] 0000:00:14.2 - dom 0   - MSIs < >  
>     
> ##Start of the HVM guest with pci device passed through (dom2).
>     (XEN) [2014-09-25 13:39:07.963] io.c:280: d2: bind: m_gsi=32 g_gsi=36 
> dev=00.00.5 intx=0
>     (XEN) [2014-09-25 13:39:48.149] d32767v2: Unsupported MSI delivery mode 3 
> for Dom2
>     
>     (XEN) [2014-09-25 13:40:44.831]    IRQ:  16 affinity:01 vec:89 
> type=IO-APIC-level   status=00000030 in-flight=0 domain-list=0: 16(---),
>     (XEN) [2014-09-25 13:40:45.089]    IRQ:  32 affinity:02 vec:c9 
> type=IO-APIC-level   status=00000010 in-flight=0 domain-list=2: 32(-M-),
>     (XEN) [2014-09-25 13:40:46.398]    IRQ:  84 affinity:02 vec:b2 
> type=PCI-MSI         status=00000030 in-flight=0 domain-list=2: 87(---),
>     
>     (XEN) [2014-09-25 13:40:47.343]     IRQ 16 Vec137:
>     (XEN) [2014-09-25 13:40:47.357]       Apic 0x00, Pin 16: vec=89 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:1
>     (XEN) [2014-09-25 13:40:47.738]     IRQ 32 Vec201:
>     (XEN) [2014-09-25 13:40:47.751]       Apic 0x01, Pin  8: vec=c9 
> delivery=Fixed dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:2
>     
>     (XEN) [2014-09-25 13:40:57.567] 0000:09:00.0 - dom 2   - MSIs < 84 >
>     (XEN) [2014-09-25 13:40:57.901] 0000:00:14.2 - dom 0   - MSIs < >
>     
>     (XEN) [2014-09-25 13:41:01.051]  MSI     84 vec=b2 lowest  edge   assert  
> log lowest dest=00000002 mask=0/1/?   
>     
> ##Shutdown of the HVM guest with pci device passed through, B happened.
>     [ 2265.395971] irq 16: nobody cared (try booting with the "irqpoll" 
> option)
>     <call trace>
>     [ 2266.234031] Disabling IRQ #16

>     (XEN) [2014-09-25 13:46:54.844]    IRQ:  16 affinity:01 vec:89 
> type=IO-APIC-level   status=00000030 in-flight=1 domain-list=0: 16(PMM),
>     (XEN) [2014-09-25 13:46:55.103]    IRQ:  32 affinity:02 vec:c9 
> type=IO-APIC-level   status=00000002 mapped, unbound
>     
>     (XEN) [2014-09-25 13:46:57.316]     IRQ 16 Vec137:
>     (XEN) [2014-09-25 13:46:57.330]       Apic 0x00, Pin 16: vec=89 
> delivery=Fixed dest=L status=0 polarity=1 irr=1 trig=L mask=0 dest_id:1
>     (XEN) [2014-09-25 13:46:57.711]     IRQ 32 Vec201:
>     (XEN) [2014-09-25 13:46:57.724]       Apic 0x01, Pin  8: vec=c9 
> delivery=Fixed dest=L status=1 polarity=1 irr=0 trig=L mask=1 dest_id:2

>     (XEN) [2014-09-25 13:47:08.688] 0000:09:00.0 - dom 0   - MSIs < >
>     (XEN) [2014-09-25 13:47:09.022] 0000:00:14.2 - dom 0   - MSIs < >


Hrmm there seems to be at least one omission in the debug-keys logging code, a 
delivery-mode other than "fixed or lowest" would never be shown:
msi.c:1311
               data & MSI_DATA_DELIVERY_LOWPRI ? "lowest" : "fixed",

However if it would be anything else than lowest, fixed would be shown.
Since the debug keys output shows lowest, it should be correct .. 
But how does it become delivery mode 3 during the guest start, at the moment 
that vmsi_deliver() is called which give the "Unsupported MSI delivery mode 3" 
message ?


But this seems to be a red herring for the "irq 16: nobody cared" case anyway, 
with 
retesting i just had it occurring on the first boot of the HVM guest just after 
the host has booted, without showing the "Unsupported MSI delivery mode 3" 
message.

So it might be slightly related, but probably no causation .. *sigh*.

--
Sander


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.