[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable: pci-passthrough "irq 16: nobody cared" on HVM guest shutdown on irq of device not passed through.

Wednesday, October 8, 2014, 2:56:53 PM, you wrote:

> On Tue, Oct 07, 2014 at 03:50:03PM +0100, Jan Beulich wrote:
>> >>> On 07.10.14 at 15:41, <konrad.wilk@xxxxxxxxxx> wrote:
>> > Could you attach also the full dmesg under baremetal with 'debug' and all
>> > kinds of debug enabled ? That should help a bit in figuring out why
>> > they get MSIs under baremetal but legacy interrupts under Xen.
>> The messages he sent don't really suggest that. The legacy pin
>> based IRQ always gets set up when a device gets enabled, no
>> matter whether in the end it would actually get used. And afaict
>> other messages clearly hint at MSI being used for the PCIe stuff.

> Correct. I fear that in the domain0 we have set an event for this
> particular GSI (16) which is also in use in the guest (and then somehow
> we did not tear this down when the PCIe setup the MSI).

> Xen will send events to both domains - and since domain0 does not
> have an IRQ handler for it - it will activate its anti-IRQ storm
> routine and disabling the IRQ line.

>> Jan

Hi Jan / Konrad / David,

(added David since Konrad seems quite busy managing all his hats ;-) )

I'm still seeing things related to pci-passthrough that i can't place, this 
time it's on:

- Intel hardware (instead of AMD).
- PV-guest (instead of HVM).
- Xen-unstable as of today.
- Linux 3.18-rc1 kernel.
- Passed through device is again a sound controller (00:1b.0).

- I added some debug printk's to the kernel, the diff is attached. 
- I disabled all sound modules in the dom0 kernel config to rule out any 
interference by those drivers.
- The device passed through is seized by pciback on host boot.

I am seeing a couple of "oddities" (although i could be misinterpreting stuff):

- When MSI gets enabled on pci-passthrough during the boot of the guest, 
dev->irq changes
  (and with it the irq reported by lspci and reported at 
  from 22 to 68. It probably is/shouldn't be used since we should be using MSI 
(55) now.
  However this "68" doesn't seem to be listed anywhere as irq, not in dom0 in 
/proc/interrupts, not
  in the hypervisor output of the debug-keys "i, I, M".

- After the guest is fully booted the hypervisor reports shows an irq mapped to 
the guest, while 22 is still mapped to dom0:
  (XEN) [2014-10-21 02:11:03]    IRQ:  22 affinity:8 vec:d8 type=IO-APIC-level  
 status=00000030 in-flight=0 domain-list=0: 22(---),
  (XEN) [2014-10-21 02:11:03]    IRQ:  32 affinity:4 vec:69 type=PCI-MSI        
 status=00000030 in-flight=0 domain-list=1: 55(---),
  MSI 55 corresponds to what dom0 reports .. but i can't place IRQ 32 though .. 
it's not 22 .. it's not 68 ..

- When looking in the guests dmesg i see:
     [    6.757131] snd_hda_intel 0000:00:00.0: enabling device (0000 -> 0002)
     [    6.764141] snd_hda_intel 0000:00:00.0: Xen PCI mapped GSI22 to IRQ34
     [    6.768823] snd_hda_intel 0000:00:00.0: enabling bus mastering

     [   14.303460] ALSA device list:
     [   14.346789]   #0: Loopback 1
     [   14.390610]   #1: HDA Intel PCH at 0xf7d30000 irq 35

     ~# lspci -v
    00:00.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family 
High Definition Audio Controller (rev 04)
            Subsystem: Intel Corporation Device 204f
            Flags: bus master, fast devsel, latency 0, IRQ 35
            Memory at f7d30000 (64-bit, non-prefetchable) [size=16K]
            Capabilities: [50] Power Management version 2
            Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
            Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
            Capabilities: [100] Virtual Channel
            Capabilities: [130] Root Complex Link
            Kernel driver in use: snd_hda_intel

  Which also strikes me as odd .. GSI22, seems to match the initial value in 
dom0 and the hypervisor and gets mapped to irq 34.
  But why the change from IRQ34 to irq 35 ?

- When the guest shuts down, 
"drivers/xen/xen-pciback/pciback_ops.c:xen_pcibk_disable_msi()" is not called 
in dom0, which seems odd to me.
  (i would expect symmetry with 
"drivers/xen/xen-pciback/pciback_ops.c:xen_pcibk_enable_msi()" which does gets 
called on guest start)

- When i destroy the guest (instead of shutting it down), 
"xen_pcibk_disable_msi()" does get called, but then i get an error
  from the toolstack, it reads the /sys/bus/pci/devices/<BDF>/irq value which 
is still "68" and that is not
  assigned to the guest:
  libxl: error: libxl_pci.c:1319:do_pci_remove: xc_physdev_unmap_pirq irq=68: 
Invalid argument
  libxl: error: libxl_pci.c:1323:do_pci_remove: xc_domain_irq_permission 
irq=68: Invalid argument

  On the guest start it called xc_physdev_unmap_pirq and 
xc_domain_irq_permission for irq=22 ..

Extra / better debug-patches are of course always welcome ..

I have attached the logs from the 2 cases:

- general
    - lspci-tv.txt
    - debug.patch
    - dotconfig (.config of dom0 kernel)
- from the sequence: host boot, guest start, guest shutdown
    - dmesg-dom0-shutdown.txt
    - xl-dmesg-shutdown.txt
    - lspci-vvvknn-dom0-before.txt (output lspci in dom0 before starting guest)
    - lspci-vvvknn-dom0-during.txt (output lspci in dom0 with guest fully 

- from the sequence: host boot, guest start, guest destroy
    - dmesg-dom0-destroy.txt
    - xl-dmesg-destroy.txt

Attachment: debug.patch
Description: Binary data

Attachment: dmesg-dom0-destroy.txt
Description: Text document

Attachment: dmesg-dom0-shutdown.txt
Description: Text document

Attachment: dotconfig
Description: Binary data

Attachment: lspci-tv.txt
Description: Text document

Attachment: lspci-vvvknn-dom0-before.txt
Description: Text document

Attachment: lspci-vvvknn-dom0-during.txt
Description: Text document

Attachment: xl-dmesg-destroy.txt
Description: Text document

Attachment: xl-dmesg-shutdown.txt
Description: Text document

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.