[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
I'm still trying to track down the problem of lost interrupts when I change /proc/irq/<irq#>/smp_affinity in domU. I'm now at Xen 3.5-unstable changeset 20320 and using pvops dom0 2.6.31.1. In domU, my PCI devices are at virtual slots 5, 6, 7 and 8 so I use "lspci -vv" to get their respective IRQs and MSI message address/data and I can also see their IRQs in /proc/interrupts (I'm not showing all 16 CPUs): lspci -vv -s 00:05.0 | grep IRQ; lspci -vv -s 00:06.0 | grep IRQ; lspci -vv -s 00:07.0 | grep IRQ; lspci -vv -s 00:08.0 | grep IRQ Interrupt: pin A routed to IRQ 48 Interrupt: pin B routed to IRQ 49 Interrupt: pin C routed to IRQ 50 Interrupt: pin D routed to IRQ 51 lspci -vv -s 00:05.0 | grep Address; lspci -vv -s 00:06.0 | grep Address; lspci -vv -s 00:07.0 | grep Address; lspci -vv -s 00:08.0 | grep Address Address: 00000000fee00000 Data: 4071 (vector=113) Address: 00000000fee00000 Data: 4089 (vector=137) Address: 00000000fee00000 Data: 4099 (vector=153) Address: 00000000fee00000 Data: 40a9 (vector=169) egrep '(HW_TACHYON|CPU0)' /proc/interrupts CPU0 CPU1 48: 1571765 0 PCI-MSI-edge HW_TACHYON 49: 3204403 0 PCI-MSI-edge HW_TACHYON 50: 2643008 0 PCI-MSI-edge HW_TACHYON 51: 3270322 0 PCI-MSI-edge HW_TACHYON In dom0, my PCI devices show up as a 4-function device: 0:07:0.0, 0:07:0.1, 0:07:0.2, 0:07:0.3 and I also use "lspci -vv" to get the IRQs and MSI info: lspci -vv -s 0:07:0.0 | grep IRQ;lspci -vv -s 0:07:0.1 | grep IRQ;lspci -vv -s 0:07:0.2 | grep IRQ;lspci -vv -s 0:07:0.3 | grep IRQ Interrupt: pin A routed to IRQ 11 Interrupt: pin B routed to IRQ 10 Interrupt: pin C routed to IRQ 7 Interrupt: pin D routed to IRQ 5 lspci -vv -s 0:07:0.0 | grep Address;lspci -vv -s 0:07:0.1 | grep Address;lspci -vv -s 0:07:0.2 | grep Address;lspci -vv -s 0:07:0.3 | grep Address Address: 00000000fee00000 Data: 403c (vector=60) Address: 00000000fee00000 Data: 4044 (vector=68) Address: 00000000fee00000 Data: 404c (vector=76) Address: 00000000fee00000 Data: 4054 (vector=84) I used the "Ctrl-a" "Ctrl-a" "Ctrl-a" "i" key sequence from the Xen console to print the guest interrupt information and the PCI devices. The vectors shown here are actually the vectors as seen from dom0 so I don't understand the label "Guest interrupt information." Meanwhile, the IRQs (74 - 77) do not match those from dom0 (11, 10, 7, 5) or domU (48, 49, 50, 51) as seen by "lspci -vv" but they do match those reported by the "Ctrl-a" key sequence followed by "Q" for PCI devices. (XEN) Guest interrupt information: (XEN) IRQ: 74, IRQ affinity:0x00000001, Vec: 60 type=PCI-MSI status=00000010 in-flight=0 domain-list=1: 79(----), (XEN) IRQ: 75, IRQ affinity:0x00000001, Vec: 68 type=PCI-MSI status=00000010 in-flight=0 domain-list=1: 78(----), (XEN) IRQ: 76, IRQ affinity:0x00000001, Vec: 76 type=PCI-MSI status=00000010 in-flight=0 domain-list=1: 77(----), (XEN) IRQ: 77, IRQ affinity:0x00000001, Vec: 84 type=PCI-MSI status=00000010 in-flight=0 domain-list=1: 76(----), (XEN) ==== PCI devices ==== (XEN) 07:00.3 - dom 1 - MSIs < 77 > (XEN) 07:00.2 - dom 1 - MSIs < 76 > (XEN) 07:00.1 - dom 1 - MSIs < 75 > (XEN) 07:00.0 - dom 1 - MSIs < 74 > If I look at /var/log/xen/qemu-dm-dpm.log, I see these 4 lines that show the pirq's which matches those in the last column of guest interrupt information: pt_msi_setup: msi mapped with pirq 4f (79) pt_msi_setup: msi mapped with pirq 4e (78) pt_msi_setup: msi mapped with pirq 4d (77) pt_msi_setup: msi mapped with pirq 4c (76) The gvec's (71, 89, 99, a9) matches the vectors as seen by lspci in domU: pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation pt_msi_update: Update msi with pirq 4f gvec 71 gflags 0 pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation pt_msi_update: Update msi with pirq 4e gvec 89 gflags 0 pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation pt_msi_update: Update msi with pirq 4d gvec 99 gflags 0 pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation pt_msi_update: Update msi with pirq 4c gvec a9 gflags 0 I see these same pirq's in the output of "xm dmesg" (XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.0 (XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.0 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4f device = 5 intx = 0 (XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.1 (XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.1 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4e device = 6 intx = 0 (XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.2 (XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.2 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4d device = 7 intx = 0 (XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.3 (XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.3 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4c device = 8 intx = 0 The machine_gsi's match the pirq's while the m_irq's match the IRQ from lspci dom0. What are the guest_gsi's? (XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=79 guest_gsi=36, device=5, intx=0. (XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4f device = 0x5 intx = 0x0 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = b device = 5 intx = 0 (XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=78 guest_gsi=40, device=6, intx=0. (XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4e device = 0x6 intx = 0x0 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = a device = 6 intx = 0 (XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=77 guest_gsi=44, device=7, intx=0. (XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4d device = 0x7 intx = 0x0 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 7 device = 7 intx = 0 (XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=76 guest_gsi=17, device=8, intx=0. (XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4c device = 0x8 intx = 0x0 (XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 5 device = 8 intx = 0 So now when I finally get to the part where I change the smp_affinity, I see a corresponding change in the guest interrupt information, qemu-dm-dpm.log and lspci on both dom0 and domU: cat /proc/irq/48/smp_affinity ffff echo 2 > /proc/irq/48/smp_affinity cat /proc/irq/48/smp_affinity 0002 (XEN) Guest interrupt information: (IRQ affinity changed from 1 to 2, while vector changed from 60 to 92) (XEN) IRQ: 74, IRQ affinity:0x00000002, Vec: 92 type=PCI-MSI status=00000010 in-flight=1 domain-list=1: 79(---M), pt_msi_update: Update msi with pirq 4f gvec 71 gflags 2 (What is the significance of gflags 2?) pt_msi_update: Update msi with pirq 4f gvec b1 gflags 2 domU: lspci -vv -s 00:05.0 | grep Address Address: 00000000fee02000 Data: 40b1 (dest ID changed from 0 to 2 and vector changed from 0x71 to 0xb1) dom0: lspci -vv -s 0:07:0.0 | grep Address Address: 00000000fee00000 Data: 405c (vector changed from 0x3c (60 decimal) to 0x5c (92 decimal)) I'm confused why there are 4 sets of IRQs: dom0 lspci:[11,10,7,5], domU lspci proc interrupts:[48,49,50,51], pirq:[76,77,78,79], guest int info:[74,75,76,77]. Are the changes resulting from changing the IRQ smp_affinity consistent with what is expected? Any recommendation on where to go from here? Thanks in advance. Dante _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |