[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem




> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault
> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0]
> fault addr c00000, iommu reg = ffff82c3fff57000

The driver gets the physical addr 0xc0049c thru kernel function virt_to_phys() and set the dma address of Tachyon chip with this address. This address translation is also involved with SWIOTLB library. Is there any issue related with SWIOTLB in pvops kernel ?
 
 
 
 
-Ray
 




-----Original Message-----
From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
Sent: Tuesday, September 28, 2010 9:19 AM
To: Lin, Ray; JBeulich@xxxxxxxxxx
Cc: Bruce Edge; Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem

On Tue, Sep 28, 2010 at 10:08:57AM -0600, Lin, Ray wrote:
>     I just checked the "xen dmesg". Look like DMA/iommu is the root cause of this issue. In order to tell the source of interrupt, Tachyon chip needs to do the DMA write to a dword memory location to indicate the source of interrupt. What iommu option do you recommend to use ?

Lets get Jan involved in this discussion.

Jan, would some of your patches that inhibit the MSI write affect this in a PV guest?

>
> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault
> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0]
> fault addr c00000, iommu reg = ffff82c3fff57000
> (XEN) DMAR:[fault reason 05h] PTE Write access is not set
> (XEN) print_vtd_entries: iommu = ffff83019fffa370 bdf = 7:0.0 gmfn = c00
> (XEN)     root_entry = ffff83019ff70000
> (XEN)     root_entry[7] = 19cf52001
> (XEN)     context = ffff83019cf52000
> (XEN)     context[0] = 102_706dc005
> (XEN)     l4 = ffff8300706dc000
> (XEN)     l4_index = 0
> (XEN)     l4[0] = 706db003
> (XEN)     l3 = ffff8300706db000
> (XEN)     l3_index = 0
> (XEN)     l3[0] = 706da003
> (XEN)     l2 = ffff8300706da000
> (XEN)     l2_index = 6
> (XEN)     l2[6] = 0
>
>
> -Ray
>
>
> ________________________________
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Bruce Edge
> Sent: Monday, September 27, 2010 9:46 PM
> To: Jiang, Yunhong
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Konrad Rzeszutek Wilk
> Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts
> on Nehalem
>
> On Mon, Sep 27, 2010 at 8:26 PM, Jiang, Yunhong <yunhong.jiang@xxxxxxxxx<mailto:yunhong.jiang@xxxxxxxxx>> wrote:
> "xm dmesg" should gives xen's boot log, and sometimes it contain some helpful information, I think, especially loglvl and guest_loglvl is set to all.
>
> I looked at the xm dmesg output and there's nothing more than what I already provided, aside from a bunch of commands from me poking at it.
>
> -Bruce
>
>
> Thanks
> --jyh
>
> From: Bruce Edge
> [mailto:bruce.edge@xxxxxxxxx<mailto:bruce.edge@xxxxxxxxx>]
> Sent: Tuesday, September 28, 2010 11:16 AM
> To: Jiang, Yunhong
> Cc: Konrad Rzeszutek Wilk;
> xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel@xxxxxxxxxxxxxxxxxxx>
>
> Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts
> on Nehalem
>
> On Mon, Sep 27, 2010 at 6:15 PM, Jiang, Yunhong <yunhong.jiang@xxxxxxxxx<mailto:yunhong.jiang@xxxxxxxxx>> wrote:
> Is the 07:0.0 your tachyon device? The VT-d fault is suspcious.
>
> Yes, there is 1 quad port card is this sytem:
>
> 07:00.0 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
> 07:00.1 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
> 07:00.2 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
> 07:00.3 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)
>
>
> Also is it possible to share the xen output?
>
> I attached the dom0 boot output. Let me know if you wanted something else.
>
> Also, here's the dom0 console output upon starting the VM: This lockdep error started with the release of 2.6.32.21. Note that I'm running the same  kernel for the domU and dom0.
>
> [ 1817.684097] ------------[ cut here ]------------ [ 1817.684113]
> WARNING: at kernel/lockdep.c:2323
> trace_hardirqs_on_caller+0x12f/0x190()
> [ 1817.684119] Hardware name: ProLiant DL380 G6 [ 1817.684122] Modules
> linked in: xt_physdev ipv6 osa_mfgdom0 xenfs xen_gntdev fbcon tileblit
> font bitblit softcursor xen_evtchn xen_pciback radeon ttm
> drm_kms_helper tun drm i2c_algo_bit ipmi_si i2c_core ipmi_msghandler joydev serio_raw hpwdt hpilo bridge stp llc usbhid hid cciss usb_storage [ 1817.684190] Pid: 11, comm: xenwatch Not tainted 2.6.32.21-xenoprof-1 #1 [ 1817.684195] Call Trace:
> [ 1817.684197]  <IRQ>  [<ffffffff810aa18f>] ?
> trace_hardirqs_on_caller+0x12f/0x190
> [ 1817.684209]  [<ffffffff8106bed0>] warn_slowpath_common+0x80/0xd0 [
> 1817.684217]  [<ffffffff815f2b80>] ? _spin_unlock_irq+0x30/0x40 [
> 1817.684223]  [<ffffffff8106bf34>] warn_slowpath_null+0x14/0x20 [
> 1817.684229]  [<ffffffff810aa18f>]
> trace_hardirqs_on_caller+0x12f/0x190
> [ 1817.684234]  [<ffffffff810aa1fd>] trace_hardirqs_on+0xd/0x10 [
> 1817.684240]  [<ffffffff815f2b80>] _spin_unlock_irq+0x30/0x40 [
> 1817.684266]  [<ffffffff813c4fc5>]
> add_to_net_schedule_list_tail+0x85/0xd0
> [ 1817.684271]  [<ffffffff813c6216>] netif_be_int+0x36/0x160 [
> 1817.684278]  [<ffffffff810e10d0>] handle_IRQ_event+0x70/0x180 [
> 1817.684284]  [<ffffffff810e36e9>] handle_edge_irq+0xc9/0x170 [
> 1817.684291]  [<ffffffff813b8d7f>] __xen_evtchn_do_upcall+0x1bf/0x1f0
> [ 1817.684297]  [<ffffffff813b92fd>] xen_evtchn_do_upcall+0x3d/0x60 [
> 1817.684304]  [<ffffffff8101647e>]
> xen_do_hypervisor_callback+0x1e/0x30
> [ 1817.684308]  <EOI>  [<ffffffff8100940a>] ?
> hypercall_page+0x40a/0x1010 [ 1817.684319]  [<ffffffff8100940a>] ?
> hypercall_page+0x40a/0x1010 [ 1817.684325]  [<ffffffff813bce54>] ?
> xb_write+0x1e4/0x290 [ 1817.684330]  [<ffffffff813bd8ca>] ?
> xs_talkv+0x6a/0x1f0 [ 1817.684336]  [<ffffffff813bd8d8>] ?
> xs_talkv+0x78/0x1f0 [ 1817.684341]  [<ffffffff813bdbcd>] ?
> xs_single+0x4d/0x60 [ 1817.684346]  [<ffffffff813be502>] ?
> xenbus_read+0x52/0x80 [ 1817.684352]  [<ffffffff813c87fc>] ?
> frontend_changed+0x48c/0x770 [ 1817.684358]  [<ffffffff813bf76d>] ?
> xenbus_otherend_changed+0xdd/0x1b0
> [ 1817.684365]  [<ffffffff8101122f>] ?
> xen_restore_fl_direct_end+0x0/0x1 [ 1817.684371]  [<ffffffff810ac830>]
> ? lock_release+0xb0/0x230 [ 1817.684376]  [<ffffffff813bfae0>] ?
> frontend_changed+0x10/0x20 [ 1817.684382]  [<ffffffff813bd4f5>] ?
> xenwatch_thread+0x55/0x160 [ 1817.684389]  [<ffffffff81093400>] ?
> autoremove_wake_function+0x0/0x40 [ 1817.684394]  [<ffffffff813bd4a0>]
> ? xenwatch_thread+0x0/0x160 [ 1817.684400]  [<ffffffff81093086>] ?
> kthread+0x96/0xb0 [ 1817.684405]  [<ffffffff8101632a>] ?
> child_rip+0xa/0x20 [ 1817.684410]  [<ffffffff81015c90>] ?
> restore_args+0x0/0x30 [ 1817.684415]  [<ffffffff81016320>] ?
> child_rip+0x0/0x20
>
> -Bruce
>
>
>
> Thanks
> --jyh
>
> >-----Original Message-----
> >From:
> >xen-devel-bounces@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel-bounces@lists.
> >xensource.com>
> >[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel-bounce
> >s@xxxxxxxxxxxxxxxxxxx>] On Behalf Of Bruce Edge
> >Sent: Tuesday, September 28, 2010 7:54 AM
> >To: Konrad Rzeszutek Wilk
> >Cc:
> >xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel@xxxxxxxxxxxxxxxxxxx>
> >Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts
> >on Nehalem
> >
> >On Mon, Sep 27, 2010 at 12:54 PM, Konrad Rzeszutek Wilk
> ><konrad.wilk@xxxxxxxxxx<mailto:konrad.wilk@xxxxxxxxxx>> wrote:
> >> On Mon, Sep 27, 2010 at 12:16:50PM -0700, Bruce Edge wrote:
> >>> On Mon, Sep 27, 2010 at 10:24 AM, Konrad Rzeszutek Wilk
> >>> <konrad.wilk@xxxxxxxxxx<mailto:konrad.wilk@xxxxxxxxxx>> wrote:
> >>> >
> >>> > On Mon, Sep 27, 2010 at 08:52:39AM -0700, Bruce Edge wrote:
> >>> > > One of our developers who is working on a tachyon driver is
> >>> > > complaining that the pvops domU kernel is not working for
> >>> > > these MSI interrupts.
> >>> > > This is using the current head of xen/2.6.32.x on both a
> >>> > > single Nahelam 920 and a dual E5540. This behavior is
> >>> > > consistent with Xen 4.0.1, 4.0.2.rc1-pre and 4.1.
> >>> > >
> >>> > > Here are his comments:
> >>> > >
> >>> > > - the driver has no problem to enable msi interrupt and
> >>> > > request the interrupt through kernel functions pci_enable_msi
> >>> > > & request_irq
> >>> >
> >>> > What shows up in the Xen console when you send the 'q' key? Does
> >>> > it show that the vector is assigned to the appropiate guest?
> >>>
> >>> The Xen console q key shows that the domU is assigned:
> >>>
> >>> (XEN)     Interrupts { 32, 41-42, 47 }
> >>
> >> Aha!
> >>
> >>>
> >>> but the domU thinks it has:
> >>>
> >>> 124/125/126/127
> >>>
> >>> Is there some mapping that's taking place, or is this plain wrong?
> >>
> >> That looks wrong. The IRQ numbers (even though they are MSI
> >> vectors) are setup as IRQ numbers in the DomU guest. You should
> >> have seen
> >>
> >> 32:
> >> 41:
> >> 42:
> >> 47:
> >> in you /proc/interrupts on your DomU guest.
> >>
> >> I wonder what broke  - can you use
> >git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git<http://g
> >it.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git>
> >> devel/xen-pcifront-0.5 (or pv/pcifront-2.6.32)?
> >
> >Please forgive the git ignorance.
> >
> >Is this the right syntax?
> >
> >git clone
> >git://git.kernel.org/pub/scm/linux/kernel/git/konrad:pv/pcifront-2.6.
> >32<http://git.kernel.org/pub/scm/linux/kernel/git/konrad:pv/pcifront-
> >2.6.32>
> >linux-2.6.32-pv-pcifront
> >
> >Initialized empty Git repository in
> >/import/kaan/bedge/src/xen/kernel/pv-ops/linux-2.6.32-pv-pcifront/.gi
> >t/
> >fatal: The remote end hung up unexpectedly
> >
> >Or:
> >
> > git clone 
> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git<http://
> > git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git>
> >
> >Initialized empty Git repository in
> >/import/kaan/bedge/src/xen/kernel/pv-ops/xen/.git/
> >remote: error: Could not read
> >59eab2f8f04147c5aadc99f2034ca7e5b81e890f
> >remote: fatal: Failed to traverse parents of commit
> >979e121cb348add17ed8171bf447b27a3a9d1be3
> >remote: aborting due to possible repository corruption on the remote side.
> >fatal: early EOF
> >fatal: index-pack failed
> >
> >>
> >> It has the latest pcifront driver but without the PVonHVM
> >> enhancments so we can try to eliminate the PvONHVM logic out of the picture.
> >>
> >>>
> >>> >
> >>> > > - the interrupt does happen. But the interrupt service routine
> >>> > > of tachyon driver doesn't detect any interrupt status related
> >>> > > to this interrupt, which inhibits the tachyon chip from coming
> >>> > > on-line. And there are high count of tachyon interrupt in
> >>> > > /proc/interrupts
> >>> >
> >>> > Is it checking the PCI_STATUS_INTERRUPT or the appropiate
> >>> > register in the MMIO BAR?
> >>> >
> >>>
> >>> The driver would check the appropriate register (tachyon
> >>> registers) in the MMIO to determine the source of interrupts.
> >>
> >> OK, so that isn't it. Is there anything at these vectors:
> >> 7c, 7d, 7e, and 7f? When you use xen debug-keys 'i' or 'q' it
> >> should give you an inkling what device this is set for.
> >
> >When I run a distro kernel in hvm mode, I get the expected irq mappings:
> >
> >'i' - Note 66 - 69
> >(XEN)    IRQ:  66 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:3a
> >type=PCI-MSI         status=00000010 in-flight=0
> >domain-list=10:127(----),
> >(XEN)    IRQ:  67 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:42
> >type=PCI-MSI         status=00000010 in-flight=0
> >domain-list=10:126(----),
> >(XEN)    IRQ:  68 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:4a
> >type=PCI-MSI         status=00000010 in-flight=0
> >domain-list=10:125(----),
> >(XEN)    IRQ:  69 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:52
> >type=PCI-MSI         status=00000010 in-flight=0
> >domain-list=10:124(----)
> >
> >
> >'q'
> >(XEN)     Interrupts { 32, 41-42, 47, 124-127 }
> >
> >
> >The same data with pv-ops kernel shows:
> >
> >'i'
> >IRQ numbers stop at 65, no 66 - 69 present:
> >
> >(XEN)    IRQ:  63 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:91
> >type=PCI-MSI         status=00000010 in-flight=0
> >domain-list=0:289(----),
> >(XEN)    IRQ:  64 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:99
> >type=PCI-MSI         status=00000002 mapped, unbound
> >(XEN)    IRQ:  65 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:b1
> >type=PCI-MSI         status=00000010 in-flight=0
> >domain-list=0:287(----),
> >(XEN) IO-APIC interrupt information:
> >
> >'q'
> >(XEN)     Interrupts { 32, 41-42, 47 }
> >
> >>
> >>>
> >>> > >
> >>> > > kaan-18-dpm:~# cat /proc/interrupts | grep TACH
> >>> > >
> >124:     760415          0          0          0          0
> >    0
> >>> > >          0          0          0          0          0
> >      0
> >>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
> >>> > >
> >125:     762234          0          0          0          0
> >    0
> >>> > >          0          0          0          0          0
> >      0
> >>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
> >>> > >
> >126:     764180          0          0          0          0
> >    0
> >>> > >          0          0          0          0          0
> >      0
> >>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
> >>> > >
> >127:     764164          0          0          0          0
> >    0
> >>> > >          0          0          0          0          0
> >      0
> >>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
> >>> >
> >>> > Can you provide the full dmesg output?
> >>>
> >>> Attached.
> >>>
> >>> Some possibly related messages on dom0 console:
> >>>
> >>> [ 1882.269778] pciback 0000:07:00.0: enabling device (0000 ->
> >>> 0003) [ 1882.269800] xen: registering gsi 32 triggering 0 polarity
> >>> 1 [ 1882.269827] xen_allocate_pirq: returning irq 32 for gsi 32 [
> >>> 1882.269834] xen: --> irq=32 [ 1882.269841] Already setup the GSI
> >>> :32 [ 1882.269847] pciback 0000:07:00.0: PCI INT A -> GSI 32
> >>> (level, low) -> IRQ 32 [ 1882.269866] pciback 0000:07:00.0:
> >>> setting latency timer to 64 [ 1882.270463] pciback 0000:07:00.0:
> >>> Driver tried to write to a read-only configuration space field at
> >>> offset 0x62, size 2. This may be harmless, but if you have
> >>> problems with your device:
> >>
> >> Uhhh, for that I think you need to do 'lspci -vvv -xxx -s 07:00.00'
> >> to find out what is at the configuration space. You could enable it
> >> using the permissive attribute.
> >>
> >>> [ 1882.270465] 1) see permissive attribute in sysfs [ 1882.270467]
> >>> 2) report problems to the xen-devel mailing list along with
> >>> details of your device obtained from lspci.
> >>> [ 1882.270615]   alloc irq_desc for 478 on node 0
> >>> [ 1882.270625]   alloc kstat_irqs on node 0
> >>
> >> So for 478: what do you see? xen-pciback I presume?
> >>> [ 1882.348411] pciback 0000:07:00.1: enabling device (0000 ->
> >>> 0003) [ 1882.348433] xen: registering gsi 42 triggering 0 polarity
> >>> 1 [ 1882.348440] xen_allocate_pirq: returning irq 42 for gsi 42 [
> >>> 1882.348445] xen: --> irq=42 [ 1882.348472] Already setup the GSI
> >>> :42 [ 1882.348479] pciback 0000:07:00.1: PCI INT B -> GSI 42
> >>> (level, low) -> IRQ 42 [ 1882.348497] pciback 0000:07:00.1:
> >>> setting latency timer to 64 [ 1882.349063] pciback 0000:07:00.1:
> >>> Driver tried to write to a read-only configuration space field at
> >>> offset 0x62, size 2. This may be harmless, but if you have
> >>> problems with your device:
> >>> [ 1882.349066] 1) see permissive attribute in sysfs [ 1882.349067]
> >>> 2) report problems to the xen-devel mailing list along with
> >>> details of your device obtained from lspci.
> >>> [ 1882.349205]   alloc irq_desc for 477 on node 0
> >>> [ 1882.349215]   alloc kstat_irqs on node 0
> >>> [ 1882.402893] pciback 0000:07:00.2: enabling device (0000 ->
> >>> 0003) [ 1882.402908] xen: registering gsi 47 triggering 0 polarity
> >>> 1 [ 1882.402913] xen_allocate_pirq: returning irq 47 for gsi 47 [
> >>> 1882.402916] xen: --> irq=47 [ 1882.402921] Already setup the GSI
> >>> :47 [ 1882.402925] pciback 0000:07:00.2: PCI INT C -> GSI 47
> >>> (level, low) -> IRQ 47 [ 1882.402938] pciback 0000:07:00.2:
> >>> setting latency timer to 64 [ 1882.403280] pciback 0000:07:00.2:
> >>> Driver tried to write to a read-only configuration space field at
> >>> offset 0x62, size 2. This may be harmless, but if you have
> >>> problems with your device:
> >>> [ 1882.403282] 1) see permissive attribute in sysfs [ 1882.403282]
> >>> 2) report problems to the xen-devel mailing list along with
> >>> details of your device obtained from lspci.
> >>> [ 1882.403380]   alloc irq_desc for 476 on node 0
> >>> [ 1882.403386]   alloc kstat_irqs on node 0
> >>> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault
> >>> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0]
> >>> fault addr e6f80000, iommu reg = ffff82c3fff57000
> >>> (XEN) DMAR:[fault reason 05h] PTE Write access is not set
> >>> (XEN) print_vtd_entries: iommu = ffff83019fffa370 bdf = 7:0.0 gmfn = e6f80
> >>> (XEN)     root_entry = ffff83019ff70000
> >>> (XEN)     root_entry[7] = 19cf52001
> >>> (XEN)     context = ffff83019cf52000
> >>> (XEN)     context[0] = 102_706dc005
> >>> (XEN)     l4 = ffff8300706dc000
> >>> (XEN)     l4_index = 0
> >>> (XEN)     l4[0] = 706db003
> >>> (XEN)     l3 = ffff8300706db000
> >>> (XEN)     l3_index = 3
> >>> (XEN)     l3[3] = 702b6003
> >>> (XEN)     l2 = ffff8300702b6000
> >>> (XEN)     l2_index = 137
> >>> (XEN)     l2[137] = 0
> >>> (XEN)     l2[137] not present
> >>> (XEN) traps.c:466:d0 Unhandled nmi fault/trap [#2] on VCPU 0
> >>> [ec=0000]
> >>
> >> That is not good. What changed from your earlier emails that this was triggered?
> >
> >Nothing
> >> Or was it triggered all along?
> >
> >Yes, I just included it for completeness
> >
> >> What happens if you run the system without the iommu enabled?
> >
> >Haven't tried yet. Will check that next.
> >
> >-Bruce
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:Xen-devel@xxxxxxxxxxxxxxxxxxx>
> >http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.