[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Multi-bridged PCIe devices (Was: Re: iommuu/vt-d issues with LSI MegaSAS (PERC5i))



On Thu, Sep 12, 2013 at 06:20:18AM +0000, Zhang, Yang Z wrote:
> Jan Beulich wrote on 2013-09-11:
> >>>> On 11.09.13 at 15:26, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> >> On Wed, 11 Sep 2013 14:22:51 +0100, "Jan Beulich"
> >> <JBeulich@xxxxxxxx>
> >>  wrote:
> >>>>>> On 11.09.13 at 15:10, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> >>>> On Wed, 11 Sep 2013 14:03:14 +0100, "Jan Beulich"
> >>>> <JBeulich@xxxxxxxx>
> >>>>  wrote:
> >>>>>>>> On 11.09.13 at 14:45, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> >>>>>>  dmesg, xl dmesg, lspci -vvvnn and lspci -tvnn output is attached.
> >>>>>>  
> >>>>>>  I'll try adding one of my LSI cards and see the comparative
> >>>>>> behaviour. Right now I don't even know if the phantom device  is
> >>>>>> on the SAS card or the motherboard.
> >>>>> 
> >>>>> The Adaptec card being the only thing on bus 0f makes it pretty
> >>>>> likely that this other device also is on that card.
> >>>>> 
> >>>>> I guess the issue is mainly because the device itself is a PCI
> >>>>> one, while the immediately upstream bridge (where I mean only the
> >>>>> visible one) is PCIe. There _must_ be a PCIe-PCI bridge between
> >>>>> them. And as long as firmware doesn't know about that bridge and
> >>>>> the bridge doesn't properly handle config space accesses to it,
> >>>>> such a device just can't be used with an IOMMU (without some yet
> >>>>> to be invented workaround).
> >>>>> 
> >>>>  I'm actually thinking about Konrad's proposed hack in that
> >>>> thread from 3 years ago. If the device IDs are parameterized  out
> >>>> rather than hard-coded, then this could work in nearly the  same
> >>>> was as xen-pciback in terms of usage. Pass the phantom  device IDs
> >>>> as parameters to the module. Done that way it  might even be
> >>>> considered clean enough to be fit for public  consumption.
> >>> 
> >>> Except that, short of being able to determine it via config space
> >>> reads, we also need the resulting command line option to tell us
> >>> that what kind of device that is.
> >>> 
> >>  Not sure I follow. Why do we need to know the device type?
> > 
> > Just look at set_msi_source_id() as well as
> > domain_context_{mapping,unmap}() (just the most prominent
> > examples): Behavior here heavily depends on the type of the device
> > itself _and_ that of the upstream bridge(s).
> Looks like there are many devices are failed to work. I wonder whether the 
> PCI/PCIe specification tells how to detect the hidden device behind those 
> devices (Like detection of phantom device). If not, I think those devices are 
> buggy. Or we can say those devices are not really PCI/PCIe compatible. Since 
> VT-d only covers the PCI/PCIe device, it's reasonable that non-PCI/PCIe 
> device failed to work under VT-d.
> 
> As Jan's suggestion, we need the user to tell us whether there is a hidden 
> device or BDF behind anther device that the OS is unaware. We need to pass 
> that info to Xen before pass-thought the device.
> 

Interestingly enough I just hit this with my brand-new Haswell CPU and
new motherboard when passing in a capture card. It shows:

    +-1c.5-[07-09]----00.0-[08-09]--+-01.0-[09]--+-08.0  Brooktree Corporation 
Bt878 Video Capture
           |                               |            +-08.1  Brooktree 
Corporation Bt878 Audio Capture
           |                               |            +-09.0  Brooktree 
Corporation Bt878 Video Capture
           |                               |            +-09.1  Brooktree 
Corporation Bt878 Audio Capture
           |                               |            +-0a.0  Brooktree 
Corporation Bt878 Video Capture
           |                               |            +-0a.1  Brooktree 
Corporation Bt878 Audio Capture
           |                               |            +-0b.0  Brooktree 
Corporation Bt878 Video Capture
           |                               |            \-0b.1  Brooktree 
Corporation Bt878 Audio Capture
           |                               \-03.0  Texas Instruments TSB43AB22A 
IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx]

And Xen says:
(XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]iommu.c:865: DMAR:[DMA Read] Request device [0000:08:00.0] fault 
addr 36aa3000, iommu reg = ffff82c3ffd53000
(XEN) DMAR:[fault reason 02h] Present bit in context entry is clear
(XEN) print_vtd_entries: iommu ffff83083d4939b0 dev 0000:08:00.0 gmfn 36aa3
(XEN)     root_entry = ffff83083d47e000
(XEN)     root_entry[8] = 72569a001
(XEN)     context = ffff83072569a000
(XEN)     context[0] = 0_0
(XEN)     ctxt_entry[0] not present
(XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]iommu.c:865: DMAR:[DMA Read] Request device [0000:08:00.0] fault 
addr 36aa3000, iommu reg = ffff82c3ffd53000


Oddly enough it was working fine in a box with an AMD IOMMU. But
to be fair - that machine was running with Xen 4.1.

The hack I developed: 
http://lists.xen.org/archives/html/xen-devel/2010-06/msg00093.html
ends up with this:

(XEN) alloc_pdev: unknown type: 0000:08:00.0
(XEN) [VT-D]iommu.c:1484: d0:unknown(0): 0000:08:00.0
(XEN) [VT-D]iommu.c:1888: d0: context mapping failed

(FYI, this Xen 4.3.1)

Let me retry on the AMD box with the same version of Xen.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.