[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Multi-bridged PCIe devices (Was: Re: iommuu/vt-d issues with LSI MegaSAS (PERC5i))



 >>> On 11.09.13 at 14:14, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> On Wed, 11 Sep 2013 12:53:09 +0100, "Jan Beulich" <JBeulich@xxxxxxxx> 
>  wrote:
>>>>> On 11.09.13 at 13:05, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
>>> I found this:
>>>
>>>  http://lists.xen.org/archives/html/xen-devel/2010-06/msg00093.html 
>>>
>>>  while looking for a solution to a similar problem. I am
>>>  facing a similar issue with LSI (8408E, 3081E-R) and
>>>  Adaptec (31605) SAS cards. Was there ever a proper, more general
>>>  fix or workaround for this issue?
>>>
>>>  These SAS cards experience these problems in dom0. When running
>>>  a vanilla kernel on bare metal, they work OK without intel_iommu
>>>  set. As soon as I set intel_iommu, the same thing happens (on
>>>  bare metal, not dom0).
>>>
>>>  Clearly there is something badly broken with multiple layers
>>>  of bridges when it comes to IOMMU in my setup (Intel 5520 PCIe
>>>  root hub -> NF200 bridge -> Intel 80333 Bridge -> SAS controller)
>>
>> The link above has some (hackish) workarounds - did you try
>> them?
> 
>  Not yet. The thing that bothers me is that he workaround
>  involves hard-coding the PCI device ID which is _nasty_
>  and unstable.

I said "hackish", didn't I? Of course such a change would not
have the slightest chance of going into any repo. But knowing
whether it helps may allow thinking of an acceptable
workaround.

>> In any event, seeing a hypervisor log with "iommu=debug" might
>> shed further light on this: For one, we might be able to see which
>> exact devices are present in the ACPI tables. And we would see
>> which device(s) eventual faults originate from.
> 
>  The thing that bothers me is that this happens in dom0 even
>  with iommu=dom0-passthrough being set.
>  iommu=dom0-passthrough,workaround_bios_bug doesn't help,
>  either

They're not meant to deal with this sort of an impossible (in
theory) situation.

>  And lo and behold, I do have phantom PCI devices after all!
>  lspci shows no device with ID 0000:0f:01.0

Not exactly: Phantom functions can't be at function 0. Irrespective
of that - do the device coordinates somehow correlate with the
problematic controller (IOW: lspci output and a full log would help)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.