Re: [Xen-devel] pci-passthrough loses msi-x interrupts ability after domain destroy

On Thu, Sep 21, 2017 at 1:27 PM, Sander Eikelenboom
<linux@xxxxxxxxxxxxxx> wrote:
> On Thu, September 21, 2017, 10:39:52 AM, Roger Pau Monné wrote:
>> On Wed, Sep 20, 2017 at 03:50:35PM -0400, Jérôme Oufella wrote:
>>> I'm using PCI pass-through to map a PCIe (intel i210) controller into
>>> a HVM domain. The system uses xen-pciback to hide the appropriate PCI
>>> device from Dom0.
>>> When creating the HVM domain after an hypervisor cold boot, the HVM
>>> domain can access and use the PCIe controller without problem.
>>> However, if the HVM domain is destroyed then restarted, it won't be
>>> able to use the pass-through PCI device anymore. The PCI device is
>>> seen and can be mapped, however, the interrupts will not be passed to
>>> the HVM domain anymore (this is visible under a Linux guest as
>>> /proc/interrupts counters remain 0). The behavior on a Windows10 guest
>>> is the same.
>>> A few interesting hints I noticed:
>>> - On Dom0, 'lspci -vv' on that PCIe device between the "working" and
>>> the "muted interrupts" states, I noted a difference between the
>>> MSI-X caps:
>>> - Capabilities: [70] MSI-X: Enable- Count=5 Masked- <-- IRQs will work if 
>>> domain started
>>> + Capabilities: [70] MSI-X: Enable- Count=5 Masked+ <-- IRQs won't work if 
>>> domain started
>>>                                             ^^^^^^^
>> IMHO it seems that either your device is not able to perform a reset
>> successfully, or Linux is not correctly performing such reset. I don't
>> think there's a lot that can be done from the Xen side.
> Unfortunately for a lot of pci-devices a simple reset as performed by default 
> isn't enough,
> but also almost none support a real pci FLR.
> In the distant past Konrad has made a patchset that implemented a bus reset 
> and
> reseting config space. (It piggy backed on already existing libxl mechanism of
> trying to call on a syfs "do_flr" attribute which triggers pciback to perform
> the busreset and rewrite of config space for the device.
> I use that patchset ever since for my pci-passtrough needs and it works pretty
> well. I can shutdown an restart VM's with pci devices passed trhough (also AMD
> Radeon graphic cards).

Just to confirm the utility of that piece of work: OpenXT also uses an
extended version of that same patch to perform device reset for

I've attached a copy of that OpenXT patch to this message and it can
also be obtained from our git repository:
This version creates a sysfs node named "reset_device" and the OpenXT
libxl toolstack is patched to use that node instead of "do_flr".

Konrad's original work encountered pushback on upstream acceptance at
the time it was developed. I'm not sure I've found where that
discussion ended. Is there any prospect of a more comprehensive reset
mechanism being accepted into xen-pciback, or elsewhere in the kernel?

As noted in the original LKML threads, vfio has similar relevant pci
device reset retry logic. (Thanks to Rich Persaud for this pointer:)

libvirt also performs similar reset logic, using a direct low level
interface to config space (Thanks to Marek for this pointer, libvirt
is used by Qubes:)
I thinks this indicates that it would be possible to extend libxl to
do something similar, but that seems less satisfactory compared to
performing the work in a kernel-provided implementation.

Is there a way forward to providing this functionality within Xen
software or Linux?



