Xen project Mailing List

Re: [Xen-devel] pci-passthrough loses msi-x interrupts ability after domain destroy

Hi, On Fri, Sep 22, 2017 at 03:25:00PM -0400, Konrad Rzeszutek Wilk wrote: > On Fri, Sep 22, 2017 at 09:35:40AM +0200, Sander Eikelenboom wrote: > > On 22/09/17 04:09, Christopher Clark wrote: > > > On Thu, Sep 21, 2017 at 1:27 PM, Sander Eikelenboom > > > <linux@xxxxxxxxxxxxxx> wrote: > > >> > > >> On Thu, September 21, 2017, 10:39:52 AM, Roger Pau Monné wrote: > > >> > > >>> On Wed, Sep 20, 2017 at 03:50:35PM -0400, Jérôme Oufella wrote: > > >>>> > > >>>> I'm using PCI pass-through to map a PCIe (intel i210) controller into > > >>>> a HVM domain. The system uses xen-pciback to hide the appropriate PCI > > >>>> device from Dom0. > > >>>> > > >>>> When creating the HVM domain after an hypervisor cold boot, the HVM > > >>>> domain can access and use the PCIe controller without problem. > > >>>> > > >>>> However, if the HVM domain is destroyed then restarted, it won't be > > >>>> able to use the pass-through PCI device anymore. The PCI device is > > >>>> seen and can be mapped, however, the interrupts will not be passed to > > >>>> the HVM domain anymore (this is visible under a Linux guest as > > >>>> /proc/interrupts counters remain 0). The behavior on a Windows10 guest > > >>>> is the same. > > >>>> > > >>>> A few interesting hints I noticed: > > >>>> > > >>>> - On Dom0, 'lspci -vv' on that PCIe device between the "working" and > > >>>> the "muted interrupts" states, I noted a difference between the > > >>>> MSI-X caps: > > >>>> > > >>>> - Capabilities: [70] MSI-X: Enable- Count=5 Masked- <-- IRQs will work > > >>>> if domain started > > >>>> + Capabilities: [70] MSI-X: Enable- Count=5 Masked+ <-- IRQs won't > > >>>> work if domain started > > >>>> ^^^^^^^ > > >> > > >>> IMHO it seems that either your device is not able to perform a reset > > >>> successfully, or Linux is not correctly performing such reset. I don't > > >>> think there's a lot that can be done from the Xen side. > > >> > > >> Unfortunately for a lot of pci-devices a simple reset as performed by > > >> default isn't enough, > > >> but also almost none support a real pci FLR. > > >> > > >> In the distant past Konrad has made a patchset that implemented a bus > > >> reset and > > >> reseting config space. (It piggy backed on already existing libxl > > >> mechanism of > > >> trying to call on a syfs "do_flr" attribute which triggers pciback to > > >> perform > > >> the busreset and rewrite of config space for the device. > > >> > > >> I use that patchset ever since for my pci-passtrough needs and it works > > >> pretty > > >> well. I can shutdown an restart VM's with pci devices passed trhough > > >> (also AMD > > >> Radeon graphic cards). > > > > > > Just to confirm the utility of that piece of work: OpenXT also uses an > > > extended version of that same patch to perform device reset for > > > passthrough. > > > > > > I've attached a copy of that OpenXT patch to this message and it can > > > also be obtained from our git repository: > > > https://github.com/OpenXT/xenclient-oe/blob/f8d3b282a87231d9ae717b13d506e8e7e28c78c4/recipes-kernel/linux/4.9/patches/thorough-reset-interface-to-pciback-s-sysfs.patch > > > This version creates a sysfs node named "reset_device" and the OpenXT > > > libxl toolstack is patched to use that node instead of "do_flr". > > > > Nice to hear there are more users of this patch. On #xen on IRC there were > > from time to time > > also users who tried pci-passtrough and ran into this issue (and probably > > abandonning the idea > > since having to restart your host before being able to use your pass > > throughed device again > > defies much of the use case). > > > > > Konrad's original work encountered pushback on upstream acceptance at > > > the time it was developed. I'm not sure I've found where that > > > discussion ended. Is there any prospect of a more comprehensive reset > > > mechanism being accepted into xen-pciback, or elsewhere in the kernel? > > > > Yeah it was nacked by David Vrabel and the discussion somewhat bleeded to > > death. > > >From what i remember the main issue was with the naming, since it doesn't > > >do a FLR, > > the sysfs hook shouldn't be called "do_flr". > > > > Some other perhaps minor issues i can think of are: > > - No way to excempt pci-devices from this new way of resetting them. > > Perhaps there could be pci devices/topologies were this way of > > resetting causes more problems than it solves and could cause a > > regression. Unfortunately auto detecting what works doesn't seem to > > be possible. On the other hand (though only with my n=10) i haven't > > encountered > > such a device yet. > > > > - The communication path between libxl and the kernel via sysfs. > > I think the preference was for a: > > a) having it use a more common used Xen communication channel or > > b) having it all self-contained in pci-back. (from my memory and the > > openxt patch description > > there could be some locking issue when trying to implement it this way, > > but the vfio guys had that solved for there reset implementation if i > > from one of the comments in there source code (patches by Alex > > Williamson > > if i remember correctly). > > > > - Not an issue back then when the patch was made, but as the question > > earlier to Roger, > > the hypervisor seems to grow more interference with pci devices with the > > PVH dom0 work. > > If and hoow does that relate to pci-back and pci-passthrough and (the > > location of) resetting mechanisms ? > > > > > > So i think David's NACK was mostly for the patchset having some hackish > > cosmetics. > > He didn't like 'do_flr' which made sense as the patchset did not do FLR. It > made a bus-reset > for more than one device (if those devices were assigned to pciback). > > > > > On the upside one can conclude that this patchset is now pretty well tested > > over the years ;) > > > > Since David has left, perhaps Jurgen/Boris/Konrad could express their views > > (again) ? > > (CC'ed them as well) > > I've asked Govinda (CC-ed) to refresh the patchset against the lastest kernel > and > repost it and see where it goes. > Nice. Looking forward to seeing the refreshed patchset hit the mailinglist! :) Thanks, -- Pasi > > > > > As noted in the original LKML threads, vfio has similar relevant pci > > > device reset retry logic. (Thanks to Rich Persaud for this pointer:) > > > http://elixir.free-electrons.com/linux/v4.14-rc1/source/drivers/vfio/pci/vfio_pci.c#L1353 > > > > > > libvirt also performs similar reset logic, using a direct low level > > > interface to config space (Thanks to Marek for this pointer, libvirt > > > is used by Qubes:) > > > https://github.com/libvirt/libvirt/blob/master/src/util/virpci.c#L929 > > > I thinks this indicates that it would be possible to extend libxl to > > > do something similar, but that seems less satisfactory compared to > > > performing the work in a kernel-provided implementation. > > > > > > Is there a way forward to providing this functionality within Xen > > > software or Linux> Christopher > > > -- > > > > > > openxt.org > > > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.