[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] pci-passthrough loses msi-x interrupts ability after domain destroy



On Fri, Sep 22, 2017 at 09:35:40AM +0200, Sander Eikelenboom wrote:
> On 22/09/17 04:09, Christopher Clark wrote:
> > On Thu, Sep 21, 2017 at 1:27 PM, Sander Eikelenboom
> > <linux@xxxxxxxxxxxxxx> wrote:
> >>
> >> On Thu, September 21, 2017, 10:39:52 AM, Roger Pau Monné wrote:
> >>
> >>> On Wed, Sep 20, 2017 at 03:50:35PM -0400, Jérôme Oufella wrote:
> >>>>
> >>>> I'm using PCI pass-through to map a PCIe (intel i210) controller into
> >>>> a HVM domain. The system uses xen-pciback to hide the appropriate PCI
> >>>> device from Dom0.
> >>>>
> >>>> When creating the HVM domain after an hypervisor cold boot, the HVM
> >>>> domain can access and use the PCIe controller without problem.
> >>>>
> >>>> However, if the HVM domain is destroyed then restarted, it won't be
> >>>> able to use the pass-through PCI device anymore. The PCI device is
> >>>> seen and can be mapped, however, the interrupts will not be passed to
> >>>> the HVM domain anymore (this is visible under a Linux guest as
> >>>> /proc/interrupts counters remain 0). The behavior on a Windows10 guest
> >>>> is the same.
> >>>>
> >>>> A few interesting hints I noticed:
> >>>>
> >>>> - On Dom0, 'lspci -vv' on that PCIe device between the "working" and
> >>>> the "muted interrupts" states, I noted a difference between the
> >>>> MSI-X caps:
> >>>>
> >>>> - Capabilities: [70] MSI-X: Enable- Count=5 Masked- <-- IRQs will work 
> >>>> if domain started
> >>>> + Capabilities: [70] MSI-X: Enable- Count=5 Masked+ <-- IRQs won't work 
> >>>> if domain started
> >>>>                                             ^^^^^^^
> >>
> >>> IMHO it seems that either your device is not able to perform a reset
> >>> successfully, or Linux is not correctly performing such reset. I don't
> >>> think there's a lot that can be done from the Xen side.
> >>
> >> Unfortunately for a lot of pci-devices a simple reset as performed by 
> >> default isn't enough,
> >> but also almost none support a real pci FLR.
> >>
> >> In the distant past Konrad has made a patchset that implemented a bus 
> >> reset and
> >> reseting config space. (It piggy backed on already existing libxl 
> >> mechanism of
> >> trying to call on a syfs "do_flr" attribute which triggers pciback to 
> >> perform
> >> the busreset and rewrite of config space for the device.
> >>
> >> I use that patchset ever since for my pci-passtrough needs and it works 
> >> pretty
> >> well. I can shutdown an restart VM's with pci devices passed trhough (also 
> >> AMD
> >> Radeon graphic cards).
> > 
> > Just to confirm the utility of that piece of work: OpenXT also uses an
> > extended version of that same patch to perform device reset for
> > passthrough.
> > 
> > I've attached a copy of that OpenXT patch to this message and it can
> > also be obtained from our git repository:
> > https://github.com/OpenXT/xenclient-oe/blob/f8d3b282a87231d9ae717b13d506e8e7e28c78c4/recipes-kernel/linux/4.9/patches/thorough-reset-interface-to-pciback-s-sysfs.patch
> > This version creates a sysfs node named "reset_device" and the OpenXT
> > libxl toolstack is patched to use that node instead of "do_flr".
> 
> Nice to hear there are more users of this patch. On #xen on IRC there were 
> from time to time
> also users who tried pci-passtrough and ran into this issue (and probably 
> abandonning the idea
> since having to restart your host before being able to use your pass 
> throughed device again
> defies much of the use case).
>  
> > Konrad's original work encountered pushback on upstream acceptance at
> > the time it was developed. I'm not sure I've found where that
> > discussion ended. Is there any prospect of a more comprehensive reset
> > mechanism being accepted into xen-pciback, or elsewhere in the kernel?
> 
> Yeah it was nacked by David Vrabel and the discussion somewhat bleeded to 
> death. 
> >From what i remember the main issue was with the naming, since it doesn't do 
> >a FLR,
> the sysfs hook shouldn't be called "do_flr".
> 
> Some other perhaps minor issues i can think of are:
> - No way to excempt pci-devices from this new way of resetting them.
>   Perhaps there could be pci devices/topologies were this way of
>   resetting causes more problems than it solves and could cause a
>   regression. Unfortunately auto detecting what works doesn't seem to
>   be possible. On the other hand (though only with my n=10) i haven't 
> encountered
>   such a device yet.
> 
> - The communication path between libxl and the kernel via sysfs.
>   I think the preference was for a:
>   a) having it use a more common used Xen communication channel or
>   b) having it all self-contained in pci-back. (from my memory and the openxt 
> patch description
>      there could be some locking issue when trying to implement it this way,
>      but the vfio guys had that solved for there reset implementation if i
>      from one of the comments in there source code (patches by Alex Williamson
>      if i remember correctly).
> 
> - Not an issue back then when the patch was made, but as the question earlier 
> to Roger,
>   the hypervisor seems to grow more interference with pci devices with the 
> PVH dom0 work.
>   If and hoow does that relate to pci-back and pci-passthrough and (the 
> location of) resetting mechanisms ?
> 
> 
> So i think David's NACK was mostly for the patchset having some hackish 
> cosmetics.

He didn't like 'do_flr' which made sense as the patchset did not do FLR. It 
made a bus-reset
for more than one device (if those devices were assigned to pciback).

> 
> On the upside one can conclude that this patchset is now pretty well tested 
> over the years ;)
> 
> Since David has left, perhaps Jurgen/Boris/Konrad could express their views 
> (again) ?
> (CC'ed them as well)

I've asked Govinda (CC-ed) to refresh the patchset against the lastest kernel 
and
repost it and see where it goes.

> 
> > As noted in the original LKML threads, vfio has similar relevant pci
> > device reset retry logic. (Thanks to Rich Persaud for this pointer:)
> > http://elixir.free-electrons.com/linux/v4.14-rc1/source/drivers/vfio/pci/vfio_pci.c#L1353
> > 
> > libvirt also performs similar reset logic, using a direct low level
> > interface to config space (Thanks to Marek for this pointer, libvirt
> > is used by Qubes:)
> > https://github.com/libvirt/libvirt/blob/master/src/util/virpci.c#L929
> > I thinks this indicates that it would be possible to extend libxl to
> > do something similar, but that seems less satisfactory compared to
> > performing the work in a kernel-provided implementation.
> > 
> > Is there a way forward to providing this functionality within Xen
> > software or Linux> Christopher
> > --
> > 
> > openxt.org
> > 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.