Xen project Mailing List

Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

To: Malcolm Crossley <malcolm.crossley@xxxxxxxxxx>

From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Date: Tue, 2 Jun 2015 10:34:12 -0400

Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx, Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>

Delivery-date: Tue, 02 Jun 2015 14:34:30 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Jun 02, 2015 at 11:06:26AM +0100, Malcolm Crossley wrote: > On 01/06/15 18:55, Konrad Rzeszutek Wilk wrote: > > On Mon, Jun 01, 2015 at 05:03:14PM +0100, Malcolm Crossley wrote: > >> On 01/06/15 16:43, Ross Lagerwall wrote: > >>> On 06/01/2015 04:26 PM, Konrad Rzeszutek Wilk wrote: > >>>> On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote: > >>>>> When doing passthrough of a PCI device for an HVM guest, don't insert > >>>>> the device into xenstore, otherwise pciback attempts to use it which > >>>>> conflicts with QEMU. > >>>> > >>>> How does it conflict? > >>> > >>> It doesn't work with repeated use. See below. > >>> > >>>>> > >>>>> This manifests itself such that the first time a device is passed to a > >>>>> domain, it succeeds. Subsequent attempts fail unless the device is > >>>>> unbound from pciback or the machine rebooted. > >>>> > >>>> Can you be more specific please? What are the issues? Why does it > >>>> fail? > >>> > >>> Without this patch, if a device (e.g. a GPU) is bound to pciback and > >>> then passed through to a guest using xl pci-attach, it appears in the > >>> guest and works fine. If the guest is rebooted, and the device is again > >>> passed through with xl pci-attach, it appears in the guest as before but > >>> does not work. In Windows, it gets something like Error Code 43 and on > >>> Linux, the Nouveau driver fails to initialize the device (with error -22 > >>> or something). The only way to get the device to work again is to reboot > >>> the host or unbind and rebind it to pciback. > >>> > >>> With this patch, it works as expected. The device is bound to pciback > >>> and works after being passed through, even after the VM is rebooted. > >>> > >>>> > >>>> There are certain things that pciback does to "prepare" an PCI device > >>>> which QEMU also does. Some of them - such as saving the configuration > >>>> registers (And then restoring them after the device has been detached) - > >>>> is something that QEMU does not do. > >>>> > >>> > >>> I really have no idea what the correct thing to do is, but the current > >>> code with qemu-trad doesn't seem to work (for me). I think I know what the problem is. Do you by any chance have the XSA133-addenum patch in? If not could you apply it and tell me if it works? > >> > >> The pciback pci_stub.c implements the pciback.hide and the device reset > >> logic. > >> > >> The rest of pciback implements the pciback xenbus device which PV guests > >> need in order to map/unmap MSI interrupts and access PCI config space. > >> > >> QEMU emulates and handles the MSI interrupt capabilities and PCI config > >> space directly. > > > > Right.. > >> > >> This is why a pciback xenbus device should not be created for > >> passthrough PCI device being handled by QEMU. > > > > To me that sounds that we should not have PV drivers because QEMU > > emulates IDE or network devices. > > That is different. We first boot with QEMU handling the devices and then > we explictly unplug QEMU's handling of IDE and network devices. > > That handover protocol does not currently exist for PCI passthrough > devices so we have to chose one mechanism or the other to manage the > passed through PCI device at boot time. Otherwise a HVM guest could load > pcifront and cause's all kinds of chaos with interrupt management or > outbound MMIO window management. Which would be fun! :-) > > > > > The crux here is that none of the operations that pciback performs > > should affect QEMU or guests. But it does - so there is a bug. > > I agree there is a bug but should we try to fix it based upon my > comments above? I am still thinking about it. I do like certain things that pciback does as part of it being notified that a device is to be used by a guest and performing the configuration save/reset (see pcistub_put_pci_dev in the pciback). If somehow that can still be done by libxl (or QEMU) via SysFS that would be good. Just to clarify: - I concur with you that having xen-pcifront loaded in HVM guest and doing odd things behind QEMU is not good. - I like the fact that xen-pciback does a bunch of safety things with the PCI device to prepare it for a guest. - Currently these 'safety things' are done when you 'unbind' or 'bind' the device to pciback. - Or when the guest is shutdown and via XenBus we are told and can do the 'safety things'. This is the crux - if there is a way to do this via SysFS this would be super. Or perhaps xenpciback can figure out that the guest is HVM and ignore any XenBus actions? > > > > I would like to understand which ones do it so I can fix in > > pciback - as it might be also be a problem with PV. > > > > Unless... are you by any chance using extra patches on top of the > > native pciback? > > We do have extra patches but they only allow us to do a SBR on PCI > device's which require it. They failure listed above occurs on devices > with device specific resets (e.g. FLR,D3) as well so those extra patches > aren't being used. > > > > >> > >> Malcolm > >> > >>> > >>> Regards > >> > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.