[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute



On Mon, Nov 06, 2017 at 12:48:42PM -0500, Govinda Tatti wrote:
> The life-cycle of a PCI device in Xen pciback is complex and is constrained
> by the generic PCI locking mechanism.
> 
> - It starts with the device being bound to us, for which we do a function
>   reset (done via SysFS so the PCI lock is held).
> - If the device is unbound from us, we also do a function reset
>   (done via SysFS so the PCI lock is held).
> - If the device is un-assigned from a guest - we do a function reset
>   (no PCI lock is held).
> 
> All reset operations are done on the individual PCI function level
> (so bus:device:function).
> 
> The reset for an individual PCI function means device must support FLR
> (PCIe or AF), PM reset on D3hot->D0 device specific reset, or a secondary
> bus reset for a singleton device on a bus but FLR does not have widespread
> support or it is not reliable in some cases. So, we need to provide an
> alternate mechanism to users to perform a slot or bus level reset.
> 
> Currently, a slot or bus reset is not exposed in SysFS as there is no good
> way of exposing a bus topology there. This is due to the complexity -
> we MUST know that the different functions of a PCIe device are not in use
> by other drivers, or if they are in use (say one of them is assigned to a
> guest and the other is  idle) - it is still OK to reset the slot (assuming
> both of them are owned by Xen pciback).
> 
> This patch does that by doing a slot or bus reset (if slot not supported)
> if all of the functions of a PCIe device belong to Xen PCIback.
> 
> Due to the complexity with the PCI lock we cannot do the reset when a
> device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind')
> as the pci_[slot|bus]_reset also takes the same lock resulting in a
> dead-lock.
> 
> Putting the reset function in a work-queue or thread won't work either -
> as we have to do the reset function outside the 'unbind' context (it holds
> the PCI lock). But once you 'unbind' a device the device is no longer under
> the ownership of Xen pciback and the pci_set_drvdata has been reset, so
> we cannot use a thread for this.
> 
> Instead of doing all this complex dance, we depend on the tool-stack doing
> the right thing. As such, we implement the 'do_flr' SysFS attribute which
> 'xl' uses when a device is detached or attached from/to a guest. It
> bypasses the need to worry about the PCI lock.
> 
> To not inadvertently do a bus reset that would affect devices that are in
> use by other drivers (other than Xen pciback) prior to the reset, we check
> that all of the devices under the bridge are owned by Xen pciback. If they
> are not, we refrain from executing the bus (or slot) reset.
> 
> Signed-off-by: Govinda Tatti <Govinda.Tatti@xxxxxxxxxx>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
> ---
>  Documentation/ABI/testing/sysfs-driver-pciback |  12 +++
>  drivers/xen/xen-pciback/pci_stub.c             | 125 
> +++++++++++++++++++++++++
>  2 files changed, 137 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback 
> b/Documentation/ABI/testing/sysfs-driver-pciback
> index 6a733bf..ccf7dc0 100644
> --- a/Documentation/ABI/testing/sysfs-driver-pciback
> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> @@ -11,3 +11,15 @@ Description:
>                  #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
>                  will allow the guest to read and write to the configuration
>                  register 0x0E.
> +
> +What:           /sys/bus/pci/drivers/pciback/do_flr
> +Date:           Nov 2017
> +KernelVersion:  4.15
> +Contact:        xen-devel@xxxxxxxxxxxxxxxxxxxx
> +Description:
> +                An option to perform a slot or bus reset when a PCI device
> +             is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F
> +             will cause the pciback driver to perform a slot or bus reset
> +             if the device supports it. It also checks to make sure that
> +             all of the devices under the bridge are owned by Xen PCI
> +             backend.
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 6331a95..2b2c269 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -244,6 +244,96 @@ struct pci_dev *pcistub_get_pci_dev(struct 
> xen_pcibk_device *pdev,
>       return found_dev;
>  }
>  
> +struct pcistub_args {
> +     struct pci_dev *dev;

const?

> +     int dcount;

unsigned int.

> +};
> +
> +static int pcistub_search_dev(struct pci_dev *dev, void *data)

Seems like this function would better return a boolean rather than an
int.

> +{
> +     struct pcistub_device *psdev;
> +     struct pcistub_args *arg = data;
> +     bool found_dev = false;
> +     unsigned long flags;
> +
> +     spin_lock_irqsave(&pcistub_devices_lock, flags);
> +
> +     list_for_each_entry(psdev, &pcistub_devices, dev_list) {
> +             if (psdev->dev == dev) {
> +                     found_dev = true;
> +                     arg->dcount++;
> +                     break;
> +             }
> +     }
> +
> +     spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> +
> +     /* Device not owned by pcistub, someone owns it. Abort the walk */
> +     if (!found_dev)
> +             arg->dev = dev;
> +
> +     return found_dev ? 0 : 1;
> +}
> +
> +static int pcistub_reset_dev(struct pci_dev *dev)
> +{
> +     struct xen_pcibk_dev_data *dev_data;
> +     bool slot = false, bus = false;
> +
> +     if (!dev)
> +             return -EINVAL;
> +
> +     dev_dbg(&dev->dev, "[%s]\n", __func__);
> +
> +     if (!pci_probe_reset_slot(dev->slot)) {
> +             slot = true;
> +     } else if (!pci_probe_reset_bus(dev->bus)) {
> +             /* We won't attempt to reset a root bridge. */
> +             if (!pci_is_root_bus(dev->bus))
> +                     bus = true;

Con't you join the two if, ie:

} else if (!pci_probe_reset_bus(dev->bus) && !pci_is_root_bus(dev->bus)) {

> +     }
> +
> +     if (!bus && !slot)
> +             return -EOPNOTSUPP;
> +
> +     if (!slot) {
> +             struct pcistub_args arg = { .dev = NULL, .dcount = 0 };
> +
> +             /*
> +              * Make sure all devices on this bus are owned by the
> +              * PCI backend so that we can safely reset the whole bus.
> +              */
> +             pci_walk_bus(dev->bus, pcistub_search_dev, &arg);
> +
> +             /* All devices under the bus should be part of pcistub! */
> +             if (arg.dev) {
> +                     dev_err(&dev->dev, "%s device on the bus is not owned 
> by pcistub\n",
> +                             pci_name(arg.dev));
> +
> +                     return -EBUSY;

Not sure EBUSY is the best return code here, EINVAL?

> +             }
> +
> +             dev_dbg(&dev->dev, "pcistub owns %d devices on the bus\n",
> +                     arg.dcount);
> +     }
> +
> +     dev_data = pci_get_drvdata(dev);
> +     if (!pci_load_saved_state(dev, dev_data->pci_saved_state))
> +             pci_restore_state(dev);
> +
> +     /* This disables the device. */
> +     xen_pcibk_reset_device(dev);
> +
> +     /* Cleanup up any emulated fields */
> +     xen_pcibk_config_reset_dev(dev);
> +
> +     dev_dbg(&dev->dev, "resetting %s device using %s reset\n",
> +             pci_name(dev), slot ? "slot" : "bus");
> +
> +     return slot ? pci_try_reset_slot(dev->slot) :
> +                     pci_try_reset_bus(dev->bus);
> +}
> +
>  /*
>   * Called when:
>   *  - XenBus state has been reconfigure (pci unplug). See 
> xen_pcibk_remove_device
> @@ -1434,6 +1524,34 @@ static ssize_t permissive_show(struct device_driver 
> *drv, char *buf)
>  static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show,
>                  permissive_add);
>  
> +static ssize_t pcistub_do_flr(struct device_driver *drv, const char *buf,
> +                           size_t count)
> +{
> +     struct pcistub_device *psdev;
> +     int domain, bus, slot, func;
> +     int err;
> +
> +     err = str_to_slot(buf, &domain, &bus, &slot, &func);
> +     if (err)
> +             goto out;
                return err;

> +
> +     psdev = pcistub_device_find(domain, bus, slot, func);
        if (!psdev)
                return -ENODEV;

> +     if (psdev) {
> +             err = pcistub_reset_dev(psdev->dev);
> +             pcistub_device_put(psdev);
> +     } else {
> +             err = -ENODEV;
> +     }
> +
> +out:
> +     if (!err)
> +             err = count;
> +
> +     return err;

What's the purpose of returning count here?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.