[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 10/11] vpci: Add initial support for virtual PCI bus topology
On 30.09.2021 09:52, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> > > Assign SBDF to the PCI devices being passed through with bus 0. This reads a little odd: If bus is already known (and I think you imply segment to also be known), it's only DF which get assigned. > The resulting topology is where PCIe devices reside on the bus 0 of the > root complex itself (embedded endpoints). > This implementation is limited to 32 devices which are allowed on > a single PCI bus. Or up to 256 when there are multi-function ones. Imo you at least want to spell out how that case is intended to be handled (even if maybe the code doesn't cover that case yet, in which case a respective code comment would also want leaving). > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -831,6 +831,66 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) > return ret; > } > > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT May I ask why the code enclosed by this conditional has been put here rather than under drivers/vpci/? > +static struct vpci_dev *pci_find_virtual_device(const struct domain *d, > + const struct pci_dev *pdev) > +{ > + struct vpci_dev *vdev; > + > + list_for_each_entry ( vdev, &d->vdev_list, list ) > + if ( vdev->pdev == pdev ) > + return vdev; > + return NULL; > +} No locking here or ... > +int pci_add_virtual_device(struct domain *d, const struct pci_dev *pdev) > +{ > + struct vpci_dev *vdev; > + > + ASSERT(!pci_find_virtual_device(d, pdev)); ... in this first caller that I've managed to spot? See also below as to up the call tree the pcidevs-lock being held (which at the very least you would then want to ASSERT() for here). > + /* Each PCI bus supports 32 devices/slots at max. */ > + if ( d->vpci_dev_next > 31 ) > + return -ENOSPC; Please avoid open-coding literals when they can be suitably expressed. > + vdev = xzalloc(struct vpci_dev); > + if ( !vdev ) > + return -ENOMEM; > + > + /* We emulate a single host bridge for the guest, so segment is always > 0. */ > + vdev->seg = 0; > + > + /* > + * The bus number is set to 0, so virtual devices are seen > + * as embedded endpoints behind the root complex. > + */ > + vdev->bus = 0; Strictly speaking both of these assignments are redundant with you using xzalloc(). I'd prefer if there was just a comment, as the compiler has no way recognizing this in order to eliminate these stores. > + vdev->devfn = PCI_DEVFN(d->vpci_dev_next++, 0); > + > + vdev->pdev = pdev; > + vdev->domain = d; > + > + pcidevs_lock(); > + list_add_tail(&vdev->list, &d->vdev_list); > + pcidevs_unlock(); I don't support a global lock getting (ab)used for per-domain list management. Apart from that I don't understand why you acquire the lock here. Does that mean the functions further were truly left without any locking, by you not having noticed that this lock is already being held by the sole caller? > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -91,20 +91,32 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev) > /* Notify vPCI that device is assigned to guest. */ > int vpci_assign_device(struct domain *d, const struct pci_dev *dev) > { > + int rc; > + > /* It only makes sense to assign for hwdom or guest domain. */ > if ( is_system_domain(d) || !has_vpci(d) ) > return 0; > > - return vpci_bar_add_handlers(d, dev); > + rc = vpci_bar_add_handlers(d, dev); > + if ( rc ) > + return rc; > + > + return pci_add_virtual_device(d, dev); > } I've peeked at the earlier patch, and both there and here I'm struggling to see how undoing partially completed steps or fully completed earlier steps is intended to work. I'm not convinced it is legitimate to leave handlers in place until the tool stack manages to roll back the failed device assignment. > /* Notify vPCI that device is de-assigned from guest. */ > int vpci_deassign_device(struct domain *d, const struct pci_dev *dev) > { > + int rc; > + > /* It only makes sense to de-assign from hwdom or guest domain. */ > if ( is_system_domain(d) || !has_vpci(d) ) > return 0; > > + rc = pci_remove_virtual_device(d, dev); > + if ( rc ) > + return rc; > + > return vpci_bar_remove_handlers(d, dev); > } So what's the ultimate effect of a partially de-assigned device, where one of the later steps failed? In a number of places we do best-effort full cleanup, by recording errors but nevertheless continuing with subsequent cleanup steps. I wonder whether that's a model to use here as well. > --- a/xen/include/xen/pci.h > +++ b/xen/include/xen/pci.h > @@ -137,6 +137,24 @@ struct pci_dev { > struct vpci *vpci; > }; > > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > +struct vpci_dev { > + struct list_head list; > + /* Physical PCI device this virtual device is connected to. */ > + const struct pci_dev *pdev; > + /* Virtual SBDF of the device. */ > + union { > + struct { > + uint8_t devfn; > + uint8_t bus; > + uint16_t seg; > + }; > + pci_sbdf_t sbdf; Could you explain to me why pci_sbdf_t (a typedef of a union) isn't providing all you need? By putting it in a union with a custom struct you set yourself up for things going out of sync if anyone chose to alter pci_sbdf_t's layout. > @@ -167,6 +185,10 @@ const unsigned long *pci_get_ro_map(u16 seg); > int pci_add_device(u16 seg, u8 bus, u8 devfn, > const struct pci_dev_info *, nodeid_t node); > int pci_remove_device(u16 seg, u8 bus, u8 devfn); > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > +int pci_add_virtual_device(struct domain *d, const struct pci_dev *pdev); > +int pci_remove_virtual_device(struct domain *d, const struct pci_dev *pdev); > +#endif Like for their definitions I question the placement of these declarations. > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -444,6 +444,14 @@ struct domain > > #ifdef CONFIG_HAS_PCI > struct list_head pdev_list; > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > + struct list_head vdev_list; > + /* > + * Current device number used by the virtual PCI bus topology > + * to assign a unique SBDF to a passed through virtual PCI device. > + */ > + int vpci_dev_next; In how far can the number stored here be negative? If it can't be, please use unsigned int. As to the comment - "current" is ambiguous: Is it the number that was used last, or the next one to be used? Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |