Xen project Mailing List

Re: [Xen-devel] [PATCH v2 11/11] vpci/sriov: add support for SR-IOV capability

To: "Roger Pau Monne" <roger.pau@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 04 Oct 2018 08:50:06 -0600

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 04 Oct 2018 14:50:28 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 17.07.18 at 11:48, <roger.pau@xxxxxxxxxx> wrote: > The current code detects enabling of the virtual functions feature and > automatically adds the VFs to the domain. It also detects enabling of > memory space and maps the VFs BARs into the domain p2m. Disabling of > the VF enable bit removes the devices and the BAR memory map from the > domain p2m. "The domain" is ambiguous here: DYM the domain owning the PF, or the (perhaps various) one(s) owning the VFs? From the code I guess the latter, but that doesn't really look correct. > @@ -500,6 +506,16 @@ static int init_bars(struct pci_dev *pdev) > }; > int rc; > > + if ( pdev->info.is_virtfn ) > + /* > + * No need to set traps for the command register or the BAR registers > + * because those are not used by VFs. Memory decoding and position of > + * the VF BARs is controlled from the PF. But iirc the BARs can be read and written (for sizing purposes at least), so I think you still need to handle accesses. > + * TODO: add DomU support for VFs. It's not clear to me whether this TODO is meant to cover the above. > +static void modify_memory_mapping(const struct pci_dev *pdev, unsigned int > pos, > + bool enable) > +{ > + const struct vpci_sriov *sriov = pdev->vpci->sriov; > + unsigned int i; > + int rc; > + > + if ( enable ) > + { > + struct pci_dev *pf_dev; > + > + pcidevs_lock(); > + /* > + * NB: a non-const pci_dev of the PF is needed in order to update > + * vf_rlen. > + */ > + pf_dev = pci_get_pdev(pdev->seg, pdev->bus, pdev->devfn); > + pcidevs_unlock(); > + ASSERT(pf_dev); ASSERT(pf_dev == pdev); ? > +static void control_write(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + struct vpci_sriov *sriov = data; > + unsigned int pos = reg - PCI_SRIOV_CTRL; > + uint16_t control = pci_conf_read16(pdev->seg, pdev->bus, > + PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), > + pos + PCI_SRIOV_CTRL); > + bool enabled = control & PCI_SRIOV_CTRL_VFE; > + bool mem_enabled = control & PCI_SRIOV_CTRL_MSE; > + bool new_enabled = val & PCI_SRIOV_CTRL_VFE; > + bool new_mem_enabled = val & PCI_SRIOV_CTRL_MSE; > + > + if ( new_enabled != enabled ) > + { > + if ( new_enabled ) > + { > + struct callback_data *cb = xmalloc(struct callback_data); > + struct vcpu *curr = current; > + > + if ( !cb ) > + { > + gprintk(XENLOG_WARNING, "%04x:%02x:%02x.%u: " > + "unable to allocate memory for SR-IOV enable\n", > + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn)); > + return; > + } > + > + /* > + * Only update the number of active VFs when enabling, when > + * disabling use the cached value in order to always remove the > + * same number of VFs that were active. > + */ > + sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus, > + PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), > + pos + PCI_SRIOV_NUM_VF); > + > + /* > + * NB: VFE needs to be enabled before calling pci_add_device so > Xen > + * can access the config space of VFs. > + */ > + pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), reg, > + control | PCI_SRIOV_CTRL_VFE); > + > + /* > + * The spec states that the software must wait at least 100ms > + * before attempting to access VF registers when enabling virtual > + * functions on the PF. > + */ > + cb->pdev = pdev; > + cb->sriov = sriov; > + cb->pos = pos; > + cb->value = val; > + cb->new_enabled = new_enabled; > + cb->new_mem_enabled = new_mem_enabled; > + curr->vpci.task = WAIT; > + curr->vpci.wait.callback = enable_callback; > + curr->vpci.wait.data = cb; > + curr->vpci.wait.end = get_cycles() + 100 * cpu_khz; > + return; Not sure whether to better ask the question here or for the previous patch: Isn't this WAIT mechanism leading to the guest actually spinning? We certainly don't want this for a 100ms period. > +static void teardown_sriov(struct pci_dev *pdev) > +{ > + if ( pdev->vpci->sriov ) > + { > + /* TODO: removing PFs is not currently supported. */ > + ASSERT_UNREACHABLE(); Irrespective of the TODO I think it would be nice if this didn't lead to a host crash. > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -459,10 +459,14 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, > unsigned int size, > return; > } > > - spin_lock(&pdev->vpci_lock); > + /* > + * NB: use the recursive variant here so that mapping an unmapping of the > + * VF vars works correctly and can recursively take the PF lock. > + */ > + spin_lock_recursive(&pdev->vpci_lock); > if ( !pdev->vpci ) > { > - spin_unlock(&pdev->vpci_lock); > + spin_unlock_recursive(&pdev->vpci_lock); Wouldn't it make sense to split out the locking changes, or do the conversion right in patch 1 (with a suitable explanation)? > @@ -144,6 +143,11 @@ struct vpci { > struct vpci_arch_msix_entry arch; > } entries[]; > } *msix; > + > + struct vpci_sriov { > + uint16_t num_vfs; Any particular reason for this to be uint16_t? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.