[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 11/11] vpci/sriov: add support for SR-IOV capability



>>> On 17.07.18 at 11:48, <roger.pau@xxxxxxxxxx> wrote:
> The current code detects enabling of the virtual functions feature and
> automatically adds the VFs to the domain. It also detects enabling of
> memory space and maps the VFs BARs into the domain p2m. Disabling of
> the VF enable bit removes the devices and the BAR memory map from the
> domain p2m.

"The domain" is ambiguous here: DYM the domain owning the PF, or
the (perhaps various) one(s) owning the VFs? From the code I guess
the latter, but that doesn't really look correct.

> @@ -500,6 +506,16 @@ static int init_bars(struct pci_dev *pdev)
>      };
>      int rc;
>  
> +    if ( pdev->info.is_virtfn )
> +        /*
> +         * No need to set traps for the command register or the BAR registers
> +         * because those are not used by VFs. Memory decoding and position of
> +         * the VF BARs is controlled from the PF.

But iirc the BARs can be read and written (for sizing purposes at least),
so I think you still need to handle accesses.

> +         * TODO: add DomU support for VFs.

It's not clear to me whether this TODO is meant to cover the above.

> +static void modify_memory_mapping(const struct pci_dev *pdev, unsigned int 
> pos,
> +                                  bool enable)
> +{
> +    const struct vpci_sriov *sriov = pdev->vpci->sriov;
> +    unsigned int i;
> +    int rc;
> +
> +    if ( enable )
> +    {
> +        struct pci_dev *pf_dev;
> +
> +        pcidevs_lock();
> +        /*
> +         * NB: a non-const pci_dev of the PF is needed in order to update
> +         * vf_rlen.
> +         */
> +        pf_dev = pci_get_pdev(pdev->seg, pdev->bus, pdev->devfn);
> +        pcidevs_unlock();
> +        ASSERT(pf_dev);

        ASSERT(pf_dev == pdev);

?

> +static void control_write(const struct pci_dev *pdev, unsigned int reg,
> +                          uint32_t val, void *data)
> +{
> +    struct vpci_sriov *sriov = data;
> +    unsigned int pos = reg - PCI_SRIOV_CTRL;
> +    uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> +                                       PCI_SLOT(pdev->devfn),
> +                                       PCI_FUNC(pdev->devfn),
> +                                       pos + PCI_SRIOV_CTRL);
> +    bool enabled = control & PCI_SRIOV_CTRL_VFE;
> +    bool mem_enabled = control & PCI_SRIOV_CTRL_MSE;
> +    bool new_enabled = val & PCI_SRIOV_CTRL_VFE;
> +    bool new_mem_enabled = val & PCI_SRIOV_CTRL_MSE;
> +
> +    if ( new_enabled != enabled )
> +    {
> +        if ( new_enabled )
> +        {
> +            struct callback_data *cb = xmalloc(struct callback_data);
> +            struct vcpu *curr = current;
> +
> +            if ( !cb )
> +            {
> +                gprintk(XENLOG_WARNING, "%04x:%02x:%02x.%u: "
> +                        "unable to allocate memory for SR-IOV enable\n",
> +                        pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                        PCI_FUNC(pdev->devfn));
> +                return;
> +            }
> +
> +            /*
> +             * Only update the number of active VFs when enabling, when
> +             * disabling use the cached value in order to always remove the
> +             * same number of VFs that were active.
> +             */
> +            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
> +                                             PCI_SLOT(pdev->devfn),
> +                                             PCI_FUNC(pdev->devfn),
> +                                             pos + PCI_SRIOV_NUM_VF);
> +
> +            /*
> +             * NB: VFE needs to be enabled before calling pci_add_device so 
> Xen
> +             * can access the config space of VFs.
> +             */
> +            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                             PCI_FUNC(pdev->devfn), reg,
> +                             control | PCI_SRIOV_CTRL_VFE);
> +
> +            /*
> +             * The spec states that the software must wait at least 100ms
> +             * before attempting to access VF registers when enabling virtual
> +             * functions on the PF.
> +             */
> +            cb->pdev = pdev;
> +            cb->sriov = sriov;
> +            cb->pos = pos;
> +            cb->value = val;
> +            cb->new_enabled = new_enabled;
> +            cb->new_mem_enabled = new_mem_enabled;
> +            curr->vpci.task = WAIT;
> +            curr->vpci.wait.callback = enable_callback;
> +            curr->vpci.wait.data = cb;
> +            curr->vpci.wait.end = get_cycles() + 100 * cpu_khz;
> +            return;

Not sure whether to better ask the question here or for the previous
patch: Isn't this WAIT mechanism leading to the guest actually spinning?
We certainly don't want this for a 100ms period.

> +static void teardown_sriov(struct pci_dev *pdev)
> +{
> +    if ( pdev->vpci->sriov )
> +    {
> +        /* TODO: removing PFs is not currently supported. */
> +        ASSERT_UNREACHABLE();

Irrespective of the TODO I think it would be nice if this didn't lead
to a host crash.

> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -459,10 +459,14 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size,
>          return;
>      }
>  
> -    spin_lock(&pdev->vpci_lock);
> +    /*
> +     * NB: use the recursive variant here so that mapping an unmapping of the
> +     * VF vars works correctly and can recursively take the PF lock.
> +     */
> +    spin_lock_recursive(&pdev->vpci_lock);
>      if ( !pdev->vpci )
>      {
> -        spin_unlock(&pdev->vpci_lock);
> +        spin_unlock_recursive(&pdev->vpci_lock);

Wouldn't it make sense to split out the locking changes, or do the
conversion right in patch 1 (with a suitable explanation)?

> @@ -144,6 +143,11 @@ struct vpci {
>              struct vpci_arch_msix_entry arch;
>          } entries[];
>      } *msix;
> +
> +    struct vpci_sriov {
> +        uint16_t num_vfs;

Any particular reason for this to be uint16_t?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.