[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] vpci: Add resizable bar support


  • To: Jiqian Chen <Jiqian.Chen@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 7 Jan 2025 11:06:33 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Huang Rui <ray.huang@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 07 Jan 2025 10:06:57 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19.12.2024 06:21, Jiqian Chen wrote:
> --- /dev/null
> +++ b/xen/drivers/vpci/rebar.c
> @@ -0,0 +1,131 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> + *
> + * Author: Jiqian Chen <Jiqian.Chen@xxxxxxx>
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/vpci.h>
> +
> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> +                                      unsigned int reg,
> +                                      uint32_t val,
> +                                      void *data)
> +{
> +    struct vpci_bar *bar = data;
> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> +
> +    if ( bar->enabled )
> +    {
> +        /*
> +         * Refuse to resize a BAR while memory decoding is enabled, as
> +         * otherwise the size of the mapped region in the p2m would become
> +         * stale with the newly set BAR size, and the position of the BAR
> +         * would be reset to undefined.  Note the PCIe specification also
> +         * forbids resizing a BAR with memory decoding enabled.
> +         */
> +        if ( size != bar->size )
> +            gprintk(XENLOG_ERR,
> +                    "%pp: refuse to resize BAR with memory decoding 
> enabled\n",
> +                    &pdev->sbdf);
> +        return;
> +    }
> +
> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
> +        gprintk(XENLOG_WARNING,
> +                "%pp: new size %#lx is not supported by hardware\n",
> +                &pdev->sbdf, size);
> +
> +    bar->size = size;

Shouldn't at least this be in an "else" to the if() above?

> +    bar->addr = 0;

For maximum compatibility with the behavior on bare metal, would we
perhaps better ...

> +    bar->guest_addr = 0;
> +    pci_conf_write32(pdev->sbdf, reg, val);

... re-read the BAR from hardware after this write?

Similar consideration may apply to ->guest_addr: Driver writers knowing
how their hardware behaves may expect that merely some of the bits of
the address get cleared (if the size increases).

> +static int cf_check init_rebar(struct pci_dev *pdev)
> +{
> +    uint32_t ctrl;
> +    unsigned int nbars;
> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> +                                                        
> PCI_EXT_CAP_ID_REBAR);
> +
> +    if ( !rebar_offset )
> +        return 0;
> +
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> +               &pdev->sbdf, pdev->domain);
> +        return -EOPNOTSUPP;
> +    }
> +
> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> +
> +    for ( unsigned int i = 0; i < nbars; i++ )
> +    {
> +        int rc;
> +        struct vpci_bar *bar;
> +        unsigned int index;
> +
> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;;

Nit: No double semicolons please.

> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> +        {
> +            /*
> +             * TODO: for failed pathes, need to hide ReBar capability
> +             * from hardware domain instead of returning an error.
> +             */
> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in 
> REBAR_CTRL\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            return -EINVAL;

With the TODO unaddressed, is it actually appropriate to return an error
here? Shouldn't we continue in a best effort manner? (Question also to
Roger as the maintainer.)

> +        }
> +
> +        bar = &pdev->vpci->header.bars[index];
> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            return -EINVAL;

Same question here then.

> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP 
> rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, rc);
> +            return rc;
> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL 
> rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, rc);
> +            return rc;
> +        }
> +
> +        bar->resizable_sizes |=
> +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
> +             PCI_REBAR_CAP_SHIFT);

Imo this would better use = in place of |= and (see also below) would also
better use MASK_EXTR() just like ...

> +        bar->resizable_sizes |=
> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
> +             (32 - PCI_REBAR_CAP_SHIFT));

... this one does.

Further I think you want to truncate the value for 32-bit BARs, such that
rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
above for them.

> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
>      pci_conf_write16(pdev->sbdf, reg, val);
>  }
>  
> +void cf_check vpci_hw_write32(
> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> +{
> +    pci_conf_write32(pdev->sbdf, reg, val);
> +}

This function is being added just to handle writing of a r/o register.
Can't you better re-use vpci_ignored_write()?

> --- a/xen/include/xen/pci_regs.h
> +++ b/xen/include/xen/pci_regs.h
> @@ -459,6 +459,7 @@
>  #define PCI_EXT_CAP_ID_ARI   14
>  #define PCI_EXT_CAP_ID_ATS   15
>  #define PCI_EXT_CAP_ID_SRIOV 16
> +#define PCI_EXT_CAP_ID_REBAR 21      /* Resizable BAR */
>  
>  /* Advanced Error Reporting */
>  #define PCI_ERR_UNCOR_STATUS 4       /* Uncorrectable Error Status */
> @@ -541,6 +542,19 @@
>  #define  PCI_VNDR_HEADER_REV(x)      (((x) >> 16) & 0xf)
>  #define  PCI_VNDR_HEADER_LEN(x)      (((x) >> 20) & 0xfff)
>  
> +/* Resizable BARs */
> +#define PCI_REBAR_SIZE_BIAS  20

I think it would be best if all register definitions came first, and
auxiliary ones followed afterwards (maybe even separated by a brief
comment for clarity).

> +#define PCI_REBAR_CAP(n)     (4 + 8 * (n))   /* capability register */
> +#define  PCI_REBAR_CAP_SHIFT         4               /* shift for supported 
> BAR sizes */
> +#define PCI_REBAR_CTRL(n)    (8 + 8 * (n))   /* control register */

Something's odd with the padding here. Please be consistent with the use
of whitespace (ought to be only hard tabs here afaict).

> +#define  PCI_REBAR_CTRL_BAR_IDX      0x00000007      /* BAR index */
> +#define  PCI_REBAR_CTRL_NBAR_MASK    0x000000E0      /* # of resizable BARs 
> */
> +#define  PCI_REBAR_CTRL_BAR_SIZE     0x00001F00      /* BAR size */

This field is 6 bits wide in the spec I'm looking at. Or else BAR sizes
2^^52 and up can't be encoded.

> +#define  PCI_REBAR_CTRL_SIZE(v) \
> +            (1UL << (MASK_EXTR(v, PCI_REBAR_CTRL_BAR_SIZE) \
> +                     + PCI_REBAR_SIZE_BIAS))
> +#define  PCI_REBAR_CTRL_SIZES                0xFFFF0000U     /* supported 
> BAR sizes */

PCI_REBAR_CAP_SHIFT and PCI_REBAR_CTRL_SIZES don't fit together very well.
Imo both want representing as masks.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.