[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7] vpci: Add resizable bar support



On Thu, Feb 06, 2025 at 05:39:00PM +0800, Jiqian Chen wrote:
> Some devices, like discrete GPU of amd, support resizable bar
                                     ^ AMD?

> capability, but vpci of Xen doesn't support this feature, so
> they fail to resize bars and then cause probing failure.
> 
> According to PCIe spec, each bar that supports resizing has
> two registers, PCI_REBAR_CAP and PCI_REBAR_CTRL. So, add
> handlers for them to support resizing the size of BARs.

You need to update the commit message to note Xen will only trap
PCI_REBAR_CTRL, as PCI_REBAR_CAP is read-only and the hardware domain
will already get access to it without needing any setup.

> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx>

Just one comment about you not needing to introduce vpci_hw_write32()
anymore.  The rest are nits to comments, which I'm not even sure are
better than your proposed text.

> ---
> Hi all,
> v6->v7 changes:
> * Deleted codes that add register for PCI_REBAR_CAP, and added comments to 
> explain why.
> * Added comments to explain why use "continue" when fail to add register for 
> PCI_REBAR_CTRL.
> 
> Best regards,
> Jiqian Chen.
> 
> v5->v6 changes:
> * Changed "1UL" to "1ULL" in PCI_REBAR_CTRL_SIZE idefinition for 32 bit 
> architecture.
> * In rebar_ctrl_write used "bar - pdev->vpci->header.bars" to get index 
> instead of reading
>   from register.
> * Added the index of BAR to error messages.
> * Changed to "continue" instead of "return an error" when vpci_add_register 
> failed.
> 
> v4->v5 changes:
> * Called pci_size_mem_bar in rebar_ctrl_write to get addr and size of BAR 
> instead of setting
>   their values directly after writing new size to hardware.
> * Changed from "return" to "continue" when index/type of BAR are not correct 
> during initializing
>   BAR.
> * Corrected the value of PCI_REBAR_CTRL_BAR_SIZE from "0x00001F00" to 
> "0x00003F00".
> * Renamed PCI_REBAR_SIZE_BIAS to PCI_REBAR_CTRL_SIZE_BIAS.
> * Re-defined "PCI_REBAR_CAP_SHIFT 4" to "PCI_REBAR_CAP_SIZES_MASK 
> 0xFFFFFFF0U".
> 
> v3->v4 changes:
> * Removed PCI_REBAR_CAP_SIZES since it was not needed, and added
>   PCI_REBAR_CAP_SHIFT and PCI_REBAR_CTRL_SIZES.
> * Added parameter resizable_sizes to struct vpci_bar to cache the support 
> resizable sizes and
>   added the logic in init_rebar().
> * Changed PCI_REBAR_CAP to PCI_REBAR_CAP(n) (4+8*(n)), changed PCI_REBAR_CTRL 
> to
>   PCI_REBAR_CTRL(n) (8+8*(n)).
> * Added domain info of pci_dev to printings of init_rebar().
> 
> v2->v3 changes:
> * Used "bar->enabled" to replace "pci_conf_read16(pdev->sbdf, PCI_COMMAND) & 
> PCI_COMMAND_MEMORY",
>   and added comments why it needs this check.
> * Added "!is_hardware_domain(pdev->domain)" check in init_rebar() to return 
> EOPNOTSUPP for domUs.
> * Moved BAR type and index check into init_rebar(), then only need to check 
> once.
> * Added 'U' suffix for macro PCI_REBAR_CAP_SIZES.
> * Added macro PCI_REBAR_SIZE_BIAS to represent 20.
> TODO: need to hide ReBar capability from hardware domain when init_rebar() 
> fails.
> 
> v1->v2 changes:
> * In rebar_ctrl_write, to check if memory decoding is enabled, and added
>   some checks for the type of Bar.
> * Added vpci_hw_write32 to handle PCI_REBAR_CAP's write, since there is
>   no write limitation of dom0.
> * And has many other minor modifications as well.
> ---
>  xen/drivers/vpci/Makefile  |   2 +-
>  xen/drivers/vpci/rebar.c   | 136 +++++++++++++++++++++++++++++++++++++
>  xen/drivers/vpci/vpci.c    |   6 ++
>  xen/include/xen/pci_regs.h |  15 ++++
>  xen/include/xen/vpci.h     |   3 +
>  5 files changed, 161 insertions(+), 1 deletion(-)
>  create mode 100644 xen/drivers/vpci/rebar.c
> 
> diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
> index 1a1413b93e76..a7c8a30a8956 100644
> --- a/xen/drivers/vpci/Makefile
> +++ b/xen/drivers/vpci/Makefile
> @@ -1,2 +1,2 @@
> -obj-y += vpci.o header.o
> +obj-y += vpci.o header.o rebar.o
>  obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
> new file mode 100644
> index 000000000000..64b56a9567fa
> --- /dev/null
> +++ b/xen/drivers/vpci/rebar.c
> @@ -0,0 +1,136 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2025 Advanced Micro Devices, Inc. All Rights Reserved.
> + *
> + * Author: Jiqian Chen <Jiqian.Chen@xxxxxxx>
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/vpci.h>
> +
> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> +                                      unsigned int reg,
> +                                      uint32_t val,
> +                                      void *data)
> +{
> +    struct vpci_bar *bar = data;
> +    const unsigned int index = bar - pdev->vpci->header.bars;
> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> +
> +    if ( bar->enabled )
> +    {
> +        /*
> +         * Refuse to resize a BAR while memory decoding is enabled, as
> +         * otherwise the size of the mapped region in the p2m would become
> +         * stale with the newly set BAR size, and the position of the BAR
> +         * would be reset to undefined.  Note the PCIe specification also
> +         * forbids resizing a BAR with memory decoding enabled.
> +         */
> +        if ( size != bar->size )
> +            gprintk(XENLOG_ERR,
> +                    "%pp: refuse to resize BAR#%u with memory decoding 
> enabled\n",
> +                    &pdev->sbdf, index);
> +        return;
> +    }
> +
> +    if ( !((size >> PCI_REBAR_CTRL_SIZE_BIAS) & bar->resizable_sizes) )
> +        gprintk(XENLOG_WARNING,
> +                "%pp: new BAR#%u size %#lx is not supported by hardware\n",
> +                &pdev->sbdf, index, size);
> +
> +    pci_conf_write32(pdev->sbdf, reg, val);
> +
> +    pci_size_mem_bar(pdev->sbdf,
> +                     PCI_BASE_ADDRESS_0 + index * 4,
> +                     &bar->addr,
> +                     &bar->size,
> +                     (index == PCI_HEADER_NORMAL_NR_BARS - 1) ?
> +                      PCI_BAR_LAST : 0);
> +    bar->guest_addr = bar->addr;
> +}
> +
> +static int cf_check init_rebar(struct pci_dev *pdev)
> +{
> +    uint32_t ctrl;
> +    unsigned int nbars;
> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> +                                                        
> PCI_EXT_CAP_ID_REBAR);
> +
> +    if ( !rebar_offset )
> +        return 0;
> +
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> +               &pdev->sbdf, pdev->domain);
> +        return -EOPNOTSUPP;
> +    }
> +
> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> +    for ( unsigned int i = 0; i < nbars; i++ )
> +    {
> +        int rc;
> +        struct vpci_bar *bar;
> +        unsigned int index;
> +
> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in 
> REBAR_CTRL\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            continue;
> +        }
> +
> +        bar = &pdev->vpci->header.bars[index];
> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            continue;
> +        }
> +
> +        /*
> +         * Here not to add register for PCI_REBAR_CAP since it is read-only

"Here not to add" doesn't read right to me (but I'm not a native
speaker), I would rather use:

"Don't add a handler for PCI_REBAR_CAP since it is read-only ..."

TBH I would even drop the comment from here, at the end we allow the
hardware domain unmediated access to a lot of read-only devices, and
there's no comment for each of them.

> +         * register without other specific operations, and hardware domain
> +         * has no limitation of read/write access to all PCI config space.
> +         */
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL 
> rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, index, rc);
> +            /*
> +             * Ideally we would hide the ReBar capability here, but code
> +             * for doing so still needs to be written. And using continue
> +             * can keep any possible hooks working, instead, returning
> +             * failure would cause all vPCI hooks down and hardware domain
> +             * has entirely unmediated access to the device, which is worse.
> +             */

"Ideally we would hide the ReBar capability on error, but code for
doing so still needs to be written. Use continue instead to keep any
already setup register hooks, as returning an error will cause
the hardware domain to get unmediated access to all device registers."

Seems slightly easier to parse IMO (again I'm not a native speaker, so
your proposed comment might be better).

> +            continue;
> +        }
> +
> +        bar->resizable_sizes =
> +            MASK_EXTR(pci_conf_read32(pdev->sbdf,
> +                                      rebar_offset + PCI_REBAR_CAP(i)),
> +                      PCI_REBAR_CAP_SIZES_MASK);
> +        bar->resizable_sizes |=
> +            (((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES_MASK) << 32) /
> +             ISOLATE_LSB(PCI_REBAR_CAP_SIZES_MASK));
> +    }
> +
> +    return 0;
> +}
> +REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 1e6aa5d799b9..3349b98389b8 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
>      pci_conf_write16(pdev->sbdf, reg, val);
>  }
>  
> +void cf_check vpci_hw_write32(
> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> +{
> +    pci_conf_write32(pdev->sbdf, reg, val);
> +}

I think you now longer need to introduce vpci_hw_write32() since it's
not used anywhere in this patch.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.