[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] vpci: Add resizable bar support



On Tue, Jan 07, 2025 at 04:58:07PM +0100, Jan Beulich wrote:
> On 07.01.2025 15:38, Roger Pau Monné wrote:
> > On Tue, Jan 07, 2025 at 11:06:33AM +0100, Jan Beulich wrote:
> >> On 19.12.2024 06:21, Jiqian Chen wrote:
> >>> --- /dev/null
> >>> +++ b/xen/drivers/vpci/rebar.c
> >>> @@ -0,0 +1,131 @@
> >>> +/* SPDX-License-Identifier: GPL-2.0-only */
> >>> +/*
> >>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> >>> + *
> >>> + * Author: Jiqian Chen <Jiqian.Chen@xxxxxxx>
> >>> + */
> >>> +
> >>> +#include <xen/sched.h>
> >>> +#include <xen/vpci.h>
> >>> +
> >>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> >>> +                                      unsigned int reg,
> >>> +                                      uint32_t val,
> >>> +                                      void *data)
> >>> +{
> >>> +    struct vpci_bar *bar = data;
> >>> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> >>> +
> >>> +    if ( bar->enabled )
> >>> +    {
> >>> +        /*
> >>> +         * Refuse to resize a BAR while memory decoding is enabled, as
> >>> +         * otherwise the size of the mapped region in the p2m would 
> >>> become
> >>> +         * stale with the newly set BAR size, and the position of the BAR
> >>> +         * would be reset to undefined.  Note the PCIe specification also
> >>> +         * forbids resizing a BAR with memory decoding enabled.
> >>> +         */
> >>> +        if ( size != bar->size )
> >>> +            gprintk(XENLOG_ERR,
> >>> +                    "%pp: refuse to resize BAR with memory decoding 
> >>> enabled\n",
> >>> +                    &pdev->sbdf);
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
> >>> +        gprintk(XENLOG_WARNING,
> >>> +                "%pp: new size %#lx is not supported by hardware\n",
> >>> +                &pdev->sbdf, size);
> >>> +
> >>> +    bar->size = size;
> >>
> >> Shouldn't at least this be in an "else" to the if() above?
> > 
> > I think this was already raised in a previous version - would be good
> > to know how real hardware behaves when an invalid size is set.  Is the
> > BAR register still reset?
> 
> I'm pretty sure what happens is undefined. I'd expect though that the
> BAR size then doesn't change. Which would require the above assignment
> to not be unconditional.

Might be better to just re-size the BAR, like you suggested to fetch
the BAR position from the register, instead of assuming 0.

> >>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> >>> +        {
> >>> +            /*
> >>> +             * TODO: for failed pathes, need to hide ReBar capability
> >>> +             * from hardware domain instead of returning an error.
> >>> +             */
> >>> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in 
> >>> REBAR_CTRL\n",
> >>> +                   pdev->domain, &pdev->sbdf, index);
> >>> +            return -EINVAL;
> >>
> >> With the TODO unaddressed, is it actually appropriate to return an error
> >> here? Shouldn't we continue in a best effort manner? (Question also to
> >> Roger as the maintainer.)
> > 
> > It would indeed be better to shallow the error and return 0, however
> > the handlers added in this loop would need removing if no error is
> > returned.
> 
> Would they? For those BARs where things worked fine I would think they
> could be left in place.

Hm, it's kind of partial support, but yes, that could likely be fine.
Then the return here should be a continue instead.

> >>> +        }
> >>> +
> >>> +        bar = &pdev->vpci->header.bars[index];
> >>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != 
> >>> VPCI_BAR_MEM32 )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> >>> +                   pdev->domain, &pdev->sbdf, index);
> >>> +            return -EINVAL;
> >>
> >> Same question here then.
> >>
> >>> +        }
> >>> +
> >>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, 
> >>> vpci_hw_write32,
> >>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> >>> +        if ( rc )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP 
> >>> rc=%d\n",
> >>> +                   pdev->domain, &pdev->sbdf, rc);
> >>> +            return rc;
> >>> +        }
> >>> +
> >>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, 
> >>> rebar_ctrl_write,
> >>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> >>> +        if ( rc )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL 
> >>> rc=%d\n",
> >>> +                   pdev->domain, &pdev->sbdf, rc);
> >>> +            return rc;
> >>> +        }
> >>> +
> >>> +        bar->resizable_sizes |=
> >>> +            (pci_conf_read32(pdev->sbdf, rebar_offset + 
> >>> PCI_REBAR_CAP(i)) >>
> >>> +             PCI_REBAR_CAP_SHIFT);
> >>
> >> Imo this would better use = in place of |= and (see also below) would also
> >> better use MASK_EXTR() just like ...
> >>
> >>> +        bar->resizable_sizes |=
> >>> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
> >>> +             (32 - PCI_REBAR_CAP_SHIFT));
> >>
> >> ... this one does.
> >>
> >> Further I think you want to truncate the value for 32-bit BARs, such that
> >> rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
> >> above for them.
> > 
> > For the hardware domain at least we shouldn't add such restriction -
> > Xen in general allows dom0 to do things it would otherwise consider
> > invalid, in case it has to deal with hardware quirks.
> > 
> > Rather than reject Xen should just print a warning that the sizes
> > supported by the device are likely invalid.
> 
> And do what when memory decode is re-enabled on the device? What size a
> P2M update should it do then?

You did suggest to re-read the BARs positions after a ctrl write, we
might as well read the BAR size and use that to be on the safe side.

> >>> --- a/xen/drivers/vpci/vpci.c
> >>> +++ b/xen/drivers/vpci/vpci.c
> >>> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
> >>>      pci_conf_write16(pdev->sbdf, reg, val);
> >>>  }
> >>>  
> >>> +void cf_check vpci_hw_write32(
> >>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void 
> >>> *data)
> >>> +{
> >>> +    pci_conf_write32(pdev->sbdf, reg, val);
> >>> +}
> >>
> >> This function is being added just to handle writing of a r/o register.
> >> Can't you better re-use vpci_ignored_write()?
> > 
> > But vpci_ignored_write() ignores the write, OTOH here the write is
> > propagated to the hardware.
> 
> Right, just for the hardware to drop it. I wouldn't have commented if
> the function needed to do things like this already existed. Adding yet
> another cf_check function just for this is what made me give the remark.

According to the spec yes, they will be ignored.  Yet for the hardware
domain we try to avoid changing behavior from native as much as
possible, hence propagating the write seems more appropriate.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.