[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 07/11] vpci/bars: add handlers to map the BARs



On Wed, Mar 14, 2018 at 10:13:16AM -0600, Jan Beulich wrote:
> >>> On 14.03.18 at 15:04, <roger.pau@xxxxxxxxxx> wrote:
> > +static void modify_decoding(const struct pci_dev *pdev, bool map, bool 
> > rom_only)
> > +{
> > +    struct vpci_header *header = &pdev->vpci->header;
> > +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> > +    {
> > +        if ( rom_only && header->bars[i].type == VPCI_BAR_ROM )
> > +        {
> > +            unsigned int rom_pos = (i == PCI_HEADER_NORMAL_NR_BARS)
> > +                                   ? PCI_ROM_ADDRESS : PCI_ROM_ADDRESS1;
> > +            uint32_t val = pci_conf_read32(pdev->seg, pdev->bus, slot, 
> > func,
> > +                                           rom_pos);
> > +
> > +            header->bars[i].enabled = header->rom_enabled = map;
> > +
> > +            val &= ~PCI_ROM_ADDRESS_ENABLE;
> > +            val |= map ? PCI_ROM_ADDRESS_ENABLE : 0;
> > +            pci_conf_write32(pdev->seg, pdev->bus, slot, func, rom_pos, 
> > val);
> > +            break;
> > +        }
> > +        if ( !rom_only && (header->bars[i].type != VPCI_BAR_ROM ||
> > +                           header->rom_enabled) )
> > +            header->bars[i].enabled = map;
> 
> While this second if() has benefited from the rename to "rom_only",
> I'm now wondering about the validity of the first if(): Why would
> this need doing only in the "ROM only" case, but not in the
> "everything" one? Or is the parameter still suffering from its name
> being misleading? This also needs to be viewed in context of the
> call here from vpci_process_pending(), which passes (dropping
> the conditional there) v->vpci.rom, which doesn't exactly mean
> "ROM only".

Sorry, I should have changed v->vpci.rom to v->vpci.rom_only.

> If there's really no name for the parameter that can properly
> convey its meaning, please attach a clarifying comment. (Having
> reached the end of the patch I now seem to understand / recall
> that this is for the case where the ROM BAR's enable bit changes.
> That's what a comment could usefully say here.)

I will add:

/*
 * The rom_only parameter is used to signal the map/unmap helpers that
 * the ROM BAR's enable bit has changed with the memory decoding bit
 * already enabled. If rom_only is not set then it's the memory
 * decoding bit the one that changed.
 */

> > +    }
> > +
> > +    if ( !rom_only )
> > +    {
> 
> Note how due to this conditional the "break" above could
> actually be "return", making more obvious that the rest of the
> function isn't be needed in that case.

Right, I thought about changing the break to a return, let me do that
now.

> > +static int maybe_defer_map(struct domain *d, struct pci_dev *pdev,
> > +                           struct rangeset *mem, bool map, bool rom)
> > +{
> > +    struct vcpu *curr = current;
> > +    int rc;
> > +
> > +    if ( is_idle_vcpu(curr) )
> > +    {
> > +        struct map_data data = { .d = d, .map = true };
> > +
> > +        /*
> > +         * Dom0 building runs on the idle vCPU, in which case it's not 
> > possible
> > +         * to defer the operation (like done in the else branch). Call
> > +         * rangeset_consume_ranges in order to establish the mappings right
> > +         * away.
> > +         */
> 
> For one I think this comment belongs above the if(), as that's what
> it explains, not the ASSERT() that follows. And then it clarifies only
> half of what needs clarifying: Why can't we make it here on an idle
> vCPU outside of Dom0 building (e.g. through a tasklet), or if we can,
> why is the given behavior the intended one?

Since this seems to be causing confusion, what about using:

system_state != SYS_STATE_active

Instead of checking if running on the idle vpcu. Do you think that
would make it clearer?

Currently all mapping/unmapping operations when the domain is running
will be handled from the domain vCPU. That's a design choice that I
can write down here if we decide to keep the is_idle_vcpu check.

> > +        ASSERT(map && !rom);
> 
> I can see why you assume it's not an un-mapping request (albeit
> I wonder whether you couldn't instead simply set .map above to
> the incoming value), but why the !rom part?

This branch will only be used at Dom0 build time, when none of the
BARs are mapped into the p2m, so asking for an unmap in this case
would be wrong.

> > +static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
> > +{
> > +    struct vpci_header *header = &pdev->vpci->header;
> > +    struct rangeset *mem = rangeset_new(NULL, NULL, 0);
> > +    struct pci_dev *tmp, *dev = NULL;
> > +    unsigned int i;
> > +    int rc;
> > +
> > +    if ( !mem )
> > +        return -ENOMEM;
> > +
> > +    /*
> > +     * Create a rangeset that represents the current device BARs memory 
> > region
> > +     * and compare it against all the currently active BAR memory regions. 
> > If
> > +     * an overlap is found, subtract it from the region to be
> > +     * mapped/unmapped.
> > +     *
> > +     * NB: the rangeset uses inclusive frame numbers.
> 
> Is this a worthwhile remark to make? All rangesets do, so if at all
> that's what the comment should say.
> 
> > +     */
> > +
> > +    /*
> > +     * First fill the rangeset with all the BARs of this device or with 
> > the ROM
> > +     * BAR only, depending on whether the guest is toggling the memory 
> > decode
> > +     * bit of the command register, or the enable bit of the ROM BAR 
> > register.
> > +     */
> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> > +    {
> > +        const struct vpci_bar *bar = &header->bars[i];
> > +
> > +        if ( !MAPPABLE_BAR(bar) ||
> > +             (rom_only ? bar->type != VPCI_BAR_ROM
> > +                       : (bar->type == VPCI_BAR_ROM && 
> > !header->rom_enabled)) )
> > +            continue;
> > +
> > +        rc = rangeset_add_range(mem, PFN_DOWN(bar->addr),
> > +                                PFN_UP(bar->addr + bar->size - 1));
> > +        if ( rc )
> > +        {
> > +            printk(XENLOG_G_WARNING
> > +                   "Failed to add [%" PRI_gfn ", %" PRI_gfn "]: %d\n",
> > +                   PFN_DOWN(bar->addr), PFN_UP(bar->addr + bar->size - 1),
> 
> I thought we had agreed that the parenthesization of tuples
> like this one should match meaning they want to convey. I'm
> having a hard time to see how PFN_UP() could ever go together
> with a closing square bracket.

There's a -1 in the PFN_UP, and it's exactly what we are adding to the
rangeset. Ie: rangeset_add_range(..., e, e) is adding the range [e,
e], not [e, e).

> > +static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
> > +                      uint32_t cmd, void *data)
> > +{
> > +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> > +    uint16_t current_cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, 
> > func,
> > +                                           reg);
> > +
> > +    /*
> > +     * Let Dom0 play with all the bits directly except for the memory
> > +     * decoding one.
> > +     */
> > +    if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY )
> > +        /*
> > +         * Ignore the error. No memory has been added or removed from the 
> > p2m,
> > +         * and the memory decoding has not been changed, so leave 
> > everything
> > +         * as-is, hoping the guest will realize and try again.
> > +         */
> > +        modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
> 
> So that comment appears to be correct, but I wonder if the reader
> could get a little more assistance, as it's not exactly obvious why no
> p2m changes would have occurred in case of failure: modify_bars()
> produces all its errors before doing any mapping, and
> maybe_defer_map() takes the "else" branch which doesn't cause
> any (direct) errors. Same for the similar comment in rom_write().

Let me expand that a little bit then to give some more context:

/*
 * Ignore the error. No memory has been added or removed from the p2m
 * (because the actual p2m changes are deferred in maybe_defer_map)
 * and the memory decoding bit has not been changed, so leave
 * everything as-is, hoping the guest will realize and try again.
 */

> > +#else /* !CONFIG_HAS_VPCI */
> > +struct vpci_vcpu {
> > +};
> > +#endif
> 
> To make clear even from e.g. simple grep output that this is a
> dummy, could I talk you into making this
> 
> #else /* !CONFIG_HAS_VPCI */
> struct vpci_vcpu {};
> #endif
> 
> ?

Sure.

Thanks for the review, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.