|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 08/11] vpci/bars: add handlers to map the BARs
>>> On 19.09.17 at 17:29, <roger.pau@xxxxxxxxxx> wrote:
> @@ -48,6 +49,9 @@ bool_t hvm_io_pending(struct vcpu *v)
> struct domain *d = v->domain;
> struct hvm_ioreq_server *s;
>
> + if ( has_vpci(v->domain) && vpci_check_pending(v) )
has_vpci(d)
> + return 1;
Indentation.
> +static int vpci_map_range(unsigned long s, unsigned long e, void *data,
> + unsigned long *c)
> +{
> + const struct map_data *map = data;
> + int rc;
> +
> + for ( ; ; )
> + {
> + unsigned long size = e - s + 1;
> +
> + rc = (map->map ? map_mmio_regions : unmap_mmio_regions)
> + (map->d, _gfn(s), size, _mfn(s));
> + if ( rc == 0 )
> + {
> + *c += size;
> + break;
> + }
> + if ( rc < 0 )
> + {
> + printk(XENLOG_G_WARNING
> + "Failed to identity %smap [%" PRI_gfn ", %" PRI_gfn ")
> for d%d: %d\n",
> + map ? "" : "un", s, e, map->d->domain_id, rc);
> + break;
> + }
ASSERT(rc < size) ?
> +bool vpci_check_pending(struct vcpu *v)
"check" in the function name generally suggests (to me at least) that
the parameter ought to be const. Perhaps vpci_process_pending()?
> +{
> + if ( v->vpci.mem )
> + {
> + int rc = vpci_map_memory(v->domain, v->vpci.mem, v->vpci.map);
> +
> + if ( rc == -ERESTART )
> + return true;
There's no real need for the local variable if all other return values
are simply discarded here. However, ...
> + rangeset_destroy(v->vpci.mem);
> + v->vpci.mem = NULL;
> + }
> +
> + return false;
> +}
... I'm not convinced this is a good error handling model. I don't
recall how previous versions dealt with this, but iirc we agreed to
generally make all such Dom0 handling best effort (here: don't skip the
remaining ranges if mapping of one failed). An exception may want/need
to be -ENOMEM.
> +static int vpci_check_bar_overlap(const struct pci_dev *pdev,
> + const struct vpci_bar *rom,
> + struct rangeset *mem)
> +{
> + const struct pci_dev *cmp;
> +
> + /* Check for overlaps with other device's BARs. */
> + list_for_each_entry(cmp, &pdev->domain->arch.pdev_list, domain_list)
> + {
> + unsigned int i;
> +
> + if ( rom == NULL && pdev == cmp )
> + continue;
This check looks rather unmotivated (or even bogus) without a comment.
The other special casing of ROM BARs further down also isn't all that
obvious (and right now I can't even convince myself it's correct).
> +static void vpci_modify_bars(const struct pci_dev *pdev, bool map)
> +{
> + struct vpci_header *header = &pdev->vpci->header;
> + struct rangeset *mem = rangeset_new(NULL, NULL, 0);
> + unsigned int i;
> + int rc;
> +
> + if ( !mem )
> + return;
> +
> + /*
> + * Create a rangeset that represents the current device BARs memory
> region
> + * and compare it against all the currently active BAR memory regions. If
> + * an overlap is found, subtract it from the region to be
> + * mapped/unmapped.
> + *
> + * NB: the rangeset uses inclusive frame numbers.
> + */
> +
> + /* First fill the rangeset with all the BARs of this device. */
> + for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> + {
> + const struct vpci_bar *bar = &header->bars[i];
> +
> + if ( !MAPPABLE_BAR(bar) ||
> + (bar->type == VPCI_BAR_ROM && !bar->rom_enabled) )
> + continue;
> +
> + rc = rangeset_add_range(mem, PFN_DOWN(bar->addr),
> + PFN_DOWN(bar->addr + bar->size - 1));
> + if ( rc )
> + {
> + rangeset_destroy(mem);
> + return;
I'm afraid -ENOMEM here (which sadly is possible, as we don't maintain
any reserves) would produce a very hard to diagnose misbehavior. I think
you want to log a message here.
> + }
> + }
> +
> + /* Check for overlaps with other device's BARs. */
> + rc = vpci_check_bar_overlap(pdev, NULL, mem);
Why is this not symmetrical with vpci_modify_rom() (which also checks
overlaps inside the current device)?
> + if ( rc )
> + {
> + rangeset_destroy(mem);
> + return;
Same error handling comment as above, despite failure here being less
likely (hopefully at least). Perhaps worth joining the two paths.
> + }
> +
> + rc = vpci_maybe_defer_map(pdev->domain, mem, map);
> + if ( !rc )
> + for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> + if ( header->bars[i].type != VPCI_BAR_ROM ||
> + header->bars[i].rom_enabled )
> + header->bars[i].enabled = map;
Hmm, you're updating state here regardless of possible failure in the
deferred operation (see the discarded error code in
vpci_check_pending()).
> +static uint32_t vpci_cmd_read(const struct pci_dev *pdev, unsigned int reg,
> + void *data)
> +{
> + return pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> + PCI_FUNC(pdev->devfn), reg);
> +}
Wouldn't it be worthwhile having generic read functions dealing with
simple cases like this (and the BAR) one?
> +static void vpci_cmd_write(const struct pci_dev *pdev, unsigned int reg,
> + uint32_t cmd, void *data)
> +{
> + uint8_t seg = pdev->seg, bus = pdev->bus;
> + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> + uint16_t current_cmd = pci_conf_read16(seg, bus, slot, func, reg);
> +
> + /*
> + * Let the guest play with all the bits directly except for the
> + * memory decoding one.
> + */
Please could you make clear it's only Dom0 we apply this lax model to?
Perhaps simply s/the guest/Dom0/.
> +static void vpci_bar_write(const struct pci_dev *pdev, unsigned int reg,
> + uint32_t val, void *data)
> +{
> + struct vpci_bar *bar = data;
> + uint8_t seg = pdev->seg, bus = pdev->bus;
> + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> + bool hi = false;
> +
> + if ( pci_conf_read16(seg, bus, slot, func, PCI_COMMAND) &
> + PCI_COMMAND_MEMORY )
> + {
> + gprintk(XENLOG_WARNING,
> + "%04x:%02x:%02x.%u: ignored BAR write with memory decoding
> enabled\n",
> + seg, bus, slot, func);
Indentation. Also any chance to log which BAR it was?
> +static void vpci_rom_write(const struct pci_dev *pdev, unsigned int reg,
> + uint32_t val, void *data)
> +{
> + struct vpci_bar *rom = data;
> + uint8_t seg = pdev->seg, bus = pdev->bus;
> + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> + uint16_t cmd = pci_conf_read16(seg, bus, slot, func, PCI_COMMAND);
> +
> + if ( (pci_conf_read16(seg, bus, slot, func, PCI_COMMAND) &
Please use cmd here.
> + PCI_COMMAND_MEMORY) && rom->rom_enabled )
> + {
> + gprintk(XENLOG_WARNING,
> + "%04x:%02x:%02x.%u: ignored ROM BAR write with memory
> decoding enabled\n",
> + seg, bus, slot, func);
Indentation again.
> +static int vpci_init_bars(struct pci_dev *pdev)
> +{
> + uint8_t seg = pdev->seg, bus = pdev->bus;
> + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> + uint16_t cmd;
> + uint64_t addr, size;
> + unsigned int i, num_bars, rom_reg;
> + struct vpci_header *header = &pdev->vpci->header;
> + struct vpci_bar *bars = header->bars;
> + pci_sbdf_t sbdf = {
> + .seg = seg,
> + .bus = bus,
> + .dev = slot,
> + .func = func,
> + };
> + int rc;
> +
> + switch ( pci_conf_read8(seg, bus, slot, func, PCI_HEADER_TYPE) & 0x7f )
> + {
> + case PCI_HEADER_TYPE_NORMAL:
> + num_bars = 6;
> + rom_reg = PCI_ROM_ADDRESS;
> + break;
> + case PCI_HEADER_TYPE_BRIDGE:
> + num_bars = 2;
> + rom_reg = PCI_ROM_ADDRESS1;
> + break;
> + default:
> + return -EOPNOTSUPP;
> + }
> +
> + /* Setup a handler for the command register. */
> + rc = vpci_add_register(pdev, vpci_cmd_read, vpci_cmd_write, PCI_COMMAND,
> + 2, header);
> + if ( rc )
> + return rc;
> +
> + /* Disable memory decoding before sizing. */
> + cmd = pci_conf_read16(seg, bus, slot, func, PCI_COMMAND);
> + if ( cmd & PCI_COMMAND_MEMORY )
> + pci_conf_write16(seg, bus, slot, func, PCI_COMMAND,
> + cmd & ~PCI_COMMAND_MEMORY);
> +
> + for ( i = 0; i < num_bars; i++ )
> + {
> + uint8_t reg = PCI_BASE_ADDRESS_0 + i * 4;
> + uint32_t val = pci_conf_read32(seg, bus, slot, func, reg);
> +
> + if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO )
> + {
> + bars[i].type = VPCI_BAR_MEM64_HI;
> + rc = vpci_add_register(pdev, vpci_bar_read, vpci_bar_write, reg,
> 4,
> + &bars[i]);
> + if ( rc )
> + {
> + pci_conf_write16(seg, bus, slot, func, PCI_COMMAND, cmd);
> + return rc;
> + }
> +
> + continue;
> + }
You don't need val up to here - please defer the read.
> + if ( (val & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO )
> + {
> + bars[i].type = VPCI_BAR_IO;
> + continue;
> + }
> + if ( (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
> + PCI_BASE_ADDRESS_MEM_TYPE_64 )
> + bars[i].type = VPCI_BAR_MEM64_LO;
> + else
> + bars[i].type = VPCI_BAR_MEM32;
> +
> + /* Size the BAR and map it. */
Isn't the map part of this comment stale now? And without it,
considering the function name called, it is perhaps no longer worth
having it.
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -20,6 +20,9 @@
> #include <xen/smp.h>
> #include <xen/perfc.h>
> #include <asm/atomic.h>
> +#ifdef CONFIG_HAS_PCI
> +#include <xen/vpci.h>
> +#endif
Perhaps the conditional would better live in that header.
> @@ -264,6 +267,11 @@ struct vcpu
>
> struct evtchn_fifo_vcpu *evtchn_fifo;
>
> +#ifdef CONFIG_HAS_PCI
> + /* vPCI per-vCPU area, used to store data for long running operations. */
> + struct vpci_vcpu vpci;
> +#endif
And perhaps the header would better provide an empty structure for the
"else" case. Another option would be to include the fields ...
> struct arch_vcpu arch;
... in this structure.
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -35,11 +35,52 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,
> unsigned int size);
> void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> uint32_t data);
>
> +/*
> + * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
> + * should not run.
> + */
> +bool vpci_check_pending(struct vcpu *v);
> +
> struct vpci {
> /* List of vPCI handlers for a device. */
> struct list_head handlers;
> spinlock_t lock;
> +
> +#ifdef __XEN__
> + /* Hide the rest of the vpci struct from the user-space test harness. */
> + struct vpci_header {
> + /* Information about the PCI BARs of this device. */
> + struct vpci_bar {
> + paddr_t addr;
> + uint64_t size;
> + enum {
> + VPCI_BAR_EMPTY,
> + VPCI_BAR_IO,
> + VPCI_BAR_MEM32,
> + VPCI_BAR_MEM64_LO,
> + VPCI_BAR_MEM64_HI,
> + VPCI_BAR_ROM,
> + } type;
> + bool prefetchable;
> + /* Store whether the BAR is mapped into guest p2m. */
> + bool enabled;
> + /*
> + * Store whether the ROM enable bit is set (doesn't imply ROM BAR
> + * is mapped into guest p2m). Only used for type VPCI_BAR_ROM.
> + */
> + bool rom_enabled;
Especially with the error handling issue in mind that I've mentioned
earlier, I wonder whether this field shouldn't be dropped, along the
lines of you also no longer caching the memory decode enable bit in the
command register.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |