|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 1/3] vpci: allow queueing of mapping operations
On Fri, Aug 01, 2025 at 05:06:32PM -0400, Stewart Hildebrand wrote:
> On 7/25/25 03:58, Roger Pau Monné wrote:
> > On Thu, Jul 24, 2025 at 06:44:32PM +0200, Roger Pau Monné wrote:
> >> On Wed, Jul 23, 2025 at 12:37:41PM -0400, Stewart Hildebrand wrote:
> >>> @@ -283,7 +297,48 @@ static int __init apply_map(struct domain *d, const
> >>> struct pci_dev *pdev,
> >>> return rc;
> >>> }
> >>>
> >>> -static void defer_map(const struct pci_dev *pdev, uint16_t cmd, bool
> >>> rom_only)
> >>> +static struct vpci_map_task *alloc_map_task(const struct pci_dev *pdev,
> >>> + uint16_t cmd, bool rom_only)
> >>> +{
> >>> + struct vpci_map_task *task = xzalloc(struct vpci_map_task);
> >>
> >> xvzalloc() preferably.
> >>
> >> This however introduces run-time allocations as a result of guest
> >> actions, which is not ideal IMO. It would be preferable to do those
> >> allocations as part of the header initialization, and re-use them.
> >
> > I've been thinking over this, as I've realized that while commenting
> > on it, I didn't provide any alternatives.
> >
> > The usage of rangesets to figure out the regions to map is already not
> > optimal, as adding/removing from a rangeset can lead to memory
> > allocations. It would be good if we could create rangesets with a
> > pre-allocated number of ranges (iow: a pool of struct ranges), but
> > that's for another patchset. I think Jan already commented on this
> > aspect long time ago.
>
> +1
>
> > I'm considering whether to allocate the deferred mapping structures
> > per-vCPU instead of per-device. That would for example mean moving
> > the current vpci_bar->mem rangeset so it's allocated in vpci_vcpu
> > struct instead. The point would be to not have the rangesets per
> > device (because there can be a lot of devices, specially for the
> > hardware domain), but instead have those per-vCPU. This should work
> > because a vCPU can only queue a single vPCI operation, from a single
> > device.
> >
> > It should then be possible to allocate the deferred mapping structures
> > at vCPU creation. I also ponder if we really need a linked list to
> > queue them; AFAIK there can only ever be an unmapping and a mapping
> > operation pending (so 2 operations at most). Hence we could use a
> > more "fixed" structure like an array. For example in struct vpci_vcpu
> > you could introduce a struct vpci_map_task task[2] field?
> >
> > Sorry, I know this is not a minor change to request. It shouldn't
> > change the overall logic much, but it would inevitably affect the
> > code. Let me know what you think.
>
> Thanks for the feedback and suggestion. Yeah, I'll give this a try.
> Here's roughly what I'm thinking so far. I'll keep playing with it.
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 5241a1629eeb..942c9fe7d364 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -387,6 +387,16 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
> */
> static int vcpu_teardown(struct vcpu *v)
> {
> +#ifdef CONFIG_HAS_VPCI
> + for ( unsigned int i = 0; i < ARRAY_SIZE(v->vpci.task); i++ )
> + {
> + struct vpci_map_task *task = &v->vpci.task[i];
> +
> + for ( unsigned int j = 0; j < ARRAY_SIZE(task->bars); j++ )
> + rangeset_destroy(task->bars[j].mem);
You might want to additionally do:
task->bars[j].mem = NULL;
> + }
> +#endif
> +
> vmtrace_free_buffer(v);
>
> return 0;
> @@ -467,6 +477,26 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int
> vcpu_id)
> d->vcpu[prev_id]->next_in_list = v;
> }
>
> +#ifdef CONFIG_HAS_VPCI
> + for ( unsigned int i = 0; i < ARRAY_SIZE(v->vpci.task); i++ )
> + {
> + struct vpci_map_task *task = &v->vpci.task[i];
> +
> + for ( unsigned int j = 0; j < ARRAY_SIZE(task->bars); j++ )
> + {
> + struct vpci_bar_map *bar = &task->bars[j];
> + char str[32];
> +
> + snprintf(str, sizeof(str), "PCI map vcpu %u task %u BAR %u",
> vcpu_id, i, j);
> +
> + bar->mem = rangeset_new(v->domain, str, RANGESETF_no_print);
Not sure there's much point in naming those with that much detail -
those are scratch space for mapping calculations. You already pass
RANGESETF_no_print, which means the contents of the rangeset won't be
dumped, and hence the name is kind of meaningless. I shouldn't have
named those either when allocated in bar_add_rangeset().
> +
> + if ( !bar->mem )
> + goto fail_sched;
> + }
> + }
> +#endif
> +
> /* Must be called after making new vcpu visible to for_each_vcpu(). */
> vcpu_check_shutdown(v);
>
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 17cfecb0aabf..afe78b00ffc9 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -116,7 +116,6 @@ struct vpci {
> uint64_t guest_addr;
> uint64_t size;
> uint64_t resizable_sizes;
> - struct rangeset *mem;
> enum {
> VPCI_BAR_EMPTY,
> VPCI_BAR_IO,
> @@ -207,14 +206,23 @@ struct vpci {
> #endif
> };
>
> +#ifdef __XEN__
> struct vpci_vcpu {
> /* Per-vcpu structure to store state while {un}mapping of PCI BARs. */
> const struct pci_dev *pdev;
> - uint16_t cmd;
> - bool rom_only : 1;
> + struct domain *domain;
> + unsigned int nr_pending_ops;
Not sure you really need a pending ops counter? Hard to tell without
seeing the code that makes use of it.
> + struct vpci_map_task {
> + struct vpci_bar_map {
> + uint64_t addr;
> + uint64_t guest_addr;
> + struct rangeset *mem;
> + } bars[PCI_HEADER_NORMAL_NR_BARS + 1];
> + uint16_t cmd;
> + bool rom_only : 1;
> + } task[2];
Don't you need a way to differentiate between map/unmap operations?
Do you plan to use slot 0 as unmap and slot 1 as map? Or would you
rather introduce a boolean field to signal it in struct vpci_map_task?
Overall seems OK, but obviously it all needs to fit together with the
current code :).
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |