|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 3/6] x86/iommu: introduce a rangeset to perform hwdom IOMMU setup
On Tue, Dec 05, 2023 at 03:50:44PM +0100, Jan Beulich wrote:
> On 04.12.2023 10:43, Roger Pau Monne wrote:
> > This change just introduces the boilerplate code in order to use a rangeset
> > when setting up the hardware domain IOMMU mappings. The rangeset is never
> > populated in this patch, so it's a non-functional change as far as the
> > mappings
> > the domain gets established.
> >
> > Note there's a change for HVM domains (ie: PVH dom0) that will get switched
> > to
> > create the p2m mappings using map_mmio_regions() instead of
> > p2m_add_identity_entry(), so that ranges can be mapped with a single
> > function
> > call if possible. Note that the interface of map_mmio_regions() doesn't
> > allow
> > creating read-only mappings, but so far there are no such mappings created
> > for
> > PVH dom0 in arch_iommu_hwdom_init().
>
> I don't understand this paragraph: The rangeset remains empty, so nothing is
> changing right here. DYM there is going to be such a change as a result of
> this patch, but in a later part of this series?
Yes, when the rangeset is populated and mappings are created based on
its contents, map_mmio_regions() will be used instead of
p2m_add_identity_entry(). I guess the '... that will get switched to
create the p2m ...' is not clear enough.
> > --- a/xen/drivers/passthrough/x86/iommu.c
> > +++ b/xen/drivers/passthrough/x86/iommu.c
> > @@ -370,10 +370,77 @@ static unsigned int __hwdom_init
> > hwdom_iommu_map(const struct domain *d,
> > return perms;
> > }
> >
> > +struct map_data {
> > + struct domain *d;
> > + unsigned int flush_flags;
> > + bool ro;
> > +};
> > +
> > +static int __hwdom_init cf_check identity_map(unsigned long s, unsigned
> > long e,
> > + void *data)
> > +{
> > + struct map_data *info = data;
> > + struct domain *d = info->d;
> > + long rc;
> > +
> > + if ( iommu_verbose )
> > + printk(XENLOG_INFO " [%010lx, %010lx] R%c\n",
> > + s, e, info->ro ? 'O' : 'W');
> > +
> > + if ( paging_mode_translate(d) )
> > + {
> > + if ( info->ro )
> > + {
> > + ASSERT_UNREACHABLE();
> > + return 0;
> > + }
> > + while ( (rc = map_mmio_regions(d, _gfn(s), e - s + 1, _mfn(s))) >
> > 0 )
> > + {
> > + s += rc;
> > + process_pending_softirqs();
> > + }
> > + }
> > + else
> > + {
> > + const unsigned int perms = IOMMUF_readable | IOMMUF_preempt |
> > + (info->ro ? 0 : IOMMUF_writable);
> > +
> > + if ( info->ro && !iomem_access_permitted(d, s, e) )
>
> How is r/o-ness related to iomem_access_permitted()? The present callers
> are such that there is a connection, but that's invisible here. I guess
> either the field wants to change name (maybe mmio_ro or ro_mmio or even
> just mmio), or there wants to be a comment.
Will add:
"read-only ranges are only created based on the contents of
mmio_ro_ranges, and hence need the additional iomem_access_permitted()
check."
> > + {
> > + /*
> > + * Should be more fine grained in order to not map the
> > forbidden
> > + * frame instead of rejecting the region as a whole, but it's
> > only
> > + * for read-only MMIO regions, which are very limited.
> > + */
>
> How certain are you/we that no two adjacent ones may appear, with
> different permissions granted to Dom0?
Yeah, I was already not very convinced by this, and I think the only
solution here is to iterate over the read-only ranges with page
granularity. In any case read-only ranges are both few and small in
size, hence this is unlikely to be noticeable performance wise.
> > + printk(XENLOG_DEBUG
> > + "IOMMU read-only mapping of region [%lx, %lx]
> > forbidden\n",
> > + s, e);
> > + return 0;
> > + }
> > + while ( (rc = iommu_map(d, _dfn(s), _mfn(s), e - s + 1,
> > + perms, &info->flush_flags)) > 0 )
> > + {
> > + s += rc;
> > + process_pending_softirqs();
> > + }
> > + }
> > + ASSERT(rc <= 0);
> > + if ( rc )
> > + printk(XENLOG_WARNING
> > + "IOMMU identity mapping of [%lx, %lx] failed: %ld\n",
> > + s, e, rc);
> > +
> > + /* Ignore errors and attempt to map the remaining regions. */
> > + return 0;
> > +}
> > +
> > void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> > {
> > unsigned long i, top, max_pfn, start, count;
> > unsigned int flush_flags = 0, start_perms = 0;
> > + struct rangeset *map;
> > + struct map_data map_data = { .d = d };
> > + int rc;
> >
> > BUG_ON(!is_hardware_domain(d));
> >
> > @@ -397,6 +464,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain
> > *d)
> > if ( iommu_hwdom_passthrough )
> > return;
> >
> > + map = rangeset_new(NULL, NULL, 0);
> > + if ( !map )
> > + panic("IOMMU init: unable to allocate rangeset\n");
> > +
> > max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
> > top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
> >
> > @@ -451,6 +522,24 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain
> > *d)
> > goto commit;
> > }
> >
> > + if ( iommu_verbose )
> > + printk(XENLOG_INFO "d%u: identity mappings for IOMMU:\n",
> > + d->domain_id);
>
> %pd: ?
Indeed, I probably copied this from a different chunk and didn't
adjust to use %pd.
> > + rc = rangeset_report_ranges(map, 0, ~0UL, identity_map, &map_data);
> > + if ( rc )
> > + panic("IOMMU unable to create mappings: %d\n", rc);
> > + if ( is_pv_domain(d) )
> > + {
> > + map_data.ro = true;
> > + rc = rangeset_report_ranges(mmio_ro_ranges, 0, ~0UL, identity_map,
> > + &map_data);
> > + if ( rc )
> > + panic("IOMMU unable to create read-only mappings: %d\n", rc);
> > + }
> > +
> > + rangeset_destroy(map);
>
> This could move up, couldn't it?
Yes, could be moved just after the rangeset_report_ranges(map...)
call.
> > /* Use if to avoid compiler warning */
> > if ( iommu_iotlb_flush_all(d, flush_flags) )
>
> Don't you need to fold map.flush_flags into flush_flags ahead of this call?
> Or can the variable perhaps go away altogether, being replaced by the struct
> field?
Yes, the variable ends up being replaced in a later patch, hence I
didn't touch it here.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |