|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 02/15] IOMMU/x86: perform PV Dom0 mappings in batches
On Fri, May 27, 2022 at 01:12:48PM +0200, Jan Beulich wrote:
> For large page mappings to be easily usable (i.e. in particular without
> un-shattering of smaller page mappings) and for mapping operations to
> then also be more efficient, pass batches of Dom0 memory to iommu_map().
> In dom0_construct_pv() and its helpers (covering strict mode) this
> additionally requires establishing the type of those pages (albeit with
> zero type references).
>
> The earlier establishing of PGT_writable_page | PGT_validated requires
> the existing places where this gets done (through get_page_and_type())
> to be updated: For pages which actually have a mapping, the type
> refcount needs to be 1.
>
> There is actually a related bug that gets fixed here as a side effect:
> Typically the last L1 table would get marked as such only after
> get_page_and_type(..., PGT_writable_page). While this is fine as far as
> refcounting goes, the page did remain mapped in the IOMMU in this case
> (when "iommu=dom0-strict").
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
> Subsequently p2m_add_identity_entry() may want to also gain an order
> parameter, for arch_iommu_hwdom_init() to use. While this only affects
> non-RAM regions, systems typically have 2-16Mb of reserved space
> immediately below 4Gb, which hence could be mapped more efficiently.
>
> Eventually we may want to overhaul this logic to use a rangeset based
> approach instead, punching holes into originally uniformly large-page-
> mapped regions. Doing so right here would first and foremost be yet more
> of a change.
>
> The installing of zero-ref writable types has in fact shown (observed
> while putting together the change) that despite the intention by the
> XSA-288 changes (affecting DomU-s only) for Dom0 a number of
> sufficiently ordinary pages (at the very least initrd and P2M ones as
> well as pages that are part of the initial allocation but not part of
> the initial mapping) still have been starting out as PGT_none, meaning
> that they would have gained IOMMU mappings only the first time these
> pages would get mapped writably. Consequently an open question is
> whether iommu_memory_setup() should set the pages to PGT_writable_page
> independent of need_iommu_pt_sync().
Hm, I see, non strict PV dom0s won't get the pages set to
PGT_writable_page even when accessible by devices by virtue of such
domain having all RAM mapped in the IOMMU page-tables.
I guess it does make sense to also have the pages set as
PGT_writable_page by default in that case, as tthe pages _are_
writable by the IOMMU. Do pages added during runtime (ie: ballooned
in) also get PGT_writable_page set?
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -363,8 +363,8 @@ static unsigned int __hwdom_init hwdom_i
>
> void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> {
> - unsigned long i, top, max_pfn;
> - unsigned int flush_flags = 0;
> + unsigned long i, top, max_pfn, start, count;
> + unsigned int flush_flags = 0, start_perms = 0;
>
> BUG_ON(!is_hardware_domain(d));
>
> @@ -395,9 +395,9 @@ void __hwdom_init arch_iommu_hwdom_init(
> * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
> * setting up potentially conflicting mappings here.
> */
> - i = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
> + start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
>
> - for ( ; i < top; i++ )
> + for ( i = start, count = 0; i < top; )
> {
> unsigned long pfn = pdx_to_pfn(i);
> unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
> @@ -406,20 +406,41 @@ void __hwdom_init arch_iommu_hwdom_init(
> if ( !perms )
> rc = 0;
> else if ( paging_mode_translate(d) )
> + {
> rc = p2m_add_identity_entry(d, pfn,
> perms & IOMMUF_writable ?
> p2m_access_rw
> :
> p2m_access_r,
> 0);
> + if ( rc )
> + printk(XENLOG_WARNING
> + "%pd: identity mapping of %lx failed: %d\n",
> + d, pfn, rc);
> + }
> + else if ( pfn != start + count || perms != start_perms )
> + {
> + commit:
> + rc = iommu_map(d, _dfn(start), _mfn(start), count, start_perms,
> + &flush_flags);
> + if ( rc )
> + printk(XENLOG_WARNING
> + "%pd: IOMMU identity mapping of [%lx,%lx) failed:
> %d\n",
> + d, pfn, pfn + count, rc);
> + SWAP(start, pfn);
> + start_perms = perms;
> + count = 1;
> + }
> else
> - rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
> - perms, &flush_flags);
> + {
> + ++count;
> + rc = 0;
> + }
>
> - if ( rc )
> - printk(XENLOG_WARNING "%pd: identity %smapping of %lx failed:
> %d\n",
> - d, !paging_mode_translate(d) ? "IOMMU " : "", pfn, rc);
>
> - if (!(i & 0xfffff))
> + if ( !(++i & 0xfffff) )
> process_pending_softirqs();
> +
> + if ( i == top && count )
Nit: do you really need to check for count != 0? AFAICT this is only
possible in the first iteration.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |