Xen project Mailing List

Re: [PATCH v4 06/21] IOMMU/x86: perform PV Dom0 mappings in batches

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Wed, 4 May 2022 15:55:45 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nQ1GIAIcILvu1VTg7Kv7YpbGZ4BPwhFXFVKbWwZ7Haw=; b=WAHEcESXe6ASaNpZGvkYY5QrvvQ5vQwk8e0BXlFI7grDRAFebhO25fH9Fr+7PlXzE26koTurWm3FF/of/BsOo12tGT0502ijRVkhap4Tz3KTK8qiZy5XIMpnnLDeiiPR13BH9MaTu2bI8AiBo8eI+T/UHCshPJb+4U0I9s2DK/G2amaNibcrFU1dITvnn6+8sTpQ0mEmLcC61UEtyLIw7JvY8mqAyA/9Qm6OHFl5HIMMs7cRALI7OdPAkliKpiAXdNirz2GV0hszgXJTk7kRFHDMIOMl/JY+9gAfs7IeDGyRm40DVj8goWHDL6M5AKpaZXodzoMRzVBJB8+uDoWYTg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PfQQbnww9kv/59hfsb+3CuC37PXX//RiGCrKisfPhiR7mGWmAGjawLAqIXS/spVbgUzXWnzrtSMn7zc6o8I1tMTNHh6ohRSVtpFjgPzdO+cRlHCjy9dNX7uTF670no7Bxexu1TNNHN1AcMfk1Or7FXPIn73GMXJ5zSjanCmIUximILoif1CTKnrjbFtNVuB32BdxTkWVCWcuczXnavF65m/EOX2Y7hnsGXWeLxtaloKxtzU1UCmLpHsTr0VugyTLg3vxtOQYgBVgEznxBR/meTD4NQSE7wvB8qwMcBO2vjGjVBGjNt1+InmXNW1yU6EPkoo7/elMr5JBu1kx0cnCWg==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Wed, 04 May 2022 13:56:02 +0000

Ironport-data: A9a23:mur9R64s/zBnHLhLlEO+ewxRtELGchMFZxGqfqrLsTDasY5as4F+v mFNWmCHPPeKNjPyeNsgOY+xpENSvcfdyN5qHQBpqi0wHi5G8cbLO4+Ufxz6V8+wwmwvb67FA +E2MISowBUcFyeEzvuVGuG96yE6j8lkf5KkYAL+EnkZqTRMFWFw0HqPp8Zj2tQy2YXhX1vX0 T/Pi5a31GGNimYc3l08s8pvmDs31BglkGpF1rCWTakjUG72zxH5PrpGTU2CByKQrr1vNvy7X 47+IISRpQs1yfuP5uSNyd4XemVSKlLb0JPnZnB+A8BOiTAazsA+PzpS2FPxpi67hh3Q9+2dx umhurSyTiktB6z1x90tQgFIDhpSZYdD0a7udC3XXcy7lyUqclPK6tA3VQQcG91d/ex6R2ZT6 fYfNTYBKAiZgP67y666Te8qgdk/KM7sP8UUvXQIITPxVK56B8ycBfiao4YHgl/chegXdRraT 9AeZjd1KgzJfjVEO0sNCYJ4l+Ct7pX6W2MJ+Q/N+PFoi4TV5F11k57EHfXFQOeLe/dft0+Sh iHs+k2sV3n2M/Tak1Jp6EmEhOXCgCf6U4I6D6Cj+7hhh1j77nweDlgaWEW2pdG9i1WiQJRPJ koM4C0soKMuskuxQbHVRxSlpFaUsxhaXMBfe8Ul7Cmdx6yS5ByWblXoVRZEYd0i8cUwFToj0 wbQm8uzXGMx9rqIVXia67GY6yuoPjQYJnMDYilCShYZ597ko8c4iRenostfLZNZR+bdQVnYq w1mZgBk71nPpabnD5mGwG0=

Ironport-hdrordr: A9a23:7XvjpKqJ85fFrPNg0SA1UuwaV5u5L9V00zEX/kB9WHVpm5Oj+v xGzc5w6farsl0ssREb9uxo9pPwJE800aQFmbX5Wo3SJzUO2VHYVb2KiLGP/9SOIU3DH4JmpM Rdmu1FeafN5DtB/LnHCWuDYrEdKbC8mcjH5Ns2jU0dKz2CA5sQkzuRYTzrdnGeKjM2Z6bQQ/ Gnl7d6TnebCD0qR/X+IkNAc/nIptXNmp6jSRkaByQ/4A3LqT+z8rb1HzWRwx9bClp0sPwf2F mAtza8yrSosvm9xBOZ/2jP765OkN+k7tdYHsSDhuUcNz2poAe1Y4ZKXaGEoVkO0amSwWdvtO OJjwYrPsx15X+UVmapoSH10w2l6zoq42+K8y7tvVLT5ejCAB4qActIgoxUNjHD7VA7gd162K VXm0qEqpt+F3r77WvAzumNcysvulu/oHIkn+JWpWdYS5EiZLhYqpFa1F9JEa0HADnx5OkcYa VT5fnnlbdrmG6hHjDkVjEF+q3uYp1zJGbKfqE6gL3a79AM90oJjXfxx6Qk7wI9HdwGOtx5Dt //Q9VVfYF1P7ErhJ1GdZc8qOuMexvwqEH3QRSvyWqOLtB1B1v977jK3Z4S2MaGPLQ18bpaou WybLofjx95R37T

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, May 04, 2022 at 02:27:14PM +0200, Jan Beulich wrote: > On 04.05.2022 13:20, Roger Pau Monné wrote: > > On Wed, May 04, 2022 at 11:46:37AM +0200, Jan Beulich wrote: > >> On 03.05.2022 16:49, Roger Pau Monné wrote: > >>> On Mon, Apr 25, 2022 at 10:34:59AM +0200, Jan Beulich wrote: > >>> It would seem to me that doing it that way would also allow the > >>> mappings to get established in blocks for domUs. > >> > >> ... then this would perhaps be possible. > >> > >>>> The installing of zero-ref writable types has in fact shown (observed > >>>> while putting together the change) that despite the intention by the > >>>> XSA-288 changes (affecting DomU-s only) for Dom0 a number of > >>>> sufficiently ordinary pages (at the very least initrd and P2M ones as > >>>> well as pages that are part of the initial allocation but not part of > >>>> the initial mapping) still have been starting out as PGT_none, meaning > >>>> that they would have gained IOMMU mappings only the first time these > >>>> pages would get mapped writably. Consequently an open question is > >>>> whether iommu_memory_setup() should set the pages to PGT_writable_page > >>>> independent of need_iommu_pt_sync(). > >>> > >>> I think I'm confused, doesn't the setting of PGT_writable_page happen > >>> as a result of need_iommu_pt_sync() and having those pages added to > >>> the IOMMU page tables? (so they can be properly tracked and IOMMU > >>> mappings are removed if thte page is also removed) > >> > >> In principle yes - in guest_physmap_add_page(). But this function isn't > >> called for the pages I did enumerate in the remark. XSA-288 really only > >> cared about getting this right for DomU-s. > > > > Would it make sense to change guest_physmap_add_page() to be able to > > pass the page_order parameter down to iommu_map(), and then use it for > > dom0 build instead of introducing iommu_memory_setup()? > > To be quite frank: This is something that I might have been willing to > do months ago, when this series was still fresh. If I was to start > re-doing all of this code now, it would take far more time than it > would have taken back then. Hence I'd like to avoid a full re-work here > unless entirely unacceptable in the way currently done (which largely > fits with how we have been doing Dom0 setup). Sorry, I would have really liked to be more on time with reviews of this, but there's always something that comes up. > Furthermore, guest_physmap_add_page() doesn't itself call iommu_map(). > What you're suggesting would require get_page_and_type() to be able to > work on higher-order pages. I view adjustments like this as well out > of scope for this series. Well, my initial thinking was to do something similar to what you currently have in iommu_memory_setup: a direct call to iommu_map and adjust the page types manually, but I think this will only work for dom0 because pages are fresh at that point. For domUs we must use get_page_and_type so any previous mapping is also removed. > > I think guest_physmap_add_page() will need to be adjusted at some > > point for domUs, and hence it could be unified with dom0 usage > > also? > > As an optimization - perhaps. I view it as more important to have HVM > guests work reasonably well (which includes the performance aspect of > setting them up). OK, I'm fine with focusing on HVM. > >>>> --- a/xen/drivers/passthrough/x86/iommu.c > >>>> +++ b/xen/drivers/passthrough/x86/iommu.c > >>>> @@ -347,8 +347,8 @@ static unsigned int __hwdom_init hwdom_i > >>>> > >>>> void __hwdom_init arch_iommu_hwdom_init(struct domain *d) > >>>> { > >>>> - unsigned long i, top, max_pfn; > >>>> - unsigned int flush_flags = 0; > >>>> + unsigned long i, top, max_pfn, start, count; > >>>> + unsigned int flush_flags = 0, start_perms = 0; > >>>> > >>>> BUG_ON(!is_hardware_domain(d)); > >>>> > >>>> @@ -379,9 +379,9 @@ void __hwdom_init arch_iommu_hwdom_init( > >>>> * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid > >>>> * setting up potentially conflicting mappings here. > >>>> */ > >>>> - i = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0; > >>>> + start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0; > >>>> > >>>> - for ( ; i < top; i++ ) > >>>> + for ( i = start, count = 0; i < top; ) > >>>> { > >>>> unsigned long pfn = pdx_to_pfn(i); > >>>> unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn); > >>>> @@ -390,20 +390,41 @@ void __hwdom_init arch_iommu_hwdom_init( > >>>> if ( !perms ) > >>>> rc = 0; > >>>> else if ( paging_mode_translate(d) ) > >>>> + { > >>>> rc = p2m_add_identity_entry(d, pfn, > >>>> perms & IOMMUF_writable ? > >>>> p2m_access_rw > >>>> : > >>>> p2m_access_r, > >>>> 0); > >>>> + if ( rc ) > >>>> + printk(XENLOG_WARNING > >>>> + "%pd: identity mapping of %lx failed: %d\n", > >>>> + d, pfn, rc); > >>>> + } > >>>> + else if ( pfn != start + count || perms != start_perms ) > >>>> + { > >>>> + commit: > >>>> + rc = iommu_map(d, _dfn(start), _mfn(start), count, > >>>> start_perms, > >>>> + &flush_flags); > >>>> + if ( rc ) > >>>> + printk(XENLOG_WARNING > >>>> + "%pd: IOMMU identity mapping of [%lx,%lx) > >>>> failed: %d\n", > >>>> + d, pfn, pfn + count, rc); > >>>> + SWAP(start, pfn); > >>>> + start_perms = perms; > >>>> + count = 1; > >>>> + } > >>>> else > >>>> - rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << > >>>> PAGE_ORDER_4K, > >>>> - perms, &flush_flags); > >>>> + { > >>>> + ++count; > >>>> + rc = 0; > >>> > >>> Seeing as we want to process this in blocks now, I wonder whether it > >>> would make sense to take a different approach, and use a rangeset to > >>> track which regions need to be mapped. What gets added would be based > >>> on the host e820 plus the options > >>> iommu_hwdom_{strict,inclusive,reserved}. We would then punch holes > >>> based on the logic in hwdom_iommu_map() and finally we could iterate > >>> over the regions afterwards using rangeset_consume_ranges(). > >>> > >>> Not that you strictly need to do it here, just think the end result > >>> would be clearer. > >> > >> The end result might indeed be, but it would be more of a change right > >> here. Hence I'd prefer to leave that out of the series for now. > > > > OK. I think it might be nice to add a comment in that regard, mostly > > because I tend to forget myself. > > Sure, I've added another post-commit-message remark. Sorry for being confused, but are those reflected in the final commit message, or in the code itself? Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.