[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM



On Mon, Nov 20, 2023 at 10:45:29AM +0100, Jan Beulich wrote:
> On 17.11.2023 12:55, Andrew Cooper wrote:
> > On 17/11/2023 9:47 am, Roger Pau Monne wrote:
> >> Using different page table levels for HVM or PV guest is not helpful, and 
> >> is
> >> not inline with the IOMMU implementation used by the other architecture 
> >> vendor
> >> (VT-d).
> >>
> >> Switch to uniformly use DEFAULT_DOMAIN_ADDRESS_WIDTH in order to set the 
> >> AMD-Vi
> >> page table levels.
> >>
> >> Note using the max RAM address for PV was bogus anyway, as there's no 
> >> guarantee
> >> there can't be device MMIO or reserved regions past the maximum RAM region.
> > 
> > Indeed - and the MMIO regions do matter for P2P DMA.
> 
> So what about any such living above the 48-bit boundary (i.e. not covered
> by DEFAULT_DOMAIN_ADDRESS_WIDTH)?

That would only work for PV guests AFAICT, as HVM guests will already
refuse to create such mappings even before getting into the IOMMU
code: p2m_pt_set_entry() will return an error as the p2m code only
deals with 4 level page tables.

> 
> >> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > 
> > Variable-height IOMMU pagetables are not worth the security
> > vulnerabilities they're made of.  I regret not fighting hard enough to
> > kill them entirely several years ago...
> > 
> > Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, although...
> > 
> >> ---
> >>  xen/drivers/passthrough/amd/pci_amd_iommu.c | 20 ++++++++------------
> >>  1 file changed, 8 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
> >> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> index 6bc73dc21052..f9e749d74da2 100644
> >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> @@ -359,21 +359,17 @@ int __read_mostly amd_iommu_min_paging_mode = 1;
> >>  static int cf_check amd_iommu_domain_init(struct domain *d)
> >>  {
> >>      struct domain_iommu *hd = dom_iommu(d);
> >> +    int pgmode = amd_iommu_get_paging_mode(
> >> +        1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT));
> > 
> > "paging mode" comes from the spec, but it's a very backwards way of
> > spelling height.
> > 
> > Can we at least start to improve the comprehensibility by renaming this
> > variable.
> > 
> >> +
> >> +    if ( pgmode < 0 )
> >> +        return pgmode;
> >>  
> >>      /*
> >> -     * Choose the number of levels for the IOMMU page tables.
> >> -     * - PV needs 3 or 4, depending on whether there is RAM (including 
> >> hotplug
> >> -     *   RAM) above the 512G boundary.
> >> -     * - HVM could in principle use 3 or 4 depending on how much guest
> >> -     *   physical address space we give it, but this isn't known yet so 
> >> use 4
> >> -     *   unilaterally.
> >> -     * - Unity maps may require an even higher number.
> >> +     * Choose the number of levels for the IOMMU page tables, taking into
> >> +     * account unity maps.
> >>       */
> >> -    hd->arch.amd.paging_mode = max(amd_iommu_get_paging_mode(
> >> -            is_hvm_domain(d)
> >> -            ? 1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT)
> >> -            : get_upper_mfn_bound() + 1),
> >> -        amd_iommu_min_paging_mode);
> >> +    hd->arch.amd.paging_mode = max(pgmode, amd_iommu_min_paging_mode);
> > 
> > I think these min/max variables can be dropped now we're not doing
> > variable height IOMMU pagetables, which further simplifies this expression.
> 
> Did you take unity maps into account? At least $subject and comment looks
> to not be consistent in this regard: Either unity maps need considering
> specially (and then we don't uniformly use the same depth), or they don't
> need mentioning in the comment (anymore).

Unity maps that require an address width > DEFAULT_DOMAIN_ADDRESS_WIDTH
will currently only work on PV at best, as HVM p2m code is limited to
4 level page tables, so even if the IOMMU page tables support a
greater address width the call to map such regions will trigger an
error in the p2m code way before attempting to create any IOMMU
mappings.

We could do:

hd->arch.amd.paging_mode =
    is_hvm_domain(d) ? pgmode : max(pgmode, amd_iommu_min_paging_mode);

Putting IVMD/RMRR regions that require the usage of 5 level page
tables would be a very short sighted move by vendors IMO.

And will put us back in a situation where PV vs HVM can get different
IOMMU page table levels, which is undesirable.  It might be better to
just assume all domains use DEFAULT_DOMAIN_ADDRESS_WIDTH and hide
devices that have IVMD/RMRR regions above that limit.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.