|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.18 v2] iommu/vt-d: fix SAGAW capability parsing
On Thu, Oct 19, 2023 at 10:49:29AM +0200, Jan Beulich wrote:
> On 19.10.2023 10:15, Roger Pau Monné wrote:
> > On Thu, Oct 19, 2023 at 09:41:41AM +0200, Jan Beulich wrote:
> >> On 18.10.2023 18:07, Roger Pau Monne wrote:
> >>> SAGAW is a bitmap field, with bits 1, 2 and 3 signaling support for 3, 4
> >>> and 5
> >>> level page tables respectively. According to the Intel VT-d
> >>> specification, an
> >>> IOMMU can report multiple SAGAW bits being set.
> >>>
> >>> Commit 859d11b27912 claims to replace the open-coded
> >>> find_first_set_bit(), but
> >>> it's actually replacing an open coded implementation to find the last set
> >>> bit.
> >>> The change forces the used AGAW to the lowest supported by the IOMMU
> >>> instead of
> >>> the highest one between 1 and 2.
> >>>
> >>> Restore the previous SAGAW parsing by using fls() instead of
> >>> find_first_set_bit(), in order to get the highest (supported) AGAW to be
> >>> used.
> >>>
> >>> However there's a caveat related to the value the AW context entry field
> >>> must
> >>> be set to when using passthrough mode:
> >>>
> >>> "When the Translation-type (TT) field indicates pass-through processing
> >>> (10b),
> >>> this field must be programmed to indicate the largest AGAW value
> >>> supported by
> >>> hardware." [0]
> >>>
> >>> Newer Intel IOMMU implementations support 5 level page tables for the
> >>> IOMMU,
> >>> and signal such support in SAGAW bit 3.
> >>>
> >>> Enabling 5 level paging support (AGAW 3) at this point in the release is
> >>> too
> >>> risky, so instead put a bodge to unconditionally disable passthough mode
> >>> if
> >>> SAGAW has any bits greater than 2 set. Ignore bit 0, it's reserved in the
> >>> specification but unlikely to have any meaning in the future.
> >>
> >> May be worth mentioning that in earlier versions this indicated 2-level
> >> paging support.
> >
> > Oh, that's not even present in my copy of the spec from 2016. I guess
> > it was removed very, very long time ago?
>
> Indeed, as mentioned in the commit you're fixing. Version 1.3 of the
> spec still has it. Judging by the document numbers 2.2 may have been
> its direct successor (i.e. no further 1.x and no 2.0 or 2.1).
>
> >>> --- a/xen/drivers/passthrough/vtd/iommu.c
> >>> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >>> @@ -1327,15 +1327,24 @@ int __init iommu_alloc(struct acpi_drhd_unit
> >>> *drhd)
> >>>
> >>> /* Calculate number of pagetable levels: 3 or 4. */
> >>> sagaw = cap_sagaw(iommu->cap);
> >>> - if ( sagaw & 6 )
> >>> - agaw = find_first_set_bit(sagaw & 6);
> >>> - if ( !agaw )
> >>> + agaw = fls(sagaw & 6) - 1;
> >>> + if ( agaw == -1 )
> >>
> >> Would you mind making this "< 0" or even "<= 0"? The latter in particular
> >> would already cover the likely future change of dropping the masking by 6.
> >
> > My plan wasn't to drop the masking, but use 0xe if we support AGAW 3.
>
> But we will also need to deal with bit 4 (at which point applying a mask
> is going to be useless code). We can either guess that it's going to mean
> 6-level paging, or we need to treat it as having unknown meaning when set
> (implying that we'd then still need to either fail initialization or
> disable pass-through mode).
I wouldn't enable support for AGAW 4 unless we have a way to test it.
It's safer to just disable passthrough mode if SAGAW bit 4 is set.
> > I'm fine with using < or <= if you think it's more robust.
>
> Good, will do so then.
>
> >>> {
> >>> printk(XENLOG_ERR VTDPREFIX "IOMMU: unsupported sagaw %x\n",
> >>> sagaw);
> >>> print_iommu_regs(drhd);
> >>> rc = -ENODEV;
> >>> goto free;
> >>> }
> >>> + if ( sagaw >> 3 )
> >>> + {
> >>> + printk_once(XENLOG_WARNING VTDPREFIX
> >>> + "IOMMU: unhandled bits set in sagaw (%#x)%s\n",
> >>
> >> I think IOMMU: is redundant with VTDPREFIX (or alternatively iommu->index
> >> would also want logging). Also note that VTDPREFIX (bogusly) has no
> >> trailing space. (I realize both apply to the other log message in context
> >> as well, but still. I'd be inclined to adjust that at the same time,
> >> including switching to %#x as you have it in the new log message.)
> >
> > Oh, I didn't realize VTDPREFIX had no trailing space.
> >
> > Since it's a printk_once(), not sure iommu->index is really useful
> > here, as we would report just one IOMMU has having an unhandled SAGAW.
> > IMO if we switch to printing iommu->index we must also use a plain
> > printk. But I don't see a lot of benefit in printing this for likely
> > each IOMMU on the system, and hence I would rather use printk_once()
> > and not print the index.
>
> Well, logging the index in printk_once() has the benefit of identifying
> the first IOMMU with the issue, which may help further analysis if not
> all of them have bits beyond 2 set. But I'm not going to insist on this
> aspect.
>
> > Feel free to drop the IOMMU prefix, but I'm not sure what to do with
> > VTDPREFIX and the missing trialing space, as some users of VTDPREFIX
> > already account for such missing space.
>
> I'd simply insert a leading space in the string literal.
Ack.
> >>> + sagaw,
> >>> + iommu_hwdom_passthrough ? " disabling passthrough" :
> >>> "" );
> >>
> >> May want a leading comma (or some other separator) in the string.
> >>
> >>> + if ( iommu_hwdom_passthrough )
> >>> + iommu_hwdom_passthrough = false;
> >>
> >> No real need for if() here.
> >
> > Not really, but also no need for a write to iommu_hwdom_passthrough
> > every time an IOMMU is initialized if the condition is removed.
>
> This is init-time code, and hence the excess writes aren't going to be
> noticable.
I don't have a strong opinion, TBH I used to have it without the
conditional and added it later. If you prefer to drop the conditional
I won't oppose.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |