[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] iommu/amd-vi: do not zero IOMMU MMIO region


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 7 May 2026 16:29:21 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cdPTQ9unoLl/SWMfTiuv5+Xuu2a9/2HGK8L4AkArAoU=; b=YvNjz7HAAMcc2qHsENTUv1t9MvO5kZhQ3O9oSoP9yKhUXpC2khE1Ms9A87iKvdKkAvs+AjOli8jASmVT1olvfg7d5gJSabi9+x8V7VoPzN4m9tez2plpAgRH12XQ3ldFDUEojOjqKmgPqIB80RUZADKua2UQZp4LeLlqI/rtFPbLNOzW+Nc2nMGICrAWqLPnETtXsBAvsKn9Og8gcqkgah5WV1o5xCM6cBxtpuy8fYmLZUcBHRY3aJHwSDhVoaflUTga4R08ZLI6+UL0sjrtRX2UGKf7DK9lwfNkymMuj/vw41O/N3dKbTSlv3wNxqaiMoHGWWvfaH8/6viwdebuMw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=mrIHrvkbj2rYmKF4DpdgMugf/jdx6bD+V8tqFGHn1MvMVfnsEC4cQFjcWV7UWgTkBBGRD6co51jOoygX6W0RmgGNEm46OiU43Iuz8mtXP3tAvN+r0L3QmUNaQ70ErE8Cxleudqay+EasLOeAure4YrI8uPnCKg5S/HmMlOdAVW7xy/PKmkLh8jcKwjzrQNX5Ep307glFrA1bd8rJ1RHelDeMJVIAGkvUzFOUAb+XOe3q+/Xcrmt4tZS3OFGsqQX6r+vagAJUeb8tZx7rCyU4Exk9awnED+CVNzMowvcibjZBKvLgoDrdTjW4020PRr9qgrl7qTHtH/huM7q/eqBeaQ==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 07 May 2026 14:29:47 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, May 07, 2026 at 01:20:25PM +0200, Jan Beulich wrote:
> On 07.05.2026 12:21, Roger Pau Monné wrote:
> > On Thu, May 07, 2026 at 10:51:18AM +0200, Jan Beulich wrote:
> >> On 07.05.2026 10:46, Roger Pau Monné wrote:
> >>> On Thu, May 07, 2026 at 10:03:05AM +0200, Jan Beulich wrote:
> >>>> On 06.05.2026 18:51, Roger Pau Monne wrote:
> >>>>> Attempting to memset the whole IOMMU MMIO region to zero is dangerous to
> >>>>> say the least.  We don't know what registers might be there, nor which
> >>>>> values might be safe for those registers.  On a forthcoming platform 
> >>>>> doing
> >>>>> the zeroing of the MMIO region does put the IOMMU in a broken state, 
> >>>>> which
> >>>>> is not recoverable by the IOMMU initialization procedure in Xen.
> >>>>>
> >>>>> Instead just zero the control register, which mimics the current 
> >>>>> behavior
> >>>>> with regards to how the control register is handled, and ensures the 
> >>>>> IOMU
> >>>>> setup is done with the unit disabled.  This approach will need 
> >>>>> revisiting
> >>>>> in order to support Preboot DMA Protection.
> >>>>>
> >>>>> Fold map_iommu_mmio_region() into its only caller, as the function body 
> >>>>> is
> >>>>> just an ioremap() call after the removal of the memset().
> >>>>>
> >>>>> Fixes: 0700c962ac2d ("Add AMD IOMMU support into hypervisor")
> >>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >>>>
> >>>> While you got Andrew's R-b, I don't view that as enough to commit it. My
> >>>> prior concern towards ...
> >>>>
> >>>>> --- a/xen/drivers/passthrough/amd/iommu_init.c
> >>>>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> >>>>> @@ -42,18 +42,6 @@ static bool iommu_has_ht_flag(struct amd_iommu 
> >>>>> *iommu, u8 mask)
> >>>>>      return iommu->ht_flags & mask;
> >>>>>  }
> >>>>>  
> >>>>> -static int __init map_iommu_mmio_region(struct amd_iommu *iommu)
> >>>>> -{
> >>>>> -    iommu->mmio_base = ioremap(iommu->mmio_base_phys,
> >>>>> -                               IOMMU_MMIO_REGION_LENGTH);
> >>>>> -    if ( !iommu->mmio_base )
> >>>>> -        return -ENOMEM;
> >>>>> -
> >>>>> -    memset(iommu->mmio_base, 0, IOMMU_MMIO_REGION_LENGTH);
> >>>>> -
> >>>>> -    return 0;
> >>>>> -}
> >>>>
> >>>> ... this part of the change wasn't addressed, neither verbally nor by an
> >>>> adjustment to the description of what was committed. As previously 
> >>>> stated,
> >>>> blindly memset()-ing the entire area may not be the best of all options,
> >>>> but the downsides of not doing this need to somehow be addressed. As
> >>>> indicated, once they run out of bits in the main control register, they
> >>>> likely will add a 2nd one. That'll then also need clearing, yet we have
> >>>> no code to do so anymore.
> >>>
> >>> I could introduce an opt-in command line option that forces the
> >>> zeroing of the MMIO region (to have the option to resort to the
> >>> previous behavior),
> >>
> >> But we don't want to fully go back to this. We'd need a form that zeroes
> >> what may be zeroed, without causing the issue you're trying to address.
> > 
> > But how do we know what needs to be zeroed?  We are then in the same
> > position where the introduction of a new control register would cause
> > the zeroing to no longer be accurate.
> 
> An option may be to zero everything we don't know about (plus perhaps
> everything we know about, but don't otherwise use), on the assumption
> that new (writable) registers added are okay to zero.

I don't know, I wouldn't feel very comfortable in zeroing everything
we don't know about - there's a risk of zeroing hidden registers set
up by the firmware.

> >>> but I was (wrongly) under the impression that we
> >>> have agreement the proposed approach was the least bad of the ones
> >>> available, sorry.
> >>>
> >>> Note how VT-d also doesn't zero the IOMMU registers MMIO page either,
> >>> neither does it seems to zero the Global Command Register either,
> >>> which I'm not saying it's correct, but is at least a (possibly wrong)
> >>> precedent.  I don't think there's much we can do with the handling of
> >>> enabled bits in possibly registers not know/handled by Xen.  Like on
> >>> VT-d, we possibly need to rely on the firmware to handle the IOMMU in
> >>> a half-sane configuration, with no enabled features on registers Xen
> >>> doesn't know about.
> >>
> >> As indicated before, for firmware we can likely rely on that. Pre-boot
> >> non-firmware environments and especially Xen being kexec-ed (or being
> >> run past something which was kexec-ed) may be of more concern.
> > 
> > Do we really support booting from such environments?  We would need
> > much more careful handling of enabled features IMO, as blindly zeroing
> > the whole MMIO register area is likely to not make the IOMMU happy if
> > it was in an enabled state.
> > 
> > Note for example how Xen was zeroing the command and log buffer
> > pointers ahead of disabling the features in the control register, just
> > because those register are ahead of the control register in the MMIO
> > space.
> 
> Hmm, yes, such ordering issues could also appear with new registers.
> Then again, with the IOMMU as a whole disabled (which we would still
> want to do up front), perhaps the order of other stores can be assumed
> to not matter?

I would assume so, yes, but for the issue here the order of the writes
did matter, even when the IOMMU was fully disabled.

I've inquired to see if there's a recommended way to clear any
previous state from an IOMMU.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.