[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional


  • To: "paul@xxxxxxx" <paul@xxxxxxx>, 'Jan Beulich' <jbeulich@xxxxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Tue, 17 Mar 2020 06:10:05 +0000
  • Accept-language: en-US
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, 'Andrew Cooper' <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Tue, 17 Mar 2020 06:10:33 +0000
  • Dlp-product: dlpe-windows
  • Dlp-reaction: no-action
  • Dlp-version: 11.2.0.6
  • Ironport-sdr: kKS09cV0bYJVgXpmYLO3EyGyBb9P8cETSxKbWS2wc8QqM9m1U34sirpduTmbBnVf0yJh5C3/Y1 Gp+6IYPhRS6Q==
  • Ironport-sdr: ijNQkcf7mskM6IT1+HGe84IwlvddSLfKM+wSO1VMPcPMZsQQFnGpu9NZgt2NlDEDUDe2h1gbK+ 1nmbdQdh1q4Q==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHV9gMvA8DkILGx5UOf5iWNvNKXRqhBK2Cg///vUwCAACJ5gIAAKxiAgAAChICAAAiJAIAABewAgARkzGD//+KrgIAGmDwQ
  • Thread-topic: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional

> From: Paul Durrant <xadimgnik@xxxxxxxxx>
> Sent: Friday, March 13, 2020 5:26 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin <kevin.tian@xxxxxxxxx>
> > Sent: 13 March 2020 03:23
> > To: paul@xxxxxxx; 'Jan Beulich' <jbeulich@xxxxxxxx>
> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; 'Andrew Cooper'
> <andrew.cooper3@xxxxxxxxxx>
> > Subject: RE: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of
> quarantined devices optional
> >
> > > From: Paul Durrant <xadimgnik@xxxxxxxxx>
> > > Sent: Wednesday, March 11, 2020 12:05 AM
> > >
> > [...]
> > >
> > > >
> > > > > However, is a really saying that things will break if any of the
> > > > > PTEs has their present bit clear?
> > > >
> > > > Well, you said that read faults are fatal (to the host). Reads will,
> > > > for any address with an unpopulated PTE, result in a fault and hence
> > > > by implication be fatal.
> > >
> > > Oh I see. I thought there was an implication that the IOMMU could not
> cope
> > > with non-present PTEs in some way. Agreed that, when the device is
> assigned
> > > to the guest, then it can arrange (via ballooning) for a non-present entry
> to
> > > be hit by a read transaction, resulting in a lock-up. But dealing with a
> > > malicious guest was not the issue at hand... dealing with a buggy device
> that
> > > still tried to DMA after reset and whilst in quarantine was the problem.
> > >
> >
> > More thinking on this, I wonder whether the scratch page is sufficient, or
> > whether we should support such device in the first place. Looking at
> > 0c35d446:
> > --
> >     The reason for doing this is that some hardware may continue to re-try
> >     DMA (despite FLR) in the event of an error, or even BME being cleared,
> and
> >     will fail to deal with DMA read faults gracefully. Having a scratch page
> >     mapped will allow pending DMA reads to complete and thus such buggy
> >     hardware will eventually be quiesced.
> > --
> >
> > 'eventually'... what does it exactly mean?
> 
> It means after a period of time we can only determine empirically.
> 
> > How would an user know a
> > device has been quiesced before he attempts to re-assign the device
> > to other domU or dom0? by guess?
> 
> Yes, a guess, but an educated one.
> 
> > Note the exact behavior of such
> > device, after different guest behaviors (hang, kill, bug, etc.), is not
> > documented. Who knows whether a in-fly DMA may be triggered when
> > the new owner starts to initialize the device again? How many stale
> > states are remaining on such device which, even not triggerring in-fly
> > DMAs, may change the desired behavior of the new owner? e.g. it's
> > possible one control register configured by the old owner, but not
> > touched by the new owner. If it cannot be reset, what's the point of
> > supporting assignment of such bogus device?
> >
> 
> Because I'm afraid it is quite ubiquitous and we need to deal with it.

it sounds the whole passthrough is in dangerous if your statement is true...

> 
> > Thereby I feel any support of such bogus device should be maintained
> > offtree, instead of in upstream Xen. Thoughts?
> >
> 
> I don't see the harm in the code being upstream. There may well be other
> devices with similar issues and it provides an option for an admin to try.
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.