[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional
> From: Paul Durrant <xadimgnik@xxxxxxxxx> > Sent: Friday, March 13, 2020 5:26 PM > > > -----Original Message----- > > From: Jan Beulich <jbeulich@xxxxxxxx> > > Sent: 13 March 2020 08:10 > > To: Tian, Kevin <kevin.tian@xxxxxxxxx> > > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Andrew Cooper > <andrew.cooper3@xxxxxxxxxx>; Paul Durrant > > <paul@xxxxxxx> > > Subject: Re: [PATCH v3] IOMMU: make DMA containment of quarantined > devices optional > > > > On 13.03.2020 04:05, Tian, Kevin wrote: > > >> From: Jan Beulich <jbeulich@xxxxxxxx> > > >> Sent: Tuesday, March 10, 2020 6:31 PM > > >> > > >> On 10.03.2020 06:30, Tian, Kevin wrote: > > >>>> From: Jan Beulich <jbeulich@xxxxxxxx> > > >>>> Sent: Monday, March 9, 2020 7:09 PM > > >>>> > > >>>> Containing still in flight DMA was introduced to work around certain > > >>>> devices / systems hanging hard upon hitting a "not-present" IOMMU > fault. > > >>>> Passing through (such) devices (on such systems) is inherently > insecure > > >>>> (as guests could easily arrange for IOMMU faults of any kind to occur). > > >>>> Defaulting to a mode where admins may not even become aware of > > >> issues > > >>>> with devices can be considered undesirable. Therefore convert this > mode > > >>>> of operation to an optional one, not one enabled by default. > > >>> > > >>> Here is another thought. The whole point of quarantine is to contain > > >>> the device after it is deassigned from untrusted guest. > > >> > > >> I'd question the "untrusted" here. Assigning devices to untrusted > > >> guests is problematic anyway, unless you're the device manufacturer > > >> and device firmware writer, and hence you can guarantee the device > > >> to not offer any backdoors or alike. Therefore I view quarantining > > > > > > Aren't all guests untrusted from hypervisor p.o.v, which is the reason > > > why IOMMU was introduced in the first place? > > > > "Untrusted" is always meant from the perspective of the host admin. > > > > > I may overlook the history of quarantine feature. Based on my study > > > of quarantine related changes, looks initially Paul pointed out such > > > problem that a device may have in-fly DMAs to touch dom0 memory > > > after it is deassigned. Then he introduced the quarantine concept by > > > putting a quarantined device into dom_io w/o any valid mapping, > > > with a whitelist approach. You later extended as a default behavior > > > for all PCI devices. Now Paul found some device which cannot tolerate > > > read-fault and then came up this scratch-page idea. > > > > > > Now I wonder why we are doing such explicit quarantine in the first > > > place. Shouldn't we always seek resetting the device when it is deassigned > > > from a guest? 'reset' should cancel/quiescent all in-fly DMAs for most > > > devices if they implement 'reset' correctly. > > > > And the important word here is "should". Paul and colleagues found > > it may not do so in reality. > > Yeah... we have to live with what we've got. Yes, it's buggy as hell but we > have to do our best to stop it wedging hosts whilst trying to prevent > scribbling over critical parts of memory. 'should' is applied to most devices who can gracefully handle the reset request, and then for exception we come with ad-hoc band-aid. > > > > > > If doing that way, we don't > > > need a quarantine option at all, and then the bogus device in Paul's > > > latest finding could be handled by a standalone option w/o struggling > > > whether 'full' is a right name vs. 'basic'. or we may introduce a reset > > > flag when assigning such device to indicate such special requirement, > > > so a scratch page/dom_io can be linked specifically for such device > > > post reset, assuming it is not a platform-level bug from Paul's response? > > > > Which would imply host admins to know such properties of their > > devices, and better _without_ first having run into problems. > > > > It is a device-level bug. We could, I guess, have a per-device quirk to say > whether it should get a context entry pointing at a scratch page or not. > If all except me think supporting such device in upstream is necessary, per-device quirk at least sounds better to me than enforcing a global quarantine policy for all devices. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |