[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional

On 13.03.2020 04:05, Tian, Kevin wrote:
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: Tuesday, March 10, 2020 6:31 PM
>> On 10.03.2020 06:30, Tian, Kevin wrote:
>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>> Sent: Monday, March 9, 2020 7:09 PM
>>>> Containing still in flight DMA was introduced to work around certain
>>>> devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
>>>> Passing through (such) devices (on such systems) is inherently insecure
>>>> (as guests could easily arrange for IOMMU faults of any kind to occur).
>>>> Defaulting to a mode where admins may not even become aware of
>> issues
>>>> with devices can be considered undesirable. Therefore convert this mode
>>>> of operation to an optional one, not one enabled by default.
>>> Here is another thought. The whole point of quarantine is to contain
>>> the device after it is deassigned from untrusted guest.
>> I'd question the "untrusted" here. Assigning devices to untrusted
>> guests is problematic anyway, unless you're the device manufacturer
>> and device firmware writer, and hence you can guarantee the device
>> to not offer any backdoors or alike. Therefore I view quarantining
> Aren't all guests untrusted from hypervisor p.o.v, which is the reason
> why IOMMU was introduced in the first place?

"Untrusted" is always meant from the perspective of the host admin.

> I may overlook the history of quarantine feature. Based on my study
> of quarantine related changes, looks initially Paul pointed out such 
> problem that a device may have in-fly DMAs to touch dom0 memory
> after it is deassigned. Then he introduced the quarantine concept by
> putting a quarantined device into dom_io w/o any valid mapping, 
> with a whitelist approach. You later extended as a default behavior
> for all PCI devices. Now Paul found some device which cannot tolerate
> read-fault and then came up this scratch-page idea.
> Now I wonder why we are doing such explicit quarantine in the first
> place. Shouldn't we always seek resetting the device when it is deassigned
> from a guest? 'reset' should cancel/quiescent all in-fly DMAs for most
> devices if they implement 'reset' correctly.

And the important word here is "should". Paul and colleagues found
it may not do so in reality.

> If doing that way, we don't
> need a quarantine option at all, and then the bogus device in Paul's
> latest finding could be handled by a standalone option w/o struggling
> whether 'full' is a right name vs. 'basic'. or we may introduce a reset
> flag when assigning such device to indicate such special requirement,
> so a scratch page/dom_io can be linked specifically for such device 
> post reset, assuming it is not a platform-level bug from Paul's response?  

Which would imply host admins to know such properties of their
devices, and better _without_ first having run into problems.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.