[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] Dom0 PV IOMMU control design (draft A)



>>> On 14.04.14 at 17:48, <malcolm.crossley@xxxxxxxxxx> wrote:
> On 14/04/14 14:47, Jan Beulich wrote:
>>>>> On 11.04.14 at 19:28, <malcolm.crossley@xxxxxxxxxx> wrote:
>>> Purpose
>>> =======
>>>
>>> Allow Xen Domain 0 PV guests to create/modify/destroy IOMMU mappings for
>>> hardware devices that Domain 0 has access to. This enables Domain 0 to 
>>> program
>>> a bus address space mapping which matches it's GPFN mapping. Once a 1-1
>>> mapping of GPFN to bus address space is created then a bounce buffer
>>> region is not required for the IO devices connected to the IOMMU.
>> So the idea then is for the domain to
>> - create this 1:1 mapping once at boot, and update it as pages get
>>   ballooned int/out, or
>> - zap the entire IOMMU mapping at boot, and establish individual
>>   mappings as needed based on the driver's use of the DMA map ops?
>>
> 
> Yes both of those are the general idea, I will document this better in
> draft B as per David's suggestion.
> The first strategy has better performance (less IOMMU updates) but lower
> security from bad devices, the second strategy has better security but
> higher IOMMU mapping overhead.
> 
> The current idea for maximum performance is to create a 1:1 mapping for
> dom0 GPFN's and an offset (offset by max dom0 GPFN) 1:1 mapping for the
> whole of the MFN address space which will be used for grant mapped
> pages. This should result in IOMMU updates only for ballooned pages.

That's not fully scalable: You can't cope with systems making use of
the full physical address space (e.g. by sparsely allocating where
each node's memory goes). HP has been having such systems for a
couple of years.

>>> IOMMUOP_unmap_page
>>> ----------------
>>> First argument, pointer to array of `struct iommu_map_op`
>>> Second argument, integer count of `struct iommu_map_op` elements in array
>>>
>>> This subop will attempt to unmap each element in the `struct 
>>> iommu_map_op` array
>>> and record the mapping status back into the array itself. If an 
>> unmapping?
> 
> Agreed
>>
>>> unmapping fault
>> mapping?
> 
> This is an unmap hypercall so I believe it would be an unmapping fault
> reported.

No - an individual unmap operation failing should result in the
status to be set to -EFAULT (or whatever). The only time you'd
fail the entire hypercall with -EFAULT would be when you can't
read an interface structure array element. Hence if anything this
would be a mapping fault (albeit I didn't like the term in the
IOMMU_map_page description either).

>>> The IOMMU TLB will only be flushed when the hypercall completes or a 
>>> hypercall
>>> continuation is created.
>>>
>>>      struct iommu_map_op {
>>>          uint64_t bfn;
>>>          uint64_t mfn;
>>>          uint32_t flags;
>>>          int32_t status;
>>>      };
>>>
>>> --------------------------------------------------------------------
>>> Field          Purpose
>>> -----          -----------------------------------------------------
>>> `bfn`          [in] Bus address frame number to be unmapped
>>>
>>> `mfn`          [in] This field is ignored for unmap subop
>>>
>>> `flags`        [in] This field is ignored for unmap subop
>>>
>>> `status`       [out] Mapping status of this unmap operation, 0 indicates 
> success
>> Hmm, bfn and flags ignored would call for an abbreviated interface
>> structure. Or how about IOMMU_MAP_OP_{readable,writeable}
>> implying unmap, in which case for now you'd need only one op. And
>> with that I wonder whether this shouldn't simply be a new
>> PHYSDEVOP_iommu_map.
> 
> I initially tried to add it to the PHYSDEV hypercall as a subop but when
> adding batch capability it seemed cleaner to use a 3 argument hypercall
> to allow for the count to be passed in as argument rather than as a
> field in a struct.

Yes, I see - batching really becomes easier that way.

>>> Security Implications of allowing Domain 0 IOMMU control
>>> ========================================================
>>>
>>> Xen currently allows IO devices attached to Domain 0 to have direct 
>>> access to
>>> the all of the MFN address space (except Xen hypervisor memory regions),
>>> provided the Xen IOMMU option dom0-strict is not enabled.
>>>
>>> The PV IOMMU feature provides the same level of access to MFN address space
>>> and the feature is not enabled when the Xen IOMMU option dom0-strict is
>>> enabled. Therefore security is not affected by the PV IOMMU feature.
>> While I was about to reply with the same request to extend this to
>> driver domains, this last section pretty much excludes such an
>> extension. I guess that would benefit from revisiting.
> 
> I agree supporting driver domains would not be straightforward with
> regards to security so I would suggest that driver domain support could
> be added later as a separate design.

That's odd - you design something anew and want to immediately
postpone one of the obvious potential uses of it? As just said in
another reply, I think you will want to support dom0-strict mode
anyway, and with that I would expect PV driver domain support to
fall out almost naturally.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.