Xen project Mailing List

Re: [Xen-devel] [RFC] Dom0 PV IOMMU control design (draft A)

To: "Malcolm Crossley" <malcolm.crossley@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 14 Apr 2014 17:00:43 +0100

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Mon, 14 Apr 2014 16:01:06 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 14.04.14 at 17:48, <malcolm.crossley@xxxxxxxxxx> wrote: > On 14/04/14 14:47, Jan Beulich wrote: >>>>> On 11.04.14 at 19:28, <malcolm.crossley@xxxxxxxxxx> wrote: >>> Purpose >>> ======= >>> >>> Allow Xen Domain 0 PV guests to create/modify/destroy IOMMU mappings for >>> hardware devices that Domain 0 has access to. This enables Domain 0 to >>> program >>> a bus address space mapping which matches it's GPFN mapping. Once a 1-1 >>> mapping of GPFN to bus address space is created then a bounce buffer >>> region is not required for the IO devices connected to the IOMMU. >> So the idea then is for the domain to >> - create this 1:1 mapping once at boot, and update it as pages get >> ballooned int/out, or >> - zap the entire IOMMU mapping at boot, and establish individual >> mappings as needed based on the driver's use of the DMA map ops? >> > > Yes both of those are the general idea, I will document this better in > draft B as per David's suggestion. > The first strategy has better performance (less IOMMU updates) but lower > security from bad devices, the second strategy has better security but > higher IOMMU mapping overhead. > > The current idea for maximum performance is to create a 1:1 mapping for > dom0 GPFN's and an offset (offset by max dom0 GPFN) 1:1 mapping for the > whole of the MFN address space which will be used for grant mapped > pages. This should result in IOMMU updates only for ballooned pages. That's not fully scalable: You can't cope with systems making use of the full physical address space (e.g. by sparsely allocating where each node's memory goes). HP has been having such systems for a couple of years. >>> IOMMUOP_unmap_page >>> ---------------- >>> First argument, pointer to array of `struct iommu_map_op` >>> Second argument, integer count of `struct iommu_map_op` elements in array >>> >>> This subop will attempt to unmap each element in the `struct >>> iommu_map_op` array >>> and record the mapping status back into the array itself. If an >> unmapping? > > Agreed >> >>> unmapping fault >> mapping? > > This is an unmap hypercall so I believe it would be an unmapping fault > reported. No - an individual unmap operation failing should result in the status to be set to -EFAULT (or whatever). The only time you'd fail the entire hypercall with -EFAULT would be when you can't read an interface structure array element. Hence if anything this would be a mapping fault (albeit I didn't like the term in the IOMMU_map_page description either). >>> The IOMMU TLB will only be flushed when the hypercall completes or a >>> hypercall >>> continuation is created. >>> >>> struct iommu_map_op { >>> uint64_t bfn; >>> uint64_t mfn; >>> uint32_t flags; >>> int32_t status; >>> }; >>> >>> -------------------------------------------------------------------- >>> Field Purpose >>> ----- ----------------------------------------------------- >>> `bfn` [in] Bus address frame number to be unmapped >>> >>> `mfn` [in] This field is ignored for unmap subop >>> >>> `flags` [in] This field is ignored for unmap subop >>> >>> `status` [out] Mapping status of this unmap operation, 0 indicates > success >> Hmm, bfn and flags ignored would call for an abbreviated interface >> structure. Or how about IOMMU_MAP_OP_{readable,writeable} >> implying unmap, in which case for now you'd need only one op. And >> with that I wonder whether this shouldn't simply be a new >> PHYSDEVOP_iommu_map. > > I initially tried to add it to the PHYSDEV hypercall as a subop but when > adding batch capability it seemed cleaner to use a 3 argument hypercall > to allow for the count to be passed in as argument rather than as a > field in a struct. Yes, I see - batching really becomes easier that way. >>> Security Implications of allowing Domain 0 IOMMU control >>> ======================================================== >>> >>> Xen currently allows IO devices attached to Domain 0 to have direct >>> access to >>> the all of the MFN address space (except Xen hypervisor memory regions), >>> provided the Xen IOMMU option dom0-strict is not enabled. >>> >>> The PV IOMMU feature provides the same level of access to MFN address space >>> and the feature is not enabled when the Xen IOMMU option dom0-strict is >>> enabled. Therefore security is not affected by the PV IOMMU feature. >> While I was about to reply with the same request to extend this to >> driver domains, this last section pretty much excludes such an >> extension. I guess that would benefit from revisiting. > > I agree supporting driver domains would not be straightforward with > regards to security so I would suggest that driver domain support could > be added later as a separate design. That's odd - you design something anew and want to immediately postpone one of the obvious potential uses of it? As just said in another reply, I think you will want to support dom0-strict mode anyway, and with that I would expect PV driver domain support to fall out almost naturally. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.