Xen project Mailing List

Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.

To: Tim Deegan <tim@xxxxxxx>

From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>

Date: Tue, 6 Jan 2015 08:56:35 +0000

Accept-language: en-US

Cc: "keir@xxxxxxx" <keir@xxxxxxx>, "Xen-devel@xxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxx>, "Paul.Durrant@xxxxxxxxxx" <Paul.Durrant@xxxxxxxxxx>, "Yu, Zhang" <yu.c.zhang@xxxxxxxxxxxxxxx>, David Vrabel <david.vrabel@xxxxxxxxxx>, "JBeulich@xxxxxxxx" <JBeulich@xxxxxxxx>, Malcolm Crossley <malcolm.crossley@xxxxxxxxxx>

Delivery-date: Tue, 06 Jan 2015 08:57:05 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHQKY6swCtkrRLU0k2PqdJRV/+n+g==

Thread-topic: [Xen-devel] One question about the hypercall to translate gfn to mfn.

> From: Tim Deegan [mailto:tim@xxxxxxx] > Sent: Thursday, December 18, 2014 11:47 PM > > Hi, > > At 07:24 +0000 on 12 Dec (1418365491), Tian, Kevin wrote: > > > I'm afraid not. There's nothing worrying per se in a backend knowing > > > the MFNs of the pages -- the worry is that the backend can pass the > > > MFNs to hardware. If the check happens only at lookup time, then XenGT > > > can (either through a bug or a security breach) just pass _any_ MFN to > > > the GPU for DMA. > > > > > > But even without considering the security aspects, this model has bugs > > > that may be impossible for XenGT itself to even detect. E.g.: > > > 1. Guest asks its virtual GPU to DMA to a frame of memory; > > > 2. XenGT looks up the GFN->MFN mapping; > > > 3. Guest balloons out the page; > > > 4. Xen allocates the page to a different guest; > > > 5. XenGT passes the MFN to the GPU, which DMAs to it. > > > > > > Whereas if stage 2 is a _mapping_ operation, Xen can refcount the > > > underlying memory and make sure it doesn't get reallocated until XenGT > > > is finished with it. > > > > yes, I see your point. Now we can't support ballooning in VM given above > > reason, and refcnt is required to close that gap. > > > > but just to confirm one point. from my understanding whether it's a > > mapping operation doesn't really matter. We can invent an interface > > to get p2m mapping and then increase refcnt. the key is refcnt here. > > when XenGT constructs a shadow GPU page table, it creates a reference > > to guest memory page so the refcnt must be increased. :-) > > True. :) But Xen does need to remember all the refcounts that were > created (so it can tidy up if the domain crashes). If Xen is already > doing that it might as well do it in the IOMMU tables since that > solves other problems. would a refcnt in p2m layer enough so we don't need separate refcnt in both EPT and IOMMU page table? > > > > [First some hopefully-helpful diagrams to explain my thinking. I'll > > > borrow 'BFN' from Malcolm's discussion of IOMMUs to describe the > > > addresses that devices issue their DMAs in: > > > > what's 'BFN' short for? Bus Frame Number? > > Yes, I think so. > > > > If we replace that lookup with a _map_ hypercall, either with Xen > > > choosing the BFN (as happens in the PV grant map operation) or with > > > the guest choosing an unused address (as happens in the HVM/PVH > > > grant map operation), then: > > > - the only extra code in XenGT itself is that you need to unmap > > > when you change the GTT; > > > - Xen can track and control exactly which MFNs XenGT/the GPU can > access; > > > - running XenGT in a driver domain or PVH dom0 ought to work; and > > > - we fix the race condition I described above. > > > > ok, I see your point here. It does sound like a better design to meet > > Xen hypervisor's security requirement and can also work with PVH > > Dom0 or driver domain. Previously even when we said a MFN is > > required, it's actually a BFN due to IOMMU existence, and it works > > just because we have a 1:1 identity mapping in-place. And by finding > > a BFN > > > > some follow-up think here: > > > > - one extra unmap call will have some performance impact, especially > > for media processing workloads where GPU page table modifications > > are hot. but suppose this can be optimized with batch request > > Yep. In general I'd hope that the extra overhead of unmap is small > compared with the trap + emulate + ioreq + schedule that's just > happened. Though I know that IOTLB shootdowns are potentially rather > expensive right now so it might want some measurement. yes, that's the hard part requiring experiments to find a good balance between complexity and performance. IOMMU page table is not designed with same frequent modifications as CPU/GPU page tables, but following above trend make them connected. Another option might be reserve a big enough BFNs to cover all available guest memory at boot time, so to eliminate run-time modification overhead. > > > - is there existing _map_ call for this purpose per your knowledge, or > > a new one is required? If the latter, what's the additional logic to be > > implemented there? > > For PVH, the XENMEM_add_to_physmap (gmfn_foreign) path ought to do > what you need, I think. For PV, I think we probably need a new map > operation with sensible semantics. My inclination would be to have it > follow the grant-map semantics (i.e. caller supplies domid + gfn, > hypervisor supplies BFN and success/failure code). setup mapping is not a big problem. it's more about finding available BFNs in a way not conflicting with other usages e.g. memory hotplug, ballooning (well for this I'm not sure now whether it's only for existing gfns from other thread...) > > Malcolm might have opinions about this -- it starts looking like the > sort of PV IOMMU interface he's suggested before. we'd like to hear Malcolm's suggestion here. > > > - when you say _map_, do you expect this mapped into dom0's virtual > > address space, or just guest physical space? > > For PVH, I mean into guest physical address space (and iommu tables, > since those are the same). For PV, I mean just the IOMMU tables -- > since the guest controls its own PFN space entirely there's nothing > Xen can to map things into it. > > > - how is BFN or unused address (what do you mean by address here?) > > allocated? does it need present in guest physical memory at boot time, > > or just finding some holes? > > That's really a question for the xen maintainers in the linux kernel. > I presume that whatever bookkeeping they currently do for grant-mapped > memory would suffice here just as well. will study that part. > > > - graphics memory size could be large. starting from BDW, there'll > > be 64bit page table format. Do you see any limitation here on finding > > BFN or address? > > Not really. The IOMMU tables are also 64-bit so there must be enough > addresses to map all of RAM. There shouldn't be any need for these > mappings to be _contiguous_, btw. You just need to have one free > address for each mapping. Again, following how grant maps work, I'd > imagine that PVH guests will allocate an unused GFN for each mapping > and do enough bookkeeping to make sure they don't clash with other GFN > users (grant mapping, ballooning, &c). PV guests will probably be > given a BFN by the hypervisor at map time (which will be == MFN in > practice) and just needs to pass the same BFN to the unmap call later > (it can store it in the GTT meanwhile). if possible prefer to make both consistent, i.e. always finding unused GFN? > > > > The default policy I'm suggesting is that the XenGT backend domain > > > should be marked IS_PRIV_FOR (or similar) over the XenGT client VMs, > > > which will need a small extension in Xen since at the moment struct > > > domain has only one "target" field. > > > > Is that connection setup by toolstack or by hypervisor today? > > It's set up by the toolstack using XEN_DOMCTL_set_target. Extending > that to something like XEN_DOMCTL_set_target_list would be OK, I > think, along with some sort of lookup call. Or maybe an > add_target/remove_target pair would be easier? > Thanks for suggestions. Yu and I will have a detail study and work out a proposal. :-) Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.