[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.



At 01:14 +0000 on 10 Dec (1418170461), Tian, Kevin wrote:
> > From: Tim Deegan [mailto:tim@xxxxxxx]
> > Sent: Tuesday, December 09, 2014 6:47 PM
> > 
> > At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote:
> > > Hi all,
> > >
> > >    As you can see, we are pushing our XenGT patches to the upstream. One
> > > feature we need in xen is to translate guests' gfn to mfn in XenGT dom0
> > > device model.
> > >
> > >    Here we may have 2 similar solutions:
> > >    1> Paul told me(and thank you, Paul :)) that there used to be a
> > > hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in
> > > commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was
> > no
> > > usage at that time.
> > 
> > It's been suggested before that we should revive this hypercall, and I
> > don't think it's a good idea.  Whenever a domain needs to know the
> > actual MFN of another domain's memory it's usually because the
> > security model is problematic.  In particular, finding the MFN is
> > usually followed by a brute-force mapping from a dom0 process, or by
> > passing the MFN to a device for unprotected DMA.
> 
> In our case it's not because the security model is problematic. It's 
> because GPU virtualization is done in Dom0 while the memory virtualization
> is done in hypervisor. We need a means to query GPFN->MFN so we can
> setup shadow GPU page table in Dom0 correctly, for a VM.

I don't think we understand each other.  Let me try to explain what I
mean.  My apologies if this sounds patronising; I'm just trying to be
as clear as I can.

It is Xen's job to isolate VMs from each other.  As part of that, Xen
uses the MMU, nested paging, and IOMMUs to control access to RAM.  Any
software component that can pass a raw MFN to hardware breaks that
isolation, because Xen has no way of controlling what that component
can do (including taking over the hypervisor).  This is why I am
afraid when developers ask for GFN->MFN translation functions.

So if the XenGT model allowed the backend component to (cause the GPU
to) perform arbitrary DMA without IOMMU checks, then that component
would have complete access to the system and (from a security pov)
might as well be running in the hypervisor.  That would be very
problematic, but AFAICT that's not what's going on.  From your reply
on the other thread it seems like the GPU is behind the IOMMU, so
that's OK. :)

When the backend component gets a GFN from the guest, it wants an
address that it can give to the GPU for DMA that will map the right
memory.  That address must be mapped in the IOMMU tables that the GPU
will be using, which means the IOMMU tables of the backend domain,
IIUC[1].  So the hypercall it needs is not "give me the MFN that matches
this GFN" but "please map this GFN into my IOMMU tables".

Asking for the MFN will only work if the backend domain's IOMMU
tables have an existing 1:1 r/w mapping of all guest RAM, which
happens to be the case if the backend component is in dom0 _and_ dom0
is PV _and_ we're not using strict IOMMU tables.  Restricting XenGT to
work in only those circumstances would be short-sighted, not only
because it would mean XenGT could never work as a driver domain, but
also because it seems like PVH dom0 is going to be the default at some
point.

If the existing hypercalls that make IOMMU mappings are not right for
XenGT then we can absolutely consider adding some more.  But we need
to talk about what policy Xen will enforce on the mapping requests.
If the shared backend is allowed to map any page of any VM, then it
can easily take control of any VM on the host (even though the IOMMU
will prevent it from taking over the hypervisor itself).  The
absolute minumum we should allow here is some toolstack-controlled
list of which VMs the XenGT backend is serving, so that it can refuse
to map other VMs' memory (like an extension of IS_PRIV_FOR, which does
this job for Qemu).

I would also strongly advise using privilege separation in the backend
between the GPUPT shadow code (which needs mapping rights and is
trusted to maintain isolation between the VMs that are sharing the
GPU) and the rest of the XenGT backend (which doesn't/isn't).  But
that's outside my remit as a hypervisor maintainer so it goes no
further than an "I told you so". :)

Cheers,

Tim.

[1] That is, AIUI this GPU doesn't context-switch which set of IOMMU
    tables it's using for DMA, SR-IOV-style, and that's why you need a
    software component in the first place.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.