[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
Hi, again. :) As promised, I'm going to talk about more abstract design considerations. Thi will be a lot less concrete than in the other email, and about a larger range of things. Some of of them may not be really desirable - or even possible. [ TL;DR: read the other reply with the practical suggestions in it :) ] I'm talking from the point of view of a hypervisor maintainer, looking at introducing this new XenGT component and thinking about what security properties we would like the _system_ to have once XenGT is introduced. I'm going to lay out a series of broadly increasing levels of security goodness and talk about what we'd need to do to get there. For the purposes of this discussion, Xen does not _trust_ XenGT. By that I mean that Xen can't rely on the correctness/integrity of XenGT itself to maintain system security. Now, we can decide that for some properties we _will_ choose to trust XenGT, but the default is to assume that XenGT could be compromised or buggy. (This is not intended as a slur on XenGT, btw -- this is how we reason about device driver domains, qemu-dm and other components. There will be bugs in any component, and we're designing the system to minimise the effect of those bugs.) OK. Properties we would like to have: LEVEL 0: Protect Xen itself from XenGT -------------------------------------- Bugs in XenGT should not be able to crash he host, and a compromised XenGT should not be able to take over the hypervisor We're not there in the current design, purely because XenGT has to be in dom0 (so it can trivially DoS Xen by rebooting the host). But it doesn't seem too hard: as soon as we can run XenGT in a driver domain, and with IOMMU tables that restrict the GPU from writing to Xen's datastructures, we'll have this property. [BTW, this whole discussion assumes that the GPU has no 'back door' access to issue DMA that is not translated by the IOMMU. I have heard rumours in the past that such things exist. :) If the GPU can issue untranslated DMA, then whetever controls it can take over the entire system, and so we can't make _any_ security guarantees about it.] LEVEL 1: Isolate XenGT's clients from other VMs ----------------------------------------------- In other words we partition the machine into VMs XenGT can touch (i.e. its clients) and those it can't. Then a malicious client that compromises XenGT only gains access to other VMs that share a GPU with it. That means we can deploy XenGT for some VMs without increasing the risk to other tenants. Again we're not there yet, but I think the design I was talking about in my other email would do it: if XenGT must map all the memory it wants to let the GPU DMA to, and Xen's policy is to deny mappings for non-client-vm memory, then VMs that aren't using XenGT are protected. LEVEL 2: Isolate XenGT's clients from each other ------------------------------------------------ This is trickier, as you pointed out. We could: a) Decide that we will trust XenGT to provide this property. After all, that's its main purpose! This is how we treat other shared backends: if a NIC device driver domain is compromised, the attacker controls the network traffic for all its frontends. OTOH, we don't trust qemu in that way -- instead we use stub domains and IS_PRIV_FOR to enforce isolation. b) Move all of XenGT into Xen. This is just defining the problem away and would probably do more harm than good - after all, keeping it separate has other advantages. c) Use privilege separation: break XenGT into parts, isolated from each other, with the principle of least privilege applied to them. E.g. - GPU emulation could be in a per-client component that doesn't share state with the other clients' emulators; - Shadowing GTTs and auditing GPU commands could move into Xen, with a clean interface to the emulation parts. That way, even if a client VM can exploit a bug in the emulator, it can't affect other clients because it can't see their emulator state, and it can't bypass the safety rules because they're enforced by Xen. When I talked about privilege separation before I was suggesting something like this, but without moving anything into Xen -- e.g. the device-emulation code for each client could be in a per-client, non-root process. The code that audits and issues commands to the GPU would be in a separate process, which is allowed to make hypercalls, and which does not trust the emulator processes. My apologies if you're already doing this -- I know XenGT has some components in a kernel driver and some elsewhere but I haven't looked at the details. LEVEL 3: Isolate XenGT's clients from XenGT itself -------------------------------------------------- XenGT should not be able to access parts of its client VMs that they have not given it permission to. E.g. XenGT should not be able to read a client VM's crypto keys unless it displays them on the framebuffer or uses the GPU to accelerate crypto. Unlike level 2, device driver domains _do_ have this property: this is what the grant tables are used for. A compromised NIC driver domain can MITM the frontend guest but it can't read any memory in the guest other than network buffers. Again there are a few approaches, like: a) Declare that we don't care (i.e. that we will trust XenGT for this property too). In a way it's no worse than trusting the firmware on a dedicated pass-though GPU. But on the other hand the client VM is sharing that firmware with some other VMs... :( b) Make the GPU driver in the client use grant tables for all RAM that it gives to the GPU. Probably not practical! c) Move just the code that builds the GTTs into Xen. That way Xen would guarantee that the GPU never accessed memory it wasn't allowed to. I'm sure there are other ideas too. Conclusion ---------- That's enough rambling from me -- time to come back down to earth. While I think it's useful to think about all these things, we don't want to get carried away. :) And as I said, for some things we can decide to trust XenGT to provide them, as long as we're clear about what that means. I think that a reasonable minimum standard to expect is to enforce levels 0 and 1 in Xen, and trust XenGT for levels 2 and 3. And I think we can do that without needing any huge engineering effort; as I said, I think that's covered in my earlier reply. Cheers, Tim. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |