|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] docs/qemu-deprivilege: Revise and update with status and future plans
Thanks for this update!
George Dunlap writes ("[PATCH] docs/qemu-deprivilege: Revise and update with
status and future plans"):
...
> +# Technical details
> +
> +## Restrictions done
This makes this doc into a mixture of a design doc and a user doc, I
think.
It might be worth stating the design intent, which I think is this:
* Even if there is a bug (for example in qemu) which permits a domain
to compromise the device model, the compromised device model
process is prevented from violating the system's overall security
properties. Ie, a guest cannot "escape" from the virtualisation by
using a qemu bug.
This design intent is not yet achieved. Right now an attacker is
impeded and their attack is complicated; in some circumstances the
will be limited to denial of service.
I'm not sure the individual restrictions need to be in a user-facing
doc.
Maybe the user-facing wording from your patch should be moved to
xl.cfg.doc.5 ?
> +'''Description''': Close and restrict Xen-related file descriptors.
> +Specifically, make sure that only one `privcmd` instance is open, and
> +that the IOCTL_EVTCHN_RESTRICT_DOMID ioctl has been called.
> +
> +XXX Also, make sure that only one `xenstore` fd remains open, and that
> +it's restricted.
No. Firstly, in each case, all relevant descriptors are restricted.
This is the purpose of the xentoolcore__restrict_* stuff. Secondly,
xenstore *is* covered - but the xs fd is squashed so as to be totally
unuseable: xs.c uses xentoolcore__restrict_by_dup2_null.
> +### Namespaces for unused functionality
> +
> +'''Descripiton''': Enter QEMU into its own mount & IPC namespaces.
> +This means that even if other restrictions fail, the process won't be
> +able to even name system mount points or exsting non-file-based IPC
> +descriptors to attempt to attack them.
> +
> +'''Implementation''':
> +
> +In theory this could be done in QEMU (similar to -sandbox, -runas,
> +-chroot, and so on), but a patch doing this in QEMU was NAKed
> +upstream. They preferred that this was done as a setup step by
> +whatever executes QEMU; i.e., have the process which exec's QEMU first
> +call:
> +
> + unshare(CLONE_NEWNS | CLONE_NEWIPC)
This would mean we would have to pass qemu fds for both the network
tap devices and any vnc consoles. That makes life considerably more
complicated. I think we should perhaps revisit this upstream.
> +'''Implementation''': Enable from the command-line:
> +
> + -sandbox
> on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny
> +
> +`elevateprivileges` is currently required to allow `-runas` to work.
> +Removing this requirement would mean making sure that the uid change
> +happened before the seccomp2 call, perhaps by changing the uid before
> +executing QEMU. (But this would then require other changes to create
> +the QMP socket, VNC socket, and so on).
See what I say above.
> +### Further RLIMITs
> +
> +RLIMIT_AS limits the total amount of memory; but this includes the
> +virtual memory which QEMU uses as a mapcache. xen-mapcache.c already
> +fiddles with this; it would be straightforward to make it *set* the
> +rlimit to what it thinks a sensible limit is.
> +
> +Other things that would take some cleverness / changes to QEMU to
> +utilize due to ordering constrants:
> + - RLIMIT_NPROC (after uid changes to a unique uid)
> + - RLIMIT_NOFILES (after all necessary files are opened)
I think there is little difficulty with RLIMIT_NPROC since our qemu
does not fork. I think we can set it to a value which is currently
violated for the current uid ?
> +### libxl UID cleanup
...
> +kill(-1,sig) sends a signal to "every process to which the calling
> +process has permission to send a signal". So in theory:
> + setuid(X)
> + kill(-1,KILL)
> +should do the trick.
We need to check whether a malicious qemu process could kill this
one.
> +### Disks
> +
> +The chroot (and seccomp?) happens late enough such that QEMU can
> +initialize itself and open its disks. If you want to add a disk at run
> +time via or insert a CD, you can't pass a path because QEMU is
> +chrooted. Instead use the add-fd QMP command and use
> +/dev/fdset/<fdset-id> as the path.
I don't think we (Xen) really support hotplug of emulated disks right
now. So it's just cd insert that's a problem.
> +### Network
>
> +If QEMU runs in its own network namespace, it can't open the tap
> +device itself because the interface won't be visible outside of its
> +own namespace. So instead, have the toolstack open the device and pass
> +it as an fd on the command-line:
I think this could be solved by doing these things in a different
order.
Thanks,
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |