[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Qemu disaggregation in Xen environment

On 03/05/2012 10:06 PM, Ian Campbell wrote:
I'm not aware of any code existing to do this.

There's a bunch of interesting stuff to do on the Xen side to make this
stuff work.

Firstly you would need to add support to the hypervisor for dispatching
I/O requests to multiple qemu instances (via multiple io req rings). I
think at the moment there is only support for a single ring (or maybe
it's one sync and one buffered I/O ring).
I have modified Xen to create "ioreq servers". An ioreq server
contains a list of IO ranges and a list of BDFs to trap IOs for a
unique instance of QEMU running.
Each ioreq server can be associated to an event channel.
This way we can deliver IOs events to different processes.
For each QEMU, a ioreq server is created. QEMU must specify
which pci (with a BDF) and IO range it handles.
I added some hypercalls:
    - to register an ioreq server
    - to register/unregister BDF
    - to register/unregister IO range

For the moment all QEMUs share the same pages (buffered and
IO request). For more security, I would like to privatize these
pages for each ioreq server. I saw these pages are allocated by
the toolstack. Can we assume that the toolstack know at domain
creation time how many QEMU it is going to spawn ?
You'd also need to make sure that qemu explicitly requests all the MMIO
regions it is interested in. Currently the hypervisor forwards any
unknown MMIO to qemu so the explicit registration is probably not done
as consistently as it could be. If you want to have N qemus then you
need to make sure that at least N-1 of register for everything they are
interested in.
I have modified QEMU to register all IO ranges and pci it needs.
All unregister IO is discarded by Xen.

Currently the PCI config space decode is done within qemu which is a bit
tricky if you are wanting to have different emulated PCI devices in
different qemu processes. We think it would independently be an
architectural improvement to have the hypervisor do the PCI config space
decode anyway. This would allow it to forward the I/O to the correct
qemu (there are other benefits to this change, e.g. relating to PCI
passthrough and the handling of MSI configuration etc)
I have created a patch which allow Xen to catch cf8 through cff
io port registers. Xen goes through the list of ioreq servers to
know which server can handle the PCI and prepare the resquest.
For that I added a new io request type IOREQ_TYPE_PCI_CONFIG.

Then you'd need to do a bunch of toolstack level work to start and
manage the multiple qemu processes instead of the existing single
I have began to modify the toolstack. For the moment, I just
handle a new type of device model for my own test.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.