[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] Qemu disaggregation in Xen environment
On 03/05/2012 10:06 PM, Ian Campbell wrote:
I'm not aware of any code existing to do this. There's a bunch of interesting stuff to do on the Xen side to make this stuff work. Firstly you would need to add support to the hypervisor for dispatching I/O requests to multiple qemu instances (via multiple io req rings). I think at the moment there is only support for a single ring (or maybe it's one sync and one buffered I/O ring).
I have modified Xen to create "ioreq servers". An ioreq server contains a list of IO ranges and a list of BDFs to trap IOs for a unique instance of QEMU running. Each ioreq server can be associated to an event channel. This way we can deliver IOs events to different processes. For each QEMU, a ioreq server is created. QEMU must specify which pci (with a BDF) and IO range it handles. I added some hypercalls: - to register an ioreq server - to register/unregister BDF - to register/unregister IO range For the moment all QEMUs share the same pages (buffered and IO request). For more security, I would like to privatize these pages for each ioreq server. I saw these pages are allocated by the toolstack. Can we assume that the toolstack know at domain creation time how many QEMU it is going to spawn ?
You'd also need to make sure that qemu explicitly requests all the MMIO regions it is interested in. Currently the hypervisor forwards any unknown MMIO to qemu so the explicit registration is probably not done as consistently as it could be. If you want to have N qemus then you need to make sure that at least N-1 of register for everything they are interested in.
I have modified QEMU to register all IO ranges and pci it needs. All unregister IO is discarded by Xen.
Currently the PCI config space decode is done within qemu which is a bit tricky if you are wanting to have different emulated PCI devices in different qemu processes. We think it would independently be an architectural improvement to have the hypervisor do the PCI config space decode anyway. This would allow it to forward the I/O to the correct qemu (there are other benefits to this change, e.g. relating to PCI passthrough and the handling of MSI configuration etc)
I have created a patch which allow Xen to catch cf8 through cff io port registers. Xen goes through the list of ioreq servers to know which server can handle the PCI and prepare the resquest. For that I added a new io request type IOREQ_TYPE_PCI_CONFIG.
Then you'd need to do a bunch of toolstack level work to start and manage the multiple qemu processes instead of the existing single process.
I have began to modify the toolstack. For the moment, I just handle a new type of device model for my own test. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
Lists.xenproject.org is hosted with RackSpace, monitoring our