[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: oxenstored performance issue when starting VMs in parallel

On 22.09.20 15:42, Paul Durrant wrote:
-----Original Message-----
From: Edwin Torok <edvin.torok@xxxxxxxxxx>
Sent: 22 September 2020 14:29
To: sstabellini@xxxxxxxxxx; Anthony Perard <anthony.perard@xxxxxxxxxx>; xen-
devel@xxxxxxxxxxxxxxxxxxxx; paul@xxxxxxx
Cc: xen-users@xxxxxxxxxxxxxxxxxxxx; jerome.leseinne@xxxxxxxxx; julien@xxxxxxx
Subject: Re: oxenstored performance issue when starting VMs in parallel

On Tue, 2020-09-22 at 15:17 +0200, jerome leseinne wrote:

Edwin you rock ! This call in qemu is effectively the culprit !
I have disabled this xen_bus_add_watch call and re-run test on our
big server:

- oxenstored is now  between 10% to 20%  CPU usage (previously was
100% all the time)
- All our VMs are responsive
- All our VM start in less than 10 seconds (before the fix some VMs
could take more than one minute to be fully up
- Dom0 is more responsive

Disabling the watch may not be the ideal solution ( I let the qemu
experts answer this and the possible side effects),

CC-ed Qemu maintainer of Xen code, please see this discussion about
scalability issues with the backend watching code in qemu 4.1+.

I think the scalability issue is due to this code in qemu, which causes
an instance of qemu to see watches from all devices (even those
belonging to other qemu instances), such that adding a single device
causes N watches to be fired on each N instances of qemu:
       xenbus->backend_watch =
            xen_bus_add_watch(xenbus, "", /* domain root node */
                              "backend", xen_bus_backend_changed,

I can understand that for backwards compatibility you might need this
code, but is there a way that an up-to-date (xl) toolstack could tell
qemu what it needs to look at (e.g. via QMP, or other keys in xenstore)
instead of relying on an overly broad watch?
I think this could be made more efficient. The call to 
"module_call_init(MODULE_INIT_XEN_BACKEND)" just prior to this watch will 
register backends that do auto-creation so we could register individual watches for the 
various backend types instead of this single one.
The watch should be on guest domain level, e.g. for:


We have one qemu process per guest, after all.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.