[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: live migration fails: qemu placing pci devices at different locations



On Tue, Oct 31, 2023 at 10:07:29AM +0000, James Dingwall wrote:
> Hi,
> 
> I'm having a bit of trouble performing live migration between hvm guests.  The
> sending side is xen 4.14.5 (qemu 5.0), receiving 4.15.5 (qemu 5.1).  The error
> message recorded in qemu-dm-<name>--incoming.log:
> 
> qemu-system-i386: Unknown savevm section or instance '0000:00:04.0/vga' 0. 
> Make sure that your current VM setup matches your saved VM setup, including 
> any hotplugged devices
> 
> I have patched libxl_dm.c to explicitly assign `addr=xx` values for various
> devices and when these are correct the domain migrates correctly.  However
> the configuration differences between guests means that the values are not
> consistent.  The domain config file doesn't allow the pci address to be
> expressed in the configuration for, e.g. `soundhw="DEVICE"`
> 
> e.g. 
> 
> diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
> index 6e531863ac0..daa7c49846f 100644
> --- a/tools/libs/light/libxl_dm.c
> +++ b/tools/libs/light/libxl_dm.c
> @@ -1441,7 +1441,7 @@ static int libxl__build_device_model_args_new(libxl__gc 
> *gc,
>              flexarray_append(dm_args, "-spice");
>              flexarray_append(dm_args, spiceoptions);
>              if (libxl_defbool_val(b_info->u.hvm.spice.vdagent)) {
> -                flexarray_vappend(dm_args, "-device", "virtio-serial",
> +                flexarray_vappend(dm_args, "-device", 
> "virtio-serial,addr=04",
>                      "-chardev", "spicevmc,id=vdagent,name=vdagent", 
> "-device",
>                      "virtserialport,chardev=vdagent,name=com.redhat.spice.0",
>                      NULL);
> 
> The order of devices on the qemu command line (below) appears to be the same
> so my assumption is that the internals of qemu have resulted in things being
> connected in a different order.  The output of a Windows `lspci` tool is
> also included.
> 
> Could anyone make any additional suggestions on how I could try to gain
> consistency between the different qemu versions?

After a bit more head scratching we worked out the cause and a solution for
our case.  In xen 4.15.4 d65ebacb78901b695bc5e8a075ad1ad865a78928 was
introduced to stop using the deprecated qemu `-soundhw` option.  The qemu
device initialisation code looks like:

...
    soundhw_init(); // handles old -soundhw option
...
    /* init generic devices */
    rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
    qemu_opts_foreach(qemu_find_opts("device"),
                      device_init_func, NULL, &error_fatal);
...

So for the old -soundhw option this was processed before any -device options
and the sound card was assigned the next available slot on the bus and then
any further -devices were added according to the command line order.  After
that xen change the sound card was added as a -device and depending on the
other emulated hardware would be added at a different point to the equivalent
-soundhw option.  By re-ordering the qemu command line building in libxl_dm.c
we can make the sound card be the first -device which resolves the migration
problem.

I think this would also have been a problem for live migration between 4.15.3
and 4.15.4 for a vm with a sound card and not just the major version jump we
are doing.

James



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.