[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Xen 4.10: domU crashes during/after live-migrate
Hi all, We (at Mendix) are upgrading our dom0s to Xen 4.10 (PV) running on Debian Stretch (Linux 4.9), but we are running into an issue regarding live-migration. We are experiencing domU crashes while live-migrating and in the seconds after the live-migration has been completed. This doesn't happen all the time. But we are able to reproduce the issue within 1 to max 10 times live migrating between 2 dom0s. We've reproduced this so far with domUs running Linux 4.9.82-1+deb9u3 (Debian Stretch) and 4.15.11-1 (Debian Buster). Attached are all kernel traces, oopses per crash that are logged from the domUs and retrieved via "xen console" in the seconds after the live-migration is completed. In some cases the domU keeps on running or being visible via "xen list", in other cases the domU disappears from "xen list" after a short amount of time. >From the logging in our dom0s in most cases everything looks fine: Apr 12 16:58:20 altair socat[738]: migration target: Ready to receive domain. Apr 12 16:58:20 altair socat[738]: Loading new save file <incoming migration stream> (new xl fmt info 0x3/0x0/1250) Apr 12 16:58:20 altair socat[738]: Savefile contains xl domain config in JSON format Apr 12 16:58:20 altair socat[738]: Parsing config from <saved> Apr 12 16:58:20 altair socat[738]: libxl: info: libxl_create.c:109:libxl__domain_build_info_setdefault: qemu-xen is unavailable, using q Apr 12 16:58:20 altair socat[738]: xc: info: Found x86 PV domain from Xen 4.10 Apr 12 16:58:20 altair socat[738]: xc: info: Restoring domain Apr 12 16:58:28 altair socat[738]: xc: info: Restore successful Apr 12 16:58:28 altair socat[738]: xc: info: XenStore: mfn 0xce734b, dom 0, evt 1 Apr 12 16:58:28 altair socat[738]: xc: info: Console: mfn 0xce734c, dom 0, evt 2 .. but 1 second later the domU gets a kernel panic (see attachment oops-1.txt). There are cases where the dom0 logs a failure. After this failure the domU disappeared: Apr 12 14:17:55 altair socat[738]: migration target: Ready to receive domain. Apr 12 14:17:55 altair socat[738]: Loading new save file <incoming migration stream> (new xl fmt info 0x3/0x0/1250) Apr 12 14:17:55 altair socat[738]: Savefile contains xl domain config in JSON format Apr 12 14:17:55 altair socat[738]: Parsing config from <saved> Apr 12 14:17:55 altair socat[738]: libxl: info: libxl_create.c:109:libxl__domain_build_info_setdefault: qemu-xen is unavailable, using qemu-xen-traditional instead: No such file or directory Apr 12 14:17:55 altair socat[738]: xc: info: Found x86 PV domain from Xen 4.10 Apr 12 14:17:55 altair socat[738]: xc: info: Restoring domain Apr 12 14:18:00 altair socat[738]: libxl-save-helper: xc_sr_restore_x86_pv.c:7: pfn_to_mfn: Assertion `pfn <= ctx->x86_pv.max_pfn' failed. Apr 12 14:18:00 altair socat[738]: libxl: error: libxl_utils.c:510:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 7 save/restore helper stdout pipe Apr 12 14:18:00 altair socat[738]: libxl: error: libxl_exec.c:129:libxl_report_child_exitstatus: domain 7 save/restore helper [18962] died due to fatal signal Aborted Apr 12 14:18:00 altair socat[738]: libxl: error: libxl_create.c:1264:domcreate_rebuild_done: Domain 7:cannot (re-)build domain: -3 Apr 12 14:18:00 altair socat[738]: libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 7:Non-existant domain Apr 12 14:18:00 altair socat[738]: libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 7:Unable to destroy guest Apr 12 14:18:00 altair socat[738]: libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 7:Destruction of domain failed Apr 12 14:18:00 altair socat[738]: migration target: Domain creation failed (code -3). Apr 12 14:18:00 altair socat[18950]: E write(5, 0x559e0ffc85c0, 8192): Broken pipe And in this case the domU was running on the destination dom0, but it crashed immediately (see attachment oops-2.txt). Apr 12 14:44:24 rho socat[725]: migration target: Ready to receive domain. Apr 12 14:44:24 rho socat[725]: Loading new save file <incoming migration stream> (new xl fmt info 0x3/0x0/1250) Apr 12 14:44:24 rho socat[725]: Savefile contains xl domain config in JSON format Apr 12 14:44:24 rho socat[725]: Parsing config from <saved> Apr 12 14:44:24 rho socat[725]: libxl: info: libxl_create.c:109:libxl__domain_build_info_setdefault: qemu-xen is unavailable, using qemu-xen-traditional instead: No such file or directory Apr 12 14:44:24 rho socat[725]: xc: info: Found x86 PV domain from Xen 4.10 Apr 12 14:44:24 rho socat[725]: xc: info: Restoring domain Apr 12 14:45:31 rho socat[725]: xc: error: Failed to read Record Header from stream (0 = Success): Internal error Apr 12 14:45:31 rho socat[725]: xc: error: Restore failed (0 = Success): Internal error Apr 12 14:45:31 rho socat[725]: libxl: error: libxl_stream_read.c:850:libxl__xc_domain_restore_done: restoring domain: Success Apr 12 14:45:31 rho socat[725]: libxl: error: libxl_create.c:1264:domcreate_rebuild_done: Domain 11:cannot (re-)build domain: -3 Apr 12 14:45:31 rho socat[725]: libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 11:Non-existant domain Apr 12 14:45:31 rho socat[725]: libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 11:Unable to destroy guest Apr 12 14:45:31 rho socat[725]: libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 11:Destruction of domain failed Apr 12 14:45:31 rho socat[725]: migration target: Domain creation failed (code -3). We have been running Xen 4.4 on Debian Jessie (Linux 3.16.51-3+deb8u1) on the same hardware flawlessly for the past years. Does anyone have similar experiences with Xen 4.10? How can we help debugging and finding the cause of these issues? Thanks! -- Pim van den Berg Attachment:
oops-1.txt Attachment:
oops-2.txt Attachment:
oops-3.txt Attachment:
oops-4.txt _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |