[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] xm migrate, xc_save failed


  • To: xen-devel@xxxxxxxxxxxxx
  • From: Armin Zentai <armin.zentai@xxxxxxx>
  • Date: Fri, 03 Oct 2014 13:09:56 +0200
  • Delivery-date: Fri, 03 Oct 2014 11:18:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

Dear Xen Developers!


We've encountered with a problem while doing a hotmigrate between hypervisors.

A process is the following:

# xm migrate fnavnzt3lm0r6f 10.0.20.24 --live
Error: /usr/lib/xen/bin/xc_save 20 13 0 0 1 failed

Usage: xm migrate <Domain> <Host>



Migrate a domain to another machine.



Options:



-h, --help           Print this help.

-l, --live           Use live migration.

-p=portnum, --port=portnum

                     Use specified port for migration.

-n=nodenum, --node=nodenum

                     Use specified NUMA node on target.

-s, --ssl            Use ssl connection for migration.

-c, --change_home_server

                     Change home server for managed domains.


This command takes about 15 minutes to finish, the VM is running fine while the command's running, and its continues to run on the source hypervisor after the xm migrate command fails.

After examining the xc_save process, in a normal case, the xc_save process runs for a few seconds, but in this case the xc_save keep running for 15 minutes, after that it times out.

In the xend.log we've found the following lines:
[2014-10-03 02:52:48 9020] DEBUG (XendCheckpoint:124) [xc_save]: /usr/lib/xen/bin/xc_save 29 17 0 0 1 [2014-10-03 02:52:48 9020] INFO (XendCheckpoint:423) xc_save: failed to get the suspend evtchn port
[2014-10-03 02:52:48 9020] INFO (XendCheckpoint:423)

After the ~15 minutes, it times out, and...

[2014-10-03 03:05:54 9020] INFO (XendCheckpoint:423) xc: error: Error when writing to state file (4c) (errno 110) (110 = Connection timed out): Internal error [2014-10-03 03:05:54 9020] ERROR (XendCheckpoint:185) Save failed on domain b415gk79eo345x (24) - resuming.
Traceback (most recent call last):
File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 146, in save
    forkHelper(cmd, fd, saveInputHandler, False)
File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 411, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_save 25 24 0 0 1 failed
[2014-10-03 03:05:54 9020] DEBUG (XendDomainInfo:3141) XendDomainInfo.resumeDomain(24) [2014-10-03 03:08:25 9020] INFO (XendCheckpoint:423) xc: error: Error when writing to state file (4a) (errno 110) (110 = Connection timed out): Internal error [2014-10-03 03:08:25 9020] ERROR (XendCheckpoint:185) Save failed on domain y1xeszf11s89ab (17) - resuming.
Traceback (most recent call last):
File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 146, in save
    forkHelper(cmd, fd, saveInputHandler, False)
File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 411, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_save 29 17 0 0 1 failed
[2014-10-03 03:08:25 9020] DEBUG (XendDomainInfo:3141) XendDomainInfo.resumeDomain(17)


We've encountered with this error on multiple hypervisors, with multiple VMs.

Some info about the hypervisors:
# uname -a
Linux c2-node15 3.10.43-11.el6.centos.alt.x86_64 #1 SMP Mon Jun 16 14:22:02 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

# xm info
host                   : c2-node15
release                : 3.10.43-11.el6.centos.alt.x86_64
version                : #1 SMP Mon Jun 16 14:22:02 UTC 2014
machine                : x86_64
nr_cpus                : 12
nr_nodes               : 1
cores_per_socket       : 6
threads_per_core       : 2
cpu_mhz                : 2660
hw_caps : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
virt_caps              : hvm
total_memory           : 49139
free_memory            : 18959
free_cpus              : 0
xen_major              : 4
xen_minor              : 2
xen_extra              : .4-33.el6
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline : dom0_mem=3145728 noreboot=true pcie_asmp=off dom0_max_vcpus=6
cc_compiler            : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
cc_compile_by          : mockbuild
cc_compile_domain      : centos.org
cc_compile_date        : Mon Jun 16 17:22:14 UTC 2014
xend_config_format     : 4


CPU:  Intel(R) Xeon(R) CPU X5650  @ 2.67GHz
Memory: 48GB

All hypervisors are Dell R410 machines, with the same CPU and memory amount.


Thanks for your help,
 - Armin Zentai



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.