[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Internal error during live migration saving


  • To: rshriram@xxxxxxxxx
  • From: Nathan March <nathan@xxxxxx>
  • Date: Wed, 14 Sep 2011 10:58:49 -0700
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 14 Sep 2011 11:03:21 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gt.net; h=message-id:date :from:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=mail; b=B3gQTV iozqy5vMNVnhw7AIvrgmOMBCvkvzkO3pnNq2Bcnr3iWU/XN+iTd4nD4meXqXWRcZ eDf1FdkOKm+VkHLsFhq32apSBbc5imS9dTX/aUY4IQEY8KJ+lkb024kZViQWTUv5 PF3cgo10Ar/+2hmnaSfFT1oCZGPVistpYGvy0=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>


On 9/14/2011 10:53 AM, Shriram Rajagopalan wrote:
On Tue, Sep 13, 2011 at 2:01 PM, Nathan March<nathan@xxxxxx>  wrote:
Just wondering if this is a known bug?

Trying to migrate the VM off to a diff dom0 results in the below error.
Other VMs migrated off fine (started at around the same time as this vm) and
I've tried a few different target servers, all resulting in the same thing.

Were other domains linux 3.0.3 as well ?
All the dom0's are 3.0.3 and all the domU's are 2.6.32.27 (w/ grsec).

I did a cold reboot of the VM and now it migrates properly.

[2011-09-13 13:48:24 3996] DEBUG (XendCheckpoint:124) [xc_save]:
/usr/lib/xen/bin/xc_save 29 77 0 0 1
[2011-09-13 13:48:24 3996] INFO (XendCheckpoint:423) xc_save: failed to get
the suspend evtchn port
[2011-09-13 13:48:24 3996] INFO (XendCheckpoint:423)
[2011-09-13 13:49:03 3996] DEBUG (XendCheckpoint:394) suspend
[2011-09-13 13:49:03 3996] DEBUG (XendCheckpoint:127) In saveInputHandler
suspend
[2011-09-13 13:49:03 3996] DEBUG (XendCheckpoint:129) Suspending 77 ...
[2011-09-13 13:49:03 3996] DEBUG (XendDomainInfo:524)
XendDomainInfo.shutdown(suspend)
[2011-09-13 13:49:03 3996] DEBUG (XendDomainInfo:1881)
XendDomainInfo.handleShutdownWatch
[2011-09-13 13:50:06 3996] DEBUG (XendDomainInfo:1881)
XendDomainInfo.handleShutdownWatch
[2011-09-13 13:50:06 3996] INFO (XendCheckpoint:423) xc: error: Suspend
request failed: Internal error
[2011-09-13 13:50:06 3996] INFO (XendCheckpoint:423) xc: error: Domain
appears not to have suspended: Internal error
[2011-09-13 13:50:06 3996] ERROR (XendCheckpoint:185) Save failed on domain
globish (77) - resuming.
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line
146, in save
    forkHelper(cmd, fd, saveInputHandler, False)
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line
395, in forkHelper
    inputHandler(line, child.tochild)
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line
131, in saveInputHandler
    dominfo.waitForSuspend()
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line
2998, in waitForSuspend
    raise XendError(msg)
XendError: Timeout waiting for domain 77 to suspend
[2011-09-13 13:50:06 3996] DEBUG (XendDomainInfo:3135)
XendDomainInfo.resumeDomain(77)

xend-debug.log and the target dom0 logs don't show anything of value.

This is xen 4.1.1 on linux 3.0.3

Did you try xm save -c (or the xl equivalent) ? This should be
activating the same
code path where this error seems to appear.

Also, make sure you have CONFIG_XEN_SAVE_RESTORE enabled.
Unfortunately I didn't think to try it. I do have that set on both dom0 and domu.

- Nathan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.