[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Migration stalls with 2.6.26.5 kernel


  • To: "Trevor Bentley" <trevor.bentley@xxxxxxxxxxxxxxxxxxx>
  • From: "Thiago Camargo Martins Cordeiro" <thiagocmartinsc@xxxxxxxxx>
  • Date: Thu, 18 Sep 2008 17:19:37 -0300
  • Cc: xen-users@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 18 Sep 2008 13:20:19 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=kwM1j+Yms32OiccZxx2kg6EsrdSW0Z1VWuozX8R2zDYZhtelSuXcUZAPVLXcRfuT8A qOwCP4XPi/opcGEWq0h+6v5CrG7YAeJJIpbIu5eAJpPfnreqLwE/BX6XqbZw6LN6EjKQ +S/Ng0xYQAf0JMtPcM0cq+uvxq7gtAuN3PoPM=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Trevor,

 Let me talk about my experience with distros that have supported Xen on their trees.

 The most stable packaged Xen-3.2 is on Debian Lenny with kernel 2.6.18-xen from Debian Etch. See http://wiki.debian.org/Xen - Installation on lenny.

 If your are "nutz", use Debian Lenny with Xen-3.3 and Linux-2.6.18.8-xen-3.3 from xen.org compiled by you.

 I have tried Ubuntu Hardy with kernel 2.6.24-21-xen and it ins't stable at all, I can't use nosmp with it, some aacraid bugs. My dom0 are all Ubuntu Hardy with Xen-3.3 and Linux 2.6.18.8-xen from xen.org, but now I have a problem with pythom-xml package. On Debian I do not see any bug or instability.

 That's my opinion!

ps.: Sorry about my english...   ;-)

Regards,
Thiago

2008/9/18 Trevor Bentley <trevor.bentley@xxxxxxxxxxxxxxxxxxx>
Hello,

I have been struggling through the task of moving our infrastructure over to Xen VMs.  We were initially using Ubuntu packages for both dom0 and our domUs, but experienced extreme instability so we moved to CentOS, which has been much more reliable for dom0.   Since we already had a bunch of Ubuntu VMs, we left them using the Ubuntu 2.4.24-19-xen kernel, but this has turned out to be a mistake -- we get frequent kernel oopses during heavy disk I/O.  We modified the kernel to add NFS-root support, but that is the only change we made to the original config.  All of our domUs mount their root file systems over NFS.

My problem is that I tried to upgrade the domU kernels to the latest kernel.org stable release (2.6.26.5) and did manage to get it working after some initial trouble (TCP checksum offloading was breaking NFS).  However, the new kernel will not live migrate anymore.  When I execute the live migrate command:

# xm migrate --live testvm 192.168.1.20

Migration hangs forever.  The VM changes name to "migrate-testvm" and keeps running normally on the system it was on, and appears as "testvm" with state "-br---" on the destination machine with 0 CPU time.  I left tcpdump running on the destination machine and captured an 84MB pcap file which looked pretty normal up until all traffic just completely stopped.  If I just change the "kernel=" line in the config script to the Ubuntu kernel migration works again.

Here's my VM configuration:
-------------------                                  name        = 'testvm'
kernel      = '/xen_vm/global/kernels/vmlinuz-2.6.26.5'
ramdisk     = '/xen_vm/global/kernels/initrd.img-xen-latest'
memory      = '256'
disk        = ['tap:aio:/xen_vm/global/swaps/testvm.img,xvda1,w']
vif         = [
              'mac=00:16:3e:5b:8d:5d,bridge=xenbr0',
              'mac=00:16:3e:99:9b:e7,bridge=xenbr1'
            ]
> on_reboot   = 'restart'
on_crash    = 'restart'
extra       = '2 console=hvc0 root=/dev/nfs ip=:192.168.1.12::::eth1:'
nfs_server  = '192.168.1.12'
nfs_root    = '/xen_vm/testvm'
-------------------


xend.log on source:
-------------------
[2008-09-18 15:51:11 xend 3751] DEBUG (balloon:127) Balloon: 786956 KiB free; need 2048; done.
[2008-09-18 15:51:11 xend 3751] DEBUG (XendCheckpoint:89) [xc_save]: /usr/lib/xen/bin/xc_save 33 38 0 0 1
-------------------

xend.log on destination:
-------------------
...
[2008-09-18 15:51:11 xend.XendDomainInfo 3331] DEBUG (XendDomainInfo:1350) XendDomainInfo.construct: None
[2008-09-18 15:51:11 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB free; need 2048; done.
...
[2008-09-18 15:51:11 xend 3331] DEBUG (blkif:24) exception looking up device number for xvda1: [Errno 2] No such file or directory: '/dev/xvda1'
[2008-09-18 15:51:11 xend 3331] DEBUG (DevController:110) DevController: writing {'backend-id': '0', 'virtual-device': '51713', 'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/backend/tap/10/51713'} to /local/domain/10/device/vbd/51713.
...
[2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:198) restore:shadow=0x0, _static_max=0x100, _static_min=0x100,
[2008-09-18 15:51:12 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB free; need 262144; done.
[2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:215) [xc_restore]: /usr/lib/xen/bin/xc_restore 24 10 1 2 0 0 0
-------------------


Xen version: xen-3.0-x86_32p
dom0: 2.6.18-92.1.10.el5xen

Anybody know what would cause this, or have any suggestions for tracking down the problem?  I did find a post from someone who was seeing identical behavior who claimed he fixed it by enabling CPU Hotplug support, but I already have that enabled in the kernel.

Thanks,

Trevor

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.