[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [Xen-users] Problem with restore/migration with Xen 4.0.0 and Jeremy kernel (2.6.32.12)



Hi Jeremy,
No, I wasn't aware of any big save/restore performance differences.  Is
the difference caused by a pvops dom0 or domU or both?

On the domu, I tried a 2.6.32.12 pvops kernel and the standard "2.6.32-22-server" from Ubuntu Lucid. It makes no difference.

What is making the difference is using a "xenlinux" kernel in dom0 (2.6.32.10 with Andrew Lyon patches)

One materially different thing is that pvops kernels support preemption,
which requires all processes to be frozen before a suspend.  I wonder if
disabling preemption makes a difference (assuming that it is the domU
which is causing the slowdown).

Ah, but the report is that its the restore which is very slow.  Which
suggests that it is the dom0 environment which is causing problems.
Does "top" show a particular process is very cpu-bound during the
restore?  Or is it IO bound?
I tried a restore a little domu with 256 MB RAM. It tooks around 30s with pvops kernel :

root@narbonne:~# time xm restore  pp

real    0m26.905s
user    0m0.070s
sys    0m0.020s

Here is the result of "iostat -c 5 100" during the time of the restore:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.15    0.00    1.00    0.20    0.15   98.51

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.27    0.00    0.14   99.58

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.94    0.00    0.35   98.70

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.88    0.00    0.46   98.67

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.01    0.00    0.38    0.09    0.15   99.38

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.00    0.97    0.44    0.30   98.23

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.00    0.08   99.92

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.01    0.00    0.00   99.99

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.00    0.00  100.00

So it seems to be neither CPU bound not I/O bound.

I rebooted the server with my xenlinux kernel, and the same restore took :

root@narbonne:~# time xm restore pp

real    0m10.006s
user    0m0.080s
sys    0m0.000s

My domu is on top of DRBD, using the drbd block script in config file.

Live migration can show larger difference between both dom0 kernels : 10s versus 1 minute for the migration of the same domu.

Hardware: Dell R610 with 32GB RAM, bi quad core, one RAID10 container (hard).

I will compile again my dom0 pvops kernel, and triple check kernel config. I will also try without DRBD backends.

If I have new results, I will ket you know.

Many thanks,
Pierre


     J

-- Pasi

On Wed, May 12, 2010 at 05:28:26PM -0400, Pierre POMES wrote:

Hi,

First sorry for the double posting...

I just built a 2.6.32.10 kernel with Andrew Lyon patches (so it is a
"xenlinux" kernel, not a "pvops" kernel).

Live migration and restore operations are between 4 and 10 times faster
with this kernel . Furthermore, during live migration, hangs time in
domu are shorter (1-2 seconds versus 1 to 15 seconds for a domu with
256M RAM).

Error messages "Error when reading batch size" / "error when buffering
batch, finishing" are still in my logs.

Regarding times, all is now similar to what I had with Xen 3.x on top of
xenlinux kernels.

Regards,
Pierre




Hi all,

I am using Xen 4.0.0 on top of Ubuntu Lucid (amd64), with the Jeremy
kernel taken from git (xen/stable-2.6.32.x branch, 2.6.32.12 when I am
writing this email). This kernel is also used in my domu.

I can save a domu without any problem, but restoring it may need from
2 to 5 minutes, from a 1G checkpoint file (domu has 1GB RAM). There
also errors in /var/log/xen/xend.log, "Error when reading batch size"
and "Error when reading batch size":

[2010-05-08 04:23:16 9497] DEBUG (XendDomainInfo:1804) Storing domain
details: {'image/entry': '18446744071587529216', 'console/port': '2',
'image/loader': 'generic', 'vm':
'/vm/156ea44d-6707-cbe6-2d58-7bea4792dff4',
'control/platform-feature-multiprocessor-suspend': '1',
'image/hv-start-low': '18446603336221196288', 'image/guest-os':
'linux', 'image/virt-base': '18446744071562067968', 'memory/target':
'1048576', 'image/guest-version': '2.6', 'image/pae-mode': 'yes',
'description': '', 'console/limit': '1048576', 'image/paddr-offset':
'0', 'image/hypercall-page': '18446744071578882048',
'image/suspend-cancel': '1', 'cpu/0/availability': 'online',
'image/features/pae-pgdir-above-4gb': '1',
'image/features/writable-page-tables': '0', 'console/type':
'xenconsoled', 'name': 'domusample', 'domid': '10',
'image/xen-version': 'xen-3.0', 'store/port': '1'}
[2010-05-08 04:23:16 9497] DEBUG (XendCheckpoint:286)
restore:shadow=0x0, _static_max=0x40000000, _static_min=0x0,
[2010-05-08 04:23:16 9497] DEBUG (XendCheckpoint:305) [xc_restore]:
/usr/lib/xen/bin/xc_restore 22 10 1 2 0 0 0 0
[2010-05-08 04:23:16 9497] INFO (XendCheckpoint:423) xc_domain_restore
start: p2m_size = 40000
[2010-05-08 04:23:16 9497] INFO (XendCheckpoint:423) Reloading memory
pages:   0%
[2010-05-08 04:25:53 9497] INFO (XendCheckpoint:423) ERROR Internal
error: Error when reading batch size
[2010-05-08 04:25:53 9497] INFO (XendCheckpoint:423) ERROR Internal
error: error when buffering batch, finishing
[2010-05-08 04:25:53 9497] INFO (XendCheckpoint:423)
[2010-05-08 04:25:53 9497] INFO (XendCheckpoint:423) ^H^H^H^H100%
[2010-05-08 04:25:53 9497] INFO (XendCheckpoint:423) Memory reloaded
(0 pages)
[2010-05-08 04:25:53 9497] INFO (XendCheckpoint:423) read VCPU 0

Live migration has the same problem, it may need several minutes to
complete. Please note that restore and migration do not fail, but
there are very long.

My domu is on top of DRBD, and the config file is:


-------------
kernel      = '/boot/vmlinuz-2.6.32.12-it-xen'
ramdisk     = '/boot/initrd.img-2.6.32.12-it-xen'
memory      = '1024'

#
#  Disk device(s).
#
root        = '/dev/xvda2 ro'
disk        = [
                   'drbd:domusampleswap,xvda1,w',
                   'drbd:domusampleslash,xvda2,w',
               ]



#
#  Hostname
#
name        = 'domusample'

#
#  Networking
#
vif         = [ 'mac=00:16:3E:58:FC:F9' ]

#
#  Behaviour
#
on_poweroff = 'destroy'
on_reboot   = 'restart'
on_crash    = 'restart'

extra = '2 console=hvc0'
----------

I do not have any idea here.

Did somebody already have (and solve ?)  this issue ?

Thanks.
Pierre


*
*

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.