Xen project Mailing List

[Xen-users] random reboots on Debian Squeeze 6.0.2.1 + Xen 4.02 on top of OCFS2 1.4.4-3

From: Benjamin Weaver <benjamin.weaver@xxxxxxxxxxxxx>

Date: Thu, 04 Aug 2011 14:50:08 +0100

Delivery-date: Thu, 04 Aug 2011 06:51:25 -0700

List-id: Xen user discussion <xen-users.lists.xensource.com>

This might be related to a posting a couple of days ago on random reboots, but the problem arises from a different environment and situation.

We are running a two-node cluster. Both nodes run Debian Squeeze 6.0.2.1 + Xen 4.02 on top of OCFS2 1.4.4-3. Kernel is 2.6.32-5-xen-amd64. Both nodes store and run vms on the ocfs2 partition, which is accessed from the 2 boxes via ISCSI. We run a network stress test in which the 2 vms pass a large file between them. One vm has an nfs share with the file in it, and the other vm copies this file (arbitrarily, a large, 4.6 Gb debian.iso file) to and from the nfs file share to its own local directory. Currently, network configuration giving us no problems--no lost packets, collisions, etc.

The vms are lucid instances (ubuntu 10.04) created by the following command:

sudo xen-create-image --hostname lucidxentest --ip 163.1.86.9 --pygrub
+ xen-tools.conf params-- size = 8 Gb, image = full, mem. = 512, swap = 512

The stress proceeds successfully for anywhere from 1 to 12 hours, then the system reboots. The file move has been interrupted, the vms crashed, with one of the nodes rebooted.

I have noticed occasional reporting of a kernel error (linux/mm/slub.c 2969!), similar to a Debian bug (#634047). But I find no firm correlation, as often kern.log and messages logs do not usually report this kernel error.

Some things I have tried:

a basic reinstall of the all the components of the system (squeeze + xen + ocfs2)

a memtest on both nodes. (no problems).

changing the default Debian IO scheduler in combination with ocfs2: cfq, deadline, anticipatory, no op.

currently investigating, but have not yet investigated, adjusting: (1) halt state set in BIOS; (2) setting of cpufreq=dom0-kernel, frequency scaling.

Any suggestions are welcome!

Ben Weaver

_______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.