[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] xen zombie while starting on "Secondary" DRBD device



Hi all,

i am implementing a HA infrastructure composed of 2 phisical server and
8 virtual server with XEN.

Xen 3.0.2
DRBD 0.7 (debian)

Data redundancy and replication is handled by DRBD.

I used this howto:
http://lists.xensource.com/archives/html/xen-devel/2005-06/msg00544.html
http://www.option-c.com/xwiki/XenLvmDrbd

I am stacking several technology:
XEN over DRBD over LVM over RAID1(software)

The HA is handled by heartbeat.
I found a very bad race condition that, when appear, require a hard
reboot of the machine (standard reboot doesn't work and hang).

Basically if a XEN server start when the corresponding DRBD device is in
Secondary state it became a zombie and it's not possible to remove it.
It's not even possible to reboot the server because xendomains stop hang
the reboot process.

Doing an "xm list" show this:
Zombie-admin-server0           15       32     2 ----cd     0.4

Zombie xen server are a very bad problem in this kind of infrastructure.

It's a problem of XEN, it's a problem of DRBD or of both?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.