[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5



On 2008-07-31 21:58, nathan@xxxxxxxxxxxx wrote:
I am running DRBD primary/primary on Centos 5.2 with CLVM and GFS with no problems. The only issue I have with live migration is that the arp takes 10 - 15 sec to get refreshed so you lose connectivity during that time. I have the problem with 3.0ish xen on Centos 5.2 as well as xen 3.2.1.

One can run a job on the vm to generate a packet every second or two to resolve this; ping in a loop should do it.

My scenario doesn't involve any clustered filesystem. I'm using phy: drbd devices as the backing for the vm, not files. As far as I understand things, a clustered filesystem shouldn't be necessary, as long as the drbd devices are in sync at the moment migration occurs.

But the question remains whether that condition is guaranteed, and I hope to hear from someone who knows the answer to that question...

Anyway, other then the ARP issue, I have this working in production with about two dozen DomUs.

Note: If you want to use LVM for xen rather then files on GFS/LVM/DRBD you need to run the latest DRBD that supports max-bio-bvecs.

I'm actually running drbd on top of LVM. But I'll look into the max-bio-bvecs thing anyway out of curiosity.

Thanks for the reply.

On Thu, 31 Jul 2008, Antibozo wrote:
Greetings.

I've reviewed the list archives, particularly the posts from Zakk, on this subject, and found results similar to his. drbd provides a block-drbd script, but with full virtualization, at least on RHEL 5, this does not work; by the time the block script is run, the qemu-dm has already been started.

Instead I've been simply musing the possibility of keeping the drbd devices in primary/primary state at all times. I'm concerned about a race condition, however, and want to ask if others have examined this alternative.

I am thinking of a scenario where the vm is running on node A, and has a process that is writing to disk at full speed, and consequently the drbd device on the node B is lagging. If I perform a live migration from node A to B under this condition, the local device on node B might not be in sync at the time the vm is started on that node. Maybe.

If I use drbd protocol C, theoretically at least, a sync on the device on node A shouldn't return until node B is fully in sync. So I guess my main question is: during migration, does xend force a device sync on node A before the vm is started on node B?

A secondary question I have (and this may be a question for the drbd folks as well) is: why is the block-drbd script necessary? I.e. why not simply leave the drbd primary/primary at all times--what benefit is there to marking the device secondary on the standby node?

Or am I just very confused? Does anyone else have thoughts or experience on this matter? All responses are appreciated.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.