[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Remus: Possible disk replication consistency bug
Hi dmelisso, On Fri, Feb 10, 2012 at 11:02 AM, Dimitrios Melissovas <dimitrios.melissovas@xxxxxxx> wrote: Greetings, It used to be possible. I have run remus with 3 blktap2 disks. I even remember fixing up some issue in the blktap2 code in unstable/4.1-testing, that broke the replication (but that was almost a year ago). Let me check what has changed in the unstable repo, to break the blktap2 replication again. 2. [Possible bug] How does Remus guarantee that when, after failover, a Currently there is no synchronization mechanism. I certainly agree that it is a bug and a very very rare one, for I have not been able to reproduce the bug/observe side-effects. I even have a fix for the drbd version, though not in the daemonized form that you are talking about. I havent found time to add a similar fix to the blktap2 version, which is why I havent sent out a fix. Remus uses two separate - Epilogue Yes I agree. Stefano posted patches in xen-devel, introducing callbacks in xc_domain_restore. My intention is to add one or more callbacks, that would (a) send explicit checkpoint acknowledgements to the primary (rather than relying on the fsync() on primary) (b) only send the checkpoint ack after ensuring that all disks have received their checkpoints. With the explicit ack from xc_domain_restore callback, one could actually get rid of the disk level acknowledgements. The primary would only send a "barrier" or "flush" to delimit checkpoints. Cheers! _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |