Thank you for your response,
I compiled Linux 3.0.101, sch_plug, and reinstalled remus-drbd. I still receive the same error when starting remus: xc: error: rdexact failed (select returned 0): Internal error xc: error: Error when reading batch size (110 = Connection timed out): Internal error xc: error: error when buffering batch, finishing (110 = Connection timed out): Internal error
The error occurs at an earlier plug/unplug with faster intervals.
I detailed my installation steps on my crude blog I just made, hopefully it helps:
For Linux I used 3.4 or 3.0 and added the necessary options in make menuconfig. For 3.0 I had to get a separate copy of sch_plug.
The only thing I had to differentiate from, is that for the DomU config, I had to use ["phy:/dev/drbd1,w,xvda"] instead of ["drbd:ubuntu_vm_1,w,xvda"]
On a side note: Interestingly, after changing the Kernel from 3.4 to 3.0 and installing an external version of sch_plug, provided on the Xen Wiki, pings to the DomU don't display the delay the network buffering causes, but on longer intervals you can feel the extra time the pings take to respond. Weird.
Many Thanks, Anthony Hi,
å 12/19/2014 05:48 AM, Anthony Korzan åé:
Hello!
I have only managed to get Xen 4.5's Remus "working" on Linux Kernels less than 3.5. The provided remus-drbd, as detailed in docs/README.remus and available from https://github.com/rshriram/remus-drbd will not compile with Linux Kernels 3.6 and above.
The DRBD you get from https://github.com/rshriram/remus-drbd is DRBD 8.3.11 and this version only compatible with Linux 3.0~3.4, see the table on this page: http://www.drbd.org/download/mainline/
I'm afraid DRBD 8.3.11 is the only version that you can get Remus work on currently. In the past, Remus disk replication based on blktap2, but blktap2 is getting deprecated I think, there's no maintainers nor patches recent years.
If you are interest, there's a new FT solution based on Remus: http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
This solution use blktap2 as disk replication, and it has lots of patches to get blktap2 work with xl.
Futhermore, we are working on a better solution on disk replication on both Remus/COLO. COLO is supposed to get into Xen 4.6.
One of these errors is that remus-drbd uses a two argument version of the macro kunmap_atomic found in include/linux/highmem.h This was deprecated and is no longer included in any Kernels above 3.6.
"error: macro "kunmap_atomic" passed 2 arguments, but takes just 1"
Is there a patch available? If not, what set up do the Remus devs use to test? I just need a "stable-ish" platform to modify remus on.
Now I did get Remus "working" on Linux 3.4, Ubuntu 14.04, and the custom remus-drbd. The issue I run into is that Remus only plugs and unplugs a few hundred times until there is a "Connection timeout error." It could be that I am using an "old" linux kernel version without much Xen integration, but I'm stumped about this error:
Can you try to use Linux 3.0 to see if the error still exists? I will take a look on this problem to see if I can reproduce it.
### ... xc: progress: Reloading memory pages: 895015/65536 1365% xc: Saving memory: iter 1416 (last sent 568 skipped 0): 65536/65536 100% ... xc: Saving memory: iter 1420 (last sent 567 skipped 0): 65536/65536 100% xc: error: rdexact failed (select returned 0): Internal error xc: error: Error when reading batch size (110 = Connection timed out): Internal error xc: error: error when buffering batch, finishing (110 = Connection timed out): Internal error migration target: Remus Failover for domain 5 libxl: error: libxl_utils.c:430:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 5 save/restore helper stdout pipe libxl: error: libxl_exec.c:129:libxl_report_child_exitstatus: domain 5 save/restore helper [-1] died due to fatal signal Broken pipe libxl: warning: libxl_dom.c:2015:domain_suspend_done: Remus: Domain suspend terminated with rc -3, teardown Remus devices... Remus: Backup failed? resuming domain at primary. xc: error: Dom 5 not suspended: (shutdown 0, reason 255): Internal error libxl: error: libxl.c:505:libxl__domain_resume: xc_domain_resume failed for domain 5: Invalid argument ###
Sincerely, Anthony
-- Thanks, Yang.
|