[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen 4.5-rc] remus-drbd incompatible with Linux 3.6+ headers

Thank you for your response,

I compiled Linux 3.0.101, sch_plug, and reinstalled remus-drbd.  I still receive the same error when starting remus:
xc: error: rdexact failed (select returned 0): Internal error
xc: error: Error when reading batch size (110 = Connection timed out): Internal error
xc: error: error when buffering batch, finishing (110 = Connection timed out): Internal error

The error occurs at an earlier plug/unplug with faster intervals.

I detailed my installation steps on my crude blog I just made, hopefully it helps:

For Linux I used 3.4 or 3.0 and added the necessary options in make menuconfig.  For 3.0 I had to get a separate copy of sch_plug.

The only thing I had to differentiate from, is that for the DomU config, I had to use ["phy:/dev/drbd1,w,xvda"] instead of ["drbd:ubuntu_vm_1,w,xvda"]

On a side note: Interestingly, after changing the Kernel from 3.4 to 3.0 and installing an external version of sch_plug, provided on the Xen Wiki, pings to the DomU don't display the delay the network buffering causes, but on longer intervals you can feel the extra time the pings take to respond.  Weird.

Many Thanks,

On Dec 18, 2014, at 9:09 PM, Hongyang Yang <yanghy@xxxxxxxxxxxxxx> wrote:


å 12/19/2014 05:48 AM, Anthony Korzan åé:

I have only managed to get Xen 4.5's Remus "working" on Linux Kernels less than
3.5. The provided remus-drbd, as detailed in docs/README.remus and available
from https://github.com/rshriram/remus-drbd will not compile with Linux Kernels
3.6 and above.

The DRBD you get from https://github.com/rshriram/remus-drbd is DRBD 8.3.11
and this version only compatible with Linux 3.0~3.4, see the table on this page:

I'm afraid DRBD 8.3.11 is the only version that you can get Remus work on
currently. In the past, Remus disk replication based on blktap2, but blktap2
is getting deprecated I think, there's no maintainers nor patches recent years.

If you are interest, there's a new FT solution based on Remus:

This solution use blktap2 as disk replication, and it has lots of patches to
get blktap2 work with xl.

Futhermore, we are working on a better solution on disk replication on both
Remus/COLO. COLO is supposed to get into Xen 4.6.

One of these errors is that remus-drbd uses a two argument version of the macro
kunmap_atomic found in include/linux/highmem.h
This was deprecated and is no longer included in any Kernels above 3.6.

"error: macro "kunmap_atomic" passed 2 arguments, but takes just 1"

Is there a patch available?  If not, what set up do the Remus devs use to test?
 I just need a "stable-ish" platform to modify remus on.

Now I did get Remus "working" on Linux 3.4, Ubuntu 14.04, and the custom
remus-drbd.  The issue I run into is that Remus only plugs and unplugs a few
hundred times until there is a "Connection timeout error."  It could be that I
am using an "old" linux kernel version without much Xen integration, but I'm
stumped about this error:

Can you try to use Linux 3.0 to see if the error still exists?
I will take a look on this problem to see if I can reproduce it.

xc: progress: Reloading memory pages: 895015/65536  1365%
xc: Saving memory: iter 1416 (last sent 568 skipped 0): 65536/65536  100%
xc: Saving memory: iter 1420 (last sent 567 skipped 0): 65536/65536  100%
xc: error: rdexact failed (select returned 0): Internal error
xc: error: Error when reading batch size (110 = Connection timed out): Internal
xc: error: error when buffering batch, finishing (110 = Connection timed out):
Internal error
migration target: Remus Failover for domain 5
libxl: error: libxl_utils.c:430:libxl_read_exactly: file/stream truncated
reading ipc msg header from domain 5 save/restore helper stdout pipe
libxl: error: libxl_exec.c:129:libxl_report_child_exitstatus: domain 5
save/restore helper [-1] died due to fatal signal Broken pipe
libxl: warning: libxl_dom.c:2015:domain_suspend_done: Remus: Domain suspend
terminated with rc -3, teardown Remus devices...
Remus: Backup failed? resuming domain at primary.
xc: error: Dom 5 not suspended: (shutdown 0, reason 255): Internal error
libxl: error: libxl.c:505:libxl__domain_resume: xc_domain_resume failed for
domain 5: Invalid argument



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.