[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [RFC PATCH COLO v5 01/29] Add readme



From: Wen Congyang <wency@xxxxxxxxxxxxxx>

Signed-off-by: Wen Congyang <wency@xxxxxxxxxxxxxx>
Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
---
 docs/README.colo | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)
 create mode 100644 docs/README.colo

diff --git a/docs/README.colo b/docs/README.colo
new file mode 100644
index 0000000..60f487d
--- /dev/null
+++ b/docs/README.colo
@@ -0,0 +1,92 @@
+COLO provides fault tolerance for virtual machines by sending continuous
+checkpoints to a backup, which will activate if the target VM fails. It
+only supports HVM guest(without pv extensions).
+
+Requriements:
+1. Hardware requriements
+   There is at least one directly connected nic to forward the nic from client
+   to secondary vm. The directly connected nic must not be used by any other
+   purpose. If your guest has more than one nic, you should have directly
+   connected nic for each guest nic. If you don't have enouth directly 
connected
+   nic, you can use vlan.
+2. Dom0 requirements
+   - Support dom0
+   - kernel module:
+        sch_ingress
+        cls_basic
+        cls_tcindex
+        cls_u32
+        act_mirred
+   - libnl-tools >= 3.0. This package provides the command nl-qdisc-list, and
+     colo need this command.
+   - If your host os has OEM-released xen tools, please uninstall it first.
+   - You can load the module which is not provided by OEM.
+3. Guest requirements
+   Only HVM guest(without pv extensions) is supported now. If you want to
+   use OEM released guest os, please use SUSE. REDHAT and Ubuntu is not
+   supported now because I don't find any way to disable pv extensions.
+   If you want to use REDHAT or Ubuntu, you need to build the newest
+   kernel which has the parameter xen_nopv.
+
+Network link topology
+   Please refer to: http://wiki.qemu.org/Features/COLO#Network_link_topology
+
+The steps to setup COLO environment:
+You need to recompile your host kernel because colo-proxy module need cooperate
+with linux kernel.
+Please refer to: http://wiki.qemu.org/Features/COLO#Test_environment_prepare
+1. Build and install xen
+2. Apply the patch for qemu xen, and rebuild xen tools:
+    - cd tools/qemu-xen-dir
+    - use git am to apply the patch:
+      
https://raw.githubusercontent.com/wencongyang/colo-files/master/patch_for_qemu/*.patch
+    - make tools && make install-tools
+    Note: You must use qemu-xen. qemu-xen-traditional is not supported.
+3. Install COLO proxy module:
+    3.1 Download COLO proxy, compile and install it:
+        https://github.com/gao-feng/colo-proxy.git
+    3.2 Download iptables patch, it is based on v1.4.21 compile and install it:
+        
https://github.com/gao-feng/colo-proxy/blob/master/colo-patch-for-kernel.patch
+4. Install the guest
+    4.1 Add "xen_platform_pci=0" into the guest configfile
+    4.2 If you use suse, please select physical machine
+    4.3 copy the disk image to the secondary host
+5. Update your guest config file for COLO:
+    5.1 disk
+        disk = [
+        
'format=raw,devtype=disk,access=w,vdev=hda,backendtype=qdisk,colo,colo-params=192.168.3.1:9000:exportname=qdisk1,active-disk=/mnt/ramfs/active_disk.img,hidden-disk=/mnt/ramfs/hidden_disk.img,target=/root/images/colo-hvm.img'
 ]
+    5.2 nic
+        vif = [ 'mac=00:16:4f:00:00:11, bridge=br0, model=e1000, 
forwarddev=eth0, forwardbr=br1' ]
+    Note:
+    a. The ip/port in colo-params is the secondary host's IP. Don't use the
+       directly connected nic's IP.
+    b. forwarddev is the directly connected nic.
+    c. If you have more than one disk, colo-params's host/port must be the same
+       and colo-param's exportname must be different.
+6. Run COLO:
+    xl remus -c -u <domname> <secondary host IP>
+    Note: The ip must not be the directly connected nic's IP.
+Note:
+Secondary host only need to do step 1-3.
+
+The known problem:
+1. Secondary vm may crash due to triple fault.
+2. The heartbeat is not reliable. If you want to test the performance,
+   please disable the heartbeat(modify the xen codes). You can use the
+   branch colo-v4-noheartbeat.
+3. Suspending the vm fails, and the error message is:
+    libxl: error: libxl_qmp.c:429:qmp_next: timeout
+
+Problem 1 and 3 don't happen every time. So you can run colo again to
+avoid this problem.
+
+Virtio-Net:
+1. If you want to get better performance, you can use virtio-net.
+
+Trouble shooting:
+If there's some error happend when staritng COLO, you can do:
+1. Make sure you have all necessary modules that DOM0 needed on both side.
+2. Make sure you have followed all the instructions in this README.
+3. Try to reboot both primary and secondary host.
+4. If you still have problems, collect the error logs and contact
+   Wen Congyang(wency@xxxxxxxxxxxxxx)/Yang Hongyang(yanghy@xxxxxxxxxxxxxx).
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.