[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [RFC PATCH COLO v5 00/29] COarse-grain LOck-stepping Virtual Machines for Non-stop Service



This patchset is for xen-4.6. The main diffrence from previous versions are:
1. Use qdisk block replication
   http://wiki.qemu.org/Features/BlockReplication
2. Nic replication based on colo-proxy
   http://wiki.qemu.org/Features/COLO#Components
Note that COLO feature is under active development, this version is not well
tested and has some known problems.
We post this early in order to give you a brief impression about how COLO
will be implemented and we request for your comments about the general idea
of COLO and of course the implementation, if you have any idea/suggestion
on COLO, please do not hesitate to give your comments, thanks in advance.

Virtual machine (VM) replication is a well known technique for providing
application-agnostic software-implemented hardware fault tolerance -
"non-stop service". Currently, remus provides this function, but it buffers
all output packets, and the latency is unacceptable.
In xen summit 2012, We introduce a new VM replication solution: colo
(COarse-grain LOck-stepping virtual machine). The presentation is in
the following URL:
http://www.slideshare.net/xen_com_mgr/colo-coarsegrain-lockstepping-virtual-machines-for-nonstop-service

Here is the summary of the solution:
>From the client's point of view, as long as the client observes identical
responses from the primary and secondary VMs, according to the service
semantics, then the secondary vm is a valid replica of the primary
vm, and can successfully take over when a hardware failure of the
primary vm is detected.

This patchset is based on migration v1.
Only supports hvm guest now. The codes are also hosted on github:
https://github.com/macrosheep/xen/tree/COLO_RFC_v5

TODO list:
1. Code reviews and Bug fixes
2. Switch to migration v2
3. Support pvm

Known bugs:
1. Secondary vm may crash due to triple fault.

Wiki pages:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
http://wiki.qemu.org/Features/COLO

Patch 1    : Add readme
Patch 2-8  : Some refactor and prepare work
Patch 9-12 : Update remus to reuse remus device codes
Patch 13-21: COLO framework related codes
Patch 22-23: implement disk replication
Patch 24-29: implement nic replication

Changelog from v4 to v5:
1. rebase to the latest xen upstream
2. disk replication: blktap2->qdisk
3. nic replication: colo-agent->colo-proxy

Changelog from v3 to v4:
1. rebase to newest xen
2. bug fix

Changlog from v2 to v3:
1. rebase to newest remus
2. add nic replication support

Changlog from v1 to v2:
1. rebase to newest remus
2. add disk replication support

Wen Congyang (23):
  Add readme
  Refactor domain_suspend_callback_common()
  tools: libxl: introduce a new API libxl__domain_restore() to read qemu
    state
  Update libxl__domain_suspend_common_switch_qemu_logdirty() for colo
  Introduce a new internal API libxl__domain_unpause()
  Update libxl__domain_unpause() to support qemu-xen
  support to resume uncooperative HVM guests
  tools/libxl: Introduce bitops macros
  move remus related codes to libxl_remus.c
  rename remus device to checkpoint device
  adjust the indentation
  don't touch remus in checkpoint_device
  Update libxl_save_msgs_gen.pl to support return data from xl to xc
  Allow slave sends data to master
  secondary vm suspend/resume/checkpoint code
  primary vm suspend/get_dirty_pfn/resume/checkpoint code
  xc_domain_save: flush cache before calling callbacks->postcopy() in
    colo mode
  COLO: xc related codes
  send store mfn and console mfn to xl before resuming secondary vm
  implement the cmdline for COLO
  tools: xc_doamin_restore: zero ioreq page only one time
  Support colo mode for qemu disk
  COLO: use qemu block replication

Yang Hongyang (6):
  COLO proxy: implement setup/teardown of COLO proxy module
  COLO proxy: preresume, postresume and checkpoint
  COLO nic: implement COLO nic subkind
  setup and control colo proxy on primary side
  setup and control colo proxy on secondary side
  cmdline switches and config vars to control colo-proxy

 docs/README.colo                      |   92 +++
 docs/man/xl.conf.pod.5                |    6 +
 docs/man/xl.pod.1                     |   11 +-
 tools/hotplug/Linux/Makefile          |    1 +
 tools/hotplug/Linux/colo-proxy-setup  |  128 ++++
 tools/libxc/include/xenguest.h        |   40 ++
 tools/libxc/xc_domain_restore.c       |  106 ++-
 tools/libxc/xc_domain_save.c          |   71 +-
 tools/libxc/xc_resume.c               |   20 +-
 tools/libxl/Makefile                  |    6 +-
 tools/libxl/libxl.c                   |  185 +++--
 tools/libxl/libxl_bitops.h            |   79 +++
 tools/libxl/libxl_checkpoint_device.c |  282 ++++++++
 tools/libxl/libxl_colo.h              |   53 ++
 tools/libxl/libxl_colo_nic.c          |  313 +++++++++
 tools/libxl/libxl_colo_proxy.c        |  267 ++++++++
 tools/libxl/libxl_colo_qdisk.c        |  209 ++++++
 tools/libxl/libxl_colo_restore.c      | 1190 +++++++++++++++++++++++++++++++++
 tools/libxl/libxl_colo_save.c         |  782 ++++++++++++++++++++++
 tools/libxl/libxl_create.c            |  166 ++++-
 tools/libxl/libxl_device.c            |   38 ++
 tools/libxl/libxl_dm.c                |  262 +++++++-
 tools/libxl/libxl_dom.c               |  569 +++++++---------
 tools/libxl/libxl_internal.h          |  302 ++++++---
 tools/libxl/libxl_netbuffer.c         |  117 ++--
 tools/libxl/libxl_nonetbuffer.c       |   10 +-
 tools/libxl/libxl_qmp.c               |   41 ++
 tools/libxl/libxl_remus.c             |  373 +++++++++++
 tools/libxl/libxl_remus.h             |   27 +
 tools/libxl/libxl_remus_device.c      |  327 ---------
 tools/libxl/libxl_remus_disk_drbd.c   |   57 +-
 tools/libxl/libxl_save_callout.c      |   37 +-
 tools/libxl/libxl_save_helper.c       |   17 +
 tools/libxl/libxl_save_msgs_gen.pl    |   74 +-
 tools/libxl/libxl_types.idl           |   20 +-
 tools/libxl/libxlu_disk_l.l           |    5 +
 tools/libxl/xl.c                      |    3 +
 tools/libxl/xl.h                      |    1 +
 tools/libxl/xl_cmdimpl.c              |  101 ++-
 tools/libxl/xl_cmdtable.c             |    4 +-
 40 files changed, 5413 insertions(+), 979 deletions(-)
 create mode 100644 docs/README.colo
 create mode 100755 tools/hotplug/Linux/colo-proxy-setup
 create mode 100644 tools/libxl/libxl_bitops.h
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 create mode 100644 tools/libxl/libxl_colo.h
 create mode 100644 tools/libxl/libxl_colo_nic.c
 create mode 100644 tools/libxl/libxl_colo_proxy.c
 create mode 100644 tools/libxl/libxl_colo_qdisk.c
 create mode 100644 tools/libxl/libxl_colo_restore.c
 create mode 100644 tools/libxl/libxl_colo_save.c
 create mode 100644 tools/libxl/libxl_remus.c
 create mode 100644 tools/libxl/libxl_remus.h
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.