[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RESEND v5 6/6] xen/arm: Implement toolstack for xl restore/save and migrate



12.11.2013 21:22, Ian Campbell wrote:
On Fri, 2013-11-08 at 16:50 +0900, Jaeyong Yoo wrote:
From: Alexey Sokolov <sokolov.a@xxxxxxxxxxx>

Implement for xl restore/save (which are also used for migrate) operation in 
xc_arm_migrate.c and make it compilable.
The overall process of save is the following:
1) save guest parameters (i.e., memory map, console and store pfn, etc)
2) save memory (if it is live, perform dirty-page tracing)
3) save hvm states (i.e., gic, timer, vcpu etc)

Singed-off-by: Alexey Sokolov <sokolov.a@xxxxxxxxxxx>
---
  config/arm32.mk              |   1 +
  tools/libxc/Makefile         |   6 +-
  tools/libxc/xc_arm_migrate.c | 712 +++++++++++++++++++++++++++++++++++++++++++
  tools/libxc/xc_dom_arm.c     |   4 +-
  tools/misc/Makefile          |   4 +-
  5 files changed, 723 insertions(+), 4 deletions(-)
  create mode 100644 tools/libxc/xc_arm_migrate.c

diff --git a/config/arm32.mk b/config/arm32.mk
index aa79d22..01374c9 100644
--- a/config/arm32.mk
+++ b/config/arm32.mk
@@ -1,6 +1,7 @@
  CONFIG_ARM := y
  CONFIG_ARM_32 := y
  CONFIG_ARM_$(XEN_OS) := y
+CONFIG_MIGRATE := y
CONFIG_XEN_INSTALL_SUFFIX := diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 4c64c15..05dfef4 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -42,8 +42,13 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
  GUEST_SRCS-y :=
  GUEST_SRCS-y += xg_private.c xc_suspend.c
  ifeq ($(CONFIG_MIGRATE),y)
+ifeq ($(CONFIG_X86),y)
  GUEST_SRCS-y += xc_domain_restore.c xc_domain_save.c
  GUEST_SRCS-y += xc_offline_page.c xc_compression.c
+endif
+ifeq ($(CONFIG_ARM),y)
+GUEST_SRCS-y += xc_arm_migrate.c
I know you are just following the example above but I think this can be
GUEST_SRCS-$(CONFIG_ARM) += xc_arm...
OK

+endif
  else
  GUEST_SRCS-y += xc_nomigrate.c
  endif
@@ -63,7 +68,6 @@ $(patsubst %.c,%.opic,$(ELF_SRCS-y)): CFLAGS += 
-Wno-pointer-sign
  GUEST_SRCS-y                 += xc_dom_core.c xc_dom_boot.c
  GUEST_SRCS-y                 += xc_dom_elfloader.c
  GUEST_SRCS-$(CONFIG_X86)     += xc_dom_bzimageloader.c
-GUEST_SRCS-$(CONFIG_X86)     += xc_dom_decompress_lz4.c
I don't think this was intentional, was it?
Oops, sure.
  GUEST_SRCS-$(CONFIG_ARM)     += xc_dom_armzimageloader.c
  GUEST_SRCS-y                 += xc_dom_binloader.c
  GUEST_SRCS-y                 += xc_dom_compat_linux.c
diff --git a/tools/libxc/xc_arm_migrate.c b/tools/libxc/xc_arm_migrate.c
new file mode 100644
index 0000000..461e339
--- /dev/null
+++ b/tools/libxc/xc_arm_migrate.c
@@ -0,0 +1,712 @@
Is this implementing the exact protocol as described in
tools/libxc/xg_save_restore.h or is it a variant? Are there any docs of
the specifics of the ARM protocol?
This implements a quite different from tools/libxc/xg_save_restore.h protocol, it is much more simplified because we do not need some things that implemented for x86. So you're right, it has to be documented. Should we use a different header to place documentation to this (and place some definitions), for example xc_arm_migrate.h?
We will eventually need to make a statement about the stability of the
protocol, i.e on x86 we support X->X+1 migrations across Xen versions. I
think we'd need to make similar guarantees on ARM before we would remove
the "tech preview" label from the migration feature.
So, should you believe our results (and where should we place this statement) or should you make tests from your side?
So the docs are useful so we can review the intended protocol for
forward compatibility problems etc. We needn't necessarily implement the
x86 one from xg_save_restore.h.

In particular it would be nice if the protocol and each of the "chunks"
in it were explicitly versioned etc. For example the code assumes that
the HVM context implicitly follows the last iteration -- this caused
untold pain on x86 when remus was added...
OK, such documentation will be soon.

+/******************************************************************************
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA
+ *
+ * Copyright (c) 2013, Samsung Electronics
+ */
+
+#include <inttypes.h>
+#include <errno.h>
+#include <xenctrl.h>
+#include <xenguest.h>
+
+#include <unistd.h>
+#include <xc_private.h>
+#include <xc_dom.h>
+#include "xc_bitops.h"
+#include "xg_private.h"
+
+/* Guest RAM base */
+#define GUEST_RAM_BASE 0x80000000
+/*
+ *  XXX: Use correct definition for RAM base when the following patch
+ *  xen: arm: 64-bit guest support and domU FDT autogeneration
+ *  will be upstreamed.
+ */
+
+#define DEF_MAX_ITERS          29 /* limit us to 30 times round loop   */
+#define DEF_MAX_FACTOR         3  /* never send more than 3x p2m_size  */
+#define DEF_MIN_DIRTY_PER_ITER 50 /* dirty page count to define last iter */
+#define DEF_PROGRESS_RATE      50 /* progress bar update rate */
+
+/* Enable this macro for debug only: "static" migration instead of live */
+/*
+#define DISABLE_LIVE_MIGRATION
+*/
I don't think this is needed, the caller can be hacked if necessary.
OK, I remove. It is internal hack.

+
+/* Enable this macro for debug only: additional debug info */
+/*
+#define ARM_MIGRATE_VERBOSE
+*/
Likewise.
OK

+/* ============ Memory ============= */
+static int save_memory(xc_interface *xch, int io_fd, uint32_t dom,
+                       struct save_callbacks *callbacks,
+                       uint32_t max_iters, uint32_t max_factor,
+                       guest_params_t *params)
+{
+    int live =  !!(params->flags & XCFLAGS_LIVE);
+    int debug =  !!(params->flags & XCFLAGS_DEBUG);
+    xen_pfn_t i;
+    char reportbuf[80];
+    int iter = 0;
+    int last_iter = !live;
+    int total_dirty_pages_num = 0;
+    int dirty_pages_on_prev_iter_num = 0;
+    int count = 0;
+    char *page = 0;
+    xen_pfn_t *busy_pages = 0;
+    int busy_pages_count = 0;
+    int busy_pages_max = 256;
+
+    DECLARE_HYPERCALL_BUFFER(unsigned long, to_send);
+
+    xen_pfn_t start = params->start_gpfn;
+    const xen_pfn_t end = params->max_gpfn;
+    const xen_pfn_t mem_size = end - start;
+
+    if ( debug )
+    {
+        IPRINTF("(save mem) start=%llx end=%llx!\n", start, end);
+    }
FYI you don't need the {}'s for cases like this.
Actually we don't need {}, this has been done because we was not sure if this macro can be empty-substituted.

is if ( debug ) IPRINTF(...) not the equivalent of DPRINTF?
This equivalence is not obvious for me, because in current code we obtain debug flag with XCFLAGS_DEBUG mask (when --debug option passed).
If it is equivalent I'll use DPRINTF.

+
+    if ( live )
+    {
+        if ( xc_shadow_control(xch, dom, XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY,
+                    NULL, 0, NULL, 0, NULL) < 0 )
+        {
+            ERROR("Couldn't enable log-dirty mode !\n");
+            return -1;
+        }
+
+        max_iters  = max_iters  ? : DEF_MAX_ITERS;
+        max_factor = max_factor ? : DEF_MAX_FACTOR;
+
+        if ( debug )
+            IPRINTF("Log-dirty mode enabled, max_iters=%d, max_factor=%d!\n",
+                    max_iters, max_factor);
+    }
+
+    to_send = xc_hypercall_buffer_alloc_pages(xch, to_send,
+                                              NRPAGES(bitmap_size(mem_size)));
+    if ( !to_send )
+    {
+        ERROR("Couldn't allocate to_send array!\n");
+        return -1;
+    }
+
+    /* send all pages on first iter */
+    memset(to_send, 0xff, bitmap_size(mem_size));
+
+    for ( ; ; )
+    {
+        int dirty_pages_on_current_iter_num = 0;
+        int frc;
+        iter++;
+
+        snprintf(reportbuf, sizeof(reportbuf),
+                 "Saving memory: iter %d (last sent %u)",
+                 iter, dirty_pages_on_prev_iter_num);
+
+        xc_report_progress_start(xch, reportbuf, mem_size);
+
+        if ( (iter > 1 &&
+              dirty_pages_on_prev_iter_num < DEF_MIN_DIRTY_PER_ITER) ||
+             (iter == max_iters) ||
+             (total_dirty_pages_num >= mem_size*max_factor) )
+        {
+            if ( debug )
+                IPRINTF("Last iteration");
+            last_iter = 1;
+        }
+
+        if ( last_iter )
+        {
+            if ( suspend_and_state(callbacks->suspend, callbacks->data,
+                                   xch, dom) )
+            {
+                ERROR("Domain appears not to have suspended");
+                return -1;
+            }
+        }
+        if ( live && iter > 1 )
+        {
+            frc = xc_shadow_control(xch, dom, XEN_DOMCTL_SHADOW_OP_CLEAN,
+                                    HYPERCALL_BUFFER(to_send), mem_size,
+                                                     NULL, 0, NULL);
+            if ( frc != mem_size )
+            {
+                ERROR("Error peeking shadow bitmap");
+                xc_hypercall_buffer_free_pages(xch, to_send,
+                                               NRPAGES(bitmap_size(mem_size)));
+                return -1;
+            }
+        }
+
+        busy_pages = malloc(sizeof(xen_pfn_t) * busy_pages_max);
+
+        for ( i = start; i < end; ++i )
+        {
+            if ( test_bit(i - start, to_send) )
+            {
+                page = xc_map_foreign_range(xch, dom, PAGE_SIZE, PROT_READ, i);
On x86 we try to do this in batches to reduce the overheads. I suppose
that could be a future enhancement.
OK. I'll make TODO comment.

+                if ( !page )
+                {
+                    /* This page is mapped elsewhere, should be resent later */
What does this ("busy") mean? When does this happen?
Oh, it looks like workaround of problem that maybe don't need now.
In case of dom0 with 2 CPU xc_map_foriegn range can return NULL, but guest page is exist. The later call xc_map_foriegn_range on the same page will return valid pointer.
I'll remove this.


[...]
+
+static int restore_guest_params(xc_interface *xch, int io_fd,
+                                uint32_t dom, guest_params_t *params)
+{
[...]
+    if ( xc_domain_setmaxmem(xch, dom, maxmemkb) )
+    {
+        ERROR("Can't set memory map");
+        return -1;
+    }
+
+    /* Set max. number of vcpus as max_vcpu_id + 1 */
+    if ( xc_domain_max_vcpus(xch, dom, params->max_vcpu_id + 1) )
Does the higher level toolstack not take care of vcpus and maxmem? I
thought so. I think this is how it shoud be.

For my tests guest config information is not transferred for ARM case from high-level stack. At the migration receiver side toolstack always create a new domain with vcpus=1 and default max. mem. So we have to send guest information as our local guest_params structure (at the beginning of migration). It is easy way to work "xl save" or "xl migrate" without modification of libxl level, but you may have another idea?
Also, toolstack_restore callback is not set (NULL) for ARM case.


+    {
+        ERROR("Can't set max vcpu number for domain");
+        return -1;
+    }
+
+    return 0;
+}
+[...]
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 17aeda5..0824100 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -11,7 +11,7 @@ HDRS     = $(wildcard *.h)
TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof xenwatchdogd xencov
  TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvmcrash xen-lowmemd 
xen-mfndump
-TARGETS-$(CONFIG_MIGRATE) += xen-hptool
+TARGETS-$(CONFIG_X86) += xen-hptool
  TARGETS := $(TARGETS-y)
SUBDIRS := $(SUBDIRS-y)
@@ -23,7 +23,7 @@ INSTALL_BIN := $(INSTALL_BIN-y)
  INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm 
xen-tmem-list-parse gtraceview \
        gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
  INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvmcrash xen-lowmemd xen-mfndump
-INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
+INSTALL_SBIN-$(CONFIG_X86) += xen-hptool
  INSTALL_SBIN := $(INSTALL_SBIN-y)
INSTALL_PRIVBIN-y := xenpvnetboot
You could resend these last two separately and they could probably go
straight in.
OK

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.