[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH Remus v3 3/3] libxc/restore: implement Remus checkpointed restore



On 05/14/2015 12:16 AM, Andrew Cooper wrote:
On 13/05/15 09:35, Yang Hongyang wrote:
With Remus, the restore flow should be:
the first full migration stream -> { periodically restore stream }

Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
CC: Wei Liu <wei.liu2@xxxxxxxxxx>
---
  tools/libxc/xc_sr_common.h  |  13 +++++
  tools/libxc/xc_sr_restore.c | 124 +++++++++++++++++++++++++++++++++++++++-----
  2 files changed, 125 insertions(+), 12 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index f8121e7..3740b89 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -208,6 +208,19 @@ struct xc_sr_context
              /* Plain VM, or checkpoints over time. */
              bool checkpointed;

+            /* Currently buffering records between a checkpoint */
+            bool buffer_all_records;
+
+/*
+ * With Remus, we buffer the records sent by primary at checkpoint,

"sent by the primary"

+ * in case the primary will fail, we can recover from the last
+ * checkpoint state.
+ * This should be enough because primary only send dirty pages at
+ * checkpoint.
+ */
+#define MAX_BUF_RECORDS 1024

As a general point, it occurs to me that this might be better as a
linked list, rather than having a hard limit.  A burst of activity on
the primary could potentially hit this limit

There are only few records at every checkpoint in my test, mostly under 10,
probably because I don't do much operations in the Guest. This limit can
be adjusted later by further testing.
Even with linked list, we still need a limit IMO, otherwise it will eat too
much memory.


+            struct xc_sr_record *buffered_records[MAX_BUF_RECORDS];

Your data handling will be much more simple if this were struct
xc_sr_record *buffered_records; and you do a one-time allocation of
MAX_BUF_RECORDS * sizeof(struct xc_sr_record), although it will require
an index integer as well.

Good point, it will avoid the alloc/free of the struct xc_sr_record, I
will add a 'buffered_rec_num' to the context also.


It would probably be best to factor out setup() and clean() just like
you did for the save side.

OK.


+
              /*
               * Xenstore and Console parameters.
               * INPUT:  evtchn & domid
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 0e512ec..85534a8 100644
--- a/tools/libxc/xc_sr_restore.c
[...]
@@ -541,7 +618,27 @@ static int restore(struct xc_sr_context *ctx)
      {
          rc = read_record(ctx, &rec);
          if ( rc )
-            goto err;
+        {
+            if ( ctx->restore.buffer_all_records )
+                goto err_buf;

Please can we choose a label sufficiently different to "err".

"resume_from_checkpoint" perhaps?

I think "remus_failover" is better? If you don't have objections, I will
use it as the label.


~Andrew

+            else
+                goto err;
+        }
+
+#ifdef XG_LIBXL_HVM_COMPAT
+        if ( ctx->dominfo.hvm &&
+             (rec.type == REC_TYPE_END || rec.type == REC_TYPE_CHECKPOINT) )
+        {
+            rc = read_qemu(ctx);
+            if ( rc )
+            {
+                if ( ctx->restore.buffer_all_records )
+                    goto err_buf;
+                else
+                    goto err;
+            }
+        }
+#endif

          rc = process_record(ctx, &rec);
          if ( rc )
@@ -549,15 +646,11 @@ static int restore(struct xc_sr_context *ctx)

      } while ( rec.type != REC_TYPE_END );

-#ifdef XG_LIBXL_HVM_COMPAT
-    if ( ctx->dominfo.hvm )
-    {
-        rc = read_qemu(ctx);
-        if ( rc )
-            goto err;
-    }
-#endif
-
+ err_buf:
+    /*
+     * With Remus, if we reach here, there must be some error on primary,
+     * failover from the last checkpoint state.
+     */
      rc = ctx->restore.ops.stream_complete(ctx);
      if ( rc )
          goto err;
@@ -571,6 +664,13 @@ static int restore(struct xc_sr_context *ctx)
      PERROR("Restore failed");

   done:
+    for ( i = 0; i < MAX_BUF_RECORDS; i++)
+    {
+        if ( ctx->restore.buffered_records[i] ) {
+            free(ctx->restore.buffered_records[i]->data);
+            free(ctx->restore.buffered_records[i]);
+        }
+    }
      free(ctx->restore.populated_pfns);
      rc = ctx->restore.ops.cleanup(ctx);
      if ( rc )

.


--
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.