[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore



On Thu, 2015-05-14 at 14:17 +0100, Andrew Cooper wrote:
> On 14/05/15 14:05, Ian Campbell wrote:
> > On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote:
> >> With Remus, the restore flow should be:
> >> the first full migration stream -> { periodically restore stream }
> >>
> >> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> >> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
> >> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
> >> ---
> >>  tools/libxc/xc_sr_common.h  |  14 ++++++
> >>  tools/libxc/xc_sr_restore.c | 113 
> >> ++++++++++++++++++++++++++++++++++++++++----
> >>  2 files changed, 117 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> >> index f8121e7..3bf27f1 100644
> >> --- a/tools/libxc/xc_sr_common.h
> >> +++ b/tools/libxc/xc_sr_common.h
> >> @@ -208,6 +208,20 @@ struct xc_sr_context
> >>              /* Plain VM, or checkpoints over time. */
> >>              bool checkpointed;
> >>  
> >> +            /* Currently buffering records between a checkpoint */
> >> +            bool buffer_all_records;
> >> +
> >> +/*
> >> + * With Remus, we buffer the records sent by the primary at checkpoint,
> >> + * in case the primary will fail, we can recover from the last
> >> + * checkpoint state.
> >> + * This should be enough because primary only send dirty pages at
> >> + * checkpoint.
> > I'm not sure how it then follows that 1024 buffers is guaranteed to be
> > enough, unless there is something on the sending side arranging it to be
> > so?
> >
> >> + */
> >> +#define MAX_BUF_RECORDS 1024
> >> +            struct xc_sr_record *buffered_records;
> >> +            unsigned buffered_rec_num;
> >> +
> >>              /*
> >>               * Xenstore and Console parameters.
> >>               * INPUT:  evtchn & domid
> >> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
> >> index 9ab5760..8468ffc 100644
> >> --- a/tools/libxc/xc_sr_restore.c
> >> +++ b/tools/libxc/xc_sr_restore.c
> >> @@ -468,11 +468,69 @@ static int handle_page_data(struct xc_sr_context 
> >> *ctx, struct xc_sr_record *rec)
> >>      return rc;
> >>  }
> >>  
> >> +static int process_record(struct xc_sr_context *ctx, struct xc_sr_record 
> >> *rec);
> >> +static int handle_checkpoint(struct xc_sr_context *ctx)
> >> +{
> >> +    xc_interface *xch = ctx->xch;
> >> +    int rc = 0;
> >> +    unsigned i;
> >> +
> >> +    if ( !ctx->restore.checkpointed )
> >> +    {
> >> +        ERROR("Found checkpoint in non-checkpointed stream");
> >> +        rc = -1;
> > Is it usual in migrv2 to set errno as well?
> 
> If a relevant errno is to be had.

EINVAL or ENOSYS perhaps?
> 
> There are a lot of cases which are waiting for some real libxc error
> codes before they can propagate numeric error information, although in
> all cases the log messages will be accurate (and hopefully helpful).
> 
> ~Andrew
> 
> >
> >> +        goto err;
> >> +    }
> >> +
> >> +    if ( ctx->restore.buffer_all_records )
> >> +    {
> >> +        IPRINTF("All records buffered");
> >> +
> >> +        /*
> >> +         * We need to set buffer_all_records to false in
> >> +         * order to process records instead of buffer records.
> >> +         * buffer_all_records should be set back to true after
> >> +         * we successfully processed all records.
> >> +         */
> >> +        ctx->restore.buffer_all_records = false;
> > I'm not personally a fan of changing global state in order to simulate
> > the action of what should be a parameter to a function.
> >
> > Preferable IMHO would be to have process_record gain a parameter to
> > override the ctx state but become an internal helper (perhaps with a
> > name change) and then have API function process_record and
> > process_buffered_records or some such which call it in the right way.
> >
> > Andy may have a differing opinion though.
> 
> Hmm yes - it would be nice to split the buffering logic away from the
> processing logic.
> 
> However, the two are slightly related.
> 
> Perhaps a process_or_buffer_record() helper, and removing all buffering
> logic from process_record().

That seems like it would work.




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.