[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH Remus v3 1/3] libxc/save: implement Remus checkpointed save



On 13/05/15 09:34, Yang Hongyang wrote:
> With Remus, the save flow should be:
> live migration->{ periodically save(checkpointed save) }
>
> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> ---
>  tools/libxc/xc_sr_save.c | 73 
> +++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 54 insertions(+), 19 deletions(-)
>
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 1d0a46d..3890c4d 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -57,6 +57,16 @@ static int write_end_record(struct xc_sr_context *ctx)
>  }
>  
>  /*
> + * Writes an CHECKPOINT record into the stream.
> + */
> +static int write_checkpoint_record(struct xc_sr_context *ctx)
> +{
> +    struct xc_sr_record checkpoint = { REC_TYPE_CHECKPOINT, 0, NULL };
> +
> +    return write_record(ctx, &checkpoint);
> +}
> +
> +/*
>   * Writes a batch of memory as a PAGE_DATA record into the stream.  The batch
>   * is constructed in ctx->save.batch_pfns.
>   *
> @@ -467,6 +477,9 @@ static int send_domain_memory_live(struct xc_sr_context 
> *ctx)
>      DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>                                      &ctx->save.dirty_bitmap_hbuf);
>  
> +    if ( !ctx->save.live )
> +        goto last_iter;
> +

I really want to avoid this code getting into the same spiders web as it
used to be with goto, but I cannot suggest a better alternative.

However, please leave a comment explaining that subsequent checkpointed
loop skip the live part and pause straight away.

>      rc = enable_logdirty(ctx);
>      if ( rc )
>          goto out;
> @@ -505,6 +518,7 @@ static int send_domain_memory_live(struct xc_sr_context 
> *ctx)
>              goto out;
>      }
>  
> + last_iter:
>      rc = suspend_domain(ctx);
>      if ( rc )
>          goto out;
> @@ -667,29 +681,50 @@ static int save(struct xc_sr_context *ctx, uint16_t 
> guest_type)
>      if ( rc )
>          goto err;
>  
> -    rc = ctx->save.ops.start_of_checkpoint(ctx);
> -    if ( rc )
> -        goto err;
> +    do {
> +        rc = ctx->save.ops.start_of_checkpoint(ctx);
> +        if ( rc )
> +            goto err;
>  
> -    if ( ctx->save.live )
> -        rc = send_domain_memory_live(ctx);
> -    else
> -        rc = send_domain_memory_nonlive(ctx);
> +        if ( ctx->save.live || ctx->save.checkpointed )
> +            rc = send_domain_memory_live(ctx);
> +        else
> +            rc = send_domain_memory_nonlive(ctx);
>  
> -    if ( rc )
> -        goto err;
> +        if ( rc )
> +            goto err;
>  
> -    if ( !ctx->dominfo.shutdown ||
> -         (ctx->dominfo.shutdown_reason != SHUTDOWN_suspend) )
> -    {
> -        ERROR("Domain has not been suspended");
> -        rc = -1;
> -        goto err;
> -    }
> +        if ( !ctx->dominfo.shutdown ||
> +            (ctx->dominfo.shutdown_reason != SHUTDOWN_suspend) )
> +        {
> +            ERROR("Domain has not been suspended");
> +            rc = -1;
> +            goto err;
> +        }
>  
> -    rc = ctx->save.ops.end_of_checkpoint(ctx);
> -    if ( rc )
> -        goto err;
> +        rc = ctx->save.ops.end_of_checkpoint(ctx);
> +        if ( rc )
> +            goto err;
> +
> +        if ( ctx->save.checkpointed ) {
> +            if ( ctx->save.live ) {

Coding style.  libxc is admittedly inconsistent, but follows hypervisor
style, so { on the next line.

> +                /* End of live migration, we are sending checkpointed stream 
> */
> +                ctx->save.live = false;
> +            }
> +
> +            rc = write_checkpoint_record(ctx);
> +            if ( rc )
> +                goto err;
> +
> +            ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
> +
> +            rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
> +            if ( rc > 0 )
> +                IPRINTF("Checkpointed save");

This should probably be progress information , so
xc_report_progress_single()

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

(I think this code is a bit of an improvement over xc_domain_save.c)

> +            else
> +                ctx->save.checkpointed = false;
> +        }
> +    } while ( ctx->save.checkpointed );
>  
>      xc_report_progress_single(xch, "End of stream");
>  


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.