[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks



Shriram Rajagopalan writes ("[Xen-devel] [PATCH 4 of 5 V2] tools/libxl: Control 
network buffering in remus      callbacks"):
> tools/libxl: Control network buffering in remus callbacks
...
> @@ -1256,10 +1272,36 @@ static void libxl__remus_domain_checkpoi
>  static void remus_checkpoint_dm_saved(libxl__egc *egc,
>                                        libxl__domain_suspend_state *dss, int 
> rc)
>  {

I think there are some error handling problems in this function.

> -    /* REMUS TODO: Wait for disk and memory ack, release network buffer */
> -    /* REMUS TODO: make this asynchronous */
> -    assert(!rc); /* REMUS TODO handle this error properly */
> -    usleep(dss->remus_ctx->interval * 1000);
> +    /*
> +     * REMUS TODO: Wait for disk and explicit memory ack (through restore
> +     * callback from remote) before releasing network buffer.
> +     */
> +    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
> +    struct timespec epoch;
> +    int ret;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc) {
> +        LOG(ERROR, "Failed to save device model. Terminating Remus..");
> +        libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                         &dss->shs, rc);

Firstly, it would be much clearer if you used the "goto out" error
handling style, rather than calling ..._callback_done and return
separately each time.

Secondly, here you return rc, a libxl error code.

> +        ret = libxl__remus_netbuf_release_prev_epoch(gc, dss->domid,
> +                                                     remus_ctx);
> +        if (ret) {
> +            libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                             &dss->shs,
> +                                                             ret);

Here you return "ret" which is what exactly ?

> +            return;
...
>      libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);

And here the original function returns "1".

The specification for the callback is in xenguest.h's struct
save_callbacks, the "checkpoint" member.  It says the return value
must be 0 or 1.

So if there is a failure I think you need to stash the error code
somewhere because libxc won't pass it through for you.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.