|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1 of 2 V3] libxl: Remus - suspend/postflush/commit callbacks
On Fri, 2012-02-03 at 07:00 +0000, rshriram@xxxxxxxxx wrote:
> # HG changeset patch
> # User Shriram Rajagopalan <rshriram@xxxxxxxxx>
> # Date 1328251593 28800
> # Node ID 90e59c643c00c079996e13b75f89d1f0cd931a02
> # Parent c7abecc14cceb18140335ebe20faad826282cd1f
> libxl: Remus - suspend/postflush/commit callbacks
>
> * Add libxl callback functions for Remus checkpoint suspend, postflush
> (aka resume) and checkpoint commit callbacks.
> * suspend callback is a stub that just bounces off
> libxl__domain_suspend_common_callback - which suspends the domain and
> saves the devices model state to a file.
> * resume callback currently just resumes the domain (and the device model).
> * commit callback just writes out the saved device model state to the
> network and sleeps for the checkpoint interval.
> * Introduce a new public API, libxl_domain_remus_start (currently a stub)
> that sets up the network and disk buffer and initiates continuous
> checkpointing.
>
> * Future patches will augument these callbacks/functions with more
> functionalities
"augment"
> like issuing network buffer plug/unplug commands, disk checkpoint
> commands, etc.
>
> Signed-off-by: Shriram Rajagopalan <rshriram@xxxxxxxxx>
>
> diff -r c7abecc14cce -r 90e59c643c00 tools/libxl/libxl.c
> --- a/tools/libxl/libxl.c Thu Feb 02 22:46:33 2012 -0800
> +++ b/tools/libxl/libxl.c Thu Feb 02 22:46:33 2012 -0800
> @@ -471,6 +471,41 @@ libxl_vminfo * libxl_list_vm(libxl_ctx *
> return ptr;
> }
>
> +/* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
> +int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
> + uint32_t domid, int send_fd, int recv_fd)
> +{
> + GC_INIT(ctx);
> + libxl_domain_type type = libxl__domain_type(gc, domid);
> + int rc = 0;
> +
> + if (info == NULL) {
> + LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> + "No remus_info structure supplied for domain %d", domid);
> + rc = ERROR_INVAL;
> + goto remus_fail;
> + }
> +
> + /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
Is it worth checking that the domain has no disks or network (IOW is
this dangerous if they do?)
[...]
> @@ -791,7 +837,27 @@ int libxl__domain_suspend_common(libxl__
> }
>
> memset(&callbacks, 0, sizeof(callbacks));
> - callbacks.suspend = libxl__domain_suspend_common_callback;
> + if (r_info != NULL) {
> + /* save_callbacks:
> + * suspend - called after expiration of checkpoint interval,
> + * to *suspend* the domain.
> + *
> + * postcopy - called after the domain's dirty pages have been
> + * copied into an output buffer. We *resume* the domain
> + * & the device model, return to the caller. Caller then
> + * flushes the output buffer, while the domain continues
> to run.
> + *
> + * checkpoint - called after the memory checkpoint has been flushed
> out
> + * into the network. Send the saved device state, *wait*
> + * for checkpoint ack and *release* the network buffer
> (TBD).
> + * Then *sleep* for the checkpoint interval.
> + */
I think this comment would be more useful in xenguest.h next to the
callback struct.
Otherwise the patch looks good.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |