[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1 of 2 V3] libxl: Remus - suspend/postflush/commit callbacks
On Fri, 2012-02-03 at 07:00 +0000, rshriram@xxxxxxxxx wrote: > # HG changeset patch > # User Shriram Rajagopalan <rshriram@xxxxxxxxx> > # Date 1328251593 28800 > # Node ID 90e59c643c00c079996e13b75f89d1f0cd931a02 > # Parent c7abecc14cceb18140335ebe20faad826282cd1f > libxl: Remus - suspend/postflush/commit callbacks > > * Add libxl callback functions for Remus checkpoint suspend, postflush > (aka resume) and checkpoint commit callbacks. > * suspend callback is a stub that just bounces off > libxl__domain_suspend_common_callback - which suspends the domain and > saves the devices model state to a file. > * resume callback currently just resumes the domain (and the device model). > * commit callback just writes out the saved device model state to the > network and sleeps for the checkpoint interval. > * Introduce a new public API, libxl_domain_remus_start (currently a stub) > that sets up the network and disk buffer and initiates continuous > checkpointing. > > * Future patches will augument these callbacks/functions with more > functionalities "augment" > like issuing network buffer plug/unplug commands, disk checkpoint > commands, etc. > > Signed-off-by: Shriram Rajagopalan <rshriram@xxxxxxxxx> > > diff -r c7abecc14cce -r 90e59c643c00 tools/libxl/libxl.c > --- a/tools/libxl/libxl.c Thu Feb 02 22:46:33 2012 -0800 > +++ b/tools/libxl/libxl.c Thu Feb 02 22:46:33 2012 -0800 > @@ -471,6 +471,41 @@ libxl_vminfo * libxl_list_vm(libxl_ctx * > return ptr; > } > > +/* TODO: Explicit Checkpoint acknowledgements via recv_fd. */ > +int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info, > + uint32_t domid, int send_fd, int recv_fd) > +{ > + GC_INIT(ctx); > + libxl_domain_type type = libxl__domain_type(gc, domid); > + int rc = 0; > + > + if (info == NULL) { > + LIBXL__LOG(ctx, LIBXL__LOG_ERROR, > + "No remus_info structure supplied for domain %d", domid); > + rc = ERROR_INVAL; > + goto remus_fail; > + } > + > + /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */ Is it worth checking that the domain has no disks or network (IOW is this dangerous if they do?) [...] > @@ -791,7 +837,27 @@ int libxl__domain_suspend_common(libxl__ > } > > memset(&callbacks, 0, sizeof(callbacks)); > - callbacks.suspend = libxl__domain_suspend_common_callback; > + if (r_info != NULL) { > + /* save_callbacks: > + * suspend - called after expiration of checkpoint interval, > + * to *suspend* the domain. > + * > + * postcopy - called after the domain's dirty pages have been > + * copied into an output buffer. We *resume* the domain > + * & the device model, return to the caller. Caller then > + * flushes the output buffer, while the domain continues > to run. > + * > + * checkpoint - called after the memory checkpoint has been flushed > out > + * into the network. Send the saved device state, *wait* > + * for checkpoint ack and *release* the network buffer > (TBD). > + * Then *sleep* for the checkpoint interval. > + */ I think this comment would be more useful in xenguest.h next to the callback struct. Otherwise the patch looks good. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |