Xen project Mailing List

Re: [Xen-devel] [PATCH Remus v2 00/10] Remus support for Migration-v2

On 11/05/15 11:48, Hongyang Yang wrote: > > On 05/11/2015 05:00 PM, Andrew Cooper wrote: >> On 11/05/15 07:28, Hongyang Yang wrote: >>> On 05/09/2015 02:12 AM, Andrew Cooper wrote: >>>> On 08/05/15 10:33, Yang Hongyang wrote: >>>>> This patchset implement the Remus support for Migration v2 but >>>>> without >>>>> memory compressing. > [...] >> >>> >>> <last iter of memory> >>> >>>> end_of_checkpoint() >>>> Checkpoint record >>> >>> ctx->save.callbacks->postcopy() >>> this callback should not be omitted, it do some necessary work before >>> resume >>> primary (such as call Remus devices preresume callbacks to ensure the >>> disk >>> data is consistent) and then resume the primary guest. I think this >>> callback should be renamed to ctx->save.callbacks->resume(). >> >> That looks to be a useful cleanup (and answers one of my questions of >> what exactly postcopy was) >> >>> >>>> ctx->save.callbacks->checkpoint() >>>> libxl qemu record >>> >>> Maybe we should add another callback to send qemu record instead of >>> using checkpoint callback. We can call it >>> ctx->save.callbacks->save_qemu() >> >> This is another layering violation. libxc should not prescribe what >> libxl might or might not do. One example we are experimenting with in >> XenServer at the moment is support for multiple emulators attached to a >> single domain, which would necessitate two LIBXL_EMULATOR records to be >> sent per checkpoint. libxl might also want to send an updated json blob >> or such. > > Ok, so we'd better not introduce save_qemu callback. > >> >>> Then in checkpoint callback, we only call remus devices commit >>> callbacks( >>> which will release the network buffer etc...) then decide whether we >>> need to >>> do another checkpoint or quit checkpointed stream. >>> With Remus, checkpoint callback only wait for 200ms(can be specified >>> by -i) >>> then return. >>> With COLO, checkpoint callback will ask COLO proxy if we need to do a >>> checkpoint, will return when COLO proxy module indicate a checkpoint >>> is needed. >> >> That sounds like COLO wants a should_checkpoint() callback which >> separates the decision to make a checkpoint from the logic of >> implementing a checkpoint. > > We use checkpoint callback to do should_checkpoint() thing currently. > libxc will check the return value of checkpoint callback. But that causes a chicken & egg problem. I am planning to use a CHECKPOINT record to synchronise the transfer of ownership of the FD between libxc and libxl. Therefore, a CHECKPOINT record must be in the stream ahead of the checkpoint() callback, as libxl will then write/read some records in itself. As a result, the checkpoint() callback itself can't be used to gate whether a CHECKPOINT record is written by libxc. > >> >>> >>>> ... >>>> libxl end-of-checkpoint record >>>> ctx->save.callbacks->checkpoint() returns >>>> start_of_checkpoint() >>> >>> ctx->save.callbacks->suspend() >>> >>>> <memory> >>>> end_of_checkpoint() >>>> Checkpoint record >>>> etc... >>>> >>>> This will eventually allow both libxc and libxl to send checkpoint >>>> data >>>> (and by the looks of it, remove the need for postcopy()). With this >>>> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the >>>> current qemu situation, but I would prefer not to be also retrofitting >>>> libxc checkpoint records when doing the libxl/migv2 work. >>>> >>>> Does this look plausible in for Remus (and eventually COLO) support? >>> >>> With comments above, I would suggest the save flow as below: >>> >>> libxc writes: libxl writes: >>> >>> live migration: >>> Image Header >>> Domain Header >>> start_of_stream() >>> start_of_checkpoint() >>> <live memory> >>> ctx->save.callbacks->suspend() >>> <last iter memory> >>> end_of_checkpoint() >>> if ( checkpointd ) >>> End of Checkpoint record >>> /*If resotre side receives this record, input fd should be handed to >>> libxl*/ >>> else >>> goto end >>> >>> loop of checkpointed stream: >>> ctx->save.callbacks->resume() >>> ctx->save.callbacks->save_qemu() >>> libxl qemu record >>> ... >>> libxl end-of-checkpoint record >>> /*If resotre side receives this record, input fd should be handed to >>> libxc*/ >>> ctx->save.callbacks->save_qemu() returns >>> ctx->save.callbacks->checkpoint() >>> start_of_checkpoint() >>> ctx->save.callbacks->suspend() >>> <memory> >>> end_of_checkpoint() >>> End of Checkpoint record >>> goto 'loop of checkpointed stream' >>> >>> end: >>> END record >>> /*If resotre side receives this record, input fd should be handed to >>> libxl*/ >>> >>> >>> In order to keep it simple, we can keep the current >>> ctx->save.callbacks->checkpoint() as it is, which do the save_qemu >>> thing, call >>> Remus devices commit callbacks and then decide whether we need a >>> checkpoint. We >>> can also combine the ctx->save.callbacks->resume() with >>> ctx->save.callbacks->checkpoint(), with only one checkpoint() >>> callback, we do >>> the following things: >>> - Call Remus devices preresume callbacks >>> - Resume the primary >>> - Save qemu records >>> - Call Remus devices commit callbacks >>> - Decide whether we need a checkpoint >>> >>> Overall, there are 3 options for the save flow: >>> 1. keep the current callbacks, rename postcopy() to resume() >>> 2. split the checkpoint() callback to save_qemu() and checkpoint() >>> 3. combine the current postcopy() and checkpoint() >>> Which one do you think is the best? >> >> I have a 4th alternative in mind, but would like your feedback from my >> comments in this email first. > > So what's the 4th alternative? I have some corrections to my patch series based on David's feedback, and your comments. After that, it should hopefully be far easier to describe. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.