[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save
On 09/06/2015 04:15, Yang Hongyang wrote: > > > On 06/08/2015 09:04 PM, Andrew Cooper wrote: >> On 08/06/15 04:45, Yang Hongyang wrote: >>> call callbacks->get_dirty_pfn() after suspend primary vm to >>> get dirty pages on secondary vm, and send pages both dirty on >>> primary/secondary to secondary. >>> >>> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx> >>> Signed-off-by: Wen Congyang <wency@xxxxxxxxxxxxxx> >>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>> --- >>> tools/libxc/xc_sr_save.c | 49 >>> +++++++++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 48 insertions(+), 1 deletion(-) >>> >>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c >>> index d63b783..cda61ed 100644 >>> --- a/tools/libxc/xc_sr_save.c >>> +++ b/tools/libxc/xc_sr_save.c >>> @@ -515,6 +515,31 @@ static int send_memory_live(struct >>> xc_sr_context *ctx) >>> return rc; >>> } >>> >>> +static int update_dirty_bitmap(uint8_t *(*get_dirty_pfn)(void *), >>> void *data, >>> + unsigned long p2m_size, unsigned >>> long *bitmap) >> >> This function should take a ctx rather than having the caller expand 3 >> parameters. Also, "update_dirty_bitmap" is a little misleading, as it >> isn't querying the hypervisor for the dirty bitmap. > > ok. (Merging the other thread) > how about merge_secondary_dirty_bitmap()? Much better! > >> >>> +{ >>> + uint64_t *pfn_list; >>> + uint64_t count, i; >>> + uint64_t pfn; >>> + >>> + pfn_list = (uint64_t *)get_dirty_pfn(data); >> >> This looks like a recipe for width-errors. The get_dirty_pfn() call >> should take a pointer to a struct for it to fill. > > but the size is unknown for the caller.pfn_list[0] is the count of > pfn. > >> >>> + assert(pfn_list); >> >> This should turn into an error rather than an abort(). > > Even if there are no dirty pages on secondary, pfn_list shouldn't be > NULL, it's just that pfn_list[0] will be 0. if pfn_list is NULL, > there might be unexpected error happened. get_dirty_pfn() should be declared alongside a struct pfn_data { uint64_t count; uint64_t *pfns; }; and this function here should create one of these on the stack and pass it by pointer to get_dirty_pfn(). I might also be tempted to rename this to get_remote_logdirty() or similar, to indicate that it is a source of logdirty data from something other than the current hypervisor. > >> >>> + >>> + count = pfn_list[0]; >>> + for (i = 0; i < count; i++) { >> >> style >> >>> + pfn = pfn_list[i + 1]; >>> + if (pfn > p2m_size) { >>> + errno = EINVAL; >>> + return -1; >>> + } >>> + >>> + set_bit(pfn, bitmap); >>> + } >>> + >>> + free(pfn_list); >>> + return 0; >>> +} >>> + >>> /* >>> * Suspend the domain and send dirty memory. >>> * This is the last iteration of the live migration and the >>> @@ -555,6 +580,19 @@ static int suspend_and_send_dirty(struct >>> xc_sr_context *ctx) >>> >>> bitmap_or(dirty_bitmap, ctx->save.deferred_pages, >>> ctx->save.p2m_size); >>> >>> + if ( !ctx->save.live && ctx->save.callbacks->get_dirty_pfn ) >>> + { >> >> Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it is a >> noop to start with) ? > > It should be mandatory, it shouldn't be noop under COLO. perhaps we > should > add sanity check at the beginning. But problem is save side do not > have a param > passed from libxl to indicate the stream type(like checkpointed_stream in > restore side). So we may need to add another XCFLAGS? Currently there is > XCFLAGS_CHECKPOINTED which represents Remus, we might need to change > this to > XCFLAGS_STREAM_REMUS > XCFLAGS_STREAM_COLO > so that we can know what kind of stream we are handling? checkpointed_stream started out as a bugfix for a legacy stream migration breakage. Really, this information should have been passed right from the start. It would probably be best to take the enum{} suggested elsewhere and make it a top level ctx item, and have it present for both save and restore, with sutable parameters passed in from the top. (When I am finally able to take out the legacy code, there is going to be a severe pruning/consolidation of the parameters.) ~Andrew > >> >> ~Andrew >> >>> + rc = update_dirty_bitmap(ctx->save.callbacks->get_dirty_pfn, >>> + ctx->save.callbacks->data, >>> + ctx->save.p2m_size, >>> + dirty_bitmap); >>> + if ( rc ) >>> + { >>> + PERROR("Failed to get secondary vm's dirty pages"); >>> + goto out; >>> + } >>> + } >>> + >>> rc = send_dirty_pages(ctx, stats.dirty_count + >>> ctx->save.nr_deferred_pages); >>> if ( rc ) >>> goto out; >>> @@ -784,7 +822,16 @@ static int save(struct xc_sr_context *ctx, >>> uint16_t guest_type) >>> if ( rc ) >>> goto err; >>> >>> - ctx->save.callbacks->postcopy(ctx->save.callbacks->data); >>> + rc = >>> ctx->save.callbacks->postcopy(ctx->save.callbacks->data); >>> + if ( !rc ) { >>> + if ( !errno ) >>> + { >>> + /* Postcopy request failed (without errno, >>> using EINVAL) */ >>> + errno = EINVAL; >>> + } >>> + rc = -1; >>> + goto err; >>> + } >>> >>> rc = >>> ctx->save.callbacks->checkpoint(ctx->save.callbacks->data); >>> if ( rc <= 0 ) >> >> . >> > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |