[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 COLOPre 16/26] tools/libx{l, c}: add back channel to libxc
On 07/01/2015 07:01 PM, Andrew Cooper wrote: On 01/07/15 11:42, Ian Campbell wrote:On Wed, 2015-07-01 at 10:38 +0800, Yang Hongyang wrote:On 06/30/2015 06:10 PM, Ian Campbell wrote:On Thu, 2015-06-25 at 14:25 +0800, Yang Hongyang wrote:We need to send secondary's dirty page pfns back to primary.In v2 Ian asked (<21888.2988.774072.32946@xxxxxxxxxxxxxxxxxxxxxxxx>): In the pdf http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0 linked from the wiki page http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping it says that the secondary keeps a copy of the original contents of its dirty pages. So I don't understand why you need to send the dirty bitmap to the primary. Which I don't see an answer for in my archive. Have I missed (or misplaced) the answer?Sorry, seems that I misplaced the answer to: [PATCH v2 COLOPre 09/13] tools/libxl: Update libxl_save_msgs_gen.pl to support return data from xl to xc > Thanks for this. I would have some comments on the details, but first > I want to properly understand your use case. So while I'm the author > and maintainer of this save helper, I won't review this in detail just > yet. I'm following the thread about what this is for... We need to send secondary's dirty page pfn back to primary. Primary will then send pages that are both dirtied on primary/secondary to secondary. in this way the secondary's memory will be consistent with primary. As we disscussed in [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h If we move this operation to libxc layer, this patch could be dropped.This doesn't seem to be a response to Ian's question which I quoted above. The crux of the question is that the design contained in those links does not appear to require a back channel, because it does not require a dirty bitmap to go from secondary to primary. Asserting a need to do so does not answer the question.It very definitely does require a dirty bitmap moving from the secondary to the primary. Lets see whether I can try explaining it in a different way. In COLO mode, both VMs are running, and are considered in sync if the visible network traffic is identical. After some time, they fall out of sync. At this point, the two VMs have definitely diverged. Lets call the primary dirty bitmap set A, while the secondary dirty bitmap set B. Sets A and B are different. Under normal migration, the page data for set A will be sent form the primary to the secondary. However, the set difference B - A (lets call this C) is out-of-date on the secondary (with respect to the primary) and will not be sent by the primary, as it was not memory dirtied by the primary. The secondary needs the page data for C to reconstruct an exact copy of the primary at the checkpoint. The secondary cannot calculate C as it doesn't know A. Instead, the secondary must send B to the primary, at which point the primary calculates the union of A and B (lets call this D) which is all the pages dirtied by both the primary and the secondary, and sends all page data covered by D. In the general case, D is a superset of both A and B. Without the backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid copy of the primary. Thank you Andy! The explaination is clear enough, do you mind if I copy your comments into the code comment or commit message and with your sob? ~Andrew P.S. I have suggested an investigation of the CoW support in Xen as a potential optimisation, as this could be used to prevent the secondary losing C, but this is very definitely future work and not appropriate at this point in COLO. . -- Thanks, Yang. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |