[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Buggy interaction of live migration and p2m updates
On 27/11/14 15:28, Tim Deegan wrote: > At 15:16 +0000 on 27 Nov (1417097812), Andrew Cooper wrote: >> On 27/11/14 15:00, Tim Deegan wrote: >>> At 10:54 +0000 on 21 Nov (1416563695), Andrew Cooper wrote: >>>> On 21/11/14 10:43, Jan Beulich wrote: >>>>>>>> On 20.11.14 at 19:28, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>> Should the guest change the p2m structure during live migration, the >>>>>> toolstack ends up with a stale p2m with a non-p2m frame in the middle, >>>>>> resulting in bogus cross-referencing. Should the guest change an entry >>>>>> in the p2m, the p2m frame itself will be resent as it would be marked as >>>>>> dirty in the logdirty bitmap, but the target pfn will remain unsent and >>>>>> probably stale on the receiving side. >>>>> MMU_MACHPHYS_UPDATE processing marks the page being changed >>>>> as dirty. Perhaps guest_physmap_{add,remove}_page() (or certain >>>>> callers thereof) should do so too? >>>>> >>>>> Jan >>>>> >>>> This is certainly needed to fix HVM ballooning and live migration >>>> issues >>> Agreed. We should be marking HVM frames dirty when they have any p2m >>> update that changes the mapping. Maybe in paging_write_p2m_entry() or >>> the various implementation-specific versions. >>> >>>> , although now you point it out, it applies just as much to PV >>>> guests as HVM guests. >>>> >>>> I believe this might allow the toolstack to avoid keeping a second copy >>>> of the p2m. >>> I don't think so. :( Because the toolstack is reading the guest's own >>> p2m, there is still a race where: >>> >>> - guest calls physmap_add_page, as part of which Xen marks the pfn dirty; >>> - toolstack reads + cleans the dirty bitmap; >>> - toolstack reads the guest p2m and DTRT for this pfn; >>> - guest updates its p2m with the result of the physmap_add_page call. >>> >>> After that, if the guest doesn't dirty that pfn again it won't be >>> fixed up. >> It will (I think). >> >> In the above scenario, step 3 will (certainly in v2) fail the p2m/m2p >> consistency check. This error is currently fatal, but need to be made >> nonfatal during the live part, and mark the pfn as deferred. > That doesn't work if the guest is mapping a new entry: the guest p2m > will show the pfn as unallocated, which is fine. Hmm - so it will. > > There's a similar race if the guest wants to move a frame from one pfn > to another, unless you mandate that the guest must do the m2p update > after all p2m updates. We certainly can't retroactively enforce that, and thusfar are in a position to provide this safety to older PV guests. It looks like we absolutely do need a second copy of the guests p2m. While unfortunate, I suspect admins will begrudgingly accept extra memory usage in preference to potential VM memory corruption on migrate. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |