[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Buggy interaction of live migration and p2m updates



At 15:16 +0000 on 27 Nov (1417097812), Andrew Cooper wrote:
> On 27/11/14 15:00, Tim Deegan wrote:
> > At 10:54 +0000 on 21 Nov (1416563695), Andrew Cooper wrote:
> >> On 21/11/14 10:43, Jan Beulich wrote:
> >>>>>> On 20.11.14 at 19:28, <andrew.cooper3@xxxxxxxxxx> wrote:
> >>>> Should the guest change the p2m structure during live migration, the
> >>>> toolstack ends up with a stale p2m with a non-p2m frame in the middle,
> >>>> resulting in bogus cross-referencing.  Should the guest change an entry
> >>>> in the p2m, the p2m frame itself will be resent as it would be marked as
> >>>> dirty in the logdirty bitmap, but the target pfn will remain unsent and
> >>>> probably stale on the receiving side.
> >>> MMU_MACHPHYS_UPDATE processing marks the page being changed
> >>> as dirty. Perhaps guest_physmap_{add,remove}_page() (or certain
> >>> callers thereof) should do so too?
> >>>
> >>> Jan
> >>>
> >> This is certainly needed to fix HVM ballooning and live migration
> >> issues
> > Agreed.  We should be marking HVM frames dirty when they have any p2m
> > update that changes the mapping.  Maybe in paging_write_p2m_entry() or
> > the various implementation-specific versions.
> >
> >> , although now you point it out, it applies just as much to PV
> >> guests as HVM guests.
> >>
> >> I believe this might allow the toolstack to avoid keeping a second copy
> >> of the p2m.
> > I don't think so. :(  Because the toolstack is reading the guest's own
> > p2m, there is still a race where:
> >
> >  - guest calls physmap_add_page, as part of which Xen marks the pfn dirty;
> >  - toolstack reads + cleans the dirty bitmap;
> >  - toolstack reads the guest p2m and DTRT for this pfn;
> >  - guest updates its p2m with the result of the physmap_add_page call.
> >
> > After that, if the guest doesn't dirty that pfn again it won't be
> > fixed up.
> 
> It will (I think).
> 
> In the above scenario, step 3 will (certainly in v2) fail the p2m/m2p
> consistency check.  This error is currently fatal, but need to be made
> nonfatal during the live part, and mark the pfn as deferred.

That doesn't work if the guest is mapping a new entry: the guest p2m
will show the pfn as unallocated, which is fine.

There's a similar race if the guest wants to move a frame from one pfn
to another, unless you mandate that the guest must do the m2p update
after all p2m updates.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.