[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Migration memory corruption - PV backends need to quiesce



On 27/06/14 19:15, Tim Deegan wrote:
> At 18:28 +0100 on 27 Jun (1403890088), David Vrabel wrote:
>> On 27/06/14 17:51, Andrew Cooper wrote:
>>> Overall, it would appear that there needs to be a hook for all PV
>>> drivers to force quiescence.  In particular, a backend must guarantee to
>>> unmap all active grant maps (so the frames get properly reflected in the
>>> dirty bitmap), and never process subsequent requests (so no new frames
>>> appear dirty in the bitmap after the guest has been paused).
>> I think this would be much too expensive for snapshots and things like
>> remus.  Waiting for all outstanding I/O could take seconds.
> The other option we talked about yesterday was a flag to the log-dirty
> operation that reports all grant-mapped frames as dirty.  Then the
> tools would add such frames to the final pass.  That could take a long
> time too, of course.
>
> I'm not sure how you would synchronize the final pass with backends
> that were doing grant copy operations -- you could exclude copies for
> the duration, but I'm not sure what that would look like for the
> backend.
>
> Tim.

Hmm - I have a crazy idea.

As identified by David, it is impractical to wait for backends to
complete any outstanding requests and unmap the grants, as this could
take seconds.

However, what the backend can do very quickly is guarantee that it will
never start processing any further requests, and never mark
subsequently-completed requests as complete in the ring.

This means that a the backend will not submit any new grant copy
operations, or regular copies to/from persistent grants, and even if a
hardware device has a dma mapping of an active grant, the request will
not be marked as completed in the ring. Even if the eventual dma'd pages
end up dirty, the frontend will replay the uncompleted requests in the
ring and be mostly fine[1].

Combined with a XEN_DOMCTL_SHADOW_OP_PEEK_INCLUDING_ACTIVE_GRANTS (name
subject to improvement), the migration code can guarantee that there
will be no corruption of the ring, and no relevant corruption of guest
memory.

I *believe* this covers all the cases, and doesn't depend on waiting for
the backends to fully complete all outstanding requests.

~Andrew

[1] The caveat is a pending read followed by a write of the same block
which, once replayed, might be out-of-order if the write did take effect
on the source side.  Any frontends which care about this must wait for
all write requests to complete before entering the suspend state.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.