[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Migration memory corruption - PV backends need to quiesce



At 10:47 +0100 on 30 Jun (1404121679), David Vrabel wrote:
> Shared ring updates are strictly ordered with respect to the writes to
> data pages (either via grant map or grant copy).  This means that is the
> guest sees a response in the ring it is guaranteed that all writes to
> the associated pages are also present.

Is the ring update also strictly ordered wrt the grant unmap operation?

> The write of the response and the write of the producer index are
> strictly ordered.  If the backend is in the process of writing a
> response and the page is saved then the partial (corrupt) response is
> not visible to the guest.  The write of the producer index is atomic so
> the saver cannot see a partial producer index write.

Yes.  The (suggested) problem is that live migration does not preserve
that write ordering.  So we have to worry about something like this:

1. Toolstack pauses the domain for the final pass.  Reads the final
   LGD bitmap, which happens to include the shared ring but not the
   data pages.
2. Backend writes the data.
3. Backend unmaps the data page, marking it dirty.
4. Backend writes the ring.
5. Toolstack sends the ring page across in the last pass.
6. Guest resumes, seeing the I/O marked as complete, but without the
   data.

ISTR working though this before and being convinced that the backends
were correctly detaching before the final pass.  That was a long time
ago, though.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.