[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/5] X86/vMCE: guest broken page handling when migration



Sorry for the delayed response.

On Tue, Nov 20, 2012 at 11:57 AM, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:
On 22/10/12 20:26, Shriram Rajagopalan wrote:
On Mon, Oct 22, 2012 at 3:54 AM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
George Dunlap writes ("Re: [Xen-devel] [PATCH 5/5] X86/vMCE: guest broken page handling when migration"):
> On Fri, Oct 19, 2012 at 4:14 PM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
> > This looks plausible to me, as far as the tools go.  Can you explain
> > how you have tested this ?  Did you manage to do any tests of the
> > remus codepaths ?
>
> I'm pretty sure that this shouldn't cause any problems with Remus.  If
> it's difficult for Jinsong to test Remus, I think probably OK to
> commit it, and then revert it if the Remus guys have any problems.

OK.


You can easily test it with Remus. With xl, memory replication functionality is already
in place. so xl remus command should work.

Should "xl remus $domain localhost" work?  How would one test the fail-over mechanism?  Are there any other requirements for the guest, the kernel, &c?


xl remus $domain localhost should work. And xl remus $domain remoteHost will work too.
Atleast that was the case, when the patches went in a few months ago.

If you are using a 3.0+ kernel for the Guest, things should work. 
There are no other requirements for the Guest/Kernel - HVM wise it should work, since its basically
 doing xl migrate -l $domain $host continually (just memory).
 
I just ran the above command on xen-unstable, and after 10 minutes or so the guest crashed with some kind of a kernel double-fault.


As far as the kernel double-fault, I have one possible candidate. The dirty page compression could
potentially be causing this - since iirc there were some proposals to re-introduce superpages, etc
which changed a lot of xc_domain_restore code.

So try "xl remus -u $domain localhost" [disables checkpoint compression].

Are we running any remus stuff in our testing infrastructure?


Nope. Wish I could though :).

shriram 
 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.