[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Problems using xl migrate



On Mon, 2014-11-24 at 12:06 +0000, M A Young wrote:
> 
> On Mon, 24 Nov 2014, George Dunlap wrote:
> 
> > On Mon, Nov 24, 2014 at 12:07 AM, M A Young <m.a.young@xxxxxxxxxxxx> wrote:
> >> On Sat, 22 Nov 2014, M A Young wrote:
> >>
> >>> While investigating a bug reported on Red Hat Bugzilla
> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1166461
> >>> I discovered the following
> >>>
> >>> xl migrate --debug domid localhost does indeed fail for Xen 4.4 pv (the
> >>> bug report is for Xen 4.3 hvm ) when xl migrate domid localhost works. 
> >>> There
> >>> are actually two issues here
> >>>
> >>> * the segfault in libxl-save-helper --restore-domain (as reported in the
> >>> bug above) occurs if the guest memory is 1024M (on my 4G box) and is
> >>> presumably because the allocated memory eventually runs out
> >>
> >>
> >> I have found a bit more out about this. The segfault at at line 1378 of
> >> tools/libxc/xc_domain_restore.c which is
> >>                 DPRINTF("************** pfn=%lx type=%lx gotcs=%08lx "
> >>                         "actualcs=%08lx\n", pfn, pagebuf->pfn_types[pfn],
> >>                         csum_page(region_base + (i + curbatch)*PAGE_SIZE),
> >>                         csum_page(buf));
> >> and is because pfn in pagebuf->pfn_types[pfn] is beyond the end of the
> >> array. This occurs in the verification phase.
> >>
> >>> * the segfault doesn't occur if the guest memory is 128M, but the
> >>> migration still fails. The first attached file contains the log from a run
> >>> with xl -v migrate --debug domid localhost (with mfn and duplicated lines
> >>> stripped out to make the size manageable).
> >>
> >>
> >> The difference actually seems to be down to how active the VM is rather 
> >> than
> >> the memory size (my small memory test system was doing very little, my
> >> larger system was a full OS install). In the non-segfault case the problem
> >> was the printf and printf_info commands in the create_domain() routine in
> >> tools/libxl/xl_cmdimpl.c . As xl migrate uses stdout to pass status 
> >> messages
> >> back from the restoring dom0, these commands cause an unexpected message. 
> >> If
> >> you move them onto stderr then the migration completes in the non-segfault
> >> case.
> >
> > Good job tracking those down -- are there patches in the works?
> 
> I have a partial patch for the printf printf_info problem, which works for 
> me but doesn't cover printing the info in sxp format.

Am I right that is all related to the use of --debug and or -vm? and
that a plain "xl migrate" works ok?

It's still a bug of course, but changes the severity (somehow, not sure
to what extent IMHO it does etc).

>  I haven't worked out 
> what is leading up to the segfault yet.
> 
>       Michael Young



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.