[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Problem about dump-core



On 09/16/2014 10:35 AM, Wen Congyang wrote:
> Hi, everyone:
> 
> The command 'xl dump-core' will fail after migration. The guest is HVM 
> guest(without pv driver).
> I use the newest staging branch to test. Both source and dest dom0 use the 
> same kernel.

The kernel version is 3.2, and it only supports IOCTL_PRIVCMD_MMAPBATCH.

After more investigation, the reason is that the mfn is ~0UL, and 
xc_map_foreign_range()
doesn't return NULL on dest host.

This patch can fix this problem:

From: Wen Congyang <wency@xxxxxxxxxxxxxx>
Date: Tue, 16 Sep 2014 14:56:03 +0800
Subject: [PATCH] check if mfn is valid before checking if 
PRIVCMD_MMAPBATCH_MFN_ERROR is set

If mfn is invalid, ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH, ..) also returns 0,
and we set mfn to mfn | PRIVCMD_MMAPBATCH_MFN_ERROR. But if mfn is ~0UL,
pfn[i] ^ arr[i] returns 0, and we cannot find this error. So we should
check if mfn is valid first before testing pfn[i] ^ arr[i].

Signed-off-by: Wen Congyang <wency@xxxxxxxxxxxxxx>
---
 tools/libxc/xc_linux_osdep.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
index a19e4b6..baa36e6 100644
--- a/tools/libxc/xc_linux_osdep.c
+++ b/tools/libxc/xc_linux_osdep.c
@@ -333,6 +333,13 @@ static void *linux_privcmd_map_foreign_bulk(xc_interface 
*xch, xc_osdep_handle h
 
         for ( i = 0; i < num; ++i )
         {
+            if ( arr[i] & PRIVCMD_MMAPBATCH_MFN_ERROR )
+            {
+                /* Invalid mfn, and pfn[i] may be equal to arr[i] */
+                err[i] = -EINVAL;
+                continue;
+            }
+
             switch ( pfn[i] ^ arr[i] )
             {
             case 0:
-- 
1.9.3

> 
> I use gdb to run 'xl dump-core' on dest dom0::
> # gdb --args xl dump-core 1 vmcore
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/sbin/xl...done.
> (gdb) b main
> Breakpoint 1 at 0x406ad8: file xl.c, line 298.
> (gdb) b xc_core.c:482
> No source file named xc_core.c.
> Make breakpoint pending on future shared library load? (y or [n]) n
> (gdb) r
> Starting program: /usr/sbin/xl dump-core 1 vmcore
> [Thread debugging using libthread_db enabled]
> 
> Breakpoint 1, main (argc=4, argv=0x7fffffffe3d8) at xl.c:298
> 298       void *config_data = 0;
> Missing separate debuginfos, use: debuginfo-install 
> bzip2-libs-1.0.5-7.el6_0.x86_64 glibc-2.12-1.80.el6.x86_64 
> libuuid-2.17.2-12.7.el6.x86_64 yajl-1.0.7-3.el6.x86_64 
> zlib-1.2.3-27.el6.x86_64
> (gdb) b xc_core.c:482
> Breakpoint 2 at 0x7ffff794559d: file xc_core.c, line 482.
> (gdb) c
> Continuing.
> 
> Breakpoint 2, xc_domain_dumpcore_via_callback (xch=0x6262d0, domid=1, 
> args=0x7fffffffe140, dump_rtn=0x7ffff79450c0 <local_file_dump>) at 
> xc_core.c:482
> 482       live_shinfo = xc_map_foreign_range(xch, domid, PAGE_SIZE,
> (gdb) p live_shinfo 
> $1 = (shared_info_any_t *) 0x0
> (gdb) n
> 484       if ( !live_shinfo && !info.hvm )
> (gdb) p live_shinfo 
> $2 = (shared_info_any_t *) 0x7ffff7ffb000
> (gdb) p *live_shinfo 
> Cannot access memory at address 0x7ffff7ffb000                  
> <==================== We cannot access live_shinfo
> (gdb) b 763
> Breakpoint 3 at 0x7ffff7946588: file xc_core.c, line 763.
> (gdb) c
> Continuing.
> 
> Breakpoint 3, xc_domain_dumpcore_via_callback (xch=0x6262d0, domid=1, 
> args=0x7fffffffe140, dump_rtn=0x7ffff79450c0 <local_file_dump>) at 
> xc_core.c:763
> 763           sts = dump_rtn(xch, args, (char*)live_shinfo, PAGE_SIZE);
> (gdb) s
> local_file_dump (xch=0x6262d0, args=0x7fffffffe140, buffer=0x7ffff7ffb000 
> <Address 0x7ffff7ffb000 out of bounds>, length=4096) at xc_core.c:931
> 931       if ( write_exact(da->fd, buffer, length) == -1 )
> (gdb) s
> write_exact (fd=14, data=0x7ffff7ffb000, size=4096) at xc_private.c:848
> 848       while ( offset < size )
> (gdb) n
> 850           len = write(fd, (const char *)data + offset, size - offset);    
>   <=============== We write live_shinfo to the core file, and fail
> (gdb) p data
> $3 = (const void *) 0x7ffff7ffb000
> (gdb) p *data
> Attempt to dereference a generic pointer.
> (gdb) n
> 851           if ( (len == -1) && (errno == EINTR) )
> (gdb) p len
> $4 = -1
> (gdb) p errno
> $5 = 14
> (gdb) 
> 
> I try it on source dom0:
> gdb --args xl dump-core 1 vmcore 
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/sbin/xl...done.
> (gdb) b main
> Breakpoint 1 at 0x406ad8: file xl.c, line 298.
> (gdb) r
> Starting program: /usr/sbin/xl dump-core 1 vmcore
> [Thread debugging using libthread_db enabled]
> 
> Breakpoint 1, main (argc=4, argv=0x7fffffffe438) at xl.c:298
> 298       void *config_data = 0;
> Missing separate debuginfos, use: debuginfo-install 
> bzip2-libs-1.0.5-7.el6_0.x86_64 glibc-2.12-1.80.el6.x86_64 
> libuuid-2.17.2-12.7.el6.x86_64 yajl-1.0.7-3.el6.x86_64 
> zlib-1.2.3-27.el6.x86_64
> (gdb) b xc_core.c:482
> Breakpoint 2 at 0x7ffff794459d: file xc_core.c, line 482.
> (gdb) c
> Continuing.
> 
> Breakpoint 2, xc_domain_dumpcore_via_callback (xch=0x6262d0, domid=1, 
> args=0x7fffffffe1a0, dump_rtn=0x7ffff79440c0 <local_file_dump>) at 
> xc_core.c:482
> 482       live_shinfo = xc_map_foreign_range(xch, domid, PAGE_SIZE,
> (gdb) p live_shinfo 
> $1 = (shared_info_any_t *) 0x0
> (gdb) n
> 484       if ( !live_shinfo && !info.hvm )
> (gdb) p live_shinfo 
> $2 = (shared_info_any_t *) 0x7ffff7ffb000
> (gdb) p *live_shinfo 
> Cannot access memory at address 0x7ffff7ffb000            <======== We also 
> cannot access live_shinfo
> (gdb) b 763
> Breakpoint 4 at 0x7ffff7945588: file xc_core.c, line 763.
> (gdb) c
> Continuing.
> 
> Breakpoint 4, xc_domain_dumpcore_via_callback (xch=0x6262d0, domid=1, 
> args=0x7fffffffe1a0, dump_rtn=0x7ffff79440c0 <local_file_dump>) at 
> xc_core.c:763
> 763           sts = dump_rtn(xch, args, (char*)live_shinfo, PAGE_SIZE);
> (gdb) s
> local_file_dump (xch=0x6262d0, args=0x7fffffffe1a0, buffer=0x7ffff7ffb000 
> <Address 0x7ffff7ffb000 out of bounds>, length=4096) at xc_core.c:931
> 931       if ( write_exact(da->fd, buffer, length) == -1 )
> (gdb) p buffer
> $3 = 0x7ffff7ffb000 <Address 0x7ffff7ffb000 out of bounds>
> (gdb) p *buffer
> Cannot access memory at address 0x7ffff7ffb000
> (gdb) n
> 937       if ( length >= (DUMP_INCREMENT * PAGE_SIZE) )            <===== But 
> we can write live_shinfo to corefile. Why???
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> .
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.