[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Fwd: NetBSD xl core-dump not working... Memory fault (core dumped)
On Thu, 2013-11-07 at 21:04 +0000, Miguel C. wrote: > yes its 4.2 from pkgsrc. Thanks, that might be enough. > how can i get the changeset id? that'd be one for the port-xen folks I think. It might be printed in the xen dmesg, but that depends on how it was built and 4.2 may be too old to have such functionalilty. > Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote: > >On Mon, 2013-11-04 at 22:13 +0000, Mike C. wrote: > >> On 31.10.13 04:34, Miguel Clara wrote: > >> > >> > I was trying to get a core-dump for a domU with xl and got this > >error: > >> > > >> > # xl dump-core 20 test.core > >> > Memory fault > >> > > >> > GDB shows this: > >> > > >> > a# gdb xl xl.core > >> > GNU gdb (GDB) 7.3.1 > >> > Copyright (C) 2011 Free Software Foundation, Inc. > >> > License GPLv3+: GNU GPL version 3 or > >later<http://gnu.org/licenses/gpl.html> > >> > This is free software: you are free to change and redistribute it. > >> > There is NO WARRANTY, to the extent permitted by law. Type "show > >copying" > >> > and "show warranty" for details. > >> > This GDB was configured as "x86_64--netbsd". > >> > For bug reporting instructions, please see: > >> > <http://www.gnu.org/software/gdb/bugs/>... > >> > Reading symbols from /usr/sbin/xl...done. > >> > [New process 1] > >> > Core was generated by `xl'. > >> > Program terminated with signal 11, Segmentation fault. > >> > #0 0x00007f7ff7007b45 in xc_domain_dumpcore_via_callback > >> > (xch=0x7f7ff7b0d800, domid=20, args=0x7f7fffffdae0, > >> > dump_rtn=0x7f7ff700632c<local_file_dump>) > >> > at xc_core.c:860 > > In 4.2.0 this corresponds to memcpy(dump_mem, vaddr, PAGE_SIZE); which is a plausible source of a segfault. xc_core.c has only changed in immaterial ways (although ways which caused all the line numbers to shift) since 4.2.0 AFAICT so it is likely that this bug is still present. Can you tell via gdb what the faulting address was and whether it corresponds to dump_mem or vaddr? gdb's "info locals" might give you at least some of that? Also you can use disas to identify the precise instruction at 0x00007f7ff7007b45, which will show you the registers which might lead you to the faulting address. vaddr is certainly not NULL, it's checked right before. It could be non-NULL and still invalid if xc_map_foreign_range were buggy on NetBSD, but that is surely used elsewhere? I suppose it might have mapped an MFN which was either invalid (or became invalid, but your bug is deterministic, right?. IIRC NetBSD's privcmd foreign mappings are populated lazily and not immediately like on Linux? If that were the case (and I'm only vaguely aware of how NetBSD operates) then it would be plausible that xc_map_foreign_range would succeed but that a subsequent attempt to access the region would fault? dump_mem isn't NULL, it's a pointer into the dump_mem_start array which has a check for failure when it is allocated. Since dump_mem is just normal process memory and vaddr is a magic foreign mapping I'd be inclined to suspect vaddr was not right in some way... Does "xl -vvv core-dump" give any useful additional logging? Unfortunately I don't think anyone has done valgrind support for debugging processes which use Xen hypercalls for *BSD (if you were very keen you could probably follow what was done for Linux http://blog.xen.org/index.php/2013/01/18/using-valgrind-to-debug-xen-toolstacks/ and wire up the BSD privcmd ioctl to the generic Xen hypercall code I added) I fear this bug is going to take someone on the ground with a NetBSD system and the ability to dive into BSD kernel internals to get to the bottom of... Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |