[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xenstored crashes with SIGSEGV
Hello Ian, On 15.12.2014 18:45, Ian Campbell wrote: > On Mon, 2014-12-15 at 14:50 +0000, Ian Campbell wrote: >> On Mon, 2014-12-15 at 15:19 +0100, Philipp Hahn wrote: >>> I just noticed something strange: >>> >>>> #3 0x000000000040a684 in tdb_open (name=0xff00000000 <Address >>>> 0xff00000000 out of bounds>, hash_size=0, >>>> tdb_flags=4254928, open_flags=-1, mode=3119127560) at tdb.c:1773 ... > I'm reasonably convinced now that this is just a weird artefact of > running gdb on an optimised binary, probably a shortcoming in the debug > info leading to gdb getting confused. > > Unfortunately this also calls into doubt the parameter to talloc_free, > perhaps in that context 0xff0000000 is a similar artefact. > > Please can you print the entire contents of tdb in the second frame > ("print *tdb" ought to do it). I'm curious whether it is all sane or > not. (gdb) print *tdb $1 = {name = 0x0, map_ptr = 0x0, fd = 47, map_size = 65280, read_only = 16711680, locked = 0xff0000000000, ecode = 16711680, header = { magic_food = "\000\000\000\000\000\000\000\000\000\377\000\000\000\000\377\000\000\000\000\000\000\000\000\000\000\377\000\000\000\000\377", version = 0, hash_size = 0, rwlocks = 65280, reserved = {16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0}}, flags = 0, travlocks = { next = 0xff0000, off = 0, hash = 65280}, next = 0xff0000, device = 280375465082880, inode = 16711680, log_fn = 0x4093b0 <null_log_fn>, hash_fn = 0x4092f0 <default_tdb_hash>, open_flags = 2} > Please can you also print "info regs" at the point of the segv (in frame > 0) as well as "disas" at that point. (gdb) info registers rax 0x0 0 rbx 0x16bff70 23854960 rcx 0xffffffffffffffff -1 rdx 0x40ecd0 4254928 rsi 0x0 0 rdi 0xff0000000000 280375465082880 rbp 0x7fcaed6c96a8 0x7fcaed6c96a8 rsp 0x7fff9dc86330 0x7fff9dc86330 r8 0x7fcaece54c08 140509534571528 r9 0xff00000000000000 -72057594037927936 r10 0x7fcaed08c14c 140509536895308 r11 0x246 582 r12 0xd 13 r13 0xff0000000000 280375465082880 r14 0x4093b0 4232112 r15 0x167d620 23582240 rip 0x4075c4 0x4075c4 <talloc_chunk_from_ptr+4> eflags 0x10206 [ PF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 fctrl 0x0 0 fstat 0x0 0 ftag 0x0 0 fiseg 0x0 0 fioff 0x0 0 foseg 0x0 0 fooff 0x0 0 fop 0x0 0 mxcsr 0x0 [ ] (gdb) disassemble Dump of assembler code for function talloc_chunk_from_ptr: 0x00000000004075c0 <talloc_chunk_from_ptr+0>: sub $0x8,%rsp 0x00000000004075c4 <talloc_chunk_from_ptr+4>: mov -0x8(%rdi),%edx 0x00000000004075c7 <talloc_chunk_from_ptr+7>: lea -0x50(%rdi),%rax 0x00000000004075cb <talloc_chunk_from_ptr+11>: mov %edx,%ecx 0x00000000004075cd <talloc_chunk_from_ptr+13>: and $0xfffffffffffffff0,%ecx 0x00000000004075d0 <talloc_chunk_from_ptr+16>: cmp $0xe814ec70,%ecx 0x00000000004075d6 <talloc_chunk_from_ptr+22>: jne 0x4075e2 <talloc_chunk_from_ptr+34> 0x00000000004075d8 <talloc_chunk_from_ptr+24>: and $0x1,%edx 0x00000000004075db <talloc_chunk_from_ptr+27>: jne 0x4075e2 <talloc_chunk_from_ptr+34> 0x00000000004075dd <talloc_chunk_from_ptr+29>: add $0x8,%rsp 0x00000000004075e1 <talloc_chunk_from_ptr+33>: retq 0x00000000004075e2 <talloc_chunk_from_ptr+34>: nopw 0x0(%rax,%rax,1) 0x00000000004075e8 <talloc_chunk_from_ptr+40>: callq 0x401b98 <abort@plt> > Can you also "p $_siginfo._sifields._sigfault.si_addr" (in frame 0). > This ought to be the actual faulting address, which ought to give a hint > on how much we can trust the parameters in the stack trace. Hmm, my gdb refused to access $_siginfo: (gdb) show convenience $_siginfo = Unable to read siginfo > Since I'm asking for the world I may as well ask you to dump the raw > stack too "x/64x $sp" ought to be a good starting point. (gdb) x/64x $sp 0x7fff9dc86330: 0xed6c96a8 0x00007fca 0x00407edf 0x00000000 0x7fff9dc86340: 0x00000000 0x00000000 0x016bff70 0x00000000 0x7fff9dc86350: 0xed6c96a8 0x00007fca 0x0000000d 0x00000000 0x7fff9dc86360: 0x00000000 0x00000000 0x004093b0 0x00000000 0x7fff9dc86370: 0x0167d620 0x00000000 0x0040a348 0x00000000 0x7fff9dc86380: 0x00000000 0x00000000 0x00000000 0x00000000 0x7fff9dc86390: 0x00000000 0x00000000 0x00000000 0x00000000 0x7fff9dc863a0: 0x00000011 0x00000000 0x411d4816 0x00000000 0x7fff9dc863b0: 0x00000001 0x00000000 0x000081a0 0x00000000 0x7fff9dc863c0: 0x00000000 0x00000000 0x00000000 0x00000000 0x7fff9dc863d0: 0x00096000 0x00000000 0x00001000 0x00000000 0x7fff9dc863e0: 0x000004b0 0x00000000 0x5438ba01 0x00000000 0x7fff9dc863f0: 0x07fd332e 0x00000000 0x5438ba01 0x00000000 0x7fff9dc86400: 0x07fd332e 0x00000000 0x5438ba01 0x00000000 0x7fff9dc86410: 0x07fd332e 0x00000000 0x00000000 0x00000000 0x7fff9dc86420: 0x00000000 0x00000000 0x00000000 0x00000000 > I notice in your bugzilla (for a different occurrence, I think): >> [2090451.721705] univention-conf[2512]: segfault at ff00000000 ip >> 000000000045e238 sp 00007ffff68dfa30 error 6 in python2.6[400000+21e000] > > Which appears to have faulted access 0xff000000000 too. It looks like > this process is a python thing, it's nothing to do with xenstored I > assume? Yes, that's one univention-config, which is completely independent of xen(stored). > It seems rather coincidental that it should be accessing the > same sort of address and be faulting. Yes, good catch. I'll have another look at those core dumps. > Ian. Thank you for your help. Philipp Hahn _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |