[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xenstored crashes with SIGSEGV
2014-12-16 11:06 GMT+00:00 Ian Campbell <Ian.Campbell@xxxxxxxxxx>: > On Tue, 2014-12-16 at 10:45 +0000, Ian Campbell wrote: >> On Mon, 2014-12-15 at 23:29 +0100, Philipp Hahn wrote: >> > > I notice in your bugzilla (for a different occurrence, I think): >> > >> [2090451.721705] univention-conf[2512]: segfault at ff00000000 ip >> > >> 000000000045e238 sp 00007ffff68dfa30 error 6 in python2.6[400000+21e000] >> > > >> > > Which appears to have faulted access 0xff000000000 too. It looks like >> > > this process is a python thing, it's nothing to do with xenstored I >> > > assume? >> > >> > Yes, that's one univention-config, which is completely independent of >> > xen(stored). >> > >> > > It seems rather coincidental that it should be accessing the >> > > same sort of address and be faulting. >> > >> > Yes, good catch. I'll have another look at those core dumps. >> >> With this in mind, please can you confirm what model of machines you've >> seen this on, and in particular whether they are all the same class of >> machine or whether they are significantly different. >> >> The reason being that randomly placed 0xff values in a field of 0x00 >> could possibly indicate hardware (e.g. a GPU) DMAing over the wrong >> memory pages. > > Thanks for giving me access to the core files. This is very suspicious: > (gdb) frame 2 > #2 0x000000000040a348 in tdb_open_ex (name=0x1941fb0 > "/var/lib/xenstored/tdb.0x1935bb0", hash_size=<value optimized out>, > tdb_flags=0, open_flags=<value optimized out>, mode=<value optimized out>, > log_fn=0x4093b0 <null_log_fn>, hash_fn=<value optimized out>) at > tdb.c:1958 > 1958 SAFE_FREE(tdb->locked); > > (gdb) x/96x tdb > 0x1921270: 0x00000000 0x00000000 0x00000000 0x00000000 > 0x1921280: 0x0000001f 0x000000ff 0x0000ff00 0x000000ff > 0x1921290: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x19212a0: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x19212b0: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x19212c0: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x19212d0: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x19212e0: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x19212f0: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921300: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921310: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921320: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921330: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921340: 0x00000000 0x00000000 0x0000ff00 0x000000ff > 0x1921350: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921360: 0x00000000 0x000000ff 0x0000ff00 0x000000ff > 0x1921370: 0x004093b0 0x00000000 0x004092f0 0x00000000 > 0x1921380: 0x00000002 0x00000000 0x00000091 0x00000000 > 0x1921390: 0x0193de70 0x00000000 0x01963600 0x00000000 > 0x19213a0: 0x00000000 0x00000000 0x0193fbb0 0x00000000 > 0x19213b0: 0x00000000 0x00000000 0x00000000 0x00000000 > 0x19213c0: 0x00405870 0x00000000 0x0040e3e0 0x00000000 > 0x19213d0: 0x00000038 0x00000000 0xe814ec70 0x6f2f6567 > 0x19213e0: 0x01963650 0x00000000 0x0193dec0 0x00000000 > > Something has clearly done a number on the ram of this process. > 0x1921270 through 0x192136f is 256 bytes... > > Since it appears to be happening to other processes too I would hazard > that this is not a xenstored issue. > > Ian. > Good catch Ian! Strange corruption. Probably not related to xenstored as you suggested. I would be curious to see what's before the tdb pointer and where does the corruption starts. I also don't understand where the "fd = 47" came from a previous mail. 0x1f is 31, not 47 (which is 0x2f). I would not be surprised about a strange bug in libc or the kernel. Frediano _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |