|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xenstore: set READ_THREAD_STACKSIZE to a sane value
On Tue, 2014-03-11 at 16:55 +0100, Roger Pau Monnà wrote:
> On 11/03/14 15:12, Ian Campbell wrote:
> > On Tue, 2014-03-11 at 14:52 +0100, Roger Pau Monnà wrote:
> >> On 11/03/14 14:24, Ian Campbell wrote:
> >>> On Mon, 2014-03-10 at 17:12 +0000, Ian Jackson wrote:
> >>>> Roger Pau Monne writes ("[PATCH] xenstore: set READ_THREAD_STACKSIZE to
> >>>> a sane value"):
> >>>>> On FreeBSD PTHREAD_STACK_MIN is 2048 by default, which is obviously
> >>>>> too low.
> >
> > It occurs to me that 2048 is < PAGE_SIZE. Which makes this seem like an
> > interesting choice of stack min, especially combined with the fact that
> > the failure seems to involve malloc.
> >
> > Perhaps the stack is malloc'd (rather than coming from brk or an anon
> > mmap), so overrunning would cause heap corruption which seems to be what
> > you are seeing.
> >
> >>> How does this manifest itself? (I suppose this may be answered as part
> >>> of answering Ian J)
> >>
> >> Yes, I'm still looking into it, this gdb output:
> >>
> >> Starting program: /usr/local/bin/xenstore-watch /foo
> >> [New LWP 100169]
> >> [New Thread 801406800 (LWP 100182/xenstore-watch)]
> >>
> >> Program received signal SIGSEGV, Segmentation fault.
> >> [Switching to Thread 801406800 (LWP 100182/xenstore-watch)]
> >> 0x0000000800ac1258 in sbrk () from /lib/libc.so.7
> >> (gdb) bt
> >> #0 0x0000000800ac1258 in sbrk () from /lib/libc.so.7
> >> #1 0x0000000800ac110e in sbrk () from /lib/libc.so.7
> >> #2 0x0000000800ac9ee8 in sbrk () from /lib/libc.so.7
> >> #3 0x0000000800ac456b in sbrk () from /lib/libc.so.7
> >> #4 0x0000000800ac447d in sbrk () from /lib/libc.so.7
> >> #5 0x0000000800aaf6ce in syscall () from /lib/libc.so.7
> >> #6 0x0000000800acb37b in malloc () from /lib/libc.so.7
> >> #7 0x00000008008202b9 in read_message (h=0x801417080, nonblocking=0) at
> >> xs.c:313
> >> #8 0x0000000800820a06 in read_thread (arg=0x801417080) at xs.c:313
> >> #9 0x0000000800dc64a4 in pthread_create () from /lib/libthr.so.3
> >> #10 0x0000000000000000 in ?? ()
> >
> > Does
> > frame 1 ; print $sp
> > frame 2 ; print $sp
> > etc
> > tell you anything useful about the stack usage at each level?
>
> Thanks, I've been able to get the stack pointer at each frame, here are
> the results (from frame 0 to frame 10):
>
> 0x7fffffbfcff0
<-PAGE BOUNDARY HERE
Hence the segfault I expct...
> 0x7fffffbfd0a0
> 0x7fffffbfd0e0
> 0x7fffffbfd120
> 0x7fffffbfd160
> 0x7fffffbfd1a0
> 0x7fffffbfd1e0
> 0x7fffffbfd6a0
> 0x7fffffbfd7a0
> 0x7fffffbfd7c0
> 0x7fffffbfd800
>
> Doing:
>
> 0x7fffffbfd800 - 0x7fffffbfcff0 = 0x810
>
> Which is 2064 in decimal. The biggest culprit seems to be malloc, which
> is using 1216 bytes of the stack.
Wow!
http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/malloc.c?rev=1.54.10.1&content-type=text/x-cvsweb-markup
I suppose? malloc itself looks fairly small, but there's a lot of inlining in
that function... I don't see any large on stack allocations (e.g. arrays) but I
suppose it all adds up.
> >> I've also tried to debug it using valgrind,
> >
> > Under BSD? Did someone wire up the dom0 OS specific bit? If so: Neat!
>
> No, I don't think anyone has wired the Dom0 specific bits, maybe they
> don't show up because this is just the xenstore client, which is not
> using any ioctls?
Oh yes, that makes sense, you'd be using the Unix domain socket.
> >> and here's what I got:
> >>
> >> [root@loki ~/xen/xen]# valgrind xenstore-watch /foo
> >> ==1901== Memcheck, a memory error detector
> >> ==1901== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
> >> ==1901== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
> >> ==1901== Command: xenstore-watch /foo
> >> ==1901==
> >> ==1901== Syscall param socketcall.connect(serv_addr..sa_len) points to
> >> uninitialised byte(s)
> >> ==1901== at 0x152A14A: connect (in /lib/libc.so.7)
> >> ==1901== by 0x1210B46: get_handle (xs.c:205)
> >> ==1901== by 0x1210CEC: xs_open (xs.c:297)
> >> ==1901== by 0x4027B1: main (xenstore_client.c:635)
> >> ==1901== Address 0x7ff000a70 is on thread 1's stack
> >> ==1901==
> >> /foo
> >>
> >> Strangely enough, when running under valgrind it doesn't segfault,
> >
> > valgrind interposes it's own malloc and stuff which will change
> > behaviour, and I wouldn't be all that surprised if it were gettings its
> > fingers into some of the pthread stuff too.
> >
> >> and
> >> I'm still trying to figure out why valgrind complains.
> >
> > It seems to be an unrelated issue though?
>
> I think so, it seems like valgrind doesn't really like the cast done in
> connect from sockaddr_un to sockaddr.
Not all that surprising I guess, it's a bit of an odd interface!
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |