[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xend segfaults when starting
On Wed, 2010-08-18 at 15:02 +0100, Christoph Egger wrote: > On Wednesday 18 August 2010 14:14:19 Ian Campbell wrote: > > > In unlock_pages, the address and length passed to munlock() is: > > > > > > laddr 0x7f7ffdfe7000, llen 0x2000 > > > > > > The reason why munlock() fails is that mlock() hasn't been called before. > > > The hcall_buf_prep() is not called at all before the first call to > > > _xc_clean_hcall_buf(). > > > > If hcall_buf_prep() has never been called then > > "pthread_getspecific(hcall_buf_pkey)" should return NULL and > > _xc_clean_hcall_buf will never be called from xc_clean_hcall_buf. > > _xc_clean_hcall_buf also ignores NULL values itself. > > Who calls hcall_buf_prep() in your case ? > > Only hypercalls call hcall_buf_prep(). > What if no hypercalls are not called during xend startup ? Then I would have expected pthread_getspecific(hcall_buf_pkey) to return NULL (because _xc_init_hcall_buf was never called) and therefore for xc_clean_hcall_buf to not doing any unlocking. However I think my expectation was wrong. If _xc_init_hcall_buf is never called then hcall_buf_pkey is undefined but not necessarily invalid -- and it seems to be the case on your system that it turns out to be valid (perhaps pthread_key_t is valid on NetBSD and invalid on Linux or something like that) and therefore we try an unlock some random address. My updated patch ensured that hcall_buf_pkey is always initialised before use. > If you call xc_clean_hcall_buf() from xc_interface_close() > then you should also call hcall_buf_prep() from xc_interface_open(). > > > However you say that hcall_buf_pkey is not NULL, but rather contains a > > valid hcall_buf containing 0x7f7ffdfe7040. > > hcall_buf itself has the address 0x7f7ffdfe7000. > > hcall_buf->buf has the address 0x7f7ffdfe7040. That's very odd -- hcall_buf->buf is allocated with xc_memalign and therefore should be page aligned. Are you sure the addresses aren't the other way round? > > The only call to "pthread_setspecific(hcall_buf_pkey, ...)" with a non-NULL > > value is in hcall_buf_prep(), so it must have been called at some point. > > In that case, I am puzzled why I don't get the trace. > Something really fishy is going on. > > > Please can you confirm if _xc_init_hcall_buf() is ever called and what > > the behaviour of "pthread_getspecific(hcall_buf_pkey)" is if > > _xc_init_hcall_buf() has never been called. I think it is supposed to > > return NULL in this case and we certainly rely on that. > > _xc_init_hcall_buf() is not called. pthread_getspecific() should return NULL > but doesn't. > > I am starting to ask myself "How did libxc ever work?". It feels like we are > hunting down a long-term hidden bug. Previously _xc_clean_hcall_buf would be called IFF hcall_buf_prep had been called. My patch changed this to also be called on close (even if hcall_buf_prep was never called) and could therefore access an uninitialised hcall_buf_pkey. I am reasonably confident that before my patch libxc was OK. > > pthread_getspecific(hcall_buf_pkey) is supposed to return NULL on error, > > however hcall_buf_pkey is uninitialised until _xc_init_hcall_buf, > > perhaps on NetBSD the uninitialised value somehow looks valid? It's not > > clear what the correct value to initialise a pthread_key_t to in order > > for it to appear invalid until it is properly setup is, but I suppose we > > should be initialising it before use. Please can you try this patch: > > I tried the replacement patch from the other mail. > With it, xend does not crash, hcall_buf is NULL, > pthread_getspecific() returns NULL, OK, I think that suggests that my updated patch does the right thing here. > and I am not able to start a guest with 'xm' > > Xend has probably crashed! Invalid or missing HTTP status code. There was another HTTP (XML/RPC) related mail on the list this morning -- is this related to that? Are you sure it is related to the libxc patch? (did you by any chance update to python2.7 recently?) > > If that doesn't work perhaps you can reduce the issue to a simple test > > case like the attached? (which doesn't reproduce the issue for me on > > Linux) If you can do that then please run it with the attached libxc > > patch and post the output. > > xc_interface is 0x7f7ffdb03800 > before prep buf is 0x7f7ffdb0b000 / 0x7f7ffdb0b040 > after prep buf is 0x7f7ffdb0b000 / 0x7f7ffdb20000 > after release buf is 0x7f7ffdb0b000 / 0x7f7ffdb0b040 > xc interface close returned 0 > > No crash. Is this the expected output ? It looks correct but didn't reproduce the crash so is of limited utility. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |