[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xend segfaults when starting
On Wednesday 18 August 2010 16:59:30 Ian Campbell wrote: > On Wed, 2010-08-18 at 15:02 +0100, Christoph Egger wrote: > > On Wednesday 18 August 2010 14:14:19 Ian Campbell wrote: > > > > In unlock_pages, the address and length passed to munlock() is: > > > > > > > > laddr 0x7f7ffdfe7000, llen 0x2000 > > > > > > > > The reason why munlock() fails is that mlock() hasn't been called > > > > before. The hcall_buf_prep() is not called at all before the first > > > > call to _xc_clean_hcall_buf(). > > > > > > If hcall_buf_prep() has never been called then > > > "pthread_getspecific(hcall_buf_pkey)" should return NULL and > > > _xc_clean_hcall_buf will never be called from xc_clean_hcall_buf. > > > _xc_clean_hcall_buf also ignores NULL values itself. > > > > Who calls hcall_buf_prep() in your case ? > > > > Only hypercalls call hcall_buf_prep(). > > What if no hypercalls are not called during xend startup ? > > Then I would have expected pthread_getspecific(hcall_buf_pkey) to return > NULL (because _xc_init_hcall_buf was never called) and therefore for > xc_clean_hcall_buf to not doing any unlocking. > > However I think my expectation was wrong. If _xc_init_hcall_buf is never > called then hcall_buf_pkey is undefined but not necessarily invalid -- > and it seems to be the case on your system that it turns out to be valid > (perhaps pthread_key_t is valid on NetBSD and invalid on Linux or > something like that) and therefore we try an unlock some random address. To make it even more mysterious, the "random" address is always the same even across machine reboots. > > My updated patch ensured that hcall_buf_pkey is always initialised > before use. Yes, but we also need to figure out why hcall_buf_prep is never called. Who calls hcall_buf_prep() on your machine ? Can you provide a call trace when hcall_buf_prep() is called the first time, please ? > > If you call xc_clean_hcall_buf() from xc_interface_close() > > then you should also call hcall_buf_prep() from xc_interface_open(). > > > > > However you say that hcall_buf_pkey is not NULL, but rather contains a > > > valid hcall_buf containing 0x7f7ffdfe7040. > > > > hcall_buf itself has the address 0x7f7ffdfe7000. > > > > hcall_buf->buf has the address 0x7f7ffdfe7040. > > That's very odd -- hcall_buf->buf is allocated with xc_memalign and > therefore should be page aligned. Are you sure the addresses aren't the > other way round? Yes, I am. > > > > The only call to "pthread_setspecific(hcall_buf_pkey, ...)" with a > > > non-NULL value is in hcall_buf_prep(), so it must have been called at > > > some point. > > > > In that case, I am puzzled why I don't get the trace. > > Something really fishy is going on. > > > > > Please can you confirm if _xc_init_hcall_buf() is ever called and what > > > the behaviour of "pthread_getspecific(hcall_buf_pkey)" is if > > > _xc_init_hcall_buf() has never been called. I think it is supposed to > > > return NULL in this case and we certainly rely on that. > > > > _xc_init_hcall_buf() is not called. pthread_getspecific() should return > > NULL but doesn't. > > > > I am starting to ask myself "How did libxc ever work?". It feels like we > > are hunting down a long-term hidden bug. > > Previously _xc_clean_hcall_buf would be called IFF hcall_buf_prep had > been called. My patch changed this to also be called on close (even if > hcall_buf_prep was never called) and could therefore access an > uninitialised hcall_buf_pkey. Calling _xc_clean_hcall_buf() unconditionally and hcall_buf_prep() conditionally sounds to me like calling free() unconditionally and malloc() conditionally. I will give calling hcall_buf_prep() from xc_interface_open() a try with your patch tomorrow. > I am reasonably confident that before my patch libxc was OK. And is ok again after it has been backed out. :) > > > pthread_getspecific(hcall_buf_pkey) is supposed to return NULL on > > > error, however hcall_buf_pkey is uninitialised until > > > _xc_init_hcall_buf, perhaps on NetBSD the uninitialised value somehow > > > looks valid? It's not clear what the correct value to initialise a > > > pthread_key_t to in order for it to appear invalid until it is properly > > > setup is, but I suppose we should be initialising it before use. Please > > > can you try this patch: > > > > I tried the replacement patch from the other mail. > > With it, xend does not crash, hcall_buf is NULL, > > pthread_getspecific() returns NULL, > > OK, I think that suggests that my updated patch does the right thing > here. Is it possible that xend can call xc_interface_close() during startup and hcall_buf_prep() later when xend comes in interaction with xm ? > > and I am not able to start a guest with 'xm' > > > > Xend has probably crashed! Invalid or missing HTTP status code. > > There was another HTTP (XML/RPC) related mail on the list this morning I saw this mail. No, I don't think it is related to this. > -- is this related to that? Are you sure it is related to the libxc > patch? Yes. > (did you by any chance update to python2.7 recently?) No, I am on python 2.5. > > > If that doesn't work perhaps you can reduce the issue to a simple test > > > case like the attached? (which doesn't reproduce the issue for me on > > > Linux) If you can do that then please run it with the attached libxc > > > patch and post the output. > > > > xc_interface is 0x7f7ffdb03800 > > before prep buf is 0x7f7ffdb0b000 / 0x7f7ffdb0b040 > > after prep buf is 0x7f7ffdb0b000 / 0x7f7ffdb20000 > > after release buf is 0x7f7ffdb0b000 / 0x7f7ffdb0b040 > > xc interface close returned 0 > > > > No crash. Is this the expected output ? > > It looks correct but didn't reproduce the crash so is of limited > utility. > > Ian. Christoph -- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |