|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.3 + tmem = Xen BUG at domain_page.c:143
On Tue, Jun 11, 2013 at 4:30 PM, konrad wilk <konrad.wilk@xxxxxxxxxx> wrote:
> I think this is a more subtle bug.
> I applied a debug patch (see attached) and with the help of it and the logs:
>
> (XEN) domain_page.c:160:d1 mfn (1ebe96) -> 6 idx: 32(i:1,j:0), branch:1
> (XEN) domain_page.c:166:d1 [0] idx=26, mfn=0x1ebcd8, refcnt: 0
> (XEN) domain_page.c:166:d1 [1] idx=12, mfn=0x1ebcd9, refcnt: 0
> (XEN) domain_page.c:166:d1 [2] idx=2, mfn=0x210e9a, refcnt: 0
> (XEN) domain_page.c:166:d1 [3] idx=14, mfn=0x210e9b, refcnt: 0
> (XEN) domain_page.c:166:d1 [4] idx=7, mfn=0x210e9c, refcnt: 0
> (XEN) domain_page.c:166:d1 [5] idx=10, mfn=0x210e9d, refcnt: 0
> (XEN) domain_page.c:166:d1 [6] idx=5, mfn=0x210e9e, refcnt: 0
> (XEN) domain_page.c:166:d1 [7] idx=13, mfn=0x1ebe97, refcnt: 0
> (XEN) Xen BUG at domain_page.c:169
>
> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]----
> (XEN) CPU: 3
> (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1
>
> (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
> (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000
> (XEN) rdx: ffff8302125b2020 rsi: 000000000000000a rdi: ffff82c4c027a6e8
> (XEN) rbp: ffff8302125afcc8 rsp: ffff8302125afc48 r8: 0000000000000004
> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001
> (XEN) r12: ffff83022e2ef000 r13: 00000000001ebe96 r14: 0000000000000020
> (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0
> (XEN) cr3: 0000000209541000 cr2: ffffffffff600400
>
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> (XEN) Xen stack trace from rsp=ffff8302125afc48:
> (XEN) 00000000001ebe97 0000000000000000 0000000000000000 ffff830200000001
> (XEN) ffff8302125afcc8 ffff82c400000000 00000000001ebe97 000000080000000d
> (XEN) ffff83022e2ef2d8 0000000000000286 ffff82c4c0127b6b ffff83022e2ef000
> (XEN) ffff82e003d7d2c0 ffff8302125afd60 00000000001ebe96 0000000000000000
> (XEN) ffff8302125afd38 ffff82c4c01373de 0000000000000000 ffffffffffffffff
> (XEN) 0000000000000001 ffff8302125afd58 ffff83022e2ef2d8 0000000000000286
>
> (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000
> (XEN) 0000000000000000 00000000001ebe96 ffff8302125afd98 ffff82c4c01377c4
> (XEN) 0000000000000000 ffff820040017000 ffff82e003d7d2c0 00000000001ebe96
> (XEN) ffff8302125afd98 ffff830210ecf390 00000000fffffff4 ffff820040009010
> (XEN) ffff820040000f50 ffff83022e2f0c90 ffff8302125afe18 ffff82c4c0135929
> (XEN) 000000160000001e ffff820040000f50 0000000000000000 00000000001ebe96
> (XEN) 0000000000000000 0000000000000000 0000a2f6125afe28 ffff8302125afe00
> (XEN) 0000001675f02b51 ffff83022e2f0c90 ffff830210ecf390 0000000000000000
> (XEN) 0000000000000001 0000000000000065 ffff8302125afef8 ffff82c4c0136510
> (XEN) ffff830200001000 0000000000000000 ffff8302125afe90 255ece02125b2040
> (XEN) 00000003125afe68 00000016742667d1 ffff8302125b2100 0000003d52299000
> (XEN) ffff8300c68f9000 0000000001c9c380 ffff8302125b2100 ffff8302125b1808
> (XEN) 0000000000000004 0000000000000004 0000000000000000 0000000000000000
> (XEN) 000000000000a2f6 0000000000000000 00000000001ebe96 ffff82c4c0126e77
> (XEN) Xen call trace:
> (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1
>
> (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b
> (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284
> (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4
> (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0
> (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145
>
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 3:
> (XEN) Xen BUG at domain_page.c:169
>
> (XEN) ****************************************
> (XEN)
> (XEN) Manual reset required ('noreboot' specified)
>
> It looks as if the path that is taken is:
>
> 110 idx = find_next_zero_bit(dcache->inuse, dcache->entries,
> dcache->cursor);
> 111 if ( unlikely(idx >= dcache->entries) )
> 112 {
>
> 115 /* /First/, clean the garbage map and update the inuse list. */
> 116 for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ )
> 117 {
> 118 dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0);
> 119 accum |= ~dcache->inuse[i];
>
> Here computes the accum
> 120 }
> 121
> 122 if ( accum )
> 123 idx = find_first_zero_bit(dcache->inuse, dcache->entries)
>
> Ok, finds the idx (32),
> 124 else
> 125 {
> .. does not go here.
> 142 }
> 143 BUG_ON(idx >= dcache->entries);
>
> And hits the BUG_ON().
>
> But I am not sure if that is appropriate. Perhaps the BUG_ON was meant as a
> check
> for the loop (lines 128 -> 141) - in case it looped around and never found
> an empty place.
> But if that is the condition then that would also look suspect as it might
> have found an
> empty hash entry and the idx would still end up being 32.
Right -- it is really curious that "accum |= ~dcache->inuse[x]"
managed to be non-zero, while find_first_zero_bit() goes off the end
(as it seems).
It seems like you should add a printk in the first loop:
if(~dcache->inuse[i]) printk(...);
Also, I don't think you've printed what dcache->entries is -- is it 32?
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |