[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-ia64-devel] BUG at mm.c:605



Hi Alex.

I think the issue is same as reported in the below thread.
http://lists.xensource.com/archives/html/xen-ia64-devel/2006-09/msg00204.html
I tried to convince him, but failed.
I think that the best way is to revert the patch and we should seek
for the right fix.

I guess that making xennet.rx_copy default enabled is the beginning.
The root cause is that relinquish_memory() writes the P2M simulteniously
with __acquire_grant_for_copy(). 
relinquish_memory() of IA64 assumes that no one else read the P2M
table at the same time. However the such assumption is wrong now.
Although I haven't dug into its details, I suspect that
the race between relinquish_memory() and get_page()
(or __acquire_grant_for_copy()) exists not only on ia64,
but also on x86.

thanks.

On Tue, Oct 17, 2006 at 05:13:21PM -0600, Alex Williamson wrote:
> 
>    I seem to be hitting the BUG below with increasing regularity lately.
> Typically it occurs right when a domU is in the middle of rebooting (it
> appears to have shut down completely, I get a dom0 console prompt, then
> the BUG below).  It's random enough that I haven't been able to isolate
> it to a particular changeset.  Has anyone else seen this or (better yet)
> have a fix?  It's getting in the way of testing all the new patches
> since I'm seeing this on xen-ia64-unstable.hg.  Thanks,
> 
>       Alex
> 
> -- 
> Alex Williamson                             HP Open Source & Linux Org.
> 
> (XEN) BUG at mm.c:605
> (XEN) die_if_kernel: bug check 0
> (XEN) d 0xf000000007c88080 domid 0
> (XEN) vcpu 0xf000000007c60000 vcpu 0
> (XEN) 
> (XEN) CPU 0
> (XEN) psr : 0000101008226038 ifs : 8000000000000288 ip  : [<f000000004048e50>]
> (XEN) ip is at __bug+0x40/0x60
> (XEN) unat: 0000000000000000 pfs : 0000000000000288 rsc : 0000000000000003
> (XEN) rnat: 0000000000004000 bsps: 000000000000435b pr  : 000000000059aa69
> (XEN) ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f
> (XEN) csd : 0000000000000000 ssd : 0000000000000000
> (XEN) b0  : f000000004048e50 b6  : f000000004049e10 b7  : a00000010064afa0
> (XEN) f6  : 0fffbccccccccc8c00000 f7  : 0ffdbf300000000000000
> (XEN) f8  : 10001c000000000000000 f9  : 10002a000000000000000
> (XEN) f10 : 0fffe9999999996900000 f11 : 1003e0000000000000000
> (XEN) r1  : f000000004326f00 r2  : 000000000000435b r3  : f0000040fdad7781
> (XEN) r8  : 0000000000000000 r9  : 0000000000000000 r10 : 0000000000000000
> (XEN) r11 : 00000000005a0969 r12 : f000000007c67900 r13 : f000000007c60000
> (XEN) r14 : 0000000000004000 r15 : f00000000412ff2a r16 : 0000000000004001
> (XEN) r17 : f0000000041294fc r18 : 000000000000035b r19 : f0000000041294f8
> (XEN) r20 : f000000007c67860 r21 : f000000007c60010 r22 : 0000000000000080
> (XEN) r23 : 000000000002003c r24 : f000000007c67e20 r25 : f000000007c67e28
> (XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000
> (XEN) r29 : 0000000000000001 r30 : 0000000000000000 r31 : f000000004133db8
> (XEN) 
> (XEN) Call Trace:
> (XEN)  [<f0000000040a0330>] show_stack+0x80/0xa0
> (XEN)                                 sp=f000000007c67520 bsp=f000000007c612c0
> (XEN)  [<f00000000407abc0>] die_if_kernel+0x90/0xe0
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c61290
> (XEN)  [<f00000000406f330>] ia64_handle_break+0x220/0x2d0
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c61258
> (XEN)  [<f00000000409d220>] ia64_leave_kernel+0x0/0x310
> (XEN)                                 sp=f000000007c67700 bsp=f000000007c61258
> (XEN)  [<f000000004048e50>] __bug+0x40/0x60
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61218
> (XEN)  [<f000000004063f10>] lookup_noalloc_domain_pte+0x40/0x130
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c611e8
> (XEN)  [<f000000004065d40>] lookup_domain_mpa+0x30/0x250
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c611b8
> (XEN)  [<f0000000040664f0>] gmfn_to_mfn_foreign+0xc0/0xf0
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61190
> (XEN)  [<f000000004026910>] __acquire_grant_for_copy+0x3a0/0x490
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61140
> (XEN)  [<f000000004029870>] do_grant_table_op+0x1f80/0x2c20
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61038
> (XEN)  [<f00000000402f060>] do_multicall+0x2b0/0x390
> (XEN)                                 sp=f000000007c67960 bsp=f000000007c60fa0
> (XEN)  [<f00000000405ec40>] ia64_hypercall+0x990/0x1020
> (XEN)                                 sp=f000000007c67960 bsp=f000000007c60f40
> (XEN)  [<f00000000406f260>] ia64_handle_break+0x150/0x2d0
> (XEN)                                 sp=f000000007c67df0 bsp=f000000007c60f08
> (XEN)  [<f00000000409d220>] ia64_leave_kernel+0x0/0x310
> (XEN)                                 sp=f000000007c67e00 bsp=f000000007c60f08
> (XEN) domain_crash_sync called from xenmisc.c:108
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) d 0xf000000007c88080 domid 0
> (XEN) vcpu 0xf000000007c60000 vcpu 0
> (XEN) 
> (XEN) CPU 0
> (XEN) psr : 0000101208026030 ifs : 800000000000040b ip  : [<a00000010006d4d0>]
> (XEN) ip is at ???
> (XEN) unat: 0000000000000000 pfs : 800000000000040b rsc : 000000000000000b
> (XEN) rnat: 0000000000000000 bsps: e000000079609260 pr  : 00000000005a5a65
> (XEN) ldrs: 0000000001680000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
> (XEN) csd : 0000000000000000 ssd : 0000000000000000
> (XEN) b0  : a00000010006d4a0 b6  : a0000001006aee00 b7  : a00000010064afa0
> (XEN) f6  : 1003e0000000000000000 f7  : 1003e000000000000023a
> (XEN) f8  : 0fffe8e7fdc6000000000 f9  : 1003efffffffffffffc00
> (XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
> (XEN) r1  : a00000010102cc90 r2  : 000000000000000d r3  : 8000000004fb7758
> (XEN) r8  : 0000000004fb7758 r9  : 0000000000000000 r10 : 0000000000000000
> (XEN) r11 : a000000100da9248 r12 : e00000007960fbd0 r13 : e000000079608000
> (XEN) r14 : 8000000004fb7758 r15 : 0000000000000001 r16 : a000000100da9200
> (XEN) r17 : 0000000000000001 r18 : e00000007960fc38 r19 : 00000000000007ff
> (XEN) r20 : ffffffffffff0060 r21 : a000000100fb7758 r22 : 0000000000000080
> (XEN) r23 : e00000007960fc30 r24 : 0000005000000080 r25 : 0000000000000000
> (XEN) r26 : 0000000000000000 r27 : 00000000000003ed r28 : 00028000000403ed
> (XEN) r29 : 0000000000000001 r30 : 0000000000000000 r31 : e00000007960fc28
> (XEN) 
> (XEN) Call Trace:
> (XEN)  [<f0000000040a0330>] show_stack+0x80/0xa0
> (XEN)                                 sp=f000000007c67520 bsp=f000000007c61300
> (XEN)  [<f00000000401f070>] __domain_crash+0xd0/0x110
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c612d8
> (XEN)  [<f00000000401f0d0>] __domain_crash_synchronous+0x20/0x50
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c612c0
> (XEN)  [<f00000000407ac00>] die_if_kernel+0xd0/0xe0
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c61290
> (XEN)  [<f00000000406f330>] ia64_handle_break+0x220/0x2d0
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c61258
> (XEN)  [<f00000000409d220>] ia64_leave_kernel+0x0/0x310
> (XEN)                                 sp=f000000007c67700 bsp=f000000007c61258
> (XEN)  [<f000000004048e50>] __bug+0x40/0x60
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61218
> (XEN)  [<f000000004063f10>] lookup_noalloc_domain_pte+0x40/0x130
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c611e8
> (XEN)  [<f000000004065d40>] lookup_domain_mpa+0x30/0x250
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c611b8
> (XEN)  [<f0000000040664f0>] gmfn_to_mfn_foreign+0xc0/0xf0
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61190
> (XEN)  [<f000000004026910>] __acquire_grant_for_copy+0x3a0/0x490
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61140
> (XEN)  [<f000000004029870>] do_grant_table_op+0x1f80/0x2c20
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61038
> (XEN)  [<f00000000402f060>] do_multicall+0x2b0/0x390
> (XEN)                                 sp=f000000007c67960 bsp=f000000007c60fa0
> (XEN)  [<f00000000405ec40>] ia64_hypercall+0x990/0x1020
> (XEN)                                 sp=f000000007c67960 bsp=f000000007c60f40
> (XEN)  [<f00000000406f260>] ia64_handle_break+0x150/0x2d0
> (XEN)                                 sp=f000000007c67df0 bsp=f000000007c60f08
> (XEN)  [<f00000000409d220>] ia64_leave_kernel+0x0/0x310
> (XEN)                                 sp=f000000007c67e00 bsp=f000000007c60f08
> (XEN) 
> (XEN) Call Trace:
> (XEN)  [<f0000000040a0330>] show_stack+0x80/0xa0
> (XEN)                                 sp=f000000007c67520 bsp=f000000007c61300
> (XEN)  [<f00000000401f080>] __domain_crash+0xe0/0x110
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c612d8
> (XEN)  [<f00000000401f0d0>] __domain_crash_synchronous+0x20/0x50
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c612c0
> (XEN)  [<f00000000407ac00>] die_if_kernel+0xd0/0xe0
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c61290
> (XEN)  [<f00000000406f330>] ia64_handle_break+0x220/0x2d0
> (XEN)                                 sp=f000000007c676f0 bsp=f000000007c61258
> (XEN)  [<f00000000409d220>] ia64_leave_kernel+0x0/0x310
> (XEN)                                 sp=f000000007c67700 bsp=f000000007c61258
> (XEN)  [<f000000004048e50>] __bug+0x40/0x60
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61218
> (XEN)  [<f000000004063f10>] lookup_noalloc_domain_pte+0x40/0x130
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c611e8
> (XEN)  [<f000000004065d40>] lookup_domain_mpa+0x30/0x250
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c611b8
> (XEN)  [<f0000000040664f0>] gmfn_to_mfn_foreign+0xc0/0xf0
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61190
> (XEN)  [<f000000004026910>] __acquire_grant_for_copy+0x3a0/0x490
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61140
> (XEN)  [<f000000004029870>] do_grant_table_op+0x1f80/0x2c20
> (XEN)                                 sp=f000000007c67900 bsp=f000000007c61038
> (XEN)  [<f00000000402f060>] do_multicall+0x2b0/0x390
> (XEN)                                 sp=f000000007c67960 bsp=f000000007c60fa0
> (XEN)  [<f00000000405ec40>] ia64_hypercall+0x990/0x1020
> (XEN)                                 sp=f000000007c67960 bsp=f000000007c60f40
> (XEN)  [<f00000000406f260>] ia64_handle_break+0x150/0x2d0
> (XEN)                                 sp=f000000007c67df0 bsp=f000000007c60f08
> (XEN)  [<f00000000409d220>] ia64_leave_kernel+0x0/0x310
> (XEN)                                 sp=f000000007c67e00 bsp=f000000007c60f08
> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
> 
> 
> 
> _______________________________________________
> Xen-ia64-devel mailing list
> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-ia64-devel

-- 
yamahata

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.