[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] LVM userspace causing dom0 crash



On Mon, May 07, 2012 at 11:36:22AM -0400, Christopher S. Aker wrote:
> Xen: 4.1.3-rc1-pre (xenbits @ 23285)
> Dom0: 3.2.6 PAE and 3.3.4 PAE

This looks suspicious like a fix that went in some time ago, ah:

2cd1c8d x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, 
regardless of lazy_mmu mode

but that went in 3.2 so that can't be it.

Hm, can you give more details on what parameters you are passing
to dom0 and the hypervisor so I can reproduce it?

Also, could you send me your .config file? Is the underlaying
storage SCSI?

And is this only happening on these SuperMicro boxes or are you seeing
this on other hardware as well?
> 
> We seeing the below crash on 3.x dom0s.  A simple lvcreate/lvremove
> loop deployed to a few dozen boxes will hit it quite reliably within
> a short time.  This happens on both an older LVM userspace and
> newest, and in production we have seen this hit on lvremove,
> lvrename, and lvdelete.
> 
> #!/bin/bash
> while true; do
>    lvcreate -L 256M -n test1 vg1; lvremove -f vg1/test1
> done
> 
> BUG: unable to handle kernel paging request at bffff628
> IP: [<c10ebc58>] __page_check_address+0xb8/0x170
> *pdpt = 0000000003cfb027 *pde = 0000000013873067 *pte = 0000000000000000
> Oops: 0000 [#1] SMP
> Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6
> ebt_ip ip_set_hash_net ip_set ebtable_nat xen_gntdev e1000e
> Pid: 27902, comm: lvremove Not tainted 3.2.6-1 #1 Supermicro X8DT6/X8DT6
> EIP: 0061:[<c10ebc58>] EFLAGS: 00010246 CPU: 6
> EIP is at __page_check_address+0xb8/0x170
> EAX: bffff000 EBX: cbf76dd8 ECX: 00000000 EDX: 00000000
> ESI: bffff628 EDI: e49ed900 EBP: c80ffe60 ESP: c80ffe4c
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> Process lvremove (pid: 27902, ti=c80fe000 task=d29adca0 task.ti=c80fe000)
> Stack:
>  e4205000 00000fff da9b6bc0 d0068dc0 e49ed900 c80ffe94 c10ec769 c80ffe84
>  00000000 00000129 00000125 b76c5000 00000001 00000000 d0068c08 d0068dc0
>  b76c5000 e49ed900 c80fff24 c10ecb73 00000002 00000005 35448025 c80ffec4
> Call Trace:
>  [<c10ec769>] try_to_unmap_one+0x29/0x310
>  [<c10ecb73>] try_to_unmap_file+0x83/0x560
>  [<c1005829>] ? xen_pte_val+0xb9/0x140
>  [<c1004116>] ? __raw_callee_save_xen_pte_val+0x6/0x8
>  [<c10e1bf8>] ? vm_normal_page+0x28/0xc0
>  [<c1038e95>] ? kmap_atomic_prot+0x45/0x110
>  [<c10ed13c>] try_to_munlock+0x1c/0x40
>  [<c10e7109>] munlock_vma_page+0x49/0x90
>  [<c10e7247>] munlock_vma_pages_range+0x57/0xa0
>  [<c10e7352>] mlock_fixup+0xc2/0x130
>  [<c10e742c>] do_mlockall+0x6c/0x80
>  [<c10e7469>] sys_munlockall+0x29/0x50
>  [<c166f1d8>] sysenter_do_call+0x12/0x28
> Code: ff c1 ee 09 81 e6 f8 0f 00 00 81 e1 ff 0f 00 00 0f ac ca 0c c1
> e2 05 03 55 ec 89 d0 e8 12 d3 f4 ff 8b 4d 0c 85 c9 8d 34 30 75 0c
> <f7> 06 01 01 00 00 0f 84 84 00 00 00 8b 0d 00 0e 9b c1 89 4d f0
> EIP: [<c10ebc58>] __page_check_address+0xb8/0x170 SS:ESP 0069:c80ffe4c
> CR2: 00000000bffff628
> ---[ end trace 8039aeca9c19f5ab ]---
> note: lvremove[27902] exited with preempt_count 1
> BUG: scheduling while atomic: lvremove/27902/0x00000001
> Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6
> ebt_ip ip_set_hash_net ip_set ebtable_nat xen_gntdev e1000e
> Pid: 27902, comm: lvremove Tainted: G      D      3.2.6-1 #1
> Call Trace:
>  [<c1040fcd>] __schedule_bug+0x5d/0x70
>  [<c1666fb9>] __schedule+0x679/0x830
>  [<c100828b>] ? xen_restore_fl_direct_reloc+0x4/0x4
>  [<c10a05fc>] ? rcu_enter_nohz+0x3c/0x60
>  [<c13b2070>] ? xen_evtchn_do_upcall+0x20/0x30
>  [<c1001227>] ? hypercall_page+0x227/0x1000
>  [<c10079ea>] ? xen_force_evtchn_callback+0x1a/0x30
>  [<c1667250>] schedule+0x30/0x50
>  [<c166890d>] rwsem_down_failed_common+0x9d/0xf0
>  [<c1668992>] rwsem_down_read_failed+0x12/0x14
>  [<c1346b63>] call_rwsem_down_read_failed+0x7/0xc
>  [<c166814d>] ? down_read+0xd/0x10
>  [<c1086f9a>] acct_collect+0x3a/0x170
>  [<c105028a>] do_exit+0x62a/0x7d0
>  [<c104cb37>] ? kmsg_dump+0x37/0xc0
>  [<c1669ac0>] oops_end+0x90/0xd0
>  [<c1032dbe>] no_context+0xbe/0x190
>  [<c1032f28>] __bad_area_nosemaphore+0x98/0x140
>  [<c1008089>] ? xen_clocksource_read+0x19/0x20
>  [<c10081f7>] ? xen_vcpuop_set_next_event+0x47/0x80
>  [<c1032fe2>] bad_area_nosemaphore+0x12/0x20
>  [<c166bc12>] do_page_fault+0x2d2/0x3f0
>  [<c106e389>] ? hrtimer_interrupt+0x1a9/0x2b0
>  [<c10079ea>] ? xen_force_evtchn_callback+0x1a/0x30
>  [<c1008294>] ? check_events+0x8/0xc
>  [<c100828b>] ? xen_restore_fl_direct_reloc+0x4/0x4
>  [<c1668a44>] ? _raw_spin_unlock_irqrestore+0x14/0x20
>  [<c166b940>] ? spurious_fault+0x130/0x130
>  [<c166932e>] error_code+0x5a/0x60
>  [<c166b940>] ? spurious_fault+0x130/0x130
>  [<c10ebc58>] ? __page_check_address+0xb8/0x170
>  [<c10ec769>] try_to_unmap_one+0x29/0x310
>  [<c10ecb73>] try_to_unmap_file+0x83/0x560
>  [<c1005829>] ? xen_pte_val+0xb9/0x140
>  [<c1004116>] ? __raw_callee_save_xen_pte_val+0x6/0x8
>  [<c10e1bf8>] ? vm_normal_page+0x28/0xc0
>  [<c1038e95>] ? kmap_atomic_prot+0x45/0x110
>  [<c10ed13c>] try_to_munlock+0x1c/0x40
>  [<c10e7109>] munlock_vma_page+0x49/0x90
>  [<c10e7247>] munlock_vma_pages_range+0x57/0xa0
>  [<c10e7352>] mlock_fixup+0xc2/0x130
>  [<c10e742c>] do_mlockall+0x6c/0x80
>  [<c10e7469>] sys_munlockall+0x29/0x50
>  [<c166f1d8>] sysenter_do_call+0x12/0x28
> 
> Thanks,
> -Chris
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.