[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] xen 3.0 amd64 crash... seems to be tied into disk i/o, > 4 gig ram
This seems to be a repeatable crash. just do some disk intensive stuff in domU and then type "sync" :( The box is a dual opteron 720, with 8 gig of ram, one domU and (duh) one dom0, both with aprox 500 meg of RAM allocated. The box has remote power control, serial console, and I can provide developer access if it helps. Kernel was compiled locally (on centos 4.2 amd64 domU and dom0) Box seems stable under raw linux 2.6.14.2, but does generate occasionaly MCE messages pointing at the northbridge/GART... I spent a day researching that, and didn't come to any conclusion other than it could be a bogus report specific to amd64 systems with > 4gig ram. there is an IBM page to that effect for an older RHE system... box has a 3ware controller and SATA drives. Anyhow, any help would be appreciated. I'm probably going to try to see if the PAE stuff is more stable... but obviously not tonight. In theory this is a 3.0.0 box, but might be 3.0-testing... This is pretty greek to me, but given that it seems reproducable, I should be able to produce any other info required...? Or should I be dumping this into bugzilla? -Tom >From root@xxxxxxxxxxxxxxxxxxxxx Thu Dec 8 00:33:19 2005 Date: Thu, 8 Dec 2005 00:21:56 -0800 From: root <root@xxxxxxxxxxxxxxxxxxxxx> To: tbrown@xxxxxxxxxxxxx Subject: oops.2.ksymoops ksymoops 2.4.11 on x86_64 2.6.12.6-xen0. Options used -V (default) -K (specified) -l /proc/modules (default) -o /lib/modules/2.6.12.6-xen0/ (default) -m /boot/System.map-2.6.12.6-xen0 (specified) No modules in ksyms, skipping objects No ksyms, skipping lsmod Unable to handle kernel paging request at ffff88001e61b000 RIP: <ffffffff80220bfb>{memcpy+11} Oops: 0003 [1] CPU 0 Pid: 0, comm: swapper Not tainted 2.6.12.6-xen0 RIP: e030:[<ffffffff80220bfb>] <ffffffff80220bfb>{memcpy+11} Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64 RSP: e02b:ffffffff80525d50 EFLAGS: 00010246 RAX: ffff88001e61b000 RBX: 000000000000500c RCX: 0000000000000200 RDX: 0000000000000000 RSI: ffff8800040a2000 RDI: ffff88001e61b000 RBP: 0000000000000002 R08: 0000000000000002 R09: ffff8800040a2000 R10: ffff8800040a2000 R11: 0000000000000246 R12: 0000000000000000 R13: ffff800000000000 R14: 7fffffffffffffff R15: 6db6db6db6db6db7 FS: 00002aaaaaac9360(0000) GS:ffffffff80511a00(0000) knlGS:0000000055572460 CS: e033 DS: 0000 ES: 0000 Stack: ffffffff8011a094 ffff8800016a55e8 0000000000000000 ffff880005ac42d8 ffffffff8011a2cd ffff8800016a55e8 0000000000000000 0000000100000000 ffff8800147221c0 0000000000000001 Call Trace:<ffffffff8011a094>{__sync_single+100} <ffffffff8011a2cd>{unmap_single+109} <ffffffff8011aa40>{swiotlb_unmap_sg+192} <ffffffff802eb517>{tw_interrupt+1799} <ffffffff8014cd9d>{handle_IRQ_event+61} <ffffffff8014ce87>{__do_IRQ+167} <ffffffff80114dc4>{do_IRQ+52} <ffffffff8010d958>{evtchn_do_upcall+136} <ffffffff80111e7d>{do_hypervisor_callback+17} <ffffffff8010f793>{xen_idle+83} <ffffffff8010f793>{xen_idle+83} <ffffffff8010f7cf>{cpu_idle+31} <ffffffff8052671f>{start_kernel+495} <ffffffff80526193>{_sinittext+403} Code: f3 48 a5 89 d1 f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 >>RIP; ffffffff80220bfb <memcpy+b/b0> <===== >>RAX; ffff88001e61b000 <__start___xen_guest+ffff88001e612144/ffffffff800f7144> >>RSI; ffff8800040a2000 <__start___xen_guest+ffff880004099144/ffffffff800f7144> >>RDI; ffff88001e61b000 <__start___xen_guest+ffff88001e612144/ffffffff800f7144> >>R09; ffff8800040a2000 <__start___xen_guest+ffff880004099144/ffffffff800f7144> >>R10; ffff8800040a2000 <__start___xen_guest+ffff880004099144/ffffffff800f7144> >>R13; ffff800000000000 <__start___xen_guest+ffff7fffffff7144/ffffffff800f7144> >>R14; 7fffffffffffffff <__start___xen_guest+7fffffffffff7143/ffffffff800f7144> >>R15; 6db6db6db6db6db7 <__start___xen_guest+6db6db6db6dadefb/ffffffff800f7144> Trace; ffffffff8011a094 <__sync_single+64/70> Trace; ffffffff8011aa40 <swiotlb_unmap_sg+c0/e0> Trace; ffffffff8014cd9d <handle_IRQ_event+3d/80> Trace; ffffffff80114dc4 <do_IRQ+34/50> Trace; ffffffff80111e7d <do_hypervisor_callback+11/18> Trace; ffffffff8010f793 <xen_idle+53/70> Trace; ffffffff8052671f <start_kernel+1ef/200> Code; ffffffff80220bfb <memcpy+b/b0> 0000000000000000 <_RIP>: Code; ffffffff80220bfb <memcpy+b/b0> <===== 0: f3 48 a5 repz movsq %ds:(%rsi),%es:(%rdi) <===== Code; ffffffff80220bfe <memcpy+e/b0> 3: 89 d1 mov %edx,%ecx Code; ffffffff80220c00 <memcpy+10/b0> 5: f3 a4 repz movsb %ds:(%rsi),%es:(%rdi) Code; ffffffff80220c02 <memcpy+12/b0> 7: c3 retq Code; ffffffff80220c03 <memcpy+13/b0> 8: 66 data16 Code; ffffffff80220c04 <memcpy+14/b0> 9: 66 data16 Code; ffffffff80220c05 <memcpy+15/b0> a: 66 data16 Code; ffffffff80220c06 <memcpy+16/b0> b: 90 nop Code; ffffffff80220c07 <memcpy+17/b0> c: 66 data16 Code; ffffffff80220c08 <memcpy+18/b0> d: 66 data16 Code; ffffffff80220c09 <memcpy+19/b0> e: 66 data16 Code; ffffffff80220c0a <memcpy+1a/b0> f: 90 nop Code; ffffffff80220c0b <memcpy+1b/b0> 10: 66 data16 Code; ffffffff80220c0c <memcpy+1c/b0> 11: 66 data16 Code; ffffffff80220c0d <memcpy+1d/b0> 12: 66 data16 Code; ffffffff80220c0e <memcpy+1e/b0> 13: 90 nop CR2: ffff88001e61b000 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! >From root@xxxxxxxxxxxxxxxxxxxxx Thu Dec 8 00:43:16 2005 Date: Thu, 8 Dec 2005 00:40:51 -0800 From: root <root@xxxxxxxxxxxxxxxxxxxxx> To: tbrown@xxxxxxxxxxxxx Subject: tmpx3.ksymoops ksymoops 2.4.11 on x86_64 2.6.12.6-xen0. Options used -V (default) -K (specified) -l /proc/modules (default) -o /lib/modules/2.6.12.6-xen0/ (default) -m /usr/src/linux/System.map (default) No modules in ksyms, skipping objects No ksyms, skipping lsmod Unable to handle kernel paging request at ffff88001e527000 RIP: <ffffffff80220bfb>{memcpy+11} Oops: 0003 [1] CPU 0 Pid: 0, comm: swapper Not tainted 2.6.12.6-xen0 RIP: e030:[<ffffffff80220bfb>] <ffffffff80220bfb>{memcpy+11} Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64 RSP: e02b:ffffffff80525d50 EFLAGS: 00010246 RAX: ffff88001e527000 RBX: 0000000000003968 RCX: 0000000000000200 RDX: 0000000000000000 RSI: ffff880003550000 RDI: ffff88001e527000 RBP: 0000000000000002 R08: 0000000000000002 R09: ffff880003550000 R10: ffff880003550000 R11: 0000000000000246 R12: 0000000000000000 R13: ffff800000000000 R14: 7fffffffffffffff R15: 6db6db6db6db6db7 FS: 00002aaaabe8f280(0000) GS:ffffffff80511a00(0000) knlGS:0000000055572460 CS: e033 DS: 0000 ES: 0000 Stack: ffffffff8011a094 ffff8800016a2088 ffffffff00000000 ffff880005ac42d8 ffffffff8011a2cd ffff8800016a2088 ffffffff00000000 0000000100000000 ffff8800078caf20 0000000000000001 Call Trace:<ffffffff8011a094>{__sync_single+100} <ffffffff8011a2cd>{unmap_single+109} <ffffffff8011aa40>{swiotlb_unmap_sg+192} <ffffffff802eb517>{tw_interrupt+1799} <ffffffff8014cd9d>{handle_IRQ_event+61} <ffffffff8014ce87>{__do_IRQ+167} <ffffffff80114dc4>{do_IRQ+52} <ffffffff8010d958>{evtchn_do_upcall+136} <ffffffff80111e7d>{do_hypervisor_callback+17} <ffffffff8010f793>{xen_idle+83} <ffffffff8010f793>{xen_idle+83} <ffffffff8010f7cf>{cpu_idle+31} <ffffffff8052671f>{start_kernel+495} <ffffffff80526193>{_sinittext+403} Code: f3 48 a5 89 d1 f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 >>RIP; ffffffff80220bfb <bitmap_parse+bb/210> <===== >>RAX; ffff88001e527000 <phys_startup_64+ffff88001e426f00/ffffffff7fffff00> >>RSI; ffff880003550000 <phys_startup_64+ffff88000344ff00/ffffffff7fffff00> >>RDI; ffff88001e527000 <phys_startup_64+ffff88001e426f00/ffffffff7fffff00> >>R09; ffff880003550000 <phys_startup_64+ffff88000344ff00/ffffffff7fffff00> >>R10; ffff880003550000 <phys_startup_64+ffff88000344ff00/ffffffff7fffff00> >>R13; ffff800000000000 <phys_startup_64+ffff7fffffefff00/ffffffff7fffff00> >>R14; 7fffffffffffffff <phys_startup_64+7fffffffffeffeff/ffffffff7fffff00> >>R15; 6db6db6db6db6db7 <phys_startup_64+6db6db6db6cb6cb7/ffffffff7fffff00> Trace; ffffffff8011a094 <touch_nmi_watchdog+4/30> Trace; ffffffff8011aa40 <pin_2_irq+60/130> Trace; ffffffff8014cd9d <kfifo_init+8d/90> Trace; ffffffff80114dc4 <pda_init+94/110> Trace; ffffffff80111e7d <handle_lost_ticks+13d/170> Trace; ffffffff8010f793 <oops_begin+23/70> Trace; ffffffff8052671f <__log_buf+e15f/20000> Code; ffffffff80220bfb <bitmap_parse+bb/210> 0000000000000000 <_RIP>: Code; ffffffff80220bfb <bitmap_parse+bb/210> <===== 0: f3 48 a5 repz movsq %ds:(%rsi),%es:(%rdi) <===== Code; ffffffff80220bfe <bitmap_parse+be/210> 3: 89 d1 mov %edx,%ecx Code; ffffffff80220c00 <bitmap_parse+c0/210> 5: f3 a4 repz movsb %ds:(%rsi),%es:(%rdi) Code; ffffffff80220c02 <bitmap_parse+c2/210> 7: c3 retq Code; ffffffff80220c03 <bitmap_parse+c3/210> 8: 66 data16 Code; ffffffff80220c04 <bitmap_parse+c4/210> 9: 66 data16 Code; ffffffff80220c05 <bitmap_parse+c5/210> a: 66 data16 Code; ffffffff80220c06 <bitmap_parse+c6/210> b: 90 nop Code; ffffffff80220c07 <bitmap_parse+c7/210> c: 66 data16 Code; ffffffff80220c08 <bitmap_parse+c8/210> d: 66 data16 Code; ffffffff80220c09 <bitmap_parse+c9/210> e: 66 data16 Code; ffffffff80220c0a <bitmap_parse+ca/210> f: 90 nop Code; ffffffff80220c0b <bitmap_parse+cb/210> 10: 66 data16 Code; ffffffff80220c0c <bitmap_parse+cc/210> 11: 66 data16 Code; ffffffff80220c0d <bitmap_parse+cd/210> 12: 66 data16 Code; ffffffff80220c0e <bitmap_parse+ce/210> 13: 90 nop CR2: ffff88001e527000 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |