[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] kernel BUG after killing tapdisk2 process


  • To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: tsk <aixt2006@xxxxxxxxx>
  • Date: Fri, 7 May 2010 19:45:58 +0800
  • Delivery-date: Fri, 07 May 2010 04:46:52 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=b1drjXQTXu2QbVPMslFwhU0R5UiLF2rW0dV/e18WdrIWu3Dpxby+xfMRsNnYK1R1jt 53h+kZM1PzHPFRAeRHvA928eVhB75zBixSEOGt/biMMS1/3cfpqb1wasSol4Q9MPb/MR GUddYwd+OvKDaTi7XjRTMVwttnci2M6+b4yNk=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi folks,

      When VM is running in XEN4.0, I kill tapdisk2.  Sometimes the  the kernel only print:
kernel: end_request: I/O error, dev tapdevh, sector 76068621
kernel: end_request: I/O error, dev tapdevh, sector 76068621
kernel: end_request: I/O error, dev tapdevh, sector 76068621
kernel: end_request: I/O error, dev tapdevh, sector 76068621
kernel: end_request: I/O error, dev tapdevh, sector 76068621
kernel: end_request: I/O error, dev tapdevh, sector 76068621
kernel: end_request: I/O error, dev tapdevh, sector 76068621
.........

Sometimes a kernel BUG  occurred. But the pid 15130 in the trace is not the one killed:

May  7 11:41:11 r21a02010 kernel: nbd10: NBD_DISCONNECT
May  7 11:41:11 r21a02010 kernel: nbd10: Receive control failed (result -32)
May  7 11:41:11 r21a02010 kernel: nbd10: shutting down socket
May  7 11:41:11 r21a02010 kernel: nbd10: queue cleared
May  7 11:47:08 r21a02010 kernel: physdev match: using --physdev-out in the OUTPUT, FORWARD and POSTROUTING chains for non-bridged traffic is not supported anymore.
May  7 11:47:08 r21a02010 last message repeated 11 times
May  7 12:01:01 r21a02010 kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
May  7 12:01:01 r21a02010 kernel: IP: [<ffffffff811e6168>] sg_set_page+0x9/0x27
May  7 12:01:01 r21a02010 kernel: PGD 13aee7067 PUD 1f3901067 PMD 0
May  7 12:01:01 r21a02010 kernel: Oops: 0000 [#1] SMP
May  7 12:01:01 r21a02010 kernel: last sysfs file: /sys/hypervisor/properties/capabilities
May  7 12:01:01 r21a02010 kernel: CPU 0
May  7 12:01:01 r21a02010 kernel: Pid: 15130, comm: tapdisk2 Not tainted 2.6.31.13hw #3 RH2285
May  7 12:01:01 r21a02010 kernel: RIP: e030:[<ffffffff811e6168>]  [<ffffffff811e6168>] sg_set_page+0x9/0x27
May  7 12:01:01 r21a02010 kernel: RSP: e02b:ffff88011e1d1768  EFLAGS: 00010202
May  7 12:01:01 r21a02010 kernel: RAX: 0000000000000000 RBX: 0000000000000019 RCX: 0000000000000000
May  7 12:01:01 r21a02010 kernel: RDX: 0000000000001000 RSI: ffff88002ffb4520 RDI: 0000000000000000
May  7 12:01:01 r21a02010 kernel: RBP: ffff88011e1d1768 R08: 00000000000476bc R09: ffff88011e1d18d8
May  7 12:01:01 r21a02010 kernel: R10: 00003ffffffff000 R11: 0000000000000000 R12: ffff8801ef5687d8
May  7 12:01:01 r21a02010 kernel: R13: ffff88026dd04940 R14: ffff8801ef5687d8 R15: ffff8801f9c3fec0
May  7 12:01:01 r21a02010 kernel: FS:  00007f201ec25730(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000
May  7 12:01:01 r21a02010 kernel: CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 12:01:01 r21a02010 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May  7 12:01:01 r21a02010 kernel: Process tapdisk2 (pid: 15130, threadinfo ffff88011e1d0000, task ffff8801f8c15b40)
May  7 12:01:01 r21a02010 kernel: Stack:
May  7 12:01:01 r21a02010 kernel:  ffff88011e1d17e8 ffffffff811e6992 ffff88026dd00ae8 ffff880034e2b7a0
May  7 12:01:01 r21a02010 kernel: <0> 000088011e1d1798 ffff8801f9c3feb0 0000000000000000 0000000181071679
May  7 12:01:01 r21a02010 kernel: <0> 0000100000000001 ffff8801f9c3fe40 000000011e1d17e8 0000000000000019
May  7 12:01:01 r21a02010 kernel: Call Trace:
May  7 12:01:01 r21a02010 kernel:  [<ffffffff811e6992>] blk_rq_map_sg+0x20d/0x34f
May  7 12:01:01 r21a02010 kernel:  [<ffffffffa012b4c7>] blktap_device_do_request+0x2bd/0xc05 [blktap]
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8102fa2e>] ? paravirt_leave_lazy_mmu+0x13/0x15
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8100c82b>] ? xen_leave_lazy_mmu+0x13/0x15
May  7 12:01:01 r21a02010 kernel:  [<ffffffff810d554d>] ? unmap_vmas+0x7b5/0x7ca
May  7 12:01:01 r21a02010 kernel:  [<ffffffff811dc88e>] ? freed_request+0x34/0x54
May  7 12:01:01 r21a02010 kernel:  [<ffffffff811dc963>] ? __blk_put_request+0xb5/0xbe
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8100eb79>] ? xen_force_evtchn_callback+0xd/0xf
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8100f2a2>] ? check_events+0x12/0x20
May  7 12:01:01 r21a02010 kernel:  [<ffffffff811dd2e0>] __blk_run_queue+0x42/0x71
May  7 12:01:01 r21a02010 kernel:  [<ffffffff811dfee5>] blk_start_queue+0x43/0x48
May  7 12:01:01 r21a02010 kernel:  [<ffffffffa012bf5d>] blktap_device_restart+0x77/0x90 [blktap]
May  7 12:01:01 r21a02010 kernel:  [<ffffffffa012a861>] blktap_run_deferred+0xcb/0xe2 [blktap]
May  7 12:01:01 r21a02010 kernel:  [<ffffffffa012a013>] blktap_ring_ioctl+0x1be/0x373 [blktap]
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81041643>] ? should_resched+0xe/0x2f
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81402ecc>] ? _cond_resched+0xe/0x22
May  7 12:01:01 r21a02010 kernel:  [<ffffffff811005db>] ? set_fd_set+0x3e/0x48
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8110135d>] ? core_sys_select+0x1de/0x212
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81121f63>] ? aio_read_evt+0xc1/0xd0
May  7 12:01:01 r21a02010 kernel:  [<ffffffff810ff332>] vfs_ioctl+0x63/0x7c
May  7 12:01:01 r21a02010 kernel:  [<ffffffff810ff841>] do_vfs_ioctl+0x479/0x4cc
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8100f0de>] ? xen_clocksource_read+0x21/0x23
May  7 12:01:01 r21a02010 kernel:  [<ffffffff8100f1ad>] ? xen_clocksource_get_cycles+0x9/0x1c
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81070fb3>] ? clocksource_read+0xf/0x11
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81071679>] ? getnstimeofday+0x5b/0xbb
May  7 12:01:01 r21a02010 kernel:  [<ffffffff810ff8f0>] sys_ioctl+0x5c/0x7f
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81058061>] ? sys_gettimeofday+0x39/0x72
May  7 12:01:01 r21a02010 kernel:  [<ffffffff81013db2>] system_call_fastpath+0x16/0x1b
May  7 12:01:01 r21a02010 kernel: RIP  [<ffffffff811e6168>] sg_set_page+0x9/0x27
May  7 12:01:01 r21a02010 kernel:  RSP <ffff88011e1d1768>
May  7 12:01:01 r21a02010 kernel: CR2: 0000000000000000
May  7 12:01:01 r21a02010 kernel: ---[ end trace bdf85010baffa3b6 ]---
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 3 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067029
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 2 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067093
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 1 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067309
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 3 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067357
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 1 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067421
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 1 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76074525
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 3 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067621
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 2 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76067893
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 1 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76068109
May  7 12:01:01 r21a02010 kernel: blktap_device_fail_pending_requests: 252:7: failing pending write of 3 pages
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76068621
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76074717
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76068685
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76068749
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76062981
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76062901
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76070221
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76064677
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76064341
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76063005
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76066661
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76074637
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76063037
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76062797
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 76070165
May  7 12:01:01 r21a02010 kernel: end_request: I/O error, dev tapdevh, sector 18069756



So the tapdisk2 process should never be killed when the VM is running. or you will need to restart the power of the machine.
What can we do to avoid?


T
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.