Xen project Mailing List

Re: [Xen-devel] netback Oops then xenwatch stuck in D state

To: "Christopher S. Aker" <caker@xxxxxxxxxxxx>

Date: Mon, 11 Feb 2013 11:45:03 +0000

Cc: wei.liu2@xxxxxxxxxx, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Mon, 11 Feb 2013 11:45:44 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Sun, 2013-02-10 at 22:03 +0000, Christopher S. Aker wrote: > And another this afternoon on a different machine: > > BUG: unable to handle kernel NULL pointer dereference at 00000000000008b8 OK, so the guest is faulting at different offset now. It is very likely that there is OOM / race condition in other places. And judging from your two emails, I presume this bug can be reproduce steadily. > IP: [<ffffffff81011dda>] xen_spin_lock_flags+0x3a/0x80 > PGD 0 > Oops: 0002 [#1] SMP > Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6 ebt_ip > ip_set_hash_net ip_set ebtable_nat xen_gntdev bonding ebtable_filter e1000e > CPU 5 > Pid: 1550, comm: netback/5 Not tainted 3.7.6-1-x86_64 #1 Supermicro > X8DT6/X8DT6 > RIP: e030:[<ffffffff81011dda>] [<ffffffff81011dda>] > xen_spin_lock_flags+0x3a/0x80 > RSP: e02b:ffff8800836e7b58 EFLAGS: 00010006 > RAX: 0000000000000400 RBX: 00000000000008b8 RCX: 000000000045de5d > RDX: 0000000000000001 RSI: 0000000000000211 RDI: 00000000000008b8 > RBP: ffff8800836e7b78 R08: 0000000000000068 R09: 0000000000000000 > R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 > R13: 0000000000000200 R14: 0000000000000400 R15: 000000000045de5d > FS: 00007f474a465700(0000) GS:ffff880100740000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00000000000008b8 CR3: 0000000001c0b000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process netback/5 (pid: 1550, threadinfo ffff8800836e6000, task > ffff880084510000) > Stack: > 0000000000000211 00000000000008b8 ffff8800771e5700 ffff8800771e57d8 > ffff8800836e7b98 ffffffff817605da 0000000000000000 00000000000008b8 > ffff8800836e7bd8 ffffffff8154446f ffff8800771e5000 0000000000000000 > Call Trace: > [<ffffffff817605da>] _raw_spin_lock_irqsave+0x2a/0x40 > [<ffffffff8154446f>] xen_netbk_schedule_xenvif+0x8f/0x100 > [<ffffffff81544505>] xen_netbk_check_rx_xenvif+0x25/0x60 > [<ffffffff815445eb>] netbk_tx_err+0x5b/0x70 > [<ffffffff8154518c>] xen_netbk_tx_build_gops+0xb8c/0xbc0 > [<ffffffff81012880>] ? __switch_to+0x160/0x4f0 > [<ffffffff810891b8>] ? idle_balance+0xf8/0x150 > [<ffffffff81080150>] ? finish_task_switch+0x60/0xd0 > [<ffffffff8175f7b4>] ? __schedule+0x394/0x750 > [<ffffffff815452af>] xen_netbk_kthread+0xef/0x9d0 > [<ffffffff81080150>] ? finish_task_switch+0x60/0xd0 > [<ffffffff810720c0>] ? wake_up_bit+0x40/0x40 > [<ffffffff815451c0>] ? xen_netbk_tx_build_gops+0xbc0/0xbc0 > [<ffffffff81071a06>] kthread+0xc6/0xd0 > [<ffffffff810037b9>] ? xen_end_context_switch+0x19/0x20 > [<ffffffff81071940>] ? kthread_freezable_should_stop+0x70/0x70 > [<ffffffff8176847c>] ret_from_fork+0x7c/0xb0 > [<ffffffff81071940>] ? kthread_freezable_should_stop+0x70/0x70 [snip] > > We're not so good at this, but it looks like xl->lock deref is what we > hit? The lock was gone? > A quick check on the xen_spinlock struct, its offset should not be 0x8b8. Reading the backtrace suggests that it is the netbk struct is gone. Do you manipulate the number of vcpus Dom0 has after it's up? Wei. > -Chris > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.