[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] netback Oops then xenwatch stuck in D state



On 2/13/13 7:47 AM, Wei Liu wrote:
On Wed, 2013-02-13 at 02:51 +0000, Christopher S. Aker wrote:
On 2/12/13 4:58 AM, Ian Campbell wrote:
Have you applied the XSA-39 fixes to this kernel?

Yes!  When I rebuilt with Wei's suggested patch for my original
netback/xenwatch problem I also brought us up to date with XSA patches.
Good to have more context.

We have found a way to reproduce a very similar BUG by keeping a guest's network IO busy and then from the host "ifconfig down" the vif. It results in the following dump. This only works if the guest is doing network I/O.

We can reproduce regardless if dom0 is patched with XSA-39, and is trigger-able at least as far back as 3.2.6 dom0.

Procedure:

Launch a guest and configure iperf [in TCP mode] between it and another box on the network then bring down its vif on the host.

root@dom0:~# ifconfig vif14.0 down <-- insta-boom
br0: port 3(vif14.0) entered disabled state
unable to handle kernel NULL pointer dereference at 00000000000008b8
IP: [<ffffffff81011dda>] xen_spin_lock_flags+0x3a/0x80
PGD 0
Oops: 0002 [#1] SMP
Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6 ebt_ip ip_set_hash_net ip_set xt_physdev iptable_filter ip_tables ebtable_nat xen_gntdev bonding ebtable_filter igb
CPU 1
Pid: 0, comm: swapper/1 Not tainted 3.7.6-1-x86_64 #1 Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F RIP: e030:[<ffffffff81011dda>] [<ffffffff81011dda>] xen_spin_lock_flags+0x3a/0x80
RSP: e02b:ffff880141843d60  EFLAGS: 00010006
RAX: 0000000000000400 RBX: 00000000000008b8 RCX: 0000000000012739
RDX: 0000000000000001 RSI: 0000000000000222 RDI: 00000000000008b8
RBP: ffff880141843d80 R08: 0000000000000144 R09: 0000000000000003
R10: 0000000000000003 R11: dead000000200200 R12: 0000000000000001
R13: 0000000000000200 R14: 0000000000000400 R15: ffff8800216ba700
FS: 00007f03fa88a700(0000) GS:ffff880141840000(0000) knlGS:0000000000000000
CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000000008b8 CR3: 0000000001c0b000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/1 (pid: 0, threadinfo ffff880101138000, task ffff8801011049c0)
Stack:
 0000000000000222 00000000000008b8 ffff8800216ba700 ffff8800216ba7d8
 ffff880141843da0 ffffffff817605da 0000000000000000 00000000000008b8
 ffff880141843de0 ffffffff8154446f ffff88014184e5b8 ffff88014184e578
Call Trace:
 <IRQ>
 [<ffffffff817605da>] _raw_spin_lock_irqsave+0x2a/0x40
 [<ffffffff8154446f>] xen_netbk_schedule_xenvif+0x8f/0x100
 [<ffffffff81544540>] ? xen_netbk_check_rx_xenvif+0x60/0x60
 [<ffffffff81544505>] xen_netbk_check_rx_xenvif+0x25/0x60
 [<ffffffff81544589>] tx_credit_callback+0x49/0x50
 [<ffffffff8105ee04>] call_timer_fn+0x44/0x120
 [<ffffffff8105f411>] run_timer_softirq+0x241/0x2b0
 [<ffffffff81544540>] ? xen_netbk_check_rx_xenvif+0x60/0x60
 [<ffffffff8105731f>] __do_softirq+0xcf/0x250
 [<ffffffff810c1253>] ? handle_percpu_irq+0x43/0x60
 [<ffffffff8176971c>] call_softirq+0x1c/0x30
 [<ffffffff81015425>] do_softirq+0x65/0xa0
 [<ffffffff8105710d>] irq_exit+0xbd/0xe0
 [<ffffffff8141a73f>] xen_evtchn_do_upcall+0x2f/0x40
 [<ffffffff8176977e>] xen_do_hypervisor_callback+0x1e/0x30
 <EOI>
 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
 [<ffffffff81009ae0>] ? xen_safe_halt+0x10/0x20
 [<ffffffff8101c168>] ? default_idle+0x58/0x1b0
 [<ffffffff8101b8a8>] ? cpu_idle+0x88/0xd0
 [<ffffffff817541de>] ? cpu_bringup_and_idle+0xe/0x10
Code: 24 08 4c 89 6c 24 10 4c 89 74 24 18 49 89 f5 48 89 fb 41 81 e5 00 02 00 00 41 bc 01 00 00 00 41 be 00 04 00 00 44 89 f0 44 89 e2 <86> 13 84 d2 74 0b f3 90 80 3b 00 74 f3 ff c8 75 f5 84 d2 75 15
RIP  [<ffffffff81011dda>] xen_spin_lock_flags+0x3a/0x80
 RSP <ffff880141843d60>
CR2: 00000000000008b8
---[ end trace 337eb85a44e2f0ef ]---
Kernel panic - not syncing: Fatal exception in interrupt
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.

-Chris


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.