[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] netback Oops then xenwatch stuck in D state
On Wed, Feb 13, 2013 at 08:12:44PM +0000, Christopher S. Aker wrote: > On 2/13/13 7:47 AM, Wei Liu wrote: > > On Wed, 2013-02-13 at 02:51 +0000, Christopher S. Aker wrote: > >> On 2/12/13 4:58 AM, Ian Campbell wrote: > >>> Have you applied the XSA-39 fixes to this kernel? > >> > >> Yes! When I rebuilt with Wei's suggested patch for my original > >> netback/xenwatch problem I also brought us up to date with XSA patches. > > Good to have more context. > > We have found a way to reproduce a very similar BUG by keeping a guest's > network IO busy and then from the host "ifconfig down" the vif. It > results in the following dump. This only works if the guest is doing > network I/O. > > We can reproduce regardless if dom0 is patched with XSA-39, and is > trigger-able at least as far back as 3.2.6 dom0. > > Procedure: > > Launch a guest and configure iperf [in TCP mode] between it and another > box on the network then bring down its vif on the host. > > root@dom0:~# ifconfig vif14.0 down <-- insta-boom > br0: port 3(vif14.0) entered disabled state > unable to handle kernel NULL pointer dereference at 00000000000008b8 > IP: [<ffffffff81011dda>] xen_spin_lock_flags+0x3a/0x80 > PGD 0 > Oops: 0002 [#1] SMP > Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6 ebt_ip > ip_set_hash_net ip_set xt_physdev iptable_filter ip_tables ebtable_nat > xen_gntdev bonding ebtable_filter igb > CPU 1 > Pid: 0, comm: swapper/1 Not tainted 3.7.6-1-x86_64 #1 Supermicro > X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F > RIP: e030:[<ffffffff81011dda>] [<ffffffff81011dda>] > xen_spin_lock_flags+0x3a/0x80 > RSP: e02b:ffff880141843d60 EFLAGS: 00010006 > RAX: 0000000000000400 RBX: 00000000000008b8 RCX: 0000000000012739 > RDX: 0000000000000001 RSI: 0000000000000222 RDI: 00000000000008b8 > RBP: ffff880141843d80 R08: 0000000000000144 R09: 0000000000000003 > R10: 0000000000000003 R11: dead000000200200 R12: 0000000000000001 > R13: 0000000000000200 R14: 0000000000000400 R15: ffff8800216ba700 > FS: 00007f03fa88a700(0000) GS:ffff880141840000(0000) > knlGS:0000000000000000 > CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > CR2: 00000000000008b8 CR3: 0000000001c0b000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper/1 (pid: 0, threadinfo ffff880101138000, task > ffff8801011049c0) > Stack: > 0000000000000222 00000000000008b8 ffff8800216ba700 ffff8800216ba7d8 > ffff880141843da0 ffffffff817605da 0000000000000000 00000000000008b8 > ffff880141843de0 ffffffff8154446f ffff88014184e5b8 ffff88014184e578 > Call Trace: > <IRQ> > [<ffffffff817605da>] _raw_spin_lock_irqsave+0x2a/0x40 > [<ffffffff8154446f>] xen_netbk_schedule_xenvif+0x8f/0x100 > [<ffffffff81544540>] ? xen_netbk_check_rx_xenvif+0x60/0x60 > [<ffffffff81544505>] xen_netbk_check_rx_xenvif+0x25/0x60 > [<ffffffff81544589>] tx_credit_callback+0x49/0x50 > [<ffffffff8105ee04>] call_timer_fn+0x44/0x120 > [<ffffffff8105f411>] run_timer_softirq+0x241/0x2b0 > [<ffffffff81544540>] ? xen_netbk_check_rx_xenvif+0x60/0x60 > [<ffffffff8105731f>] __do_softirq+0xcf/0x250 > [<ffffffff810c1253>] ? handle_percpu_irq+0x43/0x60 > [<ffffffff8176971c>] call_softirq+0x1c/0x30 > [<ffffffff81015425>] do_softirq+0x65/0xa0 > [<ffffffff8105710d>] irq_exit+0xbd/0xe0 > [<ffffffff8141a73f>] xen_evtchn_do_upcall+0x2f/0x40 > [<ffffffff8176977e>] xen_do_hypervisor_callback+0x1e/0x30 Notice the tracelog is different here, this looks like a vallina bug to me. It is the timer callback that triggers the oops. This one should be simple to fix - we should also shut down the timer when shutting down vif. Will get to this tomorrow. Need to have rest now. :-) Wei. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |