[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Trying to unmap invalid handle! pending_idx: @ drivers/net/xen-netback/netback.c:998 causes kernel panic/reboot
Hello On Mon, Jul 14, 2014 at 04:25:54AM +0200, Armin Zentai wrote: > Dear Xen Developers! > > > We're running Xen on multiple machines, most of them are Dell R410 or SM > X8DTL, with one E5645 cpu, and 48 GB of RAM. We've update the kernel to > 3.15.4, after the some of our hypervisors started to rebooting at random > times. > > The logs were empty, and we have no information about the crashes, we've > tried some tricks, and at the end the netconsole kernel modul helped, so we > can do a very thin layer of remote kernel logging. We've found the following > in the remote logs: It's good you've got netconsole working. I would still like to point out that we have a wiki page on setting up serial console on Xen, which might be helpful. http://wiki.xen.org/wiki/Xen_Serial_Console > > Jul 13 00:46:58 node11 [157060.106323] vif vif-2-0 h14z4mzbvfrrhb: Trying to > unmap invalid handle! pending_idx: c > Jul 13 00:46:58 node11 [157060.106476] ------------[ cut here ]------------ > Jul 13 00:46:58 node11 [157060.106546] kernel BUG at > drivers/net/xen-netback/netback.c:998! > Jul 13 00:46:58 node11 [157060.106616] invalid opcode: 0000 [#1] > Jul 13 00:46:58 SMP > Jul 13 00:46:58 node11 [...] > Jul 13 00:46:58 node11 [157060.112705] CPU: 0 PID: 0 Comm: swapper/0 > Tainted: G E 3.15.4 #1 > Jul 13 00:46:58 node11 [157060.112776] Hardware name: Supermicro > X8DTL/X8DTL, BIOS 1.1b 03/19/2010 > Jul 13 00:46:58 node11 [157060.112848] task: ffffffff81c1b480 ti: > ffffffff81c00000 task.ti: ffffffff81c00000 > Jul 13 00:46:58 node11 [157060.112936] RIP: e030:[<ffffffffa025f61d>] > Jul 13 00:46:58 node11 [<ffffffffa025f61d>] xenvif_idx_unmap+0x11d/0x130 > [xen_netback] > Jul 13 00:46:58 node11 [157060.113078] RSP: e02b:ffff88008ea03d48 EFLAGS: > 00010292 > Jul 13 00:46:58 node11 [157060.113147] RAX: 000000000000004a RBX: > 000000000000000c RCX: 0000000000000000 > Jul 13 00:46:58 node11 [157060.113234] RDX: ffff88008a40b600 RSI: > ffff88008ea03a18 RDI: 000000000000021b > Jul 13 00:46:58 node11 [157060.113321] RBP: ffff88008ea03d88 R08: > 0000000000000000 R09: ffff88008a40b600 > Jul 13 00:46:58 node11 [157060.113408] R10: ffff88008a0004e8 R11: > 00000000000006d8 R12: ffff8800569708c0 > Jul 13 00:46:58 node11 [157060.113495] R13: ffff88006558fec0 R14: > ffff8800569708c0 R15: 0000000000000001 > Jul 13 00:46:58 node11 [157060.113589] FS: 00007f351684b700(0000) > GS:ffff88008ea00000(0000) knlGS:0000000000000000 > Jul 13 00:46:58 node11 [157060.113679] CS: e033 DS: 0000 ES: 0000 CR0: > 000000008005003b > Jul 13 00:46:58 node11 [157060.113747] CR2: 00007fc2a4372000 CR3: > 00000000049f3000 CR4: 0000000000002660 > Jul 13 00:46:58 node11 [157060.113835] Stack: > Jul 13 00:46:58 node11 [157060.113896] ffff880056979f90 > Jul 13 00:46:58 node11 ff00000000000001 > Jul 13 00:46:58 node11 ffff880b0605e000 > Jul 13 00:46:58 node11 0000000000000000 > Jul 13 00:46:58 node11 > Jul 13 00:46:58 node11 [157060.114143] ffff0000ffffffff > Jul 13 00:46:58 node11 00000000fffffff6 > Jul 13 00:46:58 node11 0000000000000001 > Jul 13 00:46:58 node11 ffff8800569769d0 > Jul 13 00:46:58 node11 > Jul 13 00:46:58 node11 [157060.114390] ffff88008ea03e58 > Jul 13 00:46:58 node11 ffffffffa02622fc > Jul 13 00:46:58 node11 ffff88008ea03dd8 > Jul 13 00:46:58 node11 ffffffff810b5223 > Jul 13 00:46:58 node11 > Jul 13 00:46:58 node11 [157060.114637] Call Trace: > Jul 13 00:46:58 node11 [157060.114700] <IRQ> > Jul 13 00:46:58 node11 > Jul 13 00:46:58 node11 [157060.114750] > Jul 13 00:46:58 node11 [<ffffffffa02622fc>] xenvif_tx_action+0x27c/0x7f0 > [xen_netback] > Jul 13 00:46:58 node11 [157060.114927] [<ffffffff810b5223>] ? > __wake_up+0x53/0x70 > Jul 13 00:46:58 node11 [157060.114998] [<ffffffff810ca077>] ? > handle_irq_event_percpu+0xa7/0x1b0 > Jul 13 00:46:58 node11 [157060.115073] [<ffffffffa02647d1>] > xenvif_poll+0x31/0x64 [xen_netback] > Jul 13 00:46:58 node11 [157060.115147] [<ffffffff81653d4b>] > net_rx_action+0x10b/0x290 > Jul 13 00:46:58 node11 [157060.115221] [<ffffffff81071c73>] > __do_softirq+0x103/0x320 > Jul 13 00:46:58 node11 [157060.115292] [<ffffffff81072015>] > irq_exit+0x135/0x140 > Jul 13 00:46:58 node11 [157060.115363] [<ffffffff8144759c>] > xen_evtchn_do_upcall+0x3c/0x50 > Jul 13 00:46:58 node11 [157060.115436] [<ffffffff8175c07e>] > xen_do_hypervisor_callback+0x1e/0x30 > Jul 13 00:46:58 node11 [157060.115506] <EOI> > Jul 13 00:46:58 node11 > Jul 13 00:46:58 node11 [157060.115551] > Jul 13 00:46:58 node11 [<ffffffff810013aa>] ? > xen_hypercall_sched_op+0xa/0x20 > Jul 13 00:46:58 node11 [157060.115722] [<ffffffff810013aa>] ? > xen_hypercall_sched_op+0xa/0x20 > Jul 13 00:46:58 node11 [157060.115794] [<ffffffff8100a200>] ? > xen_safe_halt+0x10/0x20 > Jul 13 00:46:58 node11 [157060.115869] [<ffffffff8101dbbf>] ? > default_idle+0x1f/0xc0 > Jul 13 00:46:58 node11 [157060.115939] [<ffffffff8101d38f>] ? > arch_cpu_idle+0xf/0x20 > Jul 13 00:46:58 node11 [157060.116009] [<ffffffff810b5aa1>] ? > cpu_startup_entry+0x201/0x360 > Jul 13 00:46:58 node11 [157060.116084] [<ffffffff817420a7>] ? > rest_init+0x77/0x80 > Jul 13 00:46:58 node11 [157060.116156] [<ffffffff81d3a156>] ? > start_kernel+0x406/0x413 > Jul 13 00:46:58 node11 [157060.116227] [<ffffffff81d39b6e>] ? > repair_env_string+0x5b/0x5b > Jul 13 00:46:58 node11 [157060.116298] [<ffffffff81d39603>] ? > x86_64_start_reservations+0x2a/0x2c > Jul 13 00:46:58 node11 [157060.116373] [<ffffffff81d3d5dc>] ? > xen_start_kernel+0x584/0x586 [...] > Jul 13 00:46:58 node11 > Jul 13 00:46:58 node11 [157060.119179] RIP > Jul 13 00:46:58 node11 [<ffffffffa025f61d>] xenvif_idx_unmap+0x11d/0x130 > [xen_netback] > Jul 13 00:46:58 node11 [157060.119312] RSP <ffff88008ea03d48> > Jul 13 00:46:58 node11 [157060.119395] ---[ end trace 7e021c96c8cfea53 ]--- > Jul 13 00:46:58 node11 [157060.119465] Kernel panic - not syncing: Fatal > exception in interrupt > > > h14z4mzbvfrrhb was a name of a VIF. This VIF belongs to a Windows Server > 2008 R2 X64 virtual machine. We had 6 random reboots until now, all of the > VIFs are belonged to the same operating system, but different virtual > machines. So only Windows Server 2008 R2 X64 system's virtual interfaces > caused the crashes, these systems has been provisioned from different > installs or templates. The GPLPV driver's versions are also different. > Unfortunately I don't have Windows server 2008 R2. :-( This bug is in guest TX path. What's the workload of your guest? Is there any pattern of its traffic? I've checked changesets between 3.15.4 and 3.16-rc5 there's no fix for this, so this is the first report of this issue. If there's a reliable reproduce then that would be great. Zoltan, have you seen this before? Can your work on pktgen help? > [root@c2-node11 ~]# uname -a > Linux c2-node11 3.15.4 #1 SMP Tue Jul 8 17:58:26 CEST 2014 x86_64 x86_64 > x86_64 GNU/Linux > > > The xm create config file of the specified VM (the other VM's config files > are the same): > > kernel = "/usr/lib/xen/boot/hvmloader" > device_model = "/usr/lib64/xen/bin/qemu-dm" > builder = "hvm" > memory = "2000" > name = "vna3mhwnv9pn4m" > vcpus = "1" > > timer_mode = "2" > viridian = "1" > > vif = [ "type=ioemu, mac=00:16:3e:64:c8:ba, bridge=x0evss6g1ztoa4, ip=..., > vifname=h14z4mzbvfrrhb, rate=100Mb/s" ] > > disk = [ "phy:/dev/q7jiqc2gh02b2b/xz7wget4ycmp77,ioemu:hda,w" ] > vnc = 1 > vncpasswd="aaaaa1" > usbdevice="tablet" > > > The HV's networking looks as the following: > We are using dual emulex 10gbit network adapters, with bonding (LACP), and > on the top of the bond, we're using VLAN's for the VM, management and the > iSCSI traffic. > We're tried to reproduce the error, but we couldn't, the crash/reboot > happened randomly every time. > In that case you will need to instrument netback to spit out more information. Zoltan, is there any other information that you would like to know? Wei. > Thanks, for your help, > > - Armin Zentai > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |