[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] kernel BUG at drivers/net/xen-netfront.c:473!
Hi, The BUG_ON condition looks like this: struct page *page = skb_frag_page(frag); len = skb_frag_size(frag); offset = frag->page_offset; /* Data must not cross a page boundary. */ BUG_ON(len + offset > PAGE_SIZE<<compound_order(page));That seems to be a pretty seriously screwed up skb frag, the stacktrace suggests a packet arrived on a vif then the TCP stack either turned it back or generated a response to it. Could you reproduce the the problem with some debug printouts? The BUG_ON line should be replaced with this: if (len + offset > PAGE_SIZE<<compound_order(page)) {netdev_err(dev, "len %d offset %d order %d PageHead %d i %d nr_frags %d \n", len, offset, compound_order(page), PageHead(page), i, skb_shinfo(skb)->nr_frags); BUG(); } This can provide some insight what exactly is wrong with this packet. Regards, Zoltan On 24/10/14 18:12, Christopher S. Aker wrote: Xen: 4.4.1-pre++ (xenbits @ 28414:b2a1758e87a8) + xsa100.patch Dom0: 3.10.40-2 + futex patcheset DomU: 3.15.4-x86_64 (straight up kernel.org) Guest kernel binary and other stuff is available here: <http://vin.fo/~caker/xen/bugs/xen-netfront.c:473/> The host's networking consists of 4x 10G links, bonded, in a bridge, and then a single vif per guest on the bridge. We have a user who is able to reliably (although painfully) reproduce the following guest kernel crash. The guest is using HAProxy as a load balancer for a handful of backends - so the network was being used heavily(?). kernel BUG at drivers/net/xen-netfront.c:473! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.4-x86_64-linode45 #1 task: ffffffff81c18450 ti: ffffffff81c00000 task.ti: ffffffff81c00000 RIP: e030:[<ffffffff81568e41>] [<ffffffff81568e41>] xennet_make_frags+0x247/0x40b RSP: e02b:ffff88007fa037a8 EFLAGS: 00010002 RAX: ffffea0001dfcb40 RBX: ffff880079ee0740 RCX: 0000000000000000 RDX: ffff880079ed1a9c RSI: 0000000000001040 RDI: 0000000000001000 RBP: ffff880079bee6e8 R08: 00000000000005a8 R09: 00000000000000a6 R10: ffffffff81742dc9 R11: ffff88007978a000 R12: 0000000000000f82 R13: 00000000000000be R14: 0000000000000027 R15: ffffea0001df2300 FS: 0000000000000000(0000) GS:ffff88007fa00000(0000) knlGS:ffff8800ff300000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000001240000 CR3: 00000000775c3000 CR4: 0000000000042660 Stack: 00000000000005a8 00000000000000dd ffff880079ed1000 00005ade816e5a45 0000000000000020 0000000277db1000 ffff880079bee7cc 0000001d815de4b4 ffff880079ee1030 0000000400054803 ffff880079bee6e8 ffff880079ee0740 Call Trace: <IRQ> [<ffffffff8156a577>] ? xennet_start_xmit+0x3a9/0x4a7 [<ffffffff815ebc85>] ? dev_hard_start_xmit+0x319/0x410 [<ffffffff816050d6>] ? sch_direct_xmit+0x6a/0x191 [<ffffffff815ebf9e>] ? __dev_queue_xmit+0x222/0x444 [<ffffffff8169afe8>] ? ip_options_echo+0x2f0/0x2f0 [<ffffffff8169dd0d>] ? ip_finish_output_gso+0x329/0x40a [<ffffffff8169ddee>] ? ip_finish_output_gso+0x40a/0x40a [<ffffffff8169de41>] ? ip_finish_output+0x53/0x3c4 [<ffffffff8169d51e>] ? ip_queue_xmit+0x2be/0x2e9 [<ffffffff816af12c>] ? tcp_transmit_skb+0x74e/0x791 [<ffffffff816acb33>] ? tcp_clean_rtx_queue+0x5c1/0x6b2 [<ffffffff816b1c6e>] ? tcp_write_xmit+0x3eb/0x542 [<ffffffff816b1e1a>] ? __tcp_push_pending_frames+0x24/0x7f [<ffffffff816adc88>] ? tcp_rcv_established+0x115/0x5a1 [<ffffffff816df148>] ? ipv4_confirm+0xbf/0xc9 [<ffffffff816b4715>] ? tcp_v4_do_rcv+0xa3/0x1f5 [<ffffffff816b4c2b>] ? tcp_v4_rcv+0x3c4/0x715 [<ffffffff816341d1>] ? nf_hook_slow+0x72/0x107 [<ffffffff816988c4>] ? ip_rcv+0x317/0x317 [<ffffffff816989d6>] ? ip_local_deliver_finish+0x112/0x1cd [<ffffffff815e72a5>] ? __netif_receive_skb_core+0x4e8/0x520 [<ffffffff815e7564>] ? netif_receive_skb_internal+0x71/0x77 [<ffffffff815eb44d>] ? napi_gro_receive+0xa7/0xe5 [<ffffffff8156aec2>] ? handle_incoming_queue+0xe1/0x138 [<ffffffff8156b41b>] ? xennet_poll+0x502/0x5cc [<ffffffff815e6252>] ? __napi_schedule+0x4c/0x4e [<ffffffff815e7773>] ? net_rx_action+0xa7/0x1f6 [<ffffffff810a68cf>] ? __do_softirq+0xd1/0x1db [<ffffffff810a6a5e>] ? irq_exit+0x40/0x87 [<ffffffff814e49c9>] ? xen_evtchn_do_upcall+0x2f/0x3a [<ffffffff817b96fe>] ? xen_do_hypervisor_callback+0x1e/0x30 <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [<ffffffff81007124>] ? xen_safe_halt+0xc/0x15 [<ffffffff8101287a>] ? default_idle+0x5/0x8 [<ffffffff810d34bc>] ? cpuidle_idle_call+0x3a/0x7f [<ffffffff810d3585>] ? cpu_idle_loop+0x84/0xab [<ffffffff81caff44>] ? start_kernel+0x308/0x30e [<ffffffff81cafa76>] ? repair_env_string+0x58/0x58 [<ffffffff810071f1>] ? xen_setup_runstate_info+0x27/0x34 [<ffffffff81cb2dc5>] ? xen_start_kernel+0x400/0x405 Code: 01 44 8b 69 0c 44 8b 61 08 48 8b 30 31 c9 f7 c6 00 40 00 00 74 03 8b 48 68 43 8d 74 25 00 bf 00 10 00 00 48 d3 e7 48 39 fe 76 04 <0f> 0b eb fe 45 89 e7 41 81 e4 ff 0f 00 00 41 c1 ef 0c 45 89 ff RIP [<ffffffff81568e41>] xennet_make_frags+0x247/0x40b RSP <ffff88007fa037a8> ---[ end trace e681a3f19fa83070 ]--- Kernel panic - not syncing: Fatal exception in interrupt Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |