[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Re: [Xen-devel] kernel oops/IRQ exception when networking between many domUs
Am Montag, den 06.06.2005, 09:23 +0100 schrieb Keir Fraser: > On 5 Jun 2005, at 17:57, Birger Toedtmann wrote: > > > Apparently it is happening somewhere here: > > > > [...] > > 0xc028cbe5 <net_rx_action+1135>: test %eax,%eax > > 0xc028cbe7 <net_rx_action+1137>: je 0xc028ca82 > > <net_rx_action+780> > > 0xc028cbed <net_rx_action+1143>: mov %esi,%eax > > 0xc028cbef <net_rx_action+1145>: shr $0xc,%eax > > 0xc028cbf2 <net_rx_action+1148>: mov %eax,(%esp) > > 0xc028cbf5 <net_rx_action+1151>: call 0xc028c4c4 <free_mfn> > > 0xc028cbfa <net_rx_action+1156>: mov $0xffffffff,%ecx > > ^^^^^^^^^^ > > Most likely the driver has tried to send a bogus page to a domU. > Because it's bogus the transfer fails. The driver then tries to free > the page back to Xen, but that also fails because the page is bogus. > This confuses the driver, which then BUG()s out. I commented out the free_mfn() and status= lines: the kernel now reports the following after it configured the 10th domU and ~80th vif, with approx. 20-25 bridges up. Just an idea: the number of vifs + bridges is somewhere around the magic 128 (NR_IRQS problem in 2.0.x!) when the crash happens - could this hint to something? [...] Jun 6 10:12:14 lomin kernel: 10.2.23.8: port 2(vif10.3) entering forwarding state Jun 6 10:12:14 lomin kernel: 10.2.35.16: topology change detected, propagating Jun 6 10:12:14 lomin kernel: 10.2.35.16: port 2(vif10.4) entering forwarding state Jun 6 10:12:14 lomin kernel: 10.2.35.20: topology change detected, propagating Jun 6 10:12:14 lomin kernel: 10.2.35.20: port 2(vif10.5) entering forwarding state Jun 6 10:12:20 lomin kernel: c014cea4 Jun 6 10:12:20 lomin kernel: [do_page_fault+643/1665] do_page_fault +0x469/0x738 Jun 6 10:12:20 lomin kernel: [<c0115720>] do_page_fault+0x469/0x738 Jun 6 10:12:20 lomin kernel: [fixup_4gb_segment+2/12] page_fault +0x2e/0x34 Jun 6 10:12:20 lomin kernel: [<c0109a7e>] page_fault+0x2e/0x34 Jun 6 10:12:20 lomin kernel: [do_page_fault+49/1665] do_page_fault +0x217/0x738 Jun 6 10:12:20 lomin kernel: [<c01154ce>] do_page_fault+0x217/0x738 Jun 6 10:12:20 lomin kernel: [fixup_4gb_segment+2/12] page_fault +0x2e/0x34 Jun 6 10:12:20 lomin kernel: [<c0109a7e>] page_fault+0x2e/0x34 Jun 6 10:12:20 lomin kernel: PREEMPT Jun 6 10:12:20 lomin kernel: Modules linked in: dm_snapshot pcmcia bridge ipt_REJECT ipt_state iptable_filter ipt_MASQUERADE iptable_nat ip_conntrack ip_tables autofs4 snd_seq snd_seq_device evdev usbhid rfcomm l2cap bluetooth dm_mod cryptoloop snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd soundcore snd_page_alloc tun uhci_hcd usb_storage usbcore irtty_sir sir_dev ircomm_tty ircomm irda yenta_socket rsrc_nonstatic pcmcia_core 3c59x Jun 6 10:12:20 lomin kernel: CPU: 0 Jun 6 10:12:20 lomin kernel: EIP: 0061:[do_wp_page+622/1175] Not tainted VLI Jun 6 10:12:20 lomin kernel: EIP: 0061:[<c014cea4>] Not tainted VLI Jun 6 10:12:20 lomin kernel: EFLAGS: 00010206 (2.6.11.11-xen0) Jun 6 10:12:20 lomin kernel: EIP is at handle_mm_fault+0x5d/0x222 Jun 6 10:12:20 lomin kernel: eax: 15555b18 ebx: d8788000 ecx: 00000b18 edx: 15555b18 Jun 6 10:12:20 lomin kernel: esi: dcfc3b4c edi: dcaf5580 ebp: d8789ee4 esp: d8789ebc Jun 6 10:12:20 lomin kernel: ds: 0069 es: 0069 ss: 0069 Jun 6 10:12:20 lomin kernel: Process python (pid: 4670, threadinfo=d8788000 task=de1a1520) Jun 6 10:12:20 lomin kernel: Stack: 00000040 00000001 d40e687c d40e6874 00000006 d40e685c d8789f14 dcaf5580 Jun 6 10:12:20 lomin kernel: dcaf55ac d40e6b1c d8789fbc c01154ce dcaf5580 d40e6b1c b4ec6ff0 00000001 Jun 6 10:12:20 lomin kernel: 00000001 de1a1520 b4ec6ff0 00000006 d8789fc4 d8789fc4 c03405b0 00000006 Jun 6 10:12:20 lomin kernel: Call Trace: Jun 6 10:12:20 lomin kernel: [dump_stack+16/32] show_stack+0x80/0x96 Jun 6 10:12:20 lomin kernel: [<c0109c51>] show_stack+0x80/0x96 Jun 6 10:12:20 lomin kernel: [show_registers+384/457] show_registers +0x15a/0x1d1 Jun 6 10:12:20 lomin kernel: [<c0109de1>] show_registers+0x15a/0x1d1 Jun 6 10:12:20 lomin kernel: [die+301/458] die+0x106/0x1c4 Jun 6 10:12:20 lomin kernel: [<c010a001>] die+0x106/0x1c4 Jun 6 10:12:20 lomin kernel: [do_page_fault+675/1665] do_page_fault +0x489/0x738 Jun 6 10:12:20 lomin kernel: [<c0115740>] do_page_fault+0x489/0x738 Jun 6 10:12:20 lomin kernel: [fixup_4gb_segment+2/12] page_fault +0x2e/0x34 Jun 6 10:12:20 lomin kernel: [<c0109a7e>] page_fault+0x2e/0x34 Jun 6 10:12:20 lomin kernel: [do_page_fault+49/1665] do_page_fault +0x217/0x738 Jun 6 10:12:20 lomin kernel: [<c01154ce>] do_page_fault+0x217/0x738 Jun 6 10:12:20 lomin kernel: [fixup_4gb_segment+2/12] page_fault +0x2e/0x34 Jun 6 10:12:20 lomin kernel: [<c0109a7e>] page_fault+0x2e/0x34 Jun 6 10:12:20 lomin kernel: Code: 8b 47 1c c1 ea 16 83 43 14 01 8d 34 90 85 f6 0f 84 52 01 00 00 89 f2 8b 4d 10 89 f8 e8 4a d1 ff ff 85 c0 89 c2 0f 84 3c 01 00 00 <8b> 00 a8 81 75 3d 85 c0 0f 84 01 01 00 00 a8 40 0f 84 a4 00 00 > > It's not at all clear where the bogus address comes from: the driver > basically just reads the address out of an skbuff, and converts it from > virtual to physical address. But something is obviously going wrong, > perhaps under memory pressure. :-( Where, within the domUs or dom0? The latter has lots of memory at hand, the domU are quite strapped of memory. I'll try to find out... Regards, -- Birger Tödtmann Technik der Rechnernetze, Institut für Experimentelle Mathematik Universität Duisburg-Essen, Campus Essen email:btoedtmann@xxxxxxxxxxxxxx skype:birger.toedtmann pgp:0x6FB166C9 _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |