[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] General protection fault in netback
On Fri, Feb 03, 2012 at 07:32:40PM +0300, Anton Samsonov wrote: > I was experimenting with DomU redundancy and load balancing, > and I think this GPF started to show up after a couple of DomUs > with CARP and HAProxy were added that constantly generate > a strong flow of network traffic by pinging target machines > and each other as well. Or may be it is not related to CARP > and pinging, but just depends on traffic volume: the more VMs > added and running, the more chances that Dom0-DomU networking > will collapse, the critical point being 8 guest domains, while I need 10. > > I can't give exact steps to reproduce, as it happens randomly, > usually without any correlated user activity, after several hours > (or several minutes) of normal performance. But sometimes > it happens not so long after a balancer's DomU startup or shutdown. > After GPF happens, all VMs loose their networking connectivity. > > Dom0 is openSUSE 12.1 on AMD64 (Linux 3.1.0-1.2-xen) Do you get the same issue with a pv-ops dom0? So also 3.1, but from kernel.org? > with Xen version 4.1.2_05-1.9, which is patched as described > in openSUSE bug 727081 (bugzilla.novell.com/show_bug.cgi?id=727081). > Supposedly "offending" DomU is paravirtualized NetBSD 5.1.1 > for AMD64 with recompiled kernel (CARP enabled, no more changes). What is CARP? > Other VMs are openSUSE 11.4 and 12.1 for AMD64. > > > Trace log in /var/log/messages always looks similar (varying digits > replaced with asterisks ***): > > > general protection fault: 0000 [#1] SMP > CPU {core-number} > Modules linked in: 8250 8250_pnp af_packet asus_wmi ata_generic > blkback_pagemap blkbk blktap bridge btrfs button cdrom dm_mod > domctl drm drm_kms_helper edd eeepc_wmi ehci_hcd evtchn fuse > gntdev hid hwmon i2c_algo_bit i2c_core i2c_i801 i915 > iTCO_vendor_support iTCO_wdt linear llc lzo_compress mei(C) > microcode netbk parport parport_pc pata_via pci_hotplug pcspkr > ppdev processor r8169 rfkill serial_core [serio_raw] sg > snd snd_hda_codec snd_hda_codec_hdmi snd_hda_codec_realtek > snd_hda_intel snd_hwdep snd_mixer_oss snd_page_alloc snd_pcm > snd_pcm_oss snd_seq snd_seq_device snd_timer soundcore > sparse_keymap sr_mod stp thermal_sys uas usbbk usbcore > usbhid usb_storage video wmi xenblk xenbus_be xennet zlib_deflate > > Pid: {process-id}, comm: netback/{0/1} Tainted: G > C 3.1.0-1.2-xen #1 System manufacturer System Product Name/P8H67-M > RIP: e030:[<ffffffff803e7451>] [<ffffffff803e7451>] > skb_release_data.part.47+0x61/0xc0 > RSP: e02b:ffff880******d40 EFLAGS: 00010202 > RAX: 0000000000000000 RBX: ffff880********0 RCX: ffff880******000 > RDX: {..RCX.+.0e80..} RSI: 00000000000000** RDI: 00***c**00000000 > RBP: {.....RBX......} R08: {..RCX.-.cff0..} R09: 0000000********* > R10: 000000000000000* R11: {.task.+.0470..} R12: ffff880026a51000 > R13: ffff880********0 R14: ffffc900048****0 R15: 0000000000000001 > FS: 00007f*******7*0(0000) GS:ffff880******000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000***********0 CR3: 0000000******000 CR4: 0000000000042660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process netback/{0/1} (pid: {process-id}, threadinfo ffff880******000, > task ffff880********0) > Stack: > 0000000000000000 {.....RBX......} 0000000000000000 ffffffff803e7511 > {.....RBX......} ffffffffa0***d2c {.....task.....} {thread.+.1e00.} > {thread.+.1db0.} {.R14.-.22a40..} ffffc9000000000* 0000000000000000 Hm, that is a pretty neat stack output. Wonder which patch of theirs does that. > Call Trace: > [<ffffffff803e7511>] __kfree_skb+0x11/0x20 > [<ffffffffa0***d2c>] net_rx_action+0x66c/0x9c0 [netbk] > [<ffffffffa0***72a>] netbk_action_thread+0x5a/0x270 [netbk] > [<ffffffff8006438e>] kthread+0x7e/0x90 > [<ffffffff8050f814>] kernel_thread_helper+0x4/0x10 > Code: 48 8b 7c 02 08 e8 90 69 cf ff 8b 95 d0 00 00 00 > 48 8b 8d d8 00 00 00 48 01 ca 0f b7 02 39 c3 7c > d1 f6 42 0c 10 74 1e 48 8b 7a 30 > RIP [<ffffffff803e7451>] skb_release_data.part.47+0x61/0xc0 > RSP <ffff880******d40> > ---[ end trace **************** ]--- > > > Preceeding and subsequent messages don't seem to be related with GPF, > time gap is from minutes to half an hour or even more. But if this could give > some insight, I will post them, too. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |