[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] General protection fault in netback

I was experimenting with DomU redundancy and load balancing,
and I think this GPF started to show up after a couple of DomUs
with CARP and HAProxy were added that constantly generate
a strong flow of network traffic by pinging target machines
and each other as well. Or may be it is not related to CARP
and pinging, but just depends on traffic volume: the more VMs
added and running, the more chances that Dom0-DomU networking
will collapse, the critical point being 8 guest domains, while I need 10.

I can't give exact steps to reproduce, as it happens randomly,
usually without any correlated user activity, after several hours
(or several minutes) of normal performance. But sometimes
it happens not so long after a balancer's DomU startup or shutdown.
After GPF happens, all VMs loose their networking connectivity.

Dom0 is openSUSE 12.1 on AMD64 (Linux 3.1.0-1.2-xen)
with Xen version 4.1.2_05-1.9, which is patched as described
in openSUSE bug 727081 (bugzilla.novell.com/show_bug.cgi?id=727081).
Supposedly "offending" DomU is paravirtualized NetBSD 5.1.1
for AMD64 with recompiled kernel (CARP enabled, no more changes).
Other VMs are openSUSE 11.4 and 12.1 for AMD64.

Trace log in /var/log/messages always looks similar (varying digits
replaced with asterisks ***):

general protection fault: 0000 [#1] SMP
CPU {core-number}
Modules linked in: 8250 8250_pnp af_packet asus_wmi ata_generic
blkback_pagemap blkbk blktap bridge btrfs button cdrom dm_mod
domctl drm drm_kms_helper edd eeepc_wmi ehci_hcd evtchn fuse
gntdev hid hwmon i2c_algo_bit i2c_core i2c_i801 i915
iTCO_vendor_support iTCO_wdt linear llc lzo_compress mei(C)
microcode netbk parport parport_pc pata_via pci_hotplug pcspkr
ppdev processor r8169 rfkill serial_core [serio_raw] sg
snd snd_hda_codec snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_intel snd_hwdep snd_mixer_oss snd_page_alloc snd_pcm
snd_pcm_oss snd_seq snd_seq_device snd_timer soundcore
sparse_keymap sr_mod stp thermal_sys uas usbbk usbcore
usbhid usb_storage video wmi xenblk xenbus_be xennet zlib_deflate

Pid: {process-id}, comm: netback/{0/1} Tainted: G
         C  3.1.0-1.2-xen #1 System manufacturer System Product Name/P8H67-M
RIP: e030:[<ffffffff803e7451>]  [<ffffffff803e7451>]
RSP: e02b:ffff880******d40  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff880********0 RCX: ffff880******000
RDX: {..RCX.+.0e80..} RSI: 00000000000000** RDI: 00***c**00000000
RBP: {.....RBX......} R08: {..RCX.-.cff0..} R09: 0000000*********
R10: 000000000000000* R11: {.task.+.0470..} R12: ffff880026a51000
R13: ffff880********0 R14: ffffc900048****0 R15: 0000000000000001
FS:  00007f*******7*0(0000) GS:ffff880******000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000***********0 CR3: 0000000******000 CR4: 0000000000042660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process netback/{0/1} (pid: {process-id}, threadinfo ffff880******000,
task ffff880********0)
 0000000000000000 {.....RBX......} 0000000000000000 ffffffff803e7511
 {.....RBX......} ffffffffa0***d2c {.....task.....} {thread.+.1e00.}
 {thread.+.1db0.} {.R14.-.22a40..} ffffc9000000000* 0000000000000000
Call Trace:
 [<ffffffff803e7511>] __kfree_skb+0x11/0x20
 [<ffffffffa0***d2c>] net_rx_action+0x66c/0x9c0 [netbk]
 [<ffffffffa0***72a>] netbk_action_thread+0x5a/0x270 [netbk]
 [<ffffffff8006438e>] kthread+0x7e/0x90
 [<ffffffff8050f814>] kernel_thread_helper+0x4/0x10
Code: 48 8b 7c 02 08 e8 90 69 cf ff 8b 95 d0 00 00 00
  48 8b 8d d8 00 00 00 48 01 ca 0f b7 02 39 c3 7c
  d1 f6 42 0c 10 74 1e 48 8b 7a 30
RIP  [<ffffffff803e7451>] skb_release_data.part.47+0x61/0xc0
 RSP <ffff880******d40>
---[ end trace **************** ]---

Preceeding and subsequent messages don't seem to be related with GPF,
time gap is from minutes to half an hour or even more. But if this could give
some insight, I will post them, too.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.