Re: [Xen-devel] General protection fault in netback

2012/2/10 Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx>:

AS>> Dom0 is openSUSE 12.1 on AMD64 (Linux 3.1.0-1.2-xen)
KRW> Do you get the same issue with a pv-ops dom0? So also 3.1, but
from kernel.org?

Unfortunately, I'm not skilled at compiling the kernel myself. I tried
building the newest 3.2.6
with all Xen options (which I could find by "Xen" keyword) enabled,
but the resulting system
didn't have netback.ko module at all, barely booted, and xm was not
able to communicate
with the hypervisor.

As to vanilla kernel package provided by openSUSE, it is not Xen-enabled.

Meanwhile, an update was released, so I was testing 3.1.9-1.4-xen for
about a week,
though the outcome is still negative.

AS>> Supposedly "offending" DomU is paravirtualized NetBSD 5.1.1
AS>> for AMD64 with recompiled kernel (CARP enabled, no more changes).
KRW> What is CARP?

CARP is Common Address Redundancy Protocol, a "non-patented version of VRRP",
used for high availability and load balancing. It is supported in
NetBSD kernel (although
user-space implementation, uCARP, exist as well), but is not compiled
by default.
All my work to enable it was a quiet simple recompilation with following config:

include         "arch/amd64/conf/XEN3_DOMU"
pseudo-device   carp

It looks like GPFs happen only when those load-balancing DomUs are running;
at least, then they are shutoff, no fault is observed in a whole day,
but then they run,
fault can happen even after some minutes of Dom0 uptime, especially while
DomUs are stopping or starting.

KRW> Hm, that is a pretty neat stack output. Wonder which patch of
theirs does that.

It was not verbatim dump, but a generalized text. If you are
interested, here is an excerpt
from /var/log/messages for penultimate GPF (with date and hostname removed):

===[ Preceeding entries ]===

(Those can be absolutely unrelated to GPF, but all 3 recent faults
after kernel update
were happening during VMs shutdown, either massive or singular.)

21:18:33 avahi-daemon[3086]: Withdrawing workstation service for vif10.0.
21:18:33 kernel: [29722.267359] br1: port 10(vif10.0) entering forwarding state
21:18:33 kernel: [29722.267443] br1: port 10(vif10.0) entering disabled state
21:18:33 logger: /etc/xen/scripts/vif-bridge: offline type_if=vif
21:18:33 logger: /etc/xen/scripts/vif-bridge: brctl delif br1 vif10.0 failed
21:18:33 logger: /etc/xen/scripts/vif-bridge: ifconfig vif10.0 down failed
21:18:33 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge
offline for vif10.0, bridge br1.
21:18:33 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:33 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:33 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:33 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/10/51712
21:18:33 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:33 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:53 avahi-daemon[3086]: Withdrawing workstation service for vif9.0.
21:18:53 kernel: [29742.222676] br1: port 9(vif9.0) entering forwarding state
21:18:53 kernel: [29742.222779] br1: port 9(vif9.0) entering disabled state
21:18:53 logger: /etc/xen/scripts/vif-bridge: offline type_if=vif
21:18:53 logger: /etc/xen/scripts/vif-bridge: brctl delif br1 vif9.0 failed
21:18:53 logger: /etc/xen/scripts/vif-bridge: ifconfig vif9.0 down failed
21:18:53 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge
offline for vif9.0, bridge br1.
21:18:53 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:53 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:53 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:53 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:18:53 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/9/51712
21:18:53 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:13 avahi-daemon[3086]: Withdrawing workstation service for vif8.0.
21:19:13 kernel: [29762.605500] br1: port 8(vif8.0) entering forwarding state
21:19:13 kernel: [29762.605572] br1: port 8(vif8.0) entering disabled state
21:19:13 logger: /etc/xen/scripts/vif-bridge: offline type_if=vif
21:19:13 logger: /etc/xen/scripts/vif-bridge: brctl delif br1 vif8.0 failed
21:19:13 logger: /etc/xen/scripts/vif-bridge: ifconfig vif8.0 down failed
21:19:13 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge
offline for vif8.0, bridge br1.
21:19:13 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:13 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:13 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:13 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:13 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/8/51712
21:19:13 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:26 avahi-daemon[3086]: Withdrawing workstation service for vif7.0.
21:19:26 kernel: [29775.558990] br1: port 7(vif7.0) entering forwarding state
21:19:26 kernel: [29775.559105] br1: port 7(vif7.0) entering disabled state
21:19:26 logger: /etc/xen/scripts/vif-bridge: offline type_if=vif
21:19:26 logger: /etc/xen/scripts/vif-bridge: brctl delif br1 vif7.0 failed
21:19:26 logger: /etc/xen/scripts/vif-bridge: ifconfig vif7.0 down failed
21:19:26 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge
offline for vif7.0, bridge br1.
21:19:26 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:26 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:26 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:26 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/7/51712
21:19:26 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:26 logger: /etc/xen/scripts/xen-hotplug-cleanup:

===[ Fault alert itself ]===

21:19:37 kernel: [29786.610984] general protection fault: 0000 [#1] SMP
21:19:37 kernel: [29786.610992] CPU 0
21:19:37 kernel: [29786.610993] Modules linked in: fuse ip6t_LOG
xt_tcpudp xt_pkttype xt_physdev ipt_LOG xt_limit nfsd lockd nfs_acl
auth_rpcgss sunrpc usbbk netbk blkbk blkback_pagemap blktap domctl
xenbus_be gntdev evtchn af_packet bridge stp llc edd ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT
iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns
nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables
microcode snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device eeepc_wmi
asus_wmi sparse_keymap rfkill usb_storage ppdev pci_hotplug uas
8250_pnp sg i2c_i801 sr_mod wmi parport_pc snd_hda_codec_hdmi
snd_hda_codec_realtek parport r8169 pcspkr mei(C) 8250 serial_core
iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_hwdep
snd_pcm snd_timer snd soundcore snd_page_alloc usbhid hid dm_mod
linear btrfs zlib_deflate lzo_compress i915 drm_kms_helper drm
i2c_algo_bit ehci_hcd usbcor
21:19:37 kernel: e i2c_core button video processor thermal_sys hwmon
xenblk cdrom xennet ata_generic pata_via
21:19:37 kernel: [29786.611076]
21:19:37 kernel: [29786.611078] Pid: 3461, comm: netback/1 Tainted: G
       C  3.1.9-1.4-xen #1 System manufacturer System Product
21:19:37 kernel: [29786.611084] RIP: e030:[<ffffffff803e7f81>]
[<ffffffff803e7f81>] skb_release_data.part.46+0x61/0xc0
21:19:37 kernel: [29786.611092] RSP: e02b:ffff8802c8339d40  EFLAGS: 00010202
21:19:37 kernel: [29786.611095] RAX: 0000000000000000 RBX:
ffff88007ccf39c0 RCX: ffff8800e70db000
21:19:37 kernel: [29786.611098] RDX: ffff8800e70dbe80 RSI:
000000000000001f RDI: 0028f49c00000000
21:19:37 kernel: [29786.611101] RBP: ffff88007ccf39c0 R08:
ffff8800e70d0010 R09: 000000000000004e
21:19:37 kernel: [29786.611103] R10: 0000000000000003 R11:
ffff8802d0074c30 R12: ffff8802bb22f000
21:19:37 kernel: [29786.611106] R13: ffff88027ea382c0 R14:
ffffc900048cb960 R15: 0000000000000001
21:19:37 kernel: [29786.611114] FS:  00007f303913f700(0000)
GS:ffff8802de3c2000(0000) knlGS:0000000000000000
21:19:37 kernel: [29786.611117] CS:  e033 DS: 0000 ES: 0000 CR0:
21:19:37 kernel: [29786.611119] CR2: 00000000006b6e30 CR3:
00000002c93fb000 CR4: 0000000000042660
21:19:37 kernel: [29786.611126] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
21:19:37 kernel: [29786.611131] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
21:19:37 kernel: [29786.611136] Process netback/1 (pid: 3461,
threadinfo ffff8802c8338000, task ffff8802d00747c0)
21:19:37 kernel: [29786.611140] Stack:
21:19:37 kernel: [29786.611144]  0000000000000000 ffff88007ccf39c0
0000000000000000 ffffffff803e8041
21:19:37 kernel: [29786.611151]  ffff88007ccf39c0 ffffffffa059fd3c
ffff8802d00747c0 ffff8802c8339e00
21:19:37 kernel: [29786.611157]  ffff8802c8339db0 ffffc900048a8f20
ffffc90000000002 0000000000000000
21:19:37 kernel: [29786.611164] Call Trace:
21:19:37 kernel: [29786.611173]  [<ffffffff803e8041>] __kfree_skb+0x11/0x20
21:19:37 kernel: [29786.611182]  [<ffffffffa059fd3c>]
net_rx_action+0x66c/0x9c0 [netbk]
21:19:37 kernel: [29786.611201]  [<ffffffffa05a173a>]
netbk_action_thread+0x5a/0x270 [netbk]
21:19:37 kernel: [29786.611211]  [<ffffffff8006444e>] kthread+0x7e/0x90
21:19:37 kernel: [29786.611220]  [<ffffffff80510d24>]
21:19:37 kernel: [29786.611225] Code: 48 8b 7c 02 08 e8 a0 60 cf ff 8b
95 d0 00 00 00 48 8b 8d d8 00 00 00 48 01 ca 0f b7 02 39 c3 7c d1 f6
42 0c 10 74 1e 48 8b 7a 30
21:19:37 kernel: [29786.611265] RIP  [<ffffffff803e7f81>]
21:19:37 kernel: [29786.611271]  RSP <ffff8802c8339d40>
21:19:37 kernel: [29786.671491] ---[ end trace 6875b40b2a9f1d46 ]---

(Note that numbers after "+" in call trace did not changed after kernel update,
as compared to previously posted, although absolute addresses did changed.)

===[ Subsequent entries ]===

(Again, sometimes those can be absolutely unrelated to GPF, and happen
minutes after.)

21:19:38 avahi-daemon[3086]: Withdrawing workstation service for vif6.0.
21:19:38 kernel: [29787.904571] br1: port 6(vif6.0) entering forwarding state
21:19:38 kernel: [29787.904649] br1: port 6(vif6.0) entering disabled state
21:19:38 logger: /etc/xen/scripts/vif-bridge: offline type_if=vif
21:19:38 logger: /etc/xen/scripts/vif-bridge: brctl delif br1 vif6.0 failed
21:19:38 logger: /etc/xen/scripts/vif-bridge: ifconfig vif6.0 down failed
21:19:38 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge
offline for vif6.0, bridge br1.
21:19:39 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:39 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:39 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:39 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/6/51712
21:19:39 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:39 logger: /etc/xen/scripts/xen-hotplug-cleanup:
21:19:58 kernel: [29807.860561] br1: port 5(vif5.0) entering forwarding state

