Xen project Mailing List

Re: [Xen-devel] [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8

To: Jan Beulich <JBeulich@xxxxxxxx>

From: Eugene Istomin <e.istomin@xxxxxxx>

Date: Tue, 21 May 2013 14:09:03 +0300

Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Tue, 21 May 2013 11:09:41 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Jan,

i updated to 3.9.3 from stable branch

Dom0:

#hv04: xl info

host : hv04.edss.local

release : 3.9.3-1.g0b5d8f5-xen

version : #1 SMP Mon May 20 09:43:24 UTC 2013 (0b5d8f5)

machine : x86_64

nr_cpus : 24

max_cpu_id : 23

nr_nodes : 2

cores_per_socket : 6

threads_per_core : 2

cpu_mhz : 2666

hw_caps : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000

virt_caps : hvm hvm_directio

total_memory : 98294

free_memory : 92985

sharing_freed_memory : 0

sharing_used_memory : 0

free_cpus : 0

xen_major : 4

xen_minor : 2

xen_extra : .2_01-240.2

xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64

xen_scheduler : credit

xen_pagesize : 4096

platform_params : virt_start=0xffff800000000000

xen_changeset : 26064

xen_commandline : vga=mode-0x314 dom0_max_vcpus=8 dom0_mem=4G,max:4G msi=on iommu=on console=ttyS0,115200 xen-netback.netback_max_groups=6

cc_compiler : gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012

cc_compile_by : abuild

cc_compile_domain :

cc_compile_date : Fri May 10 15:05:06 UTC 2013

xend_config_format : 4

Both DomU:

test:/home/local # uname -a

Linux test.edss.local 3.9.3-1.g0b5d8f5-xen #1 SMP Mon May 20 09:43:24 UTC 2013 (0b5d8f5) x86_64 x86_64 x86_64 GNU/Linux

From xl console

[ 4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 26946

[ ID] Interval Transfer Bandwidth

[ 4] 0.0- 2.0 sec 8.38 MBytes 35.2 Mbits/sec

[ 4] 2.0- 4.0 sec 90.3 MBytes 379 Mbits/sec

[ 4] 4.0- 6.0 sec 7.30 MBytes 30.6 Mbits/sec

[ 105.391855] BUG: unable to handle kernel paging request at ffff880078c6f000

[ 105.391918] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]

[ 105.391970] PGD a85067 PUD a95067 PMD 7fc2b067 PTE 8010000078c6f065

[ 105.392029] Oops: 0003 [#1] SMP

[ 105.392058] Modules linked in: iptable_filter ip_tables x_tables af_packet hwmon domctl crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper joydev cryptd lrw aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix

[ 105.392319] CPU 0

[ 105.392336] Pid: 0, comm: swapper/0 Not tainted 3.9.3-1.g0b5d8f5-xen #1

[ 105.392383] RIP: e030:[<ffffffffa001a75c>] [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]

[ 105.392450] RSP: e02b:ffff88007b403d18 EFLAGS: 00010286

[ 105.392485] RAX: ffff88007da85638 RBX: ffff880078c6eec0 RCX: ffff880078c6f000

[ 105.392531] RDX: ffff880078b54170 RSI: ffff880078c6eec0 RDI: ffff880078479280

[ 105.392576] RBP: ffff88007869c6c0 R08: 0000000000000dc0 R09: 0000000000000000

[ 105.392622] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000011

[ 105.392667] R13: 000000000000b826 R14: ffff88007b403dd8 R15: ffff880078428800

[ 105.393781] FS: 00007f11b3900700(0000) GS:ffff88007b400000(0000) knlGS:0000000000000000

[ 105.394907] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b

[ 105.395840] CR2: ffff880078c6f000 CR3: 00000000797f5000 CR4: 0000000000002660

[ 105.395840] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

[ 105.395840] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

[ 105.395840] Process swapper/0 (pid: 0, threadinfo ffffffff808b8000, task ffffffff808c3960)

[ 105.395840] Stack:

[ 105.395840] ffffffff00000269 0000002200000022 ffff8800784288d0 ffff88007b403db8

[ 105.395840] ffff88007b403d98 000000150000b815 ffff880078429d50 0000000000000000

[ 105.395840] ffff880078428858 ffff88007b40e878 0000000000000012 000000400000b82a

[ 105.395840] Call Trace:

[ 105.395840] [<ffffffff804419c0>] net_rx_action+0x170/0x2c0

[ 105.395840] [<ffffffff80036dae>] __do_softirq+0xee/0x230

[ 105.395840] [<ffffffff80037075>] irq_exit+0x95/0xa0

[ 105.395840] [<ffffffff803937cd>] evtchn_do_upcall+0x2ad/0x2f0

[ 105.395840] [<ffffffff80535a0e>] do_hypervisor_callback+0x1e/0x30

[ 105.395840] [<ffffffff800033aa>] HYPERVISOR_sched_op_new+0xa/0x20

[ 105.395840] [<ffffffff8000e671>] xen_idle+0x41/0x110

[ 105.395840] [<ffffffff8000e7ef>] cpu_idle+0xaf/0x110

[ 105.395840] [<ffffffff80946b1f>] start_kernel+0x424/0x42f

[ 105.395840] Code: 44 21 ea 48 8d 54 d0 40 8b 87 d8 00 00 00 44 0f bf 42 06 44 0f b7 4a 02 48 8b 44 01 30 49 63 cc 48 83 c1 03 48 c1 e1 04 48 01 f1 <48> 89 01 44 89 49 08 44 89 41 0c 48 8b 08 80 e5 80 0f 85 54 09

[ 105.395840] RIP [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]

[ 105.395840] RSP <ffff88007b403d18>

[ 105.395840] CR2: ffff880078c6f000

[ 105.395840] ---[ end trace 67dcc8bdc485cab5 ]---

[ 105.395840] Kernel panic - not syncing: Fatal exception in interrupt

Best regards,

Eugene Istomin

On Tuesday, May 21, 2013 10:55:00 AM Jan Beulich wrote:

> >>> On 17.05.13 at 15:00, Eugene Istomin <e.istomin@xxxxxxx> wrote:

> > Bump, here it is:

> Okay, but I think we're still lacking information on what your

> Dom0 kernel is.

> Ian, Wei - looking at the forward ported as well as the upstream

> frontends, I'm wondering if there isn't a fundamental flaw in

> ..._get_responses(): It allows up to MAX_SKB_FRAGS + 1 slots/

> frags, and ..._fill_frags() then fills as many fragments as got

> queued onto the respective list. Only after both of them are done,

> __pskb_pull_tail() gets invoked reducing the fragment count by

> one if the condition is met that made ..._get_responses() bump

> the limit by one.

> Am I overlooking something? I'm asking because working through

> disassembly and register values of the dump Eugene had sent I

> clearly see that ..._fill_frags() is processing the 18th fragment,

> while in that kernel version MAX_SKB_FRAGS is only 17 (but

> possibly, hence the first question above, the Dom0 kernel still is

> one with MAX_SKB_FRAGS being 18, or the packet turns out to

> be such that it fills 17 fragments and the header is smaller than

> RX_COPY_THRESHOLD).

> Jan

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.