[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles
Wednesday, February 26, 2014, 10:14:42 AM, you wrote: > Friday, February 21, 2014, 7:32:08 AM, you wrote: >> On 2014/2/20 19:18, Sander Eikelenboom wrote: >>> Thursday, February 20, 2014, 10:49:58 AM, you wrote: >>> >>> >>>> On 2014/2/19 5:25, Sander Eikelenboom wrote: >>>>> Hi All, >>>>> >>>>> I'm currently having some network troubles with Xen and recent linux >>>>> kernels. >>>>> >>>>> - When running with a 3.14-rc3 kernel in dom0 and a 3.13 kernel in domU >>>>> I get what seems to be described in this thread: >>>>> http://www.spinics.net/lists/netdev/msg242953.html >>>>> >>>>> In the guest: >>>>> [57539.859584] net eth0: rx->offset: 0, size: 4294967295 >>>>> [57539.859599] net eth0: rx->offset: 0, size: 4294967295 >>>>> [57539.859605] net eth0: rx->offset: 0, size: 4294967295 >>>>> [57539.859610] net eth0: Need more slots >>>>> [58157.675939] net eth0: Need more slots >>>>> [58725.344712] net eth0: Need more slots >>>>> [61815.849180] net eth0: rx->offset: 0, size: 4294967295 >>>>> [61815.849205] net eth0: rx->offset: 0, size: 4294967295 >>>>> [61815.849216] net eth0: rx->offset: 0, size: 4294967295 >>>>> [61815.849225] net eth0: Need more slots >>>> This issue is familiar... and I thought it get fixed. >>>> From original analysis for similar issue I hit before, the root cause >>>> is netback still creates response when the ring is full. I remember >>>> larger MTU can trigger this issue before, what is the MTU size? >>> In dom0 both for the physical nics and the guest vif's MTU=1500 >>> In domU the eth0 also has MTU=1500. >>> >>> So it's not jumbo frames .. just everywhere the same plain defaults .. >>> >>> With the patch from Wei that solves the other issue, i'm still seeing the >>> Need more slots issue on 3.14-rc3+wei's patch now. >>> I have extended the "need more slots warn" to also print the cons, slots, >>> max, rx->offset, size, hope that gives some more insight. >>> But it indeed is the VM were i had similar issues before, the primary thing >>> this VM does is 2 simultaneous rsync's (one push one pull) with some >>> gigabytes of data. >>> >>> This time it was also acompanied by a "grant_table.c:1857:d0 Bad grant >>> reference " as seen below, don't know if it's a cause or a effect though. >> The log "grant_table.c:1857:d0 Bad grant reference " was also seen before. >> Probably the response overlaps the request and grantcopy return error >> when using wrong grant reference, Netback returns resp->status with >> ||XEN_NETIF_RSP_ERROR(-1) which is 4294967295 printed above from frontend. >> Would it be possible to print log in xenvif_rx_action of netback to see >> whether something wrong with max slots and used slots? >> Thanks >> Annie > Looking more closely it are perhaps 2 different issues ... the bad grant > references do not happen > at the same time as the netfront messages in the guest. > I added some debugpatches to the kernel netback, netfront and xen granttable > code (see below) > One of the things was to simplify the code for the debug key to print the > granttables, the present code > takes too long to execute and brings down the box due to stalls and NMI's. So > it now only prints > the nr of entries per domain. > Issue 1: grant_table.c:1858:d0 Bad grant reference > After running the box for just one night (with 15 VM's) i get these mentions > of "Bad grant reference". > The maptrack also seems to increase quite fast and the number of entries seem > to have gone up quite fast as well. > Most domains have just one disk(blkfront/blkback) and one nic, a few have a > second disk. > The blk drivers use persistent grants so i would assume it would reuse those > and not increase it (by much). > Domain 1 seems to have increased it's nr_grant_entries from 2048 to 3072 > somewhere this night. > Domain 7 is the domain that happens to give the netfront messages. > I also don't get why it is reporting the "Bad grant reference" for domain 0, > which seems to have 0 active entries .. > Also is this amount of grant entries "normal" ? or could it be a leak > somewhere ? > (XEN) [2014-02-26 00:00:38] grant_table.c:1250:d1 Expanding dom (1) grant > table from (4) to (5) frames. > (XEN) [2014-02-26 00:00:38] grant_table.c:1250:d1 Expanding dom (1) grant > table from (5) to (6) frames. > (XEN) [2014-02-26 00:00:38] grant_table.c:290:d0 Increased maptrack size to > 13/256 frames > (XEN) [2014-02-26 00:01:13] grant_table.c:290:d0 Increased maptrack size to > 14/256 frames > (XEN) [2014-02-26 04:02:55] grant_table.c:1858:d0 Bad grant reference 4325377 > | 2048 | 1 | 0 > (XEN) [2014-02-26 04:15:33] grant_table.c:290:d0 Increased maptrack size to > 15/256 frames > (XEN) [2014-02-26 04:15:53] grant_table.c:290:d0 Increased maptrack size to > 16/256 frames > (XEN) [2014-02-26 04:15:56] grant_table.c:290:d0 Increased maptrack size to > 17/256 frames > (XEN) [2014-02-26 04:15:56] grant_table.c:290:d0 Increased maptrack size to > 18/256 frames > (XEN) [2014-02-26 04:15:57] grant_table.c:290:d0 Increased maptrack size to > 19/256 frames > (XEN) [2014-02-26 04:15:57] grant_table.c:290:d0 Increased maptrack size to > 20/256 frames > (XEN) [2014-02-26 04:15:59] grant_table.c:290:d0 Increased maptrack size to > 21/256 frames > (XEN) [2014-02-26 04:16:00] grant_table.c:290:d0 Increased maptrack size to > 22/256 frames > (XEN) [2014-02-26 04:16:00] grant_table.c:290:d0 Increased maptrack size to > 23/256 frames > (XEN) [2014-02-26 04:16:00] grant_table.c:290:d0 Increased maptrack size to > 24/256 frames > (XEN) [2014-02-26 04:16:10] grant_table.c:290:d0 Increased maptrack size to > 25/256 frames > (XEN) [2014-02-26 04:16:10] grant_table.c:290:d0 Increased maptrack size to > 26/256 frames > (XEN) [2014-02-26 04:16:17] grant_table.c:290:d0 Increased maptrack size to > 27/256 frames > (XEN) [2014-02-26 04:16:20] grant_table.c:290:d0 Increased maptrack size to > 28/256 frames > (XEN) [2014-02-26 04:16:56] grant_table.c:290:d0 Increased maptrack size to > 29/256 frames > (XEN) [2014-02-26 05:15:04] grant_table.c:290:d0 Increased maptrack size to > 30/256 frames > (XEN) [2014-02-26 05:15:05] grant_table.c:290:d0 Increased maptrack size to > 31/256 frames > (XEN) [2014-02-26 05:21:15] grant_table.c:1858:d0 Bad grant reference > 107085839 | 2048 | 1 | 0 > (XEN) [2014-02-26 05:29:47] grant_table.c:1858:d0 Bad grant reference > 268435460 | 2048 | 1 | 0 > (XEN) [2014-02-26 07:53:20] gnttab_usage_print_all [ key 'g' pressed > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 0 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 0 active > entries: 0 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 1 (v1) > nr_grant_entries: 3072 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 1 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 1 active > entries: 2117 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 2 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 2 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 2 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 3 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 3 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 3 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 4 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 4 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 4 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 5 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 5 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 5 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 6 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 6 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 6 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 7 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 7 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 7 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 8 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 8 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 8 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 9 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 9 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 9 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 10 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 10 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 10 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 11 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 11 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 11 active > entries: 1061 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 12 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 12 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 12 active > entries: 1045 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 13 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 13 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 13 active > entries: 1060 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 14 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 14 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 14 active > entries: 709 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 15 (v1) > nr_grant_entries: 2048 > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 15 (v1) > (XEN) [2014-02-26 07:53:20] grant-table for remote domain: 15 active > entries: 163 > (XEN) [2014-02-26 07:53:20] gnttab_usage_print_all ] done > (XEN) [2014-02-26 07:55:09] grant_table.c:1858:d0 Bad grant reference 4325377 > | 2048 | 1 | 0 > (XEN) [2014-02-26 08:37:16] grant_table.c:1858:d0 Bad grant reference > 268435460 | 2048 | 1 | 0 > Issue 2: net eth0: rx->offset: 0, size: xxxxxxxxxx > In the guest (domain 7): > Feb 26 08:55:09 backup kernel: [39258.090375] net eth0: rx->offset: 0, size: > 4294967295 > Feb 26 08:55:09 backup kernel: [39258.090392] net eth0: me here .. > cons:15177803 slots:1 rp:15177807 max:18 err:0 rx->id:74 rx->offset:0 > size:4294967295 ref:533 > Feb 26 08:55:09 backup kernel: [39258.090401] net eth0: rx->offset: 0, size: > 4294967295 > Feb 26 08:55:09 backup kernel: [39258.090406] net eth0: me here .. > cons:15177803 slots:2 rp:15177807 max:18 err:-22 rx->id:76 rx->offset:0 > size:4294967295 ref:686 > Feb 26 08:55:09 backup kernel: [39258.090415] net eth0: rx->offset: 0, size: > 4294967295 > Feb 26 08:55:09 backup kernel: [39258.090420] net eth0: me here .. > cons:15177803 slots:3 rp:15177807 max:18 err:-22 rx->id:77 rx->offset:0 > size:4294967295 ref:571 > In dom0 i don't see any specific netback warnings related to this domain at > this specific times, the printk's i added do trigger quite some times but > these are probably not > errorneous, but they seem to only occur on the vif of domain 7 (probably the > only domain that is swamping the network by doing rsync and webdavs and > causes some fragmented packets) Another addition ... the guest doesn't shutdown anymore on "xl shutdown" .. it just does .. erhmm nothing .. (tried multiple times) After that i ssh'ed into the guest and did a "halt -p" ... the guest shutted down .. but the guest remained in xl list in blocked state .. Doing a "xl console" shows: [30024.559656] net eth0: me here .. cons:8713451 slots:1 rp:8713462 max:18 err:0 rx->id:234 rx->offset:0 size:4294967295 ref:-131941395332550 [30024.559666] net eth0: rx->offset: 0, size: 4294967295 [30024.559671] net eth0: me here .. cons:8713451 slots:2 rp:8713462 max:18 err:-22 rx->id:236 rx->offset:0 size:4294967295 ref:-131941395332504 [30024.559680] net eth0: rx->offset: 0, size: 4294967295 [30024.559686] net eth0: me here .. cons:8713451 slots:3 rp:8713462 max:18 err:-22 rx->id:1 rx->offset:0 size:4294967295 ref:-131941395332390 [30536.665135] net eth0: Need more slots cons:9088533 slots:6 rp:9088539 max:17 err:0 rx-id:26 rx->offset:0 size:0 ref:687 [39258.090375] net eth0: rx->offset: 0, size: 4294967295 [39258.090392] net eth0: me here .. cons:15177803 slots:1 rp:15177807 max:18 err:0 rx->id:74 rx->offset:0 size:4294967295 ref:533 [39258.090401] net eth0: rx->offset: 0, size: 4294967295 [39258.090406] net eth0: me here .. cons:15177803 slots:2 rp:15177807 max:18 err:-22 rx->id:76 rx->offset:0 size:4294967295 ref:686 [39258.090415] net eth0: rx->offset: 0, size: 4294967295 [39258.090420] net eth0: me here .. cons:15177803 slots:3 rp:15177807 max:18 err:-22 rx->id:77 rx->offset:0 size:4294967295 ref:571 INIT: Switching to runlevel: 0 INIT: Sending processes the TERM signal [info] Using makefile-style concurrent boot in runlevel 0. Stopping openntpd: ntpd. [ ok ] Stopping mail-transfer-agent: nullmailer. [ ok ] Stopping web server: apache2 ... waiting . [ ok ] Asking all remaining processes to terminate...done. [ ok ] All processes ended within 2 seconds...done. [ ok ] Stopping enhanced syslogd: rsyslogd. [ ok ] Deconfiguring network interfaces...done. [ ok ] Deactivating swap...done. [65015.958259] EXT4-fs (xvda1): re-mounted. Opts: (null) [info] Will now halt. [65018.166546] vif vif-0: 5 starting transaction [65160.490419] INFO: task halt:4846 blocked for more than 120 seconds. [65160.490464] Not tainted 3.14.0-rc4-20140225-vanilla-nfnbdebug2+ #1 [65160.490485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [65160.490510] halt D ffff88001d6cfc38 0 4846 4838 0x00000000 [65280.490470] INFO: task halt:4846 blocked for more than 120 seconds. [65280.490517] Not tainted 3.14.0-rc4-20140225-vanilla-nfnbdebug2+ #1 [65280.490540] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [65280.490564] halt D ffff88001d6cfc38 0 4846 4838 0x00000000 Especially the "[65018.166546] vif vif-0: 5 starting transaction" after the halt surprises me .. -- Sander > Feb 26 08:53:20 serveerstertje kernel: [39324.917255] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:15101115 cons:15101112 j:8 > Feb 26 08:53:56 serveerstertje kernel: [39361.001436] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15127649 cons:15127648 j:13 > Feb 26 08:54:00 serveerstertje kernel: [39364.725613] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15130263 cons:15130261 j:2 > Feb 26 08:54:04 serveerstertje kernel: [39368.739504] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:15133143 cons:15133141 j:0 > Feb 26 08:54:20 serveerstertje kernel: [39384.665044] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15144113 cons:15144112 j:0 > Feb 26 08:54:29 serveerstertje kernel: [39393.569871] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15150203 cons:15150200 j:0 > Feb 26 08:54:40 serveerstertje kernel: [39404.586566] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15157706 cons:15157704 j:12 > Feb 26 08:54:56 serveerstertje kernel: [39420.759769] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:6 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:15168839 cons:15168835 j:0 > Feb 26 08:54:56 serveerstertje kernel: [39421.001372] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15169002 cons:15168999 j:8 > Feb 26 08:55:00 serveerstertje kernel: [39424.515073] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15171450 cons:15171447 j:0 > Feb 26 08:55:10 serveerstertje kernel: [39435.154510] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15178773 cons:15178770 j:1 > Feb 26 08:56:19 serveerstertje kernel: [39504.195908] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15227444 cons:15227444 j:0 > Feb 26 08:57:39 serveerstertje kernel: [39583.799392] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15283346 cons:15283344 j:8 > Feb 26 08:57:55 serveerstertje kernel: [39599.517673] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:4 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:15293937 cons:15293935 j:0 > Feb 26 08:58:07 serveerstertje kernel: [39612.156622] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15302891 cons:15302889 j:19 > Feb 26 08:58:07 serveerstertje kernel: [39612.400907] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15303034 cons:15303033 j:0 > Feb 26 08:58:18 serveerstertje kernel: [39623.439383] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:6 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:15310915 cons:15310911 j:0 > Feb 26 08:58:39 serveerstertje kernel: [39643.521808] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:6 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:15324769 cons:15324766 j:1 > Feb 26 09:27:07 serveerstertje kernel: [41351.622501] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:5 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16502932 cons:16502932 j:8 > Feb 26 09:27:19 serveerstertje kernel: [41363.541003] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:5 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:16510837 cons:16510834 j:7 > Feb 26 09:27:23 serveerstertje kernel: [41368.133306] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:5 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16513940 cons:16513937 j:0 > Feb 26 09:27:43 serveerstertje kernel: [41388.025147] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16527870 cons:16527868 j:0 > Feb 26 09:27:47 serveerstertje kernel: [41391.530802] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:5 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:2 prod:16530437 cons:16530437 j:7 > Feb 26 09:27:51 serveerstertje kernel: [41395.521166] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:5 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16533320 cons:16533317 j:6 > Feb 26 09:27:51 serveerstertje kernel: [41395.767066] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16533469 cons:16533469 j:0 > Feb 26 09:27:51 serveerstertje kernel: [41395.802319] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:1 GSO:0 > vif->rx_last_skb_slots:0 nr_frags:0 prod:16533533 cons:16533533 j:24 > Feb 26 09:27:51 serveerstertje kernel: [41395.837456] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:1 GSO:0 > vif->rx_last_skb_slots:0 nr_frags:0 prod:16533534 cons:16533534 j:1 > Feb 26 09:27:51 serveerstertje kernel: [41395.872587] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16533597 cons:16533596 j:25 > Feb 26 09:27:51 serveerstertje kernel: [41396.192784] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16533833 cons:16533832 j:3 > Feb 26 09:27:51 serveerstertje kernel: [41396.235611] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16533890 cons:16533890 j:30 > Feb 26 09:27:51 serveerstertje kernel: [41396.271047] vif vif-7-0 vif7.0: > !?!?!?! skb may not fit .. bail out now max_slots_needed:3 GSO:1 > vif->rx_last_skb_slots:0 nr_frags:1 prod:16533898 cons:16533896 j:3 > -- > Sander >>> >>> Will keep you posted when it triggers again with the extra info in the warn. >>> >>> -- >>> Sander >>> >>> >>> >>>> Thanks >>>> Annie >>>>> Xen reports: >>>>> (XEN) [2014-02-18 03:22:47] grant_table.c:1857:d0 Bad grant reference >>>>> 19791875 >>>>> (XEN) [2014-02-18 03:42:33] grant_table.c:1857:d0 Bad grant reference >>>>> 268435460 >>>>> (XEN) [2014-02-18 04:15:23] grant_table.c:289:d0 Increased maptrack >>>>> size to 14 frames >>>>> (XEN) [2014-02-18 04:15:27] grant_table.c:289:d0 Increased maptrack >>>>> size to 15 frames >>>>> (XEN) [2014-02-18 04:15:48] grant_table.c:289:d0 Increased maptrack >>>>> size to 16 frames >>>>> (XEN) [2014-02-18 04:15:50] grant_table.c:289:d0 Increased maptrack >>>>> size to 17 frames >>>>> (XEN) [2014-02-18 04:15:55] grant_table.c:289:d0 Increased maptrack >>>>> size to 18 frames >>>>> (XEN) [2014-02-18 04:15:55] grant_table.c:289:d0 Increased maptrack >>>>> size to 19 frames >>>>> (XEN) [2014-02-18 04:15:56] grant_table.c:289:d0 Increased maptrack >>>>> size to 20 frames >>>>> (XEN) [2014-02-18 04:15:56] grant_table.c:289:d0 Increased maptrack >>>>> size to 21 frames >>>>> (XEN) [2014-02-18 04:15:59] grant_table.c:289:d0 Increased maptrack >>>>> size to 22 frames >>>>> (XEN) [2014-02-18 04:15:59] grant_table.c:289:d0 Increased maptrack >>>>> size to 23 frames >>>>> (XEN) [2014-02-18 04:16:00] grant_table.c:289:d0 Increased maptrack >>>>> size to 24 frames >>>>> (XEN) [2014-02-18 04:16:05] grant_table.c:289:d0 Increased maptrack >>>>> size to 25 frames >>>>> (XEN) [2014-02-18 04:16:05] grant_table.c:289:d0 Increased maptrack >>>>> size to 26 frames >>>>> (XEN) [2014-02-18 04:16:06] grant_table.c:289:d0 Increased maptrack >>>>> size to 27 frames >>>>> (XEN) [2014-02-18 04:16:12] grant_table.c:289:d0 Increased maptrack >>>>> size to 28 frames >>>>> (XEN) [2014-02-18 04:16:18] grant_table.c:289:d0 Increased maptrack >>>>> size to 29 frames >>>>> (XEN) [2014-02-18 04:17:00] grant_table.c:1857:d0 Bad grant reference >>>>> 268435460 >>>>> (XEN) [2014-02-18 04:17:00] grant_table.c:1857:d0 Bad grant reference >>>>> 268435460 >>>>> (XEN) [2014-02-18 04:34:03] grant_table.c:1857:d0 Bad grant reference >>>>> 4325377 >>>>> >>>>> >>>>> >>>>> Another issue with networking is when running both dom0 and domU's with a >>>>> 3.14-rc3 kernel: >>>>> - i can ping the guests from dom0 >>>>> - i can ping dom0 from the guests >>>>> - But i can't ssh or access things by http >>>>> - I don't see any relevant error messages ... >>>>> - This is with the same system and kernel config as with the 3.14 and >>>>> 3.13 combination above >>>>> (that previously worked fine) >>>>> >>>>> -- >>>>> >>>>> Sander >>>>> >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@xxxxxxxxxxxxx >>>>> http://lists.xen.org/xen-devel >>> >>> > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c > index 4fc46eb..4d720b4 100644 > --- a/tools/libxl/xl_cmdimpl.c > +++ b/tools/libxl/xl_cmdimpl.c > @@ -1667,6 +1667,8 @@ skip_vfb: > b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_STD; > } else if (!strcmp(buf, "cirrus")) { > b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS; > + } else if (!strcmp(buf, "none")) { > + b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE; > } else { > fprintf(stderr, "Unknown vga \"%s\" specified\n", buf); > exit(1); > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c > index 107b000..ab56927 100644 > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -265,9 +265,10 @@ get_maptrack_handle( > while ( unlikely((handle = __get_maptrack_handle(lgt)) == -1) ) > { > nr_frames = nr_maptrack_frames(lgt); > - if ( nr_frames >= max_nr_maptrack_frames() ) > + if ( nr_frames >= max_nr_maptrack_frames() ){ > + gdprintk(XENLOG_INFO, "Already at max maptrack size: %u/%u > frames\n",nr_frames, max_nr_maptrack_frames()); > break; > - > + } > new_mt = alloc_xenheap_page(); > if ( !new_mt ) > break; > @@ -285,8 +286,8 @@ get_maptrack_handle( > smp_wmb(); > lgt->maptrack_limit = new_mt_limit; > - gdprintk(XENLOG_INFO, "Increased maptrack size to %u frames\n", > - nr_frames + 1); > + gdprintk(XENLOG_INFO, "Increased maptrack size to %u/%u frames\n", > + nr_frames + 1, max_nr_maptrack_frames()); > } > spin_unlock(&lgt->lock); > @@ -1854,7 +1855,7 @@ __acquire_grant_for_copy( > if ( unlikely(gref >= nr_grant_entries(rgt)) ) > PIN_FAIL(unlock_out, GNTST_bad_gntref, > - "Bad grant reference %ld\n", gref); > + "Bad grant reference %ld | %d | %d | %d \n", gref, > nr_grant_entries(rgt), rgt->gt_version, ldom); > act = &active_entry(rgt, gref); > shah = shared_entry_header(rgt, gref); > @@ -2830,15 +2831,19 @@ static void gnttab_usage_print(struct domain *rd) > int first = 1; > grant_ref_t ref; > struct grant_table *gt = rd->grant_table; > - > + unsigned int active=0; > +/* > printk(" -------- active -------- -------- shared > --------\n"); > printk("[ref] localdom mfn pin localdom gmfn flags\n"); > - > +*/ > spin_lock(>->lock); > if ( gt->gt_version == 0 ) > goto out; > + printk("grant-table for remote domain:%5d (v%d) nr_grant_entries: %d\n", > + rd->domain_id, gt->gt_version, nr_grant_entries(gt)); > + > for ( ref = 0; ref != nr_grant_entries(gt); ref++ ) > { > struct active_grant_entry *act; > @@ -2875,19 +2880,22 @@ static void gnttab_usage_print(struct domain *rd) > rd->domain_id, gt->gt_version); > first = 0; > } > - > + active++; > /* [ddd] ddddd 0xXXXXXX 0xXXXXXXXX ddddd 0xXXXXXX 0xXX > */ > - printk("[%3d] %5d 0x%06lx 0x%08x %5d 0x%06"PRIx64" 0x%02x\n", > - ref, act->domid, act->frame, act->pin, > - sha->domid, frame, status); > + /* printk("[%3d] %5d 0x%06lx 0x%08x %5d 0x%06"PRIx64" > 0x%02x\n", ref, act->domid, act->frame, act->pin, sha->domid, frame, status); > */ > } > out: > spin_unlock(>->lock); > + printk("grant-table for remote domain:%5d active entries: %d\n", > + rd->domain_id, active); > +/* > if ( first ) > printk("grant-table for remote domain:%5d ... " > "no active grant table entries\n", rd->domain_id); > +*/ > + > } > static void gnttab_usage_print_all(unsigned char key) > diff --git a/drivers/net/xen-netback/netback.c > b/drivers/net/xen-netback/netback.c > index e5284bc..6d93358 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -482,20 +482,23 @@ static void xenvif_rx_action(struct xenvif *vif) > .meta = vif->meta, > }; > + int j=0; > + > skb_queue_head_init(&rxq); > while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) { > RING_IDX max_slots_needed; > int i; > + int nr_frags; > /* We need a cheap worse case estimate for the number of > * slots we'll use. > */ > max_slots_needed = DIV_ROUND_UP(offset_in_page(skb->data) + > - skb_headlen(skb), > - PAGE_SIZE); > - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { > + skb_headlen(skb), PAGE_SIZE); > + nr_frags = skb_shinfo(skb)->nr_frags; > + for (i = 0; i < nr_frags; i++) { > unsigned int size; > size = skb_frag_size(&skb_shinfo(skb)->frags[i]); > max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE); > @@ -508,6 +511,9 @@ static void xenvif_rx_action(struct xenvif *vif) > if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) { > skb_queue_head(&vif->rx_queue, skb); > need_to_notify = true; > + if (net_ratelimit()) > + netdev_err(vif->dev, "!?!?!?! skb may not fit > .. bail out now max_slots_needed:%d GSO:%d vif->rx_last_skb_slots:%d > nr_frags:%d prod:%d cons:%d j:%d\n", > + max_slots_needed, > (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 || skb_shinfo(skb)->gso_type & > SKB_GSO_TCPV6) ? 1 : 0, vif->rx_last_skb_slots, > nr_frags,vif->rx.sring->req_prod,vif->rx.req_cons,j); > vif->rx_last_skb_slots = max_slots_needed; > break; > } else > @@ -518,6 +524,7 @@ static void xenvif_rx_action(struct xenvif *vif) > BUG_ON(sco->meta_slots_used > max_slots_needed); > __skb_queue_tail(&rxq, skb); > + j++; > } > BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta)); > @@ -541,7 +548,7 @@ static void xenvif_rx_action(struct xenvif *vif) > resp->offset = vif->meta[npo.meta_cons].gso_size; > resp->id = vif->meta[npo.meta_cons].id; > resp->status = sco->meta_slots_used; > - > + > npo.meta_cons++; > sco->meta_slots_used--; > } > @@ -705,7 +712,7 @@ static int xenvif_count_requests(struct xenvif *vif, > */ > if (!drop_err && slots >= XEN_NETBK_LEGACY_SLOTS_MAX) { > if (net_ratelimit()) > - netdev_dbg(vif->dev, > + netdev_err(vif->dev, > "Too many slots (%d) exceeding > limit (%d), dropping packet\n", > slots, XEN_NETBK_LEGACY_SLOTS_MAX); > drop_err = -E2BIG; > @@ -728,7 +735,7 @@ static int xenvif_count_requests(struct xenvif *vif, > */ > if (!drop_err && txp->size > first->size) { > if (net_ratelimit()) > - netdev_dbg(vif->dev, > + netdev_err(vif->dev, > "Invalid tx request, slot size %u > > remaining size %u\n", > txp->size, first->size); > drop_err = -EIO; > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index f9daa9e..67d5221 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -753,6 +753,7 @@ static int xennet_get_responses(struct netfront_info *np, > if (net_ratelimit()) > dev_warn(dev, "rx->offset: %x, size: %u\n", > rx->offset, rx->status); > + dev_warn(dev, "me here .. cons:%d slots:%d > rp:%d max:%d err:%d rx->id:%d rx->offset:%x size:%u > ref:%ld\n",cons,slots,rp,max,err,rx->id, rx->offset, rx->status, ref); > xennet_move_rx_slot(np, skb, ref); > err = -EINVAL; > goto next; > @@ -784,7 +785,7 @@ next: > if (cons + slots == rp) { > if (net_ratelimit()) > - dev_warn(dev, "Need more slots\n"); > + dev_warn(dev, "Need more slots cons:%d > slots:%d rp:%d max:%d err:%d rx-id:%d rx->offset:%x size:%u > ref:%ld\n",cons,slots,rp,max,err,rx->id, rx->offset, rx->status, ref); > err = -ENOENT; > break; > } > @@ -803,7 +804,6 @@ next: > if (unlikely(err)) > np->rx.rsp_cons = cons + slots; > - > return err; > } > @@ -907,6 +907,7 @@ static int handle_incoming_queue(struct net_device *dev, > /* Ethernet work: Delayed to here as it peeks the header. */ > skb->protocol = eth_type_trans(skb, dev); > + skb_reset_network_header(skb); > if (checksum_setup(dev, skb)) { > kfree_skb(skb); _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |