[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"



Friday, March 7, 2014, 12:55:18 PM, you wrote:

> Friday, March 7, 2014, 12:19:29 PM, you wrote:

>> On Fri, Mar 07, 2014 at 11:33:21AM +0100, Sander Eikelenboom wrote:
>> [...]
>>> 
>>> >> >> 
>>> >> >> > My suggestion is, if you have a working base line, you can try to 
>>> >> >> > setup
>>> >> >> > different frontend / backend combination to help narrow down the
>>> >> >> > problem.
>>> >> >> 
>>> >> >> Will see what i can do after the weekend
>>> >> >> 
>>> A small update
>>> 
>>> I tried reverting the latest netback / netfront patches .. but to no avail 
>>> ..
>>> Also tried if i could trigger it somehow by using netperf and generating a 
>>> lot
>>> of frags (as that would make it more easily reproduceable).
>>> But that was also to no avail .. it seems to only trigger sometimes with my
>>> specific workload.
>>> 
>>> So i took a flight forward by trying out Zoltan's series v6
>>> (since it also had changes to the way the network code uses the 
>>> granttables),
>>> got that running overnight applying the same workload as before and
>>> i haven't triggered anything yet .. looking good so far :-)
>>> 

>> Thanks for letting us know. If there's any update don't hesitate to post
>> to xen-devel.

> *sigh* .. it seems posting to xen-devel triggers things ;-)
> back to square one again:

> Guest kernel:
> Mar  7 11:45:29 backup kernel: [49954.928062] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928081] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928086] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928092] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928096] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928101] net eth0: Need more slots
> Mar  7 11:45:29 backup kernel: [49954.928196] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928202] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928206] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:45:29 backup kernel: [49954.928210] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:50:42 backup kernel: [50267.397350] net_ratelimit: 14 callbacks 
> suppressed
> Mar  7 11:50:42 backup kernel: [50267.397366] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:50:42 backup kernel: [50267.397372] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:50:42 backup kernel: [50267.397377] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:50:42 backup kernel: [50267.397381] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar  7 11:50:42 backup kernel: [50267.397386] net eth0: rx->offset: 0, size: 
> 4294967295

> Xen:
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 20316163
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 4325377
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 6684675
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 13238275
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 20054019
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 4325377
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 3538945
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 3538945
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 3538945
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 3538945
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 3538945
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 4325377
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 7471105
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 4325377
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 4325377
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 107085839
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 107085839
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:45:29] grant_table.c:1857:d0v3 Bad grant reference 
> 268435460
> (XEN) [2014-03-07 10:50:42] grant_table.c:1857:d0v4 Bad grant reference 
> 4325379

> Will be testing 3.13 vanilla .. see how that works out and if there is a 
> baseline somewhere.

>> Wei.

Hi Paul,

It seems a commit by you: "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c 
xen-netback: improve guest-receive-side flow control"
is the first that gives the Bad grant references.
It seems later patches partly prevent or mask the issue, so it is less easy to 
trigger it.
With only this commit applied i can trigger it quite fast.

This is the result of:
- First testing a baseline that worked o.k. for several days (3.13.6 for both 
dom0 and domU)
- Testing domU 3.14-rc5 and dom0 3.13.6, this worked ok.
- Testing dom0 3.14-rc5 and domU 3.13.6, this failed.
- After that took 3.13.6 as base and first applied all the general xen related 
patches for the dom0 kernel, that works ok.
- After that started to apply the netback changes for 3.14 and that failed 
after the commit stated above.

So i'm quite confident i'm reporting the right thing now :-)
If you would like me to run debug patches on top of this commit, don't hesitate 
to send them !

--
Sander




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.