[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"



On Wed, Mar 12, 2014 at 02:42:43AM +0100, Sander Eikelenboom wrote:
[...]
> Ok you asked for it .. so here we go .. ;-) :
> 
> - Xen-unstable
> - DomU:
>      - PV guest
>      - Debian Wheezy
>      - Kernel: 3.14-rc5 (.config see dom0 .config)
>      - Running netserver
> - Dom0:
>      - Debian Wheezy
>      - Kernel: 3.13.6 + additional patches (see attached git log and .config)
>      - 2 physical NIC's
>      - Autoballooning is prevented by using dom0_mem=1536M,max:1536M for xen 
> and mem=1536M for dom0 kernel stanzas in grub
>      - vcpu 0 is pinned on pcpu 0 and exclusively for dom0
> 
> - Networking:
>      - Routed bridge
>      - eth0 = internet
>      - eth1 = lan 172.16.x.x
>      - xen_bridge = bridge for VM's 192.168.1.x
>      - iptables NAT and routing  
>      - attached dom0 and domU ifconfig output
>      - attached ethtool -k output for the bridge, vif and guest eth0
> 
> Triggering workload:
>      - Well that's were the problem is :-)
>      - The Guest has a normal disk and swap (phy/lvm) and
>        shared storage with glusterfs (glusterfs server is on dom0)
>      - The Guest exposes this storage via webdavs 
> 
>      What triggers it is:
>      - The Guest runs it's rsync of the shared storage to a remote client on 
> the internet
>        So this causes traffic from Dom0 to domU (reading storage) ..
>        back from domU to dom0 and via iptables NAT on to eth0 (actual rsync)
>        or vice versa when syncing the other way around
>      - At the same time do a backup from a windows machine from the lan to 
> the webdavs server
>        So this causes traffic from eth1 to domU (webdav) .. 
>        back from domU to dom0 (writing to the shared storage) 
> 
> So in essence it is doing quite some netback && netfront stress testing :-)
> 
> It seems to only trigger when doing both the rsync and the webdav 
> simultaneous.
> 
> I tried my best to emulate any of this with netperf (and multiple instances),
> i tried with various (odd) packet sizes and the packet / byte rates 
> transmitted
> are higher then with the workload above ... but it doesn't seem to trigger 
> with netpref
> 
> So i don't think it will be easy to replicate ...
> 

Sorry, this is probably not something I can setup. *sigh*

> Perhaps running through the available logging again .. and try to answer some 
> questions ... this is just with one guest running kernels as before only 
> added debugging to netfront and xen (diffs attached):
> 
> Mar 12 02:00:44 backup kernel: [  496.840646] net eth0: rx->offset: 0, size: 
> 4294967295

This "size" here is actually "status", status=-1, it's an error...

> Mar 12 02:00:44 backup kernel: [  496.840665] net eth0: cons:1346005 slots:1 
> rp:1346013 max:18 err:0 rx->id:212 rx->offset:0 size:4294967295 ref:572 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.840680] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.840687] net eth0: cons:1346005 slots:2 
> rp:1346013 max:18 err:-22 rx->id:214 rx->offset:0 size:4294967295 ref:657 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.840701] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.840712] net eth0: cons:1346005 slots:3 
> rp:1346013 max:18 err:-22 rx->id:215 rx->offset:0 size:4294967295 ref:667 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.840733] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.840740] net eth0: cons:1346005 slots:4 
> rp:1346013 max:18 err:-22 rx->id:216 rx->offset:0 size:4294967295 ref:716 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.840757] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.840764] net eth0: cons:1346005 slots:5 
> rp:1346013 max:18 err:-22 rx->id:217 rx->offset:0 size:4294967295 ref:755 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.840778] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.840784] net eth0: cons:1346005 slots:6 
> rp:1346013 max:18 err:-22 rx->id:218 rx->offset:0 size:4294967295 ref:592 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.840801] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.840807] net eth0: cons:1346005 slots:7 
> rp:1346013 max:18 err:-22 rx->id:219 rx->offset:0 size:4294967295 ref:633 
> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
> xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.841043] net eth0: rx->offset: 0, size: 
> 4294967295
> 
> Mar 12 02:00:44 backup kernel: [  496.841052] net eth0: cons:1346025 slots:1 
> rp:1346038 max:18 err:0 rx->id:232 rx->offset:0 size:4294967295 
> ref:-131941395332491 pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 
> gso_type:0 gso_segs:0 RING_HAS_UNCONSUMED_RESPONSES:13 cons_changed:1 
> cons_before:1346024 xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.841067] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.841074] net eth0: cons:1346025 slots:2 
> rp:1346038 max:18 err:-22 rx->id:234 rx->offset:0 size:4294967295 
> ref:-131941395332579 pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 
> gso_type:0 gso_segs:0 RING_HAS_UNCONSUMED_RESPONSES:29 cons_changed:1 
> cons_before:1346024 xennet_get_extras_err:0
> Mar 12 02:00:44 backup kernel: [  496.841092] net eth0: rx->offset: 0, size: 
> 4294967295
> Mar 12 02:00:44 backup kernel: [  496.841101] net eth0: cons:1346025 slots:3 
> rp:1346038 max:18 err:-22 rx->id:235 rx->offset:0 size:4294967295 
> ref:-131941395332408 pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 
> gso_type:0 gso_segs:0 RING_HAS_UNCONSUMED_RESPONSES:29 cons_changed:1 
> cons_before:1346024 xennet_get_extras_err:0
> 
> 
> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
> 4325377 gt_version:1 ldom:0 readonly:0 allow_transitive:1
> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5478146 
> source_off:0 source_len:4096 op->source.offset:0 op->len:1168

rc = -3 when acquiring grant. This means grant reference is unrecognised
or inappropriate. The grant reference is 4325377 which is obviously too
large for a sensible grant ref. It's just garbage.

> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
> 19791875 gt_version:1 ldom:0 readonly:0 allow_transitive:1
> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5497610 
> source_off:0 source_len:4096 op->source.offset:0 op->len:2476
> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
> 4325379 gt_version:1 ldom:0 readonly:0 allow_transitive:1
> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5478282 
> source_off:0 source_len:4096 op->source.offset:0 op->len:1634
> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
> 4325379 gt_version:1 ldom:0 readonly:0 allow_transitive:1
> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5497610 
> source_off:0 source_len:4096 op->source.offset:1634 op->len:1620
> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
> 4325377 gt_version:1 ldom:0 readonly:0 allow_transitive:1
> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5497609 
> source_off:0 source_len:4096 op->source.offset:0 op->len:4096
> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
> 19791875 gt_version:1 ldom:0 readonly:0 allow_transitive:1
> 
> - Sometimes (but not always) netfront also spits out:
>   dev_warn(dev, "Invalid extra type: %d\n", extra->type);
>   where the extra type seems a random value (seen 196, 31 ..)
> - Sometimes (but not always) netfront also spits out:
>   dev_warn(dev, "Need more slots\n");
> - Sometimes (but not always) netfront also spits out:
>   dev_warn(dev, "Missing extra info\n");
> 

That's because garbage is pushed to frontend. It's trying to decode some
random values.

> 
> First question that comes to my mind:
> - Are the warnings netfront spits out the cause of xen reporting the bad 
> grant reference ?
>   Or
>   Are the Bad grant references Xen is reporting .. causing netfront to spit 
> out the warnings ?

No. They are not directly connected.

0. backend breaks down a skb into slots.
1. backend gets a request from the rx ring (a slot with gref provided by
   frontend), until all slots in skb are handled.
2. backend grant-copies data to that slot (with gref).
3. backend pushes a response to frontend, whose content depends on the
   result of previous step.
4. frontend handles that response.

So the grant copy error in hypervisor you're seeing is in 2. The
complaint from frontend is in 4, when backend pushes a response with
error status in it. The bug is most likely in 0, when the calculation
goes wrong - either in the breakdown process, or in the macro that
returns usable slots (this can lead to backend thinks there's enough
slots while there's not).

> - Why is that "if (rx->flags & XEN_NETRXF_extra_info) {}" part in 
> xen-netfront.c doing there and changing cons midway ?
>   The commit message from f942dc2552b8bfdee607be867b12a8971bb9cd85 that 
> introduced the if says:
> 
>     One major change from xen.git is that the guest transmit path (i.e. what
>     looks like receive to netback) has been significantly reworked to remove
>     the dependency on the out of tree PageForeign page flag (a core kernel
>     patch which enables a per page destructor callback on the final
>     put_page). This page flag was used in order to implement a grant map
>     based transmit path (where guest pages are mapped directly into SKB
>     frags). Instead this version of netback uses grant copy operations into
>     regular memory belonging to the backend domain. Reinstating the grant
>     map functionality is something which I would like to revisit in the
>     future.
> 
>   It *is* using grant copy now .. so should this part have been removed ?
>   And
>   Could Paul's commit that seems to be the first to trigger these events 
> affect this ?
> 

This depicts guest *transmit* path, but the issue you're seeing is in
guest *receive* path. So that's totally different thing.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.