Xen project Mailing List

Re: [Xen-devel] large packet support in netfront driver and guest network throughput

To: Wei Liu <wei.liu2@xxxxxxxxxx>

From: Anirban Chakraborty <abchak@xxxxxxxxxxx>

Date: Fri, 13 Sep 2013 17:09:48 +0000

Accept-language: en-US

Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 13 Sep 2013 17:10:05 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHOr+Dtab7v4C01XU6hPj6YULek5ZnDjVUAgABbAoA=

Thread-topic: [Xen-devel] large packet support in netfront driver and guest network throughput

On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@xxxxxxxxxx> wrote: > On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote: >> Hi All, >> >> I am sure this has been answered somewhere in the list in the past, but I >> can't find it. I was wondering if the linux guest netfront driver has GRO >> support in it. tcpdump shows packets coming in with 1500 bytes, although the >> eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing >> that they receive large packet: >> >> In dom0: >> eth0 Link encap:Ethernet HWaddr 90:E2:BA:3A:B1:A4 >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 >> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto >> TCP (6), length 29012) >> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack >> 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 >> >> vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF >> UP BROADCAST RUNNING NOARP PROMISC MTU:1500 Metric:1 >> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214 >> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto >> TCP (6), length 29012) >> 10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack >> 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960 >> >> >> In the guest: >> eth0 Link encap:Ethernet HWaddr CA:FD:DE:AB:E1:E4 >> inet addr:10.84.20.213 Bcast:10.84.20.255 Mask:255.255.255.0 >> inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214 >> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto >> TCP (6), length 1500) >> 10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack >> 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448 >> >> Is the packet on transfer from netback to net front is segmented into MTU >> size? Is GRO not supported in the guest? > > Here is what I see in the guest, iperf server running in guest and iperf > client running in Dom0. Tcpdump runs with the rune you provided. > > 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq > 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr > 21832969], length 11584 > > This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s. Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both vif and the physical interfaces] , i.e. on the receive path of the server. iperf server was running on the guest (10.84.20.213) and the client was at another guest (on a different server) with IP 10.84.20.214. The traffic was between two guests, not between dom0 and the guest. > >> >> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests >> (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different >> XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. > > XenServer might use different Dom0 kernel with their own tuning. You can > also try to contact XenServer support for better idea? > XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront driver, as it appears from the tcpdump, thats why I thought I post it here. Note that checksum offloads of the interfaces (virtual and physical) were not even touched, the default setting (which was set to on) was used. > In general, off-host communication can be affected by various things. It > would be quite useful to identify the bottleneck first. > > Try to run: > 1. Dom0 to Dom0 iperf (or you workload) > 2. Dom0 to DomU iperf > 3. DomU to Dom0 iperf I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO turned on in the physical interface). However, when I run guest to guest, things fall off. Is large packet not supported in netfront? I thought otherwise. I looked at the code and I do not see any call to napi_gro_receive(), rather it is using netif_receive_skb(). netback seems to be sending GSO packets to the netfront, but it is being segmented to 1500 byte (as it appears from the tcpdump). > > In order to get line rate, you need to at least get line rate from Dom0 > to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been > achieved at the moment… What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm running on that server without any other traffic. -Anirban _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.