Xen project Mailing List

Re: [Xen-devel] Poor network performance between DomU with multiqueue support

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Zoltan Kiss <zoltan.kiss@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

From: "Zhangleiqiang (Trump)" <zhangleiqiang@xxxxxxxxxx>

Date: Mon, 8 Dec 2014 06:50:15 +0000

Accept-language: zh-CN, en-US

Cc: "jonathan.davies@xxxxxxxxxx" <jonathan.davies@xxxxxxxxxx>, "Luohao \(brian\)" <brian.luohao@xxxxxxxxxx>, Zhuangyuxin <zhuangyuxin@xxxxxxxxxx>, zhangleiqiang <zhangleiqiang@xxxxxxxxx>, "Yuzhou \(C\)" <vitas.yuzhou@xxxxxxxxxx>, "Xiaoding \(B\)" <xiaoding1@xxxxxxxxxx>

Delivery-date: Mon, 08 Dec 2014 06:51:10 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHQDh95ofp9r8yimUqbpVEX+wSvTJx8Li7g//+BxYCAAK5KQP//kQwAgAH961CAANCjgIAAiqFg//+jdIAAEh4VUAAj3QuAAAaAAwAAjyHxcA==

Thread-topic: [Xen-devel] Poor network performance between DomU with multiqueue support

> On Fri, Dec 05, 2014 at 03:20:55PM +0000, Zoltan Kiss wrote: > > > > > > On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: > > >>-----Original Message----- > > >>From: Zoltan Kiss [mailto:zoltan.kiss@xxxxxxxxxx] > > >>Sent: Thursday, December 04, 2014 9:35 PM > > >>To: Zhangleiqiang (Trump); Wei Liu; xen-devel@xxxxxxxxxxxxx > > >>Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou > > >>(C) > > >>Subject: Re: [Xen-devel] Poor network performance between DomU with > > >>multiqueue support > > >> > > >> > > >> > > >>On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: > > >>>>I think that's expected, because guest RX data path still uses > > >>>>grant_copy while > > >>>>>guest TX uses grant_map to do zero-copy transmit. > > >>>As I understand, the RX process is as follows: > > >>>1. Phy NIC receive packet > > >>>2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver > > >>>do the "RX" operation, and the packet is stored into SKB which is > > >>>also owned/shared with netback > > >>Not that easy. There is something between the NIC driver and netback > > >>which directs the packets, e.g. the old bridge driver, ovs, or the IP > > >>stack of > the kernel. > > >>>4. NetBack notify netfront through event channel that a packet is > > >>>receiving 5. Netfront grant a buffer for receiving and notify > > >>>netback the GR (if using grant-resue mechanism, netfront just > > >>>notify the GR to > > >>>netback) through IO Ring > > >>It looks a bit confusing in the code, but netfront put "requests" on > > >>the ring buffer, which contains the grant ref of the guest page > > >>where the backend can copy. When the packet comes, netback consumes > > >>these requests and send back a response telling the guest the grant > > >>copy of the packet finished, it can start handling the data. > > >>(sending a response means it's placing a response in the ring and > > >>trigger the event channel) And ideally netback should always have requests > in the ring, so it doesn't have to wait for the guest to fill it up. > > > > > >>>6. NetBack do the grant_copy to copy packet from its SKB to the > > >>>buffer referenced by GR, and notify netfront through event channel 7. > > >>>Netfront copy the data from buffer to user-level app's SKB > > >>Or wherever that SKB should go, yes. Like with any received packet > > >>on a real network interface. > > >>> > > >>>Am I right? Why not using zero-copy transmit in guest RX data pash too ? > > >>Because that means you are mapping that memory to the guest, and you > > >>won't have any guarantee when the guest will release them. And > > >>netback can't just unmap them forcibly after a timeout, because > > >>finding a correct timeout value would be quite impossible. > > >>A malicious/buggy/overloaded guest can hold on to Dom0 memory > > >>indefinitely, but it even becomes worse if the memory came from > > >>another > > >>guest: you can't shutdown that guest for example, until all its > > >>memory is returned to him. > > > > > >Thanks for your detailed explanation about RX data path, I have get > > >it, :) > > > > > >About the issue that poor performance between DomU to DomU, but high > throughout between Dom0 to remote Dom0/DomU mentioned in my previous > mail, do you have any idea about it? > > > > > >I am wondering if netfront/netback can be optimized to reach the 10Gbps > throughout between DomUs running on different hosts connected with 10GE > network. Currently, it seems like the TX is not the bottleneck, because we can > reach the aggregate throughout of 9Gbps when sending packets from one > DomU to other 3 DomUs running on different host. So I think the bottleneck > maybe the RX, are you agreed with me? > > > > > >I am wondering what is the main reason that prevent RX to reach the higher > throughout? Compared to KVM+virtio+vhost, which can reach high throughout, > the RX has extra grantcopy operation, and the grantcopy operation may be one > reason for it. Do you have any idea about it too? > > It's quite sure that the grant copy is the bottleneck for a single > > queue RX traffic. I don't know what's the plan to help that, currently > > only a faster CPU can help you with that. > > Could the Intel QuickData help with that? Thanks for your hit. I am looking for method which is independent on hardware. Because I have seen that virtio can reach the 10Gbps throughout, and I think PV network protocol which is the mainline of XEN should also reach the throughout. However, the testing results show that it is not ideal, so I am wondering what the possible reason is and if PV network protocol can be optimized. > > > > > > > >> > > >>Regards, > > >> > > >>Zoli > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxx > > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.