[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Poor network performance between DomU with multiqueue support
On Fri, Dec 05, 2014 at 03:20:55PM +0000, Zoltan Kiss wrote: > > > On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: > >>-----Original Message----- > >>From: Zoltan Kiss [mailto:zoltan.kiss@xxxxxxxxxx] > >>Sent: Thursday, December 04, 2014 9:35 PM > >>To: Zhangleiqiang (Trump); Wei Liu; xen-devel@xxxxxxxxxxxxx > >>Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou (C) > >>Subject: Re: [Xen-devel] Poor network performance between DomU with > >>multiqueue support > >> > >> > >> > >>On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: > >>>>I think that's expected, because guest RX data path still uses > >>>>grant_copy while > >>>>>guest TX uses grant_map to do zero-copy transmit. > >>>As I understand, the RX process is as follows: > >>>1. Phy NIC receive packet > >>>2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do > >>>the "RX" operation, and the packet is stored into SKB which is also > >>>owned/shared with netback > >>Not that easy. There is something between the NIC driver and netback which > >>directs the packets, e.g. the old bridge driver, ovs, or the IP stack of > >>the kernel. > >>>4. NetBack notify netfront through event channel that a packet is > >>>receiving 5. Netfront grant a buffer for receiving and notify netback > >>>the GR (if using grant-resue mechanism, netfront just notify the GR to > >>>netback) through IO Ring > >>It looks a bit confusing in the code, but netfront put "requests" on the > >>ring > >>buffer, which contains the grant ref of the guest page where the backend can > >>copy. When the packet comes, netback consumes these requests and send > >>back a response telling the guest the grant copy of the packet finished, it > >>can > >>start handling the data. (sending a response means it's placing a response > >>in > >>the ring and trigger the event channel) And ideally netback should always > >>have > >>requests in the ring, so it doesn't have to wait for the guest to fill it > >>up. > > > >>>6. NetBack do the grant_copy to copy packet from its SKB to the buffer > >>>referenced by GR, and notify netfront through event channel 7. > >>>Netfront copy the data from buffer to user-level app's SKB > >>Or wherever that SKB should go, yes. Like with any received packet on a real > >>network interface. > >>> > >>>Am I right? Why not using zero-copy transmit in guest RX data pash too ? > >>Because that means you are mapping that memory to the guest, and you won't > >>have any guarantee when the guest will release them. And netback can't just > >>unmap them forcibly after a timeout, because finding a correct timeout value > >>would be quite impossible. > >>A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, > >>but it even becomes worse if the memory came from another > >>guest: you can't shutdown that guest for example, until all its memory is > >>returned to him. > > > >Thanks for your detailed explanation about RX data path, I have get it, :) > > > >About the issue that poor performance between DomU to DomU, but high > >throughout between Dom0 to remote Dom0/DomU mentioned in my previous mail, > >do you have any idea about it? > > > >I am wondering if netfront/netback can be optimized to reach the 10Gbps > >throughout between DomUs running on different hosts connected with 10GE > >network. Currently, it seems like the TX is not the bottleneck, because we > >can reach the aggregate throughout of 9Gbps when sending packets from one > >DomU to other 3 DomUs running on different host. So I think the bottleneck > >maybe the RX, are you agreed with me? > > > >I am wondering what is the main reason that prevent RX to reach the higher > >throughout? Compared to KVM+virtio+vhost, which can reach high throughout, > >the RX has extra grantcopy operation, and the grantcopy operation may be one > >reason for it. Do you have any idea about it too? > It's quite sure that the grant copy is the bottleneck for a single queue RX > traffic. I don't know what's the plan to help that, currently only a faster > CPU can help you with that. Could the Intel QuickData help with that? > > > > >> > >>Regards, > >> > >>Zoli > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |