[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Poor network performance between DomU with multiqueue support



> On Mon, Dec 08, 2014 at 01:08:18PM +0000, Zhangleiqiang (Trump) wrote:
> > > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote:
> > > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump)
> wrote:
> > > > > [...]
> > > >
> > > > The newest mail about persistent grant I can find is sent from 16
> > > > Nov
> > > > 2012
> > > > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html).
> > > > Why is it not done right and not merged into upstream?
> > >
> > > AFAICT there's one more memcpy than necessary, i.e. frontend memcpy
> > > data into the pool then backend memcpy data out of the pool, when
> > > backend should be able to use the page in pool directly.
> >
> > Memcpy should cheaper than grant_copy because the former needs not the
> > "hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I
> > right? For RX path, using memcpy based on persistent grant table may
> > have higher performance than using grant copy now.
> 
> In theory yes. Unfortunately nobody has benchmarked that properly.
> 
> If you're interested in doing work on optimising RX performance, you might
> want to sync up with XenServer folks?

What is the recommended way to have a discussion with XenServer folks? Through 
the forum of XenServer or the standalone mailing list? I find the most of 
discussions in forum are the production of XenServer.

> >
> > I have seen "move grant copy to guest" and "Fix grant copy alignment
> > problem" as optimization methods used in "NetChannel2"
> >
> (http://www-archive.xenproject.org/files/xensummit_fall07/16_JoseRenatoSa
> ntos.pdf).
> > Unfortunately, NetChannel2 seems not be supported from 2.6.32. Do you
> > know them and are them be helpful for RX path optimization under
> > current upstream implementation?
> 
> Not sure, that's long before I ever started working on Xen.
> 
> >
> > By the way, after rethinking the testing results for multi-queue pv
> > (kernel 3.17.4+XEN 4.4) implementation, I find that when using four
> > queues for netback/netfront, there will be about 3 netback process
> > running with high CPU usage on receive Dom0 (about 85% usage per
> > process running on one CPU core), and the aggregate throughout is only
> > about 5Gbps. I doubt that there may be some bug or pitfall in current
> > multi-queue implementation, because for 5Gbps throughout, occurring
> > about all of 3 CPU core for packet receiving is somehow abnormal.
> >
> 
> 3.17.4 doesn't contain David Vrabel's fixes.
> 
> Look for
>   bc96f648df1bbc2729abbb84513cf4f64273a1f1
>   f48da8b14d04ca87ffcffe68829afd45f926ec6a
>   ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e
> in David Miller's net tree.
> 
> BTW there are some improvement planned for 4.6: "[Xen-devel] [PATCH v3 0/2]
> gnttab: Improve scaleability". This is orthogonal to the problem you're 
> trying to
> solve but it should help improve performance in general.

Thanks for your pointer, it is helpful.

> 
> Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.