[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] strangeness with high BDP tests



Diwaker Gupta wrote:
Any ideas why I'm getting such bad performance from the VMs
on high BDP links? I'm willing and interested to help in
debugging and fixing this issue, but I need some leads :)

The first thing to do is to look at the CPU usage in dom0 and domU. If
you can run them on different CPUs or even different hyperthreads it
might make the experiment simpler to understand. The first thing to find
out is whether you're maxed out on CPU, or whether this is an IO
blocking issue. Xm list should show you how much CPU each domain is
burning.


I had caught glimpses on the list of a top like utility for viewing
CPU usage.. is that a reality yet? I haven't followed up on that
thread. The problem is that xm list is fine for very coarse grained
measurements, but its a pain to do real-time fine granularity
measurements with that. Sure, I could always write my own little
Python script using the xm interface, but it'll be great if we had
something like top.


Also, you might want to play around with the rate limiting function in
netback. If you set it to a few hundred Mb/s you might help promote
batching.


Sorry if this is dumb, but whats the rate limiting function in
netback? Is it a run-time parameter or something in the code? What
does it do? If I set it too high, won't it lead to bad performance
with low b/w flows? I guess I should just look at the code :)

Hi Diwaker! Sorry I'm coming to this thread late, I was out
sick the last couple of days. I just started looking into the
net flow control problem. Ian is speculating that the  rate
limiting function will actually help improve data get pushed
faster. We're looking into where exactly our latencies are.
If you could run some debug patches for me, I'd really appreciate
it..

Btw, have you tried using the -i and -I options to netperf?
-i 30, 10, will at least ensure a minimum of 10 runs for
each measurement, and -I can be used to specify a confidence
interval (99, 5). Even if it's consistent, I wouldn't trust the 10
second run time for the test.

Netperf uses setsockopt() to set its own buffer sizes, so
increasing the system sysctl values will not affect your test
in anyway (or shouldn't ;)).


I'm also concerned that dummynet is pretty terible when operating at
such high speeds, and the whole thing might be just a bad interaction
between Xen's batching and dummynet's. Why not set up a real experiement
across Abilene just to check?


I think thats a separate debate. For now, I just want to get the same
performance levels from a VM as from dom0, for all possible
environments, dummynet just being one of them. Setting up a real
experiment is a good idea though, I'm looking into it. BTW, where can
I learn more on Xen's "batching"?

The question is how frequently should the frontend kick the
backend, and how frequently should the backend pass along packets
to the real device. Aggregating requests improves the efficiency
of the transfers but impacts latency.

thanks,
Nivedita




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.