[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] CAP and performance problem



On 05/06/13 17:50, Dario Faggioli wrote:
On mar, 2013-05-21 at 11:02 +0200, Massimo Canonico wrote:
Hi,

Hi again,

I sent the following problem on xen-user ML without an answer. I hope
I'll get one in this ML.

My application is written in std C++ and it makes a matrix
multiplication: so it uses only CPU and memory (no I/O, no network).

I'm quite surprise that with CAP = 100% I got my results in about 600
seconds and with CAP = 50% I got my results in about 1800 seconds
(around 3 times longer).

For this kind of application I was expecting to get results in about
1200 seconds (2 times longer) for the second scenario with respect to
the first one.

Of course, the HW and SW are exactly the same for the 2 experiments.

Am I wrong or the CAP mechanism is not working well?

Ok, I found a minute to run your code myself on my test box. It's quite
a large one, but since the VM has only 1 vcpu, that shouldn't really
make much difference.

I configured vcpu-pinning in such a way that there should be no room for
interference of any kind, i.e., dedicating a core to the VM, and making
sure even his fellow thread is not busy (which matters in an
hyperthreaded system):

# xl vcpu-list
Name                                ID  VCPU   CPU State   Time(s) CPU
Affinity
Domain-0                             0     0    7   -b-      38.7  0-7
Domain-0                             0     1    3   -b-       2.3  0-7
Domain-0                             0     2    2   -b-       3.3  0-7
Domain-0                             0     3    6   -b-       6.8  0-7
Domain-0                             0     4    4   -b-       3.2  0-7
Domain-0                             0     5    2   -b-       3.6  0-7
Domain-0                             0     6    4   -b-       2.1  0-7
Domain-0                             0     7    1   -b-       1.8  0-7
Domain-0                             0     8    0   -b-       2.2  0-7
Domain-0                             0     9    7   -b-       1.7  0-7
Domain-0                             0    10    1   -b-       1.8  0-7
Domain-0                             0    11    5   r--      10.4  0-7
Domain-0                             0    12    1   -b-       3.5  0-7
Domain-0                             0    13    2   -b-       3.5  0-7
Domain-0                             0    14    3   -b-       2.7  0-7
Domain-0                             0    15    0   -b-       1.9  0-7
vm1                                  1     0   11   -b-     677.0  11

The numbers I'm getting are, I think, much more consistent with the
expectations:

  * no cap:
   Client served in 299.024
   Client served in 298.783
   Client served in 298.445
  * cap 50%:
   Client served in 643.668
   Client served in 643.372
   Client served in 644.342

Which means time roughly doubles.

I tried without pinning as well, and I'm getting pretty much the same
values.

At this point, I'm not sure what could be going on on your side. If you
want to try producing some traces, we can help inspect them, looking for
something weird. You can find some information about how to produce and
better interpret traces in this blog post:

http://blog.xen.org/index.php/2012/09/27/tracing-with-xentrace-and-xenalyze/

Perhaps you can share your VM config file and Dom0 configuration
(basically, Xen and Linux boot command lines), to check whether there is
something strange there. Also, you might have said this already (in
which case I forgot), what versions of Xen and Linux are we talking
about?

I really am out of good ideas... George, any clue?

Well for one, from the scheduler's perspective, the promise isn't that you'll get 50% of the *performance*, but 50% of the *cpu time*. I haven't been following the thread terribly closely, but I don't remember seeing any xentop or xentrace reports. The first question is, other than performance, do you have any reason to believe that the VM is not getting 50% of the cpu time?

At some point while your test is running, could you execute the following command in dom0:

xentrace -D -e 0x21000 -T 10 /tmp/test.trace

This will take a 10-second trace of just the scheduling events, placing the result in /tmp/test.trace

Then download and build xenalyze from the hg repo here:

http://xenbits.xen.org/ext/xenalyze

and run he following command:

xenalyze -s /tmp/test.trace > /tmp/test.summary

And post the results here?

Thanks,
 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.