Re: [Xen-devel] Virt overehead with HT [was: Re: Xen 4.5 development update]

On 07/14/2014 11:44 PM, Dario Faggioli wrote:
On lun, 2014-07-14 at 19:31 +0100, Gordan Bobic wrote:
On 07/14/2014 06:22 PM, Dario Faggioli wrote:

I'll try more runs, e.g. with number of VCPUs equal less than
nr_corse/2 and see what happens.

Again, thoughts?

Have you tried it with VCPUs pinned to appropriate PCPUs?

Define "appropriate".

I have a run for which I pinned VCPU#1-->PCPU#1, VCPU#2-->PCPU#2, and so
on, and the result is even worse:

Average Half load -j 4 Run (std deviation):
  Elapsed Time 37.808 (0.538999)
Average Optimal load -j 8 Run (std deviation):
  Elapsed Time 26.594 (0.235223)
Average Maximal load -j Run (std deviation):
  Elapsed Time 27.9 (0.131149)

This is actually something I expected, since you do not allow the VCPUs
to move away from an HT with a busy sibling, even when it could have.

In fact, you may expect better result from pinning only if you were to
pin not only the VCPUs to the PCPUs, but also the kernbench's build jobs
on the appropriate (V)CPUs in the guest.. but that's something not only
really unpractical, but also very few representative as a benchmark, I

If you pin VCPU#1 to PCPU#1 and VCPU#2 to PCPU#2, with PCPU#1 and PCPU#2
being HT siblings, what prevents Linux (in the guest) to run two of the
four build jobs on VCPU#1 and VCPU#2 (i.e., on siblings PCPUs!!) for all
the length of the benchmark? Nothing, I think.

That would imply that Xen can somehow make a better decision that the domU's kernel scheduler, something that doesn't seem that likely. I would expect not pinning CPUs to increase process migration because Xen might migrate the CPU even though the kernel in domU decided which presented CPU was most lightly loaded.

And in fact, pinning would also result in good (near to native,
perhaps?) performance, if we were exposing the SMT topology details to
guests as, in that case, Linux would do the balancing properly. However,
that's not the case either. :-(

I see, so you are referring specifically to the HT case. I can see how that could cause a problem. Does pinning improve the performance with HT disabled?


