[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] About vcpu wakeup and runq tickling in credit



(Cc-ing David as it looks like he uses xenalyze quite a bit, and I'm 
 seeking for any advice on how to squeeze data from there too :-P)

On Thu, 2012-11-15 at 12:18 +0000, George Dunlap wrote:
> Maybe what we should do is do the wake-up based on who is likely to run 
> on the current cpu: i.e., if "current" is likely to be pre-empted, look 
> at idlers based on "current"'s mask; if "new" is likely to be put on the 
> queue, look at idlers based on "new"'s mask.
> 
Ok, find attached the two (trivial) patches that I produced and am
testing in these days. Unfortunately, early results shows that I/we
might be missing something.

In fact, although I still don't yet have the numbers for the NUMA-aware
scheduling case (which is what originated all this! :-D), comparing
'upstream' and 'patched' (namely, 'upstream' plus the two attached
patches) I can spot some perf regressions. :-(

Here's the results of running some benchmarks on 2, 6 and 10 VMs. Each
VM has 2 VCPUs and they run and execute the benchmarks concurrently on a
16 CPUs host. (Each test is repeated 3 times, and avg+/-stddev is what
is reported).

Also, the VCPUs where statically pinned on the host's PCPUs. As already
said, numbers for no-pinning and NUMA-scheduling will follow.

+ sysbench --test=memory (throughput, higher is better)
 #VMs | upstream                | patched
    2 | 550.97667 +/- 2.3512355 | 540.185   +/- 21.416892
    6 | 443.15    +/- 5.7471797 | 442.66389 +/- 2.1071732
   10 | 313.89233 +/- 1.3237493 | 305.69567 +/- 0.3279853

+ sysbench --test=cpu (time, lower is better)
 #VMs | upstream                | patched
    2 | 47.8211   +/- 0.0215503 | 47.816117 +/- 0.0174079
    6 | 62.689122 +/- 0.0877172 | 62.789883 +/- 0.1892171
   10 | 90.321097 +/- 1.4803867 | 91.197767 +/- 0.1032667

+ specjbb2005 (throughput, higher is better)
 #VMs | upstream                | patched
    2 | 49591.057 +/- 952.93384 | 50008.28  +/- 1502.4863
    6 | 33538.247 +/- 1089.2115 | 33647.873 +/- 1007.3538
   10 | 21927.87  +/- 831.88742 | 21869.654 +/- 578.236


So, as you can easily see, the numbers are very similar, with cases
where the patches produces some slight performance reduction, while I
was expecting the opposite, i.e., similar but a little bit better with
the patches.

For most of the runs of all the benchmarks, I have the full traces
(although, only for SCHED-* events, IIRC), so I can investigate more.
It's an huge amount of data, so it's really hard to make sense out of
it, and any advice and direction on that would be much appreciated.


For instance, looking at one of the runs of sysbench-memory, here's what
I found. With 10 VMs, the memory throughput reported by one of the VM
during one of the runs is as follows:

 upstream: 315.68 MB/s
 patched:  306.69 MB/s

I then went through the traces and I found out that the patched case
lasted longer (for transferring the same amount of memory, hence the
lower throughput), but with the following runstate related results:

 upstream: running for 73.67% of the time
           runnable for 24.94% of the time

 patched:  running for 74.57% of the time
           runnable for 24.10% of the time

And that is consistent with other random instances I checked. So, it
looks like the patches are, after all, doing their job in increasing (at
least a little) the running time, at the expenses of the runnable time,
of the various VCPUs, but the benefits of that is being all eaten by
some other effect --to the point that sometimes things go even worse--
that I'm not able to identify... For now! :-P

Any idea about what's going on and what I should check to better figure
that out?


Thanks a lot and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: xen-sched_credit-clarify-cpumask-and-during-tickle.patch
Description: Text Data

Attachment: xen-sched_credit-fix-tickling
Description: Text document

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.