[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Xen VMs and Unixbench: single vs multiple cpu behaviour
[Cc-ing George, which I should have done earlier, sorry! :-/] On Sat, 2015-11-21 at 11:41 +0100, Marko ÄukiÄ wrote: > And the results of vcpu pinning: > So, let me see if I can put the numbers together and recap. With a 4 vCPUs VM, we have:                   no pinning / all on 1 pCPU / 1-to-1 pin Dhrystone 2 using register variablesÂÂ3355.0 3359.4 3385.2 Double-Precision Whetstone 787.6 785.3 784.2 Execl Throughput 298.8 193.0 303.7 File Copy 1024 bufsize 2000 maxblocks 3292.7 3303.1 3294.0 File Copy 256 bufsize 500 maxblocksÂÂÂ2078.2 2089.2 2083.3 File Copy 4096 bufsize 8000 maxblocks 5516.9 5559.8 5576.7 Pipe ThroughputÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ1855.9 1857.8 1856.1 Pipe-based Context SwitchingÂÂÂÂÂÂÂÂÂÂÂ999.9 987.6 999.5 Process Creation 254.4 826.4 354.1 Shell Scripts (1 concurrent)ÂÂÂÂÂÂÂÂÂÂÂ818.0 840.1 815.8 Shell Scripts (8 concurrent)ÂÂÂÂÂÂÂÂÂÂ6493.1 1100.4 6497.7 System Call OverheadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ2870.2 2866.0 2847.9 System Benchmarks Index ScoreÂÂÂÂÂÂÂÂÂ1564.2 1438.5 1611.2 With a 1 vCPU VM, the _same_ benchmarks behaves like this:                   no pinning / pinned on 1 pCPU Dhrystone 2 using register variablesÂÂ3403.6 3391.0 Double-Precision WhetstoneÂÂÂÂÂÂÂÂÂÂÂÂÂ785.5 786.4 Execl ThroughputÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ1853.5 1857.8 File Copy 1024 bufsize 2000 maxblocksÂ3909.4 3901.2 File Copy 256 bufsize 500 maxblocksÂÂÂ2468.3 2459.8 File Copy 4096 bufsize 8000 maxblocksÂ6212.0 6191.4 Pipe ThroughputÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ2079.8 2080.8 Pipe-based Context SwitchingÂÂÂÂÂÂÂÂÂÂ1101.3 1100.9 Process CreationÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ1811.4 1877.4 Shell Scripts (1 concurrent)ÂÂÂÂÂÂÂÂÂÂ3084.2 3054.2 Shell Scripts (8 concurrent)ÂÂÂÂÂÂÂÂÂÂ2838.6 2816.9 System Call OverheadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ3511.4 3517.4 System Benchmarks Index ScoreÂÂÂÂÂÂÂÂÂ2407.4 2409.6 It looks to me that numbers are pretty much the same, no matter pinning. Considering that this is 'sequential workload' (as only one copy of any single benchmark runs at any given time), I think this makes sense. The most notable exception to the above statement is, in the 4 vCPUs case, "Shell Scripts (8 concurrent)", which is sensibly slower if all the vCPUs are pinned to 1 pCPU. That also makes sense, though, as this is the only one _not_really_ sequential test. There are other differences, still in the 4 vCPUs case: Â- "Execl Throughput" slows down a bit in "all vCPUs pinned to 1 pCPU" ÂÂÂcase; Â- "Process Creation", quite weirdly, is boosted in the "all vCPUs ÂÂÂpinned to 1 pCPU" case, and behaves worst in the "no pinning case". So, it looks to me that the Xen (Credit1, I assume, is that correct Marko?) *per* *se* is doing ok. Still, things do slow down in the case where the VM, basically, should have 3 idle vCPUs. I'd be tempted to say that it could be one of those Xen-vs-Linux's schedulers (mis)interactions, and it must be a Xen specific one, as Marko reported that KVM --despite being slightly worse, in general-- is not affected by this particular glitch. I'm a bit clueless for now, so I'll go keeping trying to reproduce this and, as soon as I manage to, collect some traces. I'll let you know... Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |