[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Poor SMP performance pv_ops domU
On 05/18/2010 10:34 AM, John Morrison wrote: > Hi, > > Over the last year we have tried many times to get acceptable performance > from pv_ops kernels. > > Tests done with 1,2,4 and 8 cores. The more cores the lower the score. > > Inside the domU it shows all cores, top -s shows all cores in use. > xentop in dom0 never shows over 99% cpu. > > 2.6.18.8-xenU kernel show's over 700% cpu and the scores are about 8 x the > pv_ops score. > > Any ideas ? > Well, I guess some kind of bad serialization is going on in there, and it should be fairly obvious with a bit of examination. Have you tried building your own pvops domu kernels? Does enabling PV spinlocks make any difference? Also enabling some of the lock debugging/profiling/contention monitoring stuff may give useful results. Can you post the corresponding 2.6.18 results? Are there specific sub-tests which show the effect more strongly than the others? How does the 2.6.32 kernel fare when booted native? Thanks, J > > John > > > 1 core > > BYTE UNIX Benchmarks (Version 4.1-wht.2) > System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC > 2010 x86_64 GNU/Linux > /dev/xvda1 141110136 1066476 132875660 1% / > > Start Benchmark Run: Tue May 18 13:54:54 BST 2010 > 13:54:54 up 0 min, 1 user, load average: 0.00, 0.00, 0.00 > > End Benchmark Run: Tue May 18 14:06:12 BST 2010 > 14:06:12 up 11 min, 2 users, load average: 11.48, 5.20, 2.43 > > > INDEX VALUES > TEST BASELINE RESULT INDEX > > Dhrystone 2 using register variables 376783.7 8950813.0 237.6 > Double-Precision Whetstone 83.1 2103.7 253.2 > Execl Throughput 188.3 1568.4 83.3 > File Copy 1024 bufsize 2000 maxblocks 2672.0 64198.0 240.3 > File Copy 256 bufsize 500 maxblocks 1077.0 17781.0 165.1 > File Read 4096 bufsize 8000 maxblocks 15382.0 643717.0 418.5 > Pipe-based Context Switching 15448.6 85379.4 55.3 > Pipe Throughput 111814.6 478490.1 42.8 > Process Creation 569.3 3329.6 58.5 > Shell Scripts (8 concurrent) 44.8 380.7 85.0 > System Call Overhead 114433.5 498712.3 43.6 > ========= > FINAL SCORE 114.1 > > 2-cores > > ============================================================== > BYTE UNIX Benchmarks (Version 4.1-wht.2) > System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC > 2010 x86_64 GNU/Linux > /dev/xvda1 141110136 1066548 132875588 1% / > > Start Benchmark Run: Tue May 18 14:07:27 BST 2010 > 14:07:27 up 0 min, 1 user, load average: 0.00, 0.00, 0.00 > > End Benchmark Run: Tue May 18 14:18:04 BST 2010 > 14:18:04 up 10 min, 1 user, load average: 12.78, 5.53, 2.49 > > > INDEX VALUES > TEST BASELINE RESULT INDEX > > Dhrystone 2 using register variables 376783.7 10124838.6 268.7 > Double-Precision Whetstone 83.1 1188.7 143.0 > Execl Throughput 188.3 1596.2 84.8 > File Copy 1024 bufsize 2000 maxblocks 2672.0 58323.0 218.3 > File Copy 256 bufsize 500 maxblocks 1077.0 17776.0 165.1 > File Read 4096 bufsize 8000 maxblocks 15382.0 568217.0 369.4 > Pipe-based Context Switching 15448.6 86111.3 55.7 > Pipe Throughput 111814.6 469957.8 42.0 > Process Creation 569.3 3298.1 57.9 > Shell Scripts (8 concurrent) 44.8 378.9 84.6 > System Call Overhead 114433.5 532828.4 46.6 > ========= > FINAL SCORE 107.9 > > 4-cores > > ============================================================== > BYTE UNIX Benchmarks (Version 4.1-wht.2) > System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC > 2010 x86_64 GNU/Linux > /dev/xvda1 141110136 1066628 132875508 1% / > > Start Benchmark Run: Tue May 18 14:19:17 BST 2010 > 14:19:17 up 0 min, 1 user, load average: 0.00, 0.00, 0.00 > > End Benchmark Run: Tue May 18 14:29:53 BST 2010 > 14:29:53 up 10 min, 1 user, load average: 13.59, 6.35, 2.97 > > > INDEX VALUES > TEST BASELINE RESULT INDEX > > Dhrystone 2 using register variables 376783.7 10185429.8 270.3 > Double-Precision Whetstone 83.1 759.8 91.4 > Execl Throughput 188.3 1386.2 73.6 > File Copy 1024 bufsize 2000 maxblocks 2672.0 62331.0 233.3 > File Copy 256 bufsize 500 maxblocks 1077.0 16492.0 153.1 > File Read 4096 bufsize 8000 maxblocks 15382.0 563402.0 366.3 > Pipe-based Context Switching 15448.6 87176.0 56.4 > Pipe Throughput 111814.6 481068.1 43.0 > Process Creation 569.3 3128.9 55.0 > Shell Scripts (8 concurrent) 44.8 394.9 88.1 > System Call Overhead 114433.5 539996.1 47.2 > ========= > FINAL SCORE 102.6 > 8-cores > > ============================================================== > BYTE UNIX Benchmarks (Version 4.1-wht.2, 8 threads) > System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC > 2010 x86_64 GNU/Linux > /dev/xvda1 141110136 1066680 132875456 1% / > > Start Benchmark Run: Tue May 18 14:30:59 BST 2010 > 14:30:59 up 0 min, 1 user, load average: 0.07, 0.02, 0.00 > > End Benchmark Run: Tue May 18 14:42:52 BST 2010 > 14:42:52 up 12 min, 1 user, load average: 25.56, 10.84, 4.96 > > > INDEX VALUES > TEST BASELINE RESULT INDEX > > Dhrystone 2 using register variables 376783.7 9972130.3 264.7 > Double-Precision Whetstone 83.1 755.2 90.9 > Execl Throughput 188.3 1584.7 84.2 > File Copy 1024 bufsize 2000 maxblocks 2672.0 58981.0 220.7 > File Copy 256 bufsize 500 maxblocks 1077.0 16904.0 157.0 > File Read 4096 bufsize 8000 maxblocks 15382.0 557735.0 362.6 > Pipe-based Context Switching 15448.6 80738.2 52.3 > Pipe Throughput 111814.6 450891.2 40.3 > Process Creation 569.3 2948.5 51.8 > Shell Scripts (8 concurrent) 44.8 378.1 84.4 > System Call Overhead 114433.5 537443.2 47.0 > ========= > FINAL SCORE 100.9 > > > > -- > Professional hosting without compromise > www.clustered.net > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |