[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] Performance issues on dom0
 
 
Hi, 
 
There are a few tests (like Pipe Throughput , Pipe-based Context
Switching) which give very bad results under xen enabled kernel. I
benchmarked similar results and I am not sure why. 
 
Also I am not sure how these may effect real server performance. I
believe that running an apache (or any other service) and sending
requests to it on xen enabled vs vanilla kernel makes more sense to
measure the performance differences.
  Below is the performance comparison between two.
  
	
	
	
	
	
	
	
	
		
			
  | 
			Non-xen kernel | 
			xen kernel | 
		 
		
			| Execl Throughput | 
			1 | 
			0.28 | 
		 
		
			| File Copy 1024 bufsize 2000 maxblocks | 
			1 | 
			0.76 | 
		 
		
			| File Copy 256 bufsize 500 maxblocks | 
			1 | 
			0.78 | 
		 
		
			| File Copy 4096 bufsize 8000 maxblocks | 
			1 | 
			0.71 | 
		 
		
			| Pipe Throughput | 
			1 | 
			0.25 | 
		 
		
			| Pipe-based Context Switching | 
			1 | 
			0.31 | 
		 
		
			| Process Creation | 
			1 | 
			0.25 | 
		 
		
			| Shell Scripts (1 concurrent) | 
			1 | 
			0.36 | 
		 
		
			| Shell Scripts (16 concurrent) | 
			1 | 
			0.39 | 
		 
		
			| Shell Scripts (8 concurrent) | 
			1 | 
			0.39 | 
		 
		
			| System Call Overhead | 
			1 | 
			0.53 | 
		 
	
 
  
regards, 
OZ
 
 2009/8/18 Fréric VANNIÈE  <frederic@xxxxxxxxxxx>
Hello, 
 
I'm benching a brand new Nehalem server and I've noticed performance problems when running xen0 without any VMs. 
 
Xen : 3.4.1 
OS: Debian 5.0 amd64 
HW: Supermicro Dual-Xeon Nehalem L5520 2,27 GHz, 24GB memory, 16 CPUS (8 cores * 2 HT) 
native kernel : 2.6.30.4 
dom0 kernel 1 : 2.6.18-xen0 
dom0 kernel 2 : 2.6.30.2-xen0 (SuSe patchs + config from http://x17.eu/xen) 
 
Benchs: tar xjf linux2.6.30.4.tar.bz2 
        build 2.6.30.4 with the default config and "make -j16" 
        filebench (mail) 
        unixbench -c 16 system 
 
The performances of 2.6.18-xen0 and 2.6.30.2-xen0 are very close (2.6.30 is a little faster) and the filesystem benchmark gives the same values for all kernels. 
 
 
The dom0 has all the memory, all the vCPUS, no VMs is running, and each vCPU is pinned on a rCPU. 
 
Linux elrond 2.6.30.2-xen0 #4 SMP Fri Aug 14 14:31:11 CEST 2009 x86_64 GNU/Linux 
 
Name                                        ID   Mem VCPUs      State   Time(s) 
Domain-0                                     0 24147    16     r-----  23128.5 
 
1. tar xjf 
  - native : 13 seconds 
  - dom0 : 21 seconds 
2. make -j16 linux : 
  - native : 56 seconds 
  - dom0 : 67 seconds 
3. filebench : 
  - native : 4268 iops 
  - dom0 : 4219 iops 
4. unixbench : 
  - native : 5262 
  - dom0 : 2200 !!!! --> very bad  (1810 with 2.6.18-xen0) 
 
 
 
 
Any idea on the cause ? 
 
 
Regards, 
 
 
 
 
===========  Unixbench : native ============== 
Benchmark Run: mar aoû2009 13:24:36 - 13:51:10 
16 CPUs in system; running 16 parallel copies of tests 
 
Execl Throughput                              40751.3 lps   (30.0 s, 2 samples) 
File Copy 1024 bufsize 2000 maxblocks        395854.5 KBps  (30.0 s, 2 samples) 
File Copy 256 bufsize 500 maxblocks          103076.1 KBps  (30.0 s, 2 samples) 
File Copy 4096 bufsize 8000 maxblocks       1363078.3 KBps  (30.0 s, 2 samples) 
Pipe Throughput                            15770868.9 lps   (10.0 s, 7 samples) 
Pipe-based Context Switching                3671269.3 lps   (10.0 s, 7 samples) 
Process Creation                             110360.2 lps   (30.0 s, 2 samples) 
Shell Scripts (1 concurrent)                  65731.1 lpm   (60.0 s, 2 samples) 
Shell Scripts (16 concurrent)                  4916.5 lpm   (60.1 s, 2 samples) 
Shell Scripts (8 concurrent)                   9776.3 lpm   (60.0 s, 2 samples) 
System Call Overhead                        6840143.9 lps   (10.0 s, 7 samples) 
 
System Benchmarks Partial Index              BASELINE       RESULT    INDEX 
Execl Throughput                                 43.0      40751.3   9477.1 
File Copy 1024 bufsize 2000 maxblocks          3960.0     395854.5    999.6 
File Copy 256 bufsize 500 maxblocks            1655.0     103076.1    622.8 
File Copy 4096 bufsize 8000 maxblocks          5800.0    1363078.3   2350.1 
Pipe Throughput                               12440.0   15770868.9  12677.5 
Pipe-based Context Switching                   4000.0    3671269.3   9178.2 
Process Creation                                126.0     110360.2   8758.7 
Shell Scripts (1 concurrent)                     42.4      65731.1  15502.6 
Shell Scripts (16 concurrent)                     ---       4916.5      --- 
Shell Scripts (8 concurrent)                      6.0       9776.3  16293.8 
System Call Overhead                          15000.0    6840143.9   4560.1 
                                                                   ======== 
System Benchmarks Index Score (Partial Only)                         5262.1 
 
 
===========  Unixbench : dom0 2.6.30.2 ============== 
Benchmark Run: mar aoû2009 15:07:45 - 15:34:31 
16 CPUs in system; running 16 parallel copies of tests 
 
Execl Throughput                              11593.1 lps   (29.9 s, 2 samples) 
File Copy 1024 bufsize 2000 maxblocks        299238.3 KBps  (30.0 s, 2 samples) 
File Copy 256 bufsize 500 maxblocks           79926.1 KBps  (30.0 s, 2 samples) 
File Copy 4096 bufsize 8000 maxblocks        974072.8 KBps  (30.0 s, 2 samples) 
Pipe Throughput                             3922840.5 lps   (10.1 s, 7 samples) 
Pipe-based Context Switching                1125963.3 lps   (10.0 s, 7 samples) 
Process Creation                              27045.1 lps   (30.0 s, 2 samples) 
Shell Scripts (1 concurrent)                  23418.9 lpm   (60.0 s, 2 samples) 
Shell Scripts (16 concurrent)                  1939.5 lpm   (60.2 s, 2 samples) 
Shell Scripts (8 concurrent)                   3826.3 lpm   (60.1 s, 2 samples) 
System Call Overhead                        3592086.9 lps   (10.1 s, 7 samples) 
 
System Benchmarks Partial Index              BASELINE       RESULT    INDEX 
Execl Throughput                                 43.0      11593.1   2696.1 
File Copy 1024 bufsize 2000 maxblocks          3960.0     299238.3    755.7 
File Copy 256 bufsize 500 maxblocks            1655.0      79926.1    482.9 
File Copy 4096 bufsize 8000 maxblocks          5800.0     974072.8   1679.4 
Pipe Throughput                               12440.0    3922840.5   3153.4 
Pipe-based Context Switching                   4000.0    1125963.3   2814.9 
Process Creation                                126.0      27045.1   2146.4 
Shell Scripts (1 concurrent)                     42.4      23418.9   5523.3 
Shell Scripts (16 concurrent)                     ---       1939.5      --- 
Shell Scripts (8 concurrent)                      6.0       3826.3   6377.2 
System Call Overhead                          15000.0    3592086.9   2394.7 
                                                                   ======== 
System Benchmarks Index Score (Partial Only)                         2200.0 
 
 
 
 
 
 
 
 
_______________________________________________ 
Xen-devel mailing list 
Xen-devel@xxxxxxxxxxxxxxxxxxx 
http://lists.xensource.com/xen-devel 
  
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 
    
     |