[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] DomU vs Dom0 performance.



On Sun, Sep 29, 2013 at 07:22:14PM -0400, sushrut shirole wrote:
> Hi,
> 
> I have been doing some diskIO bench-marking of dom0 and domU (HVM). I ran
> into an issue where domU
> performed better than dom0.  So I ran few experiments to check if it is
> just diskIO performance.
> 
> I have an archlinux (kernel 3.5.0) + xen 4.2.2) installed on a Intel Core
> i7 Q720 machine. I have also installed
> archlinux (kernel 3.5.0) in domU running on this machine. The domU runs
> with 8 vcpus. I have alloted both dom0
> and domu 4096M ram.

What kind of guest is it ? PV or HVM?

> 
> I performed following experiments to compare the performance of domU vs
> dom0.
> 
> experiment 1]
> 
> 1. Created a file.img of 5G
> 2. Mounted the file with ext2 filesystem.
> 3. Ran sysbench with following command.
> 
> sysbench --num-threads=8 --test=fileio --file-total-size=1G
> --max-requests=1000000 prepare
> 
> 4. Read files into memory
> 
> script to read files
> 
> <snip>
> for i in `ls test_file.*`
> do
>    sudo dd if=./$i of=/dev/zero
> done
> </snip>
> 
> 5. Ran sysbench.
> 
> sysbench --num-threads=8 --test=fileio --file-total-size=1G
> --max-requests=5000000 --file-test-mode=rndrd run
> 
> the output i got on dom0 is
> 
> <output>
> Number of threads: 8
> 
> Extra file open flags: 0
> 128 files, 8Mb each
> 1Gb total file size
> Block size 16Kb
> Number of random requests for random IO: 5000000
> Read/Write ratio for combined random IO test: 1.50
> Periodic FSYNC enabled, calling fsync() each 100 requests.
> Calling fsync() at the end of test, Enabled.
> Using synchronous I/O mode
> Doing random read test
> 
> Operations performed:  5130322 Read, 0 Write, 0 Other = 5130322 Total
> Read 78.283Gb  Written 0b  Total transferred 78.283Gb  (4.3971Gb/sec)
> *288165.68 Requests/sec executed*
> 
> Test execution summary:
>     total time:                          17.8034s
>     total number of events:              5130322
>     total time taken by event execution: 125.3102
>     per-request statistics:
>          min:                                  0.01ms
>          avg:                                  0.02ms
>          max:                                 55.55ms
>          approx.  95 percentile:               0.02ms
> 
> Threads fairness:
>     events (avg/stddev):           641290.2500/10057.89
>     execution time (avg/stddev):   15.6638/0.02
> </output>
> 
> 6. Performed same experiment on domU and result I got is
> 
> <output>
> Number of threads: 8
> 
> Extra file open flags: 0
> 128 files, 8Mb each
> 1Gb total file size
> Block size 16Kb
> Number of random requests for random IO: 5000000
> Read/Write ratio for combined random IO test: 1.50
> Periodic FSYNC enabled, calling fsync() each 100 requests.
> Calling fsync() at the end of test, Enabled.
> Using synchronous I/O mode
> Doing random read test
> 
> Operations performed:  5221490 Read, 0 Write, 0 Other = 5221490 Total
> Read 79.674Gb  Written 0b  Total transferred 79.674Gb  (5.9889Gb/sec)
> *392489.34 Requests/sec executed*
> 
> Test execution summary:
>     total time:                          13.3035s
>     total number of events:              5221490
>     total time taken by event execution: 98.7121
>     per-request statistics:
>          min:                                  0.01ms
>          avg:                                  0.02ms
>          max:                                 49.75ms
>          approx.  95 percentile:               0.02ms
> 
> Threads fairness:
>     events (avg/stddev):           652686.2500/1494.93
>     execution time (avg/stddev):   12.3390/0.02
> 
> </output>
> 
> I was expecting dom0 to performa better than domU, so to debug more into it
> I ram lm_bench microbenchmarks.
> 
> Experiment 2] bw_mem benchmark
> 
> 1. ./bw_mem 1000m wr
> 
> dom0 output:
> 
> 1048.58 3640.60
> 
> domU output:
> 
> 1048.58 4719.32
> 
> 2. ./bw_mem 1000m rd
> 
> dom0 output:
> 1048.58 5780.56
> 
> domU output:
> 
> 1048.58 6258.32
> 
> 
> Experiment 3] lat_syscall benchmark
> 
> 1.  ./lat_syscall write
> 
> dom0 output:
> Simple write: 1.9659 microseconds
> 
> domU output :
> Simple write: 0.4256 microseconds
> 
> 2. ./lat_syscall read
> 
> dom0 output:
> Simple read: 1.9399 microseconds
> 
> domU output :
> Simple read: 0.3764 microseconds
> 
> 3. ./lat_syscall stat
> 
> dom0 output:
> Simple stat:3.9667 microseconds
> 
> domU output :
> Simple stat: 1.2711 microseconds
> 
> I am not able to understand why domU has performed better than domU, when
> obvious guess is that dom0
> should perform better than domU. I would really appreciate an help if
> anyone knows the reason behind this
> issue.
> 
> Thank you,
> Sushrut.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.