[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: poor domU VBD performance.



Ian Pratt <m+Ian.Pratt <at> cl.cam.ac.uk> writes:

> 
> > I'll check the xen block driver to see if there's anything 
> > else that sticks out.
> >
> > Jens Axboe
> 
> Jens, I'd really appreciate this.
> 
> The blkfront/blkback drivers have rather evolved over time, and I don't
> think any of the core team fully understand the block-layer differences
> between 2.4 and 2.6. 
> 
> There's also some junk left in there from when the backend was in Xen
> itself back in the days of 1.2, though Vincent has prepared a patch to
> clean this up and also make 'refreshing' of vbd's work (for size
> changes), and also allow the blkfront driver to import whole disks
> rather than paritions. We had this functionality on 2.4, but lost it in
> the move to 2.6.
> 
> My bet is that it's the 2.6 backend that is where the true perofrmance
> bug lies. Using a 2.6 domU blkfront talking to a 2.4 dom0 blkback seems
> to give good performance under a wide variety of circumstances. Using a
> 2.6 dom0 is far more pernickety. I agree with Andrew that I suspect it's
> the work queue changes are biting us when we don't have many outstanding
> requests.
> 
> Thanks,
> Ian
> 


I have done my simple dd on hde1 with two different setting of readahead:
256 sectors and 512 sectors.

These are the results:

DOM0 readahead 512s


Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-
sz avgqu-sz   await  svctm  %util
hde        115055.40   2.00 592.40  0.80 115647.80   22.40 57823.90    11.20   
194.99     2.30    3.88   1.68  99.80
hda          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     
0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait   %idle
           0.20    0.00   31.60   14.20   54.00

 DOMU  readahead 512s

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-
sz avgqu-sz   await  svctm  %util
hda1         0.00   0.20  0.00  0.00    0.00    3.20     0.00     1.60     
0.00     0.00    0.00   0.00   0.00
hde1       102301.40   0.00 11571.00  0.00 113868.80    0.00 56934.40     
0.00     9.84    68.45    5.92   0.09 100.00

avg-cpu:  %user   %nice %system %iowait   %idle
           0.00    0.00   35.00   65.00    0.00


DOM0 readahead 256s


Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-
sz avgqu-sz   await  svctm  %util
hde        28289.20   1.80 126.80  0.40 28416.00   17.60 14208.00     8.80   
223.53     1.06    8.32   7.85  99.80
hda          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     
0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait   %idle
           0.20    0.00    1.60    5.60   92.60


DOMU readahead 256s

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-
sz avgqu-sz   await  svctm  %util
hda1         0.00   0.20  0.00  0.40    0.00    4.80     0.00     2.40    
12.00     0.00    0.00   0.00   0.00
hde1       25085.60   0.00 3330.40  0.00 28416.00    0.00 14208.00     
0.00     8.53    30.54    9.17   0.30 100.00

avg-cpu:  %user   %nice %system %iowait   %idle
           0.20    0.00    1.40   98.40    0.00




What surprises me is that the service time for the request in DOM0 decreases
dramatically when readahead is increased from 256 to 512 sectors. If the output
of iostat is reliable, it tells me requests in DOMU are assembled to about 8  
to 10 sectors in size, while DOM0 puts them together to about 200 or even more
sectors 
Using readahead of 256 sectors results in a an average queuesize of anout 1
while changing readahead to 512 sectors results in an avaerage queuesize of 
slightly above 2 on DOM0. Service times in DOM0 and readahead 256 sectors 
seem to be in the range of the typical seek time of a modern ide disk while 
it is significantly lower with readahead of 512 sectors. 
As I have mentioned, this is the system with only one installed disk; this re-
sults in the write activity on the disk. The two write request per second
go into a different partition and those result in four required seeks per 
second. This should not be a reason for all requests to take about seek time
as service time. 

I have done a number of further test on various systems. In most cases I failed
to achieve service times below 8 msecs in Dom0; the only counterexample is 
reported above. It seems to me, that at low readahead values the amount of
data requested for from disk is simply the readahead amount of data. This 
request takes about seek time and thus I get lower performance when I work
with small readahead values.
What I do not understand at all is why throughput collapses with large 
readahead 
sizes. 

I found in mm/readahead.c that the readahead size for a file is updated if 
the readahead is not efficient. I suspect that the mechanism might lead to 
readahed being switched of for this file.
With readahead being set to 2048 sectors, the product of avgq-sz and avgrq-sz
reported by drops to 4 to 5 physical pages.

Peter 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.