[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v1 0/5] VBD: enlarge max segment per request in blkfront



On Thu, Aug 16, 2012 at 09:34:57AM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Aug 16, 2012 at 10:22:56AM +0000, Duan, Ronghui wrote:
> > Hi, list.
> > The max segments for request in VBD queue is 11, while for Linux OS/ other 
> > VMM, the parameter is set to 128 in default.
> 
> Like the FreeBSD one?
> 
> > This may be caused by the limited size of ring between Front/Back. So I 
> > guess whether we can put segment data into another ring and dynamic use 
> > them for the single request's need. Here is prototype which don't do much 
> > test, but it can work on Linux 64 bits 3.4.6 kernel. I can see the CPU% can 
> > be reduced to 1/3 compared to original in sequential test. But it bring 
> > some overhead which will make random IO's cpu utilization increase a little.
> > 
> 
> Did you think also about expanding the ring size to something bigger?
> 
> > Here is a short version data use only 1K random read and 64K sequential 
> > read in direct mode. Testing a physical SSD disk as blkback in backend. 
> > CPU% is got form xentop.
> 
> > Read 1K random      IOPS       Dom0 CPU     DomU CPU%
> >             W       52005.9 86.6    71
> >             W/O     52123.1 85.8    66.9
> >                     
> > Read 64K seq        BW MB/s Dom0 CPU        DomU CPU%
> >     W       250             27.1           10.6
> >     W/O     250             62.6           31.1
> > 
> > 
> > The patch will be simple if we only use new methods. But we need consider 
> > that user may use new kernel as backend while an older one as frontend. 
> > Also need considerate live migration case. So the change become huge...
> 
> OK? I think you are implementing the extension documented in
> 
> changeset:   24875:a59c1dcfe968
> user:        Justin T. Gibbs <justing@xxxxxxxxxxxxxxxx>
> date:        Thu Feb 23 10:03:07 2012 +0000
> summary:     blkif.h: Define and document the request number/size/segments 
> extension
> 
> changeset:   24874:f9789db96c39
> user:        Justin T. Gibbs <justing@xxxxxxxxxxxxxxxx>
> date:        Thu Feb 23 10:02:30 2012 +0000
> summary:     blkif.h: Document the Red Hat and Citrix blkif multi-page ring 
> extensions
> 
> so that would be the max-requests-segments one?
> 
> 
> 
> > [RFC v1 1/5] 
> >     In order to add new segment ring, refactoring the original code, split 
> > some methods related with ring operation.
> > [RFC v1 2/5]
> >     Add the segment ring support in blkfront. Most of code is about 
> > suspend/recover.
> > [RFC v1 3/5]
> >     As the same, need refractor the original code in blkback.
> > [RFC v1 4/5]
> >     In order to support different type of ring type in blkback, make the 
> > pending_req list per disk.
> 
> Not sure why you structured the patches like this way, but it might
> make sense to order them in 1, 3, 4, 2, 5 order. The 'pending_req'/per disk 
> is an overall
> improvement that fixes a lot of concurrent issues. I tried to implement this 
> and ran
> in an issue with grants still being active? Did you have issues with that or 
> it worked just fine
> for you?
> > [RFC v1 5/5]
> >     Add the segment ring support in blkback.
> 
> So .. where are the patches? Did I miss them?

Ah, they just arrived.

I took a brief look at them, and I think they are the right step. The things 
that are
missing is that that you are missing the kfree  in 4/5 when the disk is gone 
away. Also
there are some code that is commented out and its not clear to me why that is.

Lastly, this protocol should be negotiated using the 'max-request-.. ' or 
whichever is
the proper type, not the blkfront-ring-type. It also would be good to CC Justin 
as he
might have some guidance in this and also could test the frontend on his backend
(or vice-versa). Not sure what is involved in setting up a FreeBSD backend that 
spectralogic
is using.. Thought this might also involed expanding the ring to be a 
multi-page one
I think?

And I wonder if you need to have such a huge list of ops? Can some of them be 
trimmed down?
They v1 and v2 look quite similar. Oh, and instead of v1 and v2 I would just 
call them
'large_segment' and 'default_segment'. Or 'lgr_segment' and 'def_segment' 
perhaps?

Maybe 'huge_segment' and 'generic_segment' that sounds better.

Lastly, its not clear to me why you are removing the padding on some of the 
older blkif structures?

Thanks for posting this!
> > -ronghui
> > 
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.