[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v1 0/5] VBD: enlarge max segment per request in blkfront



> On Thu, Aug 16, 2012 at 09:34:57AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Thu, Aug 16, 2012 at 10:22:56AM +0000, Duan, Ronghui wrote:
> > > Hi, list.
> > > The max segments for request in VBD queue is 11, while for Linux OS/ other
> VMM, the parameter is set to 128 in default.
> >
> > Like the FreeBSD one?
> >
Yeap.
> > > This may be caused by the limited size of ring between Front/Back. So I 
> > > guess
> whether we can put segment data into another ring and dynamic use them for
> the single request's need. Here is prototype which don't do much test, but it 
> can
> work on Linux 64 bits 3.4.6 kernel. I can see the CPU% can be reduced to 1/3
> compared to original in sequential test. But it bring some overhead which will
> make random IO's cpu utilization increase a little.
> > >
> >
> > Did you think also about expanding the ring size to something bigger?
> >
A separate ring will hold 1024 segments, I think it can feed most of H/W's BW.
> > > Here is a short version data use only 1K random read and 64K sequential 
> > > read
> in direct mode. Testing a physical SSD disk as blkback in backend. CPU% is got
> form xentop.
> >
> > > Read 1K random    IOPS       Dom0 CPU     DomU CPU%
> > >           W       52005.9 86.6    71
> > >           W/O     52123.1 85.8    66.9
> > >
> > > Read 64K seq      BW MB/s Dom0 CPU        DomU CPU%
> > >   W       250             27.1           10.6
> > >   W/O     250             62.6           31.1
> > >
> > >
> > > The patch will be simple if we only use new methods. But we need consider
> that user may use new kernel as backend while an older one as frontend. Also
> need considerate live migration case. So the change become huge...
> >
> > OK? I think you are implementing the extension documented in
> >
> > changeset:   24875:a59c1dcfe968
> > user:        Justin T. Gibbs <justing@xxxxxxxxxxxxxxxx>
> > date:        Thu Feb 23 10:03:07 2012 +0000
> > summary:     blkif.h: Define and document the request
> number/size/segments extension
> >
> > changeset:   24874:f9789db96c39
> > user:        Justin T. Gibbs <justing@xxxxxxxxxxxxxxxx>
> > date:        Thu Feb 23 10:02:30 2012 +0000
> > summary:     blkif.h: Document the Red Hat and Citrix blkif multi-page ring
> extensions
> >
> > so that would be the max-requests-segments one?
Oh, I miss this info. But sure I increase the max-request-segments.
> >
> >
> > > [RFC v1 1/5]
> > >   In order to add new segment ring, refactoring the original code, split 
> > > some
> methods related with ring operation.
> > > [RFC v1 2/5]
> > >   Add the segment ring support in blkfront. Most of code is about
> suspend/recover.
> > > [RFC v1 3/5]
> > >   As the same, need refractor the original code in blkback.
> > > [RFC v1 4/5]
> > >   In order to support different type of ring type in blkback, make the
> pending_req list per disk.
> >
> > Not sure why you structured the patches like this way, but it might
> > make sense to order them in 1, 3, 4, 2, 5 order. The 'pending_req'/per
> > disk is an overall improvement that fixes a lot of concurrent issues.
> > I tried to implement this and ran in an issue with grants still being
> > active? Did you have issues with that or it worked just fine for you?
> > > [RFC v1 5/5]
> > >   Add the segment ring support in blkback.
> >
> > So .. where are the patches? Did I miss them?
> 
> Ah, they just arrived.
> 
> I took a brief look at them, and I think they are the right step. The things 
> that are
> missing is that that you are missing the kfree  in 4/5 when the disk is gone 
> away.
> Also there are some code that is commented out and its not clear to me why 
> that
> is.
I forget clean this. I want to listen for advise for the change for the 
protocol. I can
Send out a 'patch' after.
> Lastly, this protocol should be negotiated using the 'max-request-.. ' or 
> whichever
> is the proper type, not the blkfront-ring-type. It also would be good to CC 
> Justin
> as he might have some guidance in this and also could test the frontend on his
> backend (or vice-versa). Not sure what is involved in setting up a FreeBSD
> backend that spectralogic is using.. Thought this might also involed 
> expanding the
> ring to be a multi-page one I think?
> 
I begin with this from multi-page ring, I also have a patch about multi-page 
ring for
This patch. But due to no much positive influence on performance then I drop it.

> And I wonder if you need to have such a huge list of ops? Can some of them be
> trimmed down?
Yes, it can be less ops adding a common structure like backend.
> They v1 and v2 look quite similar. Oh, and instead of v1 and v2 I would just 
> call
> them 'large_segment' and 'default_segment'. Or 'lgr_segment' and
> 'def_segment' perhaps?
> 
> Maybe 'huge_segment' and 'generic_segment' that sounds better.
> 
Good for me, I will reconsider the suitable parameter.
> Lastly, its not clear to me why you are removing the padding on some of the 
> older
> blkif structures?
It will cause some miss match size between 64bit Domu on 32bits Dom0, So remove 
it.
I will double check there alignment after. 
> Thanks for posting this!
> > > -ronghui
> > >
> > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxx
> > > http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.