[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFC v1: Xen block protocol overhaul - problem statement (with pictures!)



On Wed, Jan 23, 2013 at 09:53:14AM +0000, Ian Campbell wrote:
> On Tue, 2013-01-22 at 19:46 +0000, Konrad Rzeszutek Wilk wrote:
> > Correct. In your case we have one cacheline shared by two requests.
> > 
> > This means we will have to be extra careful in the backend (And frontend)
> > to call 'wmb()' after we have filled two entries. Otherwise we up doing
> > something like this:
> > 
> >         fill out the req[0]
> >         a) wmb()   => writes the full 64-bytes cacheline out.
> >         fill out the req[1]
> >         b) wmb()        <== has to throw out the cacheline and re-write the 
> > new
> >                     one.
> > 
> > With a 64-bytes one we do not have to worry about that.
> 
> ... unless your cachelines are 128- or 256-bytes...
> 
> Short answer: the existing ring.h macros take care of this for you.

Nice.
> 
> Long answer: The RING_PUSH_{REQUESTS,RESPONSES} handle this by only
> issuing the wmb() over the entire current batch of things, not each time
> you queue something. i.e. you can queue requests, modifying req_prod_pvt
> as you go and then at the end of a suitable batch you call
> RING_PUSH_REQUESTS which updates req_prod and does the barriers so you
> end up with
>       fill out the req[0]
>       rsp_prod_pvt = 0
>       fill out the req[1]
>       rsp_prod_pvt = 1
>       fill out the req[2]
>       rsp_prod_pvt = 2
>       Batch Done => RING_PUSH_REQUESTS
>       wmb()
>       rsp_prod = rsp_prod_pvt
> 
> The last req might cross a cache line but if you are at the end of a
> batch how much does that matter? I suppose you could push a nop request
> or something to align if it was an issue.

Oh, NOP. That is an idea.
> 
> Maybe you don't get this behaviour if you are batching effectively
> though? Solve the batching problem and you solve the cacheline thing
> basically for free though.

Right. And looking at the code it actually does this.
> 
> > > > Naturally this means we need to negotiate a 'feature-request-size'
> > > > where v1 says that the request is of 64-bytes length.
> 
> Have you consider variable length requests? This would let the request
> size scale with the number of segments required for that request, and
> allow you to cache align the ends of the requests without wasting the
> extra space that including the worst case number of segments would
> imply. e.g. a small write would take 32-bytes (padded to 64 if you must)
> and a larger one would take 196 (padded to 256). You should end up with
> more efficient use of the space in the ring this way.

Yes. If I recall right, the D) had it as "we could negotiate a proper alignment 
for
each request/response structure.". The idea is pretty much the same as yours
- pad to the cacheline if the cacheline is bigger than the request or response.

> 
> This also allows for other things like inlining requests (maybe more
> interesting for net than blk) or including DIX requests without
> incurring the overhead of however many bytes that is on every request.
> 
> I'd really like it if the result of this conversation could be a new
> generic ring structure that was applicable to at least net and blk
> rather than a new blk specific protocol.
> 
> Ian.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.