[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] RFC v1: Xen block protocol overhaul - problem statement (with pictures!)
On Tue, 2013-01-22 at 19:46 +0000, Konrad Rzeszutek Wilk wrote: > Correct. In your case we have one cacheline shared by two requests. > > This means we will have to be extra careful in the backend (And frontend) > to call 'wmb()' after we have filled two entries. Otherwise we up doing > something like this: > > fill out the req[0] > a) wmb() => writes the full 64-bytes cacheline out. > fill out the req[1] > b) wmb() <== has to throw out the cacheline and re-write the > new > one. > > With a 64-bytes one we do not have to worry about that. ... unless your cachelines are 128- or 256-bytes... Short answer: the existing ring.h macros take care of this for you. Long answer: The RING_PUSH_{REQUESTS,RESPONSES} handle this by only issuing the wmb() over the entire current batch of things, not each time you queue something. i.e. you can queue requests, modifying req_prod_pvt as you go and then at the end of a suitable batch you call RING_PUSH_REQUESTS which updates req_prod and does the barriers so you end up with fill out the req[0] rsp_prod_pvt = 0 fill out the req[1] rsp_prod_pvt = 1 fill out the req[2] rsp_prod_pvt = 2 Batch Done => RING_PUSH_REQUESTS wmb() rsp_prod = rsp_prod_pvt The last req might cross a cache line but if you are at the end of a batch how much does that matter? I suppose you could push a nop request or something to align if it was an issue. Maybe you don't get this behaviour if you are batching effectively though? Solve the batching problem and you solve the cacheline thing basically for free though. > > > Naturally this means we need to negotiate a 'feature-request-size' > > > where v1 says that the request is of 64-bytes length. Have you consider variable length requests? This would let the request size scale with the number of segments required for that request, and allow you to cache align the ends of the requests without wasting the extra space that including the worst case number of segments would imply. e.g. a small write would take 32-bytes (padded to 64 if you must) and a larger one would take 196 (padded to 256). You should end up with more efficient use of the space in the ring this way. This also allows for other things like inlining requests (maybe more interesting for net than blk) or including DIX requests without incurring the overhead of however many bytes that is on every request. I'd really like it if the result of this conversation could be a new generic ring structure that was applicable to at least net and blk rather than a new blk specific protocol. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |