[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 04/10] xen/blkfront: separate ring information to an new struct




> -----Original Message-----
> From: David Vrabel
> Sent: 19 February 2015 11:15
> To: Roger Pau Monne; Bob Liu; Felipe Franciosi
> Cc: 'Konrad Rzeszutek Wilk'; xen-devel@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; axboe@xxxxxx; hch@xxxxxxxxxxxxx;
> avanzini.arianna@xxxxxxxxx
> Subject: Re: [PATCH 04/10] xen/blkfront: separate ring information to an new
> struct
> 
> On 19/02/15 11:08, Roger Pau Monné wrote:
> > El 19/02/15 a les 3.05, Bob Liu ha escrit:
> >>
> >>
> >> On 02/19/2015 02:08 AM, Felipe Franciosi wrote:
> >>>> -----Original Message-----
> >>>> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> >>>> Sent: 18 February 2015 17:38
> >>>> To: Roger Pau Monne
> >>>> Cc: Bob Liu; xen-devel@xxxxxxxxxxxxx; David Vrabel; linux-
> >>>> kernel@xxxxxxxxxxxxxxx; Felipe Franciosi; axboe@xxxxxx;
> >>>> hch@xxxxxxxxxxxxx; avanzini.arianna@xxxxxxxxx
> >>>> Subject: Re: [PATCH 04/10] xen/blkfront: separate ring information
> >>>> to an new struct
> >>>>
> >>>> On Wed, Feb 18, 2015 at 06:28:49PM +0100, Roger Pau Monné wrote:
> >>>>> El 15/02/15 a les 9.18, Bob Liu ha escrit:
> >>>>> AFAICT you seem to have a list of persistent grants, indirect
> >>>>> pages and a grant table callback for each ring, isn't this
> >>>>> supposed to be shared between all rings?
> >>>>>
> >>>>> I don't think we should be going down that route, or else we can
> >>>>> hoard a large amount of memory and grants.
> >>>>
> >>>> It does remove the lock that would have to be accessed by each ring
> >>>> thread to access those. Those values (grants) can be limited to be
> >>>> a smaller value such that the overall number is the same as it was with
> the previous version. As in:
> >>>> each ring has = MAX_GRANTS / nr_online_cpus().
> >>>>>
> >>>
> >>> We should definitely be concerned with the amount of memory consumed
> on the backend for each plugged virtual disk. We have faced several problems
> in XenServer around this area before; it drastically affects VBD scalability 
> per
> host.
> >>>
> >>
> >> Right, so we have to keep both the lock and the amount of memory
> >> consumed in mind.
> >>
> >>> This makes me think that all the persistent grants work was done as a
> workaround while we were facing several performance problems around
> concurrent grant un/mapping operations. Given all the recent submissions
> made around this (grant ops) area, is this something we should perhaps revisit
> and discuss whether we want to continue offering persistent grants as a 
> feature?
> >>>
> >>
> >> Agree, Life would be easier if we can remove the persistent feature.
> >
> > I was thinking about this yesterday, and IMHO I think we should remove
> > persistent grants now while it's not too entangled, leaving it for
> > later will just make our life more miserable.
> >
> > While it's true that persistent grants provide a throughput increase
> > by preventing grant table operations and TLB flushes, it has several
> > problems that cannot by avoided:
> >
> >  - Memory/grants hoarding, we need to reserve the same amount of
> > memory as the amount of data that we want to have in-flight. While
> > this is not so critical for memory, it is for grants, since using too
> > many grants can basically deadlock other PV interfaces. There's no way
> > to avoid this since it's the design behind persistent grants.
> >
> >  - Memcopy: guest needs to perform a memcopy of all data that goes
> > through blkfront. While not so critical, Felipe found systems were
> > memcopy was more expensive than grant map/unmap in the backend (IIRC
> > those were AMD systems).
> >
> >  - Complexity/interactions: when persistent grants was designed number
> > of requests was limited to 32 and each request could only contain 11
> > pages. This means we had to use 352 pages/grants which was fine. Now
> > that we have indirect IO and multiqueue in the horizon this number has
> > gone up by orders of magnitude, I don't think this is viable/useful
> > any more.
> >
> > If Konrad/Bob agree I would like to send a patch to remove persistent
> > grants and then have the multiqueue series rebased on top of that.
> 
> I agree with this.
> 
> I think we can get better  performance/scalability gains of with improvements
> to grant table locking and TLB flush avoidance.
> 
> David

It doesn't change the fact that persistent grants (as well as the grant copy 
implementation we did for tapdisk3) were alternatives that allowed aggregate 
storage performance to increase drastically. Before committing to removing 
something that allow Xen users to scale their deployments, I think we need to 
revisit whether the recent improvements to the whole grant mechanisms (grant 
table locking, TLB flushing, batched calls, etc) are performing as we would 
(now) expect.

What I think should be done prior to committing to either direction is a proper 
performance assessment of grant mapping vs. persistent grants vs. grant copy 
for single and aggregate workloads. We need to test a meaningful set of host 
architectures, workloads and storage types. Last year at the XenDevelSummit, 
for example, we showed how grant copy scaled better than persistent grants at 
the cost of doing the copy on the back end.

I don't mean to propose tests that will delay innovation by weeks or months. 
However, it is very easy to find changes that improve this or that synthetic 
workload and ignore the fact that it might damage several (possibly very 
realistic) others. I think this is the time to run performance tests 
objectively without trying to dig too much into debugging and go from there.

Felipe

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.