[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Pointed questions re Xen memory overcommit

> From: Andres Lagar-Cavilla [mailto:andres@xxxxxxxxxxxxxxxx]
> > OK, I suppose xenpaging by itself could be useful in a situation such as:
> >
> > 1) A failover occurs from machine A that has lots of RAM to machine B
> >    that has much less RAM, and even horribly bad performance is better
> >    than total service interruption.
> > 2) All currently running VMs have been ballooned down "far enough"
> >    and either have no swap device or insufficiently-sized swap devices,
> >    Xen simply has no more free space, and horrible performance is
> >    acceptable.
> >
> > The historical problem with "hypervisor-based-swapping" solutions such
> > as xenpaging is that it is impossible to ensure that "horribly bad
> > performance" doesn't start occurring under "normal" circumstances
> > specifically because (as Tim indirectly concurs below), policies
> > driven by heuristics and external inference (i.e. dom0 trying
> > to guess how much memory every domU "needs") just don't work.
> >
> > As a result, VMware customers outside of some very specific domains
> > (domains possibly overlapping with Snowflock?) will tell you
> > that "memory overcommit sucks and so we turn it off".
> >
> > Which is why I raised the question "why are we doing this?"...
> > If the answer is "Snowflock customers will benefit from it",
> How come SnowFlock crept in here? :)
> I can unequivocally assert there is no such thing as "SnowFlock customers".

Sorry, no ill will intended.  Tim (I think) earlier in this thread
suggested that page-sharing might benefit snowflock-like workloads.

> Olaf Hering from SuSe invested significant time and effort in getting
> paging to where it is, so you also have to add to the list whatever
> his/their motivations are.

Thus my curiosity... if Novell has some super-secret plans
that we aren't privileged to know, that's fine.  Otherwise,
I was trying to understand the motivations.

> You have to keep in mind that paging is 1. not bad to have and 2. powerful
> and generic, and 3. a far more generic mechanism to populating on demand,
> than what is labeled in the hypervisor as "populate-on-demand".
> Re 2. you could implement a balloon using a pager -- or you could
> implement a version of ramster by putting the page file on a fuse fs with
> compression turned on. Not that you would want to, just to prove a point.
> And re 3. not that there's anything wrong with PoD, but it has several
> assumptions baked in about being a temporary balloon replacement. I
> predict that once 32 bit hypervisor and shadow mode are phased out, PoD
> will also go away, as it will be a "simple" sub-case of paging.

I *think* we are all working on the same goal of "reduce RAM
as a bottleneck *without* a big performance hit".  With xenpaging,
I fear Xen customers will be excited about "reduce/eliminate
RAM as a bottleneck" and then be surprised when there IS a
big performance hit.  I also fear that, with current policy
technology, it will be impossible to draw any sane line to implement
"I want to reduce/eliminate RAM as a bottleneck *as much as possible*
WITHOUT a big performance hit".  In other words, I am hoping
to avoid repeating all the same mistakes that VMware has already
gone through and getting all the same results VMware customers
have already gone through, e.g. "memory overcommit sucks so just
turn it off."  This would be IMHO the classic definition of insanity.

As for PoD and paging, if adding xenpaging or replacing PoD with
xenpaging ensures that a guest continues to run in situations
where PoD would have caused the guest to crash, great!  But
if xenpaging makes performance suck where PoD was doing just
fine, see above.

(And, P.S., apologies to Jan who HAS invested time and energy
into tmem.)

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.