[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Pointed questions re Xen memory overcommit

> From: George Dunlap [mailto:George.Dunlap@xxxxxxxxxxxxx]
> Subject: Re: [Xen-devel] Pointed questions re Xen memory overcommit
> Well, of course.  But IIUC, it's one of the goals of MS Research to
> impact product teams, and incentive structure there is meant to make
> that happen.  So if you can convince somebody / a team at MS Research
> that tmem is a clever cool new idea that can help Microsoft's bottom
> line, then *they* should be the ones to shop around for broader
> support.
> In theory anyway. :-)

/Me worked in corporate research for 10 years and agrees
that theory and reality are very different for the above.

That said, I am willing to play a supporting role if there
are efforts to convince MS of the value of tmem, just not
willing to be the lone salesman.

> > I'm not sure putting raw memory into a tmem pool would do
> > more good than just freeing it.  Putting *data* into tmem
> > is what makes it valuable, and I think it takes a guest OS
> > to know how and when to put and get the data and (perhaps
> > most importantly) when to flush it to ensure coherency.
> The thing with the gains from sharing is that you can't really free
> it.  Suppose you have two 2GiB VMs, of which 1GiB is identical at some
> point in time.  That means the 2 VMS use only 3GiB between them, and
> you have an extra 1GiB of RAM.  However, unlike ram which is freed by
> ballooning, this RAM isn't stable: at any point in time, either of the
> VMs might write to the shared pages, requiring 2 VMs that need 2GiB of
> RAM each again.  If this happens, you will need to either:
> * Page out 0.5GiB from each VM (*really* bad for performance), or
> * Take the 1GiB of RAM back somehow.
> In this situation, having that ram in a tmem pool that the guests can
> use (or perhaps, dom0 for file caches or whatever) is the best option.
>  I forget the name you had for the different types, but wasn't there a
> type of tmem where you tell the guest, "Feel free to store something
> here, but it might not be here when you ask for it again"?  That's
> just the kind of way to use this RAM -- then the hypervisor system can
> just yank it from the tmem pool if guests start to un-share pages.
> The other option would be to allow the guests to decrease their
> balloon size, allowing them to use the freed memory themselves; and
> then if a lot of things get unshared, just inflate the balloons again.
>  This is also a decent option, except that due to the semantic gap, we
> can't guarantee that the balloon won't end up grabbing shared pages --
> which doesn't actually free up any more memory.
> A really *bad* option, IMHO, is to start a 3rd guest with that 1GiB of
> freed RAM -- unless you can guarantee that the balloon driver in all
> of them will be able to react to unsharing events.
> Anyway, that's what I meant by using a tmem pool -- does that make
> sense?  Have I misunderstood something about tmem's capabilities?

One thing I think you may be missing is that pages in an ephemeral
pool are next in line after purely free pages, i.e. are automatically
freed in FIFO order if there is a guest demanding memory via ballooning
or if the tools are creating a new guest or, presumably, if a
shared page needs to be split/cow'ed.  IOW, if you have a mixed
environment where some guests are unknowingly using page-sharing
and others are using tmem, this should (in theory) already work.

See "Memory allocation interdependency" in docs/misc/tmem-internals.html

If you are talking about a "pure" environment where no guests
are tmem-aware, putting pageframes-recovered-due-to-sharing in
an ephemeral tmem pool isn't AFAICT any different than just
freeing them.  At least with the current policy and implementation,
the results will be the same.  But maybe I am missing something
important in your proposal.

In case anyone has time to read it, the following may be more
interesting with all of the "semantic gap" issues fresh in your
mind.  (Note some of the code links are very out-of-date.)
http://oss.oracle.com/projects/tmem/ .

And for a more Linux-centric overview: http://lwn.net/Articles/454795/ 


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.