[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Proposed XENMEM_claim_pages hypercall: Analysis of problem and alternate solutions

On Mon, 2013-01-07 at 18:41 +0000, Dan Magenheimer wrote:
> > From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxx]
> > 
> > On Thu, 2013-01-03 at 18:49 +0000, Dan Magenheimer wrote:
> > >
> > > Well, perhaps my statement is a bit heavy-handed, but I don't see
> > > how it ends the discussion... you simply need to prove my statement
> > > incorrect! ;-)  To me, that would mean pointing out any existing
> > > implementation or even university research that successfully
> > > predicts or externally infers future memory demand for guests.
> > > (That's a good approximation of my definition of an omniscient
> > > toolstack.)
> > 
> > I don't think a solution involving massaging of tot_pages need involve
> > either frequent changes to tot_pages nor omniscience from the tool
> > stack.
> > 
> > Start by separating the lifetime_maxmem from current_maxmem. The
> > lifetime_maxmem is internal to the toolstack (it is effectively your
> > tot_pages from today) and current_maxmem becomes whatever the toolstack
> > has actually pushed down into tot_pages at any given time.
> > 
> > In the normal steady state lifetime_maxmem == current_maxmem.
> > 
> > When you want to claim some memory in order to start a new domain of
> > size M you *temporarily* reduce current_maxmem for some set of domains
> > on the chosen host and arrange that the total of all the current_maxmems
> > on the host is such that "HOST_MEM - SUM(current_maxmems) > M".
> > 
> > Once the toolstack has built (or failed to build) the domain it can set
> > all the current_maxmems back to their lifetime_maxmem values.
> > 
> > If you want to build multiple domains in parallel then M just becomes
> > the sum over all the domains currently being built.
> Hi Ian --
> Happy New Year!
> Perhaps you are missing an important point that is leading
> you to oversimplify and draw conclusions based on that
> oversimplification...
> We are _primarily_ discussing the case where physical RAM is
> overcommitted, or to use your terminology IIUC:
>    SUM(lifetime_maxmem) > HOST_MEM

I understand this perfectly well.

> Thus:
> > In the normal steady state lifetime_maxmem == current_maxmem.
> is a flawed assumption, except perhaps as an initial condition
> or in systems where RAM is almost never a bottleneck.

I see that I have incorrectly (but it seems at least consistently) said
"d->tot_pages" where I meant d->max_pages. This was no doubt extremely
confusing and does indeed render the scheme unworkable. Sorry.

AIUI you currently set d->max_pages == lifetime_maxmem. In the steady
state therefore current_maxmem == lifetime_maxmem == d->max_pages and
nothing changes compared with how things are for you today

In the case where you are claiming some memory you change only max_pages
(and not tot_pages as I incorrectly stated before, tot_pages can
continue to vary dynamically, albeit with reduced range). So
d->max_pages == current_maxmem which is derived as I describe previously
(managing to keep my tot and max straight for once):

        When you want to claim some memory in order to start a new
        domain of size M you *temporarily* reduce current_maxmem for
        some set of domains on the chosen host and arrange that the
        total of all the current_maxmems on the host is such that
        "HOST_MEM - SUM(current_maxmems) > M".

I hope that clarifies what I was suggesting.

> Without that assumption, in your model, the toolstack must
> make intelligent policy decisions about how to vary
> current_maxmem relative to lifetime_maxmem, across all the
> domains on the system.  Since the memory demands of any domain
> often vary frequently, dramatically and unpredictably (i.e.
> "spike") and since the performance consequences of inadequate
> memory can be dire (i.e. "swap storm"), that is why I say the
> toolstack (in your model) must both make frequent changes
> to tot_pages and "be omniscient".

Agreed, I was mistaken in saying tot_pages where I meant max_pages.

My intention was to describe a scheme where max_pages would change only
a) when you start building a new domain and b) when you finish building
a domain. There should be no need to make adjustments between those

The inputs into the calculations are lifetime_maxmems for all domains,
the current number of domains in the system, the initial allocation of
any domain(s) currently being built (AKA the current claim) and the
total physical RAM present in the host. AIUI all of those are either
static or dynamic but only actually changing when new domains are
introduced/removed (or otherwise only changing infrequently).

> So, Ian, would you please acknowledge that the Oracle model
> is valid and, in such cases where your maxmem assumption
> is incorrect, that hypervisor-controlled capacity allocation
> (i.e. XENMEM_claim_pages) is an acceptable solution?

I have no problem with the validity of the Oracle model. I don't think
we have reached the consensus that the hypervisor-controlled capacity
allocation is the only possible solution, or the preferable solution
from the PoV of the hypervisor maintainers. In that sense it is
"unacceptable" because things which can be done outside the hypervisor
should be and so I cannot acknowledge what you ask.

Apologies again for my incorrect use of tot_pages which has lead to this


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.