Xen project Mailing List

Re: [Xen-devel] Proposed XENMEM_claim_pages hypercall: Analysis of problem and alternate solutions

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>

Date: Tue, 8 Jan 2013 09:03:27 +0000

Cc: "Keir \(Xen.org\)" <keir@xxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>, "Tim \(Xen.org\)" <tim@xxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>

Delivery-date: Tue, 08 Jan 2013 09:06:19 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, 2013-01-07 at 18:41 +0000, Dan Magenheimer wrote: > > From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxx] > > > > On Thu, 2013-01-03 at 18:49 +0000, Dan Magenheimer wrote: > > > > > > Well, perhaps my statement is a bit heavy-handed, but I don't see > > > how it ends the discussion... you simply need to prove my statement > > > incorrect! ;-) To me, that would mean pointing out any existing > > > implementation or even university research that successfully > > > predicts or externally infers future memory demand for guests. > > > (That's a good approximation of my definition of an omniscient > > > toolstack.) > > > > I don't think a solution involving massaging of tot_pages need involve > > either frequent changes to tot_pages nor omniscience from the tool > > stack. > > > > Start by separating the lifetime_maxmem from current_maxmem. The > > lifetime_maxmem is internal to the toolstack (it is effectively your > > tot_pages from today) and current_maxmem becomes whatever the toolstack > > has actually pushed down into tot_pages at any given time. > > > > In the normal steady state lifetime_maxmem == current_maxmem. > > > > When you want to claim some memory in order to start a new domain of > > size M you *temporarily* reduce current_maxmem for some set of domains > > on the chosen host and arrange that the total of all the current_maxmems > > on the host is such that "HOST_MEM - SUM(current_maxmems) > M". > > > > Once the toolstack has built (or failed to build) the domain it can set > > all the current_maxmems back to their lifetime_maxmem values. > > > > If you want to build multiple domains in parallel then M just becomes > > the sum over all the domains currently being built. > > Hi Ian -- > > Happy New Year! > > Perhaps you are missing an important point that is leading > you to oversimplify and draw conclusions based on that > oversimplification... > > We are _primarily_ discussing the case where physical RAM is > overcommitted, or to use your terminology IIUC: > > SUM(lifetime_maxmem) > HOST_MEM I understand this perfectly well. > Thus: > > > In the normal steady state lifetime_maxmem == current_maxmem. > > is a flawed assumption, except perhaps as an initial condition > or in systems where RAM is almost never a bottleneck. I see that I have incorrectly (but it seems at least consistently) said "d->tot_pages" where I meant d->max_pages. This was no doubt extremely confusing and does indeed render the scheme unworkable. Sorry. AIUI you currently set d->max_pages == lifetime_maxmem. In the steady state therefore current_maxmem == lifetime_maxmem == d->max_pages and nothing changes compared with how things are for you today In the case where you are claiming some memory you change only max_pages (and not tot_pages as I incorrectly stated before, tot_pages can continue to vary dynamically, albeit with reduced range). So d->max_pages == current_maxmem which is derived as I describe previously (managing to keep my tot and max straight for once): When you want to claim some memory in order to start a new domain of size M you *temporarily* reduce current_maxmem for some set of domains on the chosen host and arrange that the total of all the current_maxmems on the host is such that "HOST_MEM - SUM(current_maxmems) > M". I hope that clarifies what I was suggesting. > Without that assumption, in your model, the toolstack must > make intelligent policy decisions about how to vary > current_maxmem relative to lifetime_maxmem, across all the > domains on the system. Since the memory demands of any domain > often vary frequently, dramatically and unpredictably (i.e. > "spike") and since the performance consequences of inadequate > memory can be dire (i.e. "swap storm"), that is why I say the > toolstack (in your model) must both make frequent changes > to tot_pages and "be omniscient". Agreed, I was mistaken in saying tot_pages where I meant max_pages. My intention was to describe a scheme where max_pages would change only a) when you start building a new domain and b) when you finish building a domain. There should be no need to make adjustments between those events. The inputs into the calculations are lifetime_maxmems for all domains, the current number of domains in the system, the initial allocation of any domain(s) currently being built (AKA the current claim) and the total physical RAM present in the host. AIUI all of those are either static or dynamic but only actually changing when new domains are introduced/removed (or otherwise only changing infrequently). > So, Ian, would you please acknowledge that the Oracle model > is valid and, in such cases where your maxmem assumption > is incorrect, that hypervisor-controlled capacity allocation > (i.e. XENMEM_claim_pages) is an acceptable solution? I have no problem with the validity of the Oracle model. I don't think we have reached the consensus that the hypervisor-controlled capacity allocation is the only possible solution, or the preferable solution from the PoV of the hypervisor maintainers. In that sense it is "unacceptable" because things which can be done outside the hypervisor should be and so I cannot acknowledge what you ask. Apologies again for my incorrect use of tot_pages which has lead to this confusion. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.