[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domain creation vs querying free memory (xend and xl)

On Oct 4, 2012, at 6:06 AM, Tim Deegan wrote:

> At 14:56 -0700 on 02 Oct (1349189817), Dan Magenheimer wrote:
>>> AIUI xapi uses the domains' maximum allocations, centrally controlled,
>>> to place an upper bound on the amount of guest memory that can be in
>>> use.  Within those limits there can be ballooning activity.  But TBH I
>>> don't know the details.
>> Yes, that's the same as saying there is no memory-overcommit.
> I'd say there is - but it's all done by ballooning, and it's centrally
> enforced by lowering each domain's maxmem to its balloon target, so a
> badly behaved guest can't balloon up and confuse things. 
>> The original problem occurs only if there are multiple threads
>> of execution that can be simultaneously asking the hypervisor
>> to allocate memory without the knowledge of a single centralized
>> "controller".
> Absolutely.
>> Tmem argues that doing "memory capacity transfers" at a page granularity
>> can only be done efficiently in the hypervisor.  This is true for
>> page-sharing when it breaks a "share" also... it can't go ask the
>> toolstack to approve allocation of a new page every time a write to a shared
>> page occurs.
>> Does that make sense?
> Yes.  The page-sharing version can be handled by having a pool of
> dedicated memory for breaking shares, and the toolstack asynchronously
> replenish that, rather than allowing CoW to use up all memory in the
> system.

That is doable. One benefit is that it would minimize the chance of a VM 
hitting a CoW ENOMEM. I don't see how it would altogether avoid it.

If the objective is trying to put a cap to the unpredictable growth of memory 
allocations via CoW unsharing, two observations: (1) will never grow past 
nominal VM footprint (2) One can put a cap today by tweaking d->max_pages -- 
CoW will fail, faulting vcpu will sleep, and things can be kicked back into 
action at a later point.

>> (rough proposed design re-attached below)
> Thanks for that.  It describes a sensible-looking hypervisor interface,
> but my question was really: what should xl do, in the presence of
> ballooning, sharing, paging and tmem, to
> - decide whether a VM can be started at all;
> - control those four systems to shuffle memory around; and
> - resolve races sensibly to avoid small VMs deferring large ones.
> (AIUI, xl already has some logic to handle the case of balloon-to-fit.)
> The second of those three is the interesting one.  It seems to me that
> if the tools can't force all other actors to give up memory (and not
> immediately take it back) then they can't guarantee to be able to start
> a new VM, even with the new reservation hypercalls.
> Cheers,
> Tim.
>>> From: Dan Magenheimer
>>> Sent: Monday, October 01, 2012 2:04 PM
>>>   :
>>>   :
>>> Back to design brainstorming:
>>> The way I am thinking about it, the tools need to be involved
>>> to the extent that they would need to communicate to the
>>> hypervisor the following facts (probably via new hypercall):
>>> X1) I am launching a domain X and it is eventually going to
>>>   consume up to a maximum of N MB.  Please tell me if
>>>   there is sufficient RAM available AND, if so, reserve
>>>   it until I tell you I am done. ("AND" implies transactional
>>>   semantics)

X1 does not need hypervisor support. We already coexist with a global daemon 
that is a single point of failure. I'm not arguing for xenstore to hold onto 
these reservations, but a daemon can. Xapi does it that way.


>>> X2) The launch of X is complete and I will not be requesting
>>>   the allocation of any more RAM for it.  Please release
>>>   the reservation, whether or not I've requested a total
>>>   of N MB.
>>> The calls may be nested or partially ordered, i.e.
>>>   X1...Y1...Y2...X2
>>>   X1...Y1...X2...Y2
>>> and the hypervisor must be able to deal with this.
>>> Then there would need to be two "versions" of "xm/xl free".
>>> We can quibble about which should be the default, but
>>> they would be:
>>> - "xl --reserved free" asks the hypervisor how much RAM
>>>   is available taking into account reservations
>>> - "xm --raw free" asks the hypervisor for the instantaneous
>>>   amount of RAM unallocated, not counting reservations
>>> When the tools are not launching a domain (that is there
>>> has been a matching X2 for all X1), the results of the
>>> above "free" queries are always identical.
>>> So, IanJ, does this match up with the design you were thinking
>>> about?
>>> Thanks,
>>> Dan
>>> [1] I think the core culprits are (a) the hypervisor accounts for
>>> memory allocation of pages strictly on a first-come-first-served
>>> basis and (b) the tools don't have any form of need-this-much-memory
>>> "transaction" model

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.