|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
On Oct 4, 2012, at 6:06 AM, Tim Deegan wrote:
> At 14:56 -0700 on 02 Oct (1349189817), Dan Magenheimer wrote:
>>> AIUI xapi uses the domains' maximum allocations, centrally controlled,
>>> to place an upper bound on the amount of guest memory that can be in
>>> use. Within those limits there can be ballooning activity. But TBH I
>>> don't know the details.
>>
>> Yes, that's the same as saying there is no memory-overcommit.
>
> I'd say there is - but it's all done by ballooning, and it's centrally
> enforced by lowering each domain's maxmem to its balloon target, so a
> badly behaved guest can't balloon up and confuse things.
>
>> The original problem occurs only if there are multiple threads
>> of execution that can be simultaneously asking the hypervisor
>> to allocate memory without the knowledge of a single centralized
>> "controller".
>
> Absolutely.
>
>> Tmem argues that doing "memory capacity transfers" at a page granularity
>> can only be done efficiently in the hypervisor. This is true for
>> page-sharing when it breaks a "share" also... it can't go ask the
>> toolstack to approve allocation of a new page every time a write to a shared
>> page occurs.
>>
>> Does that make sense?
>
> Yes. The page-sharing version can be handled by having a pool of
> dedicated memory for breaking shares, and the toolstack asynchronously
> replenish that, rather than allowing CoW to use up all memory in the
> system.
That is doable. One benefit is that it would minimize the chance of a VM
hitting a CoW ENOMEM. I don't see how it would altogether avoid it.
If the objective is trying to put a cap to the unpredictable growth of memory
allocations via CoW unsharing, two observations: (1) will never grow past
nominal VM footprint (2) One can put a cap today by tweaking d->max_pages --
CoW will fail, faulting vcpu will sleep, and things can be kicked back into
action at a later point.
>
>> (rough proposed design re-attached below)
>
> Thanks for that. It describes a sensible-looking hypervisor interface,
> but my question was really: what should xl do, in the presence of
> ballooning, sharing, paging and tmem, to
> - decide whether a VM can be started at all;
> - control those four systems to shuffle memory around; and
> - resolve races sensibly to avoid small VMs deferring large ones.
> (AIUI, xl already has some logic to handle the case of balloon-to-fit.)
>
> The second of those three is the interesting one. It seems to me that
> if the tools can't force all other actors to give up memory (and not
> immediately take it back) then they can't guarantee to be able to start
> a new VM, even with the new reservation hypercalls.
>
> Cheers,
>
> Tim.
>
>>> From: Dan Magenheimer
>>> Sent: Monday, October 01, 2012 2:04 PM
>>> :
>>> :
>>> Back to design brainstorming:
>>>
>>> The way I am thinking about it, the tools need to be involved
>>> to the extent that they would need to communicate to the
>>> hypervisor the following facts (probably via new hypercall):
>>>
>>> X1) I am launching a domain X and it is eventually going to
>>> consume up to a maximum of N MB. Please tell me if
>>> there is sufficient RAM available AND, if so, reserve
>>> it until I tell you I am done. ("AND" implies transactional
>>> semantics)
X1 does not need hypervisor support. We already coexist with a global daemon
that is a single point of failure. I'm not arguing for xenstore to hold onto
these reservations, but a daemon can. Xapi does it that way.
Andres
>>> X2) The launch of X is complete and I will not be requesting
>>> the allocation of any more RAM for it. Please release
>>> the reservation, whether or not I've requested a total
>>> of N MB.
>>>
>>> The calls may be nested or partially ordered, i.e.
>>> X1...Y1...Y2...X2
>>> X1...Y1...X2...Y2
>>> and the hypervisor must be able to deal with this.
>>>
>>> Then there would need to be two "versions" of "xm/xl free".
>>> We can quibble about which should be the default, but
>>> they would be:
>>>
>>> - "xl --reserved free" asks the hypervisor how much RAM
>>> is available taking into account reservations
>>> - "xm --raw free" asks the hypervisor for the instantaneous
>>> amount of RAM unallocated, not counting reservations
>>>
>>> When the tools are not launching a domain (that is there
>>> has been a matching X2 for all X1), the results of the
>>> above "free" queries are always identical.
>>>
>>> So, IanJ, does this match up with the design you were thinking
>>> about?
>>>
>>> Thanks,
>>> Dan
>>>
>>> [1] I think the core culprits are (a) the hypervisor accounts for
>>> memory allocation of pages strictly on a first-come-first-served
>>> basis and (b) the tools don't have any form of need-this-much-memory
>>> "transaction" model
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |