[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domain creation vs querying free memory (xend and xl)

[Sorry, forgot to reply-to-all]

On Tue, Oct 16, 2012 at 6:51 PM, Dan Magenheimer
<dan.magenheimer@xxxxxxxxxx> wrote:
> If you reread my last response with the assumption in mind:
>   "tmem == an instance of a memory scheduler == grand vision"
> then does the discussion of the "memory reservation" hypercall
> make more sense?

Sort of. :-) Unfortunately, I think it shows a bit of confusion, which
is perhaps why it was hard to understand.

But let's go back for a minute to the problem at hand: you're afraid
of free memory disappearing between a toolstack checking for the
memory, and the toolstack actually creating the VM.

There are two ways this could happen:

1. Another admin command (perhaps by another administrator) has caused
the memory to go away -- i.e,. another admin has called "xl create",
or has instructed a VM to balloon up to a higher amount of memory.

2. One of the self-directed processes in the system has allocated the
memory: a balloon driver has ballooned up, or the swapper has swapped
something in, or the page sharing daemon has had to un-share pages.

In the case of #1, I think the right answer to that is, "Don't do
that."  :-) The admins should co-ordinate with each other about what
to start where; if they both want to use a bit of memory, that's a
human interaction problem, not a technological one.  Alternately, if
we're talking a cloud orchestration layer, the cloud orchestration
should have an idea how much memory is available on each node, and not
allow different users to issue commands which would violate those.

In the case of #2, I think the answer is, "self-directed processes
should not be allowed to consume free memory without permission from
the toolstack".  The pager should not increase the memory footprint of
a VM unless either told to by an admin or a memory controller which
has been given authority by an admin.  (Yes, memory controller, not
scheduler -- more on that in another e-mail.)  A VM should be given a
fixed amount of memory above which the balloon driver cannot go.  The
page-sharing daemon should have a small amount set aside to handle
un-sharing requests; but this should be immediately replenished by
other methods (preferably by ballooning a VM down, or if necessary by
swapping pages out).  It should not be able to make arbitrarily large
allocations without permission from the toolstack.

I was chatting with Konrad yesterday, and he brought up
"self-ballooning" VMs, which apparently vonluntarily choose to balloon
down to *below* their toolstack-dictated balloon target, in order to
induce Linux to swap some pages out to tmem, and will then balloon up to
the toolstack-dictated target later.

It seems to me that the Right Thing in this case is for the toolstack
to know that this "free" memory isn't really free -- that if your 2GiB
VM is only using 1.5GiB, you nonetheless don't touch that 0.5GiB,
because you know it may use it later.  This is what xapi does.

Alternately, if you don't want to do that accounting, and just want to
use Xen's free memory to determine if you can start a VM, then you
could just have your "self-ballooning" processes *not actually free
the memory*.  That way the free memory would be an accurate
representation of how much memory is actually present on a system.

In all of this discussion, I don't see any reason to bring up tmem at
all (except to note the reason why a VM may balloon down).  It's just
another area to which memory can be allocated (along with Xen or a
domain).  It also should not be allowed to allocate free Xen memory to
itself without being specifically instructed by the toolstack, so it can't
cause the problem you're talking about.

Any system that follows the rules I've set above won't have to worry
about free memory disappearing half-way through domain creation.

I'm not fundamentally opposed to the idea of an "allocate memory to a
VM" hypercall; but the arguments adduced to support this seem
hopelessly confused, which does not bode well for the usefulness or
maintainability of such a hypercall.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.