[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4 of 4] xenpaging: initial libxl support

On Mon, 2012-01-09 at 19:21 +0000, Olaf Hering wrote:
> So there is that maxmem= setting to let the guest OS configure itself
> for a given amount of pseudo-physical memory. Then there is a way to cut
> down the guest OS memory usage, both with balloon driver in guest and
> later with PoD.
> Isnt paging a better (or: just different) way to control the memory
> usage of a guest OS (It costs diskspace in dom0)?

On the contrary, hypervisor swapping is definitely *much worse* than
using a balloon driver.  The balloon driver was an innovation developed
specifically to avoid hypervisor swapping if at all possible[1].  We
need hypervisor swapping as a back-stop for situations where the balloon
driver is non-existent, or can't function immediately for some reason
(e.g., we've been using page-sharing to do memory overcommit and
suddenly have a bunch of pages un-shared); but it should always be a
last resort, and would ideally be mitigated by the balloon driver as
soon as possible.

[1] http://www.waldspurger.org/carl/papers/esx-mem-osdi02.pdf

> If a guest OS is configured with maxmem=4096, but then restricted with
> memory=3072 in the next line, why is maxmem= there in the first place?

Because for HVM guests at least, the guest OS will never recognize more
memory than was reported in the e820 map at boot.  So if you boot with
maxmem=3072, the VM will *never* be able to see more then 3072 megabytes
of RAM.  If you want to start a VM with 3072 MiB, but want the
flexibility of allowing the VM to use up to 4096 MiB at some point in
the future, you need to have 4096MiB in the e820 map.

> Would it clearer to say: The guest OS has a certain workload which
> requires 3072MB. But maybe at some point the guest needs the full
> 4096MB, then it can access all of it at the cost of some IO due to
> swapping in dom0.

The very best thing is if the guest does its own swapping.  If its
working set is 4096MiB, but its available memory is only 3072MiB, it's
better to tell the guest it only has 3072MiB to work with, so it can do
the swapping optimally.

> I think the balloon driver in the guest is not really needed anymore, it
> could just be there and do nothing. IF there is physical memory to
> release to the host, the pager can do it on behalf of the balloon
> driver.

Hopefully it's clear that I disagree with this completely.

> What if the config format is like this:
> Do things as they were done until now (PoD + balloon driver):
>   memory=3072
>   maxmem=4096
>   paging=0 (or not specified at all)
> Do things with pager instead of balloon driver and/or PoD:
>   memory=3072
>   maxmem=4096
>   paging=1, or xenpaging=1
>   xenpaging_extra=[ '-f', '/path/to/pagefile_guestname' ] (optional)

Except that this makes paging and ballooning mutually exclusive.  What
we want is to make them work together -- to have paging as a back-up
when ballooning fails (or isn't fast enough).

We'd also like to experiment with having a special-case of paging
replace PoD; in that case, we need to start with this special-case
paging and then transition into ballooning.

It may be that we don't have time to make them work together before the
4.2 release; in that case, we may need to make them mutually exclusive
for that release, to be fixed up in 4.3.  But if we can make them work
together by 4.2, that would be the best; and in any case, we need to
make sure we're planning for them to work together, and minimize the
interface changes when we do.

> The builder could create some sort PoD for a paged guest so
> that during startup only the amount of memory= needs to be allocated.
> This needs to be implemented, right now a starting guest needs the full
> amount of memory until the pager starts to page-out pages.

Yes, the builder needs to be able to start a guest with pages pre-paged
out, for the same reason we introduced PoD: that is, if you page a guest
from 4096MiB down to 3072MiB, and then reboot the guest, you may only
have 3072MiB available.  So if you want maxmem=4096 still, you need to
start with some pages "pre-paged" out.  We need that mechanism for
robustness anyway; we can then experiment with using it to replace PoD.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.