[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Proposed XENMEM_claim_pages hypercall: Analysis of problem and alternate solutions



On Wed, Jan 23, 2013 at 06:36:06PM +0000, Dave Scott wrote:
> Hi,
> 
> > On Mon, Jan 14, 2013 at 06:28:48PM +0000, George Dunlap wrote:
> > > I'm not fluent in OCaml either, I'm mainly going from memory based on
> > > the discussions I had with the author when it was being designed, as
> > > well as discussions with the xapi team when dealing with bugs at later
> > > points.
> 
> Konrad Rzeszutek Wilk replied:
> 
> > I was looking at xen-api/ocaml/xenops/squeeze.ml and just reading the
> > comments and feebly trying to understand how the OCaml code is.
> > Best I could understand it does various measurements, makes the
> > appropiate hypercalls and waits for everything to stabilize before allowing
> > the guest to start.
> > 
> > N.B: With tmem, the 'stabilization' might never happen.
> 
> In case it's useful I re-uploaded the squeezed design doc to the xen wiki:
> 
> http://wiki.xen.org/wiki/File:Squeezed.pdf
> 
> I think it got lost during the conversion from the old wiki to the new wiki.
> 
> Hopefully the doc gives a better "big picture" view than the code itself :-)
> 
> The quick summary is that squeezed tries to "balance" memory between the VMs 
> on the host by manipulating their balloon targets. When a VM is to be 
> started, xapi will ask it to "reserve" memory, squeezed will lower the 
> balloon targets (and set maxmem as an absolute limit on allocation), wait for 
> something to happen, possibly conclude some guests are being "uncooperative" 
> and ask the "cooperative" ones to balloon down some more etc. It works but 
> the problems I would highlight are:
> 

How do you know whether the cooperative guests _can_ balloon further down? As 
in, what if they are OK doing it but end
up OOM-ing? That can happen right now with Linux if you set the memory target 
too low.

> 0. since a VM which refuses to balloon down causes other VMs to be ballooned 
> harder, we needed a good way to signal misbehavior to the user
> 
> 1. freeing memory by ballooning can be quite slow (especially on windows)
> 
> 2. to actually free 'x' MiB we have to know what number to set the 
> memory/target to. Experimentally it seems that, when an HVM guest has 
> finished ballooning, the domain's total_pages will equal the memory/target + 
> a constant offset. Squeezed performs an initial calibration but it is 
> potentially quite fragile.
> 
> 3. we want to keep as much memory in-use as possible (ie allocated to guests) 
> but allocating domain structures often failed due to lack of (low? 
> contiguous?) memory. To work around this we balloon first and domain create 
> second, but this required us to track memory 'reservations' independently of 
> the domains so that we wouldn't leak over a crash. This is a bit complicated 
> but ok because all memory allocations are handled by squeezed.
> 
> 4. squeezed's memory management will clearly not work very well if some 
> degree of page sharing is in-use :-)

Right, and 'tmem' is in the same "boat" so to say.
> 
> 5. (more of a bug) the code for "balancing" would occasionally oscillate, 
> moving pages between VMs every few seconds. This caused quite a lot of log 
> spam.

Thank you for writing it up. I think  I got a better understanding of it.
> 
> HTH,
> 
> Dave
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.