[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Proposed XENMEM_claim_pages hypercall: Analysis of problem and alternate solutions

> From: Tim Deegan [mailto:tim@xxxxxxx]
> Subject: Re: [Xen-devel] Proposed XENMEM_claim_pages hypercall: Analysis of 
> problem and alternate
> solutions
> Hi,

Happy New Year Tim, and thanks for trying to add some clarity to the

> The question of starting VMs in parallel seems like a red herring to me:
> - TTBOMK Xapi already can start VMs in parallel.  Since it knows what
>   constraints it's placed on existing VMs and what VMs it's currently
>   building, there is nothing stopping it.  Indeed, AFAICS any toolstack
>   that can guarantee enough RAM to build one VM at a time could do the
>   same for multiple parallel builds with a bit of bookkeeping.
> - Dan's stated problem (failure during VM build in the presence of
>   unconstrained guest-controlled allocations) happens even if there is
>   only one VM being created.

Agreed.  The parallel VM discussion was simply trying to point out
that races can occur even without guest-controlled allocations,
so is distracting from the actual issue (which is, according to
wikipedia, one of the definitions of "red herring").

(As an aside, your use of the word "unconstrained" is a red herring. ;-)
> > > > Andres Lagar-Cavilla says "... this is because of shortcomings in the
> > > > [Xen] mm layer and its interaction with wait queues, documented
> > > > elsewhere."  In other words, this batching proposal requires
> > > > significant changes to the hypervisor, which I think we
> > > > all agreed we were trying to avoid.
> > >
> > > Let me nip this at the bud. I use page sharing and other techniques in an 
> > > environment that doesn't
> use Citrix's DMC, nor is focused only on proprietary kernels...
> >
> > I believe Dan is saying is that it is not enabled by default.
> > Meaning it does not get executed in by /etc/init.d/xencommons and
> > as such it never gets run (or does it now?) - unless one knows
> > about it - or it is enabled by default in a product. But perhaps
> > we are both mistaken? Is it enabled by default now on xen-unstable?
> I think the point Dan was trying to make is that if you use page-sharing
> to do overcommit, you can end up with the same problem that self-balloon
> has: guest activity might consume all your RAM while you're trying to
> build a new VM.
> That could be fixed by a 'further hypervisor change' (constraining the
> total amount of free memory that CoW unsharing can consume).  I suspect
> that it can also be resolved by using d->max_pages on each shared-memory
> VM to put a limit on how much memory they can (severally) consume.

(I will respond to this in the context of Andres' response shortly...)

> > Just as a summary as this is getting to be a long thread - my
> > understanding has been that the hypervisor is suppose to toolstack
> > independent.
> Let's keep calm.  If people were arguing "xl (or xapi) doesn't need this
> so we shouldn't do it"

Well Tim, I think this is approximately what some people ARE arguing.
AFAICT, "people" _are_ arguing that "the toolstack" must have knowledge
of and control over all memory allocation.  Since the primary toolstack
is "xl", even though xl does not currently have this knowledge/control
(and, IMHO, never can or should), I think people _are_ arguing:

"xl (or xapi) SHOULDn't need this so we shouldn't do it".

> that would certainly be wrong, but I don't think
> that's the case.  At least I certainly hope not!

I agree that would certainly be wrong, but it seems to be happening
anyway. :-(  Indeed, some are saying that we should disable existing
working functionality (eg. in-guest ballooning) so that the toolstack
CAN have complete knowledge and control.

So let me check, Tim, do you agree that some entity, either the toolstack
or the hypervisor, must have knowledge of and control over all memory
allocation, or the allocation race condition is present?

> The discussion ought to be around the actual problem, which is (as far
> as I can see) that in a system where guests are ballooning without
> limits, VM creation failure can happen after a long delay.  In
> particular it is the delay that is the problem, rather than the failure.
> Some solutions that have been proposed so far:
>  - don't do that, it's silly (possibly true but not helpful);
>  - this reservation hypercall, to pull the failure forward;
>  - make allocation faster to avoid the delay (a good idea anyway,
>    but can it be made fast enough?);
>  - use max_pages or similar to stop other VMs using all of RAM.

Good summary.  So, would you agree that the solution selection
comes down to: "Can max_pages or similar be used effectively to
stop other VMs using all of RAM? If so, who is implementing that?
Else the reservation hypercall is a good solution." ?

> My own position remains that I can live with the reservation hypercall,
> as long as it's properly done - including handling PV 32-bit and PV
> superpage guests.

Tim, would you at least agree that "properly" is a red herring?
Solving 100% of a problem is clearly preferable and I would gladly
change my loyalty to someone else's 100% solution.  But solving 98%*
of a problem while not making the other 2% any worse is not "improper",
just IMHO sensible engineering.

* I'm approximating the total number of PV 32-bit and PV superpage
guests as 2%.  Substitute a different number if you like, but
the number is certainly getting smaller over time, not growing.

Tim, thanks again for your useful input.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.