[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC/PATCH v2] XENMEM_claim_pages (subop of existing) hypercall



> From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxx]
> Subject: Re: [Xen-devel] [RFC/PATCH v2] XENMEM_claim_pages (subop of 
> existing) hypercall

Hi Ian --

> On Thu, 2012-11-15 at 19:15 +0000, Dan Magenheimer wrote:
> > > From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxx]
> > > Subject: Re: [Xen-devel] [RFC/PATCH v2] XENMEM_claim_pages (subop of 
> > > existing) hypercall
> > >
> > > On Wed, 2012-11-14 at 23:55 +0000, Dan Magenheimer wrote:
> > >
> > > How does this interact with the feature which lets you create PV guests
> > > using only superpages? I believe is something Oracle added and still
> > > maintains (Dave added to the CC).
> > >
> > > Also doesn't this fail to make any sort of guarantee if you are building
> > > a 32 bit PV guest, since they require memory under a certain host
> > > address limit (160GB IIRC)?
> > >
> > > Basically neither of those cases benefit from this hypercall at all? I
> > > don't know what the usecase for the superpage PV guests is (but I
> > > suppose it is important to Oracle, at least). The 32 bit PV guest use
> > > case is still a pretty significant one which I think ought to be
> > > handled, otherwise any system built on top of this functionality will
> > > only work reliably in a subset of cases.
> >
> > The claim mechanism will not benefit PV superpages.  IIUC, the
> > design of the PV superpages will cause a domain launch to fail
> > if it requests 10000 superpages but Xen can only successfully
> > allocate 9999.  That's already very fragile.  Since the only
> > way currently to find out if there are 10000 superpages available
> > is to start allocating them, claim can't really help.
> 
> Well, you could always account the number of free superpages in the
> system, which would allow you to cover this case too.

Because of the nature of the buddy allocator (i.e. 4MB chunks are
kept separately from 2MB chunks even though a 4MB page contains
two 2MB pages), I don't think this is trivial.  Also, once
memory is fragmented, creation of a PV-superpages domain will fail
(relatively quickly) whether claiming first or not.  The situation
could be improved by adding defrag code to Xen (or "compaction"
as Linux calls it), which I'd be interested in pursuing, but that
is also non-trivial.  Last, once PVH is an available option,
community (and Oracle's) interest in PV+superpages may fade away
quickly.

So I agree that this is worth adding superpages to a to-do list
for future claim work, but I see it as a "would be nice to add later"
feature, not a showstopper.

> > For 32 bit PV guests, note that a claim does NOT have to be
> > staked prior to any allocation.  So if a toolstack needs
> > to enforce that some portion of allocated memory is under
> > a host address limit, it can (attempt to) allocate those pages,
> > then stake a claim for the rest.
> 
> For 32 bit PV guests this is *all* of the pages needed to build any 32
> bit PV guest.

I am ignorant of the details here, but this is _all_ of the
pages only on machines with >160GiB of physical memory, correct?
(If incorrect, just ignore the 160GB parts below :-)

> > Or just not use the claim mechanism at all.
> 
> So your use case has no requirement to be able to start 32 bit domains?
> Or whatever requirement you have the leads to the claim mechanism
> somehow doesn't apply to those sorts of guests? If not then why not?

Claim doesn't fail to start 32-bit PV domains, it just doesn't
help them avoid "failing slowly" [on machines with >160GiB of RAM].

32-bit PV domains are highly likely to be legacy domains with
much smaller RAM requirements [always <= 64GiB?] so "failing slowly"
is much less of a concern.

32-bit PV domains are on their way to obsolescence so I am (and
I believe Oracle would be) quite happy documenting that launching
them might result in conditions not seen when creating other domain
types... especially when other failure conditions already exist for
32-bit PV domains.

> As it stands it seems that any toolstack which wants to use claim would
> still have to cope with the fact that a potentially significant
> proportion of guests may still fail to build even after a claim has been
> successfully staked.

The problem being addressed here is "slow failure" when creating
a domain and this can currently occur with 100% of domain creations.
The proposed claim hypercall solves this problem for:

- All HVM domains and all future PVH domains
- All 64-bit PV domains [when system RAM < 5TiB]
- [All 32-bit PV domains when system RAM < 160GiB]

and, as you've observed, currently doesn't solve the problem for

- All 32-bit PV domains [except when system RAM > 160 GiB]
- PV domains with superpages=1 that "almost succeed" but fail

I see that as a pretty good start and, with the flags argument,
the proposal even has room for future extensions should they
be needed.

> Even if these shortcomings are acceptable in your specific scenario I
> don't see why we should be satisfied with a solution which is not more
> generally applicable.

I would welcome a solution which is more generally applicable
if you've got one.  Otherwise, I am very satisfied with this one.
If you'd like confirmation from Oracle management that they
are satisfied as well, I can encourage them to reply.

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.