[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/NUMA: make init_node_heap() respect Xen heap limit



On Fri, 2015-09-04 at 02:39 -0600, Jan Beulich wrote:
> > 
> > > > On 04.09.15 at 10:27, <ian.campbell@xxxxxxxxxx> wrote:
> > On Fri, 2015-09-04 at 01:37 -0600, Jan Beulich wrote:
> > > > > > On 03.09.15 at 22:58, <julien.grall@xxxxxxxxxx> wrote:
> > > > And found why! The last xenheap_bits changed from 39 to 38.
> > > > 
> > > > On x-gene the last max mfn used for the xenheap is 0x4400000, which 
> > > > the
> > > > new computation, it will give 38 bits which doesn't cover the 
> > > > entire
> > > > xenheap range.
> > > > 
> > > > I have wrote a patch to fix the issue, but I'm not sure that it's
> > > > the right things to do (see below).
> > > 
> > > No, this is wrong: xenheap_bits isn't meant to cover all RAM, it is
> > > meant to indicate how much (as an exact power of 2) of RAM is
> > > always accessible. I'm surprised anyway that ARM64 uses
> > > xenheap_max_mfn() (and even unconditionally); I thought all RAM
> > > is always accessible there. The invocation is off by one now in any
> > > case, but rather than correcting it that way the proper fix likely
> > > will involve more than just this simple an adjustment, as it looks
> > > like its use was wrong from the beginning (commit 5263507b1b).
> > 
> > What is the correct thing which arm64 should be doing, given that today 
> > all
> > RAM is indeed always mapped? Not call xenheap_max_mfn at all, therefore
> > leaving xenheap_bits == 0?
> 
> Yes, if all memory is always accessible, then no limit should be
> enforced at all, i.e. the call be dropped.
> 
> Just for the record - even with the call in place it escapes me why
> this causes any problem: All it tells the allocator is to not hand out
> pages above a certain limit when asked for Xen heap pages. Sadly
> I wasn't able to interpret the dumped information (after the crash)
> in a way telling me what actually went wrong.

"create_xen_entries: L2 failed" tells me, through code inspection rather
than usefulness of the logging, that alloc_xenheap_page has returned NULL.

I think this is simply because all RAM on Mustang is at physical address
128GB onwards or so, IOW the off by one error has resulted in the xenheap
appearing to be empty since all RAM above the limit.

>  IOW - I'm afraid
> there's a second problem here that's going to be hidden again
> when the call gets removed.
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.