[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: An issue in xen_limit_pages_to_max_mfn() in Xenlinux Ver. 2.6.18



On Fri, Nov 11, 2011 at 08:18:57AM +0000, Jan Beulich wrote:
> >>> On 10.11.11 at 23:53, Daniel Kiper <dkiper@xxxxxxxxxxxx> wrote:
> > On Thu, Nov 10, 2011 at 10:27:57AM +0000, Jan Beulich wrote:
> >> >>> On 10.11.11 at 09:58, Daniel Kiper <dkiper@xxxxxxxxxxxx> wrote:
> >> > Hi Jan,
> >> >
> >> > During work on kexec/kdump for Xen domU I found that
> >> > xen_limit_pages_to_max_mfn() registers undo_limit_pages()
> >> > destructor which breaks __free_pages(). When __free_pages()
> >> > is called then at beginning of this function put_page_testzero()
> >> > is called which decrements page count for given page. Later
> >> > undo_limit_pages() destructor is called which once again
> >> > calls __free_pages() and in consequence put_page_testzero()
> >> > fails (BUG_ON() is called) because page count is 0.
> >>
> >> Seems like (on newer kernels, where this is a VM_BUG_ON()) this was
> >> never hit on a configuration with CONFIG_DEBUG_VM, and on the older
> >> kernels the function was never used for memory that would get freed
> >> later (only kexec uses it there).
> >>
> >> > It could
> >> > be easily fixed, however, after reviewing xen_limit_pages_to_max_mfn()
> >> > I could not find any good reason for which undo_limit_pages()
> >> > destructor is registered. Maybe it could be removed at all because
> >> > all pages are freed when __free_pages() is called and in this
> >> > case we do not care where they live. However, maybe I missed
> >> > something important.
> >>
> >> It does matter - otherwise, over time, we could exhaust memory
> >> below a certain boundary in Xen.
> >
> > Ahhhh... I thought about that once, however, I could not find it
> > in the code. It is not clear because there is no any comment. As
>
> The name of the function is - imo - sufficient to describe its purpose.

Yes, it is. However, if you do not know Xen's allocator behavior
it is quite difficult find out where exactly it is going on.

> > I understand HYPERVISOR_memory_op(XENMEM_exchange, &exchange) prefers
> > to allocate pages from Xen which live on higher addresses (at the
> > end of physical memory) if exchange.out.address_bits is 0. This
> > behavior leads to situation where pages with higher addresses
> > are more prefered than with lower addresses. In consequence pages
> > are moved from lower MFNs to higher MFNs.
> >
> > Please correct me if I am wrong.
>
> That's correct (with the slight adjustment that higher addresses are
> always preferred by Xen's allocator, just that address_bits being non-
> zero sets an upper bound).

Thanks for that explanation.

> >> So I think we need to add an init_page_count() call to undo_limit_pages().
> >
> > Hmmm... I think it is not good solution in that point. I suppose that you
> > found this usage in balloon driver, however, here it is not the case.
> > Balloon driver initializes page structure for pages allocated from Xen
> > by calling init_page_count() (inter alia). It means that it is used
> > more or less accordingly to a comment in include/linux/mm.h (Setup the
> > page count before being freed into the page allocator for the first time
> > (boot or memory hotplug)). However, I think we should not simply reset
> > page count in undo_limit_pages(). It could lead to unexpected results.
>
> Why? The page count *must* be zero at this point (or it would be an
> error that this function was reached).

Yes, it is. If page count != 0 then undo_limit_pages() function is not reached.
I missed one thing in put_page_testzero().

> > Now I think about two solutions for that problem:
> >   1) We could simply remove undo_limit_pages() destructor and move 
> > responsibility
> >      of pages relocation before freeing them to the caller. I prefer that 
> > solution,
> >      because it gives more flexibility to the developer and similar solution
> >      must/will be used in latest kernels (PageForeign mechanism is not 
> > available
> >      there or I missed something).
>
> I don't see what flexibility you have in mind. To me it's just an extra
> burden, namely because the caller would have to track for which
> pages it may have (successfully) called xen_limit_pages_...(). Just
> look at the uses of the function in a recent SLES or openSUSE kernel...

I would like to compare xen_limit_pages_to_max_mfn() calls to relevant
SetPageReserved()/ClearPageReserved(). They set some properties of given
page. If you would like to reserve page after allocation you could call
SetPageReserved(). Later you could release it by calling ClearPageReserved()
(e.g. before freeing). The same idea should be used in case of 
xen_limit_pages_to_max_mfn().
If it is required user could move pages to lower MFNs by calling
xen_limit_pages_to_max_mfn(). Later if it is needed he/she could move
back relevant pages to higher MFNs before freeing them. I think this
should not be done automagically like it is done now. Reserved bit is
not cleared by __free_pages() function. If it was set by user it is
role of user to unset it before freeing relevant pages. You could find
good example in kernel/kexec.c:kimage_alloc_pages() and
kernel/kexec.c:kimage_free_pages().

Additionally, if we do that in way which I described above it will
be inline with things which should be done in that case in upstream
kernels. Please look below for more details.

> >   2) We could prepare new function, e.g.
> >
> >      fastcall void __free_pages_core(struct page *page, unsigned int order)
> > {
> >              if (order == 0)
> >                      free_hot_page(page);
> >              else
> >                      __free_pages_ok(page, order);
> >      }
> >
> >      and call it from undo_limit_pages() destructor instead of
> > __free_pages().
> >
> > What do you think about that ???
>
> If you can get upstream to accept such a function - fine. But without
> that, adding Xen-specific code in core Xen-unspecific files is generally
> only a last resort, even in XenoLinux, and requires a good reason and
> no alternative.

This function is not needed in upstream because
PageForeign mechanism is not available. It means
that user must call xen_destroy_contiguous_region()
(xen_limit_pages_to_max_mfn() is not available yet)
if he/she wants moves its pages to higher MFNs
(e.g. before freeing them).

Daniel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.