[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-ia64-devel][PATCH][RFC] Task: support huge page RE: [Xen-ia64-devel] Xen/IA64 Healthiness Report -Cset#11460



Hi Anthony.

On Wed, Sep 27, 2006 at 09:56:11PM +0800, Xu, Anthony wrote:

> Currently, memory allocated for domU and VTI-domain is 16K contiguous.
> That means all huge page TLB entries must be broken into 16K TLB
> entries. This definitely impact overall performance, for instance, in
> linux, region 7 is using 16M page size. IA64 is supposed to be used at
> high end server, many services running on IA64 are using huge page, like
> Oracle is using 256M page in region 4, if XEN/IA64 still use 16K
> contiguous physical page, we can image, this can impact performance
> dramatically. So domU, VTI-domain and dom0 need to support huge page.
> 
> Attached patch is an experiment to use 16M page on VTI-domain. A very
> tricky way is used to allocate 16M contiguous memory for VTI-domain, so
> it's just for reference.
> Applying this patch, I can see 2%~3% performance gains when running KB
> on VTI-domain(UP), you may know performance of KB on VTI-domain is not
> bad:-), that means the improvement is somewhat big. 
> As we know, KB doesn't use 256M, the performance gain is coming from 16M
> page in region 7, if we run some applications, which use 256M huge page,
> and then we may get more improvement.

I agree with you that supporting tlb insert with large page size and
hugetlbfs would be a big gain.


> In my mind, we need do below things (there may be more) if we want to
> support huge page.
> 1. Add an option "order" in configure file vtiexample.vti. if order=0,
> XEN/IA64 allocate 16K contiguous memory for domain, if order=1, allocate
> 32K,  and so on. Thus user can chose page size for domain.

A fall back path should be implemented in case that
large page allocation fails.
Or do you propose introducing new page allocator with very large chunk?
With order option, page fragmentation should be taken care of.


> 2. This order option will be past to increase_reservation() function as
> extent_order argument, increase_reservation() will allocate contiguous
> memory for domain.
> 
> 3.  There may be some memory blocks, which we also want
> increase_reservation to allocate for us, such as shared page, or
> firmware memory for VTI domain etc. So we may need to call
> increase_reservation() several times to allocate memories with different
> page size.
> 
> 4. Per_LP_VHPT may need to be modified to support huge page.

Do you mean hash collision?


> 5. VBD/VNIF may need to be modified to use copy mechanism instead of
> flipping page.
> 
> 6. Ballon driver may need to be modified to increase or decrease domain
> memory by page size not 16K.
> 
> Magnus, would you like to take this task?
> 
> Comments are always welcome.

Those are my some random thoughts.

* Presumably there are two goals
  - Support one large page size(e.g. 16MB) to map kernel.
  - Support hugetlbfs whose page size might be different from 16MB.

  I.e. support three page sizes, normal page size 16KB, kernel mapping
  page size 16MB and hugetlbfs page size 256MB.
  I think hugetlbfs support can be addressed specialized way.

hugetlbfs
* Some specialized path can be implemented to support hugetlbfs.
  - For domU
    paravirtualize hugetlbfs for domU.
    Hook to alloc_fresh_huge_page() in Linux. Then xen/ia64 is aware of
    large pages.
    Probably a new flag of the p2m entry, or other data structure might be
    introduced.
    For xenLinux, the region number, RGN_HPAGE can be used to check before 
    entering hugetlbfs specialized path.
  - For domVTI
    Can the use of hugetlbfs be detected somehow?
    Probably some Linux-specific heuristic can be used.
    e.g. check the region, RGN_HPAGE.

kernel mapping with large page size.
* page fragmentation should be addressed.
  Both 16KB and 16MB page should be able to co-exist in a same domain.
  - Allocating large contiguous region might fail.
    So fall back path should be implemented.
  - domain should be able to have pages with both page size (16KB and 16MB)
    for smooth code merge.
  probably a new bit of the p2m entry, something like _PAGE_HPAGE, 
  would be introduce to distinguish large page from normal page.

* paravirtualized driver(VBD/VNIF)
  This is a really issue.
  For first prototype it is reasonable to not support page flipping
  resorting grant table memory copy.

  There are two kinds of page flipping, page mapping and page transfer.
  I guess page mapping should be supported somehow assuming only dom0
  (or driver domain) maps.
  We should measure page flipping and memory copy before giving it a try.
  I have no figures about it. 
  I'm not sure which has better-performance.
  (I'm biased. I know that vnif analysis on xen/x86.
   It said memory copy was cheaper on x86 than page flipping...)
  If dom0 does only DMA, I/O request can be completed without copy and tlb
  flush for VBD with tlb tracking patch.
  Page transfer is difficult. I'm not sure that it's worth while to support
  page transfer because I'm suffering in optimize it.

Another approach is
* increase xen page size.
  Probably simply increasing page size wouldn't work well.
  In that case, increase only domheap page size,
  Or introduce new zone like MEMZONE_HPAGE,
  Or introduce specialized page allocator for it.

-- 
yamahata

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.