[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs

To: "Keir Fraser" <keir.xen@xxxxxxxxx>
From: tupeng212 <tupeng212@xxxxxxxxx>
Date: Sat, 13 Oct 2012 14:46:54 +0800
Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>
Delivery-date: Sat, 13 Oct 2012 06:47:33 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

What if you replace tlbflush_filter() call with cpus_clear(&extra_cpus_mask)?

: you mean just clear it, maybe a little violent.., you 'd like to do it at any other place.

I assume you see lots of looping in one of those two functions, but only single-page-at-a-time calls into alloc_domheap_pages()->alloc_heap_pages()?
: In populate_physmap, all pages are 2M size,

static void populate_physmap(struct memop_args *a)

{

for ( i = a->nr_done; i < a->nr_extents; i++ )

{

page = alloc_domheap_pages(d, a->extent_order, a->memflags) ->alloc_heap_pages ; //a->extent_order = 9, always 2M size

}

//you mean move that block and TLB-flush here to avoid for loop ?

}

tupeng212

From: Keir Fraser

Date: 2012-10-13 14:30

To: tupeng212

Subject: Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs

What if you replace tlbflush_filter() call with cpus_clear(&extra_cpus_mask)? Seems a bit silly to do, but Iâd like to know how much a few cpumask operations per page is costing (most are of course much quicker than tlbflush_filter as they operate on 64 bits per iteration, rather than one bit per iteration).

If that is suitably fast, I think we can have a go at fixing this by pulling the TLB-flush logic out of alloc_heap_pages() and into the loops in increwase_reservation() and populate_physmap() in common/memory.c. I assume you see lots of looping in one of those two functions, but only single-page-at-a-time calls into alloc_domheap_pages()->alloc_heap_pages()?

-- Keir

On 13/10/2012 07:21, "tupeng212" <tupeng212@xxxxxxxxx> wrote:

If the tlbflush_filter() and cpumask_or() lines are commented out from the
if(need_tlbflush) block in alloc_heap_pages(), what are the domain creation
times like then?
: You mean removing these code from alloc_heap_pages, then try it.
I didn't do it as you said, but I calculated the whole time of if(need_tlbflush) block
by using time1=NOW() ...block ... time2=NOW(), time=time2-time1, its unit is ns, and s = ns * 10^9
it occupy high rate of the whole time. whole starting time is 30s, then this block may be 29s.

By the way it looks like you are not using xen-unstable or
xen-4.2, can you try with one of these later versions of Xen?
: fortunately, other groups have to use xen-4.2, we have repeated this experiment on
that source code too, it changed nothing, time is still very long in second starting.

tupeng

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
  - From: Keir Fraser

Prev by Date: Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
Next by Date: Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
Previous by thread: Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
Next by thread: Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
Index(es):
- Date
- Thread