[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
- To: tupeng212 <tupeng212@xxxxxxxxx>
- From: Keir Fraser <keir@xxxxxxx>
- Date: Sat, 13 Oct 2012 09:59:19 +0100
- Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>
- Delivery-date: Sat, 13 Oct 2012 09:00:14 +0000
- List-id: Xen developer discussion <xen-devel.lists.xen.org>
- Thread-index: Ac2pIQeX6HxWHUTuf0G4n/Wqknktzg==
- Thread-topic: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
If the allocations are 2M size, we can do better quite easily I think. Please try the attached patch.
-- Keir
On 13/10/2012 07:46, "tupeng212" <tupeng212@xxxxxxxxx> wrote:
What if you replace tlbflush_filter() call with cpus_clear(&extra_cpus_mask)?
: you mean just clear it, maybe a little violent.., you 'd like to do it at any other place.
I assume you see lots of looping in one of those two functions, but only single-page-at-a-time calls into alloc_domheap_pages()->alloc_heap_pages()?
: In populate_physmap, all pages are 2M size,
static void populate_physmap(struct memop_args *a)
{
for ( i = a->nr_done; i < a->nr_extents; i++ )
{
page = alloc_domheap_pages(d, a->extent_order, a->memflags) ->alloc_heap_pages ; //a->extent_order = 9, always 2M size
}
//you mean move that block and TLB-flush here to avoid for loop ?
}
tupeng212
From: Keir Fraser <mailto:keir.xen@xxxxxxxxx>
Date: 2012-10-13 14:30
To: tupeng212 <mailto:tupeng212@xxxxxxxxx>
Subject: Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
What if you replace tlbflush_filter() call with cpus_clear(&extra_cpus_mask)? Seems a bit silly to do, but I’d like to know how much a few cpumask operations per page is costing (most are of course much quicker than tlbflush_filter as they operate on 64 bits per iteration, rather than one bit per iteration).
If that is suitably fast, I think we can have a go at fixing this by pulling the TLB-flush logic out of alloc_heap_pages() and into the loops in increwase_reservation() and populate_physmap() in common/memory.c. I assume you see lots of looping in one of those two functions, but only single-page-at-a-time calls into alloc_domheap_pages()->alloc_heap_pages()?
-- Keir
On 13/10/2012 07:21, "tupeng212" <tupeng212@xxxxxxxxx> wrote:
If the tlbflush_filter() and cpumask_or() lines are commented out from the
if(need_tlbflush) block in alloc_heap_pages(), what are the domain creation
times like then?
: You mean removing these code from alloc_heap_pages, then try it.
I didn't do it as you said, but I calculated the whole time of if(need_tlbflush) block
by using time1=NOW() ...block ... time2=NOW(), time=time2-time1, its unit is ns, and s = ns * 10^9
it occupy high rate of the whole time. whole starting time is 30s, then this block may be 29s.
By the way it looks like you are not using xen-unstable or
xen-4.2, can you try with one of these later versions of Xen?
: fortunately, other groups have to use xen-4.2, we have repeated this experiment on
that source code too, it changed nothing, time is still very long in second starting.
tupeng
Attachment:
00-reduce-tlbflush_filter
Description: Binary data
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|