[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] alloc_heap_pages is low efficient with more CPUs
On 16/10/2012 08:51, "Jan Beulich" <JBeulich@xxxxxxxx> wrote: >>>> On 15.10.12 at 17:45, Keir Fraser <keir@xxxxxxx> wrote: >> On 15/10/2012 14:27, "tupeng212" <tupeng212@xxxxxxxxx> wrote: >> >>> Please try the attached patch. >>> : Great! you have done a good job, needless time decreases badly to 1s. >>> >>> If anybody has no proposal, I suggest you to commit this patch. >> >> I have applied it to xen-unstable. It probably makes sense to put it in 4.1 >> and 4.2 as well (cc'ed Jan, and attaching the backport for 4.1 again). > > Will do, but do you have an explanation how this simple, memory > only operation (64 CPUs isn't that many) has this dramatic an > effect on performance. Are we bouncing cache lines this badly? If > so, which one(s)? I don't see what would be written frequently > from multiple CPUs here - tlbflush_filter() itself only reads global > variables, but never writes them. It's just the small factors multiplying up. A 40G domain is 10M page allocations, each of which does 64x per-cpu cpumask bitops and timestamp compares. That's going on for a billion (10^9) times round tlbflush_filter()s loop. Each iteration need only take a few CPU cycles for the effect to actually be noticeable. If the stuff being touched were not in the cache (which of course it is, since this path is so hot and not touching much memory) it would probably take an hour to create the domain! -- Keir > Jan > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |