[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page()
> From: Jan Beulich [mailto:jbeulich@xxxxxxxxxx] > > >>> Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> 12.11.08 15:51 >>> > >I assume the 12% faster is on a benchmark... > > It's the win for an application doing nothing but dirtying > private mappings > of a file. That seemed like the least overhead test that > wouldn't require any > special testing code in kernel or hypervisor. > > >Have you measured how much faster the copy_page_sse2 > >routine (standalond) is than the memcpy? Is it a > >factor of 2? > > No, I didn't. Hmmm... I'm working on a project that does extensive page-copying so was eager to give it a spin on two test machines, one a Core 2 Duo ("Weybridge"), the other an as-yet-unreleased Intel box. I measured the routine with rdtsc, took many thousands of samples, and look at the smallest measurement. The hypervisor measured is 64-bit so "cpu_has_xmm2" appears to always be true. On the first machine, the change to use sse2 instructions made no difference. On the second machine, using sse2 actually made copy_page() *worse* (by 30-40%). I'm poor enough with the x86 instruction set that I can't explain my results, but thought I would report them. I'm not doubting that you saw improvements on your box, just noting that YMMV. Perhaps someone from Intel familiar with the microarchitectures might be able to explain (and can query me offlist to identify the as-yet-unreleased box). Dan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |