[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XenPPC] PATCH: Inline assembler for clear_page() and copy_page()
I would expect to see dcbtst in here, no? Nah, dcbtst is expensive (it causes some non-cheap bus transactions) and not needed at all; dcbz is much better (but can only be used if you kill the whole cache line; which is true here). Both functions (copy and clear) could stand a little loop unrolling. ldu ; stdu ; bdnz is not the best loop possible, esp. not on 970/P4/P5. You guys got Mac's, use Shark (go to the code browser, cmd-shift-M, select "show 970 dispatch groups" and "show 970 details drawer"). In most cases the time spent in the loop will be dominated by memory (cache) speed, of course, but still. I can understand if you're not *really* trying to optimize these, but in that case why do you want to add dcbz? Is there a noticeable performanceimprovement? Yes, dcbz is (should be) a huge improvement. Segher _______________________________________________ Xen-ppc-devel mailing list Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ppc-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |