[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XenPPC] PATCH: Inline assembler for clear_page() and copy_page()



I would expect to see dcbtst in here, no?

Nah, dcbtst is expensive (it causes some non-cheap bus
transactions) and not needed at all; dcbz is much better
(but can only be used if you kill the whole cache line;
which is true here).

Both functions (copy and clear) could stand a little loop unrolling.

ldu ; stdu ; bdnz is not the best loop possible, esp. not on
970/P4/P5.  You guys got Mac's, use Shark (go to the code browser,
cmd-shift-M, select "show 970 dispatch groups" and "show 970
details drawer").  In most cases the time spent in the loop will
be dominated by memory (cache) speed, of course, but still.

I can understand if you're not *really* trying to optimize these, but in that case why do you want to add dcbz? Is there a noticeable performance
improvement?

Yes, dcbz is (should be) a huge improvement.


Segher


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.