[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] [RFC] x86/cpu: rework instruction set selection
On Sun, 27 Apr 2025 at 12:17, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > > ffs/fls are commonly found inside loops where x is the loop condition > too. Therefore, using statically_true() to provide a form without the > zero compatibility turns out to be a win. We already have the version without the zero capability - it's just called "__ffs()" and "__fls()", and performance-critical code uses those. So fls/ffs are the "standard" library functions that have to handle zero, and add that stupid "+1" because that interface was designed by some Pascal person who doesn't understand that we start counting from 0. Standards bodies: "companies aren't sending their best people". But it's silly that we then spend effort on magic cmov in inline asm on those things when it's literally the "don't use this version unless you don't actually care about performance" case. I don't think it would be wrong to just make the x86-32 code just do the check against zero ahead of time - in C. And yes, that will generate some extra code - you'll test for zero before, and then the caller might also test for a zero result that then results in another test for zero that can't actually happen (but the compiler doesn't know that). But I suspect that on the whole, it is likely to generate better code anyway just because the compiler sees that first test and can DTRT. UNTESTED patch applied in case somebody wants to play with this. It removes 10 lines of silly code, and along with them that 'cmov' use. Anybody? Linus Attachment:
patch.diff
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |