[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C
* Ingo Molnar <mingo@xxxxxxxxxx> wrote: > > UNTESTED patch applied in case somebody wants to play with this. It > > removes 10 lines of silly code, and along with them that 'cmov' use. > > > > Anybody? > > Makes sense - it seems to boot here, but I only did some very light > testing. > > There's a minor text size increase on x86-32 defconfig, GCC 14.2.0: > > text data bss dec hex filename > 16577728 7598826 1744896 25921450 18b87aa vmlinux.before > 16577908 7598838 1744896 25921642 18b886a vmlinux.after > > bloatometer output: > > add/remove: 2/1 grow/shrink: 201/189 up/down: 5681/-3486 (2195) And once we remove 486, I think we can do the optimization below to just assume the output doesn't get clobbered by BS*L in the zero-case, right? In the text size space it's a substantial optimization on x86-32 defconfig: text data bss dec hex filename 16,577,728 7598826 1744896 25921450 18b87aa vmlinux.vanilla # CMOV+BS*L 16,577,908 7598838 1744896 25921642 18b886a vmlinux.linus_patch # if()+BS*L 16,573,568 7602922 1744896 25921386 18b876a vmlinux.noclobber # BS*L Thanks, Ingo --- arch/x86/include/asm/bitops.h | 20 ++------------------ 1 file changed, 2 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index 6061c87f14ac..e3e94a806656 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -308,24 +308,16 @@ static __always_inline int variable_ffs(int x) { int r; -#ifdef CONFIG_X86_64 /* * AMD64 says BSFL won't clobber the dest reg if x==0; Intel64 says the * dest reg is undefined if x==0, but their CPU architect says its * value is written to set it to the same as before, except that the * top 32 bits will be cleared. - * - * We cannot do this on 32 bits because at the very least some - * 486 CPUs did not behave this way. */ asm("bsfl %1,%0" : "=r" (r) : ASM_INPUT_RM (x), "0" (-1)); -#else - if (!x) - return 0; - asm("bsfl %1,%0" : "=r" (r) : "rm" (x)); -#endif + return r + 1; } @@ -360,24 +352,16 @@ static __always_inline int fls(unsigned int x) if (__builtin_constant_p(x)) return x ? 32 - __builtin_clz(x) : 0; -#ifdef CONFIG_X86_64 /* * AMD64 says BSRL won't clobber the dest reg if x==0; Intel64 says the * dest reg is undefined if x==0, but their CPU architect says its * value is written to set it to the same as before, except that the * top 32 bits will be cleared. - * - * We cannot do this on 32 bits because at the very least some - * 486 CPUs did not behave this way. */ asm("bsrl %1,%0" : "=r" (r) : ASM_INPUT_RM (x), "0" (-1)); -#else - if (!x) - return 0; - asm("bsrl %1,%0" : "=r" (r) : "rm" (x)); -#endif + return r + 1; }
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |