[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C



On April 28, 2025 12:14:40 AM PDT, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
>* Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
>> And once we remove 486, I think we can do the optimization below to 
>> just assume the output doesn't get clobbered by BS*L in the 
>> zero-case, right?
>> 
>> In the text size space it's a substantial optimization on x86-32 
>> defconfig:
>> 
>>         text    data        bss           dec            hex filename
>>   16,577,728    7598826    1744896      25921450        18b87aa 
>> vmlinux.vanilla      # CMOV+BS*L
>>   16,577,908 7598838    1744896      25921642        18b886a 
>> vmlinux.linus_patch  # if()+BS*L
>>   16,573,568 7602922    1744896      25921386        18b876a 
>> vmlinux.noclobber    # BS*L
>
>And BTW, *that* is a price that all of non-486 x86-32 was paying for 
>486 support...
>
>And, just out of intellectual curiosity, I also tried to measure the 
>code generation price of the +1 standards-quirk in the fls()/ffs() 
>interface as well:
>
>         text     data        bss           dec            hex filename
>   16,577,728   7598826    1744896      25921450        18b87aa 
> vmlinux.vanilla      # CMOV+BS*L
>   16,577,908  7598838    1744896      25921642        18b886a 
> vmlinux.linus_patch  # if()+BS*L
>   16,573,568  7602922    1744896      25921386        18b876a 
> vmlinux.noclobber    # BS*L
>   ..........
>   16,573,552  7602922    1744896      25921370        18b875a vmlinux.broken  
>      # BROKEN: 0 baseline instead of 1
>
>... and unless I messed up the patch, it seems to have a surprisingly 
>low impact - maybe because the compiler can amortize its cost by 
>adjusting all dependent code mostly at build time, so the +1 doesn't 
>end up being generated most of the time?
>
>Thanks,
>
>       Ingo
>
>===============================>
>
>This broken patch is broken: it intentionally breaks the ffs()/fls() 
>interface in an attempt to measure the code generation effects of 
>interface details.
>
>NOT-Signed-off-by: <anyone@anywhere.anytime>
>---
> arch/x86/include/asm/bitops.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
>index e3e94a806656..21707696bafe 100644
>--- a/arch/x86/include/asm/bitops.h
>+++ b/arch/x86/include/asm/bitops.h
>@@ -318,7 +318,7 @@ static __always_inline int variable_ffs(int x)
>           : "=r" (r)
>           : ASM_INPUT_RM (x), "0" (-1));
> 
>-      return r + 1;
>+      return r;
> }
> 
> /**
>@@ -362,7 +362,7 @@ static __always_inline int fls(unsigned int x)
>           : "=r" (r)
>           : ASM_INPUT_RM (x), "0" (-1));
> 
>-      return r + 1;
>+      return r;
> }
> 
> /**

My recollection was that you can't assume that even for 586; that it is only 
safe for 686, but it has been a long time...



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.