[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C



On Tue, 29 Apr 2025 at 14:24, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>
> Could you file a gcc bug? Gcc shouldn't generate worse code than inline asm 
> ...

Honestly, I've given up on that idea.

That's basically what the bug report I reported about clts was - gcc
generating worse code than inline asm.

But why would we use gcc builtins at all if we have inline asm that is
better than what som eversions of gcc generates? The inline asm works
for all versions.

Anyway, attached is a test file that seems to generate basically
"optimal" code. It's not a kernel patch, but a standalone C file for
testing with a couple of stupid test-cases to make sure that it gets
the constant argument case right, and that it optimizes the "I know
it's not zero" case.

That is why it also uses that "#if __SIZEOF_LONG__ == 4" instead of
something like CONFIG_64BIT.

So it's a proof-of-concept thing, and it does seem to be fairly simple.

The "important" parts are just the

  #define variable_ffs(x) \
        (statically_true((x)!=0) ? __ffs(x)+1 : do_variable_ffs(x))
  #define constant_ffs(x) __builtin_ffs(x)
  #define ffs(x) \
        (__builtin_constant_p(x) ? constant_ffs(x) : variable_ffs(x))

lines which end up picking the right choice. The rest is either the
"reimplement the kernel infrastructure for testing" or the trivial
do_variable_ffs() thing.

I just did

    gcc -m32 -O2 -S -fomit-frame-pointer t.c

(with, and without that -m32) and looked at the result to see that it
looks sane. No *actual* testing.


                Linus

Attachment: t.c
Description: Text Data


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.