| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
 Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C
 
To: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>,        Andrew Cooper <andrew.cooper3@xxxxxxxxxx>From: "H. Peter Anvin" <hpa@xxxxxxxxx>Date: Tue, 29 Apr 2025 15:10:29 -0700Cc: Ingo Molnar <mingo@xxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>,        Arnd Bergmann <arnd@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>,        Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>,        Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, x86@xxxxxxxxxx,        Juergen Gross <jgross@xxxxxxxx>,        Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>,        Alexander Usyskin <alexander.usyskin@xxxxxxxxx>,        Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>,        Mateusz Jończyk <mat.jonczyk@xxxxx>,        Mike Rapoport <rppt@xxxxxxxxxx>, Ard Biesheuvel <ardb@xxxxxxxxxx>,        Peter Zijlstra <peterz@xxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx,        xen-devel@xxxxxxxxxxxxxxxxxxxxDelivery-date: Tue, 29 Apr 2025 22:11:27 +0000Dkim-filter: OpenDKIM Filter v2.11.0 mail.zytor.com 53TMAU8h607259List-id: Xen developer discussion <xen-devel.lists.xenproject.org> 
 On April 29, 2025 3:04:30 PM PDT, Linus Torvalds 
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>On Tue, 29 Apr 2025 at 14:59, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>>
>> do_variable_ffs() doesn't quite work.
>>
>> REP BSF is LZCNT, and unconditionally writes it's output operand, and
>> defeats the attempt to preload with -1.
>>
>> Drop the REP prefix, and it should work as intended.
>
>Bah. That's what I get for just doing it blindly without actually
>looking at the kernel source. I just copied the __ffs() thing - and
>there the 'rep' is not for the zero case - which we don't care about -
>but because lzcnt performs better on newer CPUs.
>
>So you're obviously right.
>
>            Linus
Yeah, the encoding of lzcnt was a real mistake, because the outputs are 
different (so you still need instruction-specific postprocessing.)
 
 |