|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH] x86/bitops: adjust partial first word handling in __find_next{,_zero}_bit()
There's no need to subtract "bit" in what is passed to __scanbit(), as the
other bits are zero anyway after the shift (in __find_next_bit()) or can
be made so (in __find_next_zero_bit()) by flipping negation and shift. (We
actually leverage the same facts in find_next{,_zero}_bit() as well.) This
way in __scanbit() the TZCNT alternative can be engaged.
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
Register allocation (and hence effects on code size of the change here)
is pretty "interesting". The compiler doesn't look to realize that while
for 64-bit insns it doesn't matter which GPR is used (a REX prefix is
needed anyway), 32-bit insns can be helped by preferring the low 8 GPRs.
(Granted the inline assembly in __scanbit() may also be a little difficult
to deal with.)
--- a/xen/arch/x86/bitops.c
+++ b/xen/arch/x86/bitops.c
@@ -35,8 +35,8 @@ unsigned int __find_next_bit(
if ( bit != 0 )
{
/* Look for a bit in the first word. */
- set = __scanbit(*p >> bit, BITS_PER_LONG - bit);
- if ( set < (BITS_PER_LONG - bit) )
+ set = __scanbit(*p >> bit, BITS_PER_LONG);
+ if ( set < BITS_PER_LONG )
return (offset + set);
offset += BITS_PER_LONG - bit;
p++;
@@ -85,8 +85,8 @@ unsigned int __find_next_zero_bit(
if ( bit != 0 )
{
/* Look for zero in the first word. */
- set = __scanbit(~(*p >> bit), BITS_PER_LONG - bit);
- if ( set < (BITS_PER_LONG - bit) )
+ set = __scanbit(~*p >> bit, BITS_PER_LONG);
+ if ( set < BITS_PER_LONG )
return (offset + set);
offset += BITS_PER_LONG - bit;
p++;
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |