[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 02/15] xen: introduce generic non-atomic test_*bit()



On Wed, 2024-05-15 at 10:52 +0200, Jan Beulich wrote:
> On 06.05.2024 12:15, Oleksii Kurochko wrote:
> > The following generic functions were introduced:
> > * test_bit
> > * generic__test_and_set_bit
> > * generic__test_and_clear_bit
> > * generic__test_and_change_bit
> > 
> > Also, the patch introduces the following generics which are
> > used by the functions mentioned above:
> > * BITOP_BITS_PER_WORD
> > * BITOP_MASK
> > * BITOP_WORD
> > * BITOP_TYPE
> > 
> > These functions and macros can be useful for architectures
> > that don't have corresponding arch-specific instructions.
> 
> Logically this paragraph may better move ahead of the BITOP_* one.
> 
> > Because of that x86 has the following check in the macros
> > test_bit(),
> > __test_and_set_bit(), __test_and_clear_bit(),
> > __test_and_change_bit():
> >     if ( bitop_bad_size(addr) ) __bitop_bad_size();
> > It was necessary to make bitop bad size check generic too, so
> > arch_check_bitop_size() was introduced.
> 
> Not anymore, with the most recent adjustments? There's nothing arch-
> specific anymore in the checking.
> 
> > @@ -183,7 +180,7 @@ static inline int test_and_set_bit(int nr,
> > volatile void *addr)
> >   * If two examples of this operation race, one can appear to
> > succeed
> >   * but actually fail.  You must protect multiple accesses with a
> > lock.
> >   */
> > -static inline int __test_and_set_bit(int nr, void *addr)
> > +static inline int arch__test_and_set_bit(int nr, volatile void
> > *addr)
> 
> I think I raised this point before: Why arch__ here, ...
> 
> > @@ -232,7 +226,7 @@ static inline int test_and_clear_bit(int nr,
> > volatile void *addr)
> >   * If two examples of this operation race, one can appear to
> > succeed
> >   * but actually fail.  You must protect multiple accesses with a
> > lock.
> >   */
> > -static inline int __test_and_clear_bit(int nr, void *addr)
> > +static inline int arch__test_and_clear_bit(int nr, volatile void
> > *addr)
> 
> ... here, ...
> 
> > @@ -243,13 +237,10 @@ static inline int __test_and_clear_bit(int
> > nr, void *addr)
> >  
> >      return oldbit;
> >  }
> > -#define __test_and_clear_bit(nr, addr) ({               \
> > -    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
> > -    __test_and_clear_bit(nr, addr);                     \
> > -})
> > +#define arch__test_and_clear_bit arch__test_and_clear_bit
> >  
> >  /* WARNING: non atomic and it can be reordered! */
> > -static inline int __test_and_change_bit(int nr, void *addr)
> > +static inline int arch__test_and_change_bit(int nr, volatile void
> > *addr)
> 
> ... and here, while ...
> 
> > @@ -307,8 +295,7 @@ static inline int variable_test_bit(int nr,
> > const volatile void *addr)
> >      return oldbit;
> >  }
> >  
> > -#define test_bit(nr, addr) ({                           \
> > -    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
> > +#define arch_test_bit(nr, addr) ({                      \
> 
> ... just arch_ here? I don't like the double underscore infixes very
> much, but I'm okay with them as long as they're applied consistently.

Common code and x86 use __test_and_clear_bit(), and this patch provides
a generic version of __test_and_clear_bit(). To emphasize that
generic__test_and_clear_bit() is a common implementation of
__test_and_clear_bit(), the double underscore was retained. Also,
test_and_clear_bit() exists and if one day it will be needed to provide
a generic version of it, then it will be needed to have
generic__test_and_clear_bit() and generic_test_and_clear_bit()

A similar logic was chosen for test_bit.


> 
> > --- a/xen/include/xen/bitops.h
> > +++ b/xen/include/xen/bitops.h
> > @@ -65,10 +65,144 @@ static inline int generic_flsl(unsigned long
> > x)
> >   * scope
> >   */
> >  
> > +#define BITOP_MASK(nr)  ((bitop_uint_t)1 << ((nr) %
> > BITOP_BITS_PER_WORD))
> > +
> > +#define BITOP_WORD(nr)  ((nr) / BITOP_BITS_PER_WORD)
> > +
> > +extern void __bitop_bad_size(void);
> > +
> >  /* --------------------- Please tidy above here ------------------
> > --- */
> >  
> >  #include <asm/bitops.h>
> >  
> > +#ifndef arch_check_bitop_size
> > +
> > +#define bitop_bad_size(addr) sizeof(*(addr)) <
> > sizeof(bitop_uint_t)
> 
> Nit: Missing parentheses around the whole expression.
> 
> > +#define arch_check_bitop_size(addr) \
> > +    if ( bitop_bad_size(addr) ) __bitop_bad_size();
> 
> Apart from the arch_ prefix that now wants dropping, this macro (if
> we
> want one in the first place) 
What do you mean by 'want' here? I thought it is pretty necessary from
safety point of view to have this check.
Except arch_ prefix does it make sense to drop "#ifndef
arch_check_bitop_size" around this macros?

> also wants writing in a way such that it
> can be used as a normal expression, without double semicolons or
> other
> anomalies (potentially) resulting at use sites. E.g.
> 
> #define check_bitop_size(addr) do { \
>     if ( bitop_bad_size(addr) )     \
>         __bitop_bad_size();         \
> } while ( 0 )
> 
> > +#endif /* arch_check_bitop_size */
> > +
> > +/**
> > + * generic__test_and_set_bit - Set a bit and return its old value
> > + * @nr: Bit to set
> > + * @addr: Address to count from
> > + *
> > + * This operation is non-atomic and can be reordered.
> > + * If two examples of this operation race, one can appear to
> > succeed
> > + * but actually fail.  You must protect multiple accesses with a
> > lock.
> > + */
> > +static always_inline bool
> > +generic__test_and_set_bit(unsigned long nr, volatile void *addr)
> 
> The original per-arch functions all use "int" for their first
> parameter.
> Here you use unsigned long, without any mention in the description of
> the
> potential behavioral change. Which is even worse given that then x86
> ends
> up inconsistent with Arm and PPC in this regard, by ...
> 
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > BITOP_WORD(nr);
> > +    bitop_uint_t old = *p;
> > +
> > +    *p = old | mask;
> > +    return (old & mask);
> > +}
> > +
> > +/**
> > + * generic__test_and_clear_bit - Clear a bit and return its old
> > value
> > + * @nr: Bit to clear
> > + * @addr: Address to count from
> > + *
> > + * This operation is non-atomic and can be reordered.
> > + * If two examples of this operation race, one can appear to
> > succeed
> > + * but actually fail.  You must protect multiple accesses with a
> > lock.
> > + */
> > +static always_inline bool
> > +generic__test_and_clear_bit(bitop_uint_t nr, volatile void *addr)
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > BITOP_WORD(nr);
> > +    bitop_uint_t old = *p;
> > +
> > +    *p = old & ~mask;
> > +    return (old & mask);
> > +}
> > +
> > +/* WARNING: non atomic and it can be reordered! */
> > +static always_inline bool
> > +generic__test_and_change_bit(unsigned long nr, volatile void
> > *addr)
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > BITOP_WORD(nr);
> > +    bitop_uint_t old = *p;
> > +
> > +    *p = old ^ mask;
> > +    return (old & mask);
> > +}
> > +/**
> > + * generic_test_bit - Determine whether a bit is set
> > + * @nr: bit number to test
> > + * @addr: Address to start counting from
> > + */
> > +static always_inline bool generic_test_bit(int nr, const volatile
> > void *addr)
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > BITOP_WORD(nr);
> > +
> > +    return (*p & mask);
> > +}
> > +
> > +static always_inline bool
> > +__test_and_set_bit(unsigned long nr, volatile void *addr)
> > +{
> > +#ifndef arch__test_and_set_bit
> > +#define arch__test_and_set_bit generic__test_and_set_bit
> > +#endif
> > +
> > +    return arch__test_and_set_bit(nr, addr);
> 
> ... silently truncating and sign-converting nr here.
Missed that fact. AFAIU there is no specific reason for bit number to
be 'int' for this function, so it makes sense to update x86's prototype
of arch__test_and_set_bit() to:
   static inline int arch__test_and_set_bit(unsigned long nr, volatile
   void *addr).
   
But probably I can't use 'unsigned long' here too, as it should a
compilation error around 'btsl' instruction.

So it can be or 'unsinged int' or 'int'.
> 
> As to generic_test_bit() - please don't cast away const-ness there.
> 
> > +}
> > +#define __test_and_set_bit(nr, addr) ({             \
> > +    arch_check_bitop_size(addr);                    \
> > +    __test_and_set_bit(nr, addr);                   \
> > +})
> > +
> > +static always_inline bool
> > +__test_and_clear_bit(bitop_uint_t nr, volatile void *addr)
> 
> Oddly enough here at least you use bitop_uint_t, but that's still ...
> 
> > +{
> > +#ifndef arch__test_and_clear_bit
> > +#define arch__test_and_clear_bit generic__test_and_clear_bit
> > +#endif
> > +
> > +    return arch__test_and_clear_bit(nr, addr);
> 
> ... meaning a signedness conversion on x86 then. And beware: You
> can't
> simply change x86'es code to use bitop_uint_t. The underlying insns
> used
> interpret the bit position as a signed number, i.e. permitting
> accesses
> below the incoming pointer (whether it really makes sense to be that
> way
> is a separate question). I'm afraid I have no good suggestion how to
> deal
> with that: Any approach I can think of is either detrimental to the
> generic implementation or would have unwanted effects on the x86 one.
> Others, anyone?
Is the signedness conversion here a big problem? I suppose no one will
pass a negative number to nr.

It seems to me that it would be better to use 'int' everywhere and not
use bitop_uint_t for 'nr' since it is just a bit position. 'Int'
provides enough range for possible bit number.

> 
> > --- a/xen/include/xen/types.h
> > +++ b/xen/include/xen/types.h
> > @@ -64,6 +64,12 @@ typedef __u64 __be64;
> >  
> >  typedef unsigned int __attribute__((__mode__(__pointer__)))
> > uintptr_t;
> >  
> > +#ifndef BITOP_TYPE
> > +#define BITOP_BITS_PER_WORD 32
> > +
> > +typedef uint32_t bitop_uint_t;
> > +#endif
> 
> I think you mentioned to me before why this needs to live here, not
> in
> xen/bitops.h. Yet I don't recall the reason, and the description
> (hint,
> hint) doesn't say anything either.
If I remember correctly ( after this phrase I think I have to update
the description ) the reason was that if I put that to xen/bitops.h:

    ...
    #include <asm/bitops.h>
    
    #ifndef BITOP_TYPE
    #define BITOP_BITS_PER_WORD 32
    /* typedef uint32_t bitop_uint_t; */
    #endif
    ...

Then we will have an issue that we can't use the generic definition of 
BITOP_BITS_PER_WORD and bitop_uint_t in asm/bitops.h as it is defined
after inclusion of <asm/bitops.h>.

But if to put it to <xen/types.h> then such problem won't occur.

If it still makes sense, then I'll update the description in the next
patch version.

~ Oleksii

> 
> Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.