[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 02/15] xen: introduce generic non-atomic test_*bit()



On Wed, 2024-05-15 at 17:41 +0200, Jan Beulich wrote:
> On 15.05.2024 17:29, Oleksii K. wrote:
> > On Wed, 2024-05-15 at 10:52 +0200, Jan Beulich wrote:
> > > On 06.05.2024 12:15, Oleksii Kurochko wrote:
> > > > The following generic functions were introduced:
> > > > * test_bit
> > > > * generic__test_and_set_bit
> > > > * generic__test_and_clear_bit
> > > > * generic__test_and_change_bit
> > > > 
> > > > Also, the patch introduces the following generics which are
> > > > used by the functions mentioned above:
> > > > * BITOP_BITS_PER_WORD
> > > > * BITOP_MASK
> > > > * BITOP_WORD
> > > > * BITOP_TYPE
> > > > 
> > > > These functions and macros can be useful for architectures
> > > > that don't have corresponding arch-specific instructions.
> > > 
> > > Logically this paragraph may better move ahead of the BITOP_*
> > > one.
> > > 
> > > > Because of that x86 has the following check in the macros
> > > > test_bit(),
> > > > __test_and_set_bit(), __test_and_clear_bit(),
> > > > __test_and_change_bit():
> > > >     if ( bitop_bad_size(addr) ) __bitop_bad_size();
> > > > It was necessary to make bitop bad size check generic too, so
> > > > arch_check_bitop_size() was introduced.
> > > 
> > > Not anymore, with the most recent adjustments? There's nothing
> > > arch-
> > > specific anymore in the checking.
> > > 
> > > > @@ -183,7 +180,7 @@ static inline int test_and_set_bit(int nr,
> > > > volatile void *addr)
> > > >   * If two examples of this operation race, one can appear to
> > > > succeed
> > > >   * but actually fail.  You must protect multiple accesses with
> > > > a
> > > > lock.
> > > >   */
> > > > -static inline int __test_and_set_bit(int nr, void *addr)
> > > > +static inline int arch__test_and_set_bit(int nr, volatile void
> > > > *addr)
> > > 
> > > I think I raised this point before: Why arch__ here, ...
> > > 
> > > > @@ -232,7 +226,7 @@ static inline int test_and_clear_bit(int
> > > > nr,
> > > > volatile void *addr)
> > > >   * If two examples of this operation race, one can appear to
> > > > succeed
> > > >   * but actually fail.  You must protect multiple accesses with
> > > > a
> > > > lock.
> > > >   */
> > > > -static inline int __test_and_clear_bit(int nr, void *addr)
> > > > +static inline int arch__test_and_clear_bit(int nr, volatile
> > > > void
> > > > *addr)
> > > 
> > > ... here, ...
> > > 
> > > > @@ -243,13 +237,10 @@ static inline int
> > > > __test_and_clear_bit(int
> > > > nr, void *addr)
> > > >  
> > > >      return oldbit;
> > > >  }
> > > > -#define __test_and_clear_bit(nr, addr) ({               \
> > > > -    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
> > > > -    __test_and_clear_bit(nr, addr);                     \
> > > > -})
> > > > +#define arch__test_and_clear_bit arch__test_and_clear_bit
> > > >  
> > > >  /* WARNING: non atomic and it can be reordered! */
> > > > -static inline int __test_and_change_bit(int nr, void *addr)
> > > > +static inline int arch__test_and_change_bit(int nr, volatile
> > > > void
> > > > *addr)
> > > 
> > > ... and here, while ...
> > > 
> > > > @@ -307,8 +295,7 @@ static inline int variable_test_bit(int nr,
> > > > const volatile void *addr)
> > > >      return oldbit;
> > > >  }
> > > >  
> > > > -#define test_bit(nr, addr) ({                           \
> > > > -    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
> > > > +#define arch_test_bit(nr, addr) ({                      \
> > > 
> > > ... just arch_ here? I don't like the double underscore infixes
> > > very
> > > much, but I'm okay with them as long as they're applied
> > > consistently.
> > 
> > Common code and x86 use __test_and_clear_bit(), and this patch
> > provides
> > a generic version of __test_and_clear_bit(). To emphasize that
> > generic__test_and_clear_bit() is a common implementation of
> > __test_and_clear_bit(), the double underscore was retained. Also,
> > test_and_clear_bit() exists and if one day it will be needed to
> > provide
> > a generic version of it, then it will be needed to have
> > generic__test_and_clear_bit() and generic_test_and_clear_bit()
> > 
> > A similar logic was chosen for test_bit.
> 
> Right, but in all of your reply arch_ doesn't appear at all.
I am a little confused here. According to my logic, should it be
arch___test_and_set_bit() and generic___test_and_set_bit()?

If you are asking why there is no generic implementation for
test_and_clear_bit() without the double underscores, the reason is that
Arm, PPC, and x86 don't use generic code and rely on specific
instructions for this operation. Therefore, I didn't see much sense in
providing a generic version of test_and_clear_bit(), at least, for now.


>  Yet the
> question was: Why then not arch__test_bit(), to match the other arch
> helpers?
Because no one uses __test_bit(). Everywhere is used test_bit().

> 
> > > > --- a/xen/include/xen/bitops.h
> > > > +++ b/xen/include/xen/bitops.h
> > > > @@ -65,10 +65,144 @@ static inline int generic_flsl(unsigned
> > > > long
> > > > x)
> > > >   * scope
> > > >   */
> > > >  
> > > > +#define BITOP_MASK(nr)  ((bitop_uint_t)1 << ((nr) %
> > > > BITOP_BITS_PER_WORD))
> > > > +
> > > > +#define BITOP_WORD(nr)  ((nr) / BITOP_BITS_PER_WORD)
> > > > +
> > > > +extern void __bitop_bad_size(void);
> > > > +
> > > >  /* --------------------- Please tidy above here --------------
> > > > ----
> > > > --- */
> > > >  
> > > >  #include <asm/bitops.h>
> > > >  
> > > > +#ifndef arch_check_bitop_size
> > > > +
> > > > +#define bitop_bad_size(addr) sizeof(*(addr)) <
> > > > sizeof(bitop_uint_t)
> > > 
> > > Nit: Missing parentheses around the whole expression.
> > > 
> > > > +#define arch_check_bitop_size(addr) \
> > > > +    if ( bitop_bad_size(addr) ) __bitop_bad_size();
> > > 
> > > Apart from the arch_ prefix that now wants dropping, this macro
> > > (if
> > > we
> > > want one in the first place) 
> > What do you mean by 'want' here? I thought it is pretty necessary
> > from
> > safety point of view to have this check.
> 
> I don't question the check. What I was questioning is the need for a
> macro to wrap that, seeing how x86 did without. I'm not outright
> objecting to such a macro, though.
We can follow x86 approach as we there's no any more intention to allow
arch overrides.

> 
> > Except arch_ prefix does it make sense to drop "#ifndef
> > arch_check_bitop_size" around this macros?
> 
> Of course, as with arch_ dropped from the name there's no intention
> to
> allow arch overrides.
> 
> > > > +/**
> > > > + * generic__test_and_set_bit - Set a bit and return its old
> > > > value
> > > > + * @nr: Bit to set
> > > > + * @addr: Address to count from
> > > > + *
> > > > + * This operation is non-atomic and can be reordered.
> > > > + * If two examples of this operation race, one can appear to
> > > > succeed
> > > > + * but actually fail.  You must protect multiple accesses with
> > > > a
> > > > lock.
> > > > + */
> > > > +static always_inline bool
> > > > +generic__test_and_set_bit(unsigned long nr, volatile void
> > > > *addr)
> > > 
> > > The original per-arch functions all use "int" for their first
> > > parameter.
> > > Here you use unsigned long, without any mention in the
> > > description of
> > > the
> > > potential behavioral change. Which is even worse given that then
> > > x86
> > > ends
> > > up inconsistent with Arm and PPC in this regard, by ...
> > > 
> > > > +{
> > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > > > BITOP_WORD(nr);
> > > > +    bitop_uint_t old = *p;
> > > > +
> > > > +    *p = old | mask;
> > > > +    return (old & mask);
> > > > +}
> > > > +
> > > > +/**
> > > > + * generic__test_and_clear_bit - Clear a bit and return its
> > > > old
> > > > value
> > > > + * @nr: Bit to clear
> > > > + * @addr: Address to count from
> > > > + *
> > > > + * This operation is non-atomic and can be reordered.
> > > > + * If two examples of this operation race, one can appear to
> > > > succeed
> > > > + * but actually fail.  You must protect multiple accesses with
> > > > a
> > > > lock.
> > > > + */
> > > > +static always_inline bool
> > > > +generic__test_and_clear_bit(bitop_uint_t nr, volatile void
> > > > *addr)
> > > > +{
> > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > > > BITOP_WORD(nr);
> > > > +    bitop_uint_t old = *p;
> > > > +
> > > > +    *p = old & ~mask;
> > > > +    return (old & mask);
> > > > +}
> > > > +
> > > > +/* WARNING: non atomic and it can be reordered! */
> > > > +static always_inline bool
> > > > +generic__test_and_change_bit(unsigned long nr, volatile void
> > > > *addr)
> > > > +{
> > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > > > BITOP_WORD(nr);
> > > > +    bitop_uint_t old = *p;
> > > > +
> > > > +    *p = old ^ mask;
> > > > +    return (old & mask);
> > > > +}
> > > > +/**
> > > > + * generic_test_bit - Determine whether a bit is set
> > > > + * @nr: bit number to test
> > > > + * @addr: Address to start counting from
> > > > + */
> > > > +static always_inline bool generic_test_bit(int nr, const
> > > > volatile
> > > > void *addr)
> > > > +{
> > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> > > > BITOP_WORD(nr);
> > > > +
> > > > +    return (*p & mask);
> > > > +}
> > > > +
> > > > +static always_inline bool
> > > > +__test_and_set_bit(unsigned long nr, volatile void *addr)
> > > > +{
> > > > +#ifndef arch__test_and_set_bit
> > > > +#define arch__test_and_set_bit generic__test_and_set_bit
> > > > +#endif
> > > > +
> > > > +    return arch__test_and_set_bit(nr, addr);
> > > 
> > > ... silently truncating and sign-converting nr here.
> > Missed that fact. AFAIU there is no specific reason for bit number
> > to
> > be 'int' for this function, so it makes sense to update x86's
> > prototype
> > of arch__test_and_set_bit() to:
> >    static inline int arch__test_and_set_bit(unsigned long nr,
> > volatile
> >    void *addr).
> >    
> > But probably I can't use 'unsigned long' here too, as it should a
> > compilation error around 'btsl' instruction.
> > 
> > So it can be or 'unsinged int' or 'int'.
> > > 
> > > As to generic_test_bit() - please don't cast away const-ness
> > > there.
> > > 
> > > > +}
> > > > +#define __test_and_set_bit(nr, addr) ({             \
> > > > +    arch_check_bitop_size(addr);                    \
> > > > +    __test_and_set_bit(nr, addr);                   \
> > > > +})
> > > > +
> > > > +static always_inline bool
> > > > +__test_and_clear_bit(bitop_uint_t nr, volatile void *addr)
> > > 
> > > Oddly enough here at least you use bitop_uint_t, but that's still
> > > ...
> > > 
> > > > +{
> > > > +#ifndef arch__test_and_clear_bit
> > > > +#define arch__test_and_clear_bit generic__test_and_clear_bit
> > > > +#endif
> > > > +
> > > > +    return arch__test_and_clear_bit(nr, addr);
> > > 
> > > ... meaning a signedness conversion on x86 then. And beware: You
> > > can't
> > > simply change x86'es code to use bitop_uint_t. The underlying
> > > insns
> > > used
> > > interpret the bit position as a signed number, i.e. permitting
> > > accesses
> > > below the incoming pointer (whether it really makes sense to be
> > > that
> > > way
> > > is a separate question). I'm afraid I have no good suggestion how
> > > to
> > > deal
> > > with that: Any approach I can think of is either detrimental to
> > > the
> > > generic implementation or would have unwanted effects on the x86
> > > one.
> > > Others, anyone?
> > Is the signedness conversion here a big problem? I suppose no one
> > will
> > pass a negative number to nr.
> > 
> > It seems to me that it would be better to use 'int' everywhere and
> > not
> > use bitop_uint_t for 'nr' since it is just a bit position. 'Int'
> > provides enough range for possible bit number.
> 
> Indeed, and that's then hopefully less of a risk as to introducing
> hard
> to spot issues. Provided Arm and PPC are okay with that type then as
> well.
Then I will update prototypes of generic functions correspondingly.

> 
> > > > --- a/xen/include/xen/types.h
> > > > +++ b/xen/include/xen/types.h
> > > > @@ -64,6 +64,12 @@ typedef __u64 __be64;
> > > >  
> > > >  typedef unsigned int __attribute__((__mode__(__pointer__)))
> > > > uintptr_t;
> > > >  
> > > > +#ifndef BITOP_TYPE
> > > > +#define BITOP_BITS_PER_WORD 32
> > > > +
> > > > +typedef uint32_t bitop_uint_t;
> > > > +#endif
> > > 
> > > I think you mentioned to me before why this needs to live here,
> > > not
> > > in
> > > xen/bitops.h. Yet I don't recall the reason, and the description
> > > (hint,
> > > hint) doesn't say anything either.
> > If I remember correctly ( after this phrase I think I have to
> > update
> > the description ) the reason was that if I put that to
> > xen/bitops.h:
> > 
> >     ...
> >     #include <asm/bitops.h>
> >     
> >     #ifndef BITOP_TYPE
> >     #define BITOP_BITS_PER_WORD 32
> >     /* typedef uint32_t bitop_uint_t; */
> >     #endif
> >     ...
> > 
> > Then we will have an issue that we can't use the generic definition
> > of 
> > BITOP_BITS_PER_WORD and bitop_uint_t in asm/bitops.h as it is
> > defined
> > after inclusion of <asm/bitops.h>.
> > 
> > But if to put it to <xen/types.h> then such problem won't occur.
> > 
> > If it still makes sense, then I'll update the description in the
> > next
> > patch version.
> 
> I see. But we don't need to allow for arch overrides here anymore, so
> in
> xen/bitops.h couldn't we as well have
> 
> #define BITOP_BITS_PER_WORD 32
> typedef uint32_t bitop_uint_t;
> 
> ... (if necessary)
> 
> #include <asm/bitops.h>
> 
> ?
Good point. If there is not need to allow for arch overrides then we
can use suggested above approach. Thanks.

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.