[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 04/19] xen: introduce generic non-atomic test_*bit()



On Thu, 2024-04-04 at 15:22 +0200, Jan Beulich wrote:
> On 03.04.2024 12:19, Oleksii Kurochko wrote:
> > The patch introduces the following generic functions:
> > * test_bit
> > * generic__test_and_set_bit
> > * generic__test_and_clear_bit
> > * generic__test_and_change_bit
> > 
> > Also, the patch introduces the following generics which are
> > used by the functions mentioned above:
> > * BITOP_BITS_PER_WORD
> > * BITOP_MASK
> > * BITOP_WORD
> > * BITOP_TYPE
> > 
> > These functions and macros can be useful for architectures
> > that don't have corresponding arch-specific instructions.
> > 
> > Because of that x86 has the following check in the macros
> > test_bit(),
> > __test_and_set_bit(), __test_and_clear_bit(),
> > __test_and_change_bit():
> >     if ( bitop_bad_size(addr) ) __bitop_bad_size();
> > It was necessary to move the check to generic code and define as 0
> > for other architectures as they do not require this check.
> 
> Hmm, yes, the checks need to be in the outermost wrapper macros.
> While
> you're abstracting other stuff to arch_*(), wouldn't it make sense to
> also abstract this to e.g. arch_check_bitop_size(), with the
> expansion
> simply being (effectively) empty in the generic fallback case?
Probably it would be better to be consistent and introduce
arch_check_bitop_size().

> 
> > --- a/xen/include/xen/bitops.h
> > +++ b/xen/include/xen/bitops.h
> > @@ -65,10 +65,164 @@ static inline int generic_flsl(unsigned long
> > x)
> >   * scope
> >   */
> >  
> > +#define BITOP_BITS_PER_WORD 32
> > +/* typedef uint32_t bitop_uint_t; */
> > +#define bitop_uint_t uint32_t
> 
> So no arch overrides permitted anymore at all?
Not really, I agree that it is ugly, but I expected that arch will use
undef to override.

> 
> > +#define BITOP_MASK(nr)  ((bitop_uint_t)1 << ((nr) %
> > BITOP_BITS_PER_WORD))
> > +
> > +#define BITOP_WORD(nr)  ((nr) / BITOP_BITS_PER_WORD)
> > +
> >  /* --------------------- Please tidy above here ------------------
> > --- */
> >  
> >  #include <asm/bitops.h>
> >  
> > +#ifndef bitop_bad_size
> > +extern void __bitop_bad_size(void);
> 
> If not switching to arch_check_bitop_size() or alike as suggested
> above,
> why exactly does this need duplicating here and in x86? Can't the
> decl
> simply move ahead of the #include right above? (Sure, this will then
> require that nothing needing any of the functions you move here would
> still include asm/bitops.h; it would need to be xen/bitops.h
> everywhere.)
It could be done in way you suggest, I just wanted to keep changes
minimal ( without going and switching asm/bitops.h -> xen/bitops.h ),
but we can consider such option.

> 
> > +#define bitop_bad_size(addr) 0
> > +#endif
> > +
> > +/**
> > + * generic__test_and_set_bit - Set a bit and return its old value
> > + * @nr: Bit to set
> > + * @addr: Address to count from
> > + *
> > + * This operation is non-atomic and can be reordered.
> > + * If two examples of this operation race, one can appear to
> > succeed
> > + * but actually fail.  You must protect multiple accesses with a
> > lock.
> > + */
> > +static always_inline __pure bool
> > +generic__test_and_set_bit(unsigned long nr, volatile void *addr)
> 
> Does __pure actually fit with the use of volatile? The former says
> multiple
> accesses may be folded; the latter says they must not be.
Overlooked that, __pure should be dropped.

> 
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = ((volatile bitop_uint_t *)addr) +
> > BITOP_WORD(nr);
> 
> Nit: Slightly shorter line possible:
> 
>     volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr +
> BITOP_WORD(nr);
> 
> > +    bitop_uint_t old = *p;
> > +
> > +    *p = old | mask;
> > +    return (old & mask);
> > +}
> > +
> > +/**
> > + * generic__test_and_clear_bit - Clear a bit and return its old
> > value
> > + * @nr: Bit to clear
> > + * @addr: Address to count from
> > + *
> > + * This operation is non-atomic and can be reordered.
> > + * If two examples of this operation race, one can appear to
> > succeed
> > + * but actually fail.  You must protect multiple accesses with a
> > lock.
> > + */
> > +static always_inline __pure bool
> > +generic__test_and_clear_bit(bitop_uint_t nr, volatile void *addr)
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = ((volatile bitop_uint_t *)addr) +
> > BITOP_WORD(nr);
> > +    bitop_uint_t old = *p;
> > +
> > +    *p = old & ~mask;
> > +    return (old & mask);
> > +}
> > +
> > +/* WARNING: non atomic and it can be reordered! */
> > +static always_inline __pure bool
> > +generic__test_and_change_bit(unsigned long nr, volatile void
> > *addr)
> > +{
> > +    bitop_uint_t mask = BITOP_MASK(nr);
> > +    volatile bitop_uint_t *p = ((volatile bitop_uint_t *)addr) +
> > BITOP_WORD(nr);
> > +    bitop_uint_t old = *p;
> > +
> > +    *p = old ^ mask;
> > +    return (old & mask);
> > +}
> > +/**
> > + * generic_test_bit - Determine whether a bit is set
> > + * @nr: bit number to test
> > + * @addr: Address to start counting from
> > + */
> > +static always_inline __pure int generic_test_bit(int nr, const
> > volatile void *addr)
> 
> Further up you use bool; why int here?
No specific reason, I have to update everything to bool.

> 
> > +{
> > +    const volatile bitop_uint_t *p = addr;
> > +    return 1 & (p[BITOP_WORD(nr)] >> (nr & (BITOP_BITS_PER_WORD -
> > 1)));
> 
> And reason not to use BITOP_MASK() here as well (once having switched
> to
> bool return type)?
It seems we can use BITOP_MASK() this implementation was copied from
arch specific code.

> 
> > +}
> > +
> > +static always_inline __pure bool
> > +__test_and_set_bit(unsigned long nr, volatile void *addr)
> > +{
> > +#ifndef arch__test_and_set_bit
> > +#define arch__test_and_set_bit generic__test_and_set_bit
> > +#endif
> > +
> > +    return arch__test_and_set_bit(nr, addr);
> > +}
> > +#define __test_and_set_bit(nr, addr) ({             \
> > +    if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
> > +    __test_and_set_bit(nr, addr);                   \
> > +})
> > +
> > +static always_inline __pure bool
> > +__test_and_clear_bit(bitop_uint_t nr, volatile void *addr)
> > +{
> > +#ifndef arch__test_and_clear_bit
> > +#define arch__test_and_clear_bit generic__test_and_clear_bit
> > +#endif
> > +
> > +    return arch__test_and_clear_bit(nr, addr);
> > +}
> > +#define __test_and_clear_bit(nr, addr) ({           \
> > +    if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
> > +    __test_and_clear_bit(nr, addr);                 \
> > +})
> > +
> > +static always_inline __pure bool
> > +__test_and_change_bit(unsigned long nr, volatile void *addr)
> > +{
> > +#ifndef arch__test_and_change_bit
> > +#define arch__test_and_change_bit generic__test_and_change_bit
> > +#endif
> > +
> > +    return arch__test_and_change_bit(nr, addr);
> > +}
> > +#define __test_and_change_bit(nr, addr) ({              \
> > +    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
> > +    __test_and_change_bit(nr, addr);                    \
> > +})
> > +
> > +static always_inline __pure int test_bit(int nr, const volatile
> > void *addr)
> 
> Further up you use bool; why int here?
> 
> > +{
> > +#ifndef arch_test_bit
> > +#define arch_test_bit generic_test_bit
> > +#endif
> > +
> > +    return arch_test_bit(nr, addr);
> > +}
> > +#define test_bit(nr, addr) ({                           \
> > +    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
> > +    test_bit(nr, addr);                                 \
> > +})
> 
> From here onwards, ...
> 
> > +static always_inline __pure int fls(unsigned int x)
> > +{
> > +    if (__builtin_constant_p(x))
> > +        return generic_fls(x);
> > +
> > +#ifndef arch_fls
> > +#define arch_fls generic_fls
> > +#endif
> > +
> > +    return arch_fls(x);
> > +}
> > +
> > +static always_inline __pure int flsl(unsigned long x)
> > +{
> > +    if (__builtin_constant_p(x))
> > +        return generic_flsl(x);
> > +
> > +#ifndef arch_flsl
> > +#define arch_flsl generic_flsl
> > +#endif
> > +
> > +    return arch_flsl(x);
> > +}
> 
> ... does all of this really belong here? Neither title nor
> description have
> any hint towards this.
No, it should be a part of the  [PATCH v7 05/19] xen/bitops: implement
fls{l}() in common logic. An issue during rebase. I'll update that.

> 
> >  /*
> >   * Find First Set bit.  Bits are labelled from 1.
> >   */
> 
> This context suggests there's a dependency on an uncommitted patch.
> Nothing
> above says so. I guess you have a remark in the cover letter, yet imo
> that's
> only partly helpful.
Is it really a hard dependency?
The current patch series really depends on ffs{l}() and that was
mentioned in the cover letter ( I'll reword the cover letter to explain
why exactly this dependency is needed ), but this patch isn't really
depends on Andrew's patch series, where ffs{l}() are introduced.

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.