[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 02/14] xen: introduce generic non-atomic test_*bit()



On Fri, 2024-05-24 at 08:48 +0200, Jan Beulich wrote:
> On 23.05.2024 18:40, Oleksii K. wrote:
> > On Thu, 2024-05-23 at 15:33 +0100, Julien Grall wrote:
> > > On 23/05/2024 15:11, Oleksii K. wrote:
> > > > On Thu, 2024-05-23 at 14:00 +0100, Julien Grall wrote:
> > > > > On 17/05/2024 14:54, Oleksii Kurochko wrote:
> > > > > > diff --git a/xen/arch/arm/arm64/livepatch.c
> > > > > > b/xen/arch/arm/arm64/livepatch.c
> > > > > > index df2cebedde..4bc8ed9be5 100644
> > > > > > --- a/xen/arch/arm/arm64/livepatch.c
> > > > > > +++ b/xen/arch/arm/arm64/livepatch.c
> > > > > > @@ -10,7 +10,6 @@
> > > > > >    #include <xen/mm.h>
> > > > > >    #include <xen/vmap.h>
> > > > > >    
> > > > > > -#include <asm/bitops.h>
> > > > > 
> > > > > It is a bit unclear how this change is related to the patch.
> > > > > Can
> > > > > you
> > > > > explain in the commit message?
> > > > Probably it doesn't need anymore. I will double check and if
> > > > this
> > > > change is not needed, I will just drop it in the next patch
> > > > version.
> > > > 
> > > > > 
> > > > > >    #include <asm/byteorder.h>
> > > > > >    #include <asm/insn.h>
> > > > > >    #include <asm/livepatch.h>
> > > > > > diff --git a/xen/arch/arm/include/asm/bitops.h
> > > > > > b/xen/arch/arm/include/asm/bitops.h
> > > > > > index 5104334e48..8e16335e76 100644
> > > > > > --- a/xen/arch/arm/include/asm/bitops.h
> > > > > > +++ b/xen/arch/arm/include/asm/bitops.h
> > > > > > @@ -22,9 +22,6 @@
> > > > > >    #define __set_bit(n,p)            set_bit(n,p)
> > > > > >    #define __clear_bit(n,p)          clear_bit(n,p)
> > > > > >    
> > > > > > -#define BITOP_BITS_PER_WORD     32
> > > > > > -#define BITOP_MASK(nr)          (1UL << ((nr) %
> > > > > > BITOP_BITS_PER_WORD))
> > > > > > -#define BITOP_WORD(nr)          ((nr) /
> > > > > > BITOP_BITS_PER_WORD)
> > > > > >    #define BITS_PER_BYTE           8
> > > > > 
> > > > > OOI, any reason BITS_PER_BYTE has not been moved as well? I
> > > > > don't
> > > > > expect
> > > > > the value to change across arch.
> > > > I can move it to generic one header too in the next patch
> > > > version.
> > > > 
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > diff --git a/xen/include/xen/bitops.h
> > > > > > b/xen/include/xen/bitops.h
> > > > > > index f14ad0d33a..6eeeff0117 100644
> > > > > > --- a/xen/include/xen/bitops.h
> > > > > > +++ b/xen/include/xen/bitops.h
> > > > > > @@ -65,10 +65,141 @@ static inline int
> > > > > > generic_flsl(unsigned
> > > > > > long
> > > > > > x)
> > > > > >     * scope
> > > > > >     */
> > > > > >    
> > > > > > +#define BITOP_BITS_PER_WORD 32
> > > > > > +typedef uint32_t bitop_uint_t;
> > > > > > +
> > > > > > +#define BITOP_MASK(nr)  ((bitop_uint_t)1 << ((nr) %
> > > > > > BITOP_BITS_PER_WORD))
> > > > > > +
> > > > > > +#define BITOP_WORD(nr)  ((nr) / BITOP_BITS_PER_WORD)
> > > > > > +
> > > > > > +extern void __bitop_bad_size(void);
> > > > > > +
> > > > > > +#define bitop_bad_size(addr) (sizeof(*(addr)) <
> > > > > > sizeof(bitop_uint_t))
> > > > > > +
> > > > > >    /* --------------------- Please tidy above here --------
> > > > > > ----
> > > > > > -----
> > > > > > ---- */
> > > > > >    
> > > > > >    #include <asm/bitops.h>
> > > > > >    
> > > > > > +/**
> > > > > > + * generic__test_and_set_bit - Set a bit and return its
> > > > > > old
> > > > > > value
> > > > > > + * @nr: Bit to set
> > > > > > + * @addr: Address to count from
> > > > > > + *
> > > > > > + * This operation is non-atomic and can be reordered.
> > > > > > + * If two examples of this operation race, one can appear
> > > > > > to
> > > > > > succeed
> > > > > > + * but actually fail.  You must protect multiple accesses
> > > > > > with
> > > > > > a
> > > > > > lock.
> > > > > > + */
> > > > > 
> > > > > Sorry for only mentioning this on v10. I think this comment
> > > > > should be
> > > > > duplicated (or moved to) on top of test_bit() because this is
> > > > > what
> > > > > everyone will use. This will avoid the developper to follow
> > > > > the
> > > > > function
> > > > > calls and only notice the x86 version which says "This
> > > > > function
> > > > > is
> > > > > atomic and may not be reordered." and would be wrong for all
> > > > > the
> > > > > other arch.
> > > > It makes sense to add this comment on top of test_bit(), but I
> > > > am
> > > > curious if it is needed to mention that for x86 arch_test_bit()
> > > > "is
> > > > atomic and may not be reordered":
> > > 
> > > I would say no because any developper modifying common code can't
> > > relying it.
> > > 
> > > > 
> > > >   * This operation is non-atomic and can be reordered. (
> > > > Exception:
> > > > for
> > > > * x86 arch_test_bit() is atomic and may not be reordered )
> > > >   * If two examples of this operation race, one can appear to
> > > > succeed
> > > >   * but actually fail.  You must protect multiple accesses with
> > > > a
> > > > lock.
> > > >   */
> > > > 
> > > > > 
> > > > > > +static always_inline bool
> > > > > > +generic__test_and_set_bit(int nr, volatile void *addr)
> > > > > > +{
> > > > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t
> > > > > > *)addr +
> > > > > > BITOP_WORD(nr);
> > > > > > +    bitop_uint_t old = *p;
> > > > > > +
> > > > > > +    *p = old | mask;
> > > > > > +    return (old & mask);
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * generic__test_and_clear_bit - Clear a bit and return
> > > > > > its
> > > > > > old
> > > > > > value
> > > > > > + * @nr: Bit to clear
> > > > > > + * @addr: Address to count from
> > > > > > + *
> > > > > > + * This operation is non-atomic and can be reordered.
> > > > > > + * If two examples of this operation race, one can appear
> > > > > > to
> > > > > > succeed
> > > > > > + * but actually fail.  You must protect multiple accesses
> > > > > > with
> > > > > > a
> > > > > > lock.
> > > > > > + */
> > > > > 
> > > > > Same applies here and ...
> > > > > 
> > > > > > +static always_inline bool
> > > > > > +generic__test_and_clear_bit(int nr, volatile void *addr)
> > > > > > +{
> > > > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t
> > > > > > *)addr +
> > > > > > BITOP_WORD(nr);
> > > > > > +    bitop_uint_t old = *p;
> > > > > > +
> > > > > > +    *p = old & ~mask;
> > > > > > +    return (old & mask);
> > > > > > +}
> > > > > > +
> > > > > > +/* WARNING: non atomic and it can be reordered! */
> > > > > 
> > > > > ... here.
> > > > > 
> > > > > > +static always_inline bool
> > > > > > +generic__test_and_change_bit(int nr, volatile void *addr)
> > > > > > +{
> > > > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > > > +    volatile bitop_uint_t *p = (volatile bitop_uint_t
> > > > > > *)addr +
> > > > > > BITOP_WORD(nr);
> > > > > > +    bitop_uint_t old = *p;
> > > > > > +
> > > > > > +    *p = old ^ mask;
> > > > > > +    return (old & mask);
> > > > > > +}
> > > > > > +/**
> > > > > > + * generic_test_bit - Determine whether a bit is set
> > > > > > + * @nr: bit number to test
> > > > > > + * @addr: Address to start counting from
> > > > > > + */
> > > > > > +static always_inline bool generic_test_bit(int nr, const
> > > > > > volatile
> > > > > > void *addr)
> > > > > > +{
> > > > > > +    bitop_uint_t mask = BITOP_MASK(nr);
> > > > > > +    const volatile bitop_uint_t *p =
> > > > > > +                        (const volatile bitop_uint_t
> > > > > > *)addr +
> > > > > > BITOP_WORD(nr);
> > > > > > +
> > > > > > +    return (*p & mask);
> > > > > > +}
> > > > > > +
> > > > > > +static always_inline bool
> > > > > > +__test_and_set_bit(int nr, volatile void *addr)
> > > > > > +{
> > > > > > +#ifndef arch__test_and_set_bit
> > > > > > +#define arch__test_and_set_bit generic__test_and_set_bit
> > > > > > +#endif
> > > > > > +
> > > > > > +    return arch__test_and_set_bit(nr, addr);
> > > > > > +}
> > > > > 
> > > > > NIT: It is a bit too late to change this one. But I have to
> > > > > admit, I
> > > > > don't understand the purpose of the static inline when you
> > > > > could
> > > > > have
> > > > > simply call...
> > > > > 
> > > > > > +#define __test_and_set_bit(nr, addr) ({             \
> > > > > > +    if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
> > > > > > +    __test_and_set_bit(nr, addr);                   \
> > > > > 
> > > > > ... __arch__test_and_set_bit here.
> > > > I just wanted to be in sync with other non-atomic test*_bit()
> > > > operations and Andrew's patch series ( xen/bitops: Reduce the
> > > > mess,
> > > > starting with ffs() ).
> > > 
> > > I looked at Andrew series and I can't seem to find an example
> > > where
> > > he 
> > > uses both the macro + static inline. He seems only use a static
> > > inline.
> > > Do you have any pointer?
> > > 
> > > But by any chance are you referring to the pattern on x86? If so,
> > > I 
> > > would really like to understand what's the benefits of
> > > introducing a
> > > a 
> > > one-liner static inline which is "shadowed" by a macro...
> > Yes, I was referring to the x86 pattern.
> > 
> > I tried to find the reason in the commit but couldn't:
> > https://lists.xenproject.org/archives/html/xen-changelog/2008-03/msg00097.html
> > 
> > For some reason, I also couldn't find the mailing list thread for
> > this.
> > 
> > Perhaps Jan could help here, based on the commit message he might
> > have
> > a suggestion.
> 
> There's a lot of apparently unrelated context left, so I'm trying to
> guess
> on what the actual question is, from the old change you're pointing
> at: With
> 
> #define __test_and_set_bit(nr, addr) ({             \
>    if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
>    __test_and_set_bit(nr, addr);                   \
> 
> why do we have that wrapping macro? If that's indeed the question,
> then may
> I suggest you consider what would happen to the sizeof() in
> bitop_bad_size()
> if the invocation was moved to the inline function, taking pointer-
> to-void?
> 
> However, I can't relate that old change to the question I think
> Julien
> raised (invoking __test_and_set_bit() vs arch__test_and_set_bit()),
> so
> perhaps I'm missing something.
If I understand the question correctly then the question is why do we
need static inline function. Can't we just do the following:

#ifndef arch_test_bit
#define arch_test_bit generic_test_bit
#endif

#define test_bit(nr, addr) ({                           \
    if ( bitop_bad_size(addr) ) __bitop_bad_size();     \
    arch_test_bit(nr, addr);                            \
})

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.