Xen project Mailing List

Re: [PATCH v9 02/15] xen: introduce generic non-atomic test_*bit()

On Wed, 2024-05-15 at 17:41 +0200, Jan Beulich wrote: > On 15.05.2024 17:29, Oleksii K. wrote: > > On Wed, 2024-05-15 at 10:52 +0200, Jan Beulich wrote: > > > On 06.05.2024 12:15, Oleksii Kurochko wrote: > > > > The following generic functions were introduced: > > > > * test_bit > > > > * generic__test_and_set_bit > > > > * generic__test_and_clear_bit > > > > * generic__test_and_change_bit > > > > > > > > Also, the patch introduces the following generics which are > > > > used by the functions mentioned above: > > > > * BITOP_BITS_PER_WORD > > > > * BITOP_MASK > > > > * BITOP_WORD > > > > * BITOP_TYPE > > > > > > > > These functions and macros can be useful for architectures > > > > that don't have corresponding arch-specific instructions. > > > > > > Logically this paragraph may better move ahead of the BITOP_* > > > one. > > > > > > > Because of that x86 has the following check in the macros > > > > test_bit(), > > > > __test_and_set_bit(), __test_and_clear_bit(), > > > > __test_and_change_bit(): > > > > if ( bitop_bad_size(addr) ) __bitop_bad_size(); > > > > It was necessary to make bitop bad size check generic too, so > > > > arch_check_bitop_size() was introduced. > > > > > > Not anymore, with the most recent adjustments? There's nothing > > > arch- > > > specific anymore in the checking. > > > > > > > @@ -183,7 +180,7 @@ static inline int test_and_set_bit(int nr, > > > > volatile void *addr) > > > > * If two examples of this operation race, one can appear to > > > > succeed > > > > * but actually fail. You must protect multiple accesses with > > > > a > > > > lock. > > > > */ > > > > -static inline int __test_and_set_bit(int nr, void *addr) > > > > +static inline int arch__test_and_set_bit(int nr, volatile void > > > > *addr) > > > > > > I think I raised this point before: Why arch__ here, ... > > > > > > > @@ -232,7 +226,7 @@ static inline int test_and_clear_bit(int > > > > nr, > > > > volatile void *addr) > > > > * If two examples of this operation race, one can appear to > > > > succeed > > > > * but actually fail. You must protect multiple accesses with > > > > a > > > > lock. > > > > */ > > > > -static inline int __test_and_clear_bit(int nr, void *addr) > > > > +static inline int arch__test_and_clear_bit(int nr, volatile > > > > void > > > > *addr) > > > > > > ... here, ... > > > > > > > @@ -243,13 +237,10 @@ static inline int > > > > __test_and_clear_bit(int > > > > nr, void *addr) > > > > > > > > return oldbit; > > > > } > > > > -#define __test_and_clear_bit(nr, addr) ({ \ > > > > - if ( bitop_bad_size(addr) ) __bitop_bad_size(); \ > > > > - __test_and_clear_bit(nr, addr); \ > > > > -}) > > > > +#define arch__test_and_clear_bit arch__test_and_clear_bit > > > > > > > > /* WARNING: non atomic and it can be reordered! */ > > > > -static inline int __test_and_change_bit(int nr, void *addr) > > > > +static inline int arch__test_and_change_bit(int nr, volatile > > > > void > > > > *addr) > > > > > > ... and here, while ... > > > > > > > @@ -307,8 +295,7 @@ static inline int variable_test_bit(int nr, > > > > const volatile void *addr) > > > > return oldbit; > > > > } > > > > > > > > -#define test_bit(nr, addr) ({ \ > > > > - if ( bitop_bad_size(addr) ) __bitop_bad_size(); \ > > > > +#define arch_test_bit(nr, addr) ({ \ > > > > > > ... just arch_ here? I don't like the double underscore infixes > > > very > > > much, but I'm okay with them as long as they're applied > > > consistently. > > > > Common code and x86 use __test_and_clear_bit(), and this patch > > provides > > a generic version of __test_and_clear_bit(). To emphasize that > > generic__test_and_clear_bit() is a common implementation of > > __test_and_clear_bit(), the double underscore was retained. Also, > > test_and_clear_bit() exists and if one day it will be needed to > > provide > > a generic version of it, then it will be needed to have > > generic__test_and_clear_bit() and generic_test_and_clear_bit() > > > > A similar logic was chosen for test_bit. > > Right, but in all of your reply arch_ doesn't appear at all. I am a little confused here. According to my logic, should it be arch___test_and_set_bit() and generic___test_and_set_bit()? If you are asking why there is no generic implementation for test_and_clear_bit() without the double underscores, the reason is that Arm, PPC, and x86 don't use generic code and rely on specific instructions for this operation. Therefore, I didn't see much sense in providing a generic version of test_and_clear_bit(), at least, for now. > Yet the > question was: Why then not arch__test_bit(), to match the other arch > helpers? Because no one uses __test_bit(). Everywhere is used test_bit(). > > > > > --- a/xen/include/xen/bitops.h > > > > +++ b/xen/include/xen/bitops.h > > > > @@ -65,10 +65,144 @@ static inline int generic_flsl(unsigned > > > > long > > > > x) > > > > * scope > > > > */ > > > > > > > > +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % > > > > BITOP_BITS_PER_WORD)) > > > > + > > > > +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > > > > + > > > > +extern void __bitop_bad_size(void); > > > > + > > > > /* --------------------- Please tidy above here -------------- > > > > ---- > > > > --- */ > > > > > > > > #include <asm/bitops.h> > > > > > > > > +#ifndef arch_check_bitop_size > > > > + > > > > +#define bitop_bad_size(addr) sizeof(*(addr)) < > > > > sizeof(bitop_uint_t) > > > > > > Nit: Missing parentheses around the whole expression. > > > > > > > +#define arch_check_bitop_size(addr) \ > > > > + if ( bitop_bad_size(addr) ) __bitop_bad_size(); > > > > > > Apart from the arch_ prefix that now wants dropping, this macro > > > (if > > > we > > > want one in the first place) > > What do you mean by 'want' here? I thought it is pretty necessary > > from > > safety point of view to have this check. > > I don't question the check. What I was questioning is the need for a > macro to wrap that, seeing how x86 did without. I'm not outright > objecting to such a macro, though. We can follow x86 approach as we there's no any more intention to allow arch overrides. > > > Except arch_ prefix does it make sense to drop "#ifndef > > arch_check_bitop_size" around this macros? > > Of course, as with arch_ dropped from the name there's no intention > to > allow arch overrides. > > > > > +/** > > > > + * generic__test_and_set_bit - Set a bit and return its old > > > > value > > > > + * @nr: Bit to set > > > > + * @addr: Address to count from > > > > + * > > > > + * This operation is non-atomic and can be reordered. > > > > + * If two examples of this operation race, one can appear to > > > > succeed > > > > + * but actually fail. You must protect multiple accesses with > > > > a > > > > lock. > > > > + */ > > > > +static always_inline bool > > > > +generic__test_and_set_bit(unsigned long nr, volatile void > > > > *addr) > > > > > > The original per-arch functions all use "int" for their first > > > parameter. > > > Here you use unsigned long, without any mention in the > > > description of > > > the > > > potential behavioral change. Which is even worse given that then > > > x86 > > > ends > > > up inconsistent with Arm and PPC in this regard, by ... > > > > > > > +{ > > > > + bitop_uint_t mask = BITOP_MASK(nr); > > > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > > > BITOP_WORD(nr); > > > > + bitop_uint_t old = *p; > > > > + > > > > + *p = old | mask; > > > > + return (old & mask); > > > > +} > > > > + > > > > +/** > > > > + * generic__test_and_clear_bit - Clear a bit and return its > > > > old > > > > value > > > > + * @nr: Bit to clear > > > > + * @addr: Address to count from > > > > + * > > > > + * This operation is non-atomic and can be reordered. > > > > + * If two examples of this operation race, one can appear to > > > > succeed > > > > + * but actually fail. You must protect multiple accesses with > > > > a > > > > lock. > > > > + */ > > > > +static always_inline bool > > > > +generic__test_and_clear_bit(bitop_uint_t nr, volatile void > > > > *addr) > > > > +{ > > > > + bitop_uint_t mask = BITOP_MASK(nr); > > > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > > > BITOP_WORD(nr); > > > > + bitop_uint_t old = *p; > > > > + > > > > + *p = old & ~mask; > > > > + return (old & mask); > > > > +} > > > > + > > > > +/* WARNING: non atomic and it can be reordered! */ > > > > +static always_inline bool > > > > +generic__test_and_change_bit(unsigned long nr, volatile void > > > > *addr) > > > > +{ > > > > + bitop_uint_t mask = BITOP_MASK(nr); > > > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > > > BITOP_WORD(nr); > > > > + bitop_uint_t old = *p; > > > > + > > > > + *p = old ^ mask; > > > > + return (old & mask); > > > > +} > > > > +/** > > > > + * generic_test_bit - Determine whether a bit is set > > > > + * @nr: bit number to test > > > > + * @addr: Address to start counting from > > > > + */ > > > > +static always_inline bool generic_test_bit(int nr, const > > > > volatile > > > > void *addr) > > > > +{ > > > > + bitop_uint_t mask = BITOP_MASK(nr); > > > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > > > BITOP_WORD(nr); > > > > + > > > > + return (*p & mask); > > > > +} > > > > + > > > > +static always_inline bool > > > > +__test_and_set_bit(unsigned long nr, volatile void *addr) > > > > +{ > > > > +#ifndef arch__test_and_set_bit > > > > +#define arch__test_and_set_bit generic__test_and_set_bit > > > > +#endif > > > > + > > > > + return arch__test_and_set_bit(nr, addr); > > > > > > ... silently truncating and sign-converting nr here. > > Missed that fact. AFAIU there is no specific reason for bit number > > to > > be 'int' for this function, so it makes sense to update x86's > > prototype > > of arch__test_and_set_bit() to: > > static inline int arch__test_and_set_bit(unsigned long nr, > > volatile > > void *addr). > > > > But probably I can't use 'unsigned long' here too, as it should a > > compilation error around 'btsl' instruction. > > > > So it can be or 'unsinged int' or 'int'. > > > > > > As to generic_test_bit() - please don't cast away const-ness > > > there. > > > > > > > +} > > > > +#define __test_and_set_bit(nr, addr) ({ \ > > > > + arch_check_bitop_size(addr); \ > > > > + __test_and_set_bit(nr, addr); \ > > > > +}) > > > > + > > > > +static always_inline bool > > > > +__test_and_clear_bit(bitop_uint_t nr, volatile void *addr) > > > > > > Oddly enough here at least you use bitop_uint_t, but that's still > > > ... > > > > > > > +{ > > > > +#ifndef arch__test_and_clear_bit > > > > +#define arch__test_and_clear_bit generic__test_and_clear_bit > > > > +#endif > > > > + > > > > + return arch__test_and_clear_bit(nr, addr); > > > > > > ... meaning a signedness conversion on x86 then. And beware: You > > > can't > > > simply change x86'es code to use bitop_uint_t. The underlying > > > insns > > > used > > > interpret the bit position as a signed number, i.e. permitting > > > accesses > > > below the incoming pointer (whether it really makes sense to be > > > that > > > way > > > is a separate question). I'm afraid I have no good suggestion how > > > to > > > deal > > > with that: Any approach I can think of is either detrimental to > > > the > > > generic implementation or would have unwanted effects on the x86 > > > one. > > > Others, anyone? > > Is the signedness conversion here a big problem? I suppose no one > > will > > pass a negative number to nr. > > > > It seems to me that it would be better to use 'int' everywhere and > > not > > use bitop_uint_t for 'nr' since it is just a bit position. 'Int' > > provides enough range for possible bit number. > > Indeed, and that's then hopefully less of a risk as to introducing > hard > to spot issues. Provided Arm and PPC are okay with that type then as > well. Then I will update prototypes of generic functions correspondingly. > > > > > --- a/xen/include/xen/types.h > > > > +++ b/xen/include/xen/types.h > > > > @@ -64,6 +64,12 @@ typedef __u64 __be64; > > > > > > > > typedef unsigned int __attribute__((__mode__(__pointer__))) > > > > uintptr_t; > > > > > > > > +#ifndef BITOP_TYPE > > > > +#define BITOP_BITS_PER_WORD 32 > > > > + > > > > +typedef uint32_t bitop_uint_t; > > > > +#endif > > > > > > I think you mentioned to me before why this needs to live here, > > > not > > > in > > > xen/bitops.h. Yet I don't recall the reason, and the description > > > (hint, > > > hint) doesn't say anything either. > > If I remember correctly ( after this phrase I think I have to > > update > > the description ) the reason was that if I put that to > > xen/bitops.h: > > > > ... > > #include <asm/bitops.h> > > > > #ifndef BITOP_TYPE > > #define BITOP_BITS_PER_WORD 32 > > /* typedef uint32_t bitop_uint_t; */ > > #endif > > ... > > > > Then we will have an issue that we can't use the generic definition > > of > > BITOP_BITS_PER_WORD and bitop_uint_t in asm/bitops.h as it is > > defined > > after inclusion of <asm/bitops.h>. > > > > But if to put it to <xen/types.h> then such problem won't occur. > > > > If it still makes sense, then I'll update the description in the > > next > > patch version. > > I see. But we don't need to allow for arch overrides here anymore, so > in > xen/bitops.h couldn't we as well have > > #define BITOP_BITS_PER_WORD 32 > typedef uint32_t bitop_uint_t; > > ... (if necessary) > > #include <asm/bitops.h> > > ? Good point. If there is not need to allow for arch overrides then we can use suggested above approach. Thanks. ~ Oleksii

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.