|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
> On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > The header was taken from Linux kernl 6.4.0-rc1.
> >
> > Addionally, were updated:
> > * add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
> > access.
> > * replace tabs with spaces
> > * replace __* variale with *__
> > * introduce generic version of xchg_* and cmpxchg_*.
> >
> > Implementation of 4- and 8-byte cases were left as it is done in
> > Linux kernel as according to the RISC-V spec:
> > ```
> > Table A.5 ( only part of the table was copied here )
> >
> > Linux Construct RVWMO Mapping
> > atomic <op> relaxed amo<op>.{w|d}
> > atomic <op> acquire amo<op>.{w|d}.aq
> > atomic <op> release amo<op>.{w|d}.rl
> > atomic <op> amo<op>.{w|d}.aqrl
> >
> > Linux Construct RVWMO LR/SC Mapping
> > atomic <op> relaxed loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
> > atomic <op> acquire loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
> > atomic <op> release loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez
> > loop OR
> > fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ;
> > bnez loop
> > atomic <op> loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
> > loop
> >
> > The Linux mappings for release operations may seem stronger than
> > necessary,
> > but these mappings are needed to cover some cases in which Linux
> > requires
> > stronger orderings than the more intuitive mappings would provide.
> > In particular, as of the time this text is being written, Linux is
> > actively
> > debating whether to require load-load, load-store, and store-store
> > orderings
> > between accesses in one critical section and accesses in a
> > subsequent critical
> > section in the same hart and protected by the same synchronization
> > object.
> > Not all combinations of FENCE RW,W/FENCE R,RW mappings with aq/rl
> > mappings
> > combine to provide such orderings.
> > There are a few ways around this problem, including:
> > 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This
> > suffices
> > but is undesirable, as it defeats the purpose of the aq/rl
> > modifiers.
> > 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does
> > not
> > currently work due to the lack of load and store opcodes with aq
> > and rl
> > modifiers.
>
> As before I don't understand this point. Can you give an example of
> what
> sort of opcode / instruction is missing?
If I understand the spec correctly then l{b|h|w|d} and s{b|h|w|d}
instructions don't have aq or rl annotation. Here is text from the
spec:
ARM Operation RVWMO Mapping
Load l{b|h|w|d}
Load-Acquire fence rw, rw; l{b|h|w|d}; fence r,rw
Load-Exclusive lr.{w|d}
Load-Acquire-Exclusive lr.{w|d}.aqrl
Store s{b|h|w|d}
Store-Release fence rw,w; s{b|h|w|d}
Store-Exclusive sc.{w|d}
Store-Release-Exclusive sc.{w|d}.rl
dmb fence rw,rw
dmb.ld fence r,rw
dmb.st fence w,w
isb fence.i; fence r,r
Table A.4: Mappings from ARM operations to RISC-V operations
Table A.4 provides a mapping from ARM memory operations onto RISC-V
memory instructions.
Since RISC-V does not currently have plain load and store opcodes with
aq or rl annotations, ARM
load-acquire and store-release operations should be mapped using fences
instead.
>
> > 3. Strengthen the mappings of release operations such that they
> > would
> > enforce sufficient orderings in the presence of either type of
> > acquire mapping.
> > This is the currently-recommended solution, and the one shown in
> > Table A.5.
> > ```
> >
> > But in Linux kenrel atomics were strengthen with fences:
> > ```
> > Atomics present the same issue with locking: release and acquire
> > variants need to be strengthened to meet the constraints defined
> > by the Linux-kernel memory consistency model [1].
> >
> > Atomics present a further issue: implementations of atomics such
> > as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
> > which do not give full-ordering with .aqrl; for example, current
> > implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
> > below to end up with the state indicated in the "exists" clause.
> >
> > In order to "synchronize" LKMM and RISC-V's implementation, this
> > commit strengthens the implementations of the atomics operations
> > by replacing .rl and .aq with the use of ("lightweigth") fences,
> > and by replacing .aqrl LR/SC pairs in sequences such as:
> >
> > 0: lr.w.aqrl %0, %addr
> > bne %0, %old, 1f
> > ...
> > sc.w.aqrl %1, %new, %addr
> > bnez %1, 0b
> > 1:
> >
> > with sequences of the form:
> >
> > 0: lr.w %0, %addr
> > bne %0, %old, 1f
> > ...
> > sc.w.rl %1, %new, %addr /* SC-release */
> > bnez %1, 0b
> > fence rw, rw /* "full" fence */
> > 1:
> >
> > following Daniel's suggestion.
> >
> > These modifications were validated with simulation of the RISC-V
> > memory consistency model.
> >
> > C lr-sc-aqrl-pair-vs-full-barrier
> >
> > {}
> >
> > P0(int *x, int *y, atomic_t *u)
> > {
> > int r0;
> > int r1;
> >
> > WRITE_ONCE(*x, 1);
> > r0 = atomic_cmpxchg(u, 0, 1);
> > r1 = READ_ONCE(*y);
> > }
> >
> > P1(int *x, int *y, atomic_t *v)
> > {
> > int r0;
> > int r1;
> >
> > WRITE_ONCE(*y, 1);
> > r0 = atomic_cmpxchg(v, 0, 1);
> > r1 = READ_ONCE(*x);
> > }
> >
> > exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
>
> While I'm entirely willing to trust this can happen, I can't bring
> this
> in line with the A extension spec.
>
> Additionally it's not clear to me in how far all of this applies when
> you don't really use LR/SC in the 4- and 8-byte cases (and going
> forward
> likely also not in the 1- and 2-byte case, utilizing Zahba when
> available).
It just explain what combination of fences, lr/sc, amoswap, .aq and .rl
annotation can be combined, and why combinations introduced in this
patch are used.
>
> > ---
> > Changes in V5:
> > - update the commit message.
> > - drop ALIGN_DOWN().
> > - update the definition of emulate_xchg_1_2():
> > - lr.d -> lr.w, sc.d -> sc.w.
> > - drop ret argument.
> > - code style fixes around asm volatile.
> > - update prototype.
> > - use asm named operands.
> > - rename local variables.
> > - add comment above the macros
> > - update the definition of __xchg_generic:
> > - drop local ptr__ variable.
> > - code style fixes around switch()
> > - update prototype.
> > - introduce RISCV_FULL_BARRIES.
> > - redefine cmpxchg()
> > - update emulate_cmpxchg_1_2():
> > - update prototype
> > - update local variables names and usage of them
> > - use name asm operands.
> > - add comment above the macros
> > ---
> > Changes in V4:
> > - Code style fixes.
> > - enforce in __xchg_*() has the same type for new and *ptr, also
> > "\n"
> > was removed at the end of asm instruction.
> > - dependency from
> > https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@xxxxxxxxxxx/
> > - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
> > - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
> > - drop cmpxcg{32,64}_{local} as they aren't used.
> > - introduce generic version of xchg_* and cmpxchg_*.
> > - update the commit message.
> > ---
> > Changes in V3:
> > - update the commit message
> > - add emulation of {cmp}xchg_... for 1 and 2 bytes types
> > ---
> > Changes in V2:
> > - update the comment at the top of the header.
> > - change xen/lib.h to xen/bug.h.
> > - sort inclusion of headers properly.
> > ---
> > xen/arch/riscv/include/asm/cmpxchg.h | 258
> > +++++++++++++++++++++++++++
> > 1 file changed, 258 insertions(+)
> > create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
> >
> > diff --git a/xen/arch/riscv/include/asm/cmpxchg.h
> > b/xen/arch/riscv/include/asm/cmpxchg.h
> > new file mode 100644
> > index 0000000000..66cbe26737
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> > @@ -0,0 +1,258 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright (C) 2014 Regents of the University of California */
> > +
> > +#ifndef _ASM_RISCV_CMPXCHG_H
> > +#define _ASM_RISCV_CMPXCHG_H
> > +
> > +#include <xen/compiler.h>
> > +#include <xen/lib.h>
> > +
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +#define __amoswap_generic(ptr, new, ret, sfx, pre, post) \
> > +({ \
> > + asm volatile( \
>
> Nit: In Xen style this is lacking a blank ahead of the opening
> parenthesis.
>
> > + pre \
> > + " amoswap" sfx " %0, %2, %1\n" \
> > + post \
> > + : "=r" (ret), "+A" (*ptr) \
> > + : "r" (new) \
> > + : "memory" ); \
> > +})
> > +
> > +/*
> > + * For LR and SC, the A extension requires that the address held
> > in rs1 be
> > + * naturally aligned to the size of the operand (i.e., eight-byte
> > aligned
> > + * for 64-bit words and four-byte aligned for 32-bit words).
> > + * If the address is not naturally aligned, an address-misaligned
> > exception
> > + * or an access-fault exception will be generated.
> > + *
> > + * Thereby:
> > + * - for 1-byte xchg access the containing word by clearing low
> > two bits
> > + * - for 2-byte xchg ccess the containing word by clearing first
> > bit.
>
> "first bit" can still be ambiguous. Better say "bit 1".
>
> > + *
>
> Here and apparently also elsewhere: Stray trailing blank. Git has a
> config
> setting to warn you about (maybe even to automatically strip? such.
It would be useful for me. Thanks a lot for such recommendation.
>
> > + * If resulting 4-byte access is still misalgined, it will fault
> > just as
> > + * non-emulated 4-byte access would.
> > + */
> > +#define emulate_xchg_1_2(ptr, new, sc_sfx, pre, post) \
> > +({ \
> > + uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr &
> > ~(0x4 - sizeof(*ptr))); \
>
> Here and elsewhere: sizeof(*(ptr)) (i.e. the inner parentheses are
> needed
> also there).
>
> > + uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 -
> > sizeof(*ptr))) * BITS_PER_BYTE; \
>
> Why uint8_t?
It is enough to cover possible start bit position of value that should
be updated, so I decided to use uint8_t.
>
> > + unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE)
> > - 1, 0) << new_val_pos; \
> > + unsigned int new_ = new << new_val_pos; \
> > + unsigned int old_val; \
> > + unsigned int xchged_val; \
> > + \
> > + asm volatile ( \
> > + pre \
> > + "0: lr.w %[op_oldval], %[op_aligned_ptr]\n" \
> > + " and %[op_xchged_val], %[op_oldval], %z[op_nmask]\n" \
> > + " or %[op_xchged_val], %[op_xchged_val], %z[op_new]\n"
> > \
> > + " sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val],
> > %[op_aligned_ptr]\n" \
> > + " bnez %[op_xchged_val], 0b\n" \
> > + post \
> > + : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r"
> > (xchged_val), [op_aligned_ptr]"+A" (*aligned_ptr) \
>
> Too long line. Partly because you have op_ prefixes here which I
> can't
> recognized what they would be good for. The val / _val suffixes also
> don't appear to carry much useful information. And "xchged", being
> explicitly past tense, doesn't look to fit even up and until the SC,
> not to speak of afterwards. Anything wrong with calling this just
> tmp,
> aux, or scratch?
op_ can be dropped and named operand can be equal to local variable
name, I thought it would be useful to understand that it is named
operand, but after rethinking it looks like unneeded overhead.
In case of emulate_xchg_1_2() there is no sense in val/_val suffixes as
local variables don't intersect with macros variable, and the suffixes
were added just to be in sync with emulate_cmpxchg_1_2 macros, but in
case of emulate_cmpxchg_1_2(ptr, old, new, sc_sfx, pre, post), the
macros has old argument, so to distinguish them _val was added.
Probably, it would be better to rename it to read or read_old.
>
> > + : [op_new] "rJ" (new_), [op_nmask] "rJ" (~mask) \
> > + : "memory" ); \
> > + \
> > + (__typeof__(*(ptr)))((old_val & mask) >> new_val_pos); \
> > +})
> > +
> > +#define __xchg_generic(ptr, new, size, sfx, pre, post) \
> > +({ \
> > + __typeof__(*(ptr)) new__ = (new); \
> > + __typeof__(*(ptr)) ret__; \
> > + switch ( size ) \
>
> Can't this use sizeof(*(ptr)), allowing for one less macro parameter?
>
> > + { \
> > + case 1: \
> > + case 2: \
> > + ret__ = emulate_xchg_1_2(ptr, new__, sfx, pre, post); \
> > + break; \
> > + case 4: \
> > + __amoswap_generic(ptr, new__, ret__,\
> > + ".w" sfx, pre, post); \
> > + break; \
> > + case 8: \
> > + __amoswap_generic(ptr, new__, ret__,\
> > + ".d" sfx, pre, post); \
> > + break; \
>
> In io.h you make sure to avoid rv64-only insns. Here you don't. The
> build
> would fail either way, but this still looks inconsistent.
>
> Also nit: Stray double blands (twice) ahead of "pre". Plus with this
> style
> of line continuation you want to consistently have exactly one blank
> ahead
> of each backslash.
>
> > + default: \
> > + STATIC_ASSERT_UNREACHABLE(); \
> > + } \
> > + ret__; \
> > +})
> > +
> > +#define xchg_relaxed(ptr, x) \
> > +({ \
> > + __typeof__(*(ptr)) x_ = (x); \
>
> What is the purpose of this, when __xchg_generic() already does this
> same
> type conversion?
>
> > + (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),
> > "", "", ""); \
> > +})
> > +
> > +#define xchg_acquire(ptr, x) \
> > +({ \
> > + __typeof__(*(ptr)) x_ = (x); \
> > + (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
> > + "", "",
> > RISCV_ACQUIRE_BARRIER); \
> > +})
> > +
> > +#define xchg_release(ptr, x) \
> > +({ \
> > + __typeof__(*(ptr)) x_ = (x); \
> > + (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
> > + "", RISCV_RELEASE_BARRIER,
> > ""); \
> > +})
>
> As asked before: Are there going to be any uses of these three?
> Common
> code doesn't require them. And not needing to provide them would
> simplify things quite a bit, it seems.
I checked my private branches and it looks to me that I introduced them
only for the correspondent atomic operations ( which was copied from
Linux Kernel ) which are not also used.
So we could definitely drop these macros for now, but should
xchg_generic() be updated as well? If to look at:
#define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x), sizeof(*
(ptr)), \
".aqrl", "", "")
Last two arguments start to be unneeded, but I've wanted to leave them,
in case someone will needed to back xchg_{release, acquire, ...}. Does
it make any sense?
>
> > +#define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x),
> > sizeof(*(ptr)), \
> > + ".aqrl", "", "")
>
> According to the earlier comment (where I don't follow the example
> given),
> is .aqrl sufficient here? And even if it was for the 4- and 8-byte
> cases,
> is it sufficient in the 1- and 2-byte emulation case (where it then
> is
> appended to just the SC)?
If I understand your question correctly then accroding to the spec.,
.aqrl is enough for amo<op>.{w|d} instructions:
Linux Construct RVWMO AMO Mapping
atomic <op> relaxed amo<op>.{w|d}
atomic <op> acquire amo<op>.{w|d}.aq
atomic <op> release amo<op>.{w|d}.rl
atomic <op> amo<op>.{w|d}.aqrl
but in case of lr/sc you are right sc requires suffix too:
Linux Construct RVWMO LR/SC Mapping
atomic <op> relaxed loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
atomic <op> acquire loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
atomic <op> release loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez
loop OR fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ; bnez loop
atomic <op> loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
loop
I will add sc_sfx to emulate_xchg_1_2(). The only question is left if
__xchg_generic(ptr, new, size, sfx, pre, post) should be changed to:
__xchg_generic(ptr, new, size, sfx1, sfx2, pre, post) to cover both
cases amo<op>.{w|d}.sfx1 and lr.{w|d}.sfx1 ... sc.{w|d}.sfx2?
~ Oleksii
>
> > +#define __generic_cmpxchg(ptr, old, new, ret, lr_sfx, sc_sfx, pre,
> > post) \
> > + ({ \
> > + register unsigned int rc; \
> > + asm volatile( \
> > + pre \
> > + "0: lr" lr_sfx " %0, %2\n" \
> > + " bne %0, %z3, 1f\n" \
> > + " sc" sc_sfx " %1, %z4, %2\n" \
> > + " bnez %1, 0b\n" \
> > + post \
> > + "1:\n" \
> > + : "=&r" (ret), "=&r" (rc), "+A" (*ptr) \
> > + : "rJ" (old), "rJ" (new) \
> > + : "memory"); \
> > + })
> > +
> > +/*
> > + * For LR and SC, the A extension requires that the address held
> > in rs1 be
> > + * naturally aligned to the size of the operand (i.e., eight-byte
> > aligned
> > + * for 64-bit words and four-byte aligned for 32-bit words).
> > + * If the address is not naturally aligned, an address-misaligned
> > exception
> > + * or an access-fault exception will be generated.
> > + *
> > + * Thereby:
> > + * - for 1-byte xchg access the containing word by clearing low
> > two bits
> > + * - for 2-byte xchg ccess the containing word by clearing first
> > bit.
> > + *
> > + * If resulting 4-byte access is still misalgined, it will fault
> > just as
> > + * non-emulated 4-byte access would.
> > + *
> > + * old_val was casted to unsigned long at the end of the define
> > because of
> > + * the following issue:
> > + * ./arch/riscv/include/asm/cmpxchg.h:166:5: error: cast to
> > pointer from integer of different size [-Werror=int-to-pointer-
> > cast]
> > + * 166 | (__typeof__(*(ptr)))(old_val >> new_val_pos); \
> > + * | ^
> > + * ./arch/riscv/include/asm/cmpxchg.h:184:17: note: in expansion
> > of macro 'emulate_cmpxchg_1_2'
> > + * 184 | ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
> > + * | ^~~~~~~~~~~~~~~~~~~
> > + * ./arch/riscv/include/asm/cmpxchg.h:227:5: note: in expansion of
> > macro '__cmpxchg_generic'
> > + * 227 | __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned
> > long)(n), \
> > + * | ^~~~~~~~~~~~~~~~~
> > + * ./include/xen/lib.h:141:26: note: in expansion of macro
> > '__cmpxchg'
> > + * 141 | ((__typeof__(*(ptr)))__cmpxchg(ptr, (unsigned
> > long)o_, \
> > + * | ^~~~~~~~~
> > + * common/event_channel.c:109:13: note: in expansion of macro
> > 'cmpxchgptr'
> > + * 109 | cmpxchgptr(&xen_consumers[i], NULL, fn);
> > + */
>
> This is too much detail on the compile issue. Just mentioning that
> said
> cast is needed for cmpxchgptr() ought to be sufficient.
>
> > +#define emulate_cmpxchg_1_2(ptr, old, new, sc_sfx, pre, post) \
> > +({ \
> > + uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr &
> > ~(0x4 - sizeof(*ptr))); \
> > + uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 -
> > sizeof(*ptr))) * BITS_PER_BYTE; \
> > + unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE)
> > - 1, 0) << new_val_pos; \
> > + unsigned int old_ = old << new_val_pos; \
> > + unsigned int new_ = new << new_val_pos; \
> > + unsigned int old_val; \
> > + unsigned int xchged_val; \
> > + \
> > + __asm__ __volatile__ ( \
> > + pre \
> > + "0: lr.w %[op_xchged_val], %[op_aligned_ptr]\n" \
> > + " and %[op_oldval], %[op_xchged_val], %z[op_mask]\n" \
> > + " bne %[op_oldval], %z[op_old], 1f\n" \
> > + " xor %[op_xchged_val], %[op_oldval],
> > %[op_xchged_val]\n" \
> > + " or %[op_xchged_val], %[op_xchged_val], %z[op_new]\n"
> > \
> > + " sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val],
> > %[op_aligned_ptr]\n" \
> > + " bnez %[op_xchged_val], 0b\n" \
> > + post \
> > + "1:\n" \
> > + : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r"
> > (xchged_val), [op_aligned_ptr] "+A" (*aligned_ptr) \
> > + : [op_old] "rJ" (old_), [op_new] "rJ" (new_), \
> > + [op_mask] "rJ" (mask) \
> > + : "memory" ); \
> > + \
> > + (__typeof__(*(ptr)))((unsigned long)old_val >> new_val_pos); \
> > +})
> > +
> > +/*
> > + * Atomic compare and exchange. Compare OLD with MEM, if
> > identical,
> > + * store NEW in MEM. Return the initial value in MEM. Success is
> > + * indicated by comparing RETURN with OLD.
> > + */
> > +#define __cmpxchg_generic(ptr, old, new, size, sc_sfx, pre, post)
> > \
> > +({ \
> > + __typeof__(ptr) ptr__ = (ptr); \
> > + __typeof__(*(ptr)) old__ = (__typeof__(*(ptr)))(old); \
> > + __typeof__(*(ptr)) new__ = (__typeof__(*(ptr)))(new); \
> > + __typeof__(*(ptr)) ret__; \
> > + switch ( size ) \
> > + { \
> > + case 1: \
> > + case 2: \
> > + ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
> > + sc_sfx, pre, post); \
> > + break; \
> > + case 4: \
> > + __generic_cmpxchg(ptr__, old__, new__, ret__, \
> > + ".w", ".w"sc_sfx, pre, post); \
> > + break; \
> > + case 8: \
> > + __generic_cmpxchg(ptr__, old__, new__, ret__, \
> > + ".d", ".d"sc_sfx, pre, post); \
> > + break; \
> > + default: \
> > + STATIC_ASSERT_UNREACHABLE(); \
> > + } \
> > + ret__; \
> > +})
> > +
> > +#define cmpxchg_relaxed(ptr, o, n) \
> > +({ \
> > + __typeof__(*(ptr)) o_ = (o); \
> > + __typeof__(*(ptr)) n_ = (n); \
> > + (__typeof__(*(ptr)))__cmpxchg_generic(ptr, \
> > + o_, n_, sizeof(*(ptr)), "", "", ""); \
> > +})
> > +
> > +#define cmpxchg_acquire(ptr, o, n) \
> > +({ \
> > + __typeof__(*(ptr)) o_ = (o); \
> > + __typeof__(*(ptr)) n_ = (n); \
> > + (__typeof__(*(ptr)))__cmpxchg_generic(ptr, o_, n_,
> > sizeof(*(ptr)), \
> > + "", "",
> > RISCV_ACQUIRE_BARRIER); \
> > +})
> > +
> > +#define cmpxchg_release(ptr, o, n) \
> > +({ \
> > + __typeof__(*(ptr)) o_ = (o); \
> > + __typeof__(*(ptr)) n_ = (n); \
> > + (__typeof__(*(ptr)))__cmpxchg_release(ptr, o_, n_,
> > sizeof(*(ptr)), \
>
> There's no __cmpxchg_release() afaics; dym __cmpxchg_generic()?
>
> Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |