[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h



On Thu, 2024-03-07 at 12:11 +0100, Jan Beulich wrote:
> On 07.03.2024 12:01, Oleksii wrote:
> > On Thu, 2024-03-07 at 11:46 +0100, Jan Beulich wrote:
> > > On 07.03.2024 11:35, Oleksii wrote:
> > > > On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
> > > > > On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > > > > > The header was taken from Linux kernl 6.4.0-rc1.
> > > > > > 
> > > > > > Addionally, were updated:
> > > > > > * add emulation of {cmp}xchg for 1/2 byte types using 32-
> > > > > > bit
> > > > > > atomic
> > > > > >   access.
> > > > > > * replace tabs with spaces
> > > > > > * replace __* variale with *__
> > > > > > * introduce generic version of xchg_* and cmpxchg_*.
> > > > > > 
> > > > > > Implementation of 4- and 8-byte cases were left as it is
> > > > > > done
> > > > > > in
> > > > > > Linux kernel as according to the RISC-V spec:
> > > > > > ```
> > > > > > Table A.5 ( only part of the table was copied here )
> > > > > > 
> > > > > > Linux Construct       RVWMO Mapping
> > > > > > atomic <op> relaxed    amo<op>.{w|d}
> > > > > > atomic <op> acquire    amo<op>.{w|d}.aq
> > > > > > atomic <op> release    amo<op>.{w|d}.rl
> > > > > > atomic <op>            amo<op>.{w|d}.aqrl
> > > > > > 
> > > > > > Linux Construct       RVWMO LR/SC Mapping
> > > > > > atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez
> > > > > > loop
> > > > > > atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d};
> > > > > > bnez
> > > > > > loop
> > > > > > atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗
> > > > > > ;
> > > > > > bnez
> > > > > > loop OR
> > > > > >                        fence.tso; loop: lr.{w|d}; <op>;
> > > > > > sc.{w|d}∗ ;
> > > > > > bnez loop
> > > > > > atomic <op>            loop: lr.{w|d}.aq; <op>;
> > > > > > sc.{w|d}.aqrl;
> > > > > > bnez
> > > > > > loop
> 
> Note the load and store forms mentioned here. How would ...
> 
> > > > > > The Linux mappings for release operations may seem stronger
> > > > > > than
> > > > > > necessary,
> > > > > > but these mappings are needed to cover some cases in which
> > > > > > Linux
> > > > > > requires
> > > > > > stronger orderings than the more intuitive mappings would
> > > > > > provide.
> > > > > > In particular, as of the time this text is being written,
> > > > > > Linux
> > > > > > is
> > > > > > actively
> > > > > > debating whether to require load-load, load-store, and
> > > > > > store-
> > > > > > store
> > > > > > orderings
> > > > > > between accesses in one critical section and accesses in a
> > > > > > subsequent critical
> > > > > > section in the same hart and protected by the same
> > > > > > synchronization
> > > > > > object.
> > > > > > Not all combinations of FENCE RW,W/FENCE R,RW mappings with
> > > > > > aq/rl
> > > > > > mappings
> > > > > > combine to provide such orderings.
> > > > > > There are a few ways around this problem, including:
> > > > > > 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl.
> > > > > > This
> > > > > > suffices
> > > > > >    but is undesirable, as it defeats the purpose of the
> > > > > > aq/rl
> > > > > > modifiers.
> > > > > > 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW.
> > > > > > This
> > > > > > does
> > > > > > not
> > > > > >    currently work due to the lack of load and store opcodes
> > > > > > with aq
> > > > > > and rl
> > > > > >    modifiers.
> > > > > 
> > > > > As before I don't understand this point. Can you give an
> > > > > example
> > > > > of
> > > > > what
> > > > > sort of opcode / instruction is missing?
> > > > If I understand the spec correctly then l{b|h|w|d} and
> > > > s{b|h|w|d}
> > > > instructions don't have aq or rl annotation.
> > > 
> > > How would load insns other that LR and store insns other than SC
> > > come
> > > into play here?
> > 
> > This part of the spec. is not only about LR and SC which cover
> > Load-
> > Exclusive and Store-Exclusive cases, but also about non-Exclusive
> > cases
> > for each l{b|h|w|d} and s{b|h|w|d} are used.
> 
> ... the spec (obviously covering other forms, too) be relevant when
> reasoning whether just suffixes or actual barrier insns need using?
Based on 3 rules which are in the commit message and in the spec.,
there is no difference between what option should be used ( at least, I
wasn't able to find an explanation in that paragraph ), but based on
the tables provided in the same paragraph ( and partially in the commit
message ) if an instruction has .aq or .rl annotation it should be
used.

And speaking about xchg and cmpxcgh case and their implementations, all
instructions have .ar/.rl suffixes, so we'd rather prefer suffixes
instead of barriers. 

Does it make sense?

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.