[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 10/20] xen/riscv: introduce atomic.h
On 27.03.2024 11:28, Oleksii wrote: > On Wed, 2024-03-27 at 08:40 +0100, Jan Beulich wrote: > ... > >>>>>>> +/* This is required to provide a full barrier on success. >>>>>>> */ >>>>>>> +static inline int atomic_add_unless(atomic_t *v, int a, >>>>>>> int u) >>>>>>> +{ >>>>>>> + int prev, rc; >>>>>>> + >>>>>>> + asm volatile ( >>>>>>> + "0: lr.w %[p], %[c]\n" >>>>>>> + " beq %[p], %[u], 1f\n" >>>>>>> + " add %[rc], %[p], %[a]\n" >>>>>>> + " sc.w.rl %[rc], %[rc], %[c]\n" >>>>>>> + " bnez %[rc], 0b\n" >>>>>>> + RISCV_FULL_BARRIER >>>>>> >>>>>> With this and no .aq on the load, why the .rl on the store? >>>>> It is something that LKMM requires [1]. >>>>> >>>>> This is not fully clear to me what is so specific in LKMM, but >>>>> accoring >>>>> to the spec: >>>>> Ordering Annotation Fence-based Equivalent >>>>> l{b|h|w|d|r}.aq l{b|h|w|d|r}; fence r,rw >>>>> l{b|h|w|d|r}.aqrl fence rw,rw; l{b|h|w|d|r}; fence r,rw >>>>> s{b|h|w|d|c}.rl fence rw,w; s{b|h|w|d|c} >>>>> s{b|h|w|d|c}.aqrl fence rw,w; s{b|h|w|d|c} >>>>> amo<op>.aq amo<op>; fence r,rw >>>>> amo<op>.rl fence rw,w; amo<op> >>>>> amo<op>.aqrl fence rw,rw; amo<op>; fence rw,rw >>>>> Table 2.2: Mappings from .aq and/or .rl to fence-based >>>>> equivalents. >>>>> An alternative mapping places a fence rw,rw after the >>>>> existing >>>>> s{b|h|w|d|c} mapping rather than at the front of the >>>>> l{b|h|w|d|r} mapping. >>>> >>>> I'm afraid I can't spot the specific case in this table. None of >>>> the >>>> stores in the right column have a .rl suffix. >>> Yes, it is expected. >>> >>> I am reading this table as (f.e.) amo<op>.rl is an equivalent of >>> fence >>> rw,w; amo<op>. (without .rl) >> >> In which case: How does quoting the table answer my original >> question? > Agree, it is starting to be confusing, so let me give an answer to your > question below. > >> >>>>> It is also safe to translate any .aq, .rl, or .aqrl >>>>> annotation >>>>> into >>>>> the fence-based snippets of >>>>> Table 2.2. These can also be used as a legal implementation >>>>> of >>>>> l{b|h|w|d} or s{b|h|w|d} pseu- >>>>> doinstructions for as long as those instructions are not >>>>> added >>>>> to >>>>> the ISA. >>>>> >>>>> So according to the spec, it should be: >>>>> sc.w ... >>>>> RISCV_FULL_BARRIER. >>>>> >>>>> Considering [1] and how this code looks before, it seems to me >>>>> that >>>>> it >>>>> is safe to use lr.w.aq/sc.w.rl here or an fence equivalent. >>>> >>>> Here you say "or". Then why dos the code use sc.?.rl _and_ a >>>> fence? >>> I confused this line with amo<op>.aqrl, so based on the table 2.2 >>> above >>> s{b|h|w|d|c}.aqrl is transformed to "fence rw,w; s{b|h|w|d|c}", but >>> Linux kernel decided to strengthen it with full barrier: >>> - "0:\n\t" >>> - "lr.w.aqrl %[p], %[c]\n\t" >>> - "beq %[p], %[u], 1f\n\t" >>> - "add %[rc], %[p], %[a]\n\t" >>> - "sc.w.aqrl %[rc], %[rc], %[c]\n\t" >>> - "bnez %[rc], 0b\n\t" >>> - "1:" >>> + "0: lr.w %[p], %[c]\n" >>> + " beq %[p], %[u], 1f\n" >>> + " add %[rc], %[p], %[a]\n" >>> + " sc.w.rl %[rc], %[rc], %[c]\n" >>> + " bnez %[rc], 0b\n" >>> + " fence rw, rw\n" >>> + "1:\n" >>> As they have the following issue: >>> implementations of atomics such as atomic_cmpxchg() and >>> atomic_add_unless() rely on LR/SC pairs, >>> which do not give full-ordering with .aqrl; for example, current >>> implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test >>> below to end up with the state indicated in the "exists" clause. >> >> Is that really "current implementations", not "the abstract model"? >> If so, the use of an explicit fence would be more like a workaround >> (and would hence want commenting to that effect). >> >> In neither case can I see my original question answered: Why the .rl >> on the store when there is a full fence later? > The good explanation for that was provided in the commit addressing a > similar issue for ARM64 [ > https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon@xxxxxxx/ > ]. > The same holds true for RISC-V since ARM also employs WMO. > > I would also like to mention another point, as I indicated in another > email thread > [ https://lists.xen.org/archives/html/xen-devel/2024-03/msg01582.html ] > , that now this fence can be omitted and .aqrl can be used instead. > > This was confirmed by Dan (the author of the RVWMO spec) > [https://lore.kernel.org/linux-riscv/41e01514-74ca-84f2-f5cc-2645c444fd8e@xxxxxxxxxx/ > ] > > I hope this addresses your original question. Does it? I think it does, thanks. Some of this will need putting in at least the patch description, if not a code comment. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |