[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 10/20] xen/riscv: introduce atomic.h


  • To: Oleksii <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 27 Mar 2024 12:07:26 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 27 Mar 2024 11:07:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.03.2024 11:28, Oleksii wrote:
> On Wed, 2024-03-27 at 08:40 +0100, Jan Beulich wrote:
> ...
> 
>>>>>>> +/* This is required to provide a full barrier on success.
>>>>>>> */
>>>>>>> +static inline int atomic_add_unless(atomic_t *v, int a,
>>>>>>> int u)
>>>>>>> +{
>>>>>>> +       int prev, rc;
>>>>>>> +
>>>>>>> +    asm volatile (
>>>>>>> +        "0: lr.w     %[p],  %[c]\n"
>>>>>>> +        "   beq      %[p],  %[u], 1f\n"
>>>>>>> +        "   add      %[rc], %[p], %[a]\n"
>>>>>>> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
>>>>>>> +        "   bnez     %[rc], 0b\n"
>>>>>>> +        RISCV_FULL_BARRIER
>>>>>>
>>>>>> With this and no .aq on the load, why the .rl on the store?
>>>>> It is something that LKMM requires [1].
>>>>>
>>>>> This is not fully clear to me what is so specific in LKMM, but
>>>>> accoring
>>>>> to the spec:
>>>>>    Ordering Annotation Fence-based Equivalent
>>>>>    l{b|h|w|d|r}.aq     l{b|h|w|d|r}; fence r,rw
>>>>>    l{b|h|w|d|r}.aqrl   fence rw,rw; l{b|h|w|d|r}; fence r,rw
>>>>>    s{b|h|w|d|c}.rl     fence rw,w; s{b|h|w|d|c}
>>>>>    s{b|h|w|d|c}.aqrl   fence rw,w; s{b|h|w|d|c}
>>>>>    amo<op>.aq          amo<op>; fence r,rw
>>>>>    amo<op>.rl          fence rw,w; amo<op>
>>>>>    amo<op>.aqrl        fence rw,rw; amo<op>; fence rw,rw
>>>>>    Table 2.2: Mappings from .aq and/or .rl to fence-based
>>>>> equivalents.
>>>>>    An alternative mapping places a fence rw,rw after the
>>>>> existing 
>>>>>    s{b|h|w|d|c} mapping rather than at the front of the
>>>>>    l{b|h|w|d|r} mapping.
>>>>
>>>> I'm afraid I can't spot the specific case in this table. None of
>>>> the
>>>> stores in the right column have a .rl suffix.
>>> Yes, it is expected.
>>>
>>> I am reading this table as (f.e.) amo<op>.rl is an equivalent of
>>> fence
>>> rw,w; amo<op>. (without .rl) 
>>
>> In which case: How does quoting the table answer my original
>> question?
> Agree, it is starting to be confusing, so let me give an answer to your
> question below.
> 
>>
>>>>>    It is also safe to translate any .aq, .rl, or .aqrl
>>>>> annotation
>>>>> into
>>>>>    the fence-based snippets of
>>>>>    Table 2.2. These can also be used as a legal implementation
>>>>> of
>>>>>    l{b|h|w|d} or s{b|h|w|d} pseu-
>>>>>    doinstructions for as long as those instructions are not
>>>>> added
>>>>> to
>>>>>    the ISA.
>>>>>
>>>>> So according to the spec, it should be:
>>>>>  sc.w ...
>>>>>  RISCV_FULL_BARRIER.
>>>>>
>>>>> Considering [1] and how this code looks before, it seems to me
>>>>> that
>>>>> it
>>>>> is safe to use lr.w.aq/sc.w.rl here or an fence equivalent.
>>>>
>>>> Here you say "or". Then why dos the code use sc.?.rl _and_ a
>>>> fence?
>>> I confused this line with amo<op>.aqrl, so based on the table 2.2
>>> above
>>> s{b|h|w|d|c}.aqrl is transformed to "fence rw,w; s{b|h|w|d|c}", but
>>> Linux kernel decided to strengthen it with full barrier:
>>>    -              "0:\n\t"
>>>    -              "lr.w.aqrl  %[p],  %[c]\n\t"
>>>    -              "beq        %[p],  %[u], 1f\n\t"
>>>    -              "add       %[rc],  %[p], %[a]\n\t"
>>>    -              "sc.w.aqrl %[rc], %[rc], %[c]\n\t"
>>>    -              "bnez      %[rc], 0b\n\t"
>>>    -              "1:"
>>>    +               "0:     lr.w     %[p],  %[c]\n"
>>>    +               "       beq      %[p],  %[u], 1f\n"
>>>    +               "       add      %[rc], %[p], %[a]\n"
>>>    +               "       sc.w.rl  %[rc], %[rc], %[c]\n"
>>>    +               "       bnez     %[rc], 0b\n"
>>>    +               "       fence    rw, rw\n"
>>>    +               "1:\n"
>>> As they have the following issue:
>>>    implementations of atomics such as atomic_cmpxchg() and
>>>    atomic_add_unless() rely on LR/SC pairs,
>>>    which do not give full-ordering with .aqrl; for example, current
>>>    implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
>>>    below to end up with the state indicated in the "exists" clause.
>>
>> Is that really "current implementations", not "the abstract model"?
>> If so, the use of an explicit fence would be more like a workaround
>> (and would hence want commenting to that effect).
>>
>> In neither case can I see my original question answered: Why the .rl
>> on the store when there is a full fence later?
> The good explanation for that was provided in the commit addressing a
> similar issue for ARM64 [
> https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon@xxxxxxx/
> ].
> The same holds true for RISC-V since ARM also employs WMO.
> 
> I would also like to mention another point, as I indicated in another
> email thread
>https://lists.xen.org/archives/html/xen-devel/2024-03/msg01582.html ]
> , that now this fence can be omitted and .aqrl can be used instead.
> 
> This was confirmed by Dan (the author of the RVWMO spec)
> [https://lore.kernel.org/linux-riscv/41e01514-74ca-84f2-f5cc-2645c444fd8e@xxxxxxxxxx/
> ]
> 
> I hope this addresses your original question. Does it?

I think it does, thanks. Some of this will need putting in at least the
patch description, if not a code comment.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.