[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 07/20] xen/riscv: introduce bitops.h
On 21.03.2024 11:07, Oleksii wrote: > On Wed, 2024-03-20 at 17:03 +0100, Jan Beulich wrote: >> On 15.03.2024 19:06, Oleksii Kurochko wrote: >>> --- /dev/null >>> +++ b/xen/arch/riscv/include/asm/bitops.h >>> @@ -0,0 +1,144 @@ >>> +/* SPDX-License-Identifier: GPL-2.0 */ >>> +/* Copyright (C) 2012 Regents of the University of California */ >>> + >>> +#ifndef _ASM_RISCV_BITOPS_H >>> +#define _ASM_RISCV_BITOPS_H >>> + >>> +#include <asm/system.h> >>> + >>> +#define BITOP_BITS_PER_WORD BITS_PER_LONG >>> + >>> +#define BITOP_TYPE >>> +typedef uint64_t bitops_uint_t; >>> + >>> +#include <asm-generic/bitops/bitops-bits.h> >>> + >>> +#define __set_bit(n, p) set_bit(n, p) >>> +#define __clear_bit(n, p) clear_bit(n, p) >> >> If these cam with a TODO, I wouldn't say anything. But without I take >> it >> they're meant to remain that way, at which point I'd like to ask >> about >> the performance aspect: Surely the AMO insns are more expensive than >> whatever more basic insns could be used instead? I'd even go as far >> as >> wondering whether >> >> #define __set_bit(n, p) ((void)__test_and_set_bit(n, p)) >> #define __clear_bit(n, p) ((void)__test_and_clear_bit(n, p)) >> >> wouldn't be cheaper (the compiler would recognize the unused result >> and eliminate its calculation, I'm pretty sure). > It was implemented using atomic ops because of Arm: > /* > * Non-atomic bit manipulation. > * > * Implemented using atomics to be interrupt safe. Could alternatively > * implement with local interrupt masking. > */ > #define __set_bit(n,p) set_bit(n,p) > #define __clear_bit(n,p) clear_bit(n,p) > > I though that the same comment is true for x86, but after your comment > I checked x86 implementation, I realized that x86 uses non-atomic > operations. > > In this case, it seems to me there is a sense to use non-atomic for > RISC-V too. Hmm, wait: There's an important difference between x86 on one side and Arm/RISC-V/PPC and various other more or less RISC-like ones on the other. x86 has read-modify-write (memory) insns. Therefore even without using their atomic (LOCKed) forms, they do the update atomically as far as the local CPU is concerned. That's not the case when you need to use a three (or more) step load-op-store sequence. Had you retained Arm's comment, I probably wouldn't even have asked. Please add such a comment while sticking to this aliasing you have. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |