[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote: > On 23.02.2024 13:23, Oleksii wrote: > > > > > > > > > As 1- and 2-byte cases are emulated I decided that is not > > > > > > to > > > > > > provide > > > > > > sfx argument for emulation macros as it will not have to > > > > > > much > > > > > > affect on > > > > > > emulated types and just consume more performance on acquire > > > > > > and > > > > > > release > > > > > > version of sc/ld instructions. > > > > > > > > > > Question is whether the common case (4- and 8-byte accesses) > > > > > shouldn't > > > > > be valued higher, with 1- and 2-byte emulation being there > > > > > just > > > > > to > > > > > allow things to not break altogether. > > > > If I understand you correctly, it would make sense to add the > > > > 'sfx' > > > > argument for the 1/2-byte access case, ensuring that all > > > > options > > > > are > > > > available for 1/2-byte access case as well. > > > > > > That's one of the possibilities. As said, I'm not overly worried > > > about > > > the emulated cases. For the initial implementation I'd recommend > > > going > > > with what is easiest there, yielding the best possible result for > > > the > > > 4- and 8-byte cases. If later it turns out repeated > > > acquire/release > > > accesses are a problem in the emulation loop, things can be > > > changed > > > to explicit barriers, without touching the 4- and 8-byte cases. > > I am confused then a little bit if emulated case is not an issue. > > > > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and > > aqcuire > > version of xchg barries are used. > > > > The similar is done for cmpxchg. > > > > If something will be needed to change in emulation loop it won't > > require to change 4- and 8-byte cases. > > I'm afraid I don't understand your reply. IIUC, emulated cases it is implemented correctly in terms of usage barriers. And it also OK not to use sfx for lr/sc instructions and use only barriers. For 4- and 8-byte cases are used sfx + barrier depending on the specific case ( relaxed, acquire, release, generic xchg/cmpxchg ). What also looks to me correct. But you suggested to provide the best possible result for 4- and 8-byte cases. So I don't understand what the best possible result is as the current one usage of __{cmp}xchg_generic for each specific case ( relaxed, acquire, release, generic xchg/cmpxchg ) looks correct to me: xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without barriers. xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only release barrier xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire barrier xchg_relaxed -> (..., "", "", "") - no barries, no sfx The similar for cmpxchg(). ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |