|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/atomic: Improvements and simplifications to assembly constraints
On 22/11/2018 08:57, Jan Beulich wrote:
> >>> On 21.11.18 at 20:37, <andrew.cooper3@xxxxxxxxxx> wrote:
>> * Some of the single-byte versions specify "=q" as the output. AFAICT, there
>> was not a legitimate reason to restrict the use of %esi/%edi in the 32-bit
>> build. Either way, in 64-bit, it is equivelent to "=r".
> I'm confused about the 32-bit part here: Of course it was necessary
> to restrict the compiler to the low 4 registers in that case. It's just
> not clear to me whether you've just written it down wrongly, or
> whether you indeed think the way it reads to me.
Wait - are you saying that the combination of "=r" and %b0 would
actually fail to build if the compiler happened to chose %edi/%esi?
Now that you point it out, I can see why %esi/%edi aren't actually
encodable in this circumstance, but surely the fact that the compiler
has to pick a byte register means that it wouldn't end up choosing these?
>
>> * Constraints in the form "=r" (x) : "0" (x) can be folded to just "+r" (x)
>> * Switch to using named parameters (mostly for legibility) which in
>> particular helps with...
>> * __xchg(), __cmpxchg() and __cmpxchg_user() modify their memory operand, so
>> must list it as an output operand. This only works because they each have
>> a memory clobber to give the construct full compiler-barrier properties.
>> * Every memory operand has an explicit known size. Letting the compiler see
>> the real size rather than obscuring it with __xg() allows for the removal
>> of the instruction size suffixes without introducing ambiguity.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>
>> Interestingly, switching to use output memory operands has the following
>> perturbance in the build:
>>
>> add/remove: 0/0 grow/shrink: 3/5 up/down: 70/-124 (-54)
>> Function old new delta
>> do_mmu_update 7041 7101 +60
>> mctelem_process_deferred 234 242 +8
>> cpufreq_governor_dbs 851 853 +2
>> _set_status 162 161 -1
>> create_irq 325 323 -2
>> do_tmem_put 2066 2062 -4
>> task_switch_load_seg 892 884 -8
>> _get_page_type 6057 5948 -109
>>
>> but as far as I can tell, it is exclusively down to different register
>> scheduling choices.
>> ---
>> xen/include/asm-x86/system.h | 99
>> +++++++++++++++++--------------------
>> xen/include/asm-x86/x86_64/system.h | 24 ++++-----
>> 2 files changed, 57 insertions(+), 66 deletions(-)
>>
>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
>> index 483cd20..8764e31 100644
>> --- a/xen/include/asm-x86/system.h
>> +++ b/xen/include/asm-x86/system.h
>> @@ -23,9 +23,6 @@
>> #define xchg(ptr,v) \
>> ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr))))
>>
>> -struct __xchg_dummy { unsigned long a[100]; };
>> -#define __xg(x) ((volatile struct __xchg_dummy *)(x))
> I never fully understood why we have this, so I'm happy to see it
> go away. I see it has gone away in Linux back in 2.6.36.
I've been trying to get rid of it for several years now. I'm glad to
see it gone.
>
>> @@ -79,31 +72,27 @@ static always_inline unsigned long __cmpxchg(
>> switch ( size )
>> {
>> case 1:
>> - asm volatile ( "lock; cmpxchgb %b1,%2"
>> - : "=a" (prev)
>> - : "q" (new), "m" (*__xg(ptr)),
>> - "0" (old)
>> + asm volatile ( "lock; cmpxchg %b[new], %[ptr]"
>> + : "=a" (prev), [ptr] "+m" (*(uint8_t *)ptr)
>> + : [new] "r" (new), "0" (old)
> Any reason you retain the reference by number in the input
> constraint here, rather than giving its corresponding output
> one a name?
Not specifically. I suppose this is doable because the constraint is an
explicitly register.
>
> Also since you're playing with this anyway - is there a need to
> retain the bogus ; after "lock"?
Ok.
>
>> --- a/xen/include/asm-x86/x86_64/system.h
>> +++ b/xen/include/asm-x86/x86_64/system.h
>> @@ -25,7 +25,7 @@ static always_inline __uint128_t __cmpxchg16b(
>>
>> /* Don't use "=A" here - clang can't deal with that. */
>> asm volatile ( "lock; cmpxchg16b %2"
> Any reason not to change this to named operands as well?
Simply code perturbance, and the fact that it isn't exactly ambiguous to
begin with. I can change it.
>
>> @@ -63,36 +63,38 @@ static always_inline __uint128_t cmpxchg16b_local_(
>> * If no fault occurs then _o is updated to the value we saw at _p. If this
>> * is the same as the initial value of _o then _n is written to location _p.
>> */
>> -#define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype) \
>> +#define __cmpxchg_user(_p, _o, _n, _oppre) \
>> stac(); \
>> asm volatile ( \
>> - "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n" \
>> + "1: lock; cmpxchg %"_oppre"[new], %[ptr]\n" \
>> "2:\n" \
>> ".section .fixup,\"ax\"\n" \
>> "3: movl $1,%1\n" \
> Any what about this?
I'm certain that I fix that at one point. I must have lost it in a rebase.
>
>> " jmp 2b\n" \
>> ".previous\n" \
>> _ASM_EXTABLE(1b, 3b) \
>> - : "=a" (_o), "=r" (_rc) \
>> - : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1"
>> (0) \
>> + : "+a" (_o), "=r" (_rc), \
>> + [ptr] "+m" (*(volatile typeof(*(_p)) *)(_p)) \
> Does casting to add "volatile" here really make any difference,
> considering the asm() itself is a volatile one and has a memory
> clobber?
Yes. mod_l1_entry() hits a BUG() without it.
Until I understand why, I purposefully didn't change the volatility of
any of these constructs in what is otherwise a cleanup patch.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |