[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/atomic: Improvements and simplifications to assembly constraints
>>> On 21.11.18 at 20:37, <andrew.cooper3@xxxxxxxxxx> wrote: > * Some of the single-byte versions specify "=q" as the output. AFAICT, there > was not a legitimate reason to restrict the use of %esi/%edi in the 32-bit > build. Either way, in 64-bit, it is equivelent to "=r". I'm confused about the 32-bit part here: Of course it was necessary to restrict the compiler to the low 4 registers in that case. It's just not clear to me whether you've just written it down wrongly, or whether you indeed think the way it reads to me. > * Constraints in the form "=r" (x) : "0" (x) can be folded to just "+r" (x) > * Switch to using named parameters (mostly for legibility) which in > particular helps with... > * __xchg(), __cmpxchg() and __cmpxchg_user() modify their memory operand, so > must list it as an output operand. This only works because they each have > a memory clobber to give the construct full compiler-barrier properties. > * Every memory operand has an explicit known size. Letting the compiler see > the real size rather than obscuring it with __xg() allows for the removal > of the instruction size suffixes without introducing ambiguity. > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > --- > CC: Jan Beulich <JBeulich@xxxxxxxx> > CC: Wei Liu <wei.liu2@xxxxxxxxxx> > CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > Interestingly, switching to use output memory operands has the following > perturbance in the build: > > add/remove: 0/0 grow/shrink: 3/5 up/down: 70/-124 (-54) > Function old new delta > do_mmu_update 7041 7101 +60 > mctelem_process_deferred 234 242 +8 > cpufreq_governor_dbs 851 853 +2 > _set_status 162 161 -1 > create_irq 325 323 -2 > do_tmem_put 2066 2062 -4 > task_switch_load_seg 892 884 -8 > _get_page_type 6057 5948 -109 > > but as far as I can tell, it is exclusively down to different register > scheduling choices. > --- > xen/include/asm-x86/system.h | 99 > +++++++++++++++++-------------------- > xen/include/asm-x86/x86_64/system.h | 24 ++++----- > 2 files changed, 57 insertions(+), 66 deletions(-) > > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index 483cd20..8764e31 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -23,9 +23,6 @@ > #define xchg(ptr,v) \ > ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr)))) > > -struct __xchg_dummy { unsigned long a[100]; }; > -#define __xg(x) ((volatile struct __xchg_dummy *)(x)) I never fully understood why we have this, so I'm happy to see it go away. I see it has gone away in Linux back in 2.6.36. > @@ -79,31 +72,27 @@ static always_inline unsigned long __cmpxchg( > switch ( size ) > { > case 1: > - asm volatile ( "lock; cmpxchgb %b1,%2" > - : "=a" (prev) > - : "q" (new), "m" (*__xg(ptr)), > - "0" (old) > + asm volatile ( "lock; cmpxchg %b[new], %[ptr]" > + : "=a" (prev), [ptr] "+m" (*(uint8_t *)ptr) > + : [new] "r" (new), "0" (old) Any reason you retain the reference by number in the input constraint here, rather than giving its corresponding output one a name? Also since you're playing with this anyway - is there a need to retain the bogus ; after "lock"? > --- a/xen/include/asm-x86/x86_64/system.h > +++ b/xen/include/asm-x86/x86_64/system.h > @@ -25,7 +25,7 @@ static always_inline __uint128_t __cmpxchg16b( > > /* Don't use "=A" here - clang can't deal with that. */ > asm volatile ( "lock; cmpxchg16b %2" Any reason not to change this to named operands as well? > @@ -63,36 +63,38 @@ static always_inline __uint128_t cmpxchg16b_local_( > * If no fault occurs then _o is updated to the value we saw at _p. If this > * is the same as the initial value of _o then _n is written to location _p. > */ > -#define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype) \ > +#define __cmpxchg_user(_p, _o, _n, _oppre) \ > stac(); \ > asm volatile ( \ > - "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n" \ > + "1: lock; cmpxchg %"_oppre"[new], %[ptr]\n" \ > "2:\n" \ > ".section .fixup,\"ax\"\n" \ > "3: movl $1,%1\n" \ Any what about this? > " jmp 2b\n" \ > ".previous\n" \ > _ASM_EXTABLE(1b, 3b) \ > - : "=a" (_o), "=r" (_rc) \ > - : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" (0) > \ > + : "+a" (_o), "=r" (_rc), \ > + [ptr] "+m" (*(volatile typeof(*(_p)) *)(_p)) \ Does casting to add "volatile" here really make any difference, considering the asm() itself is a volatile one and has a memory clobber? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |