Xen project Mailing List

Re: [Xen-devel] [PATCH v2] arm: reduce power use by contented spin locks with WFE/SEV

On Mon, 3 Aug 2015, Ian Campbell wrote: > From: David Vrabel <david.vrabel@xxxxxxxxxx> > > Instead of cpu_relax() while spinning and observing the ticket head, > introduce arch_lock_relax() which executes a WFE instruction. After > the ticket head is changed call arch_lock_signal() to execute an SEV > instruction (with the required DSB first) to wake any spinners. > > This should improve power consumption when locks are contented and > spinning. > > For consistency also move arch_lock_(acquire|release)_barrier to > asm/spinlock.h. > > Booted the result on arm32 (Midway) and arm64 (Mustang). Build test > only on amd64. > > Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> > [ijc: add barrier, rename as arch_lock_*, move arch_lock_*_barrier, test] > Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx> > Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> I don't know why you are moving arch_lock_(acquire|release)_barrier to spinlock.h as part of this patch, I would have thought that sticking to one goal only would have been better for this kind of change. The code should work though. > v2 (ijc): > Add dsb(ishst) to spin_relax. > s/spin_(relax|signal)/arch_lock_\1/ > Move arch_lock_(acquire|release)_barrier to asm/spinlock.h > (dropped Andy's Reviewed-by due to this change) > > In principal the SEV could be made unnecessary on arm64, but this > requires a new hook before the wait loop as well as changing > observe_head and _spin_unlock to use the Acquire/Release instructions > instead of the non-atomic loads and stores used today, which is a lot > more refactoring of the generic code than I think we can be bothered > with at this stage. > > 4.6: I'm in two minds about this, the lack of WFE in the ticket > spinlocks is not a regression (the old locks lacked them as well, > oops!). On the otherhand spinning like this isn't good. I think > overall I'm inclined to say this should wait for 4.7 but be a > candidate for backport to 4.6.1. I agree > --- > xen/common/spinlock.c | 5 +++-- > xen/include/asm-arm/spinlock.h | 9 ++++++++- > xen/include/asm-arm/system.h | 3 --- > xen/include/asm-x86/spinlock.h | 14 ++++++++++++++ > xen/include/asm-x86/system.h | 11 ----------- > 5 files changed, 25 insertions(+), 17 deletions(-) > > diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c > index 29149d1..7f89694 100644 > --- a/xen/common/spinlock.c > +++ b/xen/common/spinlock.c > @@ -141,7 +141,7 @@ void _spin_lock(spinlock_t *lock) > while ( tickets.tail != observe_head(&lock->tickets) ) > { > LOCK_PROFILE_BLOCK; > - cpu_relax(); > + arch_lock_relax(); > } > LOCK_PROFILE_GOT; > preempt_disable(); > @@ -170,6 +170,7 @@ void _spin_unlock(spinlock_t *lock) > preempt_enable(); > LOCK_PROFILE_REL; > add_sized(&lock->tickets.head, 1); > + arch_lock_signal(); > } > > void _spin_unlock_irq(spinlock_t *lock) > @@ -228,7 +229,7 @@ void _spin_barrier(spinlock_t *lock) > if ( sample.head != sample.tail ) > { > while ( observe_head(&lock->tickets) == sample.head ) > - cpu_relax(); > + arch_lock_relax(); > #ifdef LOCK_PROFILE > if ( lock->profile ) > { > diff --git a/xen/include/asm-arm/spinlock.h b/xen/include/asm-arm/spinlock.h > index 81955d1..8cdf9e1 100644 > --- a/xen/include/asm-arm/spinlock.h > +++ b/xen/include/asm-arm/spinlock.h > @@ -1,6 +1,13 @@ > #ifndef __ASM_SPINLOCK_H > #define __ASM_SPINLOCK_H > > -/* Nothing ARM specific. */ > +#define arch_lock_acquire_barrier() smp_mb() > +#define arch_lock_release_barrier() smp_mb() > + > +#define arch_lock_relax() wfe() > +#define arch_lock_signal() do { \ > + dsb(ishst); \ > + sev(); \ > +} while(0) > > #endif /* __ASM_SPINLOCK_H */ > diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h > index f0e222f..2eb96e8 100644 > --- a/xen/include/asm-arm/system.h > +++ b/xen/include/asm-arm/system.h > @@ -53,9 +53,6 @@ > > #define arch_fetch_and_add(x, v) __sync_fetch_and_add(x, v) > > -#define arch_lock_acquire_barrier() smp_mb() > -#define arch_lock_release_barrier() smp_mb() > - > extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next); > > #endif > diff --git a/xen/include/asm-x86/spinlock.h b/xen/include/asm-x86/spinlock.h > index 7d69e75..70a85af 100644 > --- a/xen/include/asm-x86/spinlock.h > +++ b/xen/include/asm-x86/spinlock.h > @@ -4,4 +4,18 @@ > #define _raw_read_unlock(l) \ > asm volatile ( "lock; dec%z0 %0" : "+m" ((l)->lock) :: "memory" ) > > +/* > + * On x86 the only reordering is of reads with older writes. In the > + * lock case, the read in observe_head() can only be reordered with > + * writes that precede it, and moving a write _into_ a locked section > + * is OK. In the release case, the write in add_sized() can only be > + * reordered with reads that follow it, and hoisting a read _into_ a > + * locked region is OK. > + */ > +#define arch_lock_acquire_barrier() barrier() > +#define arch_lock_release_barrier() barrier() > + > +#define arch_lock_relax() cpu_relax() > +#define arch_lock_signal() > + > #endif /* __ASM_SPINLOCK_H */ > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index 25a6a2a..9fb70f5 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -185,17 +185,6 @@ static always_inline unsigned long __xadd( > #define set_mb(var, value) do { xchg(&var, value); } while (0) > #define set_wmb(var, value) do { var = value; wmb(); } while (0) > > -/* > - * On x86 the only reordering is of reads with older writes. In the > - * lock case, the read in observe_head() can only be reordered with > - * writes that precede it, and moving a write _into_ a locked section > - * is OK. In the release case, the write in add_sized() can only be > - * reordered with reads that follow it, and hoisting a read _into_ a > - * locked region is OK. > - */ > -#define arch_lock_acquire_barrier() barrier() > -#define arch_lock_release_barrier() barrier() > - > #define local_irq_disable() asm volatile ( "cli" : : : "memory" ) > #define local_irq_enable() asm volatile ( "sti" : : : "memory" ) > > -- > 2.1.4 > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.