[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/2] rwlock: add per-cpu reader-writer locks
>>> On 03.11.15 at 18:58, <malcolm.crossley@xxxxxxxxxx> wrote: > Per-cpu read-write locks allow for the fast path read case to have low > overhead > by only setting/clearing a per-cpu variable for using the read lock. > The per-cpu read fast path also avoids locked compare swap operations which > can > be particularly slow on coherent multi-socket systems, particularly if there > is > heavy usage of the read lock itself. > > The per-cpu reader-writer lock uses a global variable to control the read lock > fast path. This allows a writer to disable the fast path and ensure the > readers > use the underlying read-write lock implementation. > > Once the writer has taken the write lock and disabled the fast path, it must > poll the per-cpu variable for all CPU's which have entered the critical > section > for the specific read-write lock the writer is attempting to take. This design > allows for a single per-cpu variable to be used for read/write locks belonging > to seperate data structures as long as multiple per-cpu read locks are not > simultaneously held by one particular cpu. This also means per-cpu > reader-writer locks are not recursion safe. This sounds like pretty severe a restriction, and something to easily get wrong once more than a handful of users got introduced. > Slow path readers which are unblocked set the per-cpu variable and drop the > read lock. This simplifies the implementation and allows for fairness in the > underlying read-write lock to be taken advantage of. > > There may be slightly more overhead on the per-cpu write lock path due to > checking each CPUs fast path read variable but this overhead is likely be > hidden > by the required delay of waiting for readers to exit the critical section. Hmm, the "slightly" here seems to depend on the number of CPUs. > --- a/xen/common/spinlock.c > +++ b/xen/common/spinlock.c > @@ -492,6 +492,38 @@ int _rw_is_write_locked(rwlock_t *lock) > return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */ > } > > +void percpu_write_lock(rwlock_t **per_cpudata, bool_t *writer_activating, > + rwlock_t *rwlock) > +{ > + int cpu; unsigned int > + cpumask_t active_readers; Did you consider ways to avoid such a large variable to be put on the stack, and none turned out suitable? > + cpumask_copy(&active_readers, &cpu_online_map); While we have other cases not fully hotplug safe, I don't think we can allow this in basic locking infrastructure code. You need to protect against cpu_online_map changing. > + /* First take the write lock to protect against other writers */ Most if not all of your comments are missing a full stop. > + write_lock(rwlock); > + > + /* Now set the global variable so that readers start using read_lock */ > + *writer_activating = true; > + smp_mb(); > + > + /* Check if there are any percpu readers in progress on our rwlock*/ > + do > + { > + for_each_cpu(cpu, &active_readers) > + { > + /* Remove any percpu readers not contending > + * from our check mask */ Comment style. > + if ( per_cpu_ptr(per_cpudata, cpu) != rwlock ) > + cpumask_clear_cpu(cpu, &active_readers); This is a LOCKed op, and with the subject of the patch being to lower the load on the fabric, I wonder whether time isn't ripe for introducing a non-LOCKed variant of set, clear, and their test_and_ variants. > + } > + /* Check if we've cleared all percpu readers */ > + if ( cpumask_empty(&active_readers) ) > + break; > + /* Give the coherency fabric a break */ > + cpu_relax(); > + } while ( 1 ); While certainly a matter of taste, could I talk you into using for ( ; ; ) instead? > --- a/xen/include/xen/spinlock.h > +++ b/xen/include/xen/spinlock.h > @@ -3,6 +3,7 @@ > > #include <asm/system.h> > #include <asm/spinlock.h> > +#include <xen/stdbool.h> No inclusion of this header in code other than such shared with the tools please. > @@ -261,4 +262,40 @@ int _rw_is_write_locked(rwlock_t *lock); > #define rw_is_locked(l) _rw_is_locked(l) > #define rw_is_write_locked(l) _rw_is_write_locked(l) > > +static inline void percpu_read_lock(rwlock_t **per_cpudata, bool_t > *writer_activating, > + rwlock_t *rwlock) > +{ > + /* Indicate this cpu is reading */ > + this_cpu_ptr(per_cpudata) = rwlock; > + smp_mb(); > + /* Check if a writer is waiting */ > + if ( unlikely(*writer_activating) ) > + { > + /* Let the waiting writer know we aren't holding the lock */ > + this_cpu_ptr(per_cpudata) = NULL; > + /* Wait using the read lock to keep the lock fair */ > + read_lock(rwlock); > + /* Set the per CPU data again and continue*/ > + this_cpu_ptr(per_cpudata) = rwlock; > + /* Drop the read lock because we don't need it anymore */ > + read_unlock(rwlock); > + } > +} So am I getting it right that the whole point of the introduction of this lock flavor is to replace the cmpxchg() in read_lock() with just the write+read above? If so I'm sure you considered alternatives, namely of modifying the plain rw locks - would you mind sharing the outcome thereof? I admit I can't see alternatives, but the price - as said above - seems quite high. Or maybe I'm misunderstanding the "as long as multiple per-cpu read locks are not simultaneously held by one particular cpu", as having made it here I can't right away see where that restriction comes from? > +static inline void percpu_read_unlock(rwlock_t **per_cpudata) > +{ > + this_cpu_ptr(per_cpudata) = NULL; > + smp_wmb(); > +} > + > +/* Don't inline percpu write lock as it's a complex function */ > +void percpu_write_lock(rwlock_t **per_cpudata, bool_t *writer_activating, > + rwlock_t *rwlock); > + > +static inline void percpu_write_unlock(bool_t *writer_activating, rwlock_t > *rwlock) > +{ > + *writer_activating = false; > + write_unlock(rwlock); > +} Considering that the arguments one needs to pass to any of the functions here taking multiple arguments need to be in close sync, I'm missing abstracting macros here that need to be passed just a single argument (plus a declaration and a definition one). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |