[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

To: Oleg Nesterov <oleg@xxxxxxxxxx>
From: Waiman Long <waiman.long@xxxxxx>
Date: Tue, 04 Mar 2014 09:46:25 -0500
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>, kvm@xxxxxxxxxxxxxxx, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, Andi Kleen <andi@xxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Michel Lespinasse <walken@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, linux-arch@xxxxxxxxxxxxxxx, Gleb Natapov <gleb@xxxxxxxxxx>, x86@xxxxxxxxxx, Ingo Molnar <mingo@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Scott J Norton <scott.norton@xxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Chris Wright <chrisw@xxxxxxxxxxxx>, Alok Kataria <akataria@xxxxxxxxxx>, Aswin Chandramouleeswaran <aswin@xxxxxx>, Chegu Vinod <chegu_vinod@xxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, David Vrabel <david.vrabel@xxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 04 Mar 2014 14:47:18 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 03/02/2014 08:12 AM, Oleg Nesterov wrote:

On 02/26, Waiman Long wrote:

+void queue_spin_lock_slowpath(struct qspinlock *lock, int qsval)
+{
+       unsigned int cpu_nr, qn_idx;
+       struct qnode *node, *next;
+       u32 prev_qcode, my_qcode;
+
+       /*
+        * Get the queue node
+        */
+       cpu_nr = smp_processor_id();
+       node   = get_qnode(&qn_idx);
+
+       /*
+        * It should never happen that all the queue nodes are being used.
+        */
+       BUG_ON(!node);
+
+       /*
+        * Set up the new cpu code to be exchanged
+        */
+       my_qcode = queue_encode_qcode(cpu_nr, qn_idx);
+
+       /*
+        * Initialize the queue node
+        */
+       node->wait = true;
+       node->next = NULL;
+
+       /*
+        * The lock may be available at this point, try again if no task was
+        * waiting in the queue.
+        */
+       if (!(qsval>>  _QCODE_OFFSET)&&  queue_spin_trylock(lock)) {
+               put_qnode();
+               return;
+       }

Cosmetic, but probably "goto release_node" would be more consistent.


Yes, that is true.

And I am wondering how much this "qsval>>  _QCODE_OFFSET" check can help.
Note that this is the only usage of this arg, perhaps it would be better
to simply remove it and shrink the caller's code a bit? It is also used
in 3/8, but we can read the "fresh" value of ->qlcode (trylock does this
anyway), and perhaps it can actually help if it is already unlocked.

First of all, there is no shrinkage in the caller code even if the qsvalargument is removed, at least for x86. The caller just targets thereturn register of the cmpxchg instruction to be the 2nd functionparameter register.

When the lock is lightly contended, there isn't much difference onwhether to check qsval or a fresh copy of qlcode. However, when the lockis heavily contended, every additional read or write will contribute tothe cacheline bouncing traffic. The code was written to minimize thoseoptional read request.

+       prev_qcode = atomic_xchg(&lock->qlcode, my_qcode);
+       /*
+        * It is possible that we may accidentally steal the lock. If this is
+        * the case, we need to either release it if not the head of the queue
+        * or get the lock and be done with it.
+        */
+       if (unlikely(!(prev_qcode&  _QSPINLOCK_LOCKED))) {
+               if (prev_qcode == 0) {
+                       /*
+                        * Got the lock since it is at the head of the queue
+                        * Now try to atomically clear the queue code.
+                        */
+                       if (atomic_cmpxchg(&lock->qlcode, my_qcode,
+                                         _QSPINLOCK_LOCKED) == my_qcode)
+                               goto release_node;
+                       /*
+                        * The cmpxchg fails only if one or more tasks
+                        * are added to the queue. In this case, we need to
+                        * notify the next one to be the head of the queue.
+                        */
+                       goto notify_next;
+               }
+               /*
+                * Accidentally steal the lock, release the lock and
+                * let the queue head get it.
+                */
+               queue_spin_unlock(lock);
+       } else
+               prev_qcode&= ~_QSPINLOCK_LOCKED;    /* Clear the lock bit */

You know, actually I started this email because I thought that "goto 
notify_next"
is wrong, I misread the patch as if this "goto" can happen even if prev_qcode 
!= 0.

So feel free to ignore, all my comments are cosmetic/subjective, but to me it
would be more clean/clear to rewrite the code above as

        if (prev_qcode == 0) {
                if (atomic_cmpxchg(..., _QSPINLOCK_LOCKED) == my_qcode)
                        goto release_node;
                goto notify_next;
        }

        if (prev_qcode&  _QSPINLOCK_LOCKED)
                prev_qcode&= ~_QSPINLOCK_LOCKED;
        else
                queue_spin_unlock(lock);

This part of the code cause confusion and make it harder to read. I amplanning to rewrite it to use cmpxchg to make sure that it won'taccidentally steal the lock. That should make the code easier tounderstand and make it possible to write better optimized code in otherpart of the function.

+       while (true) {
+               u32 qcode;
+               int retval;
+
+               retval = queue_get_lock_qcode(lock,&qcode, my_qcode);
+               if (retval>  0)
+                       ;       /* Lock not available yet */
+               else if (retval<  0)
+                       /* Lock taken, can release the node&  return */
+                       goto release_node;

I guess this is for 3/8which adds the optimized version of
queue_get_lock_qcode(), so perhaps this "retval<  0" block can go into 3/8
as well.


Yes, that is true.

+               else if (qcode != my_qcode) {
+                       /*
+                        * Just get the lock with other spinners waiting
+                        * in the queue.
+                        */
+                       if (queue_spin_setlock(lock))
+                               goto notify_next;

OTOH, at least the generic (non-optimized) version of queue_spin_setlock()
could probably accept "qcode" and avoid atomic_read() + _QSPINLOCK_LOCKED
check.


Will do so.

Thank for the comments.

-Longman

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- Re: [Xen-devel] [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation
  - From: Oleg Nesterov

Prev by Date: [Xen-devel] [PATCH v2 2/2] xl: Add "seize" option to PCI devices
Next by Date: Re: [Xen-devel] [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
Previous by thread: Re: [Xen-devel] [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation
Next by thread: Re: [Xen-devel] [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.