[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

To: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
From: Waiman Long <waiman.long@xxxxxx>
Date: Tue, 04 Mar 2014 12:48:26 -0500
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, Andi Kleen <andi@xxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Michel Lespinasse <walken@xxxxxxxxxx>, Alok Kataria <akataria@xxxxxxxxxx>, linux-arch@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, Ingo Molnar <mingo@xxxxxxxxxx>, Scott J Norton <scott.norton@xxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, Alexander Fyodorov <halcy@xxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Daniel J Blueman <daniel@xxxxxxxxxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, Oleg Nesterov <oleg@xxxxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Chris Wright <chrisw@xxxxxxxxxxxx>, George Spelvin <linux@xxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Aswin Chandramouleeswaran <aswin@xxxxxx>, Chegu Vinod <chegu_vinod@xxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, David Vrabel <david.vrabel@xxxxxxxxxx>, Paolo Bonzini <pbonzini@xxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Delivery-date: Tue, 04 Mar 2014 17:48:57 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

Peter,

I was trying to implement the generic queue code exchange code using
cmpxchg as suggested by you. However, when I gathered the performance
data, the code performed worse than I expected at a higher contention
level. Below were the execution time of the benchmark tool that I sent
you:

                [xchg]        [cmpxchg]
  # of tasks    Ticket lock     Queue lock      Queue Lock
  ----------    -----------     -----------     ----------
       1          135            135              135
       2          732           1315            1102
       3         1827           2372            2681
       4         2689           2934             5392
       5         3736           3658             7696
       6         4942           4434            9876
       7         6304           5176           11901
       8         7736           5955           14551

Below is the code that I used:

static inline u32 queue_code_xchg(struct qspinlock *lock, u32 *ocode,u32 ncode)

{
        while (true) {
                u32 qlcode = atomic_read(&lock->qlcode);

                if (qlcode == 0) {
                        /*
                         * Try to get the lock
                         */
                        if (atomic_cmpxchg(&lock->qlcode, 0,
                                           _QSPINLOCK_LOCKED) == 0)
                                return 1;
                } else if (qlcode & _QSPINLOCK_LOCKED) {
                        *ocode = atomic_cmpxchg(&lock->qlcode, qlcode,
                                                ncode | _QSPINLOCK_LOCKED);
                        if (*ocode == qlcode) {
                                /* Clear lock bit before return */
                                *ocode &= ~_QSPINLOCK_LOCKED;
                                return 0;
                        }
                }
                /*

* Wait if atomic_cmpxchg() fails or lock istemporarily free.

                 */
                arch_mutex_cpu_relax();
        }
}

My cmpxchg code is not optimal, and I can probably tune the code to
make it perform better. Given the trend that I was seeing, however,
I think I will keep the current xchg code, but I will package it in
an inline function.

-Longman


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
  - From: Peter Zijlstra

References:
- Re: [Xen-devel] [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
  - From: Peter Zijlstra

Prev by Date: Re: [Xen-devel] [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
Next by Date: Re: [Xen-devel] Source tree tidy
Previous by thread: Re: [Xen-devel] [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
Next by thread: Re: [Xen-devel] [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.