[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks

To: Waiman Long <Waiman.Long@xxxxxx>
From: Waiman Long <waiman.long@xxxxxx>
Date: Wed, 12 Mar 2014 15:08:24 -0400
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>, kvm@xxxxxxxxxxxxxxx, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, Andi Kleen <andi@xxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Michel Lespinasse <walken@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, linux-arch@xxxxxxxxxxxxxxx, Gleb Natapov <gleb@xxxxxxxxxx>, x86@xxxxxxxxxx, Ingo Molnar <mingo@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Scott J Norton <scott.norton@xxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Chris Wright <chrisw@xxxxxxxxxxxx>, Oleg Nesterov <oleg@xxxxxxxxxx>, Alok Kataria <akataria@xxxxxxxxxx>, Aswin Chandramouleeswaran <aswin@xxxxxx>, Chegu Vinod <chegu_vinod@xxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, David Vrabel <david.vrabel@xxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 12 Mar 2014 19:08:56 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 03/12/2014 02:54 PM, Waiman Long wrote:

+
+               /*
+                * Now wait until the lock bit is cleared
+                */
+               while (smp_load_acquire(&qlock->qlcode)&  _QSPINLOCK_LOCKED)
+                       arch_mutex_cpu_relax();
+
+               /*
+                * Set the lock bit&  clear the waiting bit simultaneously
+                * It is assumed that there is no lock stealing with this
+                * quick path active.
+                *
+                * A direct memory store of _QSPINLOCK_LOCKED into the
+                * lock_wait field causes problem with the lockref code, e.g.
+                *   ACCESS_ONCE(qlock->lock_wait) = _QSPINLOCK_LOCKED;
+                *
+                * It is not currently clear why this happens. A workaround
+                * is to use atomic instruction to store the new value.
+                */
+               {
+                       u16 lw = xchg(&qlock->lock_wait, _QSPINLOCK_LOCKED);
+                       BUG_ON(lw != _QSPINLOCK_WAITING);
+               }
+               return 1;

It was found that when I used a direct memory store instead of an atomicop, the following kernel crash might happen at filesystem dismount time:


Red Hat Enterprise Linux Server 7.0 (Maipo)
Kernel 3.14.0-rc6-qlock on an x86_64

h11-kvm20 login: [ 1529.934047] BUG: Dentryffff883f4c048480{i=30181e9e,n=libopc

odes-2.23.52.0.1-15.el7.so} still in use (-1) [unmount of xfs dm-1]
[ 1529.935762] ------------[ cut here ]------------
[ 1529.936331] kernel BUG at fs/dcache.c:1343!
[ 1529.936714] invalid opcode: 0000 [#1] SMP

[ 1529.936714] Modules linked in: ext4 mbcache jbd2 binfmt_misc brdip6t_rpfilter cfg80211 ip6t_REJECT rfkill ipt_REJECT xt_conntrack ebtable_natebtable_broutebridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_rawip6table_filterip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filterip_tables sgppdev snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdepsnd_seq snd_seq_device snd_pcm snd_timer snd parport_pc parport soundcore serio_rawi2c_piix4virtio_console virtio_balloon microcode pcspkr nfsd auth_rpcgssnfs_acl lockd sunrpc uinput xfs libcrc32c sr_mod cdrom ata_generic pata_acpi qxlvirtio_blk virtio_net drm_kms_helper ttm drm ata_piix libata virtio_pci virtio_ringfloppy i2c

_core virtio dm_mirror dm_region_hash dm_log dm_mod

[ 1529.936714] CPU: 12 PID: 11106 Comm: umount Not tainted3.14.0-rc6-qlock #1

[ 1529.936714] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011

[ 1529.936714] task: ffff881f9183b540 ti: ffff881f920fa000 task.ti:ffff881f920f

a000

[ 1529.936714] RIP: 0010:[<ffffffff811c185c>] [<ffffffff811c185c>]umount_colle

ct+0xec/0x110
[ 1529.936714] RSP: 0018:ffff881f920fbdc8  EFLAGS: 00010282

[ 1529.936714] RAX: 0000000000000073 RBX: ffff883f4c048480 RCX:0000000000000000[ 1529.936714] RDX: 0000000000000001 RSI: 0000000000000046 RDI:0000000000000246[ 1529.936714] RBP: ffff881f920fbde0 R08: ffffffff819e42e0 R09:0000000000000396[ 1529.936714] R10: 0000000000000000 R11: ffff881f920fbb06 R12:ffff881f920fbe60[ 1529.936714] R13: ffff883f8d458460 R14: ffff883f4c048480 R15:ffff883f8d4583c0[ 1529.936714] FS: 00007f6027b0c880(0000) GS:ffff88403fc40000(0000)knlGS:00000

00000000000
[ 1529.936714] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b

[ 1529.936714] CR2: 00007f60276c4900 CR3: 0000003f421c0000 CR4:00000000000006e0

[ 1529.936714] Stack:

[ 1529.936714] ffff883f8edf4ac8 ffff883f4c048510 ffff883f910a02d0ffff881f920fb

e50

[ 1529.936714] ffffffff811c2d03 0000000000000000 00ff881f920fbe500000896600000

[ 1529.936714] ffff883f8d4587d8 ffff883f8d458780 ffffffff811c1770ffff881f920fb

e60
[ 1529.936714] Call Trace:
[ 1529.936714]  [<ffffffff811c2d03>] d_walk+0xc3/0x260
[ 1529.936714]  [<ffffffff811c1770>] ? check_and_collect+0x30/0x30
[ 1529.936714]  [<ffffffff811c3985>] shrink_dcache_for_umount+0x75/0x120
[ 1529.936714]  [<ffffffff811adf21>] generic_shutdown_super+0x21/0xf0
[ 1529.936714]  [<ffffffff811ae207>] kill_block_super+0x27/0x70
[ 1529.936714]  [<ffffffff811ae4ed>] deactivate_locked_super+0x3d/0x60
[ 1529.936714]  [<ffffffff811aea96>] deactivate_super+0x46/0x60
[ 1529.936714]  [<ffffffff811ca277>] mntput_no_expire+0xa7/0x140
[ 1529.936714]  [<ffffffff811cb6ce>] SyS_umount+0x8e/0x100
[ 1529.936714]  [<ffffffff815d2c29>] system_call_fastpath+0x16/0x1b

[ 1529.936714] Code: 00 00 48 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 742a 48 8b50 40 48 89 34 24 48 c7 c7 e0 4a 7f 81 48 89 de 31 c0 e8 03 cb 3f 00<0f> 0b 66

 90 48 89 f7 e8 c8 fc ff ff e9 66 ff ff ff 31 d2 90 eb
[ 1529.936714] RIP  [<ffffffff811c185c>] umount_collect+0xec/0x110
[ 1529.936714]  RSP <ffff881f920fbdc8>
[ 1529.976523] ---[ end trace 6c8ce7cee0969bbb ]---
[ 1529.977137] Kernel panic - not syncing: Fatal exception

[ 1529.978119] Kernel Offset: 0x0 from 0xffffffff81000000 (relocationrange: 0xf

fffffff80000000-0xffffffff9fffffff)

[ 1529.978119] drm_kms_helper: panic occurred, switching back to textconsole

It was more readily reproducible in a KVM guest. It was harder toreproduce in a bare metal machine, but kernel crash still happened afterseveral tries.

I am not sure what exactly cause this crash, but it will have somethingto do with the interaction between the lockref and the qspinlock code. Iwould like more eyes on that to find the root cause of it.


-Longman


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
  - From: Peter Zijlstra

References:
- [Xen-devel] [PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support
  - From: Waiman Long
- [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
  - From: Waiman Long

Prev by Date: [Xen-devel] [PATCH 2/5] xen/common: Cleanup use of __attribute__((packed))
Next by Date: Re: [Xen-devel] [PATCH 1/1] hpet: Act more like real hardware
Previous by thread: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
Next by thread: Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.