Xen project Mailing List

[Xen-devel] Xen Linux deadlock

To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Andre Przywara <andre.przywara@xxxxxxx>

Date: Wed, 7 Jun 2017 16:05:21 +0100

Cc: Juergen Gross <jgross@xxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>

Delivery-date: Wed, 07 Jun 2017 15:05:21 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi, when booting Linux 4.12-rc4 as Dom0 under a recent Xen HV I saw the following lockdep splat after running xencommons start: root@junor1:~# bash /etc/init.d/xencommons start Setting domain 0 name, domid and JSON config... [ 247.979498] ====================================================== [ 247.985688] WARNING: possible circular locking dependency detected [ 247.991882] 4.12.0-rc4-00022-gc4b25c0 #575 Not tainted [ 247.997040] ------------------------------------------------------ [ 248.003232] xenbus/91 is trying to acquire lock: [ 248.007875] (&u->msgbuffer_mutex){+.+.+.}, at: [<ffff00000863e904>] xenbus_dev_queue_reply+0x3c/0x230 [ 248.017163] [ 248.017163] but task is already holding lock: [ 248.023096] (xb_write_mutex){+.+...}, at: [<ffff00000863a940>] xenbus_thread+0x5f0/0x798 [ 248.031267] [ 248.031267] which lock already depends on the new lock. [ 248.031267] [ 248.039615] [ 248.039615] the existing dependency chain (in reverse order) is: [ 248.047176] [ 248.047176] -> #1 (xb_write_mutex){+.+...}: [ 248.052943] __lock_acquire+0x1728/0x1778 [ 248.057498] lock_acquire+0xc4/0x288 [ 248.061630] __mutex_lock+0x84/0x868 [ 248.065755] mutex_lock_nested+0x3c/0x50 [ 248.070227] xs_send+0x164/0x1f8 [ 248.074015] xenbus_dev_request_and_reply+0x6c/0x88 [ 248.079427] xenbus_file_write+0x260/0x420 [ 248.084073] __vfs_write+0x48/0x138 [ 248.088113] vfs_write+0xa8/0x1b8 [ 248.091983] SyS_write+0x54/0xb0 [ 248.095768] el0_svc_naked+0x24/0x28 [ 248.099897] [ 248.099897] -> #0 (&u->msgbuffer_mutex){+.+.+.}: [ 248.106088] print_circular_bug+0x80/0x2e0 [ 248.110730] __lock_acquire+0x1768/0x1778 [ 248.115288] lock_acquire+0xc4/0x288 [ 248.119417] __mutex_lock+0x84/0x868 [ 248.123545] mutex_lock_nested+0x3c/0x50 [ 248.128016] xenbus_dev_queue_reply+0x3c/0x230 [ 248.133005] xenbus_thread+0x788/0x798 [ 248.137306] kthread+0x110/0x140 [ 248.141087] ret_from_fork+0x10/0x40 [ 248.145214] [ 248.145214] other info that might help us debug this: [ 248.145214] [ 248.153383] Possible unsafe locking scenario: [ 248.153383] [ 248.159403] CPU0 CPU1 [ 248.163960] ---- ---- [ 248.168518] lock(xb_write_mutex); [ 248.172045] lock(&u->msgbuffer_mutex); [ 248.178500] lock(xb_write_mutex); [ 248.184514] lock(&u->msgbuffer_mutex); [ 248.188470] [ 248.188470] *** DEADLOCK *** [ 248.188470] [ 248.194578] 2 locks held by xenbus/91: [ 248.198360] #0: (xs_response_mutex){+.+...}, at: [<ffff00000863a7b0>] xenbus_thread+0x460/0x798 [ 248.207218] #1: (xb_write_mutex){+.+...}, at: [<ffff00000863a940>] xenbus_thread+0x5f0/0x798 [ 248.215818] [ 248.215818] stack backtrace: [ 248.220293] CPU: 0 PID: 91 Comm: xenbus Not tainted 4.12.0-rc4-00022-gc4b25c0 #575 [ 248.227858] Hardware name: ARM Juno development board (r1) (DT) [ 248.233792] Call trace: [ 248.236289] [<ffff00000808a748>] dump_backtrace+0x0/0x270 [ 248.241707] [<ffff00000808aa94>] show_stack+0x24/0x30 [ 248.246782] [<ffff0000084caa98>] dump_stack+0xb8/0xf0 [ 248.251859] [<ffff000008139068>] print_circular_bug+0x1f8/0x2e0 [ 248.257787] [<ffff00000813c090>] __lock_acquire+0x1768/0x1778 [ 248.263548] [<ffff00000813c90c>] lock_acquire+0xc4/0x288 [ 248.268882] [<ffff000008bdb28c>] __mutex_lock+0x84/0x868 [ 248.274219] [<ffff000008bdbaac>] mutex_lock_nested+0x3c/0x50 [ 248.279889] [<ffff00000863e904>] xenbus_dev_queue_reply+0x3c/0x230 [ 248.286081] [<ffff00000863aad8>] xenbus_thread+0x788/0x798 [ 248.291585] [<ffff000008108070>] kthread+0x110/0x140 [ 248.296572] [<ffff000008083710>] ret_from_fork+0x10/0x40 Apparently it's not easily reproducible, but Julien confirmed that the dead lock condition as reported above is indeed in the Linux code. Does anyone has an idea of how to fix this? Cheers, Andre. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.