lockdep splat when disconnecting NVMe oF TCP inititator
Lucas Stach
l.stach at pengutronix.de
Wed Sep 23 05:02:05 EDT 2020
Hi nvme folks,
I've toyed a bit with NVMe oF TCP over the last few days and came
across the following lockdep splat on the target when disconnecting the
initiator. I don't know if I'll have time to look into this anytime
soon, so I figured I could at least leave the report here.
Regards,
Lucas
[ 142.078964] WARNING: possible circular locking dependency detected
[ 142.085172] 5.9.0-rc5 #2 Not tainted
[ 142.091982] ------------------------------------------------------
[ 142.098186] kworker/0:3/74 is trying to acquire lock:
[ 142.103259] ffff000015228830 ((work_completion)(&queue->io_work)){+.+.}-{0:0}, at: __flush_work+0x54/0x510
[ 142.112981]
[ 142.112981] but task is already holding lock:
[ 142.118835] ffff800011efbdc0 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x1ec/0x710
[ 142.129413]
[ 142.129413] which lock already depends on the new lock.
[ 142.129413]
[ 142.137618]
[ 142.137618] the existing dependency chain (in reverse order) is:
[ 142.145124]
[ 142.145124] -> #2 ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
[ 142.153352] process_one_work+0x248/0x710
[ 142.157906] worker_thread+0x74/0x470
[ 142.162119] kthread+0x15c/0x160
[ 142.165894] ret_from_fork+0x10/0x38
[ 142.170005]
[ 142.170005] -> #1 ((wq_completion)events){+.+.}-{0:0}:
[ 142.176665] flush_workqueue+0x98/0x400
[ 142.181054] nvmet_tcp_install_queue+0x11c/0x130
[ 142.186217] nvmet_install_queue+0xbc/0x150
[ 142.190946] nvmet_execute_admin_connect+0x11c/0x200
[ 142.196458] nvmet_tcp_io_work+0x8c0/0x950
[ 142.201099] process_one_work+0x294/0x710
[ 142.205654] worker_thread+0x74/0x470
[ 142.209860] kthread+0x15c/0x160
[ 142.213630] ret_from_fork+0x10/0x38
[ 142.217742]
[ 142.217742] -> #0 ((work_completion)(&queue->io_work)){+.+.}-{0:0}:
[ 142.225542] __lock_acquire+0x13fc/0x2160
[ 142.230097] lock_acquire+0xec/0x4d0
[ 142.234215] __flush_work+0x7c/0x510
[ 142.238333] flush_work+0x14/0x20
[ 142.242192] nvmet_tcp_release_queue_work+0xb0/0x280
[ 142.247703] process_one_work+0x294/0x710
[ 142.252257] worker_thread+0x74/0x470
[ 142.256465] kthread+0x15c/0x160
[ 142.260234] ret_from_fork+0x10/0x38
[ 142.264346]
[ 142.264346] other info that might help us debug this:
[ 142.264346]
[ 142.272375] Chain exists of:
[ 142.272375] (work_completion)(&queue->io_work) --> (wq_completion)events --> (work_completion)(&queue->release_work)
[ 142.272375]
[ 142.287476] Possible unsafe locking scenario:
[ 142.287476]
[ 142.293415] CPU0 CPU1
[ 142.297962] ---- ----
[ 142.302508] lock((work_completion)(&queue->release_work));
[ 142.308194] lock((wq_completion)events);
[ 142.314836] lock((work_completion)(&queue->release_work));
[ 142.323046] lock((work_completion)(&queue->io_work));
[ 142.328296]
[ 142.328296] *** DEADLOCK ***
[ 142.328296]
[ 142.334239] 2 locks held by kworker/0:3/74:
[ 142.338438] #0: ffff000017405738 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1ec/0x710
[ 142.347887] #1: ffff800011efbdc0 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x1ec/0x710
[ 142.358901]
[ 142.358901] stack backtrace:
[ 142.363284] CPU: 0 PID: 74 Comm: kworker/0:3 Not tainted 5.9.0-rc5 #2
[ 142.372970] Hardware name: XXX (DT)
[ 142.378053] Workqueue: events nvmet_tcp_release_queue_work
[ 142.383563] Call trace:
[ 142.386032] dump_backtrace+0x0/0x1b0
[ 142.389717] show_stack+0x18/0x30
[ 142.393055] dump_stack+0xe8/0x15c
[ 142.396480] print_circular_bug+0x278/0x280
[ 142.400688] check_noncircular+0x164/0x1e0
[ 142.404808] __lock_acquire+0x13fc/0x2160
[ 142.408841] lock_acquire+0xec/0x4d0
[ 142.412437] __flush_work+0x7c/0x510
[ 142.416032] flush_work+0x14/0x20
[ 142.419369] nvmet_tcp_release_queue_work+0xb0/0x280
[ 142.424358] process_one_work+0x294/0x710
[ 142.428390] worker_thread+0x74/0x470
[ 142.432075] kthread+0x15c/0x160
[ 142.435322] ret_from_fork+0x10/0x38
More information about the Linux-nvme
mailing list