linux/drivers/infiniband/core
Jason Gunthorpe 621e55ff5b RDMA/devices: Do not deadlock during client removal
lockdep reports:

   WARNING: possible circular locking dependency detected

   modprobe/302 is trying to acquire lock:
   0000000007c8919c ((wq_completion)ib_cm){+.+.}, at: flush_workqueue+0xdf/0x990

   but task is already holding lock:
   000000002d3d2ca9 (&device->client_data_rwsem){++++}, at: remove_client_context+0x79/0xd0 [ib_core]

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -> #2 (&device->client_data_rwsem){++++}:
          down_read+0x3f/0x160
          ib_get_net_dev_by_params+0xd5/0x200 [ib_core]
          cma_ib_req_handler+0x5f6/0x2090 [rdma_cm]
          cm_process_work+0x29/0x110 [ib_cm]
          cm_req_handler+0x10f5/0x1c00 [ib_cm]
          cm_work_handler+0x54c/0x311d [ib_cm]
          process_one_work+0x4aa/0xa30
          worker_thread+0x62/0x5b0
          kthread+0x1ca/0x1f0
          ret_from_fork+0x24/0x30

   -> #1 ((work_completion)(&(&work->work)->work)){+.+.}:
          process_one_work+0x45f/0xa30
          worker_thread+0x62/0x5b0
          kthread+0x1ca/0x1f0
          ret_from_fork+0x24/0x30

   -> #0 ((wq_completion)ib_cm){+.+.}:
          lock_acquire+0xc8/0x1d0
          flush_workqueue+0x102/0x990
          cm_remove_one+0x30e/0x3c0 [ib_cm]
          remove_client_context+0x94/0xd0 [ib_core]
          disable_device+0x10a/0x1f0 [ib_core]
          __ib_unregister_device+0x5a/0xe0 [ib_core]
          ib_unregister_device+0x21/0x30 [ib_core]
          mlx5_ib_stage_ib_reg_cleanup+0x9/0x10 [mlx5_ib]
          __mlx5_ib_remove+0x3d/0x70 [mlx5_ib]
          mlx5_ib_remove+0x12e/0x140 [mlx5_ib]
          mlx5_remove_device+0x144/0x150 [mlx5_core]
          mlx5_unregister_interface+0x3f/0xf0 [mlx5_core]
          mlx5_ib_cleanup+0x10/0x3a [mlx5_ib]
          __x64_sys_delete_module+0x227/0x350
          do_syscall_64+0xc3/0x6a4
          entry_SYSCALL_64_after_hwframe+0x49/0xbe

Which is due to the read side of the client_data_rwsem being obtained
recursively through a work queue flush during cm client removal.

The lock is being held across the remove in remove_client_context() so
that the function is a fence, once it returns the client is removed. This
is required so that the two callers do not proceed with destruction until
the client completes removal.

Instead of using client_data_rwsem use the existing device unregistration
refcount and add a similar client unregistration (client->uses) refcount.

This will fence the two unregistration paths without holding any locks.

Cc: <stable@vger.kernel.org>
Fixes: 921eab1143 ("RDMA/devices: Re-organize device.c locking")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190731081841.32345-2-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-01 11:44:47 -04:00
..
addr.c RDMA/core: Fix race when resolving IP address 2019-07-09 16:27:04 -03:00
agent.c
agent.h
cache.c IB/core, ipoib: Do not overreact to SM LID change event 2019-05-07 16:06:03 -03:00
cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
cm_msgs.h RDMA: Use __packed annotation instead of __attribute__ ((packed)) 2019-03-25 21:14:12 -03:00
cm.c IB/cm: Reduce dependency on gid attribute ndev check 2019-05-03 11:10:02 -03:00
cma_configfs.c
cma_priv.h
cma.c RDMA/cma: Use rdma_read_gid_attr_ndev_rcu to access netdev 2019-05-03 11:10:03 -03:00
core_priv.h RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink 2019-07-08 16:37:22 -03:00
counters.c IB/counters: Always initialize the port counter object 2019-07-25 11:45:48 -03:00
cq.c RDMA/core: Fix -Wunused-const-variable warnings 2019-07-11 11:49:55 -03:00
device.c RDMA/devices: Do not deadlock during client removal 2019-08-01 11:44:47 -04:00
fmr_pool.c
iwcm.c RDMA: Get rid of iw_cm_verbs 2019-05-03 10:56:56 -03:00
iwcm.h
iwpm_msg.c
iwpm_util.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
iwpm_util.h
mad_priv.h RDMA: Use __packed annotation instead of __attribute__ ((packed)) 2019-03-25 21:14:12 -03:00
mad_rmpp.c
mad_rmpp.h
mad.c IB/MAD: Add SMP details to MAD tracing 2019-03-27 15:52:01 -03:00
Makefile RDMA/counter: Add set/clear per-port auto mode support 2019-07-05 10:22:54 -03:00
mr_pool.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
multicast.c IB/core, ipoib: Do not overreact to SM LID change event 2019-05-07 16:06:03 -03:00
netlink.c
nldev.c IB/core: Work on the caller socket net namespace in nldev_newlink() 2019-07-08 17:03:35 -03:00
opa_smi.h
packer.c
rdma_core.c uverbs: Convert idr to XArray 2019-04-25 12:27:11 -03:00
rdma_core.h RDMA/core: Clear out the udata before error unwind 2019-05-27 14:35:26 -03:00
restrack.c RDMA/restrack: Make is_visible_in_pid_ns() as an API 2019-07-05 10:22:54 -03:00
restrack.h RDMA/restrack: Make is_visible_in_pid_ns() as an API 2019-07-05 10:22:54 -03:00
roce_gid_mgmt.c drivers: use in_dev_for_each_ifa_rtnl/rcu 2019-06-02 18:06:26 -07:00
rw.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
sa_query.c 5.2 Merge Window pull request 2019-05-09 09:02:46 -07:00
sa.h
security.c
smi.c
smi.h
sysfs.c RDMA/nldev: Allow get default counter statistics through RDMA netlink 2019-07-05 10:22:55 -03:00
ucma.c RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV 2019-06-18 22:44:08 -04:00
ud_header.c
umem_odp.c RDMA/odp: Do not leak dma maps when working with huge pages 2019-06-20 21:52:47 -04:00
umem.c RDMA: Check umem pointer validity prior to release 2019-06-20 15:17:59 -04:00
user_mad.c IB/core: Add mitigation for Spectre V1 2019-08-01 11:34:11 -04:00
uverbs_cmd.c RDMA/uverbs: remove redundant assignment to variable ret 2019-07-04 14:06:47 -03:00
uverbs_ioctl.c mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options 2019-07-12 11:05:46 -07:00
uverbs_main.c RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV 2019-06-18 22:44:08 -04:00
uverbs_marshall.c
uverbs_std_types_counters.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_cq.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
uverbs_std_types_device.c
uverbs_std_types_dm.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_flow_action.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_mr.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
uverbs_std_types.c IB: Remove 'uobject->context' dependency in object destroy APIs 2019-04-01 14:59:35 -03:00
uverbs_uapi.c RDMA: Move driver_id into struct ib_device_ops 2019-06-10 16:56:02 -03:00
uverbs.h uverbs: Convert idr to XArray 2019-04-25 12:27:11 -03:00
verbs.c RDMA/counter: Add "auto" configuration mode support 2019-07-05 10:22:54 -03:00