linux/drivers/infiniband/hw
Selvin Xavier 942c9b6ca8 RDMA/bnxt_re: Avoid Hard lockup during error CQE processing
Hitting the following hardlockup due to a race condition in
error CQE processing.

[26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req
[26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa
[26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
[26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace
[26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded
[26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8,
[26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[26156.472639] Call Trace:
[26156.475379]  <NMI>  [<ffffffff98d0d722>] dump_stack+0x19/0x1b
[26156.481833]  [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140
[26156.489341]  [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100
[26156.496256]  [<ffffffff98787c24>] perf_event_overflow+0x14/0x20
[26156.502887]  [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510
[26156.509813]  [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50
[26156.516738]  [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150
[26156.523273]  [<ffffffff98d17be8>] do_nmi+0x218/0x460
[26156.528834]  [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e
[26156.534980]  [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.543268]  [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.551556]  [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.559842]  <EOE>  [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf
[26156.567555]  [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30
[26156.573696]  [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re]
[26156.581789]  [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re]
[26156.589493]  [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re]

The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to
put the error QP in flush list. Since SQ and RQ of each error qp are added
to two different flush list, we need to protect it using locks of
corresponding CQs. Difference in order of acquiring the lock in
SQ poll_cq and RQ poll_cq can cause a hard lockup.

Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock.
Instead of this lock, introduces qplib_cq.flush_lock to handle
addition/deletion of QPs in flush list. Also, always invoke the flush_lock
in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential
deadlock.

Other than the poll_cq context, the movement of QP to/from flush list can
be done in modify_qp context or from an async error event from HW.
Synchronize these operations using the bnxt_re verbs layer CQ locks.
To achieve this, adds a call back to the HW abstraction layer(qplib) to
bnxt_re ib_verbs layer in case of async error event. Also, removes the
buddy cq functions as it is no longer required.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 20:08:39 -07:00
..
bnxt_re RDMA/bnxt_re: Avoid Hard lockup during error CQE processing 2018-03-06 20:08:39 -07:00
cxgb3 Updates for 4.15 kernel merge window 2017-11-15 14:54:53 -08:00
cxgb4 iw_cxgb4: Change error/warn prints to pr_debug 2017-12-29 11:09:23 -07:00
hfi1 vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
hns RDMA/hns: Fix the endian problem for hns 2018-02-05 10:48:48 -05:00
i40iw i40iw: Free IEQ resources 2018-01-16 20:38:18 -07:00
mlx4 IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE 2018-03-06 20:08:38 -07:00
mlx5 IB/mlx5: Fix an error code in __mlx5_ib_modify_qp() 2018-03-06 20:08:39 -07:00
mthca IB/mthca: remove mthca_user.h 2018-01-28 14:07:16 -07:00
nes nes: Change accelerated flag to bool 2017-12-22 13:33:30 -07:00
ocrdma IB/ocrdma: Use zeroing memory allocator than allocator/memset 2018-01-02 11:20:13 -07:00
qedr RDMA/qedr: Fix iWARP write and send with immediate 2018-03-06 19:57:37 -07:00
qib vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
usnic drivers: infiniband: remove duplicate includes 2017-12-22 09:39:35 -07:00
vmw_pvrdma RDMA/vmw_pvrdma: Fix usage of user response structures in ABI file 2018-02-15 15:31:28 -07:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00