linux/drivers/infiniband/hw/hfi1
Mike Marciniszyn 3ffea7d8cd IB/{rdmavt, hfi1, qib}: Fix panic with post receive and SGE compression
The server side of qperf panics as follows:

[242446.336860] IP: report_bug+0x64/0x10
[242446.341031] PGD 1c0c067
[242446.341032] P4D 1c0c067
[242446.343951] PUD 1c0d063
[242446.346870] PMD 8587ea067
[242446.349788] PTE 800000083e14016
[242446.352901]
[242446.358352] Oops: 0003 [#1] SM
[242446.437919] CPU: 1 PID: 7442 Comm: irq/92-hfi1_0 k Not tainted 4.12.0-mam-asm #1
[242446.446365] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/201
[242446.458397] task: ffff8808392d2b80 task.stack: ffffc9000664000
[242446.465097] RIP: 0010:report_bug+0x64/0x10
[242446.469859] RSP: 0018:ffffc900066439c0 EFLAGS: 0001000
[242446.475784] RAX: ffffffffa06647e4 RBX: ffffffffa06461e1 RCX: 000000000000000
[242446.483840] RDX: 0000000000000907 RSI: ffffffffa0675040 RDI: ffffffffffff740
[242446.491897] RBP: ffffc900066439e0 R08: 0000000000000001 R09: 000000000000025
[242446.499953] R10: ffffffff81a253df R11: 0000000000000133 R12: ffffc90006643b3
[242446.508010] R13: ffffffffa065bbf0 R14: 00000000000001e5 R15: 000000000000000
[242446.516067] FS:  0000000000000000(0000) GS:ffff88085f640000(0000) knlGS:000000000000000
[242446.525191] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003
[242446.531698] CR2: ffffffffa06647ee CR3: 0000000001c09000 CR4: 00000000001406e
[242446.539756] Call Trace
[242446.542582]  fixup_bug+0x2c/0x5
[242446.546277]  do_trap+0x12b/0x18
[242446.549972]  do_error_trap+0x89/0x11
[242446.554171]  ? hfi1_copy_sge+0x271/0x2b0 [hfi1
[242446.559324]  ? ttwu_do_wakeup+0x1e/0x14
[242446.563795]  ? ttwu_do_activate+0x77/0x8
[242446.568363]  do_invalid_op+0x20/0x3
[242446.572448]  invalid_op+0x1e/0x3
[242446.576247] RIP: 0010:hfi1_copy_sge+0x271/0x2b0 [hfi1
[242446.582075] RSP: 0018:ffffc90006643be8 EFLAGS: 0001004
[242446.587999] RAX: 0000000000000000 RBX: ffff88083e0fa240 RCX: 000000000000000
[242446.596058] RDX: 0000000000000000 RSI: ffff880842508000 RDI: ffff88083e0fa24
[242446.604116] RBP: ffffc90006643c28 R08: 0000000000000000 R09: 000000000000000
[242446.612172] R10: ffffc90009473640 R11: 0000000000000133 R12: 000000000000000
[242446.620228] R13: 0000000000000000 R14: 0000000000002000 R15: ffff88084250800
[242446.628293]  ? hfi1_copy_sge+0x1a1/0x2b0 [hfi1
[242446.633449]  hfi1_rc_rcv+0x3da/0x1270 [hfi1
[242446.638312]  ? sc_buffer_alloc+0x113/0x150 [hfi1
[242446.643662]  hfi1_ib_rcv+0x1c9/0x2e0 [hfi1
[242446.648428]  process_receive_ib+0x19a/0x270 [hfi1
[242446.653866]  ? process_rcv_qp_work+0xd2/0x160 [hfi1
[242446.659505]  handle_receive_interrupt_nodma_rtail+0x184/0x2e0 [hfi1
[242446.666693]  ? irq_finalize_oneshot+0x100/0x10
[242446.671846]  receive_context_thread+0x1b/0x140 [hfi1
[242446.677576]  irq_thread_fn+0x1e/0x4
[242446.681659]  irq_thread+0x13c/0x1b
[242446.685646]  ? irq_forced_thread_fn+0x60/0x6
[242446.690604]  kthread+0x112/0x15
[242446.694298]  ? irq_thread_check_affinity+0xe0/0xe
[242446.699738]  ? kthread_park+0x60/0x6
[242446.703919]  ? do_syscall_64+0x67/0x15
[242446.708292]  ret_from_fork+0x25/0x3
[242446.712374] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
[242446.733527] RIP: report_bug+0x64/0x100 RSP: ffffc900066439c
[242446.739935] CR2: ffffffffa06647e
[242446.743763] ---[ end trace 0e90a20d0aa494f7 ]--

The root cause is that the qib/hfi1 post receive call to rvt_lkey_ok()
doesn't interpret the new return value from rvt_lkey_ok() properly
leading to an mr reference count underrun.

Additionally, remove an unused argument in rvt_sge_adjacent()
aw well as an unneeded incr local in rvt_post_one_wr().

Fixes: Commit 14fe13fcd3 ("IB/rdmavt: Compress adjacent SGEs in rvt_lkey_ok()")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-31 15:18:38 -04:00
..
affinity.c IB/hfi1: Replace deprecated pci functions with new API 2017-06-27 16:58:11 -04:00
affinity.h IB/hfi1: Name function prototype parameters for affinity module 2017-06-27 16:56:33 -04:00
aspm.h IB/hfi1: Size rcd array index correctly and consistently 2017-07-31 15:17:55 -04:00
chip_registers.h RDMA/hfi1: Defer setting VL15 credits to link-up interrupt 2017-06-01 17:04:20 -04:00
chip.c IB/hfi1: Do not enable disabled port on cable insert 2017-07-31 15:18:37 -04:00
chip.h IB/hfi1: Pass the context pointer rather than the index 2017-07-31 15:17:55 -04:00
common.h IB/hfi1: Setup common IB fields in hfi1_packet struct 2017-06-27 16:56:33 -04:00
debugfs.c IB/hfi1: Virtual Network Interface Controller (VNIC) HW support 2017-04-20 15:19:35 -04:00
debugfs.h IB/hfi1: Add transmit fault injection feature 2017-04-05 14:45:09 -04:00
device.c infiniband: utilize the new cdev_set_parent function 2017-03-21 06:44:33 +01:00
device.h
driver.c IB/hfi1: Size rcd array index correctly and consistently 2017-07-31 15:17:55 -04:00
efivar.c IB/hfi1: Check upper-case EFI variables 2017-02-19 09:18:37 -05:00
efivar.h
eprom.c IB/hfi1: Handle missing magic values in config file 2017-06-27 16:58:13 -04:00
eprom.h IB/hfi1: Add ability to read platform config from the EPROM 2016-10-02 08:42:20 -04:00
exp_rcv.c IB/hfi1: Initialize TID lists to avoid crash on cleanup 2017-06-27 16:58:13 -04:00
exp_rcv.h IB/hfi1: Fix bar0 mapping to use write combining 2017-07-31 15:17:54 -04:00
file_ops.c IB/hfi1: Only set fd pointer when base context is completely initialized 2017-07-31 15:18:38 -04:00
firmware.c IB/hfi1: Fix initialization failure for debug firmware 2017-07-31 15:17:55 -04:00
hfi.h IB/hfi1: Move saving PCI values to a separate function 2017-07-31 15:18:37 -04:00
init.c IB/hfi1: Pass the context pointer rather than the index 2017-07-31 15:17:55 -04:00
intr.c IB/hfi1: Verify port data VLs credits on transition to Armed 2017-07-31 15:18:37 -04:00
iowait.h IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
Kconfig Second round of merge items for 4.8 2016-08-04 20:26:31 -04:00
mad.c IB/hfi1: Send MAD traps until repressed 2017-07-31 15:17:55 -04:00
mad.h IB/hfi1: Send MAD traps until repressed 2017-07-31 15:17:55 -04:00
Makefile IB/hfi1: Create common expected receive verbs/PSM code 2017-06-27 16:58:13 -04:00
mmu_rb.c IB/hfi1: Don't remove RB entry when not needed. 2017-06-27 16:56:33 -04:00
mmu_rb.h IB/hfi1: Don't remove RB entry when not needed. 2017-06-27 16:56:33 -04:00
opa_compat.h
pcie.c IB/hfi1: Move saving PCI values to a separate function 2017-07-31 15:18:37 -04:00
pio_copy.c IB/hfi1: Optimize pio_buf and send_context structs 2016-11-15 16:37:27 -05:00
pio.c IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
pio.h IB/hfi: Fix up comments in engine mapping 2017-04-25 15:24:51 -04:00
platform.c IB/hfi1: Disambiguate corruption and uninitialized error cases 2017-07-31 15:18:38 -04:00
platform.h IB/hfi1: Define platform_config_table_limits once 2016-12-11 15:29:42 -05:00
qp.c IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
qp.h IB/{rdmavt, qib, hfi1}: Remove gfp flags argument 2017-07-17 21:21:23 -04:00
qsfp.c IB/hfi1: Extend i2c timeout 2016-10-02 08:42:13 -04:00
qsfp.h IB/hfi1: Fetch monitor values on-demand for CableInfo query 2016-08-22 14:31:41 -04:00
rc.c IB/hfi1: Setup common IB fields in hfi1_packet struct 2017-06-27 16:56:33 -04:00
ruc.c IB/{rdmavt, hfi1, qib}: Fix panic with post receive and SGE compression 2017-07-31 15:18:38 -04:00
sdma_txreq.h
sdma.c IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
sdma.h IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
sysfs.c RDMA/hfi1: fix array termination by appending NULL to attr array 2017-06-01 17:03:19 -04:00
trace_ctxts.h IB/hfi1: Clean up context initialization 2017-05-04 19:31:46 -04:00
trace_dbg.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
trace_ibhdrs.h IB/hfi1: Separate input/output header tracing 2017-06-27 16:56:33 -04:00
trace_misc.h IB/hfi1: Add traces for TID operations 2017-06-27 16:58:13 -04:00
trace_rc.h IB/rdmavt, IB/hfi1: Fix timer migration regressions 2017-04-05 14:45:09 -04:00
trace_rx.h IB/hfi1: Size rcd array index correctly and consistently 2017-07-31 15:17:55 -04:00
trace_tx.h IB/hfi1: Fix yield logic in send engine 2017-05-04 19:31:46 -04:00
trace.c IB/hfi1: Separate input/output header tracing 2017-06-27 16:56:33 -04:00
trace.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
uc.c IB/hfi1: Setup common IB fields in hfi1_packet struct 2017-06-27 16:56:33 -04:00
ud.c IB/hfi1,qib: Do not send QKey trap for UD qps 2017-06-27 16:58:12 -04:00
user_exp_rcv.c IB/hfi1: Only set fd pointer when base context is completely initialized 2017-07-31 15:18:38 -04:00
user_exp_rcv.h IB/hfi1: Only set fd pointer when base context is completely initialized 2017-07-31 15:18:38 -04:00
user_pages.c IB/hfi1: Virtual Network Interface Controller (VNIC) HW support 2017-04-20 15:19:35 -04:00
user_sdma.c IB/hfi1: Only set fd pointer when base context is completely initialized 2017-07-31 15:18:38 -04:00
user_sdma.h IB/hfi1: Only set fd pointer when base context is completely initialized 2017-07-31 15:18:38 -04:00
verbs_txreq.c IB/hfi1: Add unique txwait_lock for txreq events 2016-11-15 16:25:59 -05:00
verbs_txreq.h IB/hfi1: Remove dependence on qp->s_cur_size 2016-12-11 15:25:13 -05:00
verbs.c IB/hfi1: Send MAD traps until repressed 2017-07-31 15:17:55 -04:00
verbs.h IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
vnic_main.c IB/hfi1: Pass the context pointer rather than the index 2017-07-31 15:17:55 -04:00
vnic_sdma.c IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00
vnic.h IB/hfi1: Serve the most starved iowait entry first 2017-07-31 15:17:54 -04:00