linux/drivers/infiniband/hw/hfi1
Ira Weiny e0cf75deab IB/hfi1: Fix mm_struct use after free
Testing with CONFIG_SLUB_DEBUG_ON=y resulted in the kernel panic below.

This is the result of the mm_struct sometimes being free'd prior to
hfi1_file_close being called.

This was due to the combination of 2 reasons:

1) hfi1_file_close is deferred in process exit and it therefore may not
   be called synchronously with process exit.
2) exit_mm is called prior to exit_files in do_exit.  Normally this is ok
   however, our kernel bypass code requires us to have access to the
   mm_struct for house keeping both at "normal" close time as well as at
   process exit.

Therefore, the fix is to simply keep a reference to the mm_struct until
we are done with it.

[ 3006.340150] general protection fault: 0000 [#1] SMP
[ 3006.346469] Modules linked in: hfi1 rdmavt rpcrdma ib_isert iscsi_target_mod
ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod
 ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
 ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod snd_hda_code
 c_realtek iTCO_wdt snd_hda_codec_generic iTCO_vendor_support sb_edac edac_core
 x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass c
 rct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw snd_hda_intel
 gf128mul snd_hda_codec glue_helper snd_hda_core ablk_helper sn
 d_hwdep cryptd snd_seq snd_seq_device snd_pcm snd_timer snd soundcore pcspkr
 shpchp mei_me sg lpc_ich mei i2c_i801 mfd_core ioatdma ipmi_devi
 ntf wmi ipmi_si ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd
 grace sunrpc ip_tables ext4 jbd2 mbcache mlx4_en ib_core sr_mod s
 d_mod cdrom crc32c_intel mgag200 drm_kms_helper syscopyarea sysfillrect igb
 sysimgblt fb_sys_fops ptp mlx4_core ttm isci pps_core ahci drm li
 bsas libahci dca firewire_ohci i2c_algo_bit scsi_transport_sas firewire_core
 crc_itu_t i2c_core libata [last unloaded: mlx4_ib]
 [ 3006.461759] CPU: 16 PID: 11624 Comm: mpi_stress Not tainted 4.7.0-rc5+ #1
 [ 3006.469915] Hardware name: Intel Corporation W2600CR ........../W2600CR, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
 [ 3006.483027] task: ffff8804102f0040 ti: ffff8804102f8000 task.ti: ffff8804102f8000
 [ 3006.491971] RIP: 0010:[<ffffffff810f0383>]  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
 [ 3006.501905] RSP: 0018:ffff8804102fb908  EFLAGS: 00010002
 [ 3006.508447] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
 [ 3006.517012] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880410b56a40
 [ 3006.525569] RBP: ffff8804102fb9b0 R08: 0000000000000001 R09: 0000000000000000
 [ 3006.534119] R10: ffff8804102f0040 R11: 0000000000000000 R12: 0000000000000000
 [ 3006.542664] R13: ffff880410b56a40 R14: 0000000000000000 R15: 0000000000000000
 [ 3006.551203] FS:  00007ff478c08700(0000) GS:ffff88042e200000(0000) knlGS:0000000000000000
 [ 3006.560814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 3006.567806] CR2: 00007f667f5109e0 CR3: 0000000001c06000 CR4: 00000000000406e0
 [ 3006.576352] Stack:
 [ 3006.579157]  ffffffff8124b819 ffffffffffffffff 0000000000000000 ffff8804102fb940
 [ 3006.588072]  0000000000000002 0000000000000000 ffff8804102f0040 0000000000000007
 [ 3006.596971]  0000000000000006 ffff8803cad6f000 0000000000000000 ffff8804102f0040
 [ 3006.605878] Call Trace:
 [ 3006.609220]  [<ffffffff8124b819>] ? uncharge_batch+0x109/0x250
 [ 3006.616382]  [<ffffffff810f2313>] lock_acquire+0xd3/0x220
 [ 3006.623056]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
 [ 3006.631593]  [<ffffffff81775579>] down_write+0x49/0x80
 [ 3006.638022]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
 [ 3006.646569]  [<ffffffffa0a30bfc>] hfi1_release_user_pages+0x7c/0xa0 [hfi1]
 [ 3006.654898]  [<ffffffffa0a2efb6>] cacheless_tid_rb_remove+0x106/0x330 [hfi1]
 [ 3006.663417]  [<ffffffff810efd36>] ? mark_held_locks+0x66/0x90
 [ 3006.670498]  [<ffffffff817771f6>] ? _raw_spin_unlock_irqrestore+0x36/0x60
 [ 3006.678741]  [<ffffffffa0a2f1ee>] tid_rb_remove+0xe/0x10 [hfi1]
 [ 3006.686010]  [<ffffffffa0a0c5d5>] hfi1_mmu_rb_unregister+0xc5/0x100 [hfi1]
 [ 3006.694387]  [<ffffffffa0a2fcb9>] hfi1_user_exp_rcv_free+0x39/0x120 [hfi1]
 [ 3006.702732]  [<ffffffffa09fc6ea>] hfi1_file_close+0x17a/0x330 [hfi1]
 [ 3006.710489]  [<ffffffff81263e9a>] __fput+0xfa/0x230
 [ 3006.716595]  [<ffffffff8126400e>] ____fput+0xe/0x10
 [ 3006.722696]  [<ffffffff810b95c6>] task_work_run+0x86/0xc0
 [ 3006.729379]  [<ffffffff81099933>] do_exit+0x323/0xc40
 [ 3006.735672]  [<ffffffff8109a2dc>] do_group_exit+0x4c/0xc0
 [ 3006.742371]  [<ffffffff810a7f55>] get_signal+0x345/0x940
 [ 3006.748958]  [<ffffffff810340c7>] do_signal+0x37/0x700
 [ 3006.755328]  [<ffffffff8127872a>] ? poll_select_set_timeout+0x5a/0x90
 [ 3006.763146]  [<ffffffff811609cb>] ? __audit_syscall_exit+0x1db/0x260
 [ 3006.770853]  [<ffffffff8110f3e3>] ? rcu_read_lock_sched_held+0x93/0xa0
 [ 3006.778765]  [<ffffffff812347a4>] ? kfree+0x1e4/0x2a0
 [ 3006.784986]  [<ffffffff8108e75a>] ? exit_to_usermode_loop+0x33/0xac
 [ 3006.792551]  [<ffffffff8108e785>] exit_to_usermode_loop+0x5e/0xac
 [ 3006.799907]  [<ffffffff81003dca>] do_syscall_64+0x12a/0x190
 [ 3006.806664]  [<ffffffff81777a7f>] entry_SYSCALL64_slow_path+0x25/0x25
 [ 3006.814396] Code: 24 08 44 89 44 24 10 89 4c 24 18 e8 a8 d8 ff ff 48 85 c0
 8b 4c 24 18 44 8b 44 24 10 44 8b 4c 24 08 4c 8b 14 24 0f 84 30
 08 00 00 <f0> ff 80 98 01 00 00 8b 3d 48 ad be 01 45 8b a2 90 0b 00 00 85
 [ 3006.837158] RIP  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
 [ 3006.844401]  RSP <ffff8804102fb908>
 [ 3006.851170] ---[ end trace b7b9f21cf06c27df ]---
 [ 3006.927420] Kernel panic - not syncing: Fatal exception
 [ 3006.933954] Kernel Offset: disabled
 [ 3006.940961] ---[ end Kernel panic - not syncing: Fatal exception
 [ 3006.948249] ------------[ cut here ]------------

Fixes: 3faa3d9a30 ("IB/hfi1: Make use of mm consistent")
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22 15:00:42 -04:00
..
affinity.c IB/hfi1: Remove duplicated include from affinity.c 2016-08-22 14:27:14 -04:00
affinity.h IB/hfi1: Add sysfs entry to override SDMA interrupt affinity 2016-08-02 16:00:58 -04:00
aspm.h
chip_registers.h IB/hfi1: Read all firmware versions 2016-08-02 16:00:58 -04:00
chip.c IB/hfi1: Use the same capability state for all shared contexts 2016-08-02 22:46:21 -04:00
chip.h IB/hfi1: Use the same capability state for all shared contexts 2016-08-02 22:46:21 -04:00
common.h
debugfs.c IB/hfi1,IB/qib: Fix qp_stats sleep with rcu read lock held 2016-08-22 14:31:34 -04:00
debugfs.h
device.c
device.h
dma.c
driver.c IB/hfi1: Validate header in set_armed_active 2016-08-22 14:31:42 -04:00
efivar.c
efivar.h
eprom.c
eprom.h
file_ops.c IB/hfi1: Fix mm_struct use after free 2016-08-22 15:00:42 -04:00
firmware.c IB/hfi1: Read all firmware versions 2016-08-02 16:00:58 -04:00
hfi.h IB/hfi1: Improve J_KEY generation 2016-08-22 15:00:42 -04:00
init.c IB/hfi1: Using kfree_rcu() to simplify the code 2016-08-22 14:31:42 -04:00
intr.c
iowait.h
Kconfig Second round of merge items for 4.8 2016-08-04 20:26:31 -04:00
mad.c IB/hfi1: Return invalid field for non-QSFP CableInfo queries 2016-08-22 15:00:42 -04:00
mad.h IB/hfi1: Clean up port state structure definition 2016-08-02 12:00:54 -04:00
Makefile IB/hfi1: Remove TWSI references 2016-08-02 15:47:42 -04:00
mmu_rb.c IB/hfi1: Add cache evict LRU list 2016-08-02 22:46:21 -04:00
mmu_rb.h IB/hfi1: Remove unneeded mm argument in remove function 2016-08-02 22:46:21 -04:00
opa_compat.h
pcie.c IB/hfi1: Add static PCIe Gen3 CTLE tuning 2016-08-02 16:00:58 -04:00
pio_copy.c
pio.c IB/hfi1: Handle kzalloc failure in init_pervl_scs 2016-08-02 22:46:21 -04:00
pio.h
platform.c IB/hfi1: Reset QSFP on every run through channel tuning 2016-08-02 16:00:58 -04:00
platform.h
qp.c IB/hfi1,IB/qib: Fix qp_stats sleep with rcu read lock held 2016-08-22 14:31:34 -04:00
qp.h IB/hfi1: Rename struct ahg_ib_header to struct hfi1_ahg_info 2016-08-02 16:00:58 -04:00
qsfp.c IB/hfi1: Fetch monitor values on-demand for CableInfo query 2016-08-22 14:31:41 -04:00
qsfp.h IB/hfi1: Fetch monitor values on-demand for CableInfo query 2016-08-22 14:31:41 -04:00
rc.c IB/rdmavt, hfi1: Fix NFSoRDMA failure with FRMR enabled 2016-08-02 16:00:58 -04:00
ruc.c IB/hfi1: Rename hfi1_pio_header to hfi1_sdma_header. 2016-08-02 16:00:58 -04:00
sdma_txreq.h
sdma.c
sdma.h
sysfs.c IB/hfi1: Add sysfs entry to override SDMA interrupt affinity 2016-08-02 16:00:58 -04:00
trace_ctxts.h IB/hfi1: Fix trace sparse errors 2016-08-02 12:00:54 -04:00
trace_dbg.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
trace_ibhdrs.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
trace_misc.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
trace_rc.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
trace_rx.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
trace_tx.h IB/hfi1: Fix trace sparse errors 2016-08-02 12:00:54 -04:00
trace.c IB/hfi1: Suppress sparse warnings 2016-06-06 19:37:23 -04:00
trace.h IB/hfi1: Separate tracepoints into specific headers 2016-08-02 12:00:54 -04:00
uc.c IB/rdmavt, hfi1: Fix NFSoRDMA failure with FRMR enabled 2016-08-02 16:00:58 -04:00
ud.c IB/qib, IB/hfi1: Fix grh creation in ud loopback 2016-08-02 16:00:58 -04:00
user_exp_rcv.c IB/hfi1: Fix memory leak during unexpected shutdown 2016-08-02 22:46:21 -04:00
user_exp_rcv.h
user_pages.c IB/hfi1: Make use of mm consistent 2016-08-02 22:46:21 -04:00
user_sdma.c IB/hfi1: Remove unneeded mm argument in remove function 2016-08-02 22:46:21 -04:00
user_sdma.h IB/hfi1: Use evict mmu rb operation 2016-08-02 22:46:21 -04:00
verbs_txreq.c IB/hfi1: Fix deadlock with txreq allocation slow path 2016-06-23 10:16:15 -04:00
verbs_txreq.h IB/hfi1: Rename hfi1_pio_header to hfi1_sdma_header. 2016-08-02 16:00:58 -04:00
verbs.c Second round of merge items for 4.8 2016-08-04 20:26:31 -04:00
verbs.h IB/hfi1: Rename hfi1_pio_header to hfi1_sdma_header. 2016-08-02 16:00:58 -04:00