linux/net/sunrpc
Chuck Lever e28b4fc652 svcrdma: Fix trace point use-after-free race
I hit this while testing nfsd-5.7 with kernel memory debugging
enabled on my server:

Mar 30 13:21:45 klimt kernel: BUG: unable to handle page fault for address: ffff8887e6c279a8
Mar 30 13:21:45 klimt kernel: #PF: supervisor read access in kernel mode
Mar 30 13:21:45 klimt kernel: #PF: error_code(0x0000) - not-present page
Mar 30 13:21:45 klimt kernel: PGD 3601067 P4D 3601067 PUD 87c519067 PMD 87c3e2067 PTE 800ffff8193d8060
Mar 30 13:21:45 klimt kernel: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
Mar 30 13:21:45 klimt kernel: CPU: 2 PID: 1933 Comm: nfsd Not tainted 5.6.0-rc6-00040-g881e87a3c6f9 #1591
Mar 30 13:21:45 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Mar 30 13:21:45 klimt kernel: RIP: 0010:svc_rdma_post_chunk_ctxt+0xab/0x284 [rpcrdma]
Mar 30 13:21:45 klimt kernel: Code: c1 83 34 02 00 00 29 d0 85 c0 7e 72 48 8b bb a0 02 00 00 48 8d 54 24 08 4c 89 e6 48 8b 07 48 8b 40 20 e8 5a 5c 2b e1 41 89 c6 <8b> 45 20 89 44 24 04 8b 05 02 e9 01 00 85 c0 7e 33 e9 5e 01 00 00
Mar 30 13:21:45 klimt kernel: RSP: 0018:ffffc90000dfbdd8 EFLAGS: 00010286
Mar 30 13:21:45 klimt kernel: RAX: 0000000000000000 RBX: ffff8887db8db400 RCX: 0000000000000030
Mar 30 13:21:45 klimt kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000246
Mar 30 13:21:45 klimt kernel: RBP: ffff8887e6c27988 R08: 0000000000000000 R09: 0000000000000004
Mar 30 13:21:45 klimt kernel: R10: ffffc90000dfbdd8 R11: 00c068ef00000000 R12: ffff8887eb4e4a80
Mar 30 13:21:45 klimt kernel: R13: ffff8887db8db634 R14: 0000000000000000 R15: ffff8887fc931000
Mar 30 13:21:45 klimt kernel: FS:  0000000000000000(0000) GS:ffff88885bd00000(0000) knlGS:0000000000000000
Mar 30 13:21:45 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 CR3: 000000081b72e002 CR4: 00000000001606e0
Mar 30 13:21:45 klimt kernel: Call Trace:
Mar 30 13:21:45 klimt kernel: ? svc_rdma_vec_to_sg+0x7f/0x7f [rpcrdma]
Mar 30 13:21:45 klimt kernel: svc_rdma_send_write_chunk+0x59/0xce [rpcrdma]
Mar 30 13:21:45 klimt kernel: svc_rdma_sendto+0xf9/0x3ae [rpcrdma]
Mar 30 13:21:45 klimt kernel: ? nfsd_destroy+0x51/0x51 [nfsd]
Mar 30 13:21:45 klimt kernel: svc_send+0x105/0x1e3 [sunrpc]
Mar 30 13:21:45 klimt kernel: nfsd+0xf2/0x149 [nfsd]
Mar 30 13:21:45 klimt kernel: kthread+0xf6/0xfb
Mar 30 13:21:45 klimt kernel: ? kthread_queue_delayed_work+0x74/0x74
Mar 30 13:21:45 klimt kernel: ret_from_fork+0x3a/0x50
Mar 30 13:21:45 klimt kernel: Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ib_umad ib_ipoib mlx4_ib sb_edac x86_pkg_temp_thermal iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper crypto_simd cryptd pcspkr rpcrdma i2c_i801 rdma_ucm lpc_ich mfd_core ib_iser rdma_cm iw_cm ib_cm mei_me raid0 libiscsi mei sg scsi_transport_iscsi ioatdma wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod sr_mod cdrom mlx4_core crc32c_intel igb nvme i2c_algo_bit ahci i2c_core libahci nvme_core dca libata t10_pi qedr dm_mirror dm_region_hash dm_log dm_mod dax qede qed crc8 ib_uverbs ib_core
Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8
Mar 30 13:21:45 klimt kernel: ---[ end trace 87971d2ad3429424 ]---

It's absolutely not safe to use resources pointed to by the @send_wr
argument of ib_post_send() _after_ that function returns. Those
resources are typically freed by the Send completion handler, which
can run before ib_post_send() returns.

Thus the trace points currently around ib_post_send() in the
server's RPC/RDMA transport are a hazard, even when they are
disabled. Rearrange them so that they touch the Work Request only
_before_ ib_post_send() is invoked.

Fixes: bd2abef333 ("svcrdma: Trace key RDMA API events")
Fixes: 4201c74647 ("svcrdma: Introduce svc_rdma_send_ctxt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-04-17 12:40:38 -04:00
..
auth_gss NFS client updates for Linux 5.7 2020-04-07 13:51:39 -07:00
xprtrdma svcrdma: Fix trace point use-after-free race 2020-04-17 12:40:38 -04:00
addr.c SUNRPC: Use kmemdup_nul() in rpc_parse_scope_id() 2020-02-03 16:35:07 -05:00
auth_null.c SUNRPC: Add rpc_auth::au_ralign field 2019-02-14 11:48:36 -05:00
auth_unix.c SUNRPC: Use the client user namespace when encoding creds 2019-04-26 16:24:32 -04:00
auth.c SUNRPC: Remove broken gss_mech_list_pseudoflavors() 2020-01-15 10:54:32 -05:00
backchannel_rqst.c SUNRPC: Destroy the back channel when we destroy the host transport 2019-10-30 12:04:35 -04:00
cache.c SUNRPC/cache: Fix unsafe traverse caused double-free in cache_purge 2020-04-13 10:28:21 -04:00
clnt.c NFS client updates for Linux 5.7 2020-04-07 13:51:39 -07:00
debugfs.c NFS client updates for Linux 5.3 2019-07-18 14:32:33 -07:00
Kconfig SUNRPC: Drop redundant CONFIG_ from CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES 2019-07-06 14:54:53 -04:00
Makefile SUNRPC: remove generic cred code. 2018-12-19 13:52:46 -05:00
netns.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rpc_pipe.c kernel/notifier.c: remove blocking_notifier_chain_cond_register() 2019-12-04 19:44:12 -08:00
rpcb_clnt.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
sched.c SUNRPC: Don't start a timer on an already queued rpc task 2020-04-04 19:59:27 -04:00
socklib.c SUNRPC: Refactor xs_sendpages() 2020-03-16 12:04:33 -04:00
socklib.h SUNRPC: Refactor xs_sendpages() 2020-03-16 12:04:33 -04:00
stats.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
sunrpc_syms.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
sunrpc.h SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends 2020-03-16 12:04:33 -04:00
svc_xprt.c SUNRPC: Fix backchannel RPC soft lockups 2020-04-17 12:40:31 -04:00
svc.c SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends 2020-03-16 12:04:33 -04:00
svcauth_unix.c nfsd: export upcalls must not return ESTALE when mountd is down 2020-03-16 12:04:33 -04:00
svcauth.c SUNRPC: Trace gssproxy upcall results 2019-10-30 16:32:07 -04:00
svcsock.c SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends 2020-03-16 12:04:33 -04:00
sysctl.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
timer.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
xdr.c SUNRPC: Remove xdr_buf_read_mic() 2020-03-16 10:18:45 -04:00
xprt.c svcrdma: Create a generic tracing class for displaying xdr_buf layout 2020-03-16 12:04:32 -04:00
xprtmultipath.c SUNRPC: Optimise transport balancing code 2019-07-18 14:43:52 -04:00
xprtsock.c SUNRPC: Fix backchannel RPC soft lockups 2020-04-17 12:40:31 -04:00