mainlining shenanigans
Go to file
Sagi Grimberg 825619b09a nvme-tcp: fix possible use-after-completion
Commit db5ad6b7f8 ("nvme-tcp: try to send request in queue_rq
context") added a second context that may perform a network send.
This means that now RX and TX are not serialized in nvme_tcp_io_work
and can run concurrently.

While there is correct mutual exclusion in the TX path (where
the send_mutex protect the queue socket send activity) RX activity,
and more specifically request completion may run concurrently.

This means we must guarantee that any mutation of the request state
related to its lifetime, bytes sent must not be accessed when a completion
may have possibly arrived back (and processed).

The race may trigger when a request completion arrives, processed
_and_ reused as a fresh new request, exactly in the (relatively short)
window between the last data payload sent and before the request iov_iter
is advanced.

Consider the following race:
1. 16K write request is queued
2. The nvme command and the data is sent to the controller (in-capsule
   or solicited by r2t)
3. After the last payload is sent but before the req.iter is advanced,
   the controller sends back a completion.
4. The completion is processed, the request is completed, and reused
   to transfer a new request (write or read)
5. The new request is queued, and the driver reset the request parameters
   (nvme_tcp_setup_cmd_pdu).
6. Now context in (2) resumes execution and advances the req.iter

==> use-after-completion as this is already a new request.

Fix this by making sure the request is not advanced after the last
data payload send, knowing that a completion may have arrived already.

An alternative solution would have been to delay the request completion
or state change waiting for reference counting on the TX path, but besides
adding atomic operations to the hot-path, it may present challenges in
multi-stage R2T scenarios where a r2t handler needs to be deferred to
an async execution.

Reported-by: Narayan Ayalasomayajula <narayan.ayalasomayajula@wdc.com>
Tested-by: Anil Mishra <anil.mishra@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-05-19 08:33:42 +02:00
arch The x86 MM changes in this cycle were: 2021-04-29 11:41:43 -07:00
block block/partitions/efi.c: Fix the efi_partition() kernel-doc header 2021-05-14 09:00:06 -06:00
certs certs: add 'x509_revocation_list' to gitignore 2021-04-26 10:48:07 -07:00
crypto for-5.13/drivers-2021-04-27 2021-04-28 14:39:37 -07:00
Documentation - removed get_fs/set_fs 2021-04-29 11:28:08 -07:00
drivers nvme-tcp: fix possible use-after-completion 2021-05-19 08:33:42 +02:00
fs block: reexpand iov_iter after read/write 2021-05-06 09:24:03 -06:00
include blkdev.h: remove unused codes blk_account_rq 2021-05-12 07:40:32 -06:00
init Merge branch 'for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2021-04-27 18:47:42 -07:00
ipc fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
kernel The x86 MM changes in this cycle were: 2021-04-29 11:41:43 -07:00
lib - removed get_fs/set_fs 2021-04-29 11:28:08 -07:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm \n 2021-04-29 11:06:13 -07:00
net Scheduler updates for this cycle are: 2021-04-28 13:33:57 -07:00
samples vfio/mdev: Correct the function signatures for the mdev_type_attributes 2021-04-12 10:36:00 -06:00
scripts Devicetree updates for v5.13: 2021-04-28 15:50:24 -07:00
security Devicetree updates for v5.13: 2021-04-28 15:50:24 -07:00
sound - Core Frameworks 2021-04-28 15:59:13 -07:00
tools Perf events changes in this cycle were: 2021-04-28 13:03:44 -07:00
usr Kbuild updates for v5.12 2021-02-25 10:17:31 -08:00
virt KVM: x86/mmu: Consider the hva in mmu_notifier retry 2021-02-22 13:16:53 -05:00
.clang-format cxl for 5.12 2021-02-24 09:38:36 -08:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore clang-lto series for v5.12-rc1 2021-02-23 09:28:51 -08:00
.mailmap It's been a relatively busy cycle in docsland, though more than usually 2021-04-26 13:22:43 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS - Core Frameworks 2021-04-28 15:59:13 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Here's a collection of largely clk driver updates for the merge window. The 2021-04-28 17:13:56 -07:00
Makefile CFI on arm64 series for v5.13-rc1 2021-04-27 10:16:46 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.