linux/net
Boris Pismenny c1318b39c7 tls: Add opt-in zerocopy mode of sendfile()
TLS device offload copies sendfile data to a bounce buffer before
transmitting. It allows to maintain the valid MAC on TLS records when
the file contents change and a part of TLS record has to be
retransmitted on TCP level.

In many common use cases (like serving static files over HTTPS) the file
contents are not changed on the fly. In many use cases breaking the
connection is totally acceptable if the file is changed during
transmission, because it would be received corrupted in any case.

This commit allows to optimize performance for such use cases to
providing a new optional mode of TLS sendfile(), in which the extra copy
is skipped. Removing this copy improves performance significantly, as
TLS and TCP sendfile perform the same operations, and the only overhead
is TLS header/trailer insertion.

The new mode can only be enabled with the new socket option named
TLS_TX_ZEROCOPY_SENDFILE on per-socket basis. It preserves backwards
compatibility with existing applications that rely on the copying
behavior.

The new mode is safe, meaning that unsolicited modifications of the file
being sent can't break integrity of the kernel. The worst thing that can
happen is sending a corrupted TLS record, which is in any case not
forbidden when using regular TCP sockets.

Sockets other than TLS device offload are not affected by the new socket
option. The actual status of zerocopy sendfile can be queried with
sock_diag.

Performance numbers in a single-core test with 24 HTTPS streams on
nginx, under 100% CPU load:

* non-zerocopy: 33.6 Gbit/s
* zerocopy: 79.92 Gbit/s

CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz

Signed-off-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20220518092731.1243494-1-maximmi@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-19 12:14:11 +02:00
..
6lowpan net: don't include ndisc.h from ipv6.h 2022-02-04 14:15:11 -08:00
9p xen/grant-table: remove readonly parameter from functions 2022-03-15 20:34:40 -05:00
802
8021q net: add netif_inherit_tso_max() 2022-05-06 12:07:56 +01:00
appletalk net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
atm net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
ax25 ax25: merge repeat codes in ax25_dev_device_down() 2022-05-17 12:24:16 +02:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
bpf net: allow gso_max_size to exceed 65536 2022-05-16 10:18:55 +01:00
bpfilter uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
bridge rtnetlink: add extack support in fdb del handlers 2022-05-09 11:58:20 +01:00
caif net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
can can: isotp: isotp_bind(): return -EINVAL on incorrect CAN ID formatting 2022-05-16 22:03:45 +02:00
ceph libceph: disambiguate cluster/pool full log message 2022-04-25 10:45:15 +02:00
core net: call skb_defer_free_flush() before each napi_poll() 2022-05-16 11:33:59 +01:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp dccp: use READ_ONCE() to read sk->sk_bound_dev_if 2022-05-16 10:31:06 +01:00
decnet dn_route: set rt neigh to blackhole_netdev instead of loopback_dev in ifdown 2022-05-17 18:03:23 -07:00
dns_resolver
dsa net: dsa: remove port argument from ->change_tag_protocol() 2022-05-12 16:38:55 -07:00
ethernet net: ethernet: set default assignment identifier to NET_NAME_ENUM 2022-04-07 21:04:03 -07:00
ethtool ethtool: Add 10base-T1L link mode entry 2022-05-01 17:45:35 +01:00
hsr net: add per-cpu storage and net->core_stats 2022-03-11 23:17:24 -08:00
ieee802154 net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
ife
ipv4 net: tcp: reset 'drop_reason' to NOT_SPCIFIED in tcp_v{4,6}_rcv() 2022-05-16 10:47:44 +01:00
ipv6 net: tcp: reset 'drop_reason' to NOT_SPCIFIED in tcp_v{4,6}_rcv() 2022-05-16 10:47:44 +01:00
iucv net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
kcm
key net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
l2tp l2tp: use add READ_ONCE() to fetch sk->sk_bound_dev_if 2022-05-16 10:31:06 +01:00
l3mdev l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu 2022-04-15 14:27:24 -07:00
lapb
llc llc: only change llc->dev when bind() succeeds 2022-03-25 16:55:41 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
mac802154 net: mac802154: Fix symbol durations 2022-04-30 20:29:47 +02:00
mctp net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
mpls net: mpls: fix memdup.cocci warning 2022-04-07 21:06:41 -07:00
mptcp mptcp: sockopt: add TCP_DEFER_ACCEPT support 2022-05-16 13:11:31 -07:00
ncsi all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate 2022-01-15 08:47:31 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next 2022-05-16 10:10:37 +01:00
netlabel netlabel: fix out-of-bounds memory accesses 2022-03-21 10:59:11 +00:00
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
netrom net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-05 13:03:18 -07:00
nsh
openvswitch openvswitch: fix OOB access in reserve_sfa_size() 2022-04-15 11:50:02 +01:00
packet net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
phonet net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
psample
qrtr net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
rds net: rds: use maybe_get_net() when acquiring refcount on TCP sockets 2022-05-05 16:44:49 -07:00
rfkill rfkill: make new event layout opt-in 2022-03-18 13:09:17 +02:00
rose ROSE: Remove unused code and clean up some inconsistent indenting 2022-05-09 17:19:27 -07:00
rxrpc rxrpc: Enable IPv6 checksums on transport socket 2022-04-30 13:59:34 +01:00
sched net_sched: em_meta: add READ_ONCE() in var_sk_bound_if() 2022-05-16 10:31:06 +01:00
sctp sctp: read sk->sk_bound_dev_if once in sctp_rcv() 2022-05-16 10:31:06 +01:00
smc net/smc: rdma write inline if qp has sufficient inline space 2022-05-17 17:34:12 -07:00
strparser
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
switchdev net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device 2022-02-24 21:31:43 -08:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-23 10:53:49 -07:00
tls tls: Add opt-in zerocopy mode of sendfile() 2022-05-19 12:14:11 +02:00
unix net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
vmw_vsock vsock/virtio: add support for device suspend/resume 2022-05-02 16:04:34 -07:00
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
x25 x25: remove redundant pointer dev 2022-05-10 11:59:22 +02:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
xfrm xfrm: drop not needed flags variable in XFRM offload struct 2022-05-06 08:35:46 +02:00
compat.c
devres.c
Kconfig page_pool: Add allocation stats 2022-03-03 09:55:28 +00:00
Kconfig.debug net: add CONFIG_DEBUG_NET 2022-05-11 12:43:10 +01:00
Makefile
socket.c ptp: Support late timestamp determination 2022-05-10 09:48:08 +02:00
sysctl_net.c