linux/net/core
John Fastabend 6fa9201a89 bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self
If a socket redirects to itself and it is under memory pressure it is
possible to get a socket stuck so that recv() returns EAGAIN and the
socket can not advance for some time. This happens because when
redirecting a skb to the same socket we received the skb on we first
check if it is OK to enqueue the skb on the receiving socket by checking
memory limits. But, if the skb is itself the object holding the memory
needed to enqueue the skb we will keep retrying from kernel side
and always fail with EAGAIN. Then userspace will get a recv() EAGAIN
error if there are no skbs in the psock ingress queue. This will continue
until either some skbs get kfree'd causing the memory pressure to
reduce far enough that we can enqueue the pending packet or the
socket is destroyed. In some cases its possible to get a socket
stuck for a noticeable amount of time if the socket is only receiving
skbs from sk_skb verdict programs. To reproduce I make the socket
memory limits ridiculously low so sockets are always under memory
pressure. More often though if under memory pressure it looks like
a spurious EAGAIN error on user space side causing userspace to retry
and typically enough has moved on the memory side that it works.

To fix skip memory checks and skb_orphan if receiving on the same
sock as already assigned.

For SK_PASS cases this is easy, its always the same socket so we
can just omit the orphan/set_owner pair.

For backlog cases we need to check skb->sk and decide if the orphan
and set_owner pair are needed.

Fixes: 51199405f9 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/160556572660.73229.12566203819812939627.stgit@john-XPS-13-9370
2020-11-18 00:12:41 +01:00
..
bpf_sk_storage.c bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON 2020-09-25 13:58:01 -07:00
datagram.c net: zerocopy: combine pages in zerocopy_sg_from_iter() 2020-08-20 16:12:50 -07:00
datagram.h
dev_addr_lists.c net: core: add nested_level variable in net_device 2020-09-28 15:00:15 -07:00
dev_ioctl.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dev.c random32: add noise from network and scheduling activity 2020-10-24 20:21:57 +02:00
devlink.c devlink: Unlock on error in dumpit() 2020-10-27 17:05:57 -07:00
drop_monitor.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
dst_cache.c
dst.c net: Correct the comment of dst_dev_put() 2020-09-10 13:28:57 -07:00
failover.c
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c fib: fix fib_rule_ops indirect call wrappers when CONFIG_IPV6=m 2020-09-08 20:09:08 -07:00
filter.c net: Properly typecast int values to set sk_max_pacing_rate 2020-10-22 12:18:25 -07:00
flow_dissector.c net: flow_dissector: avoid indirect call to DSA .flow_dissect for generic case 2020-09-26 14:17:59 -07:00
flow_offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-07-25 17:49:04 -07:00
gen_estimator.c net_sched: gen_estimator: extend packet counter to 64bit 2019-11-06 21:51:36 -08:00
gen_stats.c docs: networking: convert gen_stats.txt to ReST 2020-04-28 14:39:46 -07:00
gro_cells.c
hwbm.c
link_watch.c net: Add IF_OPER_TESTING 2020-04-20 12:43:24 -07:00
lwt_bpf.c net: add net available in build_state 2020-03-29 22:30:57 -07:00
lwtunnel.c net: ipv6: add rpl sr tunnel 2020-03-29 22:30:57 -07:00
Makefile ethtool: move to its own directory 2019-12-12 17:07:05 -08:00
neighbour.c net: neighbor: add fdb extended attribute 2020-06-24 14:36:33 -07:00
net_namespace.c bpf, net: Rework cookie generator as per-cpu one 2020-09-30 11:50:35 -07:00
net-procfs.c net-sysfs: add backlog len and CPU id to softnet data 2020-09-21 13:56:37 -07:00
net-sysfs.c net-sysfs: Fix inconsistent of format with argument type in net-sysfs.c 2020-10-01 18:45:23 -07:00
net-sysfs.h net-sysfs: add netdev_change_owner() 2020-02-26 20:07:25 -08:00
net-traces.c
netclassid_cgroup.c cgroup, netclassid: remove double cond_resched 2020-04-21 15:44:30 -07:00
netevent.c
netpoll.c net: make sure napi_list is safe for RCU traversal 2020-09-10 13:08:46 -07:00
netprio_cgroup.c netprio_cgroup: Fix unlimited memory leak of v2 cgroups 2020-05-09 20:59:21 -07:00
page_pool.c net: page pool: allow to pass zero flags to page_pool_init() 2020-03-29 21:49:20 -07:00
pktgen.c pktgen: Fix inconsistent of format with argument type in pktgen.c 2020-10-01 18:45:23 -07:00
ptp_classifier.c ptp: Add generic ptp v2 header parsing function 2020-08-19 16:07:49 -07:00
request_sock.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
rtnetlink.c rtnetlink: fix data overflow in rtnl_calcit() 2020-10-21 18:24:08 -07:00
scm.c fs: Add receive_fd() wrapper for __receive_fd() 2020-07-13 11:03:44 -07:00
secure_seq.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
skbuff.c networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
skmsg.c bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self 2020-11-18 00:12:41 +01:00
sock_diag.c bpf, net: Rework cookie generator as per-cpu one 2020-09-30 11:50:35 -07:00
sock_map.c net, sockmap: Don't call bpf_prog_put() on NULL pointer 2020-10-15 21:05:23 +02:00
sock_reuseport.c udp: Copy has_conns in reuseport_grow(). 2020-07-21 15:31:02 -07:00
sock.c net: Properly typecast int values to set sk_max_pacing_rate 2020-10-22 12:18:25 -07:00
stream.c tcp: make sure EPOLLOUT wont be missed 2019-08-19 13:07:43 -07:00
sysctl_net_core.c net: add option to not create fall-back tunnels in root-ns as well 2020-08-28 06:52:44 -07:00
timestamping.c net: Introduce a new MII time stamping interface. 2019-12-25 19:51:33 -08:00
tso.c net: tso: add UDP segmentation support 2020-06-18 20:46:23 -07:00
utils.c net: Fix skb->csum update in inet_proto_csum_replace16(). 2020-01-24 20:54:30 +01:00
xdp.c bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commands 2020-07-25 20:37:02 -07:00