linux/net/core
Denis V. Lunev 66ba215cb5 neigh: fix possible DoS due to net iface start/stop loop
Normal processing of ARP request (usually this is Ethernet broadcast
packet) coming to the host is looking like the following:
* the packet comes to arp_process() call and is passed through routing
  procedure
* the request is put into the queue using pneigh_enqueue() if
  corresponding ARP record is not local (common case for container
  records on the host)
* the request is processed by timer (within 80 jiffies by default) and
  ARP reply is sent from the same arp_process() using
  NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED condition (flag is set inside
  pneigh_enqueue())

And here the problem comes. Linux kernel calls pneigh_queue_purge()
which destroys the whole queue of ARP requests on ANY network interface
start/stop event through __neigh_ifdown().

This is actually not a problem within the original world as network
interface start/stop was accessible to the host 'root' only, which
could do more destructive things. But the world is changed and there
are Linux containers available. Here container 'root' has an access
to this API and could be considered as untrusted user in the hosting
(container's) world.

Thus there is an attack vector to other containers on node when
container's root will endlessly start/stop interfaces. We have observed
similar situation on a real production node when docker container was
doing such activity and thus other containers on the node become not
accessible.

The patch proposed doing very simple thing. It drops only packets from
the same namespace in the pneigh_queue_purge() where network interface
state change is detected. This is enough to prevent the problem for the
whole node preserving original semantics of the code.

v2:
	- do del_timer_sync() if queue is empty after pneigh_queue_purge()
v3:
	- rebase to net tree

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@kernel.org>
Cc: Yajun Deng <yajun.deng@linux.dev>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Konstantin Khorenko <khorenko@virtuozzo.com>
Cc: kernel@openvz.org
Cc: devel@openvz.org
Investigated-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15 11:25:09 +01:00
..
.gitignore net: skb: use auto-generation to convert skb drop reason to string 2022-06-07 12:51:41 +02:00
bpf_sk_storage.c bpf: Check the validity of max_rdwr_access for sock local storage map iterator 2022-08-10 10:12:48 -07:00
datagram.c iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() 2022-08-08 22:37:22 -04:00
dev_addr_lists_test.c net: kunit: add a test for dev_addr_lists 2021-11-20 12:25:57 +00:00
dev_addr_lists.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
dev_ioctl.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
dev.c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-07-22 16:55:44 -07:00
dev.h net: add skb_defer_max sysctl 2022-05-16 11:33:59 +01:00
devlink.c devlink: Fix use-after-free after a failed reload 2022-08-10 13:48:04 +01:00
drop_monitor.c drop_monitor: adopt u64_stats_t 2022-06-09 21:53:12 -07:00
dst_cache.c wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
failover.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
fib_notifier.c
fib_rules.c fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
filter.c net: bpf: Use the protocol's set_rcvlowat behavior if there is one 2022-08-08 09:45:14 +01:00
flow_dissector.c Networking changes for 6.0. 2022-08-03 16:29:08 -07:00
flow_offload.c flow_offload: Introduce flow_match_pppoe 2022-07-26 10:49:27 -07:00
gen_estimator.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
gen_stats.c net: stats: Read the statistics in ___gnet_stats_copy_basic() instead of adding. 2021-10-21 12:47:56 +01:00
gro_cells.c net: add per-cpu storage and net->core_stats 2022-03-11 23:17:24 -08:00
gro.c net: allow gro_max_size to exceed 65536 2022-05-16 10:18:56 +01:00
hwbm.c
link_watch.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
lwt_bpf.c bpf, lwt: Fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook 2022-04-22 17:45:25 +02:00
lwtunnel.c lwtunnel: Validate RTA_ENCAP_TYPE attribute length 2021-12-31 14:31:59 +00:00
Makefile net: skb: use auto-generation to convert skb drop reason to string 2022-06-07 12:51:41 +02:00
neighbour.c neigh: fix possible DoS due to net iface start/stop loop 2022-08-15 11:25:09 +01:00
net_namespace.c net: set proper memcg for net_init hooks allocations 2022-06-16 19:48:31 -07:00
net-procfs.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
net-sysfs.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-06-23 12:33:24 -07:00
net-sysfs.h
net-traces.c tcp: add tracepoint for checksum errors 2021-05-14 15:26:03 -07:00
netclassid_cgroup.c bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2021-09-13 16:35:58 -07:00
netevent.c net: core: Correct function name netevent_unregister_notifier() in the kerneldoc 2021-03-28 17:56:56 -07:00
netpoll.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
netprio_cgroup.c bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2021-09-13 16:35:58 -07:00
of_net.c Revert "of: net: support NVMEM cells with MAC in text format" 2022-01-12 14:14:36 +00:00
page_pool.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
pktgen.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
ptp_classifier.c ptp: Add generic PTP is_sync() function 2022-03-07 11:31:34 +00:00
request_sock.c
rtnetlink.c net: allow gro_max_size to exceed 65536 2022-05-16 10:18:56 +01:00
scm.c memcg: enable accounting for scm_fp_list objects 2021-07-20 06:00:38 -07:00
secure_seq.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-20 10:14:49 +01:00
selftests.c net: core: constify mac addrs in selftests 2021-10-24 13:59:44 +01:00
skbuff.c Merge branch 'io_uring-zerocopy-send' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux 2022-07-19 14:22:41 -07:00
skmsg.c Including fixes from bluetooth, bpf, can and netfilter. 2022-08-11 13:45:37 -07:00
sock_destructor.h skb_expand_head() adjust skb->truesize incorrectly 2021-10-22 12:35:51 -07:00
sock_diag.c net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
sock_map.c bpf: Acquire map uref in .init_seq_private for sock{map,hash} iterator 2022-08-10 10:12:48 -07:00
sock_reuseport.c tcp: Fix data-races around sysctl_tcp_migrate_req. 2022-07-18 12:21:54 +01:00
sock.c tls: rx: periodically flush socket backlog 2022-07-06 12:56:35 +01:00
stream.c net: use WARN_ON_ONCE() in sk_stream_kill_queues() 2022-06-09 21:53:55 -07:00
sysctl_net_core.c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-05-23 16:07:14 -07:00
timestamping.c
tso.c
utils.c net: core: Use csum_replace_by_diff() and csum_sub() instead of opencoding 2022-02-21 11:40:44 +00:00
xdp.c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-03-22 11:18:49 -07:00