linux/net
Jozsef Kadlecsik f66ee0410b netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports
In the case of huge hash:* types of sets, due to the single spinlock of
a set the processing of the whole set under spinlock protection could take
too long.

There were four places where the whole hash table of the set was processed
from bucket to bucket under holding the spinlock:

- During resizing a set, the original set was locked to exclude kernel side
  add/del element operations (userspace add/del is excluded by the
  nfnetlink mutex). The original set is actually just read during the
  resize, so the spinlocking is replaced with rcu locking of regions.
  However, thus there can be parallel kernel side add/del of entries.
  In order not to loose those operations a backlog is added and replayed
  after the successful resize.
- Garbage collection of timed out entries was also protected by the spinlock.
  In order not to lock too long, region locking is introduced and a single
  region is processed in one gc go. Also, the simple timer based gc running
  is replaced with a workqueue based solution. The internal book-keeping
  (number of elements, size of extensions) is moved to region level due to
  the region locking.
- Adding elements: when the max number of the elements is reached, the gc
  was called to evict the timed out entries. The new approach is that the gc
  is called just for the matching region, assuming that if the region
  (proportionally) seems to be full, then the whole set does. We could scan
  the other regions to check every entry under rcu locking, but for huge
  sets it'd mean a slowdown at adding elements.
- Listing the set header data: when the set was defined with timeout
  support, the garbage collector was called to clean up timed out entries
  to get the correct element numbers and set size values. Now the set is
  scanned to check non-timed out entries, without actually calling the gc
  for the whole set.

Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe ->
SOFTIRQ-unsafe lock order issues during working on the patch.

Reported-by: syzbot+4b0e9d4ff3cf117837e5@syzkaller.appspotmail.com
Reported-by: syzbot+c27b8d5010f45c666ed1@syzkaller.appspotmail.com
Reported-by: syzbot+68a806795ac89df3aa1c@syzkaller.appspotmail.com
Fixes: 23c42a403a ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
2020-02-22 12:00:06 +01:00
..
6lowpan
9p
802 treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-09 12:13:43 -08:00
appletalk
atm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
ax25 net: Make sock protocol value checks more specific 2020-01-09 18:41:40 -08:00
batman-adv Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
bluetooth Bluetooth: Fix race condition in hci_release_sock() 2020-01-26 10:34:17 +02:00
bpf bpf: Allow to change skb mark in test_run 2019-12-18 17:05:58 -08:00
bpfilter
bridge net: bridge: vlan: add per-vlan state 2020-01-24 12:58:14 +01:00
caif caif_usb: fix spelling mistake "to" -> "too" 2020-01-24 08:12:06 +01:00
can can: j1939: j1939_sk_bind(): take priv after lock is held 2019-12-08 11:52:02 +01:00
ceph libceph, rbd, ceph: convert to use the new mount API 2019-11-27 22:28:37 +01:00
core net/core: Do not clear VF index for node/port GUIDs query 2020-01-30 15:20:26 +01:00
dcb
dccp treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
decnet net: Make sock protocol value checks more specific 2020-01-09 18:41:40 -08:00
dns_resolver
dsa net: dsa: Fix use-after-free in probing of DSA switch tree 2020-01-27 11:12:46 +01:00
ethernet net: remove eth_change_mtu 2020-01-27 11:09:31 +01:00
ethtool net/core: Replace driver version to be kernel version 2020-01-27 13:47:22 +01:00
hsr net/hsr: remove seq_nr_after_or_eq 2020-01-21 12:03:21 +01:00
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-02 13:54:56 -07:00
ife
ipv4 tcp: Reduce SYN resend delay if a suspicous ACK is received 2020-02-02 13:33:21 -08:00
ipv6 mptcp: Fix undefined mptcp_handle_ipv6_mapped for modular IPV6 2020-01-30 10:55:54 +01:00
iucv treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
kcm
key
l2tp l2tp: Remove redundant BUG_ON() check in l2tp_pernet 2020-01-03 12:26:07 -08:00
l3mdev
lapb
llc llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c) 2019-12-20 21:19:36 -08:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
mac802154
mpls net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2019-12-04 12:27:13 -08:00
mptcp mptcp: Fix undefined mptcp_handle_ipv6_mapped for modular IPV6 2020-01-30 10:55:54 +01:00
ncsi net/ncsi: Support for multi host mellanox card 2020-01-09 18:36:22 -08:00
netfilter netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports 2020-02-22 12:00:06 +01:00
netlabel
netlink treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
netrom net: core: add generic lockdep keys 2019-10-24 14:53:48 -07:00
nfc net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive() 2019-12-18 11:57:33 -08:00
nsh
openvswitch net: openvswitch: use skb_list_walk_safe helper for gso segments 2020-01-14 11:48:41 -08:00
packet y2038: core, driver and file system changes 2020-01-29 14:55:47 -08:00
phonet net: Remove redundant BUG_ON() check in phonet_pernet 2020-01-03 12:25:50 -08:00
psample net: psample: fix skb_over_panic 2019-11-26 14:40:13 -08:00
qrtr net: qrtr: Remove receive worker 2020-01-14 18:36:42 -08:00
rds net/rds: Use prefetch for On-Demand-Paging MR 2020-01-18 11:48:19 +02:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2019-12-16 10:15:49 +01:00
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
rxrpc rxrpc: Fix use-after-free in rxrpc_receive_data() 2020-01-27 10:56:30 +01:00
sched cls_rsvp: fix rsvp_policy 2020-02-01 12:25:06 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-09 12:13:43 -08:00
smc net/smc: allow unprivileged users to read pnet table 2020-01-21 11:39:56 +01:00
strparser
sunrpc y2038: core, driver and file system changes 2020-01-29 14:55:47 -08:00
switchdev
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
tls Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
unix Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2020-01-21 12:18:20 +01:00
vmw_vsock Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
x25 net/x25: fix nonblocking connect 2020-01-09 18:39:33 -08:00
xdp xsk, net: Make sock_def_readable() have external linkage 2020-01-22 00:08:52 +01:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-26 10:40:21 +01:00
compat.c y2038: socket: use __kernel_old_timespec instead of timespec 2019-11-15 14:38:29 +01:00
Kconfig mptcp: Add MPTCP socket stubs 2020-01-24 13:44:07 +01:00
Makefile mptcp: Add MPTCP socket stubs 2020-01-24 13:44:07 +01:00
socket.c socket: fix unused-function warning 2020-01-08 15:02:21 -08:00
sysctl_net.c