linux/net
Jiri Wiesner ebaf39e603 ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
The *_frag_reasm() functions are susceptible to miscalculating the byte
count of packet fragments in case the truesize of a head buffer changes.
The truesize member may be changed by the call to skb_unclone(), leaving
the fragment memory limit counter unbalanced even if all fragments are
processed. This miscalculation goes unnoticed as long as the network
namespace which holds the counter is not destroyed.

Should an attempt be made to destroy a network namespace that holds an
unbalanced fragment memory limit counter the cleanup of the namespace
never finishes. The thread handling the cleanup gets stuck in
inet_frags_exit_net() waiting for the percpu counter to reach zero. The
thread is usually in running state with a stacktrace similar to:

 PID: 1073   TASK: ffff880626711440  CPU: 1   COMMAND: "kworker/u48:4"
  #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480
  #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b
  #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c
  #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856
  #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0
 #10 [ffff880621563e38] process_one_work at ffffffff81096f14

It is not possible to create new network namespaces, and processes
that call unshare() end up being stuck in uninterruptible sleep state
waiting to acquire the net_mutex.

The bug was observed in the IPv6 netfilter code by Per Sundstrom.
I thank him for his analysis of the problem. The parts of this patch
that apply to IPv4 and IPv6 fragment reassembly are preemptive measures.

Signed-off-by: Jiri Wiesner <jwiesner@suse.com>
Reported-by: Per Sundstrom <per.sundstrom@redqube.se>
Acked-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-05 20:44:46 -08:00
..
6lowpan
9p Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-03 10:35:52 -07:00
802
8021q netpoll: allow cleanup to be synchronous 2018-10-19 17:01:43 -07:00
appletalk
atm Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
ax25
batman-adv batman-adv: Expand merged fragment buffer for full packet 2018-11-12 10:41:29 +01:00
bluetooth Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
bpf bpf: refactor bpf_test_run() to separate own failures and test program result 2018-12-01 12:33:58 -08:00
bpfilter net: bpfilter: Set user mode helper's command line 2018-10-22 19:37:36 -07:00
bridge net: bridge: fix vlan stats use-after-free on destruction 2018-11-17 21:38:44 -08:00
caif Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
can can: raw: check for CAN FD capable netdev in raw_sendmsg() 2018-11-09 17:19:34 +01:00
ceph libceph: fall back to sendmsg for slab pages 2018-11-19 17:59:47 +01:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2018-12-05 16:30:30 -08:00
dcb
dccp Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
decnet decnet: Remove unnecessary check for dev->name 2018-09-21 19:48:36 -07:00
dns_resolver dns: Allow the dns resolver to retrieve a server set 2018-10-04 09:40:52 -07:00
dsa net: dsa: Fix tagging attribute location 2018-11-30 17:17:39 -08:00
ethernet
hsr
ieee802154 net/ipfrag: let ip[6]frag_high_thresh in ns be higher than in init_net 2018-09-21 19:45:52 -07:00
ife
ipv4 ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes 2018-12-05 20:44:46 -08:00
ipv6 ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes 2018-12-05 20:44:46 -08:00
iucv Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
kcm Revert "kcm: remove any offset before parsing messages" 2018-09-17 18:43:42 -07:00
key
l2tp l2tp: fix a sock refcnt leak in l2tp_tunnel_register 2018-11-14 22:49:31 -08:00
l3mdev
lapb
llc llc: do not use sk_eat_skb() 2018-10-22 19:59:20 -07:00
mac80211 mac80211: ignore NullFunc frames in the duplicate detection 2018-12-05 12:34:49 +01:00
mac802154 mac802154: Remove VLA usage of skcipher 2018-09-28 12:46:07 +08:00
mpls net/mpls: Handle kernel side filtering of route dumps 2018-10-16 00:14:07 -07:00
ncsi net/ncsi: Add NCSI Broadcom OEM command 2018-10-17 22:14:54 -07:00
netfilter netfilter: nf_tables: deactivate expressions in rule replecement routine 2018-11-28 10:56:40 +01:00
netlabel netlabel: check for IPV4MASK in addrinfo_get 2018-09-21 18:58:34 -07:00
netlink netlink: Add answer_flags to netlink_callback 2018-10-16 00:13:12 -07:00
netrom
nfc Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-10-24 14:43:41 +01:00
nsh
openvswitch openvswitch: fix spelling mistake "execeeds" -> "exceeds" 2018-11-30 13:18:09 -08:00
packet packet: copy user buffers before orphan or clone 2018-11-23 11:08:03 -08:00
phonet
psample
qrtr
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-12 21:38:46 -07:00
rfkill Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-04 21:33:03 -07:00
rose
rxrpc rxrpc: Fix life check 2018-11-15 11:35:40 -08:00
sched net/sched: act_police: fix memory leak in case of invalid control action 2018-11-30 17:14:06 -08:00
sctp sctp: frag_point sanity check 2018-12-05 20:37:52 -08:00
smc net/smc: use after free fix in smc_wr_tx_put_slot() 2018-11-21 16:14:56 -08:00
strparser bpf, sockmap: convert to generic sk_msg interface 2018-10-15 12:23:19 -07:00
sunrpc NFS client bugfixes for Linux 4.20 2018-11-15 10:59:37 -06:00
switchdev
tipc tipc: fix lockdep warning during node delete 2018-11-27 16:30:39 -08:00
tls Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
unix Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
vmw_vsock
wimax
wireless cfg80211: Fix busy loop regression in ieee80211_ie_split_ric() 2018-12-05 12:51:29 +01:00
x25 net/x25: handle call collisions 2018-11-29 14:25:36 -08:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-19 11:03:06 -07:00
xfrm Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-03 18:25:17 -07:00
compat.c y2038: socket: Change recvmmsg to use __kernel_timespec 2018-08-29 15:42:24 +02:00
Kconfig bpf, sockmap: convert to generic sk_msg interface 2018-10-15 12:23:19 -07:00
Makefile
socket.c socket: do a generic_file_splice_read when proto_ops has no splice_read 2018-11-17 21:34:11 -08:00
sysctl_net.c