linux/net/ipv4/netfilter
Taehee Yoo 5a86d68bcf netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine
When network namespace is destroyed, cleanup_net() is called.
cleanup_net() holds pernet_ops_rwsem then calls each ->exit callback.
So that clusterip_tg_destroy() is called by cleanup_net().
And clusterip_tg_destroy() calls unregister_netdevice_notifier().

But both cleanup_net() and clusterip_tg_destroy() hold same
lock(pernet_ops_rwsem). hence deadlock occurrs.

After this patch, only 1 notifier is registered when module is inserted.
And all of configs are added to per-net list.

test commands:
   %ip netns add vm1
   %ip netns exec vm1 bash
   %ip link set lo up
   %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
	-j CLUSTERIP --new --hashmode sourceip \
	--clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
   %exit
   %ip netns del vm1

splat looks like:
[  341.809674] ============================================
[  341.809674] WARNING: possible recursive locking detected
[  341.809674] 4.19.0-rc5+ #16 Tainted: G        W
[  341.809674] --------------------------------------------
[  341.809674] kworker/u4:2/87 is trying to acquire lock:
[  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x8c/0x460
[  341.809674]
[  341.809674] but task is already holding lock:
[  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
[  341.809674]
[  341.809674] other info that might help us debug this:
[  341.809674]  Possible unsafe locking scenario:
[  341.809674]
[  341.809674]        CPU0
[  341.809674]        ----
[  341.809674]   lock(pernet_ops_rwsem);
[  341.809674]   lock(pernet_ops_rwsem);
[  341.809674]
[  341.809674]  *** DEADLOCK ***
[  341.809674]
[  341.809674]  May be due to missing lock nesting notation
[  341.809674]
[  341.809674] 3 locks held by kworker/u4:2/87:
[  341.809674]  #0: 00000000d9df6c92 ((wq_completion)"%s""netns"){+.+.}, at: process_one_work+0xafe/0x1de0
[  341.809674]  #1: 00000000c2cbcee2 (net_cleanup_work){+.+.}, at: process_one_work+0xb60/0x1de0
[  341.809674]  #2: 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
[  341.809674]
[  341.809674] stack backtrace:
[  341.809674] CPU: 1 PID: 87 Comm: kworker/u4:2 Tainted: G        W         4.19.0-rc5+ #16
[  341.809674] Workqueue: netns cleanup_net
[  341.809674] Call Trace:
[ ... ]
[  342.070196]  down_write+0x93/0x160
[  342.070196]  ? unregister_netdevice_notifier+0x8c/0x460
[  342.070196]  ? down_read+0x1e0/0x1e0
[  342.070196]  ? sched_clock_cpu+0x126/0x170
[  342.070196]  ? find_held_lock+0x39/0x1c0
[  342.070196]  unregister_netdevice_notifier+0x8c/0x460
[  342.070196]  ? register_netdevice_notifier+0x790/0x790
[  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
[  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
[  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
[  342.070196]  ? trace_hardirqs_on+0x93/0x210
[  342.070196]  ? __bpf_trace_preemptirq_template+0x10/0x10
[  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
[  342.123094]  clusterip_tg_destroy+0x3ad/0x650 [ipt_CLUSTERIP]
[  342.123094]  ? clusterip_net_init+0x3d0/0x3d0 [ipt_CLUSTERIP]
[  342.123094]  ? cleanup_match+0x17d/0x200 [ip_tables]
[  342.123094]  ? xt_unregister_table+0x215/0x300 [x_tables]
[  342.123094]  ? kfree+0xe2/0x2a0
[  342.123094]  cleanup_entry+0x1d5/0x2f0 [ip_tables]
[  342.123094]  ? cleanup_match+0x200/0x200 [ip_tables]
[  342.123094]  __ipt_unregister_table+0x9b/0x1a0 [ip_tables]
[  342.123094]  iptable_filter_net_exit+0x43/0x80 [iptable_filter]
[  342.123094]  ops_exit_list.isra.10+0x94/0x140
[  342.123094]  cleanup_net+0x45b/0x900
[ ... ]

Fixes: 202f59afd4 ("netfilter: ipt_CLUSTERIP: do not hold dev")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-18 01:17:59 +01:00
..
arp_tables.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2018-03-30 11:41:18 -04:00
arpt_mangle.c
arptable_filter.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
ip_tables.c netfilter: x_tables: set module owner for icmp(6) matches 2018-07-05 11:45:11 +02:00
ipt_ah.c netfilter: ipt_ah: return boolean instead of integer 2018-03-05 23:15:43 +01:00
ipt_CLUSTERIP.c netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine 2018-12-18 01:17:59 +01:00
ipt_ECN.c netfilter: x_tables: use pr ratelimiting in all remaining spots 2018-02-14 21:05:38 +01:00
ipt_MASQUERADE.c netfilter: add NAT support for shifted portmap ranges 2018-04-24 10:29:12 +02:00
ipt_REJECT.c netfilter: x_tables: use pr ratelimiting in all remaining spots 2018-02-14 21:05:38 +01:00
ipt_rpfilter.c netfilter: rpfilter: Convert rpfilter_lookup_reverse to new dev helper 2018-09-20 20:01:52 -07:00
ipt_SYNPROXY.c netfilter: ctnetlink: synproxy support 2018-03-20 14:39:31 +01:00
iptable_filter.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
iptable_mangle.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
iptable_nat.c netfilter: nf_nat: add nat type hooks to nat core 2018-05-23 09:14:06 +02:00
iptable_raw.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
iptable_security.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
Kconfig netfilter: nat: remove l4proto->manip_pkt 2018-12-17 23:33:29 +01:00
Makefile netfilter: nat: remove nf_nat_l4proto struct 2018-12-17 23:33:31 +01:00
nf_defrag_ipv4.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
nf_dup_ipv4.c netfilter: kill the fake untracked conntrack objects 2017-04-15 11:47:57 +02:00
nf_flow_table_ipv4.c netfilter: nf_flow_table: move init code to nf_flow_table_core.c 2018-04-24 10:28:45 +02:00
nf_log_arp.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
nf_log_ipv4.c netfilter: check if the socket netns is correct. 2018-06-28 22:21:32 +09:00
nf_nat_h323.c netfilter: add NAT support for shifted portmap ranges 2018-04-24 10:29:12 +02:00
nf_nat_l3proto_ipv4.c netfilter: nat: remove nf_nat_l4proto struct 2018-12-17 23:33:31 +01:00
nf_nat_masquerade_ipv4.c netfilter: masquerade: don't flush all conntracks if only one address deleted on device 2018-09-28 14:28:26 +02:00
nf_nat_pptp.c netfilter: nat: remove l4proto->manip_pkt 2018-12-17 23:33:29 +01:00
nf_nat_snmp_basic_main.c netfilter: nf_nat_snmp_basic: add missing helper alias name 2018-10-16 10:01:50 +02:00
nf_nat_snmp_basic.asn1 netfilter: nf_nat_snmp_basic: use asn1 decoder library 2018-01-19 13:59:07 +01:00
nf_reject_ipv4.c netfilter: nf_reject_ipv4: Fix use-after-free in send_reset 2017-11-01 12:15:29 +01:00
nf_socket_ipv4.c netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6} 2018-03-24 21:17:14 +01:00
nf_tproxy_ipv4.c netfilter: nf_tproxy: fix possible non-linear access to transport header 2018-07-06 14:32:44 +02:00
nft_chain_nat_ipv4.c netfilter: nf_nat: add nat type hooks to nat core 2018-05-23 09:14:06 +02:00
nft_chain_route_ipv4.c netfilter: nf_tables: nft_register_chain_type() returns void 2018-03-30 11:29:18 +02:00
nft_dup_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
nft_fib_ipv4.c netfilter: nft_fib: Convert nft_fib4_eval to new dev helper 2018-09-20 20:01:53 -07:00
nft_masq_ipv4.c netfilter: add NAT support for shifted portmap ranges 2018-04-24 10:29:12 +02:00
nft_redir_ipv4.c netfilter: nf_tables: fix mismatch in big-endian system 2017-03-13 13:30:28 +01:00
nft_reject_ipv4.c netfilter: nf_tables: use hook state from xt_action_param structure 2016-11-03 11:52:34 +01:00