linux/net/netfilter
Pablo Neira Ayuso e53376bef2 netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt
With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1                                    CPU2
____nf_conntrack_find()
                                        nf_ct_put()
                                         destroy_conntrack()
                                        ...
                                        init_conntrack
                                         __nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&ct->use) (use = 2)
                                         if (!l4proto->new(ct, skb, dataoff, timeouts))
                                          nf_conntrack_free(ct); (use = 2 !!!)
                                        ...
                                        __nf_conntrack_alloc (set use = 1)
 if (!nf_ct_key_equal(h, tuple, zone))
  nf_ct_put(ct); (use = 0)
   destroy_conntrack()
                                        /* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

<4>[67096.759334] ------------[ cut here ]------------
<2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
<4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
<4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>]  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760255] Call Trace:
<4>[67096.760255]  [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
<4>[67096.760255]  [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
<4>[67096.760255]  [<ffffffff81484419>] nf_iterate+0x69/0xb0
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814845d4>] nf_hook_slow+0x74/0x110
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910
<4>[67096.760255]  [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
<4>[67096.760255]  [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
<4>[67096.760255]  [<ffffffff81444f97>] sock_sendmsg+0x117/0x140
<4>[67096.760255]  [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
<4>[67096.760255]  [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
<4>[67096.760255]  [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
<4>[67096.760255]  [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814457c9>] sys_sendto+0x139/0x190
<4>[67096.760255]  [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
<4>[67096.760255]  [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
<4>[67096.760255]  [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
<4>[67096.760255]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Vagin <avagin@parallels.com>
Cc: Florian Westphal <fw@strlen.de>
Reported-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-02-05 17:46:06 +01:00
..
ipset Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
ipvs ipvs: fix AF assignment in ip_vs_conn_new() 2014-02-04 21:13:47 +09:00
core.c netfilter: pass hook ops to hookfn 2013-10-14 11:29:31 +02:00
Kconfig Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables 2014-01-16 11:44:54 -08:00
Makefile Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2014-01-10 14:50:02 -05:00
nf_conntrack_acct.c netfilter: introduce nf_conn_acct structure 2013-11-03 21:48:49 +01:00
nf_conntrack_amanda.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_broadcast.c
nf_conntrack_core.c netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
nf_conntrack_ecache.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_expect.c netfilter: ctnetlink: fix incorrect NAT expectation dumping 2013-07-15 11:14:51 +02:00
nf_conntrack_extend.c netfilter: nf_ct_ext: support variable length extensions 2012-06-16 15:08:49 +02:00
nf_conntrack_ftp.c netfilter: Implement RFC 1123 for FTP conntrack 2013-05-27 13:32:43 +02:00
nf_conntrack_h323_asn1.c
nf_conntrack_h323_main.c netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper 2013-10-21 18:37:01 -04:00
nf_conntrack_h323_types.c
nf_conntrack_helper.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_irc.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_l3proto_generic.c
nf_conntrack_labels.c netfilter: connlabels: remove unneeded includes 2013-07-31 16:39:18 +02:00
nf_conntrack_netbios_ns.c
nf_conntrack_netlink.c netfilter: ctnetlink: honor CTA_MARK_MASK when setting ctmark 2013-12-20 10:21:42 +01:00
nf_conntrack_pptp.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_proto_dccp.c netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages 2014-01-06 17:40:02 +01:00
nf_conntrack_proto_generic.c netfilter: nf_conntrack: generalize nf_ct_l4proto_net 2012-07-04 19:37:22 +02:00
nf_conntrack_proto_gre.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_proto_sctp.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_proto_tcp.c netfilter: add SYNPROXY core/target 2013-08-28 00:27:54 +02:00
nf_conntrack_proto_udp.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_proto_udplite.c netfilter: nf_log: prepare net namespace support for loggers 2013-04-05 20:12:54 +02:00
nf_conntrack_proto.c netfilter: nf_conntrack: remove dead code 2014-01-03 23:41:37 +01:00
nf_conntrack_sane.c netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
nf_conntrack_seqadj.c netfilter: only warn once on wrong seqadj usage 2014-01-06 14:23:17 +01:00
nf_conntrack_sip.c netfilter: nf_ct_sip: consolidate NAT hook functions 2013-10-01 12:47:09 +02:00
nf_conntrack_snmp.c netfilter: nf_ct_snmp: add include file 2013-01-18 00:28:18 +01:00
nf_conntrack_standalone.c net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
nf_conntrack_tftp.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_conntrack_timeout.c netfilter: nf_ct_timeout: move initialization out of pernet_operations 2013-01-23 12:56:02 +01:00
nf_conntrack_timestamp.c netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletion 2013-12-20 14:58:29 +01:00
nf_internals.h net: misc: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
nf_log.c net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
nf_nat_amanda.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
nf_nat_core.c netfilter: nf_nat: add full port randomization support 2014-01-03 23:41:26 +01:00
nf_nat_ftp.c netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
nf_nat_helper.c netfilter: nf_conntrack: make sequence number adjustments usuable without NAT 2013-08-28 00:26:48 +02:00
nf_nat_irc.c netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper 2014-01-06 14:17:17 +01:00
nf_nat_proto_common.c netfilter: nf_nat: add full port randomization support 2014-01-03 23:41:26 +01:00
nf_nat_proto_dccp.c netfilter: add protocol independent NAT core 2012-08-30 03:00:14 +02:00
nf_nat_proto_sctp.c net/sctp: Refactor SCTP skb checksum computation 2013-07-27 20:07:15 -07:00
nf_nat_proto_tcp.c netfilter: add protocol independent NAT core 2012-08-30 03:00:14 +02:00
nf_nat_proto_udp.c netfilter: add protocol independent NAT core 2012-08-30 03:00:14 +02:00
nf_nat_proto_udplite.c netfilter: add protocol independent NAT core 2012-08-30 03:00:14 +02:00
nf_nat_proto_unknown.c netfilter: add protocol independent NAT core 2012-08-30 03:00:14 +02:00
nf_nat_sip.c netfilter: nf_ct_sip: consolidate NAT hook functions 2013-10-01 12:47:09 +02:00
nf_nat_tftp.c netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
nf_queue.c netfilter: move skb_gso_segment into nfnetlink_queue module 2013-04-29 20:09:05 +02:00
nf_sockopt.c
nf_synproxy_core.c netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
nf_tables_api.c netfilter: nf_tables: fix oops when deleting a chain with references 2014-02-05 13:16:17 +01:00
nf_tables_core.c netfilter: nf_tables: rename nft_do_chain_pktinfo() to nft_do_chain() 2014-01-09 20:17:16 +01:00
nf_tables_inet.c netfilter: nf_tables: fix error path in the init functions 2014-01-09 23:25:48 +01:00
nfnetlink_acct.c netfilter: nfnetlink_acct: fix incomplete dumping of objects 2013-06-05 12:36:36 +02:00
nfnetlink_cthelper.c netfilter: check return code from nla_parse_tested 2013-06-20 11:20:13 +02:00
nfnetlink_cttimeout.c netfilter: cttimeout: allow to set/get default protocol timeouts 2013-10-01 13:17:39 +02:00
nfnetlink_log.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2014-01-08 15:04:56 -05:00
nfnetlink_queue_core.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch 2014-01-06 19:48:38 -05:00
nfnetlink_queue_ct.c netfilter: nf_conntrack: make sequence number adjustments usuable without NAT 2013-08-28 00:26:48 +02:00
nfnetlink.c nfnetlink: do not ack malformed messages 2013-11-08 15:12:11 -05:00
nft_bitwise.c netfilter: nf_tables: expression ops overloading 2013-10-14 17:16:08 +02:00
nft_byteorder.c netfilter: nf_tables: expression ops overloading 2013-10-14 17:16:08 +02:00
nft_cmp.c netfilter: nf_tables: add compatibility layer for x_tables 2013-10-14 18:00:04 +02:00
nft_compat.c netfilter: nf_tables: add support for multi family tables 2014-01-07 23:55:46 +01:00
nft_counter.c netfilter: nf_tables: expression ops overloading 2013-10-14 17:16:08 +02:00
nft_ct.c netfilter: nft_ct: fix unconditional dump of 'dir' attr 2014-02-05 13:16:17 +01:00
nft_expr_template.c netfilter: nf_tables: expression ops overloading 2013-10-14 17:16:08 +02:00
nft_exthdr.c netfilter: nft_exthdr: call ipv6_find_hdr() with explicitly initialized offset 2013-12-20 11:25:10 +01:00
nft_hash.c Revert "netfilter: avoid get_random_bytes calls" 2014-01-06 14:00:55 +01:00
nft_immediate.c netfilter: nf_tables: add compatibility layer for x_tables 2013-10-14 18:00:04 +02:00
nft_limit.c netfilter: nf_tables: expression ops overloading 2013-10-14 17:16:08 +02:00
nft_log.c netfilter: nf_tables: add hook ops to struct nft_pktinfo 2014-01-07 23:50:43 +01:00
nft_lookup.c netfilter: nf_tables: expression ops overloading 2013-10-14 17:16:08 +02:00
nft_meta.c netfilter: nft_meta: fix lack of validation of the input register 2014-01-09 20:04:16 +01:00
nft_nat.c netfilter: nft_nat: Fix endianness issue reported by sparse 2013-10-28 17:41:49 +01:00
nft_payload.c netfilter: nf_tables: nft_payload: fix transport header base 2013-10-14 18:00:56 +02:00
nft_queue.c netfilter: nft: add queue module 2013-12-07 23:20:46 +01:00
nft_rbtree.c netfilter: nf_tables: add netlink set API 2013-10-14 17:16:07 +02:00
nft_reject.c netfilter: nf_tables: add hook ops to struct nft_pktinfo 2014-01-07 23:50:43 +01:00
x_tables.c netfilter: x_tables: fix ordering of jumpstack allocation and table update 2013-10-22 10:11:29 +02:00
xt_addrtype.c netfilter: xt_addrtype: fix trivial typo 2013-07-31 16:36:25 +02:00
xt_AUDIT.c netfilter: xt_AUDIT: only generate audit log when audit enabled 2013-03-04 14:45:25 +01:00
xt_bpf.c netfilter: x_tables: add xt_bpf match 2013-01-21 12:20:19 +01:00
xt_cgroup.c netfilter: x_tables: lightweight process control group matching 2014-01-03 23:41:44 +01:00
xt_CHECKSUM.c
xt_CLASSIFY.c
xt_cluster.c
xt_comment.c
xt_connbytes.c netfilter: introduce nf_conn_acct structure 2013-11-03 21:48:49 +01:00
xt_connlabel.c netfilter: add connlabel conntrack extension 2013-01-18 00:28:15 +01:00
xt_connlimit.c Revert "netfilter: avoid get_random_bytes calls" 2014-01-06 14:00:55 +01:00
xt_connmark.c netfilter: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
xt_CONNSECMARK.c
xt_conntrack.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
xt_cpu.c
xt_CT.c netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt 2014-02-05 17:46:06 +01:00
xt_dccp.c
xt_devgroup.c
xt_dscp.c
xt_DSCP.c
xt_ecn.c
xt_esp.c
xt_hashlimit.c Revert "netfilter: avoid get_random_bytes calls" 2014-01-06 14:00:55 +01:00
xt_helper.c
xt_hl.c
xt_HL.c
xt_HMARK.c ipv6: Move ipv6_find_hdr() out of Netfilter code. 2012-11-09 17:05:07 -08:00
xt_IDLETIMER.c
xt_ipcomp.c netfilter: add IPv4/6 IPComp extension match support 2013-12-24 12:37:58 +01:00
xt_iprange.c
xt_ipvs.c ipvs: API change to avoid rescan of IPv6 exthdr 2012-09-28 11:34:33 +09:00
xt_l2tp.c netfilter: introduce l2tp match extension 2014-01-09 21:36:39 +01:00
xt_LED.c
xt_length.c
xt_limit.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
xt_LOG.c netfilter: xt_LOG: fix mark logging for IPv6 packets 2013-05-29 12:29:18 +02:00
xt_mac.c netfilter: Convert compare_ether_addr to ether_addr_equal 2012-05-09 20:49:18 -04:00
xt_mark.c
xt_multiport.c
xt_nat.c netfilter: xt_nat: fix incorrect hooks for SNAT and DNAT targets 2012-10-15 13:39:12 +02:00
xt_NETMAP.c netfilter: combine ipt_NETMAP and ip6t_NETMAP 2012-09-21 12:11:08 +02:00
xt_nfacct.c
xt_NFLOG.c netfilter: log: netns NULL ptr bug when calling from conntrack 2013-05-15 14:11:07 +02:00
xt_NFQUEUE.c netfilter: xt_NFQUEUE: separate reusable code 2013-12-07 23:20:45 +01:00
xt_osf.c netfilter: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
xt_owner.c userns: xt_owner: Add basic user namespace support. 2012-08-14 21:55:30 -07:00
xt_physdev.c
xt_pkttype.c
xt_policy.c
xt_quota.c
xt_rateest.c net_sched: add 64bit rate estimators 2013-06-11 02:51:03 -07:00
xt_RATEEST.c Revert "netfilter: avoid get_random_bytes calls" 2014-01-06 14:00:55 +01:00
xt_realm.c
xt_recent.c Revert "netfilter: avoid get_random_bytes calls" 2014-01-06 14:00:55 +01:00
xt_REDIRECT.c netfilter: combine ipt_REDIRECT and ip6t_REDIRECT 2012-09-21 12:12:05 +02:00
xt_repldata.h
xt_sctp.c
xt_SECMARK.c
xt_set.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-11-15 16:47:22 -08:00
xt_socket.c netfilter: xt_socket: use sock_gen_put() 2013-10-17 10:27:25 +02:00
xt_state.c
xt_statistic.c net: replace macros net_random and net_srandom with direct calls to prandom 2014-01-14 15:15:25 -08:00
xt_string.c
xt_tcpmss.c
xt_TCPMSS.c netfilter: xt_TCPMSS: lookup route from proper net namespace 2013-09-27 16:18:23 +02:00
xt_TCPOPTSTRIP.c netfilter: xt_TCPOPTSTRIP: fix possible off by one access 2013-08-01 11:45:15 +02:00
xt_tcpudp.c
xt_TEE.c net: pass info struct via netdevice notifier 2013-05-28 13:11:01 -07:00
xt_time.c netfilter: xt_time: add support to ignore day transition 2012-09-24 14:29:01 +02:00
xt_TPROXY.c ipv6: make lookups simpler and faster 2013-10-09 00:01:25 -04:00
xt_TRACE.c
xt_u32.c