linux/net
Daniel Borkmann 8c98653f05 sctp: sctp_close: fix release of bindings for deferred call_rcu's
It seems due to RCU usage, i.e. within SCTP's address binding list,
a, say, ``behavioral change'' was introduced which does actually
not conform to the RFC anymore. In particular consider the following
(fictional) scenario to demonstrate this:

  do:
    Two SOCK_SEQPACKET-style sockets are opened (S1, S2)
    S1 is bound to 127.0.0.1, port 1024 [server]
    S2 is bound to 127.0.0.1, port 1025 [client]
    listen(2) is invoked on S1
    From S2 we call one sendmsg(2) with msg.msg_name and
       msg.msg_namelen parameters set to the server's
       address
    S1, S2 are closed
    goto do

The first pass of this loop passes successful, while the second round
fails during binding of S1 (address still in use). What is happening?
In the first round, the initial handshake is being done, and, at the
time close(2) is called on S1, a non-graceful shutdown is performed via
ABORT since in S1's receive queue an unprocessed packet is present,
thus stating an error condition. This can be considered as a correct
behavior.

During close also all bound addresses are freed, thus nothing *must*
be active anymore. In reference to RFC2960:

  After checking the Verification Tag, the receiving endpoint shall
  remove the association from its record, and shall report the
  termination to its upper layer. (9.1 Abort of an Association)

Also, no half-open states are supported, thus after an ungraceful
shutdown, we leave nothing behind. However, this seems not to be
happening though. In a real-world scenario, this is exactly where
it breaks the lksctp-tools functional test suite, *for instance*:

  ./test_sockopt
  test_sockopt.c  1 PASS : getsockopt(SCTP_STATUS) on a socket with no assoc
  test_sockopt.c  2 PASS : getsockopt(SCTP_STATUS)
  test_sockopt.c  3 PASS : getsockopt(SCTP_STATUS) with invalid associd
  test_sockopt.c  4 PASS : getsockopt(SCTP_STATUS) with NULL associd
  test_sockopt.c  5 BROK : bind: Address already in use

The underlying problem is that sctp_endpoint_destroy() hasn't been
triggered yet while the next bind attempt is being done. It will be
triggered eventually (but too late) by sctp_transport_destroy_rcu()
after one RCU grace period:

  sctp_transport_destroy()
    sctp_transport_destroy_rcu() ----.
      sctp_association_put() [*]  <--+--> sctp_packet_free()
        sctp_association_destroy()          [...]
          sctp_endpoint_put()                 skb->destructor
            sctp_endpoint_destroy()             sctp_wfree()
              sctp_bind_addr_free()               sctp_association_put() [*]

Thus, we move out the condition with sctp_association_put() as well as
the sctp_packet_free() invocation and the issue can be solved. We also
better free the SCTP chunks first before putting the ref of the association.

With this patch, the example above (which simulates a similar scenario
as in the implementation of this test case) and therefore also the test
suite run successfully through. Tested by myself.

Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:22:33 -05:00
..
9p virtio: 9p: correctly pass physical address to userspace for high pages 2012-10-22 18:19:36 +10:30
802
8021q net: disallow drivers with buggy VLAN accel to register_netdevice() 2013-01-29 22:58:40 -05:00
appletalk userns: Print out socket uids in a user namespace aware fashion. 2012-08-14 21:48:06 -07:00
atm atm: use scnprintf() instead of sprintf() 2012-12-17 20:50:51 -08:00
ax25 userns: Convert net/ax25 to use kuid_t where appropriate 2012-08-14 21:49:42 -07:00
batman-adv Included changes: 2013-01-29 15:59:45 -05:00
bluetooth Bluetooth: Check if the hci connection exists in SCO shutdown 2013-01-10 03:53:32 -02:00
bridge netns: bridge: allow unprivileged users add/delete mdb entry 2013-02-04 13:12:16 -05:00
caif caif_usb: Make the driver name check more efficient 2012-12-09 00:34:02 -05:00
can can: rework skb reserved data handling 2013-01-28 18:17:25 -05:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2013-01-02 17:32:49 -08:00
core netns: fdb: allow unprivileged users to add/del fdb entries 2013-02-04 13:12:16 -05:00
dcb net: Allow DCBnl to use other namespaces besides init_net 2012-12-10 14:09:01 -05:00
dccp inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock 2012-12-14 13:14:07 -05:00
decnet decnet: use correct RCU API to deref sk_dst_cache field 2013-01-28 00:15:27 -05:00
dns_resolver Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2012-12-16 15:40:50 -08:00
dsa dsa: make dsa_switch_setup check for valid port names 2013-01-21 15:40:12 -05:00
ethernet net: split eth_mac_addr for better error handling 2013-01-21 14:07:44 -05:00
ieee802154 6lowpan: Handle uncompressed IPv6 packets over 6LoWPAN 2013-01-18 14:18:30 -05:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-01-29 15:32:13 -05:00
ipv6 ipv6 anycast: Convert ipv6_sk_ac_lock to spinlock. 2013-01-30 22:41:13 -05:00
ipx userns: Print out socket uids in a user namespace aware fashion. 2012-08-14 21:48:06 -07:00
irda irda: buffer overflow in irnet_ctrl_read() 2013-01-27 20:38:19 -05:00
iucv s390/irq: remove split irq fields from /proc/stat 2013-01-08 10:57:07 +01:00
key xfrm: Convert xfrm_addr_cmp() to boolean xfrm_addr_equal(). 2013-01-29 22:58:40 -05:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-11-10 18:32:51 -05:00
lapb
llc net: Allow userns root to control llc, netfilter, netlink, packet, and xfrm 2012-11-18 20:32:45 -05:00
mac80211 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2013-01-28 13:54:03 -05:00
mac802154 mac802154: fix NOHZ local_softirq_pending 08 warning 2013-01-04 13:47:21 -08:00
netfilter netfilter ipset: Use ipv6_addr_equal() where appropriate. 2013-01-29 22:58:40 -05:00
netlabel Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2012-10-02 13:38:27 -07:00
netlink netlink: Use FIELD_SIZEOF() in netlink_proto_init(). 2013-01-09 23:38:23 -08:00
netrom net: change return values from -EACCES to -EPERM 2012-09-21 13:58:08 -04:00
nfc NFC: Use skb_copy_datagram_iovec 2013-01-11 14:56:32 +01:00
openvswitch openvswitch: Use FIELD_SIZEOF() in dp_init(). 2013-01-09 23:38:24 -08:00
packet net: Allow userns root to control llc, netfilter, netlink, packet, and xfrm 2012-11-18 20:32:45 -05:00
phonet net: Push capable(CAP_NET_ADMIN) into the rtnl methods 2012-11-18 20:32:44 -05:00
rds IB/rds: suppress incompatible protocol when version is known 2012-12-26 15:17:37 -08:00
rfkill Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
rose
rxrpc rxrpc: Use FIELD_SIZEOF() in af_rxrpc_init(). 2013-01-09 23:38:24 -08:00
sched pkt_sched: namespace aware act_mirred 2013-01-14 15:09:36 -05:00
sctp sctp: sctp_close: fix release of bindings for deferred call_rcu's 2013-02-04 13:22:33 -05:00
sunrpc NFS client bugfixe for Linux 3.8 2013-01-11 12:09:04 -08:00
tipc tipc: refactor accept() code for improved readability 2012-12-07 17:23:24 -05:00
unix unix: Use FIELD_SIZEOF() in af_unix_init(). 2013-01-09 23:38:24 -08:00
wimax
wireless Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-01-28 14:43:00 -05:00
x25
xfrm xfrm: Convert xfrm_addr_cmp() to boolean xfrm_addr_equal(). 2013-01-29 22:58:40 -05:00
compat.c make get_file() return its argument 2012-09-26 21:10:25 -04:00
Kconfig wanrouter: completely decouple obsolete code from kernel. 2013-01-31 19:20:33 -05:00
Makefile wanrouter: completely decouple obsolete code from kernel. 2013-01-31 19:20:33 -05:00
nonet.c
socket.c wanrouter: completely decouple obsolete code from kernel. 2013-01-31 19:20:33 -05:00
sysctl_net.c user_ns: get rid of duplicate code in net_ctl_permissions 2012-11-18 20:32:45 -05:00