linux/net
Daniel Borkmann 2061dcd6bf net: sctp: fix race for one-to-many sockets in sendmsg's auto associate
I.e. one-to-many sockets in SCTP are not required to explicitly
call into connect(2) or sctp_connectx(2) prior to data exchange.
Instead, they can directly invoke sendmsg(2) and the SCTP stack
will automatically trigger connection establishment through 4WHS
via sctp_primitive_ASSOCIATE(). However, this in its current
implementation is racy: INIT is being sent out immediately (as
it cannot be bundled anyway) and the rest of the DATA chunks are
queued up for later xmit when connection is established, meaning
sendmsg(2) will return successfully. This behaviour can result
in an undesired side-effect that the kernel made the application
think the data has already been transmitted, although none of it
has actually left the machine, worst case even after close(2)'ing
the socket.

Instead, when the association from client side has been shut down
e.g. first gracefully through SCTP_EOF and then close(2), the
client could afterwards still receive the server's INIT_ACK due
to a connection with higher latency. This INIT_ACK is then considered
out of the blue and hence responded with ABORT as there was no
alive assoc found anymore. This can be easily reproduced f.e.
with sctp_test application from lksctp. One way to fix this race
is to wait for the handshake to actually complete.

The fix defers waiting after sctp_primitive_ASSOCIATE() and
sctp_primitive_SEND() succeeded, so that DATA chunks cooked up
from sctp_sendmsg() have already been placed into the output
queue through the side-effect interpreter, and therefore can then
be bundeled together with COOKIE_ECHO control chunks.

strace from example application (shortened):

socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(0)=[], msg_controllen=48, {cmsg_len=48, cmsg_level=0x84 /* SOL_??? */, cmsg_type=, ...},
           msg_flags=0}, 0) = 0 // graceful shutdown for SOCK_SEQPACKET via SCTP_EOF
close(3) = 0

tcpdump before patch (fooling the application):

22:33:36.306142 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 3879023686] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3139201684]
22:33:36.316619 IP 192.168.1.115.8888 > 192.168.1.114.41462: sctp (1) [INIT ACK] [init tag: 3345394793] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 3380109591]
22:33:36.317600 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [ABORT]

tcpdump after patch:

14:28:58.884116 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 438593213] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3092969729]
14:28:58.888414 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [INIT ACK] [init tag: 381429855] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 2141904492]
14:28:58.888638 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [COOKIE ECHO] , (2) [DATA] (B)(E) [TSN: 3092969729] [...]
14:28:58.893278 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [COOKIE ACK] , (2) [SACK] [cum ack 3092969729] [a_rwnd 106491] [#gap acks 0] [#dup tsns 0]
14:28:58.893591 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969730] [...]
14:28:59.096963 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969730] [a_rwnd 106496] [#gap acks 0] [#dup tsns 0]
14:28:59.097086 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969731] [...] , (2) [DATA] (B)(E) [TSN: 3092969732] [...]
14:28:59.103218 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969732] [a_rwnd 106486] [#gap acks 0] [#dup tsns 0]
14:28:59.103330 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN]
14:28:59.107793 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SHUTDOWN ACK]
14:28:59.107890 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN COMPLETE]

Looks like this bug is from the pre-git history museum. ;)

Fixes: 08707d5482df ("lksctp-2_5_31-0_5_1.patch")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-17 23:52:20 -05:00
..
6lowpan net/6lowpan: Remove FSF address from GPL statement. 2014-12-05 12:43:04 +01:00
9p 9p/trans_virtio: enable VQs early 2014-10-15 10:25:04 +10:30
802
8021q vlan: Add ability to always enable TSO/UFO 2014-12-12 10:58:53 -05:00
appletalk new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
atm put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
ax25 new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
batman-adv batman-adv: fix potential TT client + orig-node memory leak 2015-01-06 11:07:01 +01:00
bluetooth Bluetooth: Fix accepting connections when not using mgmt 2014-12-24 20:02:00 +01:00
bridge bridge: only provide proxy ARP when CONFIG_INET is enabled 2015-01-14 15:08:02 -05:00
caif put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
can can: fix spelling errors 2014-12-07 21:22:05 +01:00
ceph libceph: fix sparse endianness warnings 2015-01-08 20:36:57 +03:00
core net: rps: fix cpu unplug 2015-01-16 01:02:42 -05:00
dcb dcbnl : Disable software interrupts before taking dcb_lock 2014-11-16 14:50:52 -05:00
dccp net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
decnet new helper: memcpy_to_msg() 2014-11-24 04:28:51 -05:00
dns_resolver Merge commit 'v3.16' into next 2014-10-01 00:44:04 +10:00
dsa Driver core patches for 3.19-rc1 2014-12-14 16:10:09 -08:00
ethernet
hsr
ieee802154 Merge tag 'master-2014-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2014-12-09 18:12:03 -05:00
ipv4 ip: zero sockaddr returned on error queue 2015-01-15 19:41:16 -05:00
ipv6 ip: zero sockaddr returned on error queue 2015-01-15 19:41:16 -05:00
ipx switch ipxrtr_route_packet() from iovec to msghdr 2014-11-24 04:28:49 -05:00
irda irda: Convert function pointer arrays and uses to const 2014-12-10 15:33:16 -05:00
iucv net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
key new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
l2tp ip_generic_getfrag, udplite_getfrag: switch to passing msghdr 2014-12-09 16:28:22 -05:00
lapb lapb: move EXPORT_SYMBOL after functions. 2014-10-24 15:51:42 -04:00
llc llc: Make llc_sap_action_t function pointer arrays const 2014-12-10 15:21:24 -05:00
mac80211 mac80211: uninitialized return val in __ieee80211_sta_handle_tspec_ac_params 2015-01-07 13:57:34 +01:00
mac802154 mac802154: use goto label on failure 2014-12-05 14:18:42 +01:00
mpls mpls: Fix allowed protocols for mpls gso 2014-12-23 23:57:31 -05:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2015-01-12 00:14:49 -05:00
netlabel netlabel: kernel-doc warning fix 2014-10-09 01:40:05 -04:00
netlink genetlink: synchronize socket closing and family removal 2015-01-16 17:04:25 -05:00
netrom new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
nfc Merge tag 'master-2014-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2014-12-09 18:12:03 -05:00
openvswitch openvswitch: packet messages need their own probe attribtue 2015-01-14 16:49:44 -05:00
packet packet: bail out of packet_snd() if L2 header creation fails 2015-01-11 21:54:03 -05:00
phonet new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
rds rds: Fix min() warning in rds_message_inc_copy_to_user() 2014-12-15 11:49:09 -05:00
rfkill Driver core patches for 3.19-rc1 2014-12-14 16:10:09 -08:00
rose new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
rxrpc net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-12-10 15:48:20 -05:00
sctp net: sctp: fix race for one-to-many sockets in sendmsg's auto associate 2015-01-17 23:52:20 -05:00
sunrpc rpc: fix xdr_truncate_encode to handle buffer ending on page boundary 2015-01-07 14:03:58 -05:00
switchdev bridge: call netdev_sw_port_stp_update when bridge port STP status changes 2014-12-02 20:01:22 -08:00
tipc tipc: fix bug in broadcast retransmit code 2015-01-12 16:01:59 -05:00
unix put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
vmw_vsock put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
wimax wimax: convert printk to pr_foo() 2014-10-07 20:28:44 -04:00
wireless Just two fixes - one for an uninialized variable and 2015-01-15 19:28:36 -05:00
x25 new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2014-12-08 21:30:21 -05:00
compat.c put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
Kconfig net: introduce generic switch devices support 2014-12-02 20:01:20 -08:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-16 15:53:03 -08:00
socket.c [regression] chunk lost from bd9b51 2014-12-19 07:13:21 -05:00
sysctl_net.c