linux/net
Sowmini Varadhan 1a0e100fb2 RDS: TCP: Force every connection to be initiated by numerically smaller IP address
When 2 RDS peers initiate an RDS-TCP connection simultaneously,
there is a potential for "duelling syns" on either/both sides.
See commit 241b271952 ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") for a description of this
condition, and the arbitration logic which ensures that the
numerically large IP address in the TCP connection is bound to the
RDS_TCP_PORT ("canonical ordering").

The rds_connection should not be marked as RDS_CONN_UP until the
arbitration logic has converged for the following reason. The sender
may start transmitting RDS datagrams as soon as RDS_CONN_UP is set,
and since the sender removes all datagrams from the rds_connection's
cp_retrans queue based on TCP acks. If the TCP ack was sent from
a tcp socket that got reset as part of duel aribitration (but
before data was delivered to the receivers RDS socket layer),
the sender may end up prematurely freeing the datagram, and
the datagram is no longer reliably deliverable.

This patch remedies that condition by making sure that, upon
receipt of 3WH completion state change notification of TCP_ESTABLISHED
in rds_tcp_state_change, we mark the rds_connection as RDS_CONN_UP
if, and only if, the IP addresses and ports for the connection are
canonically ordered. In all other cases, rds_tcp_state_change will
force an rds_conn_path_drop(), and rds_queue_reconnect() on
both peers will restart the connection to ensure canonical ordering.

A side-effect of enforcing this condition in rds_tcp_state_change()
is that rds_tcp_accept_one_path() can now be refactored for simplicity.
It is also no longer possible to encounter an RDS_CONN_UP connection in
the arbitration logic in rds_tcp_accept_one().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-17 13:35:18 -05:00
..
6lowpan 6lowpan: ndisc: no overreact if no short address is available 2016-09-19 20:19:34 +02:00
9p IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
802 net: use core MTU range checking in misc drivers 2016-10-20 14:51:10 -04:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
appletalk appletalk: use IS_ENABLED() instead of checking for built-in or module 2016-09-10 21:19:10 -07:00
atm net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
ax25
batman-adv This feature and cleanup patchset includes the following changes: 2016-11-09 22:15:28 -05:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
bridge netfilter: remove hook_entries field from nf_hook_state 2016-11-03 11:52:58 +01:00
caif net caif: insert missing spaces in pr_* messages and unbreak multi-line strings 2016-10-28 13:47:33 -04:00
can can: bcm: fix warning in bcm_connect/proc_register 2016-10-31 20:48:19 +01:00
ceph libceph: initialize last_linger_id with a large integer 2016-11-10 20:13:08 +01:00
core netpoll: more efficient locking 2016-11-16 18:32:02 -05:00
dcb
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
decnet net: fix sleeping for sk_wait_event() 2016-11-14 13:17:21 -05:00
dns_resolver
dsa net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
ethernet net: make default TX queue length a defined constant 2016-11-07 20:15:55 -05:00
hsr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
ieee802154 genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
ipv4 tcp: allow to enable the repair mode for non-listening sockets 2016-11-15 22:28:50 -05:00
ipv6 ipv6: sr: add option to control lwtunnel support 2016-11-16 11:29:46 -05:00
ipx
irda genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
iucv Subject: [PATCH] af_iucv: drop skbs rejected by filter 2016-10-12 01:56:04 -04:00
kcm Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-07 15:36:58 -07:00
key
l2tp net: l2tp: fix negative assignment to unsigned int 2016-11-09 18:55:36 -05:00
l3mdev net: ipv6: Remove l3mdev_get_saddr6 2016-09-10 23:12:53 -07:00
lapb
llc net: fix sleeping for sk_wait_event() 2016-11-14 13:17:21 -05:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
mac802154 mac802154: use rate limited warnings for malformed frames 2016-09-19 20:19:34 +02:00
mpls lwt: Remove unused len field 2016-10-23 17:45:01 -04:00
ncsi net/ncsi: Improve HNCDSC AEN handler 2016-10-20 11:23:08 -04:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
netlabel genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
netrom
nfc genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2016-11-13 22:41:25 -05:00
packet packet: on direct_xmit, limit tso and csum to supported devices 2016-10-29 15:02:15 -04:00
phonet net: fix sleeping for sk_wait_event() 2016-11-14 13:17:21 -05:00
qrtr
rds RDS: TCP: Force every connection to be initiated by numerically smaller IP address 2016-11-17 13:35:18 -05:00
rfkill
rose rose: limit sk_filter trim to payload 2016-07-13 11:53:40 -07:00
rxrpc udp: do fwd memory scheduling on dequeue 2016-11-07 13:24:41 -05:00
sched net_sched: sch_fq: use hash_ptr() 2016-11-17 13:28:56 -05:00
sctp sctp: use new rhlist interface on sctp transport rhashtable 2016-11-16 23:22:17 -05:00
strparser strparser: Propagate correct error code in strp_recv() 2016-10-12 01:51:49 -04:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
switchdev Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
tipc net: fix sleeping for sk_wait_event() 2016-11-14 13:17:21 -05:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
vmw_vsock net: fix sleeping for sk_wait_event() 2016-11-14 13:17:21 -05:00
wimax genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
x25 net: x25: remove null checks on arrays calling_ae and called_ae 2016-09-09 18:13:30 -07:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2016-10-28 13:26:27 -04:00
compat.c
Kconfig strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
Makefile strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
sysctl_net.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-10-06 09:52:23 -07:00