linux/net/ipv4
Daniel Borkmann 81164413ad net: tcp: add per route congestion control
This work adds the possibility to define a per route/destination
congestion control algorithm. Generally, this opens up the possibility
for a machine with different links to enforce specific congestion
control algorithms with optimal strategies for each of them based
on their network characteristics, even transparently for a single
application listening on all links.

For our specific use case, this additionally facilitates deployment
of DCTCP, for example, applications can easily serve internal
traffic/dsts in DCTCP and external one with CUBIC. Other scenarios
would also allow for utilizing e.g. long living, low priority
background flows for certain destinations/routes while still being
able for normal traffic to utilize the default congestion control
algorithm. We also thought about a per netns setting (where different
defaults are possible), but given its actually a link specific
property, we argue that a per route/destination setting is the most
natural and flexible.

The administrator can utilize this through ip-route(8) by appending
"congctl [lock] <name>", where <name> denotes the name of a
congestion control algorithm and the optional lock parameter allows
to enforce the given algorithm so that applications in user space
would not be allowed to overwrite that algorithm for that destination.

The dst metric lookups are being done when a dst entry is already
available in order to avoid a costly lookup and still before the
algorithms are being initialized, thus overhead is very low when the
feature is not being used. While the client side would need to drop
the current reference on the module, on server side this can actually
even be avoided as we just got a flat-copied socket clone.

Joint work with Florian Westphal.

Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
af_inet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-29 20:47:48 -08:00
ah4.c ipsec: Remove obsolete MAX_AH_AUTH_LEN 2014-09-18 10:54:36 +02:00
arp.c neigh: remove dynamic neigh table registration support 2014-11-11 15:23:54 -05:00
cipso_ipv4.c cipso: remove NULL assignment on static 2014-11-04 15:09:52 -05:00
datagram.c net: Save TX flow hash in sock and set in skbuf on xmit 2014-07-07 21:14:21 -07:00
devinet.c ipv4: fail early when creating netdev named all or default 2014-07-29 11:43:50 -07:00
esp4.c net: esp: Convert NETDEBUG to pr_info 2014-11-06 15:11:10 -05:00
fib_frontend.c fib_trie: Push rcu_read_lock/unlock to callers 2014-12-31 18:25:54 -05:00
fib_lookup.h ipv4: make fib_detect_death static 2013-12-28 17:01:46 -05:00
fib_rules.c fib_trie: Push rcu_read_lock/unlock to callers 2014-12-31 18:25:54 -05:00
fib_semantics.c net: tcp: add RTAX_CC_ALGO fib handling 2015-01-05 22:55:24 -05:00
fib_trie.c fib_trie: Add tracking value for suffix length 2014-12-31 18:25:55 -05:00
fou.c ip: Move checksum convert defines to inet 2015-01-05 22:44:46 -05:00
geneve.c geneve: Check family when reusing sockets. 2015-01-04 22:21:33 -05:00
gre_demux.c net: Fix GRE RX to use skb_transport_header for GRE header offset 2014-09-08 15:23:05 -07:00
gre_offload.c gre: Set inner mac header in gro complete 2014-12-05 21:18:34 -08:00
icmp.c icmp: Remove some spurious dropped packet profile hits from the ICMP path 2014-11-18 15:28:28 -05:00
igmp.c ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs 2014-11-16 16:55:06 -05:00
inet_connection_sock.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
inet_diag.c inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets 2014-01-13 22:35:46 -08:00
inet_fragment.c net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited 2014-11-11 14:10:31 -05:00
inet_hashtables.c net: use reciprocal_scale() helper 2014-08-23 12:21:21 -07:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c
inetpeer.c inet: remove dead inetpeer sequence code 2014-09-08 16:42:42 -07:00
ip_forward.c net: rename local_df to ignore_df 2014-05-12 14:03:41 -04:00
ip_fragment.c net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited 2014-11-11 14:10:31 -05:00
ip_gre.c gre: allow live address change 2014-12-31 14:18:28 -05:00
ip_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip_options.c ipv4: rename ip_options_echo to __ip_options_echo() 2014-09-28 16:35:42 -04:00
ip_output.c put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
ip_sockglue.c ip: Add offset parameter to ip_cmsg_recv 2015-01-05 22:44:46 -05:00
ip_tunnel_core.c ipv4: fix a potential use after free in ip_tunnel_core.c 2014-10-17 23:45:26 -04:00
ip_tunnel.c ip_tunnel: Add missing validation of encap type to ip_tunnel_encap_setup() 2014-12-16 15:20:41 -05:00
ip_vti.c ip_tunnel: the lack of vti_link_ops' dellink() cause kernel panic 2014-11-23 21:11:17 -05:00
ipcomp.c ipcomp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
ipconfig.c ipv4: remove 0/NULL assignment on static 2014-11-04 15:09:52 -05:00
ipip.c fou: Fix typo in returning flags in netlink 2014-11-05 22:18:20 -05:00
ipmr.c net: set name_assign_type in alloc_netdev() 2014-07-15 16:12:48 -07:00
Kconfig net: Move fou_build_header into fou.c and refactor 2014-11-05 16:30:02 -05:00
Makefile net: Add Geneve tunneling protocol driver 2014-10-06 00:32:20 -04:00
netfilter.c netfilter: remove double colon 2014-02-19 11:41:25 +01:00
ping.c put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
proc.c tcp_cubic: add SNMP counters to track how effective is Hystart 2014-12-09 14:58:23 -05:00
protocol.c net: Export inet_offloads and inet6_offloads 2014-09-19 17:15:31 -04:00
raw.c put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
route.c ipv4: Do not cache routing failures due to disabled forwarding. 2014-10-30 19:20:40 -04:00
syncookies.c net: allow setting ecn via routing table 2014-11-04 16:06:09 -05:00
sysctl_net_ipv4.c tcp: allow for bigger reordering level 2014-10-29 15:05:15 -04:00
tcp_bic.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_cong.c net: tcp: add key management to congestion control 2015-01-05 22:55:24 -05:00
tcp_cubic.c tcp_cubic: refine Hystart delay threshold 2014-12-09 14:58:23 -05:00
tcp_dctcp.c net: tcp: add DCTCP congestion control algorithm 2014-09-29 00:13:10 -04:00
tcp_diag.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_fastopen.c tcp: remove unnecessary assignment. 2014-09-29 12:31:12 -04:00
tcp_highspeed.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_htcp.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_hybla.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_illinois.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_input.c switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) 2014-12-09 16:28:22 -05:00
tcp_ipv4.c net: tcp: add per route congestion control 2015-01-05 22:55:24 -05:00
tcp_lp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_memcontrol.c mm: memcontrol: lockless page counters 2014-12-10 17:41:04 -08:00
tcp_metrics.c tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic 2014-08-14 14:38:54 -07:00
tcp_minisocks.c net: tcp: add per route congestion control 2015-01-05 22:55:24 -05:00
tcp_offload.c net: Remove MPLS GSO feature. 2014-11-05 23:52:33 -08:00
tcp_output.c net: tcp: add per route congestion control 2015-01-05 22:55:24 -05:00
tcp_probe.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_scalable.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_timer.c net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited 2014-11-11 14:10:31 -05:00
tcp_vegas.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_vegas.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
tcp_veno.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_westwood.c net: tcp: split ack slow/fast events from cwnd_event 2014-09-29 00:13:10 -04:00
tcp_yeah.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp.c Merge branch 'for-davem-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 13:17:23 -05:00
tunnel4.c
udp_diag.c
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c net: Remove MPLS GSO feature. 2014-11-05 23:52:33 -08:00
udp_tunnel.c ip: Move checksum convert defines to inet 2015-01-05 22:44:46 -05:00
udp.c ip: Add offset parameter to ip_cmsg_recv 2015-01-05 22:44:46 -05:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm4_input.c xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
xfrm4_policy.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_protocol.c xfrm4: Remove duplicate semicolon 2014-06-30 07:49:47 +02:00
xfrm4_state.c inet: make no_pmtu_disc per namespace and kill ipv4_config 2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c