linux/net
Eric Dumazet 789f558cfb tcp/dccp: get rid of central timewait timer
Using a timer wheel for timewait sockets was nice ~15 years ago when
memory was expensive and machines had a single processor.

This does not scale, code is ugly and source of huge latencies
(Typically 30 ms have been seen, cpus spinning on death_lock spinlock.)

We can afford to use an extra 64 bytes per timewait sock and spread
timewait load to all cpus to have better behavior.

Tested:

On following test, /proc/sys/net/ipv4/tcp_tw_recycle is set to 1
on the target (lpaa24)

Before patch :

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
419594

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
437171

While test is running, we can observe 25 or even 33 ms latencies.

lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
...
1000 packets transmitted, 1000 received, 0% packet loss, time 20601ms
rtt min/avg/max/mdev = 0.020/0.217/25.771/1.535 ms, pipe 2

lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
...
1000 packets transmitted, 1000 received, 0% packet loss, time 20702ms
rtt min/avg/max/mdev = 0.019/0.183/33.761/1.441 ms, pipe 2

After patch :

About 90% increase of throughput :

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
810442

lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
800992

And latencies are kept to minimal values during this load, even
if network utilization is 90% higher :

lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
...
1000 packets transmitted, 1000 received, 0% packet loss, time 19991ms
rtt min/avg/max/mdev = 0.023/0.064/0.360/0.042 ms

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13 16:40:05 -04:00
..
6lowpan 6lowpan: nhc: add other known rfc6282 compressions 2015-02-14 23:08:44 +01:00
9p Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
802 net: Kill dev_rebuild_header 2015-03-02 16:43:41 -05:00
8021q vlan: implement ndo_get_iflink 2015-04-02 14:05:00 -04:00
appletalk appletalk: Use eth_<foo>_addr instead of memset 2015-03-03 17:01:37 -05:00
atm atm: Use eth_<foo>_addr instead of memset 2015-03-03 17:01:37 -05:00
ax25 ax25: Fix the build when CONFIG_INET is disabled 2015-03-05 13:17:39 -05:00
batman-adv dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
bluetooth Bluetooth: Read LE remote features during connection establishment 2015-04-09 08:36:54 +03:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2015-04-09 14:46:04 -04:00
caif Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
can can: introduce new raw socket option to join the given CAN filters 2015-04-01 11:28:22 +02:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-02-19 14:14:42 -08:00
core net: use jump label patching for ingress qdisc in __netif_receive_skb_core 2015-04-13 13:34:40 -04:00
dcb net/dcb: Add IEEE QCN attribute 2015-03-06 21:50:02 -05:00
dccp tcp/dccp: get rid of central timewait timer 2015-04-13 16:40:05 -04:00
decnet netfilter: Pass socket pointer down through okfn(). 2015-04-07 15:25:55 -04:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-04-06 22:34:15 -04:00
ethernet ethernet: Use eth_<foo>_addr instead of memset 2015-03-03 17:01:38 -05:00
hsr net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface. 2015-03-01 13:40:23 -05:00
ieee802154 ieee802154: don't export static symbol 2015-03-14 17:11:31 +01:00
ipv4 tcp/dccp: get rid of central timewait timer 2015-04-13 16:40:05 -04:00
ipv6 tcp/dccp: get rid of central timewait timer 2015-04-13 16:40:05 -04:00
ipx net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
irda Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-09 23:38:02 -04:00
iucv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-04-02 16:16:53 -04:00
key xfrm: simplify xfrm_address_t use 2015-03-31 13:58:35 -04:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-04-06 22:34:15 -04:00
lapb
llc net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
mac80211 There isn't much left, but we have 2015-04-12 20:43:46 -04:00
mac802154 mac802154: cleanup concurrent check 2015-03-27 19:18:50 +01:00
mpls mpls: In mpls_egress verify the packet length. 2015-03-12 23:05:04 -04:00
netfilter tcp/dccp: get rid of central timewait timer 2015-04-13 16:40:05 -04:00
netlabel netlink: implement nla_put_in_addr and nla_put_in6_addr 2015-03-31 13:58:35 -04:00
netlink rhashtable: provide len to obj_hashfn 2015-03-25 17:18:33 +01:00
netrom net: Kill dev_rebuild_header 2015-03-02 16:43:41 -05:00
nfc nfc: Fix portid type in urelease_work 2015-04-13 16:35:16 -04:00
openvswitch udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb(). 2015-04-07 15:29:08 -04:00
packet af_packet: pass checksum validation status to the user 2015-03-23 22:01:28 -04:00
phonet net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
rfkill Last round of updates for net-next: 2015-02-04 14:57:45 -08:00
rose net: Kill dev_rebuild_header 2015-03-02 16:43:41 -05:00
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
sched net: use jump label patching for ingress qdisc in __netif_receive_skb_core 2015-04-13 13:34:40 -04:00
sctp sctp: avoid to repeatedly declare external variables 2015-03-25 11:40:16 -04:00
sunrpc sunrpc: make debugfs file creation failure non-fatal 2015-03-31 14:15:08 -04:00
switchdev switchdev: fix stp update API to work with layered netdevices 2015-03-23 16:44:56 -04:00
tipc udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb(). 2015-04-07 15:29:08 -04:00
unix net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
vmw_vsock net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
wimax
wireless cfg80211: don't allow disabling WEXT if it's required 2015-04-08 09:19:29 +02:00
x25 net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2015-04-09 14:41:47 -04:00
compat.c net: socket: add support for async operations 2015-03-23 16:41:36 -04:00
Kconfig kconfig: use bool instead of boolean for type definition attributes 2015-01-07 13:08:04 +01:00
Makefile mpls: Refactor how the mpls module is built 2015-03-04 00:26:06 -05:00
socket.c net: socket: add support for async operations 2015-03-23 16:41:36 -04:00
sysctl_net.c