linux/include
Eric Dumazet 605ad7f184 tcp: refine TSO autosizing
Commit 95bd09eb27 ("tcp: TSO packets automatic sizing") tried to
control TSO size, but did this at the wrong place (sendmsg() time)

At sendmsg() time, we might have a pessimistic view of flow rate,
and we end up building very small skbs (with 2 MSS per skb).

This is bad because :

 - It sends small TSO packets even in Slow Start where rate quickly
   increases.
 - It tends to make socket write queue very big, increasing tcp_ack()
   processing time, but also increasing memory needs, not necessarily
   accounted for, as fast clones overhead is currently ignored.
 - Lower GRO efficiency and more ACK packets.

Servers with a lot of small lived connections suffer from this.

Lets instead fill skbs as much as possible (64KB of payload), but split
them at xmit time, when we have a precise idea of the flow rate.
skb split is actually quite efficient.

Patch looks bigger than necessary, because TCP Small Queue decision now
has to take place after the eventual split.

As Neal suggested, introduce a new tcp_tso_autosize() helper, so that
tcp_tso_should_defer() can be synchronized on same goal.

Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable
contains number of mss that we can put in GSO packet, and is not
related to the autosizing goal anymore.

Tested:

40 ms rtt link

nstat >/dev/null
netperf -H remote -l -2000000 -- -s 1000000
nstat | egrep "IpInReceives|IpOutRequests|TcpOutSegs|IpExtOutOctets"

Before patch :

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/s

 87380 2000000 2000000    0.36         44.22
IpInReceives                    600                0.0
IpOutRequests                   599                0.0
TcpOutSegs                      1397               0.0
IpExtOutOctets                  2033249            0.0

After patch :

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380 2000000 2000000    0.36       44.27
IpInReceives                    221                0.0
IpOutRequests                   232                0.0
TcpOutSegs                      1397               0.0
IpExtOutOctets                  2013953            0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09 16:39:22 -05:00
..
acpi ACPI and power management updates for 3.18-rc2 2014-10-24 11:29:31 -07:00
asm-generic Revert "fast_hash: avoid indirect function calls" 2014-11-14 16:36:25 -05:00
clocksource
crypto crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code 2014-10-14 10:51:22 +02:00
drm drm/radeon: remove invalid pci id 2014-10-28 10:44:36 -04:00
dt-bindings The fixes for the clock framework are all regressions in drivers, plus a 2014-11-25 17:52:56 -08:00
keys
kvm arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs 2014-10-16 10:57:41 +02:00
linux tcp: refine TSO autosizing 2014-12-09 16:39:22 -05:00
math-emu
media Merge branch 'patchwork' into v4l_for_linus 2014-10-09 14:00:54 -03:00
memory
misc
net net: sched: cls: remove unused op put from tcf_proto_ops 2014-12-09 14:49:02 -05:00
pcmcia
ras
rdma
rxrpc
scsi scsi: set REQ_QUEUE for the blk-mq case 2014-10-28 09:53:43 +01:00
soc/tegra
sound ALSA: pcm: Add big-endian DSD sample formats and fix XMOS DSD sample format 2014-11-21 15:13:28 +01:00
target
trace Merge branch 'urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent 2014-10-30 07:37:37 +01:00
uapi tcp_cubic: add SNMP counters to track how effective is Hystart 2014-12-09 14:58:23 -05:00
video fbdev changes for 3.18 2014-10-18 18:03:02 -07:00
xen
Kbuild