add documents for snmp counters

Add explaination of below counters:
TcpExtTCPRcvCoalesce
TcpExtTCPAutoCorking
TcpExtTCPOrigDataSent
TCPSynRetrans
TCPFastOpenActiveFail
TcpExtListenOverflows
TcpExtListenDrops
TcpExtTCPHystartTrainDetect
TcpExtTCPHystartTrainCwnd
TcpExtTCPHystartDelayDetect
TcpExtTCPHystartDelayCwnd

Signed-off-by: yupeng <yupeng0921@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
yupeng 2018-11-25 23:35:46 -08:00 committed by David S. Miller
parent 50853808ff
commit 712ee16c23

View File

@ -220,6 +220,68 @@ Defined in `RFC1213 tcpPassiveOpens`_
It means the TCP layer receives a SYN, replies a SYN+ACK, come into
the SYN-RCVD state.
* TcpExtTCPRcvCoalesce
When packets are received by the TCP layer and are not be read by the
application, the TCP layer will try to merge them. This counter
indicate how many packets are merged in such situation. If GRO is
enabled, lots of packets would be merged by GRO, these packets
wouldn't be counted to TcpExtTCPRcvCoalesce.
* TcpExtTCPAutoCorking
When sending packets, the TCP layer will try to merge small packets to
a bigger one. This counter increase 1 for every packet merged in such
situation. Please refer to the LWN article for more details:
https://lwn.net/Articles/576263/
* TcpExtTCPOrigDataSent
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
explaination below::
TCPOrigDataSent: number of outgoing packets with original data (excluding
retransmission but including data-in-SYN). This counter is different from
TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is
more useful to track the TCP retransmission rate.
* TCPSynRetrans
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
explaination below::
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
* TCPFastOpenActiveFail
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
explaination below::
TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed because
the remote does not accept it or the attempts timed out.
.. _kernel commit f19c29e3e391: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f19c29e3e391a66a273e9afebaf01917245148cd
* TcpExtListenOverflows and TcpExtListenDrops
When kernel receives a SYN from a client, and if the TCP accept queue
is full, kernel will drop the SYN and add 1 to TcpExtListenOverflows.
At the same time kernel will also add 1 to TcpExtListenDrops. When a
TCP socket is in LISTEN state, and kernel need to drop a packet,
kernel would always add 1 to TcpExtListenDrops. So increase
TcpExtListenOverflows would let TcpExtListenDrops increasing at the
same time, but TcpExtListenDrops would also increase without
TcpExtListenOverflows increasing, e.g. a memory allocation fail would
also let TcpExtListenDrops increase.
Note: The above explanation is based on kernel 4.10 or above version, on
an old kernel, the TCP stack has different behavior when TCP accept
queue is full. On the old kernel, TCP stack won't drop the SYN, it
would complete the 3-way handshake. As the accept queue is full, TCP
stack will keep the socket in the TCP half-open queue. As it is in the
half open queue, TCP stack will send SYN+ACK on an exponential backoff
timer, after client replies ACK, TCP stack checks whether the accept
queue is still full, if it is not full, moves the socket to the accept
queue, if it is full, keeps the socket in the half-open queue, at next
time client replies ACK, this socket will get another chance to move
to the accept queue.
TCP Fast Open
============
When kernel receives a TCP packet, it has two paths to handler the
@ -331,6 +393,38 @@ TcpExtTCPAbortFailed will be increased.
.. _RFC2525 2.17 section: https://tools.ietf.org/html/rfc2525#page-50
TCP Hybrid Slow Start
====================
The Hybrid Slow Start algorithm is an enhancement of the traditional
TCP congestion window Slow Start algorithm. It uses two pieces of
information to detect whether the max bandwidth of the TCP path is
approached. The two pieces of information are ACK train length and
increase in packet delay. For detail information, please refer the
`Hybrid Slow Start paper`_. Either ACK train length or packet delay
hits a specific threshold, the congestion control algorithm will come
into the Congestion Avoidance state. Until v4.20, two congestion
control algorithms are using Hybrid Slow Start, they are cubic (the
default congestion control algorithm) and cdg. Four snmp counters
relate with the Hybrid Slow Start algorithm.
.. _Hybrid Slow Start paper: https://pdfs.semanticscholar.org/25e9/ef3f03315782c7f1cbcd31b587857adae7d1.pdf
* TcpExtTCPHystartTrainDetect
How many times the ACK train length threshold is detected
* TcpExtTCPHystartTrainCwnd
The sum of CWND detected by ACK train length. Dividing this value by
TcpExtTCPHystartTrainDetect is the average CWND which detected by the
ACK train length.
* TcpExtTCPHystartDelayDetect
How many times the packet delay threshold is detected.
* TcpExtTCPHystartDelayCwnd
The sum of CWND detected by packet delay. Dividing this value by
TcpExtTCPHystartDelayDetect is the average CWND which detected by the
packet delay.
examples
=======
@ -743,3 +837,111 @@ After run client_linger.py, check the output of nstat::
nstatuser@nstat-a:~$ nstat | grep -i abort
TcpExtTCPAbortOnLinger 1 0.0
TcpExtTCPRcvCoalesce
-------------------
On the server, we run a program which listen on TCP port 9000, but
doesn't read any data::
import socket
import time
port = 9000
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('0.0.0.0', port))
s.listen(1)
sock, addr = s.accept()
while True:
time.sleep(9999999)
Save the above code as server_coalesce.py, and run::
python3 server_coalesce.py
On the client, save below code as client_coalesce.py::
import socket
server = 'nstat-b'
port = 9000
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((server, port))
Run::
nstatuser@nstat-a:~$ python3 -i client_coalesce.py
We use '-i' to come into the interactive mode, then a packet::
>>> s.send(b'foo')
3
Send a packet again::
>>> s.send(b'bar')
3
On the server, run nstat::
ubuntu@nstat-b:~$ nstat
#kernel
IpInReceives 2 0.0
IpInDelivers 2 0.0
IpOutRequests 2 0.0
TcpInSegs 2 0.0
TcpOutSegs 2 0.0
TcpExtTCPRcvCoalesce 1 0.0
IpExtInOctets 110 0.0
IpExtOutOctets 104 0.0
IpExtInNoECTPkts 2 0.0
The client sent two packets, server didn't read any data. When
the second packet arrived at server, the first packet was still in
the receiving queue. So the TCP layer merged the two packets, and we
could find the TcpExtTCPRcvCoalesce increased 1.
TcpExtListenOverflows and TcpExtListenDrops
----------------------------------------
On server, run the nc command, listen on port 9000::
nstatuser@nstat-b:~$ nc -lkv 0.0.0.0 9000
Listening on [0.0.0.0] (family 0, port 9000)
On client, run 3 nc commands in different terminals::
nstatuser@nstat-a:~$ nc -v nstat-b 9000
Connection to nstat-b 9000 port [tcp/*] succeeded!
The nc command only accepts 1 connection, and the accept queue length
is 1. On current linux implementation, set queue length to n means the
actual queue length is n+1. Now we create 3 connections, 1 is accepted
by nc, 2 in accepted queue, so the accept queue is full.
Before running the 4th nc, we clean the nstat history on the server::
nstatuser@nstat-b:~$ nstat -n
Run the 4th nc on the client::
nstatuser@nstat-a:~$ nc -v nstat-b 9000
If the nc server is running on kernel 4.10 or higher version, you
won't see the "Connection to ... succeeded!" string, because kernel
will drop the SYN if the accept queue is full. If the nc client is running
on an old kernel, you would see that the connection is succeeded,
because kernel would complete the 3 way handshake and keep the socket
on half open queue. I did the test on kernel 4.15. Below is the nstat
on the server::
nstatuser@nstat-b:~$ nstat
#kernel
IpInReceives 4 0.0
IpInDelivers 4 0.0
TcpInSegs 4 0.0
TcpExtListenOverflows 4 0.0
TcpExtListenDrops 4 0.0
IpExtInOctets 240 0.0
IpExtInNoECTPkts 4 0.0
Both TcpExtListenOverflows and TcpExtListenDrops were 4. If the time
between the 4th nc and the nstat was longer, the value of
TcpExtListenOverflows and TcpExtListenDrops would be larger, because
the SYN of the 4th nc was dropped, the client was retrying.