forked from Minki/linux
add documents for snmp counters
Add explaination of below counters: TcpExtTCPRcvCoalesce TcpExtTCPAutoCorking TcpExtTCPOrigDataSent TCPSynRetrans TCPFastOpenActiveFail TcpExtListenOverflows TcpExtListenDrops TcpExtTCPHystartTrainDetect TcpExtTCPHystartTrainCwnd TcpExtTCPHystartDelayDetect TcpExtTCPHystartDelayCwnd Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
parent
50853808ff
commit
712ee16c23
@ -220,6 +220,68 @@ Defined in `RFC1213 tcpPassiveOpens`_
|
||||
It means the TCP layer receives a SYN, replies a SYN+ACK, come into
|
||||
the SYN-RCVD state.
|
||||
|
||||
* TcpExtTCPRcvCoalesce
|
||||
When packets are received by the TCP layer and are not be read by the
|
||||
application, the TCP layer will try to merge them. This counter
|
||||
indicate how many packets are merged in such situation. If GRO is
|
||||
enabled, lots of packets would be merged by GRO, these packets
|
||||
wouldn't be counted to TcpExtTCPRcvCoalesce.
|
||||
|
||||
* TcpExtTCPAutoCorking
|
||||
When sending packets, the TCP layer will try to merge small packets to
|
||||
a bigger one. This counter increase 1 for every packet merged in such
|
||||
situation. Please refer to the LWN article for more details:
|
||||
https://lwn.net/Articles/576263/
|
||||
|
||||
* TcpExtTCPOrigDataSent
|
||||
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
|
||||
explaination below::
|
||||
|
||||
TCPOrigDataSent: number of outgoing packets with original data (excluding
|
||||
retransmission but including data-in-SYN). This counter is different from
|
||||
TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is
|
||||
more useful to track the TCP retransmission rate.
|
||||
|
||||
* TCPSynRetrans
|
||||
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
|
||||
explaination below::
|
||||
|
||||
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
|
||||
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
|
||||
|
||||
* TCPFastOpenActiveFail
|
||||
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
|
||||
explaination below::
|
||||
|
||||
TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed because
|
||||
the remote does not accept it or the attempts timed out.
|
||||
|
||||
.. _kernel commit f19c29e3e391: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f19c29e3e391a66a273e9afebaf01917245148cd
|
||||
|
||||
* TcpExtListenOverflows and TcpExtListenDrops
|
||||
When kernel receives a SYN from a client, and if the TCP accept queue
|
||||
is full, kernel will drop the SYN and add 1 to TcpExtListenOverflows.
|
||||
At the same time kernel will also add 1 to TcpExtListenDrops. When a
|
||||
TCP socket is in LISTEN state, and kernel need to drop a packet,
|
||||
kernel would always add 1 to TcpExtListenDrops. So increase
|
||||
TcpExtListenOverflows would let TcpExtListenDrops increasing at the
|
||||
same time, but TcpExtListenDrops would also increase without
|
||||
TcpExtListenOverflows increasing, e.g. a memory allocation fail would
|
||||
also let TcpExtListenDrops increase.
|
||||
|
||||
Note: The above explanation is based on kernel 4.10 or above version, on
|
||||
an old kernel, the TCP stack has different behavior when TCP accept
|
||||
queue is full. On the old kernel, TCP stack won't drop the SYN, it
|
||||
would complete the 3-way handshake. As the accept queue is full, TCP
|
||||
stack will keep the socket in the TCP half-open queue. As it is in the
|
||||
half open queue, TCP stack will send SYN+ACK on an exponential backoff
|
||||
timer, after client replies ACK, TCP stack checks whether the accept
|
||||
queue is still full, if it is not full, moves the socket to the accept
|
||||
queue, if it is full, keeps the socket in the half-open queue, at next
|
||||
time client replies ACK, this socket will get another chance to move
|
||||
to the accept queue.
|
||||
|
||||
|
||||
TCP Fast Open
|
||||
============
|
||||
When kernel receives a TCP packet, it has two paths to handler the
|
||||
@ -331,6 +393,38 @@ TcpExtTCPAbortFailed will be increased.
|
||||
|
||||
.. _RFC2525 2.17 section: https://tools.ietf.org/html/rfc2525#page-50
|
||||
|
||||
TCP Hybrid Slow Start
|
||||
====================
|
||||
The Hybrid Slow Start algorithm is an enhancement of the traditional
|
||||
TCP congestion window Slow Start algorithm. It uses two pieces of
|
||||
information to detect whether the max bandwidth of the TCP path is
|
||||
approached. The two pieces of information are ACK train length and
|
||||
increase in packet delay. For detail information, please refer the
|
||||
`Hybrid Slow Start paper`_. Either ACK train length or packet delay
|
||||
hits a specific threshold, the congestion control algorithm will come
|
||||
into the Congestion Avoidance state. Until v4.20, two congestion
|
||||
control algorithms are using Hybrid Slow Start, they are cubic (the
|
||||
default congestion control algorithm) and cdg. Four snmp counters
|
||||
relate with the Hybrid Slow Start algorithm.
|
||||
|
||||
.. _Hybrid Slow Start paper: https://pdfs.semanticscholar.org/25e9/ef3f03315782c7f1cbcd31b587857adae7d1.pdf
|
||||
|
||||
* TcpExtTCPHystartTrainDetect
|
||||
How many times the ACK train length threshold is detected
|
||||
|
||||
* TcpExtTCPHystartTrainCwnd
|
||||
The sum of CWND detected by ACK train length. Dividing this value by
|
||||
TcpExtTCPHystartTrainDetect is the average CWND which detected by the
|
||||
ACK train length.
|
||||
|
||||
* TcpExtTCPHystartDelayDetect
|
||||
How many times the packet delay threshold is detected.
|
||||
|
||||
* TcpExtTCPHystartDelayCwnd
|
||||
The sum of CWND detected by packet delay. Dividing this value by
|
||||
TcpExtTCPHystartDelayDetect is the average CWND which detected by the
|
||||
packet delay.
|
||||
|
||||
examples
|
||||
=======
|
||||
|
||||
@ -743,3 +837,111 @@ After run client_linger.py, check the output of nstat::
|
||||
|
||||
nstatuser@nstat-a:~$ nstat | grep -i abort
|
||||
TcpExtTCPAbortOnLinger 1 0.0
|
||||
|
||||
TcpExtTCPRcvCoalesce
|
||||
-------------------
|
||||
On the server, we run a program which listen on TCP port 9000, but
|
||||
doesn't read any data::
|
||||
|
||||
import socket
|
||||
import time
|
||||
port = 9000
|
||||
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
s.bind(('0.0.0.0', port))
|
||||
s.listen(1)
|
||||
sock, addr = s.accept()
|
||||
while True:
|
||||
time.sleep(9999999)
|
||||
|
||||
Save the above code as server_coalesce.py, and run::
|
||||
|
||||
python3 server_coalesce.py
|
||||
|
||||
On the client, save below code as client_coalesce.py::
|
||||
|
||||
import socket
|
||||
server = 'nstat-b'
|
||||
port = 9000
|
||||
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
s.connect((server, port))
|
||||
|
||||
Run::
|
||||
|
||||
nstatuser@nstat-a:~$ python3 -i client_coalesce.py
|
||||
|
||||
We use '-i' to come into the interactive mode, then a packet::
|
||||
|
||||
>>> s.send(b'foo')
|
||||
3
|
||||
|
||||
Send a packet again::
|
||||
|
||||
>>> s.send(b'bar')
|
||||
3
|
||||
|
||||
On the server, run nstat::
|
||||
|
||||
ubuntu@nstat-b:~$ nstat
|
||||
#kernel
|
||||
IpInReceives 2 0.0
|
||||
IpInDelivers 2 0.0
|
||||
IpOutRequests 2 0.0
|
||||
TcpInSegs 2 0.0
|
||||
TcpOutSegs 2 0.0
|
||||
TcpExtTCPRcvCoalesce 1 0.0
|
||||
IpExtInOctets 110 0.0
|
||||
IpExtOutOctets 104 0.0
|
||||
IpExtInNoECTPkts 2 0.0
|
||||
|
||||
The client sent two packets, server didn't read any data. When
|
||||
the second packet arrived at server, the first packet was still in
|
||||
the receiving queue. So the TCP layer merged the two packets, and we
|
||||
could find the TcpExtTCPRcvCoalesce increased 1.
|
||||
|
||||
TcpExtListenOverflows and TcpExtListenDrops
|
||||
----------------------------------------
|
||||
On server, run the nc command, listen on port 9000::
|
||||
|
||||
nstatuser@nstat-b:~$ nc -lkv 0.0.0.0 9000
|
||||
Listening on [0.0.0.0] (family 0, port 9000)
|
||||
|
||||
On client, run 3 nc commands in different terminals::
|
||||
|
||||
nstatuser@nstat-a:~$ nc -v nstat-b 9000
|
||||
Connection to nstat-b 9000 port [tcp/*] succeeded!
|
||||
|
||||
The nc command only accepts 1 connection, and the accept queue length
|
||||
is 1. On current linux implementation, set queue length to n means the
|
||||
actual queue length is n+1. Now we create 3 connections, 1 is accepted
|
||||
by nc, 2 in accepted queue, so the accept queue is full.
|
||||
|
||||
Before running the 4th nc, we clean the nstat history on the server::
|
||||
|
||||
nstatuser@nstat-b:~$ nstat -n
|
||||
|
||||
Run the 4th nc on the client::
|
||||
|
||||
nstatuser@nstat-a:~$ nc -v nstat-b 9000
|
||||
|
||||
If the nc server is running on kernel 4.10 or higher version, you
|
||||
won't see the "Connection to ... succeeded!" string, because kernel
|
||||
will drop the SYN if the accept queue is full. If the nc client is running
|
||||
on an old kernel, you would see that the connection is succeeded,
|
||||
because kernel would complete the 3 way handshake and keep the socket
|
||||
on half open queue. I did the test on kernel 4.15. Below is the nstat
|
||||
on the server::
|
||||
|
||||
nstatuser@nstat-b:~$ nstat
|
||||
#kernel
|
||||
IpInReceives 4 0.0
|
||||
IpInDelivers 4 0.0
|
||||
TcpInSegs 4 0.0
|
||||
TcpExtListenOverflows 4 0.0
|
||||
TcpExtListenDrops 4 0.0
|
||||
IpExtInOctets 240 0.0
|
||||
IpExtInNoECTPkts 4 0.0
|
||||
|
||||
Both TcpExtListenOverflows and TcpExtListenDrops were 4. If the time
|
||||
between the 4th nc and the nstat was longer, the value of
|
||||
TcpExtListenOverflows and TcpExtListenDrops would be larger, because
|
||||
the SYN of the 4th nc was dropped, the client was retrying.
|
||||
|
Loading…
Reference in New Issue
Block a user