net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
.. SPDX-License-Identifier: GPL-2.0
|
|
|
|
|
|
|
|
====================
|
|
|
|
Interface statistics
|
|
|
|
====================
|
|
|
|
|
2020-09-15 00:11:53 +00:00
|
|
|
Overview
|
|
|
|
========
|
|
|
|
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
This document is a guide to Linux network interface statistics.
|
|
|
|
|
2020-09-15 00:11:53 +00:00
|
|
|
There are three main sources of interface statistics in Linux:
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
|
|
|
|
- standard interface statistics based on
|
2020-09-15 00:11:53 +00:00
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`;
|
|
|
|
- protocol-specific statistics; and
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
- driver-defined statistics available via ethtool.
|
|
|
|
|
2020-09-15 00:11:53 +00:00
|
|
|
Standard interface statistics
|
|
|
|
-----------------------------
|
|
|
|
|
|
|
|
There are multiple interfaces to reach the standard statistics.
|
|
|
|
Most commonly used is the `ip` command from `iproute2`::
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
|
|
|
|
$ ip -s -s link show dev ens4u1u1
|
|
|
|
6: ens4u1u1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
|
|
|
|
link/ether 48:2a:e3:4c:b1:d1 brd ff:ff:ff:ff:ff:ff
|
|
|
|
RX: bytes packets errors dropped overrun mcast
|
|
|
|
74327665117 69016965 0 0 0 0
|
|
|
|
RX errors: length crc frame fifo missed
|
|
|
|
0 0 0 0 0
|
|
|
|
TX: bytes packets errors dropped carrier collsns
|
|
|
|
21405556176 44608960 0 0 0 0
|
|
|
|
TX errors: aborted fifo window heartbeat transns
|
|
|
|
0 0 0 0 128
|
|
|
|
altname enp58s0u1u1
|
|
|
|
|
|
|
|
Note that `-s` has been specified twice to see all members of
|
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`.
|
|
|
|
If `-s` is specified once the detailed errors won't be shown.
|
|
|
|
|
|
|
|
`ip` supports JSON formatting via the `-j` option.
|
|
|
|
|
2024-03-06 19:55:07 +00:00
|
|
|
Queue statistics
|
|
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Queue statistics are accessible via the netdev netlink family.
|
|
|
|
|
|
|
|
Currently no widely distributed CLI exists to access those statistics.
|
|
|
|
Kernel development tools (ynl) can be used to experiment with them,
|
|
|
|
see `Documentation/userspace-api/netlink/intro-specs.rst`.
|
|
|
|
|
2020-09-15 00:11:53 +00:00
|
|
|
Protocol-specific statistics
|
|
|
|
----------------------------
|
|
|
|
|
2021-04-16 19:27:37 +00:00
|
|
|
Protocol-specific statistics are exposed via relevant interfaces,
|
|
|
|
the same interfaces as are used to configure them.
|
|
|
|
|
|
|
|
ethtool
|
|
|
|
~~~~~~~
|
|
|
|
|
|
|
|
Ethtool exposes common low-level statistics.
|
|
|
|
All the standard statistics are expected to be maintained
|
|
|
|
by the device, not the driver (as opposed to driver-defined stats
|
|
|
|
described in the next section which mix software and hardware stats).
|
|
|
|
For devices which contain unmanaged
|
|
|
|
switches (e.g. legacy SR-IOV or multi-host NICs) the events counted
|
|
|
|
may not pertain exclusively to the packets destined to
|
|
|
|
the local host interface. In other words the events may
|
|
|
|
be counted at the network port (MAC/PHY blocks) without separation
|
|
|
|
for different host side (PCIe) devices. Such ambiguity must not
|
|
|
|
be present when internal switch is managed by Linux (so called
|
|
|
|
switchdev mode for NICs).
|
|
|
|
|
|
|
|
Standard ethtool statistics can be accessed via the interfaces used
|
|
|
|
for configuration. For example ethtool interface used
|
2020-09-15 00:11:53 +00:00
|
|
|
to configure pause frames can report corresponding hardware counters::
|
|
|
|
|
|
|
|
$ ethtool --include-statistics -a eth0
|
|
|
|
Pause parameters for eth0:
|
|
|
|
Autonegotiate: on
|
|
|
|
RX: on
|
|
|
|
TX: on
|
|
|
|
Statistics:
|
|
|
|
tx_pause_frames: 1
|
|
|
|
rx_pause_frames: 1
|
|
|
|
|
2021-04-16 19:27:37 +00:00
|
|
|
General Ethernet statistics not associated with any particular
|
|
|
|
functionality are exposed via ``ethtool -S $ifc`` by specifying
|
|
|
|
the ``--groups`` parameter::
|
|
|
|
|
|
|
|
$ ethtool -S eth0 --groups eth-phy eth-mac eth-ctrl rmon
|
|
|
|
Stats for eth0:
|
|
|
|
eth-phy-SymbolErrorDuringCarrier: 0
|
|
|
|
eth-mac-FramesTransmittedOK: 1
|
|
|
|
eth-mac-FrameTooLongErrors: 1
|
|
|
|
eth-ctrl-MACControlFramesTransmitted: 1
|
|
|
|
eth-ctrl-MACControlFramesReceived: 0
|
|
|
|
eth-ctrl-UnsupportedOpcodesReceived: 1
|
|
|
|
rmon-etherStatsUndersizePkts: 1
|
|
|
|
rmon-etherStatsJabbers: 0
|
|
|
|
rmon-rx-etherStatsPkts64Octets: 1
|
|
|
|
rmon-rx-etherStatsPkts65to127Octets: 0
|
|
|
|
rmon-rx-etherStatsPkts128to255Octets: 0
|
|
|
|
rmon-tx-etherStatsPkts64Octets: 2
|
|
|
|
rmon-tx-etherStatsPkts65to127Octets: 3
|
|
|
|
rmon-tx-etherStatsPkts128to255Octets: 0
|
|
|
|
|
2020-09-15 00:11:53 +00:00
|
|
|
Driver-defined statistics
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
Driver-defined ethtool statistics can be dumped using `ethtool -S $ifc`, e.g.::
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
|
|
|
|
$ ethtool -S ens4u1u1
|
|
|
|
NIC statistics:
|
|
|
|
tx_single_collisions: 0
|
|
|
|
tx_multi_collisions: 0
|
|
|
|
|
|
|
|
uAPIs
|
|
|
|
=====
|
|
|
|
|
|
|
|
procfs
|
|
|
|
------
|
|
|
|
|
|
|
|
The historical `/proc/net/dev` text interface gives access to the list
|
|
|
|
of interfaces as well as their statistics.
|
|
|
|
|
|
|
|
Note that even though this interface is using
|
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`
|
|
|
|
internally it combines some of the fields.
|
|
|
|
|
|
|
|
sysfs
|
|
|
|
-----
|
|
|
|
|
|
|
|
Each device directory in sysfs contains a `statistics` directory (e.g.
|
|
|
|
`/sys/class/net/lo/statistics/`) with files corresponding to
|
|
|
|
members of :c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`.
|
|
|
|
|
|
|
|
This simple interface is convenient especially in constrained/embedded
|
|
|
|
environments without access to tools. However, it's inefficient when
|
|
|
|
reading multiple stats as it internally performs a full dump of
|
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`
|
|
|
|
and reports only the stat corresponding to the accessed file.
|
|
|
|
|
|
|
|
Sysfs files are documented in
|
|
|
|
`Documentation/ABI/testing/sysfs-class-net-statistics`.
|
|
|
|
|
|
|
|
|
|
|
|
netlink
|
|
|
|
-------
|
|
|
|
|
|
|
|
`rtnetlink` (`NETLINK_ROUTE`) is the preferred method of accessing
|
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>` stats.
|
|
|
|
|
|
|
|
Statistics are reported both in the responses to link information
|
|
|
|
requests (`RTM_GETLINK`) and statistic requests (`RTM_GETSTATS`,
|
|
|
|
when `IFLA_STATS_LINK_64` bit is set in the `.filter_mask` of the request).
|
|
|
|
|
2024-03-06 19:55:07 +00:00
|
|
|
netdev (netlink)
|
|
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
`netdev` generic netlink family allows accessing page pool and per queue
|
|
|
|
statistics.
|
|
|
|
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
ethtool
|
|
|
|
-------
|
|
|
|
|
|
|
|
Ethtool IOCTL interface allows drivers to report implementation
|
|
|
|
specific statistics. Historically it has also been used to report
|
|
|
|
statistics for which other APIs did not exist, like per-device-queue
|
|
|
|
statistics, or standard-based statistics (e.g. RFC 2863).
|
|
|
|
|
|
|
|
Statistics and their string identifiers are retrieved separately.
|
|
|
|
Identifiers via `ETHTOOL_GSTRINGS` with `string_set` set to `ETH_SS_STATS`,
|
|
|
|
and values via `ETHTOOL_GSTATS`. User space should use `ETHTOOL_GDRVINFO`
|
|
|
|
to retrieve the number of statistics (`.n_stats`).
|
|
|
|
|
2020-09-15 00:11:53 +00:00
|
|
|
ethtool-netlink
|
|
|
|
---------------
|
|
|
|
|
|
|
|
Ethtool netlink is a replacement for the older IOCTL interface.
|
|
|
|
|
|
|
|
Protocol-related statistics can be requested in get commands by setting
|
|
|
|
the `ETHTOOL_FLAG_STATS` flag in `ETHTOOL_A_HEADER_FLAGS`. Currently
|
|
|
|
statistics are supported in the following commands:
|
|
|
|
|
|
|
|
- `ETHTOOL_MSG_PAUSE_GET`
|
2021-04-15 22:53:15 +00:00
|
|
|
- `ETHTOOL_MSG_FEC_GET`
|
2023-01-19 12:26:55 +00:00
|
|
|
- `ETHTOOL_MSG_MM_GET`
|
2020-09-15 00:11:53 +00:00
|
|
|
|
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.
Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?
I tried to clarify the expectations, further clarifications from
others are very welcome.
The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.
Now - which two of those counters we select to use is anyone's pick:
sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).
Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).
Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.
v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-03 23:14:31 +00:00
|
|
|
debugfs
|
|
|
|
-------
|
|
|
|
|
|
|
|
Some drivers expose extra statistics via `debugfs`.
|
|
|
|
|
|
|
|
struct rtnl_link_stats64
|
|
|
|
========================
|
|
|
|
|
|
|
|
.. kernel-doc:: include/uapi/linux/if_link.h
|
|
|
|
:identifiers: rtnl_link_stats64
|
|
|
|
|
|
|
|
Notes for driver authors
|
|
|
|
========================
|
|
|
|
|
|
|
|
Drivers should report all statistics which have a matching member in
|
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>` exclusively
|
|
|
|
via `.ndo_get_stats64`. Reporting such standard stats via ethtool
|
|
|
|
or debugfs will not be accepted.
|
|
|
|
|
|
|
|
Drivers must ensure best possible compliance with
|
|
|
|
:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`.
|
|
|
|
Please note for example that detailed error statistics must be
|
|
|
|
added into the general `rx_error` / `tx_error` counters.
|
|
|
|
|
|
|
|
The `.ndo_get_stats64` callback can not sleep because of accesses
|
|
|
|
via `/proc/net/dev`. If driver may sleep when retrieving the statistics
|
|
|
|
from the device it should do so periodically asynchronously and only return
|
|
|
|
a recent copy from `.ndo_get_stats64`. Ethtool interrupt coalescing interface
|
|
|
|
allows setting the frequency of refreshing statistics, if needed.
|
|
|
|
|
|
|
|
Retrieving ethtool statistics is a multi-syscall process, drivers are advised
|
|
|
|
to keep the number of statistics constant to avoid race conditions with
|
|
|
|
user space trying to read them.
|
|
|
|
|
|
|
|
Statistics must persist across routine operations like bringing the interface
|
|
|
|
down and up.
|
2020-09-15 00:11:53 +00:00
|
|
|
|
|
|
|
Kernel-internal data structures
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
The following structures are internal to the kernel, their members are
|
|
|
|
translated to netlink attributes when dumped. Drivers must not overwrite
|
|
|
|
the statistics they don't report with 0.
|
|
|
|
|
2020-10-27 09:51:10 +00:00
|
|
|
- ethtool_pause_stats()
|
2021-04-15 22:53:15 +00:00
|
|
|
- ethtool_fec_stats()
|