mirror of
https://github.com/torvalds/linux.git
synced 2024-11-29 15:41:36 +00:00
56af5e749f
Currently, when doing rate limiting using the tc-police(8) action, the easiest way is to simply drop the packets which exceed or conform the configured bandwidth limit. Add a new option to tc-skbmod(8), so that users may use the ECN [1] extension to explicitly inform the receiver about the congestion instead of dropping packets "on the floor". The 2 least significant bits of the Traffic Class field in IPv4 and IPv6 headers are used to represent different ECN states [2]: 0b00: "Non ECN-Capable Transport", Non-ECT 0b10: "ECN Capable Transport", ECT(0) 0b01: "ECN Capable Transport", ECT(1) 0b11: "Congestion Encountered", CE As an example: $ tc filter add dev eth0 parent 1: protocol ip prio 10 \ matchall action skbmod ecn Doing the above marks all ECT(0) and ECT(1) packets as CE. It does NOT affect Non-ECT or non-IP packets. In the tc-police scenario mentioned above, users may pipe a tc-police action and a tc-skbmod "ecn" action together to achieve ECN-based rate limiting. For TCP connections, upon receiving a CE packet, the receiver will respond with an ECE packet, asking the sender to reduce their congestion window. However ECN also works with other L4 protocols e.g. DCCP and SCTP [2], and our implementation does not touch or care about L4 headers. The updated tc-skbmod SYNOPSIS looks like the following: tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ... Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod command. Trying to use more than one of them at a time is considered undefined behavior; pipe multiple tc-skbmod commands together instead. "set" and "swap" only affect Ethernet packets, while "ecn" only affects IPv{4,6} packets. It is also worth mentioning that, in theory, the same effect could be achieved by piping a "police" action and a "bpf" action using the bpf_skb_ecn_set_ce() helper, but this requires eBPF programming from the user, thus impractical. Depends on patch "net/sched: act_skbmod: Skip non-Ethernet packets". [1] https://datatracker.ietf.org/doc/html/rfc3168 [2] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
---|---|---|
.. | ||
6lowpan | ||
9p | ||
802 | ||
8021q | ||
appletalk | ||
atm | ||
ax25 | ||
batman-adv | ||
bluetooth | ||
bpf | ||
bpfilter | ||
bridge | ||
caif | ||
can | ||
ceph | ||
core | ||
dcb | ||
dccp | ||
decnet | ||
dns_resolver | ||
dsa | ||
ethernet | ||
ethtool | ||
hsr | ||
ieee802154 | ||
ife | ||
ipv4 | ||
ipv6 | ||
iucv | ||
kcm | ||
key | ||
l2tp | ||
l3mdev | ||
lapb | ||
llc | ||
mac80211 | ||
mac802154 | ||
mpls | ||
mptcp | ||
ncsi | ||
netfilter | ||
netlabel | ||
netlink | ||
netrom | ||
nfc | ||
nsh | ||
openvswitch | ||
packet | ||
phonet | ||
psample | ||
qrtr | ||
rds | ||
rfkill | ||
rose | ||
rxrpc | ||
sched | ||
sctp | ||
smc | ||
strparser | ||
sunrpc | ||
switchdev | ||
tipc | ||
tls | ||
unix | ||
vmw_vsock | ||
wireless | ||
x25 | ||
xdp | ||
xfrm | ||
compat.c | ||
devres.c | ||
Kconfig | ||
Makefile | ||
socket.c | ||
sysctl_net.c |