mirror of
https://github.com/torvalds/linux.git
synced 2024-11-25 21:51:40 +00:00
e420bed025
This work refactors and adds a lightweight extension ("tcx") to the tc BPF ingress and egress data path side for allowing BPF program management based on fds via bpf() syscall through the newly added generic multi-prog API. The main goal behind this work which we also presented at LPC [0] last year and a recent update at LSF/MM/BPF this year [3] is to support long-awaited BPF link functionality for tc BPF programs, which allows for a model of safe ownership and program detachment. Given the rise in tc BPF users in cloud native environments, this becomes necessary to avoid hard to debug incidents either through stale leftover programs or 3rd party applications accidentally stepping on each others toes. As a recap, a BPF link represents the attachment of a BPF program to a BPF hook point. The BPF link holds a single reference to keep BPF program alive. Moreover, hook points do not reference a BPF link, only the application's fd or pinning does. A BPF link holds meta-data specific to attachment and implements operations for link creation, (atomic) BPF program update, detachment and introspection. The motivation for BPF links for tc BPF programs is multi-fold, for example: - From Meta: "It's especially important for applications that are deployed fleet-wide and that don't "control" hosts they are deployed to. If such application crashes and no one notices and does anything about that, BPF program will keep running draining resources or even just, say, dropping packets. We at FB had outages due to such permanent BPF attachment semantics. With fd-based BPF link we are getting a framework, which allows safe, auto-detachable behavior by default, unless application explicitly opts in by pinning the BPF link." [1] - From Cilium-side the tc BPF programs we attach to host-facing veth devices and phys devices build the core datapath for Kubernetes Pods, and they implement forwarding, load-balancing, policy, EDT-management, etc, within BPF. Currently there is no concept of 'safe' ownership, e.g. we've recently experienced hard-to-debug issues in a user's staging environment where another Kubernetes application using tc BPF attached to the same prio/handle of cls_bpf, accidentally wiping all Cilium-based BPF programs from underneath it. The goal is to establish a clear/safe ownership model via links which cannot accidentally be overridden. [0,2] BPF links for tc can co-exist with non-link attachments, and the semantics are in line also with XDP links: BPF links cannot replace other BPF links, BPF links cannot replace non-BPF links, non-BPF links cannot replace BPF links and lastly only non-BPF links can replace non-BPF links. In case of Cilium, this would solve mentioned issue of safe ownership model as 3rd party applications would not be able to accidentally wipe Cilium programs, even if they are not BPF link aware. Earlier attempts [4] have tried to integrate BPF links into core tc machinery to solve cls_bpf, which has been intrusive to the generic tc kernel API with extensions only specific to cls_bpf and suboptimal/complex since cls_bpf could be wiped from the qdisc also. Locking a tc BPF program in place this way, is getting into layering hacks given the two object models are vastly different. We instead implemented the tcx (tc 'express') layer which is an fd-based tc BPF attach API, so that the BPF link implementation blends in naturally similar to other link types which are fd-based and without the need for changing core tc internal APIs. BPF programs for tc can then be successively migrated from classic cls_bpf to the new tc BPF link without needing to change the program's source code, just the BPF loader mechanics for attaching is sufficient. For the current tc framework, there is no change in behavior with this change and neither does this change touch on tc core kernel APIs. The gist of this patch is that the ingress and egress hook have a lightweight, qdisc-less extension for BPF to attach its tc BPF programs, in other words, a minimal entry point for tc BPF. The name tcx has been suggested from discussion of earlier revisions of this work as a good fit, and to more easily differ between the classic cls_bpf attachment and the fd-based one. For the ingress and egress tcx points, the device holds a cache-friendly array with program pointers which is separated from control plane (slow-path) data. Earlier versions of this work used priority to determine ordering and expression of dependencies similar as with classic tc, but it was challenged that for something more future-proof a better user experience is required. Hence this resulted in the design and development of the generic attach/detach/query API for multi-progs. See prior patch with its discussion on the API design. tcx is the first user and later we plan to integrate also others, for example, one candidate is multi-prog support for XDP which would benefit and have the same 'look and feel' from API perspective. The goal with tcx is to have maximum compatibility to existing tc BPF programs, so they don't need to be rewritten specifically. Compatibility to call into classic tcf_classify() is also provided in order to allow successive migration or both to cleanly co-exist where needed given its all one logical tc layer and the tcx plus classic tc cls/act build one logical overall processing pipeline. tcx supports the simplified return codes TCX_NEXT which is non-terminating (go to next program) and terminating ones with TCX_PASS, TCX_DROP, TCX_REDIRECT. The fd-based API is behind a static key, so that when unused the code is also not entered. The struct tcx_entry's program array is currently static, but could be made dynamic if necessary at a point in future. The a/b pair swap design has been chosen so that for detachment there are no allocations which otherwise could fail. The work has been tested with tc-testing selftest suite which all passes, as well as the tc BPF tests from the BPF CI, and also with Cilium's L4LB. Thanks also to Nikolay Aleksandrov and Martin Lau for in-depth early reviews of this work. [0] https://lpc.events/event/16/contributions/1353/ [1] https://lore.kernel.org/bpf/CAEf4BzbokCJN33Nw_kg82sO=xppXnKWEncGTWCTB9vGCmLB6pw@mail.gmail.com [2] https://colocatedeventseu2023.sched.com/event/1Jo6O/tales-from-an-ebpf-programs-murder-mystery-hemanth-malla-guillaume-fournier-datadog [3] http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf [4] https://lore.kernel.org/bpf/20210604063116.234316-1-memxor@gmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-3-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
956 lines
28 KiB
Plaintext
956 lines
28 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
#
|
|
# Traffic control configuration.
|
|
#
|
|
|
|
menuconfig NET_SCHED
|
|
bool "QoS and/or fair queueing"
|
|
select NET_SCH_FIFO
|
|
help
|
|
When the kernel has several packets to send out over a network
|
|
device, it has to decide which ones to send first, which ones to
|
|
delay, and which ones to drop. This is the job of the queueing
|
|
disciplines, several different algorithms for how to do this
|
|
"fairly" have been proposed.
|
|
|
|
If you say N here, you will get the standard packet scheduler, which
|
|
is a FIFO (first come, first served). If you say Y here, you will be
|
|
able to choose from among several alternative algorithms which can
|
|
then be attached to different network devices. This is useful for
|
|
example if some of your network devices are real time devices that
|
|
need a certain minimum data flow rate, or if you need to limit the
|
|
maximum data flow rate for traffic which matches specified criteria.
|
|
This code is considered to be experimental.
|
|
|
|
To administer these schedulers, you'll need the user-level utilities
|
|
from the package iproute2+tc at
|
|
<https://www.kernel.org/pub/linux/utils/net/iproute2/>. That package
|
|
also contains some documentation; for more, check out
|
|
<http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2>.
|
|
|
|
This Quality of Service (QoS) support will enable you to use
|
|
Differentiated Services (diffserv) and Resource Reservation Protocol
|
|
(RSVP) on your Linux router if you also say Y to the corresponding
|
|
classifiers below. Documentation and software is at
|
|
<http://diffserv.sourceforge.net/>.
|
|
|
|
If you say Y here and to "/proc file system" below, you will be able
|
|
to read status information about packet schedulers from the file
|
|
/proc/net/psched.
|
|
|
|
The available schedulers are listed in the following questions; you
|
|
can say Y to as many as you like. If unsure, say N now.
|
|
|
|
if NET_SCHED
|
|
|
|
comment "Queueing/Scheduling"
|
|
|
|
config NET_SCH_HTB
|
|
tristate "Hierarchical Token Bucket (HTB)"
|
|
help
|
|
Say Y here if you want to use the Hierarchical Token Buckets (HTB)
|
|
packet scheduling algorithm. See
|
|
<http://luxik.cdi.cz/~devik/qos/htb/> for complete manual and
|
|
in-depth articles.
|
|
|
|
HTB is very similar to CBQ regarding its goals however is has
|
|
different properties and different algorithm.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_htb.
|
|
|
|
config NET_SCH_HFSC
|
|
tristate "Hierarchical Fair Service Curve (HFSC)"
|
|
help
|
|
Say Y here if you want to use the Hierarchical Fair Service Curve
|
|
(HFSC) packet scheduling algorithm.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_hfsc.
|
|
|
|
config NET_SCH_PRIO
|
|
tristate "Multi Band Priority Queueing (PRIO)"
|
|
help
|
|
Say Y here if you want to use an n-band priority queue packet
|
|
scheduler.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_prio.
|
|
|
|
config NET_SCH_MULTIQ
|
|
tristate "Hardware Multiqueue-aware Multi Band Queuing (MULTIQ)"
|
|
help
|
|
Say Y here if you want to use an n-band queue packet scheduler
|
|
to support devices that have multiple hardware transmit queues.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_multiq.
|
|
|
|
config NET_SCH_RED
|
|
tristate "Random Early Detection (RED)"
|
|
help
|
|
Say Y here if you want to use the Random Early Detection (RED)
|
|
packet scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_red.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_red.
|
|
|
|
config NET_SCH_SFB
|
|
tristate "Stochastic Fair Blue (SFB)"
|
|
help
|
|
Say Y here if you want to use the Stochastic Fair Blue (SFB)
|
|
packet scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_sfb.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_sfb.
|
|
|
|
config NET_SCH_SFQ
|
|
tristate "Stochastic Fairness Queueing (SFQ)"
|
|
help
|
|
Say Y here if you want to use the Stochastic Fairness Queueing (SFQ)
|
|
packet scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_sfq.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_sfq.
|
|
|
|
config NET_SCH_TEQL
|
|
tristate "True Link Equalizer (TEQL)"
|
|
help
|
|
Say Y here if you want to use the True Link Equalizer (TLE) packet
|
|
scheduling algorithm. This queueing discipline allows the combination
|
|
of several physical devices into one virtual device.
|
|
|
|
See the top of <file:net/sched/sch_teql.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_teql.
|
|
|
|
config NET_SCH_TBF
|
|
tristate "Token Bucket Filter (TBF)"
|
|
help
|
|
Say Y here if you want to use the Token Bucket Filter (TBF) packet
|
|
scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_tbf.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_tbf.
|
|
|
|
config NET_SCH_CBS
|
|
tristate "Credit Based Shaper (CBS)"
|
|
help
|
|
Say Y here if you want to use the Credit Based Shaper (CBS) packet
|
|
scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_cbs.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_cbs.
|
|
|
|
config NET_SCH_ETF
|
|
tristate "Earliest TxTime First (ETF)"
|
|
help
|
|
Say Y here if you want to use the Earliest TxTime First (ETF) packet
|
|
scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_etf.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_etf.
|
|
|
|
config NET_SCH_MQPRIO_LIB
|
|
tristate
|
|
help
|
|
Common library for manipulating mqprio queue configurations.
|
|
|
|
config NET_SCH_TAPRIO
|
|
tristate "Time Aware Priority (taprio) Scheduler"
|
|
select NET_SCH_MQPRIO_LIB
|
|
help
|
|
Say Y here if you want to use the Time Aware Priority (taprio) packet
|
|
scheduling algorithm.
|
|
|
|
See the top of <file:net/sched/sch_taprio.c> for more details.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_taprio.
|
|
|
|
config NET_SCH_GRED
|
|
tristate "Generic Random Early Detection (GRED)"
|
|
help
|
|
Say Y here if you want to use the Generic Random Early Detection
|
|
(GRED) packet scheduling algorithm for some of your network devices
|
|
(see the top of <file:net/sched/sch_red.c> for details and
|
|
references about the algorithm).
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_gred.
|
|
|
|
config NET_SCH_NETEM
|
|
tristate "Network emulator (NETEM)"
|
|
help
|
|
Say Y if you want to emulate network delay, loss, and packet
|
|
re-ordering. This is often useful to simulate networks when
|
|
testing applications or protocols.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_netem.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_DRR
|
|
tristate "Deficit Round Robin scheduler (DRR)"
|
|
help
|
|
Say Y here if you want to use the Deficit Round Robin (DRR) packet
|
|
scheduling algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_drr.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_MQPRIO
|
|
tristate "Multi-queue priority scheduler (MQPRIO)"
|
|
select NET_SCH_MQPRIO_LIB
|
|
help
|
|
Say Y here if you want to use the Multi-queue Priority scheduler.
|
|
This scheduler allows QOS to be offloaded on NICs that have support
|
|
for offloading QOS schedulers.
|
|
|
|
To compile this driver as a module, choose M here: the module will
|
|
be called sch_mqprio.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_SKBPRIO
|
|
tristate "SKB priority queue scheduler (SKBPRIO)"
|
|
help
|
|
Say Y here if you want to use the SKB priority queue
|
|
scheduler. This schedules packets according to skb->priority,
|
|
which is useful for request packets in DoS mitigation systems such
|
|
as Gatekeeper.
|
|
|
|
To compile this driver as a module, choose M here: the module will
|
|
be called sch_skbprio.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_CHOKE
|
|
tristate "CHOose and Keep responsive flow scheduler (CHOKE)"
|
|
help
|
|
Say Y here if you want to use the CHOKe packet scheduler (CHOose
|
|
and Keep for responsive flows, CHOose and Kill for unresponsive
|
|
flows). This is a variation of RED which tries to penalize flows
|
|
that monopolize the queue.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_choke.
|
|
|
|
config NET_SCH_QFQ
|
|
tristate "Quick Fair Queueing scheduler (QFQ)"
|
|
help
|
|
Say Y here if you want to use the Quick Fair Queueing Scheduler (QFQ)
|
|
packet scheduling algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_qfq.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_CODEL
|
|
tristate "Controlled Delay AQM (CODEL)"
|
|
help
|
|
Say Y here if you want to use the Controlled Delay (CODEL)
|
|
packet scheduling algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_codel.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_FQ_CODEL
|
|
tristate "Fair Queue Controlled Delay AQM (FQ_CODEL)"
|
|
help
|
|
Say Y here if you want to use the FQ Controlled Delay (FQ_CODEL)
|
|
packet scheduling algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_fq_codel.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_CAKE
|
|
tristate "Common Applications Kept Enhanced (CAKE)"
|
|
help
|
|
Say Y here if you want to use the Common Applications Kept Enhanced
|
|
(CAKE) queue management algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_cake.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_FQ
|
|
tristate "Fair Queue"
|
|
help
|
|
Say Y here if you want to use the FQ packet scheduling algorithm.
|
|
|
|
FQ does flow separation, and is able to respect pacing requirements
|
|
set by TCP stack into sk->sk_pacing_rate (for locally generated
|
|
traffic)
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_fq.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_HHF
|
|
tristate "Heavy-Hitter Filter (HHF)"
|
|
help
|
|
Say Y here if you want to use the Heavy-Hitter Filter (HHF)
|
|
packet scheduling algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_hhf.
|
|
|
|
config NET_SCH_PIE
|
|
tristate "Proportional Integral controller Enhanced (PIE) scheduler"
|
|
help
|
|
Say Y here if you want to use the Proportional Integral controller
|
|
Enhanced scheduler packet scheduling algorithm.
|
|
For more information, please see https://tools.ietf.org/html/rfc8033
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_pie.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_FQ_PIE
|
|
depends on NET_SCH_PIE
|
|
tristate "Flow Queue Proportional Integral controller Enhanced (FQ-PIE)"
|
|
help
|
|
Say Y here if you want to use the Flow Queue Proportional Integral
|
|
controller Enhanced (FQ-PIE) packet scheduling algorithm.
|
|
For more information, please see https://tools.ietf.org/html/rfc8033
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_fq_pie.
|
|
|
|
If unsure, say N.
|
|
|
|
config NET_SCH_INGRESS
|
|
tristate "Ingress/classifier-action Qdisc"
|
|
depends on NET_CLS_ACT
|
|
select NET_XGRESS
|
|
help
|
|
Say Y here if you want to use classifiers for incoming and/or outgoing
|
|
packets. This qdisc doesn't do anything else besides running classifiers,
|
|
which can also have actions attached to them. In case of outgoing packets,
|
|
classifiers that this qdisc holds are executed in the transmit path
|
|
before real enqueuing to an egress qdisc happens.
|
|
|
|
If unsure, say Y.
|
|
|
|
To compile this code as a module, choose M here: the module will be
|
|
called sch_ingress with alias of sch_clsact.
|
|
|
|
config NET_SCH_PLUG
|
|
tristate "Plug network traffic until release (PLUG)"
|
|
help
|
|
|
|
This queuing discipline allows userspace to plug/unplug a network
|
|
output queue, using the netlink interface. When it receives an
|
|
enqueue command it inserts a plug into the outbound queue that
|
|
causes following packets to enqueue until a dequeue command arrives
|
|
over netlink, causing the plug to be removed and resuming the normal
|
|
packet flow.
|
|
|
|
This module also provides a generic "network output buffering"
|
|
functionality (aka output commit), wherein upon arrival of a dequeue
|
|
command, only packets up to the first plug are released for delivery.
|
|
The Remus HA project uses this module to enable speculative execution
|
|
of virtual machines by allowing the generated network output to be rolled
|
|
back if needed.
|
|
|
|
For more information, please refer to <http://wiki.xenproject.org/wiki/Remus>
|
|
|
|
Say Y here if you are using this kernel for Xen dom0 and
|
|
want to protect Xen guests with Remus.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called sch_plug.
|
|
|
|
config NET_SCH_ETS
|
|
tristate "Enhanced transmission selection scheduler (ETS)"
|
|
help
|
|
The Enhanced Transmission Selection scheduler is a classful
|
|
queuing discipline that merges functionality of PRIO and DRR
|
|
qdiscs in one scheduler. ETS makes it easy to configure a set of
|
|
strict and bandwidth-sharing bands to implement the transmission
|
|
selection described in 802.1Qaz.
|
|
|
|
Say Y here if you want to use the ETS packet scheduling
|
|
algorithm.
|
|
|
|
To compile this driver as a module, choose M here: the module
|
|
will be called sch_ets.
|
|
|
|
If unsure, say N.
|
|
|
|
menuconfig NET_SCH_DEFAULT
|
|
bool "Allow override default queue discipline"
|
|
help
|
|
Support for selection of default queuing discipline.
|
|
|
|
Nearly all users can safely say no here, and the default
|
|
of pfifo_fast will be used. Many distributions already set
|
|
the default value via /proc/sys/net/core/default_qdisc.
|
|
|
|
If unsure, say N.
|
|
|
|
if NET_SCH_DEFAULT
|
|
|
|
choice
|
|
prompt "Default queuing discipline"
|
|
default DEFAULT_PFIFO_FAST
|
|
help
|
|
Select the queueing discipline that will be used by default
|
|
for all network devices.
|
|
|
|
config DEFAULT_FQ
|
|
bool "Fair Queue" if NET_SCH_FQ
|
|
|
|
config DEFAULT_CODEL
|
|
bool "Controlled Delay" if NET_SCH_CODEL
|
|
|
|
config DEFAULT_FQ_CODEL
|
|
bool "Fair Queue Controlled Delay" if NET_SCH_FQ_CODEL
|
|
|
|
config DEFAULT_FQ_PIE
|
|
bool "Flow Queue Proportional Integral controller Enhanced" if NET_SCH_FQ_PIE
|
|
|
|
config DEFAULT_SFQ
|
|
bool "Stochastic Fair Queue" if NET_SCH_SFQ
|
|
|
|
config DEFAULT_PFIFO_FAST
|
|
bool "Priority FIFO Fast"
|
|
endchoice
|
|
|
|
config DEFAULT_NET_SCH
|
|
string
|
|
default "pfifo_fast" if DEFAULT_PFIFO_FAST
|
|
default "fq" if DEFAULT_FQ
|
|
default "fq_codel" if DEFAULT_FQ_CODEL
|
|
default "fq_pie" if DEFAULT_FQ_PIE
|
|
default "sfq" if DEFAULT_SFQ
|
|
default "pfifo_fast"
|
|
endif
|
|
|
|
comment "Classification"
|
|
|
|
config NET_CLS
|
|
bool
|
|
|
|
config NET_CLS_BASIC
|
|
tristate "Elementary classification (BASIC)"
|
|
select NET_CLS
|
|
help
|
|
Say Y here if you want to be able to classify packets using
|
|
only extended matches and actions.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called cls_basic.
|
|
|
|
config NET_CLS_ROUTE4
|
|
tristate "Routing decision (ROUTE)"
|
|
depends on INET
|
|
select IP_ROUTE_CLASSID
|
|
select NET_CLS
|
|
help
|
|
If you say Y here, you will be able to classify packets
|
|
according to the route table entry they matched.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called cls_route.
|
|
|
|
config NET_CLS_FW
|
|
tristate "Netfilter mark (FW)"
|
|
select NET_CLS
|
|
help
|
|
If you say Y here, you will be able to classify packets
|
|
according to netfilter/firewall marks.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called cls_fw.
|
|
|
|
config NET_CLS_U32
|
|
tristate "Universal 32bit comparisons w/ hashing (U32)"
|
|
select NET_CLS
|
|
help
|
|
Say Y here to be able to classify packets using a universal
|
|
32bit pieces based comparison scheme.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called cls_u32.
|
|
|
|
config CLS_U32_PERF
|
|
bool "Performance counters support"
|
|
depends on NET_CLS_U32
|
|
help
|
|
Say Y here to make u32 gather additional statistics useful for
|
|
fine tuning u32 classifiers.
|
|
|
|
config CLS_U32_MARK
|
|
bool "Netfilter marks support"
|
|
depends on NET_CLS_U32
|
|
help
|
|
Say Y here to be able to use netfilter marks as u32 key.
|
|
|
|
config NET_CLS_FLOW
|
|
tristate "Flow classifier"
|
|
select NET_CLS
|
|
help
|
|
If you say Y here, you will be able to classify packets based on
|
|
a configurable combination of packet keys. This is mostly useful
|
|
in combination with SFQ.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called cls_flow.
|
|
|
|
config NET_CLS_CGROUP
|
|
tristate "Control Group Classifier"
|
|
select NET_CLS
|
|
select CGROUP_NET_CLASSID
|
|
depends on CGROUPS
|
|
help
|
|
Say Y here if you want to classify packets based on the control
|
|
cgroup of their process.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called cls_cgroup.
|
|
|
|
config NET_CLS_BPF
|
|
tristate "BPF-based classifier"
|
|
select NET_CLS
|
|
help
|
|
If you say Y here, you will be able to classify packets based on
|
|
programmable BPF (JIT'ed) filters as an alternative to ematches.
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called cls_bpf.
|
|
|
|
config NET_CLS_FLOWER
|
|
tristate "Flower classifier"
|
|
select NET_CLS
|
|
help
|
|
If you say Y here, you will be able to classify packets based on
|
|
a configurable combination of packet keys and masks.
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called cls_flower.
|
|
|
|
config NET_CLS_MATCHALL
|
|
tristate "Match-all classifier"
|
|
select NET_CLS
|
|
help
|
|
If you say Y here, you will be able to classify packets based on
|
|
nothing. Every packet will match.
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called cls_matchall.
|
|
|
|
config NET_EMATCH
|
|
bool "Extended Matches"
|
|
select NET_CLS
|
|
help
|
|
Say Y here if you want to use extended matches on top of classifiers
|
|
and select the extended matches below.
|
|
|
|
Extended matches are small classification helpers not worth writing
|
|
a separate classifier for.
|
|
|
|
A recent version of the iproute2 package is required to use
|
|
extended matches.
|
|
|
|
config NET_EMATCH_STACK
|
|
int "Stack size"
|
|
depends on NET_EMATCH
|
|
default "32"
|
|
help
|
|
Size of the local stack variable used while evaluating the tree of
|
|
ematches. Limits the depth of the tree, i.e. the number of
|
|
encapsulated precedences. Every level requires 4 bytes of additional
|
|
stack space.
|
|
|
|
config NET_EMATCH_CMP
|
|
tristate "Simple packet data comparison"
|
|
depends on NET_EMATCH
|
|
help
|
|
Say Y here if you want to be able to classify packets based on
|
|
simple packet data comparisons for 8, 16, and 32bit values.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_cmp.
|
|
|
|
config NET_EMATCH_NBYTE
|
|
tristate "Multi byte comparison"
|
|
depends on NET_EMATCH
|
|
help
|
|
Say Y here if you want to be able to classify packets based on
|
|
multiple byte comparisons mainly useful for IPv6 address comparisons.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_nbyte.
|
|
|
|
config NET_EMATCH_U32
|
|
tristate "U32 key"
|
|
depends on NET_EMATCH
|
|
help
|
|
Say Y here if you want to be able to classify packets using
|
|
the famous u32 key in combination with logic relations.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_u32.
|
|
|
|
config NET_EMATCH_META
|
|
tristate "Metadata"
|
|
depends on NET_EMATCH
|
|
help
|
|
Say Y here if you want to be able to classify packets based on
|
|
metadata such as load average, netfilter attributes, socket
|
|
attributes and routing decisions.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_meta.
|
|
|
|
config NET_EMATCH_TEXT
|
|
tristate "Textsearch"
|
|
depends on NET_EMATCH
|
|
select TEXTSEARCH
|
|
select TEXTSEARCH_KMP
|
|
select TEXTSEARCH_BM
|
|
select TEXTSEARCH_FSM
|
|
help
|
|
Say Y here if you want to be able to classify packets based on
|
|
textsearch comparisons.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_text.
|
|
|
|
config NET_EMATCH_CANID
|
|
tristate "CAN Identifier"
|
|
depends on NET_EMATCH && (CAN=y || CAN=m)
|
|
help
|
|
Say Y here if you want to be able to classify CAN frames based
|
|
on CAN Identifier.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_canid.
|
|
|
|
config NET_EMATCH_IPSET
|
|
tristate "IPset"
|
|
depends on NET_EMATCH && IP_SET
|
|
help
|
|
Say Y here if you want to be able to classify packets based on
|
|
ipset membership.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_ipset.
|
|
|
|
config NET_EMATCH_IPT
|
|
tristate "IPtables Matches"
|
|
depends on NET_EMATCH && NETFILTER && NETFILTER_XTABLES
|
|
help
|
|
Say Y here to be able to classify packets based on iptables
|
|
matches.
|
|
Current supported match is "policy" which allows packet classification
|
|
based on IPsec policy that was used during decapsulation
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called em_ipt.
|
|
|
|
config NET_CLS_ACT
|
|
bool "Actions"
|
|
select NET_CLS
|
|
select NET_XGRESS
|
|
help
|
|
Say Y here if you want to use traffic control actions. Actions
|
|
get attached to classifiers and are invoked after a successful
|
|
classification. They are used to overwrite the classification
|
|
result, instantly drop or redirect packets, etc.
|
|
|
|
A recent version of the iproute2 package is required to use
|
|
extended matches.
|
|
|
|
config NET_ACT_POLICE
|
|
tristate "Traffic Policing"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here if you want to do traffic policing, i.e. strict
|
|
bandwidth limiting. This action replaces the existing policing
|
|
module.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_police.
|
|
|
|
config NET_ACT_GACT
|
|
tristate "Generic actions"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to take generic actions such as dropping and
|
|
accepting packets.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_gact.
|
|
|
|
config GACT_PROB
|
|
bool "Probability support"
|
|
depends on NET_ACT_GACT
|
|
help
|
|
Say Y here to use the generic action randomly or deterministically.
|
|
|
|
config NET_ACT_MIRRED
|
|
tristate "Redirecting and Mirroring"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to allow packets to be mirrored or redirected to
|
|
other devices.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_mirred.
|
|
|
|
config NET_ACT_SAMPLE
|
|
tristate "Traffic Sampling"
|
|
depends on NET_CLS_ACT
|
|
select PSAMPLE
|
|
help
|
|
Say Y here to allow packet sampling tc action. The packet sample
|
|
action consists of statistically choosing packets and sampling
|
|
them using the psample module.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_sample.
|
|
|
|
config NET_ACT_IPT
|
|
tristate "IPtables targets"
|
|
depends on NET_CLS_ACT && NETFILTER && NETFILTER_XTABLES
|
|
help
|
|
Say Y here to be able to invoke iptables targets after successful
|
|
classification.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_ipt.
|
|
|
|
config NET_ACT_NAT
|
|
tristate "Stateless NAT"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to do stateless NAT on IPv4 packets. You should use
|
|
netfilter for NAT unless you know what you are doing.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_nat.
|
|
|
|
config NET_ACT_PEDIT
|
|
tristate "Packet Editing"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here if you want to mangle the content of packets.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_pedit.
|
|
|
|
config NET_ACT_SIMP
|
|
tristate "Simple Example (Debug)"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to add a simple action for demonstration purposes.
|
|
It is meant as an example and for debugging purposes. It will
|
|
print a configured policy string followed by the packet count
|
|
to the console for every packet that passes by.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_simple.
|
|
|
|
config NET_ACT_SKBEDIT
|
|
tristate "SKB Editing"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to change skb priority or queue_mapping settings.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_skbedit.
|
|
|
|
config NET_ACT_CSUM
|
|
tristate "Checksum Updating"
|
|
depends on NET_CLS_ACT && INET
|
|
select LIBCRC32C
|
|
help
|
|
Say Y here to update some common checksum after some direct
|
|
packet alterations.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_csum.
|
|
|
|
config NET_ACT_MPLS
|
|
tristate "MPLS manipulation"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to push or pop MPLS headers.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_mpls.
|
|
|
|
config NET_ACT_VLAN
|
|
tristate "Vlan manipulation"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to push or pop vlan headers.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_vlan.
|
|
|
|
config NET_ACT_BPF
|
|
tristate "BPF based action"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to execute BPF code on packets. The BPF code will decide
|
|
if the packet should be dropped or not.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_bpf.
|
|
|
|
config NET_ACT_CONNMARK
|
|
tristate "Netfilter Connection Mark Retriever"
|
|
depends on NET_CLS_ACT && NETFILTER
|
|
depends on NF_CONNTRACK && NF_CONNTRACK_MARK
|
|
help
|
|
Say Y here to allow retrieving of conn mark
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_connmark.
|
|
|
|
config NET_ACT_CTINFO
|
|
tristate "Netfilter Connection Mark Actions"
|
|
depends on NET_CLS_ACT && NETFILTER
|
|
depends on NF_CONNTRACK && NF_CONNTRACK_MARK
|
|
help
|
|
Say Y here to allow transfer of a connmark stored information.
|
|
Current actions transfer connmark stored DSCP into
|
|
ipv4/v6 diffserv and/or to transfer connmark to packet
|
|
mark. Both are useful for restoring egress based marks
|
|
back onto ingress connections for qdisc priority mapping
|
|
purposes.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_ctinfo.
|
|
|
|
config NET_ACT_SKBMOD
|
|
tristate "skb data modification action"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to allow modification of skb data
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_skbmod.
|
|
|
|
config NET_ACT_IFE
|
|
tristate "Inter-FE action based on IETF ForCES InterFE LFB"
|
|
depends on NET_CLS_ACT
|
|
select NET_IFE
|
|
help
|
|
Say Y here to allow for sourcing and terminating metadata
|
|
For details refer to netdev01 paper:
|
|
"Distributing Linux Traffic Control Classifier-Action Subsystem"
|
|
Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_ife.
|
|
|
|
config NET_ACT_TUNNEL_KEY
|
|
tristate "IP tunnel metadata manipulation"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to set/release ip tunnel metadata.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_tunnel_key.
|
|
|
|
config NET_ACT_CT
|
|
tristate "connection tracking tc action"
|
|
depends on NET_CLS_ACT && NF_CONNTRACK && (!NF_NAT || NF_NAT) && NF_FLOW_TABLE
|
|
select NF_CONNTRACK_OVS
|
|
select NF_NAT_OVS if NF_NAT
|
|
help
|
|
Say Y here to allow sending the packets to conntrack module.
|
|
|
|
If unsure, say N.
|
|
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_ct.
|
|
|
|
config NET_ACT_GATE
|
|
tristate "Frame gate entry list control tc action"
|
|
depends on NET_CLS_ACT
|
|
help
|
|
Say Y here to allow to control the ingress flow to be passed at
|
|
specific time slot and be dropped at other specific time slot by
|
|
the gate entry list.
|
|
|
|
If unsure, say N.
|
|
To compile this code as a module, choose M here: the
|
|
module will be called act_gate.
|
|
|
|
config NET_IFE_SKBMARK
|
|
tristate "Support to encoding decoding skb mark on IFE action"
|
|
depends on NET_ACT_IFE
|
|
|
|
config NET_IFE_SKBPRIO
|
|
tristate "Support to encoding decoding skb prio on IFE action"
|
|
depends on NET_ACT_IFE
|
|
|
|
config NET_IFE_SKBTCINDEX
|
|
tristate "Support to encoding decoding skb tcindex on IFE action"
|
|
depends on NET_ACT_IFE
|
|
|
|
config NET_TC_SKB_EXT
|
|
bool "TC recirculation support"
|
|
depends on NET_CLS_ACT
|
|
select SKB_EXTENSIONS
|
|
|
|
help
|
|
Say Y here to allow tc chain misses to continue in OvS datapath in
|
|
the correct recirc_id, and hardware chain misses to continue in
|
|
the correct chain in tc software datapath.
|
|
|
|
Say N here if you won't be using tc<->ovs offload or tc chains offload.
|
|
|
|
endif # NET_SCHED
|
|
|
|
config NET_SCH_FIFO
|
|
bool
|