mainlining shenanigans
Go to file
Maxim Mikityanskiy d03b195b5a sch_htb: Hierarchical QoS hardware offload
HTB doesn't scale well because of contention on a single lock, and it
also consumes CPU. This patch adds support for offloading HTB to
hardware that supports hierarchical rate limiting.

In the offload mode, HTB passes control commands to the driver using
ndo_setup_tc. The driver has to replicate the whole hierarchy of classes
and their settings (rate, ceil) in the NIC. Every modification of the
HTB tree caused by the admin results in ndo_setup_tc being called.

After this setup, the HTB algorithm is done completely in the NIC. An SQ
(send queue) is created for every leaf class and attached to the
hierarchy, so that the NIC can calculate and obey aggregated rate
limits, too. In the future, it can be changed, so that multiple SQs will
back a single leaf class.

ndo_select_queue is responsible for selecting the right queue that
serves the traffic class of each packet.

The data path works as follows: a packet is classified by clsact, the
driver selects a hardware queue according to its class, and the packet
is enqueued into this queue's qdisc.

This solution addresses two main problems of scaling HTB:

1. Contention by flow classification. Currently the filters are attached
to the HTB instance as follows:

    # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80
    classid 1:10

It's possible to move classification to clsact egress hook, which is
thread-safe and lock-free:

    # tc filter add dev eth0 egress protocol ip flower dst_port 80
    action skbedit priority 1:10

This way classification still happens in software, but the lock
contention is eliminated, and it happens before selecting the TX queue,
allowing the driver to translate the class to the corresponding hardware
queue in ndo_select_queue.

Note that this is already compatible with non-offloaded HTB and doesn't
require changes to the kernel nor iproute2.

2. Contention by handling packets. HTB is not multi-queue, it attaches
to a whole net device, and handling of all packets takes the same lock.
When HTB is offloaded, it registers itself as a multi-queue qdisc,
similarly to mq: HTB is attached to the netdev, and each queue has its
own qdisc.

Some features of HTB may be not supported by some particular hardware,
for example, the maximum number of classes may be limited, the
granularity of rate and ceil parameters may be different, etc. - so, the
offload is not enabled by default, a new parameter is used to enable it:

    # tc qdisc replace dev eth0 root handle 1: htb offload

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22 20:41:29 -08:00
arch arm64: dts: qcom: sdm845: kill IPA modem-remoteproc property 2021-01-21 20:42:46 -08:00
block block-5.11-2021-01-10 2021-01-10 12:53:08 -08:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto X.509: Fix crash caused by NULL pointer 2021-01-20 11:33:51 -08:00
Documentation dt-bindings: net: renesas,etheravb: Add r8a779a0 support 2021-01-22 18:06:03 -08:00
drivers net: hns3: replace skb->csum_not_inet with skb_csum_is_sctp 2021-01-22 19:27:15 -08:00
fs cachefiles: Drop superfluous readpages aops NULL check 2021-01-20 11:33:51 -08:00
include sch_htb: Hierarchical QoS hardware offload 2021-01-22 20:41:29 -08:00
init Revert "init/console: Use ttynull as a fallback when there is no console" 2021-01-08 11:02:18 -08:00
ipc Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-01-20 12:16:11 -08:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-01-20 12:16:11 -08:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm mm: don't put pinned pages into the swap cache 2021-01-17 12:08:04 -08:00
net sch_htb: Hierarchical QoS hardware offload 2021-01-22 20:41:29 -08:00
samples bpf: Rename BPF_XADD and prepare to encode other atomics in .imm 2021-01-14 18:34:29 -08:00
scripts Kbuild fixes for v5.11 2021-01-10 13:24:55 -08:00
security dump_common_audit_data(): fix racy accesses to ->d_name 2021-01-16 15:11:35 -05:00
sound ALSA: hda/hdmi - enable runtime pm for CI AMD display audio 2021-01-12 16:06:01 +01:00
tools sch_htb: Hierarchical QoS hardware offload 2021-01-22 20:41:29 -08:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt x86: 2021-01-08 15:06:02 -08:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap MAINTAINERS: Update my email address 2021-01-15 23:55:16 +01:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: dccp: move Gerrit Renker to CREDITS 2021-01-14 10:53:49 -08:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: add entry for Arrow SpeedChips XRS7000 driver 2021-01-21 13:00:08 -08:00
Makefile Linux 5.11-rc4 2021-01-17 16:37:05 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.