mainlining shenanigans
Go to file
David S. Miller f0c227c7df mlx5-updates-2021-06-14
1) Trivial Lag refactroing in preparation for upcomming Single FDB lag feature
  - First 3 patches
 
 2) Scalable IRQ distriburion for Sub-functions
 
 A subfunction (SF) is a lightweight function that has a parent PCI
 function (PF) on which it is deployed.
 
 Currently, mlx5 subfunction is sharing the IRQs (MSI-X) with their
 parent PCI function.
 
 Before this series the PF allocates enough IRQs to cover
 all the cores in a system, Newly created SFs will re-use all the IRQs
 that the PF has allocated for itself.
 Hence, the more SFs are created, there are more EQs per IRQs. Therefore,
 whenever we handle an interrupt, we need to pull all SFs EQs and PF EQs
 instead of PF EQs without SFs on the system. This leads to a hard impact
 on the performance of SFs and PF.
 
 For example, on machine with:
 Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores.
 PCI Express 3 with BW of 126 Gb/s.
 ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16.
 
 test case: iperf TX BW single CPU, affinity of app and IRQ are the same.
 PF only: no SFs on the system, 56 IRQs.
 SF (before), 250 SFs Sharing the same 56 IRQs .
 SF (now),    250 SFs + 255 avaiable IRQs for the NIC. (please see IRQ spread scheme below).
 
 	    application SF-IRQ  channel   BW(Gb/sec)         interrupts/sec
             iperf TX            affinity
 PF only     cpu={0}     cpu={0} cpu={0}   79                 8200
 SF (before) cpu={0}     cpu={0} cpu={0}   51.3 (-35%)        9500
 SF (now)    cpu={0}     cpu={0} cpu={0}   78 (-2%)           8200
 
 command:
 $ taskset -c 0 iperf -c 11.1.1.1 -P 3 -i 6 -t 30 | grep SUM
 
 The different between the SF examples is that before this series we
 allocate num_cpus (56) IRQs, and all of them were shared among the PF
 and the SFs. And after this series, we allocate 255 IRQs, and we spread
 the SFs among the above IRQs. This have significantly decreased the load
 on each IRQ and the number of EQs per IRQ is down by 95% (251->11).
 
 In this patchset the solution proposed is to have a dedicated IRQ pool
 for SFs to use. the pool will allocate a large number of IRQs
 for SFs to grab from in order to minimize irq sharing between the
 different SFs.
 IRQs will not be requested from the OS until they are 1st requested by
 an SF consumer, and will be eventually released when the last SF consumer
 releases them.
 
 For the detailed IRQ spread and allocation scheme  please see last patch:
 ("net/mlx5: Round-Robin EQs over IRQs")
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmDIJUkACgkQSD+KveBX
 +j7tgQf+KtxzniuEY+JgbGWWyQvglx88S6WfhTOhZZllm2QXa2wWX24mz/AdYc0x
 QCT6yUzvaeaHPNpw/KwCw1IKpB9dlT+wIBD9NCEqtHqj+bVz+ioL/OlM5VJj+wC2
 kp+EjYsQbwgZIM40JgLLu2uzLy/5w7a1v9Rj0l4mLRZqPmrqeKrIAsVkVutaxtPg
 PtECBag4XtYERMXOfKohnXanwjW6ZyYQ0Yal76jNqoXXgy5dHr/JJDZQZTDURt7S
 3ex0gwTZwHfOLFQdRzD+U0kuC2/6sHMfeVrKO6QxuG/gihYe8FXEQ4qVSJmgXANP
 VH6n1Vk5IhaMzYKfGFb2OGOWanAVIA==
 =z0x7
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-06-14

1) Trivial Lag refactroing in preparation for upcomming Single FDB lag feature
 - First 3 patches

2) Scalable IRQ distriburion for Sub-functions

A subfunction (SF) is a lightweight function that has a parent PCI
function (PF) on which it is deployed.

Currently, mlx5 subfunction is sharing the IRQs (MSI-X) with their
parent PCI function.

Before this series the PF allocates enough IRQs to cover
all the cores in a system, Newly created SFs will re-use all the IRQs
that the PF has allocated for itself.
Hence, the more SFs are created, there are more EQs per IRQs. Therefore,
whenever we handle an interrupt, we need to pull all SFs EQs and PF EQs
instead of PF EQs without SFs on the system. This leads to a hard impact
on the performance of SFs and PF.

For example, on machine with:
Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores.
PCI Express 3 with BW of 126 Gb/s.
ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16.

test case: iperf TX BW single CPU, affinity of app and IRQ are the same.
PF only: no SFs on the system, 56 IRQs.
SF (before), 250 SFs Sharing the same 56 IRQs .
SF (now),    250 SFs + 255 avaiable IRQs for the NIC. (please see IRQ spread scheme below).

	    application SF-IRQ  channel   BW(Gb/sec)         interrupts/sec
            iperf TX            affinity
PF only     cpu={0}     cpu={0} cpu={0}   79                 8200
SF (before) cpu={0}     cpu={0} cpu={0}   51.3 (-35%)        9500
SF (now)    cpu={0}     cpu={0} cpu={0}   78 (-2%)           8200

command:
$ taskset -c 0 iperf -c 11.1.1.1 -P 3 -i 6 -t 30 | grep SUM

The different between the SF examples is that before this series we
allocate num_cpus (56) IRQs, and all of them were shared among the PF
and the SFs. And after this series, we allocate 255 IRQs, and we spread
the SFs among the above IRQs. This have significantly decreased the load
on each IRQ and the number of EQs per IRQ is down by 95% (251->11).

In this patchset the solution proposed is to have a dedicated IRQ pool
for SFs to use. the pool will allocate a large number of IRQs
for SFs to grab from in order to minimize irq sharing between the
different SFs.
IRQs will not be requested from the OS until they are 1st requested by
an SF consumer, and will be eventually released when the last SF consumer
releases them.

For the detailed IRQ spread and allocation scheme  please see last patch:
("net/mlx5: Round-Robin EQs over IRQs")
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 11:14:21 -07:00
arch s390/qeth: remove QAOB's pointer to its TX buffer 2021-06-11 12:49:15 -07:00
block block-5.13-2021-05-22 2021-05-22 07:40:34 -10:00
certs Kbuild updates for v5.13 (2nd) 2021-05-08 10:00:11 -07:00
crypto for-5.13/drivers-2021-04-27 2021-04-28 14:39:37 -07:00
Documentation dt-bindings: dwmac: Add bindings for new Ingenic SoCs. 2021-06-14 13:06:52 -07:00
drivers mlx5-updates-2021-06-14 2021-06-15 11:14:21 -07:00
fs proc: Check /proc/$pid/attr/ writes against file opener 2021-05-25 10:24:41 -10:00
include net/mlx5: Enlarge interrupt field in CREATE_EQ 2021-06-14 20:58:00 -07:00
init Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2021-05-11 16:05:56 -07:00
ipc ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry 2021-05-22 15:09:07 -10:00
kernel Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2021-06-07 13:01:52 -07:00
lib lib: kunit: suppress a compilation warning of frame size 2021-05-22 15:09:07 -10:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm userfaultfd: hugetlbfs: fix new flag usage in error path 2021-05-22 15:09:07 -10:00
net net/sched: cls_flower: Remove match on n_proto 2021-06-15 10:26:51 -07:00
samples samples: pktgen: add UDP tx checksum support 2021-05-28 14:52:13 -07:00
scripts kbuild: Quote OBJCOPY var to avoid a pahole call break the build 2021-05-27 11:32:56 -07:00
security trusted-keys: match tpm_get_ops on all return paths 2021-05-12 22:36:37 +03:00
sound sound fixes for 5.13-rc3 2021-05-20 06:42:21 -10:00
tools testing: selftests: drivers: net: netdevsim: devlink: add test case for hard drop statistics 2021-06-14 13:04:25 -07:00
usr .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
virt kvm: Cap halt polling at kvm->max_halt_poll_ns 2021-05-07 06:06:22 -04:00
.clang-format cxl for 5.12 2021-02-24 09:38:36 -08:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap Merge drm/drm-fixes into drm-misc-fixes 2021-05-11 13:35:52 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: move Murali Karicheri to credits 2021-04-29 15:47:30 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS net: iosm: infrastructure 2021-06-13 13:49:39 -07:00
Makefile Linux 5.13-rc3 2021-05-23 11:42:48 -10:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.