linux

History

David S. Miller 26abf15c49 mlx5-updates-2022-01-06 1) Expose FEC per lane block counters via ethtool 2) Trivial fixes/updates/cleanup to mlx5e netdev driver 3) Fix htmldoc build warning 4) Spread mlx5 SFs (sub-functions) to all available CPU cores: Commits 1..5 Shay Drory Says: ================ Before this patchset, mlx5 subfunction shared the same IRQs (MSI-X) with their peers subfunctions, causing them to use same CPU cores. In large scale, this is very undesirable, SFs use small number of cpu cores and all of them will be packed on the same CPU cores, not utilizing all CPU cores in the system. In this patchset we want to achieve two things. a) Spread IRQs used by SFs to all cpu cores b) Pack less SFs in the same IRQ, will result in multiple IRQs per core. In this patchset, we spread SFs over all online cpus available to mlx5 irqs in Round-Robin manner. e.g.: Whenever a SF is created, pick the next CPU core with least number of SF IRQs bound to it, SFs will share IRQs on the same core until a certain limit, when such limit is reached, we request a new IRQ and add it to that CPU core IRQ pool, when out of IRQs, pick any IRQ with least number of SF users. This enhancement is done in order to achieve a better distribution of the SFs over all the available CPUs, which reduces application latency, as shown bellow. Machine details: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores. PCI Express 3 with BW of 126 Gb/s. ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16. Base line test description: Single SF on the system. One instance of netperf is running on-top the SF. Numbers: latency = 15.136 usec, CPU Util = 35% Test description: There are 250 SFs on the system. There are 3 instances of netperf running, on-top three different SFs, in parallel. Perf numbers: # netperf SFs latency(usec) latency CPU utilization affinity affinity (lower is better) increase % 1 cpu=0 cpu={0} ~23 (app 1-3) 35% 75% 2 cpu=0,2,4 cpu={0} app 1: 21.625 30% 68% (CPU 0) app 2-3: 16.5 9% 15% (CPU 2,4) 3 cpu=0 cpu={0,2,4} app 1: ~16 7% 84% (CPU 0) app 2-3: ~17.9 14% 22% (CPU 2,4) 4 cpu=0,2,4 cpu={0,2,4} 15.2 (app 1-3) 0% 33% (CPU 0,2,4) - The first two entries (#1 and #2) show current state. e.g.: SFs are using the same CPU. The last two entries (#3 and #4) shows the latency reduction improvement of this patch. e.g.: SFs are on different CPUs. - Whenever we use several CPUs, in case there is a different CPU utilization, write the utilization of each CPU separately. - Whenever the latency result of the netperf instances were different, write the latency of each netperf instances separately. Commands: - for netperf CPU=0: $ for i in {1..3}; do taskset -c 0 netperf -H 1${i}.1.1.1 -t TCP_RR -- \ -o RT_LATENCY -r8 & done - for netperf CPU=0,2,4 $ for i in {1..3}; do taskset -c $(( ($i - 1) * 2 )) netperf -H \ 1${i}.1.1.1 -t TCP_RR -- -o RT_LATENCY -r8 & done ================ -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmHXh+AACgkQSD+KveBX +j68fQgAghUX4TFS2JSwa7+XSCtzz7GIu2Xrz8aWTAnydRLlNXuFuuHYcNed6I0l 7DaVOZwHp1tp3tnx3WMGPUU6ujDPEgasaDDblvG2UXix5LPVEHDXY44ittQX8mpC SC8Yj9mNo6DSfOMUZklFDMbw57XuLJ+HEGnwnrOEEyLX7ruDXGEViUmVBd4IoC3B F2fJHBkdTJfHWTJRB4pWbZD1dw7WbKd0RyPla3OkoHugEUCKnbjii8cMwNM64Bbp Pjz/SiShVy+NTotqPzRNjcx7y4tHOXCYt33zt1VlGtdUxs5eCA5jkjHFz0jb12Lu rvfHaBaU+elMKTw5G/WMGJxZQx0kEQ== =VBWY -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2022-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-01-06 1) Expose FEC per lane block counters via ethtool 2) Trivial fixes/updates/cleanup to mlx5e netdev driver 3) Fix htmldoc build warning 4) Spread mlx5 SFs (sub-functions) to all available CPU cores: Commits 1..5 Shay Drory Says: ================ Before this patchset, mlx5 subfunction shared the same IRQs (MSI-X) with their peers subfunctions, causing them to use same CPU cores. In large scale, this is very undesirable, SFs use small number of cpu cores and all of them will be packed on the same CPU cores, not utilizing all CPU cores in the system. In this patchset we want to achieve two things. a) Spread IRQs used by SFs to all cpu cores b) Pack less SFs in the same IRQ, will result in multiple IRQs per core. In this patchset, we spread SFs over all online cpus available to mlx5 irqs in Round-Robin manner. e.g.: Whenever a SF is created, pick the next CPU core with least number of SF IRQs bound to it, SFs will share IRQs on the same core until a certain limit, when such limit is reached, we request a new IRQ and add it to that CPU core IRQ pool, when out of IRQs, pick any IRQ with least number of SF users. This enhancement is done in order to achieve a better distribution of the SFs over all the available CPUs, which reduces application latency, as shown bellow. Machine details: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores. PCI Express 3 with BW of 126 Gb/s. ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16. Base line test description: Single SF on the system. One instance of netperf is running on-top the SF. Numbers: latency = 15.136 usec, CPU Util = 35% Test description: There are 250 SFs on the system. There are 3 instances of netperf running, on-top three different SFs, in parallel. Perf numbers: # netperf SFs latency(usec) latency CPU utilization affinity affinity (lower is better) increase % 1 cpu=0 cpu={0} ~23 (app 1-3) 35% 75% 2 cpu=0,2,4 cpu={0} app 1: 21.625 30% 68% (CPU 0) app 2-3: 16.5 9% 15% (CPU 2,4) 3 cpu=0 cpu={0,2,4} app 1: ~16 7% 84% (CPU 0) app 2-3: ~17.9 14% 22% (CPU 2,4) 4 cpu=0,2,4 cpu={0,2,4} 15.2 (app 1-3) 0% 33% (CPU 0,2,4) - The first two entries (#1 and #2) show current state. e.g.: SFs are using the same CPU. The last two entries (#3 and #4) shows the latency reduction improvement of this patch. e.g.: SFs are on different CPUs. - Whenever we use several CPUs, in case there is a different CPU utilization, write the utilization of each CPU separately. - Whenever the latency result of the netperf instances were different, write the latency of each netperf instances separately. Commands: - for netperf CPU=0: $ for i in {1..3}; do taskset -c 0 netperf -H 1${i}.1.1.1 -t TCP_RR -- \ -o RT_LATENCY -r8 & done - for netperf CPU=0,2,4 $ for i in {1..3}; do taskset -c $(( ($i - 1) * 2 )) netperf -H \ 1${i}.1.1.1 -t TCP_RR -- -o RT_LATENCY -r8 & done ================ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>		2022-01-07 11:10:57 +00:00
..
accessibility
acpi	Merge branch 'acpi-properties'	2021-11-26 19:45:31 +01:00
amba	ARM: 9119/1: amba: Properly handle device probe without IRQ domain	2021-10-19 10:30:53 +01:00
android	binder: fix async_free_space accounting for empty parcels	2021-12-21 11:07:34 +01:00
ata	libata: if T_LENGTH is zero, dma direction should be DMA_NONE	2021-12-17 09:32:13 +09:00
atm
auxdisplay	auxdisplay: charlcd: checking for pointer reference before dereferencing	2021-11-24 11:46:52 +01:00
base	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-12-30 12:12:12 -08:00
bcma	pci-v5.16-changes	2021-11-06 14:36:12 -07:00
block	xen/blkfront: harden blkfront against event channel storms	2021-12-16 08:24:08 +01:00
bluetooth	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2021-12-31 14:35:40 +00:00
bus	Some new drivers changes for the Allwinner SoCs, fixing the shutdown	2021-12-17 16:04:54 +01:00
cdrom	for-5.16/cdrom-2021-10-29	2021-11-01 10:09:14 -07:00
char	Fix some IPMI crashes	2021-12-22 10:11:17 -08:00
clk	clk: Don't parent clks until the parent is fully registered	2021-12-07 19:20:35 -08:00
clocksource	clocksource/drivers/arm_arch_timer: Force inlining of erratum_set_next_event_generic()	2021-12-10 17:47:00 +01:00
comedi	comedi: dt9812: fix DMA buffers on stack	2021-10-30 10:54:47 +02:00
connector
counter	counter: Fix use-after-free race condition for events_queue_size write	2021-10-21 13:02:47 +02:00
cpufreq	cpufreq: Fix a comment in cpufreq_policy_free	2021-12-01 20:02:11 +01:00
cpuidle	ARM: SoC drivers for 5.16	2021-11-03 17:00:52 -07:00
crypto	crypto: qat - do not handle PFVF sources for qat_4xxx	2021-12-17 13:11:54 +11:00
cxl	cxl for v5.16	2021-11-08 11:49:48 -08:00
dax
dca
devfreq	Merge branches 'pm-opp' and 'pm-cpufreq'	2021-11-10 14:06:51 +01:00
dio
dma	dmaengine: st_fdma: fix MODULE_ALIAS	2021-12-13 13:18:48 +05:30
dma-buf	dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow	2021-12-01 15:30:10 +05:30
edac	- amd64_edac: Add support for three-rank interleaving mode which is	2021-11-01 15:02:49 -07:00
eisa
extcon	extcon: usbc-tusb320: Add support for TUSB320L	2021-10-27 14:13:39 +09:00
firewire	SCSI misc on 20211105	2021-11-05 08:42:02 -07:00
firmware	firmware: arm_scpi: Fix string overflow in SCPI genpd driver	2021-12-13 15:17:37 +01:00
fpga
fsi	fsi: sbefifo: Use interruptible mutex locking	2021-10-22 09:54:33 +10:30
gnss
gpio	gpio: gpio-aspeed-sgpio: Fix wrong hwirq base in irq handler	2022-01-03 10:50:12 +01:00
gpu	Merge branch 'drm-misc-fixes' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes	2021-12-31 11:40:29 +10:00
greybus
hid	HID: potential dereference of null pointer	2021-12-20 11:26:14 +01:00
hsi	HSI changes for the 5.16 series	2021-11-04 13:56:55 -07:00
hv	hv: utils: add PTP_1588_CLOCK to Kconfig to fix build	2021-11-28 21:22:35 +00:00
hwmon	hwmon: (lm90) Do not report 'busy' status bit as alarm	2021-12-12 16:22:53 -08:00
hwspinlock
hwtracing	coresight: trbe: Work around write to out of range	2021-10-27 11:46:01 -06:00
i2c	i2c: validate user data in compat ioctl	2021-12-31 14:28:22 +01:00
i3c
idle
iio	iio: trigger: stm32-timer: fix MODULE_ALIAS	2021-12-04 15:37:02 +00:00
infiniband	net/mlx5: Introduce API for bulk request and release of IRQs	2022-01-06 16:22:52 -08:00
input	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input	2022-01-01 10:21:49 -08:00
interconnect
iommu	iommu/vt-d: Fix unmap_pages support	2021-11-26 22:54:47 +01:00
ipack
irqchip	irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node()	2021-12-10 13:23:13 +00:00
isdn	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-12-30 12:12:12 -08:00
leds
macintosh	Merge branch 'akpm' (patches from Andrew)	2021-11-06 14:08:17 -07:00
mailbox	mailbox: imx: support i.MX8ULP S4 MU	2021-10-29 23:03:09 -05:00
mcb
md	block-5.16-2021-12-17	2021-12-17 11:46:07 -08:00
media	media fixes for v5.16-rc3	2021-11-22 14:58:57 -08:00
memory	memory: mtk-smi: Fix a null dereference for the ostd	2021-11-25 14:46:00 +01:00
memstick	memstick: r592: Fix a UAF bug when removing the driver	2021-10-19 13:04:42 +02:00
message	pci-v5.16-changes	2021-11-06 14:36:12 -07:00
mfd	chrome platform changes for 5.16	2021-11-10 11:36:43 -08:00
misc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-12-16 16:13:19 -08:00
mmc	mmc: mmci: stm32: clear DLYB_CR after sending tuning command	2021-12-21 13:03:51 +01:00
most	most: fix control-message timeouts	2021-10-26 19:12:01 +02:00
mtd	Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"	2021-12-10 11:52:34 -07:00
mux	mux: add support for delay after muxing	2021-10-21 20:02:42 +01:00
net	mlx5-updates-2022-01-06	2022-01-07 11:10:57 +00:00
nfc	nfc: st21nfca: remove redundant assignment to variable i	2021-12-30 17:26:58 -08:00
ntb
nubus
nvdimm	libnvdimm for v5.16	2021-11-10 10:56:02 -08:00
nvme	nvmet-tcp: fix possible list corruption for unexpected command failure	2021-12-08 16:36:58 +01:00
nvmem
of	of/irq: Add a quirk for controllers with their own definition of interrupt-map	2021-12-03 11:30:22 -06:00
opp
parisc
parport
pci	- Clear the PCI_MSIX_FLAGS_MASKALL bit too on the error path so that it	2021-12-19 12:28:46 -08:00
pcmcia	pcmcia: hide the MAC address helpers if !NET	2021-11-22 14:02:52 +00:00
perf	ACPI updates for 5.16-rc1	2021-11-02 15:58:39 -07:00
phy	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-12-16 16:13:19 -08:00
pinctrl	pinctrl: stm32: consider the GPIO offset to expose all the GPIO lines	2021-12-16 04:14:56 +01:00
platform	platform-drivers-x86 for v5.16-4	2021-12-24 08:58:23 -08:00
pnp
power	power: supply: bq25890: Fix initial setting of the F_CONV_RATE field	2021-11-02 16:48:47 +01:00
powercap	powercap: DTPM: Drop unused local variable from init_dtpm()	2021-12-03 17:51:59 +01:00
pps
ps3
ptp	net: fix SOF_TIMESTAMPING_BIND_PHC to work with multiple sockets	2022-01-06 12:18:08 +00:00
pwm	pwm: vt8500: Rename pwm_busy_wait() to make it obviously driver-specific	2021-11-05 11:57:13 +01:00
rapidio	rapidio: avoid bogus __alloc_size warning	2021-11-06 13:30:33 -07:00
ras
regulator	- Remove Drivers	2021-11-08 12:07:52 -08:00
remoteproc
reset	reset: tegra-bpmp: Revert Handle errors in BPMP response	2021-11-17 17:22:27 +01:00
rpmsg	remoteproc updates for v5.16	2021-11-10 09:07:26 -08:00
rtc	RTC for 5.16	2021-11-12 11:44:31 -08:00
s390	s390/qeth: remove check for packing mode in qeth_check_outbound_queue()	2021-12-07 22:01:03 -08:00
sbus
scsi	SCSI fixes on 20211231	2021-12-31 09:22:25 -08:00
sh
siox
slimbus
soc	soc/tegra: Fixes for v5.16-rc6	2021-12-16 15:02:26 +01:00
soundwire	soundwire: qcom: add debugfs entry for soundwire register dump	2021-10-20 20:54:59 +05:30
spi	spi: Fix for v5.16	2021-12-20 10:23:19 -08:00
spmi
ssb
staging	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2021-11-26 13:45:19 -08:00
target	scsi: target: configfs: Delete unnecessary checks for NULL	2021-11-18 23:07:02 -05:00
tc
tee	ARM: SoC fixes for 5.16, part 4	2021-12-23 09:22:34 -08:00
thermal	thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL	2021-12-08 15:29:22 +01:00
thunderbolt	thunderbolt: Changes for v5.16 merge window	2021-10-25 13:17:29 +02:00
tty	Merge branch 'xsa' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip	2021-12-20 07:42:21 -08:00
uio	Drivers: hv: vmbus: Mark vmbus ring buffer visible to host in Isolation VM	2021-10-28 11:22:23 +00:00
usb	usb: typec: ucsi: Only check the contract if there is a connection	2021-12-21 16:30:53 +01:00
vdpa	vdpa: Consider device id larger than 31	2021-12-08 15:41:50 -05:00
vfio	vfio/pci: Fix OpRegion read	2021-11-30 11:41:49 -07:00
vhost	vdpa: check that offsets are within bounds	2021-12-08 14:53:15 -05:00
video	TTY/Serial fixes for 5.16-rc4	2021-12-05 09:13:20 -08:00
virt	nitro_enclaves: Use get_user_pages_unlocked() call to handle mmap assert	2021-12-21 11:08:19 +01:00
virtio	virtio_ring: Fix querying of maximum DMA mapping size for virtio device	2021-12-08 15:04:06 -05:00
visorbus
vlynq
vme
w1
watchdog	linux-watchdog 5.16-rc1 tag	2021-11-10 09:41:22 -08:00
xen	xen/console: harden hvc_xen against event channel storms	2021-12-16 08:24:08 +01:00
zorro
Kconfig
Makefile	virtio: always enter drivers/virtio/	2021-12-08 14:53:15 -05:00