linux/drivers
Alexander Lobakin ea01fa7031 iommu/dma: avoid expensive indirect calls for sync operations
When IOMMU is on, the actual synchronization happens in the same cases
as with the direct DMA. Advertise %DMA_F_CAN_SKIP_SYNC in IOMMU DMA to
skip sync ops calls (indirect) for non-SWIOTLB buffers.

perf profile before the patch:

    18.53%  [kernel]       [k] gq_rx_skb
    14.77%  [kernel]       [k] napi_reuse_skb
     8.95%  [kernel]       [k] skb_release_data
     5.42%  [kernel]       [k] dev_gro_receive
     5.37%  [kernel]       [k] memcpy
<*>  5.26%  [kernel]       [k] iommu_dma_sync_sg_for_cpu
     4.78%  [kernel]       [k] tcp_gro_receive
<*>  4.42%  [kernel]       [k] iommu_dma_sync_sg_for_device
     4.12%  [kernel]       [k] ipv6_gro_receive
     3.65%  [kernel]       [k] gq_pool_get
     3.25%  [kernel]       [k] skb_gro_receive
     2.07%  [kernel]       [k] napi_gro_frags
     1.98%  [kernel]       [k] tcp6_gro_receive
     1.27%  [kernel]       [k] gq_rx_prep_buffers
     1.18%  [kernel]       [k] gq_rx_napi_handler
     0.99%  [kernel]       [k] csum_partial
     0.74%  [kernel]       [k] csum_ipv6_magic
     0.72%  [kernel]       [k] free_pcp_prepare
     0.60%  [kernel]       [k] __napi_poll
     0.58%  [kernel]       [k] net_rx_action
     0.56%  [kernel]       [k] read_tsc
<*>  0.50%  [kernel]       [k] __x86_indirect_thunk_r11
     0.45%  [kernel]       [k] memset

After patch, lines with <*> no longer show up, and overall
cpu usage looks much better (~60% instead of ~72%):

    25.56%  [kernel]       [k] gq_rx_skb
     9.90%  [kernel]       [k] napi_reuse_skb
     7.39%  [kernel]       [k] dev_gro_receive
     6.78%  [kernel]       [k] memcpy
     6.53%  [kernel]       [k] skb_release_data
     6.39%  [kernel]       [k] tcp_gro_receive
     5.71%  [kernel]       [k] ipv6_gro_receive
     4.35%  [kernel]       [k] napi_gro_frags
     4.34%  [kernel]       [k] skb_gro_receive
     3.50%  [kernel]       [k] gq_pool_get
     3.08%  [kernel]       [k] gq_rx_napi_handler
     2.35%  [kernel]       [k] tcp6_gro_receive
     2.06%  [kernel]       [k] gq_rx_prep_buffers
     1.32%  [kernel]       [k] csum_partial
     0.93%  [kernel]       [k] csum_ipv6_magic
     0.65%  [kernel]       [k] net_rx_action

iavf yields +10% of Mpps on Rx. This also unblocks batched allocations
of XSk buffers when IOMMU is active.

Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-05-07 13:29:53 +02:00
..
accel accel/ivpu: Fix deadlock in context_xa 2024-04-08 10:55:01 +02:00
accessibility speakup: Avoid crash on very long word 2024-04-11 14:32:53 +02:00
acpi Merge branch 'acpi-cppc' 2024-04-25 19:25:54 +02:00
amba
android binder: check offset alignment in binder_get_object() 2024-04-11 15:19:12 +02:00
ata ata: libata-core: Allow command duration limits detection for ACS-4 drives 2024-04-13 10:42:28 +09:00
atm atm: fore200e: Convert to platform remove callback returning void 2024-03-07 20:36:32 -08:00
auxdisplay auxdisplay: img-ascii-lcd: Convert to platform remove callback returning void 2024-03-12 17:37:54 +02:00
base regmap: Fixes for v6.9 2024-04-05 17:21:16 -07:00
bcma
block nullblk: Fix cleanup order in null_add_dev() error path 2024-04-02 07:43:24 -06:00
bluetooth Bluetooth: qca: set power_ctrl_enabled on NULL returned by gpiod_get_optional() 2024-04-24 16:26:22 -04:00
bus Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
cache cache: sifive_ccache: Partially convert to a platform driver 2024-03-28 22:40:56 +00:00
cdrom cdrom: gdrom: Convert to platform remove callback returning void 2024-03-07 11:53:30 -07:00
cdx cdx: add MSI support for CDX bus 2024-03-07 21:52:03 +00:00
char random: handle creditable entropy from atomic process context 2024-04-17 13:53:18 +02:00
clk clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe port 2024-04-10 20:50:26 -07:00
clocksource A set of updates for clocksource and clockevent drivers: 2024-03-23 14:42:45 -07:00
comedi comedi: vmk80xx: fix incomplete endpoint checking 2024-04-11 15:16:23 +02:00
connector
counter
cpufreq RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
cpuidle RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
crypto x86/CPU/AMD: Track SNP host status with cc_platform_*() 2024-04-04 10:40:30 +02:00
cxl cxl/core: Fix potential payload size confusion in cxl_mem_get_poison() 2024-04-22 08:58:59 -07:00
dax libnvdimm updates for v6.9 2024-03-15 11:58:32 -07:00
dca
devfreq
dio dio: make dio_bus_type const 2024-03-07 20:37:04 +00:00
dma dmaengine: idxd: Fix oops during rmmod on single-CPU platforms 2024-04-07 17:56:06 +05:30
dma-buf Merge drm/drm-fixes into drm-misc-fixes 2024-03-25 21:11:58 +01:00
dpll dpll: fix dpll_pin_on_pin_register() for multiple parent pins 2024-04-25 08:32:09 -07:00
edac - Add a FRU (Field Replaceable Unit) memory poison manager which 2024-03-11 18:14:06 -07:00
eisa
extcon
firewire firewire: ohci: mask bus reset interrupts between ISR and bottom half 2024-04-06 09:36:46 +09:00
firmware Qualcomm driver fix for v6.9 2024-04-26 18:08:02 +02:00
fpga Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
fsi
gnss
gpio intel-gpio for v6.9-2 2024-04-25 14:35:55 +02:00
gpu - Fix error paths on managed allocations 2024-04-26 12:56:58 +10:00
greybus Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
hid HID: mcp-2221: cancel delayed_work only when CONFIG_IIO is enabled 2024-04-12 17:48:53 +02:00
hsi
hte
hv hyperv-fixes for v6.9-rc4 2024-04-11 16:23:56 -07:00
hwmon - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
hwspinlock hwspinlock: omap: Use index to get hwspinlock pointer 2024-03-05 20:01:14 -08:00
hwtracing coresight-tpda: Change qcom,dsb-element-size to qcom,dsb-elem-bits 2024-02-27 11:26:45 +00:00
i2c i2c: smbus: fix NULL function pointer dereference 2024-04-27 12:57:57 +02:00
i3c
idle cpuidle: ACPI/intel: fix MWAIT hint target C-state computation 2024-03-05 21:25:18 +01:00
iio Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
infiniband RDMA/mlx5: Fix port number for counter query in multi-port configuration 2024-04-08 13:33:10 +03:00
input TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
interconnect interconnect fixes for v6.9-rc 2024-04-11 14:44:49 +02:00
iommu iommu/dma: avoid expensive indirect calls for sync operations 2024-05-07 13:29:53 +02:00
ipack ipack: make ipack_bus_type const 2024-03-07 20:32:47 +00:00
irqchip irqchip/gic-v3-its: Prevent double free on error 2024-04-25 14:30:46 +02:00
isdn mISDN: fix MISDN_TIME_STAMP handling 2024-04-09 17:01:01 -07:00
leds - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
macintosh powerpc updates for 6.9 2024-03-15 17:53:48 -07:00
mailbox imx: add support for i.MX95 ELE/V2X MU 2024-03-13 12:23:36 -07:00
mcb mcb: constify the struct device_type usage 2024-03-07 20:38:15 +00:00
md - Fix 6.9 regression so that DM device removal is performed 2024-04-26 11:17:24 -07:00
media media: mediatek: vcodec: support 36 bits physical address 2024-03-26 09:52:59 +01:00
memory Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
memstick MMC core: 2024-03-13 10:59:28 -07:00
message
mfd TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
misc at24 fixes for v6.9-rc6 2024-04-26 07:59:20 +02:00
mmc MMC host: 2024-04-26 13:17:33 -07:00
most most: core: make mostbus const 2024-03-07 20:32:38 +00:00
mtd There has been OTP support improvements in the NVMEM subsystem, and 2024-04-26 13:05:34 -07:00
mux
net net: b44: set pause params only when interface is up 2024-04-25 08:34:18 -07:00
nfc NFC: trf7970a: disable all regulators on removal 2024-04-22 14:19:58 -07:00
ntb
nubus
nvdimm libnvdimm updates for v6.9 2024-03-15 11:58:32 -07:00
nvme nvme-fc: rename free_ctrl callback to match name pattern 2024-04-04 08:47:56 -07:00
nvmem nvmem: core: Print error on wrong bits DT property 2024-03-07 20:21:53 +00:00
of of: module: prevent NULL pointer dereference in vsnprintf() 2024-03-27 17:05:07 -05:00
opp OPP: Extend dev_pm_opp_data with turbo support 2024-03-11 10:39:24 +05:30
parisc parisc: led: Convert to platform remove callback returning void 2024-03-08 10:00:07 +01:00
parport parport: sunbpp: Convert to platform remove callback returning void 2024-03-07 21:50:06 +00:00
pci Revert "PCI: Mark LSI FW643 to avoid bus reset" 2024-03-29 11:57:12 -05:00
pcmcia pcmcia: cs: make pcmcia_socket_class constant 2024-03-10 09:07:00 +01:00
peci
perf drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported 2024-03-26 14:09:18 -07:00
phy phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregistered 2024-04-12 16:57:19 +05:30
pinctrl Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
platform platform-drivers-x86 for v6.9-3 2024-04-18 07:15:33 -07:00
pmdomain Core: 2024-03-13 11:33:10 -07:00
pnp
power power supply and reset changes for the 6.9 series 2024-03-14 10:19:48 -07:00
powercap powercap: intel_rapl: Convert to platform remove callback returning void 2024-03-13 20:45:54 +01:00
pps pps: use cflags-y instead of EXTRA_CFLAGS 2024-03-07 21:51:39 +00:00
ps3
ptp Networking changes for 6.9. 2024-03-12 17:44:08 -07:00
pwm pwm: dwc: allow suspend/resume for 16 channels 2024-04-15 17:28:13 +02:00
rapidio
ras RAS: Avoid build errors when CONFIG_DEBUG_FS=n 2024-03-26 21:48:21 +01:00
regulator regulator: tps65132: Add of_match table 2024-03-25 19:28:27 +00:00
remoteproc remoteproc updates for v6.9 2024-03-21 10:37:39 -07:00
reset
rpmsg
rtc RTC for 6.9 2024-03-21 17:16:46 -07:00
s390 s390 updates for 6.9-rc5 2024-04-19 09:59:15 -07:00
sbus This includes the following changes related to sparc for v6.9: 2024-03-15 12:47:21 -07:00
scsi scsi: core: Fix handling of SCMD_FAIL_IF_RECOVERING 2024-04-08 21:40:29 -04:00
sh
siox SIOX changes for 6.9-rc1 2024-03-21 15:18:18 -07:00
slimbus slimbus: core: make slimbus_bus const 2024-03-07 20:21:39 +00:00
soc soc: mediatek: mtk-socinfo: depends on CONFIG_SOC_BUS 2024-04-23 12:09:12 +02:00
soundwire soundwire: amd: fix for wake interrupt handling for clockstop mode 2024-03-28 23:40:33 +05:30
spi spi: mchp-pci1xxx: Fix a possible null pointer dereference in pci1xxx_spi_probe 2024-04-03 11:04:58 +01:00
spmi
ssb
staging staging: vc04_services: fix information leak in create_component() 2024-03-25 19:10:01 +01:00
target scsi: target: Fix SELinux error when systemd-modules loads the target module 2024-04-05 21:37:54 -04:00
tc
tee ARM: SoC drivers for 6.9 2024-03-12 10:35:24 -07:00
thermal thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up() 2024-04-19 15:08:19 +02:00
thunderbolt thunderbolt: Avoid notify PM core about runtime PM resume 2024-04-10 10:49:58 +03:00
tty serial: stm32: Reset .throttled state in .startup() 2024-04-17 13:26:45 +02:00
ufs scsi: ufs: qcom: Add missing interconnect bandwidth values for Gear 5 2024-04-08 15:06:56 -04:00
uio hyperv-fixes for v6.9-rc4 2024-04-11 16:23:56 -07:00
usb USB-serial device ids for 6.9-rc5 2024-04-19 16:07:18 +02:00
vdpa vDPA: code clean for vhost_vdpa uapi 2024-04-22 17:07:13 -04:00
vfio VFIO updates for v6.9-rc1 2024-03-15 13:21:13 -07:00
vhost vhost: correct misleading printing information 2024-04-08 04:11:04 -04:00
video fbdev: fix incorrect address computation in deferred IO 2024-04-24 15:03:37 +02:00
virt Revert "vmgenid: emit uevent when VMGENID updates" 2024-04-18 14:47:23 +02:00
virtio virtio: store owner from modules with register_virtio_driver() 2024-04-08 04:11:04 -04:00
w1
watchdog linux-watchdog 6.9-rc1 tag 2024-03-17 12:06:10 -07:00
xen swiotlb: remove alloc_size argument to swiotlb_tbl_map_single() 2024-05-07 13:29:28 +02:00
zorro
Kconfig
Makefile Revert "leds: Only descend into leds directory when CONFIG_NEW_LEDS is set" 2024-03-07 08:48:10 +00:00