linux/arch/arm64/lib
Puranjay Mohan 7a4c32222b arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs
Support an instruction for resolving absolute addresses of per-CPU
data from their per-CPU offsets. This instruction is internal-only and
users are not allowed to use them directly. They will only be used for
internal inlining optimizations for now between BPF verifier and BPF
JITs.

Since commit 7158627686 ("arm64: percpu: implement optimised pcpu
access using tpidr_el1"), the per-cpu offset for the CPU is stored in
the tpidr_el1/2 register of that CPU.

To support this BPF instruction in the ARM64 JIT, the following ARM64
instructions are emitted:

mov dst, src		// Move src to dst, if src != dst
mrs tmp, tpidr_el1/2	// Move per-cpu offset of the current cpu in tmp.
add dst, dst, tmp	// Add the per cpu offset to the dst.

To measure the performance improvement provided by this change, the
benchmark in [1] was used:

Before:
glob-arr-inc   :   23.597 ± 0.012M/s
arr-inc        :   23.173 ± 0.019M/s
hash-inc       :   12.186 ± 0.028M/s

After:
glob-arr-inc   :   23.819 ± 0.034M/s
arr-inc        :   23.285 ± 0.017M/s
hash-inc       :   12.419 ± 0.011M/s

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
..
clear_page.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
clear_user.S arm64: extable: consolidate definitions 2021-10-21 10:45:22 +01:00
copy_from_user.S arm64: lib: __arch_copy_from_user(): fold fixups into body 2021-10-21 10:45:21 +01:00
copy_page.S arm64: Get rid of ARM64_HAS_NO_HW_PREFETCH 2023-12-05 12:02:52 +00:00
copy_template.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
copy_to_user.S arm64: lib: __arch_copy_to_user(): fold fixups into body 2021-10-21 10:45:21 +01:00
crc32.S arm64: lib: accelerate crc32_be 2022-01-31 11:21:43 +11:00
csum.c arm64: csum: Fix OoB access in IP checksum code for negative lengths 2023-09-07 10:15:20 +01:00
delay.c arm64: Avoid cpus_have_const_cap() for ARM64_HAS_WFXT 2023-10-16 14:17:05 +01:00
error-inject.c arm64: Add support for function error injection 2019-08-07 13:53:09 +01:00
insn.c arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs 2024-05-12 16:54:34 -07:00
kasan_sw_tags.S arm64: Use BTI C directly and unconditionally 2021-12-14 18:12:58 +00:00
Makefile isystem: delete global -isystem compile option 2021-09-22 09:26:24 +09:00
memchr.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
memcmp.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
memcpy.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
memset.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
mte.S arm64/sysreg: Remove duplicate definitions from asm/sysreg.h 2022-12-01 17:31:12 +00:00
strchr.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
strcmp.S Merge branch 'for-next/strings' into for-next/core 2022-03-14 19:02:52 +00:00
strlen.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
strncmp.S Merge branch 'for-next/strings' into for-next/core 2022-03-14 19:02:52 +00:00
strnlen.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
strrchr.S arm64: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
tishift.S arm64: lib: Use modern annotations for assembly functions 2020-01-08 12:23:02 +00:00
uaccess_flushcache.c arm: uaccess: Remove memcpy_page_flushcache() 2023-03-27 16:26:19 +01:00
xor-neon.c arm64: xor-neon: mark xor_arm64_neon_*() static 2023-05-25 17:44:01 +01:00