39491867ac
As pointed out by Ilya and explained in the new comment, there's a
discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
the value from memory into r0, while x86 only does so when r0 and the
value in memory are different. The same issue affects s390.
At first this might sound like pure semantics, but it makes a real
difference when the comparison is 32-bit, since the load will
zero-extend r0/rax.
The fix is to explicitly zero-extend rax after doing such a
CMPXCHG. Since this problem affects multiple archs, this is done in
the verifier by patching in a BPF_ZEXT_REG instruction after every
32-bit cmpxchg. Any archs that don't need such manual zero-extension
can do a look-ahead with insn_is_zext to skip the unnecessary mov.
Note this still goes on top of Ilya's patch:
https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com/T/#u
Differences v5->v6[1]:
- Moved is_cmpxchg_insn and ensured it can be safely re-used. Also renamed it
and removed 'inline' to match the style of the is_*_function helpers.
- Fixed up comments in verifier test (thanks for the careful review, Martin!)
Differences v4->v5[1]:
- Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
for suggesting this.
Differences v3->v4[1]:
- Moved the optimization against pointless zext into the correct place:
opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.
Differences v2->v3[1]:
- Moved patching into fixup_bpf_calls (patch incoming to rename this function)
- Added extra commentary on bpf_jit_needs_zext
- Added check to avoid adding a pointless zext(r0) if there's already one there.
Difference v1->v2[1]: Now solved centrally in the verifier instead of
specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
[1] v5: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Fixes:
|
||
---|---|---|
.. | ||
benchs | ||
bpf_testmod | ||
gnu | ||
map_tests | ||
prog_tests | ||
progs | ||
verifier | ||
.gitignore | ||
bench.c | ||
bench.h | ||
bpf_legacy.h | ||
bpf_rand.h | ||
bpf_rlimit.h | ||
bpf_sockopt_helpers.h | ||
bpf_tcp_helpers.h | ||
bpf_util.h | ||
btf_helpers.c | ||
btf_helpers.h | ||
cgroup_helpers.c | ||
cgroup_helpers.h | ||
config | ||
flow_dissector_load.c | ||
flow_dissector_load.h | ||
get_cgroup_id_user.c | ||
ima_setup.sh | ||
Makefile | ||
netcnt_common.h | ||
network_helpers.c | ||
network_helpers.h | ||
README.rst | ||
settings | ||
test_bpftool_build.sh | ||
test_bpftool_metadata.sh | ||
test_bpftool.py | ||
test_bpftool.sh | ||
test_btf.h | ||
test_cgroup_storage.c | ||
test_cpp.cpp | ||
test_dev_cgroup.c | ||
test_flow_dissector.c | ||
test_flow_dissector.sh | ||
test_ftrace.sh | ||
test_iptunnel_common.h | ||
test_kmod.sh | ||
test_lirc_mode2_user.c | ||
test_lirc_mode2.sh | ||
test_lpm_map.c | ||
test_lru_map.c | ||
test_lwt_ip_encap.sh | ||
test_lwt_seg6local.sh | ||
test_maps.c | ||
test_maps.h | ||
test_netcnt.c | ||
test_offload.py | ||
test_progs.c | ||
test_progs.h | ||
test_select_reuseport_common.h | ||
test_skb_cgroup_id_user.c | ||
test_skb_cgroup_id.sh | ||
test_sock_addr.c | ||
test_sock_addr.sh | ||
test_sock.c | ||
test_sockmap.c | ||
test_stub.c | ||
test_sysctl.c | ||
test_tag.c | ||
test_tc_edt.sh | ||
test_tc_redirect.sh | ||
test_tc_tunnel.sh | ||
test_tcp_check_syncookie_user.c | ||
test_tcp_check_syncookie.sh | ||
test_tcp_hdr_options.h | ||
test_tcpbpf.h | ||
test_tcpnotify_user.c | ||
test_tcpnotify.h | ||
test_tunnel.sh | ||
test_verifier_log.c | ||
test_verifier.c | ||
test_xdp_meta.sh | ||
test_xdp_redirect.sh | ||
test_xdp_veth.sh | ||
test_xdp_vlan_mode_generic.sh | ||
test_xdp_vlan_mode_native.sh | ||
test_xdp_vlan.sh | ||
test_xdping.sh | ||
test_xsk.sh | ||
testing_helpers.c | ||
testing_helpers.h | ||
trace_helpers.c | ||
trace_helpers.h | ||
urandom_read.c | ||
vmtest.sh | ||
with_addr.sh | ||
with_tunnels.sh | ||
xdping.c | ||
xdping.h | ||
xdpxceiver.c | ||
xdpxceiver.h | ||
xsk_prereqs.sh |
================== BPF Selftest Notes ================== General instructions on running selftests can be found in `Documentation/bpf/bpf_devel_QA.rst`__. __ /Documentation/bpf/bpf_devel_QA.rst#q-how-to-run-bpf-selftests ========================= Running Selftests in a VM ========================= It's now possible to run the selftests using ``tools/testing/selftests/bpf/vmtest.sh``. The script tries to ensure that the tests are run with the same environment as they would be run post-submit in the CI used by the Maintainers. This script downloads a suitable Kconfig and VM userspace image from the system used by the CI. It builds the kernel (without overwriting your existing Kconfig), recompiles the bpf selftests, runs them (by default ``tools/testing/selftests/bpf/test_progs``) and saves the resulting output (by default in ``~/.bpf_selftests``). For more information on about using the script, run: .. code-block:: console $ tools/testing/selftests/bpf/vmtest.sh -h .. note:: The script uses pahole and clang based on host environment setting. If you want to change pahole and llvm, you can change `PATH` environment variable in the beginning of script. .. note:: The script currently only supports x86_64. Additional information about selftest failures are documented here. profiler[23] test failures with clang/llvm <12.0.0 ================================================== With clang/llvm <12.0.0, the profiler[23] test may fail. The symptom looks like .. code-block:: c // r9 is a pointer to map_value // r7 is a scalar 17: bf 96 00 00 00 00 00 00 r6 = r9 18: 0f 76 00 00 00 00 00 00 r6 += r7 math between map_value pointer and register with unbounded min value is not allowed // the instructions below will not be seen in the verifier log 19: a5 07 01 00 01 01 00 00 if r7 < 257 goto +1 20: bf 96 00 00 00 00 00 00 r6 = r9 // r6 is used here The verifier will reject such code with above error. At insn 18 the r7 is indeed unbounded. The later insn 19 checks the bounds and the insn 20 undoes map_value addition. It is currently impossible for the verifier to understand such speculative pointer arithmetic. Hence `this patch`__ addresses it on the compiler side. It was committed on llvm 12. __ https://reviews.llvm.org/D85570 The corresponding C code .. code-block:: c for (int i = 0; i < MAX_CGROUPS_PATH_DEPTH; i++) { filepart_length = bpf_probe_read_str(payload, ...); if (filepart_length <= MAX_PATH) { barrier_var(filepart_length); // workaround payload += filepart_length; } } bpf_iter test failures with clang/llvm 10.0.0 ============================================= With clang/llvm 10.0.0, the following two bpf_iter tests failed: * ``bpf_iter/ipv6_route`` * ``bpf_iter/netlink`` The symptom for ``bpf_iter/ipv6_route`` looks like .. code-block:: c 2: (79) r8 = *(u64 *)(r1 +8) ... 14: (bf) r2 = r8 15: (0f) r2 += r1 ; BPF_SEQ_PRINTF(seq, "%pi6 %02x ", &rt->fib6_dst.addr, rt->fib6_dst.plen); 16: (7b) *(u64 *)(r8 +64) = r2 only read is supported The symptom for ``bpf_iter/netlink`` looks like .. code-block:: c ; struct netlink_sock *nlk = ctx->sk; 2: (79) r7 = *(u64 *)(r1 +8) ... 15: (bf) r2 = r7 16: (0f) r2 += r1 ; BPF_SEQ_PRINTF(seq, "%pK %-3d ", s, s->sk_protocol); 17: (7b) *(u64 *)(r7 +0) = r2 only read is supported This is due to a llvm BPF backend bug. `The fix`__ has been pushed to llvm 10.x release branch and will be available in 10.0.1. The patch is available in llvm 11.0.0 trunk. __ https://reviews.llvm.org/D78466 BPF CO-RE-based tests and Clang version ======================================= A set of selftests use BPF target-specific built-ins, which might require bleeding-edge Clang versions (Clang 12 nightly at this time). Few sub-tests of core_reloc test suit (part of test_progs test runner) require the following built-ins, listed with corresponding Clang diffs introducing them to Clang/LLVM. These sub-tests are going to be skipped if Clang is too old to support them, they shouldn't cause build failures or runtime test failures: - __builtin_btf_type_id() [0_, 1_, 2_]; - __builtin_preserve_type_info(), __builtin_preserve_enum_value() [3_, 4_]. .. _0: https://reviews.llvm.org/D74572 .. _1: https://reviews.llvm.org/D74668 .. _2: https://reviews.llvm.org/D85174 .. _3: https://reviews.llvm.org/D83878 .. _4: https://reviews.llvm.org/D83242