linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-02 17:11:33 +00:00

Author	SHA1	Message	Date
Ilya Leoshkevich	e85465e420	libbpf: Simplify barrier_var() Use a single "+r" constraint instead of the separate "=r" and "0". Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-22-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:45:14 -08:00
Ilya Leoshkevich	1b5e385325	selftests/bpf: Fix profiler on s390x Use bpf_probe_read_kernel() and bpf_probe_read_kernel_str() instead of bpf_probe_read() and bpf_probe_read_kernel(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-21-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:45:14 -08:00
Ilya Leoshkevich	438a2edf26	selftests/bpf: Fix xdp_synproxy/tc on s390x Use the correct datatype for the values map values; currently the test works by accident, since on little-endian machines it is sometimes acceptable to access u64 as u32. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-20-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:45:14 -08:00
Ilya Leoshkevich	d504270a23	selftests/bpf: Fix vmlinux test on s390x Use a syscall macro to access the nanosleep()'s first argument; currently the code uses gprs[2] instead of orig_gpr2. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-18-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:09 -08:00
Ilya Leoshkevich	26e8a01494	selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x s390x cache line size is 256 bytes, so skb_shared_info must be aligned on a much larger boundary than for x86. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-17-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:09 -08:00
Ilya Leoshkevich	207612eb12	selftests/bpf: Fix test_lsm on s390x Use syscall macros to access the setdomainname() arguments; currently the code uses gprs[2] instead of orig_gpr2 for the first argument. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-16-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:09 -08:00
Ilya Leoshkevich	be6b5c10ec	selftests/bpf: Add a sign-extension test for kfuncs s390x ABI requires the caller to zero- or sign-extend the arguments. eBPF already deals with zero-extension (by definition of its ABI), but not with sign-extension. Add a test to cover that potentially problematic area. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-15-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:09 -08:00
Ilya Leoshkevich	80a611904e	selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x sizeof(struct bpf_local_storage_elem) is 512 on s390x: struct bpf_local_storage_elem { struct hlist_node map_node; /* 0 16 / struct hlist_node snode; / 16 16 / struct bpf_local_storage local_storage; /* 32 8 / struct callback_head rcu __attribute__((__aligned__(8))); / 40 16 / / XXX 200 bytes hole, try to pack / / --- cacheline 1 boundary (256 bytes) --- / struct bpf_local_storage_data sdata __attribute__((__aligned__(256))); / 256 8 / / size: 512, cachelines: 2, members: 5 / / sum members: 64, holes: 1, sum holes: 200 / / padding: 248 / / forced alignments: 2, forced holes: 1, sum forced holes: 200 */ } __attribute__((__aligned__(256))); As the existing comment suggests, use a larger number in order to be future-proof. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-14-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	2934565f04	selftests/bpf: Check stack_mprotect() return value If stack_mprotect() succeeds, errno is not changed. This can produce misleading error messages, that show stale errno. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-13-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	06cea99e68	selftests/bpf: Fix cgrp_local_storage on s390x Sync the definition of socket_cookie between the eBPF program and the test. Currently the test works by accident, since on little-endian it is sometimes acceptable to access u64 as u32. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-12-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	06c1865b0b	selftests/bpf: Fix xdp_do_redirect on s390x s390x cache line size is 256 bytes, so skb_shared_info must be aligned on a much larger boundary than for x86. This makes the maximum packet size smaller. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-11-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	56e1a50483	selftests/bpf: Fix verify_pkcs7_sig on s390x Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is not defined on all architectures. While at it, improve the error handling: do not hide the verifier log, and check the return values of bpf_probe_read_kernel() and bpf_copy_from_user(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-10-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	98e13848cf	selftests/bpf: Fix decap_sanity_ns cleanup decap_sanity prints the following on the 1st run: decap_sanity: sh: 1: Syntax error: Bad fd number and the following on the 2nd run: Cannot create namespace file "/run/netns/decap_sanity_ns": File exists The problem is that the cleanup command has a typo and does nothing. Fix the typo. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-9-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	804acdd251	selftests/bpf: Set errno when urand_spawn() fails The result of urand_spawn() is checked with ASSERT_OK_PTR, which treats NULL as success if errno == 0. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-8-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	31da9be64a	selftests/bpf: Fix kfree_skb on s390x h_proto is big-endian; use htons() in order to make comparison work on both little- and big-endian machines. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-7-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	6eab2370d1	selftests/bpf: Fix symlink creation error When building with O=, the following error occurs: ln: failed to create symbolic link 'no_alu32/bpftool': No such file or directory Adjust the code to account for $(OUTPUT). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	b14b01f281	selftests/bpf: Fix liburandom_read.so linker error When building with O=, the following linker error occurs: clang: error: no such file or directory: 'liburandom_read.so' Fix by adding $(OUTPUT) to the linker search path. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:30:08 -08:00
Ilya Leoshkevich	8fb9fb2f17	selftests/bpf: Query BPF_MAX_TRAMP_LINKS using BTF Do not hard-code the value, since for s390x it will be smaller than for x86. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:29:41 -08:00
Ilya Leoshkevich	390a07a921	bpf: Change BPF_MAX_TRAMP_LINKS to enum This way it's possible to query its value from testcases using BTF. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:27:12 -08:00
Ilya Leoshkevich	bf3849755a	bpf: Use ARG_CONST_SIZE_OR_ZERO for 3rd argument of bpf_tcp_raw_gen_syncookie_ipv{4,6}() These functions already check that th_len < sizeof(*th), and propagating the lower bound (th_len > 0) may be challenging in complex code, e.g. as is the case with xdp_synproxy test on s390x [1]. Switch to ARG_CONST_SIZE_OR_ZERO in order to make the verifier accept code where it cannot prove that th_len > 0. [1] https://lore.kernel.org/bpf/CAEf4Bzb3uiSHtUbgVWmkWuJ5Sw1UZd4c_iuS4QXtUkXmTTtXuQ@mail.gmail.com/ Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:27:12 -08:00
Randy Dunlap	1d3cab43f4	Documentation: bpf: correct spelling Correct spelling problems for Documentation/bpf/ as reported by codespell. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: bpf@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230128195046.13327-1-rdunlap@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:22:20 -08:00
David Vernet	cb4a21ea59	bpf: Build-time assert that cpumask offset is zero The first element of a struct bpf_cpumask is a cpumask_t. This is done to allow struct bpf_cpumask to be cast to a struct cpumask. If this element were ever moved to another field, any BPF program passing a struct bpf_cpumask * to a kfunc expecting a const struct cpumask * would immediately fail to load. Add a build-time assertion so this is assumption is captured and verified. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230128141537.100777-1-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-28 12:21:40 -08:00
Johannes Berg	70eb3911d8	net: netlink: recommend policy range validation For large ranges (outside of s16) the documentation currently recommends open-coding the validation, but it's better to use the NLA_POLICY_FULL_RANGE() or NLA_POLICY_FULL_RANGE_SIGNED() policy validation instead; recommend that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20230127084506.09f280619d64.I5dece85f06efa8ab0f474ca77df9e26d3553d4ab@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-28 00:33:51 -08:00
Jakub Kicinski	2d104c390f	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY9RqJgAKCRDbK58LschI gw2IAP9G5uhFO5abBzYLupp6SY3T5j97MUvPwLfFqUEt7EXmuwEA2lCUEWeW0KtR QX+QmzCa6iHxrW7WzP4DUYLue//FJQY= =yYqA -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2023-01-28 We've added 124 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 6386 insertions(+), 1827 deletions(-). The main changes are: 1) Implement XDP hints via kfuncs with initial support for RX hash and timestamp metadata kfuncs, from Stanislav Fomichev and Toke Høiland-Jørgensen. Measurements on overhead: https://lore.kernel.org/bpf/875yellcx6.fsf@toke.dk 2) Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case, from Andrii Nakryiko. 3) Significantly reduce the search time for module symbols by livepatch and BPF, from Jiri Olsa and Zhen Lei. 4) Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals, from David Vernet. 5) Fix several issues in the dynptr processing such as stack slot liveness propagation, missing checks for PTR_TO_STACK variable offset, etc, from Kumar Kartikeya Dwivedi. 6) Various performance improvements, fixes, and introduction of more than just one XDP program to XSK selftests, from Magnus Karlsson. 7) Big batch to BPF samples to reduce deprecated functionality, from Daniel T. Lee. 8) Enable struct_ops programs to be sleepable in verifier, from David Vernet. 9) Reduce pr_warn() noise on BTF mismatches when they are expected under the CONFIG_MODULE_ALLOW_BTF_MISMATCH config anyway, from Connor O'Brien. 10) Describe modulo and division by zero behavior of the BPF runtime in BPF's instruction specification document, from Dave Thaler. 11) Several improvements to libbpf API documentation in libbpf.h, from Grant Seltzer. 12) Improve resolve_btfids header dependencies related to subcmd and add proper support for HOSTCC, from Ian Rogers. 13) Add ipip6 and ip6ip decapsulation support for bpf_skb_adjust_room() helper along with BPF selftests, from Ziyang Xuan. 14) Simplify the parsing logic of structure parameters for BPF trampoline in the x86-64 JIT compiler, from Pu Lehui. 15) Get BTF working for kernels with CONFIG_RUST enabled by excluding Rust compilation units with pahole, from Martin Rodriguez Reboredo. 16) Get bpf_setsockopt() working for kTLS on top of TCP sockets, from Kui-Feng Lee. 17) Disable stack protection for BPF objects in bpftool given BPF backends don't support it, from Holger Hoffstätte. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (124 commits) selftest/bpf: Make crashes more debuggable in test_progs libbpf: Add documentation to map pinning API functions libbpf: Fix malformed documentation formatting selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket. bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt(). bpf/selftests: Verify struct_ops prog sleepable behavior bpf: Pass const struct bpf_prog * to .check_member libbpf: Support sleepable struct_ops.s section bpf: Allow BPF_PROG_TYPE_STRUCT_OPS programs to be sleepable selftests/bpf: Fix vmtest static compilation error tools/resolve_btfids: Alter how HOSTCC is forced tools/resolve_btfids: Install subcmd headers bpf/docs: Document the nocast aliasing behavior of ___init bpf/docs: Document how nested trusted fields may be defined bpf/docs: Document cpumask kfuncs in a new file selftests/bpf: Add selftest suite for cpumask kfuncs selftests/bpf: Add nested trust selftests suite bpf: Enable cpumasks to be queried and used as kptrs bpf: Disallow NULLable pointers for trusted kfuncs ... ==================== Link: https://lore.kernel.org/r/20230128004827.21371-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-28 00:00:14 -08:00
Breno Leitao	d8afe2f8a9	netpoll: Remove 4s sleep during carrier detection This patch removes the msleep(4s) during netpoll_setup() if the carrier appears instantly. Here are some scenarios where this workaround is counter-productive in modern ages: Servers which have BMC communicating over NC-SI via the same NIC as gets used for netconsole. BMC will keep the PHY up, hence the carrier appearing instantly. The link is fibre, SERDES getting sync could happen within 0.1Hz, and the carrier also appears instantly. Other than that, if a driver is reporting instant carrier and then losing it, this is probably a driver bug. Reported-by: Michael van der Westhuizen <rmikey@meta.com> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20230125185230.3574681-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 23:24:07 -08:00
Jakub Kicinski	b568d3072a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: drivers/net/ethernet/intel/ice/ice_main.c `418e53401e` ("ice: move devlink port creation/deletion") `643ef23bd9` ("ice: Introduce local var for readability") https://lore.kernel.org/all/20230127124025.0dacef40@canb.auug.org.au/ https://lore.kernel.org/all/20230124005714.3996270-1-anthony.l.nguyen@intel.com/ drivers/net/ethernet/engleder/tsnep_main.c `3d53aaef43` ("tsnep: Fix TX queue stop/wake for multiple queues") `25faa6a4c5` ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/20230127123604.36bb3e99@canb.auug.org.au/ net/netfilter/nf_conntrack_proto_sctp.c `13bd9b31a9` ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") `a44b765148` ("netfilter: conntrack: unify established states for SCTP paths") `f71cb8f45d` ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/20230127125052.674281f9@canb.auug.org.au/ https://lore.kernel.org/all/d36076f3-6add-a442-6d4b-ead9f7ffff86@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:56:18 -08:00
Arınç ÜNAL	ff445b8397	net: dsa: mt7530: fix tristate and help description Fix description for tristate and help sections which include inaccurate information. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/r/20230126190110.9124-1-arinc.unal@arinc9.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:33:49 -08:00
Jakub Kicinski	3ac77ecd9a	Merge branch 'net-xdp-execute-xdp_do_flush-before-napi_complete_done' Magnus Karlsson says: ==================== net: xdp: execute xdp_do_flush() before napi_complete_done() Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found in [1]. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in [2]. The drivers have only been compile-tested since I do not own any of the HW below. So if you are a maintainer, it would be great if you could take a quick look to make sure I did not mess something up. Note that these were the drivers I found that violated the ordering by running a simple script and manually checking the ones that came up as potential offenders. But the script was not perfect in any way. There might still be offenders out there, since the script can generate false negatives. [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ ==================== Link: https://lore.kernel.org/r/20230125074901.2737-1-magnus.karlsson@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:28:03 -08:00
Magnus Karlsson	a3191c4d86	dpaa2-eth: execute xdp_do_flush() before napi_complete_done() Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: `d678be1dc1` ("dpaa2-eth: add XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:27:59 -08:00
Magnus Karlsson	b534013798	dpaa_eth: execute xdp_do_flush() before napi_complete_done() Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: `a1e031ffb4` ("dpaa_eth: add XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:27:59 -08:00
Magnus Karlsson	ad7e615f64	virtio-net: execute xdp_do_flush() before napi_complete_done() Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: `186b3c998c` ("virtio-net: support XDP_REDIRECT") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:27:59 -08:00
Magnus Karlsson	12b5717990	lan966x: execute xdp_do_flush() before napi_complete_done() Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: `a825b611c7` ("net: lan966x: Add support for XDP_REDIRECT") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Steen Hegelund <Steen.Hegelund@microchip.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:27:59 -08:00
Magnus Karlsson	2ccce20d51	qede: execute xdp_do_flush() before napi_complete_done() Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: `d1b25b79e1` ("qede: add .ndo_xdp_xmit() and XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-27 22:27:59 -08:00
Stanislav Fomichev	16809afdcb	selftest/bpf: Make crashes more debuggable in test_progs Reset stdio before printing verbose log of the SIGSEGV'ed test. Otherwise, it's hard to understand what's going on in the cases like [0]. With the following patch applied: --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -392,6 +392,11 @@ void test_xdp_metadata(void) "generate freplace packet")) goto out; + + ASSERT_EQ(1, 2, "oops"); + int x = 0; + x = 1; /* die */ + while (!retries--) { if (bpf_obj2->bss->called) break; Before: #281 xdp_metadata:FAIL Caught signal #11! Stack trace: ./test_progs(crash_handler+0x1f)[0x55c919d98bcf] /lib/x86_64-linux-gnu/libc.so.6(+0x3bf90)[0x7f36aea5df90] ./test_progs(test_xdp_metadata+0x1db0)[0x55c919d8c6d0] ./test_progs(+0x23b438)[0x55c919d9a438] ./test_progs(main+0x534)[0x55c919d99454] /lib/x86_64-linux-gnu/libc.so.6(+0x2718a)[0x7f36aea4918a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f36aea49245] ./test_progs(_start+0x21)[0x55c919b82ef1] After: test_xdp_metadata:PASS:ip netns add xdp_metadata 0 nsec open_netns:PASS:malloc token 0 nsec open_netns:PASS:open /proc/self/ns/net 0 nsec open_netns:PASS:open netns fd 0 nsec open_netns:PASS:setns 0 nsec .. test_xdp_metadata:FAIL:oops unexpected oops: actual 1 != expected 2 #281 xdp_metadata:FAIL Caught signal #11! Stack trace: ./test_progs(crash_handler+0x1f)[0x562714a76bcf] /lib/x86_64-linux-gnu/libc.so.6(+0x3bf90)[0x7fa663f9cf90] ./test_progs(test_xdp_metadata+0x1db0)[0x562714a6a6d0] ./test_progs(+0x23b438)[0x562714a78438] ./test_progs(main+0x534)[0x562714a77454] /lib/x86_64-linux-gnu/libc.so.6(+0x2718a)[0x7fa663f8818a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7fa663f88245] ./test_progs(_start+0x21)[0x562714860ef1] 0: https://github.com/kernel-patches/bpf/actions/runs/4019879316/jobs/6907358876 Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230127215705.1254316-1-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-01-27 15:22:00 -08:00
Grant Seltzer	e4ce876f10	libbpf: Add documentation to map pinning API functions This adds documentation for the following API functions: - bpf_map__set_pin_path() - bpf_map__pin_path() - bpf_map__is_pinned() - bpf_map__pin() - bpf_map__unpin() - bpf_object__pin_maps() - bpf_object__unpin_maps() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230126024225.520685-1-grantseltzer@gmail.com	2023-01-27 23:03:19 +01:00
Grant Seltzer	139df64d26	libbpf: Fix malformed documentation formatting This fixes the doxygen format documentation above the user_ring_buffer__* APIs. There has to be a newline before the @brief, otherwise doxygen won't render them for libbpf.readthedocs.org. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230126024749.522278-1-grantseltzer@gmail.com	2023-01-27 23:02:03 +01:00
David S. Miller	c2ea552065	Merge branch 'devlink-parama-cleanup' Jiri Pirko says: ==================== devlink: Cleanup params usage This patchset takes care of small cleanup of devlink params usage. Some of the patches (first 2/3) are cosmetic, but I would like to point couple of interesting ones: Patch 9 is the main one of this set and introduces devlink instance locking for params, similar to other devlink objects. That allows params to be registered/unregistered when devlink instance is registered. Patches 10-12 change mlx5 code to register non-driverinit params in the code they are related to, and thanks to patch 8 this might be when devlink instance is registered - for example during devlink reload. --- v1->v2: - Just small fix in the last patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:03 +00:00
Jiri Pirko	d2a651ef18	net/mlx5: Move eswitch port metadata devlink param to flow eswitch code Move the param registration and handling code into the eswitch offloads code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	db492c1e5b	net/mlx5: Move flow steering devlink param to flow steering code Move the param registration and handling code into the flow steering code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	c2077fbc42	net/mlx5: Move fw reset devlink param to fw reset code Move the param registration and handling code into the fw reset code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	075935f0ae	devlink: protect devlink param list by instance lock Commit `1d18bb1a4d` ("devlink: allow registering parameters after the instance") as the subject implies introduced possibility to register devlink params even for already registered devlink instance. This is a bit problematic, as the consistency or params list was originally secured by the fact it is static during devlink lifetime. So in order to protect the params list, take devlink instance lock during the params operations. Introduce unlocked function variants and use them in drivers in locked context. Put lock assertions to appropriate places. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	3f716a620e	devlink: put couple of WARN_ONs in devlink_param_driverinit_value_get() Put couple of WARN_ONs in devlink_param_driverinit_value_get() function to clearly indicate, that it is a driver bug if used without reload support or for non-driverinit param. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	85fe0b324c	devlink: make devlink_param_driverinit_value_set() return void devlink_param_driverinit_value_set() currently returns int with possible error, but no user is checking it anyway. The only reason for a fail is a driver bug. So convert the function to return void and put WARN_ONs on error paths. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	6fd6eda0e6	qed: remove pointless call to devlink_param_driverinit_value_set() devlink_param_driverinit_value_set() call makes sense only for " driverinit" params. However here, the param is "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless call to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	2fc631b5d7	ice: remove pointless calls to devlink_param_driverinit_value_set() devlink_param_driverinit_value_set() call makes sense only for "driverinit" params. However here, both params are "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless calls to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	bb9bb6bfd1	devlink: don't work with possible NULL pointer in devlink_param_unregister() There is a WARN_ON checking the param_item for being NULL when the param is not inserted in the list. That indicates a driver BUG. Instead of continuing to work with NULL pointer with its consequences, return. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	020dd127a3	devlink: make devlink_param_register/unregister static There is no user outside the devlink code, so remove the export and make the functions static. Move them before callers to avoid forward declarations. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	a756185ac3	net/mlx5: Covert devlink params registration to use devlink_params_register/unregister() Since mlx5 is the only user of devlink API to register/unregister a single param, convert it to use array registration function allowing to simplify the devlink API by removing the single param registration functions. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
Jiri Pirko	c8aebff459	net/mlx5: Change devlink param register/unregister function names The functions are registering and unregistering devlink params, so change the names accordingly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:32:02 +00:00
David S. Miller	86e99b5bf2	Merge branch 'ethtool-netlink-next' Jakub Kicinski says: ==================== ethtool: netlink: handle SET intro/outro in the common code Factor out the boilerplate code from SET handlers to common code. I volunteered to refactor the extack in GET in a conversation with Vladimir but I gave up. The handling of failures during dump in GET handlers is a bit unclear to me. Some code uses presence of info as indication of dump and tries to avoid reporting errors altogether (including extack messages). There's also the question of whether we should have a validation callback (similar to .set_validate here) for GET. It looks like .parse_request was expected to perform the validation. It takes the extack and tb directly, not via info: int (parse_request)(struct ethnl_req_info req_info, struct nlattr *tb, struct netlink_ext_ack extack); int (prepare_data)(const struct ethnl_req_info req_info, struct ethnl_reply_data reply_data, struct genl_info info); so no crashes dereferencing info possible. But .parse_request doesn't run under rtnl nor ethnl_ops_begin(). As a result some implementations defer validation until .prepare_data where all the locks are held and they can call out to the driver. All this makes me think that maybe we should refactor GET in the same direction I'm refactoring SET. Split .prepare_data, take more locks in the core, and add a validation helper which would take extack directly: - ret = ops->prepare_data(req_info, reply_data, info); + ret = ops->prepare_data_validate(req_info, reply_data, attrs, extack); + if (ret < 1) // if 0 -> skip for dump; -EOPNOTSUPP in do + goto err1; + + ret = ethnl_ops_begin(dev); + if (ret) + goto err1; + + ret = ops->prepare_data(req_info, reply_data); // no extack + ethnl_ops_complete(dev); I'll file that away as a TODO for posterity / older me. v2: - invert checks for coalescing to avoid error code changes - rebase and convert MM as well v1: https://lore.kernel.org/all/20230121054430.642280-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-27 12:24:32 +00:00

1 2 3 4 5 ...

1155703 Commits