linux

Author	SHA1	Message	Date
Daniel Borkmann	2893c996d8	Merge branch 'bpf-trampoline' Alexei Starovoitov says: ==================== Introduce BPF trampoline that works as a bridge between kernel functions, BPF programs and other BPF programs. The first use case is fentry/fexit BPF programs that are roughly equivalent to kprobe/kretprobe. Unlike k[ret]probe there is practically zero overhead to call a set of BPF programs before or after kernel function. The second use case is heavily influenced by pain points in XDP development. BPF trampoline allows attaching similar fentry/fexit BPF program to any networking BPF program. It's now possible to see packets on input and output of any XDP, TC, lwt, cgroup programs without disturbing them. This greatly helps BPF-based network troubleshooting. The third use case of BPF trampoline will be explored in the follow up patches. The BPF trampoline will be used to dynamicly link BPF programs. It's more generic mechanism than array and link list of programs used in tracing, networking, cgroups. In many cases it can be used as a replacement for bpf_tail_call-based program chaining. See [1] for long term design discussion. v3 -> v4: - Included Peter's "86/alternatives: Teach text_poke_bp() to emulate instructions" as a first patch. If it changes between now and merge window, I'll rebease to newer version. The patch is necessary to do s/text_poke/text_poke_bp/ in patch 3 to fix the race. - In patch 4 fixed bpf_trampoline creation race spotted by Andrii. - Added patch 15 that annotates prog->kern bpf context types. It made patches 16 and 17 cleaner and more generic. - Addressed Andrii's feedback in other patches. v2 -> v3: - Addressed Song's and Andrii's comments - Fixed few minor bugs discovered while testing - Added one more libbpf patch v1 -> v2: - Addressed Andrii's comments - Added more test for fentry/fexit to kernel functions. Including stress test for maximum number of progs per trampoline. - Fixed a race btf_resolve_helper_id() - Added a patch to compare BTF types of functions arguments with actual types. - Added support for attaching BPF program to another BPF program via trampoline - Converted to use text_poke() API. That's the only viable mechanism to implement BPF-to-BPF attach. BPF-to-kernel attach can be refactored to use register_ftrace_direct() whenever it's available. [1] https://lore.kernel.org/bpf/20191112025112.bhzmrrh2pr76ssnh@ast-mbp.dhcp.thefacebook.com/ ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-11-15 23:49:34 +01:00
Alexei Starovoitov	d6f39601ec	selftests/bpf: Add a test for attaching BPF prog to another BPF prog and subprog Add a test that attaches one FEXIT program to main sched_cls networking program and two other FEXIT programs to subprograms. All three tracing programs access return values and skb->len of networking program and subprograms. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-21-ast@kernel.org	2019-11-15 23:46:09 +01:00
Alexei Starovoitov	4c0963243c	selftests/bpf: Extend test_pkt_access test The test_pkt_access.o is used by multiple tests. Fix its section name so that program type can be automatically detected by libbpf and make it call other subprograms with skb argument. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-20-ast@kernel.org	2019-11-15 23:45:50 +01:00
Alexei Starovoitov	e7bf94dbb8	libbpf: Add support for attaching BPF programs to other BPF programs Extend libbpf api to pass attach_prog_fd into bpf_object__open. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-19-ast@kernel.org	2019-11-15 23:45:37 +01:00
Alexei Starovoitov	5b92a28aae	bpf: Support attaching tracing BPF program to other BPF programs Allow FENTRY/FEXIT BPF programs to attach to other BPF programs of any type including their subprograms. This feature allows snooping on input and output packets in XDP, TC programs including their return values. In order to do that the verifier needs to track types not only of vmlinux, but types of other BPF programs as well. The verifier also needs to translate uapi/linux/bpf.h types used by networking programs into kernel internal BTF types used by FENTRY/FEXIT BPF programs. In some cases LLVM optimizations can remove arguments from BPF subprograms without adjusting BTF info that LLVM backend knows. When BTF info disagrees with actual types that the verifiers sees the BPF trampoline has to fallback to conservative and treat all arguments as u64. The FENTRY/FEXIT program can still attach to such subprograms, but it won't be able to recognize pointer types like 'struct sk_buff *' and it won't be able to pass them to bpf_skb_output() for dumping packets to user space. The FENTRY/FEXIT program would need to use bpf_probe_read_kernel() instead. The BPF_PROG_LOAD command is extended with attach_prog_fd field. When it's set to zero the attach_btf_id is one vmlinux BTF type ids. When attach_prog_fd points to previously loaded BPF program the attach_btf_id is BTF type id of main function or one of its subprograms. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-18-ast@kernel.org	2019-11-15 23:45:24 +01:00
Alexei Starovoitov	8c1b6e69dc	bpf: Compare BTF types of functions arguments with actual types Make the verifier check that BTF types of function arguments match actual types passed into top-level BPF program and into BPF-to-BPF calls. If types match such BPF programs and sub-programs will have full support of BPF trampoline. If types mismatch the trampoline has to be conservative. It has to save/restore five program arguments and assume 64-bit scalars. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-17-ast@kernel.org	2019-11-15 23:45:02 +01:00
Alexei Starovoitov	91cc1a9974	bpf: Annotate context types Annotate BPF program context types with program-side type and kernel-side type. This type information is used by the verifier. btf_get_prog_ctx_type() is used in the later patches to verify that BTF type of ctx in BPF program matches to kernel expected ctx type. For example, the XDP program type is: BPF_PROG_TYPE(BPF_PROG_TYPE_XDP, xdp, struct xdp_md, struct xdp_buff) That means that XDP program should be written as: int xdp_prog(struct xdp_md *ctx) { ... } Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-16-ast@kernel.org	2019-11-15 23:44:48 +01:00
Alexei Starovoitov	9cc31b3a09	bpf: Fix race in btf_resolve_helper_id() btf_resolve_helper_id() caching logic is a bit racy, since under root the verifier can verify several programs in parallel. Fix it with READ/WRITE_ONCE. Fix the type as well, since error is also recorded. Fixes: `a7658e1a41` ("bpf: Check types of arguments passed into helpers") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-15-ast@kernel.org	2019-11-15 23:44:20 +01:00
Alexei Starovoitov	9fd4a39dc7	bpf: Reserve space for BPF trampoline in BPF programs BPF trampoline can be made to work with existing 5 bytes of BPF program prologue, but let's add 5 bytes of NOPs to the beginning of every JITed BPF program to make BPF trampoline job easier. They can be removed in the future. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-14-ast@kernel.org	2019-11-15 23:44:06 +01:00
Alexei Starovoitov	e76d776e9c	selftests/bpf: Add stress test for maximum number of progs Add stress test for maximum number of attached BPF programs per BPF trampoline. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-13-ast@kernel.org	2019-11-15 23:43:53 +01:00
Alexei Starovoitov	510312882c	selftests/bpf: Add combined fentry/fexit test Add a combined fentry/fexit test. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-12-ast@kernel.org	2019-11-15 23:43:41 +01:00
Alexei Starovoitov	d3b0856e59	selftests/bpf: Add fexit tests for BPF trampoline Add fexit tests for BPF trampoline that checks kernel functions with up to 6 arguments of different sizes and their return values. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-11-ast@kernel.org	2019-11-15 23:43:28 +01:00
Alexei Starovoitov	11d1e2eeff	selftests/bpf: Add test for BPF trampoline Add sanity test for BPF trampoline that checks kernel functions with up to 6 arguments of different sizes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org	2019-11-15 23:43:15 +01:00
Alexei Starovoitov	faeb2dce08	bpf: Add kernel test functions for fentry testing Add few kernel functions with various number of arguments, their types and sizes for BPF trampoline testing to cover different calling conventions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-9-ast@kernel.org	2019-11-15 23:43:01 +01:00
Alexei Starovoitov	e41074d39d	selftest/bpf: Simple test for fentry/fexit Add simple test for fentry and fexit programs around eth_type_trans. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-8-ast@kernel.org	2019-11-15 23:42:46 +01:00
Alexei Starovoitov	b8c54ea455	libbpf: Add support to attach to fentry/fexit tracing progs Teach libbpf to recognize tracing programs types and attach them to fentry/fexit. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-7-ast@kernel.org	2019-11-15 23:42:31 +01:00
Alexei Starovoitov	1442e2871b	libbpf: Introduce btf__find_by_name_kind() Introduce btf__find_by_name_kind() helper to search BTF by name and kind, since name alone can be ambiguous. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-6-ast@kernel.org	2019-11-15 23:42:14 +01:00
Alexei Starovoitov	fec56f5890	bpf: Introduce BPF trampoline Introduce BPF trampoline concept to allow kernel code to call into BPF programs with practically zero overhead. The trampoline generation logic is architecture dependent. It's converting native calling convention into BPF calling convention. BPF ISA is 64-bit (even on 32-bit architectures). The registers R1 to R5 are used to pass arguments into BPF functions. The main BPF program accepts only single argument "ctx" in R1. Whereas CPU native calling convention is different. x86-64 is passing first 6 arguments in registers and the rest on the stack. x86-32 is passing first 3 arguments in registers. sparc64 is passing first 6 in registers. And so on. The trampolines between BPF and kernel already exist. BPF_CALL_x macros in include/linux/filter.h statically compile trampolines from BPF into kernel helpers. They convert up to five u64 arguments into kernel C pointers and integers. On 64-bit architectures this BPF_to_kernel trampolines are nops. On 32-bit architecture they're meaningful. The opposite job kernel_to_BPF trampolines is done by CAST_TO_U64 macros and __bpf_trace_##call() shim functions in include/trace/bpf_probe.h. They convert kernel function arguments into array of u64s that BPF program consumes via R1=ctx pointer. This patch set is doing the same job as __bpf_trace_##call() static trampolines, but dynamically for any kernel function. There are ~22k global kernel functions that are attachable via nop at function entry. The function arguments and types are described in BTF. The job of btf_distill_func_proto() function is to extract useful information from BTF into "function model" that architecture dependent trampoline generators will use to generate assembly code to cast kernel function arguments into array of u64s. For example the kernel function eth_type_trans has two pointers. They will be casted to u64 and stored into stack of generated trampoline. The pointer to that stack space will be passed into BPF program in R1. On x86-64 such generated trampoline will consume 16 bytes of stack and two stores of %rdi and %rsi into stack. The verifier will make sure that only two u64 are accessed read-only by BPF program. The verifier will also recognize the precise type of the pointers being accessed and will not allow typecasting of the pointer to a different type within BPF program. The tracing use case in the datacenter demonstrated that certain key kernel functions have (like tcp_retransmit_skb) have 2 or more kprobes that are always active. Other functions have both kprobe and kretprobe. So it is essential to keep both kernel code and BPF programs executing at maximum speed. Hence generated BPF trampoline is re-generated every time new program is attached or detached to maintain maximum performance. To avoid the high cost of retpoline the attached BPF programs are called directly. __bpf_prog_enter/exit() are used to support per-program execution stats. In the future this logic will be optimized further by adding support for bpf_stats_enabled_key inside generated assembly code. Introduction of preemptible and sleepable BPF programs will completely remove the need to call to __bpf_prog_enter/exit(). Detach of a BPF program from the trampoline should not fail. To avoid memory allocation in detach path the half of the page is used as a reserve and flipped after each attach/detach. 2k bytes is enough to call 40+ BPF programs directly which is enough for BPF tracing use cases. This limit can be increased in the future. BPF_TRACE_FENTRY programs have access to raw kernel function arguments while BPF_TRACE_FEXIT programs have access to kernel return value as well. Often kprobe BPF program remembers function arguments in a map while kretprobe fetches arguments from a map and analyzes them together with return value. BPF_TRACE_FEXIT accelerates this typical use case. Recursion prevention for kprobe BPF programs is done via per-cpu bpf_prog_active counter. In practice that turned out to be a mistake. It caused programs to randomly skip execution. The tracing tools missed results they were looking for. Hence BPF trampoline doesn't provide builtin recursion prevention. It's a job of BPF program itself and will be addressed in the follow up patches. BPF trampoline is intended to be used beyond tracing and fentry/fexit use cases in the future. For example to remove retpoline cost from XDP programs. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-5-ast@kernel.org	2019-11-15 23:41:51 +01:00
Alexei Starovoitov	5964b2000f	bpf: Add bpf_arch_text_poke() helper Add bpf_arch_text_poke() helper that is used by BPF trampoline logic to patch nops/calls in kernel text into calls into BPF trampoline and to patch calls/nops inside BPF programs too. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-4-ast@kernel.org	2019-11-15 23:41:28 +01:00
Alexei Starovoitov	3b2744e665	bpf: Refactor x86 JIT into helpers Refactor x86 JITing of LDX, STX, CALL instructions into separate helper functions. No functional changes in LDX and STX helpers. There is a minor change in CALL helper. It will populate target address correctly on the first pass of JIT instead of second pass. That won't reduce total number of JIT passes though. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-3-ast@kernel.org	2019-11-15 23:41:06 +01:00
Peter Zijlstra	c3d6324f84	x86/alternatives: Teach text_poke_bp() to emulate instructions In preparation for static_call and variable size jump_label support, teach text_poke_bp() to emulate instructions, namely: JMP32, JMP8, CALL, NOP2, NOP_ATOMIC5, INT3 The current text_poke_bp() takes a @handler argument which is used as a jump target when the temporary INT3 is hit by a different CPU. When patching CALL instructions, this doesn't work because we'd miss the PUSH of the return address. Instead, teach poke_int3_handler() to emulate an instruction, typically the instruction we're patching in. This fits almost all text_poke_bp() users, except arch_unoptimize_kprobe() which restores random text, and for that site we have to build an explicit emulate instruction. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191111132457.529086974@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 8c7eebc10687af45ac8e40ad1bac0cf7893dba9f) Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-11-15 14:07:01 -08:00
Mao Wenan	808c9f7ebf	bpf, doc: Change right arguments for JIT example code The example code for the x86_64 JIT uses the wrong arguments when calling function bar(). Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191114034351.162740-1-maowenan@huawei.com	2019-11-15 22:36:35 +01:00
Andre Guedes	b313332980	samples/bpf: Add missing option to xdpsock usage Commit `743e568c15` (samples/bpf: Add a "force" flag to XDP samples) introduced the '-F' option but missed adding it to the usage() and the 'long_option' array. Fixes: `743e568c15` (samples/bpf: Add a "force" flag to XDP samples) Signed-off-by: Andre Guedes <andre.guedes@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191114162847.221770-2-andre.guedes@intel.com	2019-11-15 22:32:10 +01:00
Andre Guedes	110b2263db	samples/bpf: Remove duplicate option from xdpsock The '-f' option is shown twice in the usage(). This patch removes the outdated version. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191114162847.221770-1-andre.guedes@intel.com	2019-11-15 22:29:17 +01:00
Ilya Leoshkevich	fcf3513139	s390/bpf: Make sure JIT passes do not increase code size The upcoming s390 branch length extension patches rely on "passes do not increase code size" property in order to consistently choose between short and long branches. Currently this property does not hold between the first and the second passes for register save/restore sequences, as well as various code fragments that depend on SEEN_* flags. Generate the code during the first pass conservatively: assume register save/restore sequences have the maximum possible length, and that all SEEN_* flags are set. Also refuse to JIT if this happens anyway (e.g. due to a bug), as this might lead to verifier bypass once long branches are introduced. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191114151820.53222-1-iii@linux.ibm.com	2019-11-15 22:25:54 +01:00
Ilya Leoshkevich	b7b3fc8dd9	bpf: Support doubleword alignment in bpf_jit_binary_alloc Currently passing alignment greater than 4 to bpf_jit_binary_alloc does not work: in such cases it silently aligns only to 4 bytes. On s390, in order to load a constant from memory in a large (>512k) BPF program, one must use lgrl instruction, whose memory operand must be aligned on an 8-byte boundary. This patch makes it possible to request 8-byte alignment from bpf_jit_binary_alloc, and also makes it issue a warning when an unsupported alignment is requested. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191115123722.58462-1-iii@linux.ibm.com	2019-11-15 22:25:00 +01:00
Anders Roxell	e47a179997	bpf, testing: Add missing object file to TEST_FILES When installing kselftests to its own directory and run the test_lwt_ip_encap.sh it will complain that test_lwt_ip_encap.o can't be found. Same with the test_tc_edt.sh test it will complain that test_tc_edt.o can't be found. $ ./test_lwt_ip_encap.sh starting egress IPv4 encap test Error opening object test_lwt_ip_encap.o: No such file or directory Object hashing failed! Cannot initialize ELF context! Failed to parse eBPF program: Invalid argument Rework to add test_lwt_ip_encap.o and test_tc_edt.o to TEST_FILES so the object file gets installed when installing kselftest. Fixes: `74b5a5968f` ("selftests/bpf: Replace test_progs and test_maps w/ general rule") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191111161728.8854-1-anders.roxell@linaro.org	2019-11-11 22:35:23 +01:00
Yonghong Song	b7a0d65d80	bpf, testing: Workaround a verifier failure for test_progs With latest llvm compiler, running test_progs will have the following verifier failure for test_sysctl_loop1.o: libbpf: load bpf program failed: Permission denied libbpf: -- BEGIN DUMP LOG --- libbpf: invalid indirect read from stack var_off (0x0; 0xff)+196 size 7 ... libbpf: -- END LOG -- libbpf: failed to load program 'cgroup/sysctl' libbpf: failed to load object 'test_sysctl_loop1.o' The related bytecode looks as below: 0000000000000308 LBB0_8: 97: r4 = r10 98: r4 += -288 99: r4 += r7 100: w8 &= 255 101: r1 = r10 102: r1 += -488 103: r1 += r8 104: r2 = 7 105: r3 = 0 106: call 106 107: w1 = w0 108: w1 += -1 109: if w1 > 6 goto -24 <LBB0_5> 110: w0 += w8 111: r7 += 8 112: w8 = w0 113: if r7 != 224 goto -17 <LBB0_8> And source code: for (i = 0; i < ARRAY_SIZE(tcp_mem); ++i) { ret = bpf_strtoul(value + off, MAX_ULONG_STR_LEN, 0, tcp_mem + i); if (ret <= 0 \|\| ret > MAX_ULONG_STR_LEN) return 0; off += ret & MAX_ULONG_STR_LEN; } Current verifier is not able to conclude that register w0 before '+' at insn 110 has a range of 1 to 7 and thinks it is from 0 - 255. This leads to more conservative range for w8 at insn 112, and later verifier complaint. Let us workaround this issue until we found a compiler and/or verifier solution. The workaround in this patch is to make variable 'ret' volatile, which will force a reload and then '&' operation to ensure better value range. With this patch, I got the below byte code for the loop: 0000000000000328 LBB0_9: 101: r4 = r10 102: r4 += -288 103: r4 += r7 104: w8 &= 255 105: r1 = r10 106: r1 += -488 107: r1 += r8 108: r2 = 7 109: r3 = 0 110: call 106 111: (u32 )(r10 - 64) = r0 112: r1 = (u32 )(r10 - 64) 113: if w1 s< 1 goto -28 <LBB0_5> 114: r1 = (u32 )(r10 - 64) 115: if w1 s> 7 goto -30 <LBB0_5> 116: r1 = (u32 )(r10 - 64) 117: w1 &= 7 118: w1 += w8 119: r7 += 8 120: w8 = w1 121: if r7 != 224 goto -21 <LBB0_9> Insn 117 did the '&' operation and we got more precise value range for 'w8' at insn 120. The test is happy then: #3/17 test_sysctl_loop1.o:OK Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191107170045.2503480-1-yhs@fb.com	2019-11-11 14:03:10 +01:00
Alexei Starovoitov	0d2ec5b51d	Merge branch 'share-umem' Magnus Karlsson says: ==================== This patch set extends libbpf and the xdpsock sample program to demonstrate the shared umem mode (XDP_SHARED_UMEM) as well as Rx-only and Tx-only sockets. This in order for users to have an example to use as a blue print and also so that these modes will be exercised more frequently. Note that the user needs to supply an XDP program with the XDP_SHARED_UMEM mode that distributes the packets over the sockets according to some policy. There is an example supplied with the xdpsock program, but there is no default one in libbpf similarly to when XDP_SHARED_UMEM is not used. The reason for this is that I felt that supplying one that would work for all users in this mode is futile. There are just tons of ways to distribute packets, so whatever I come up with and build into libbpf would be wrong in most cases. This patch has been applied against commit `30ee348c12` ("Merge branch 'bpf-libbpf-fixes'") Structure of the patch set: Patch 1: Adds shared umem support to libbpf Patch 2: Shared umem support and example XPD program added to xdpsock sample Patch 3: Adds Rx-only and Tx-only support to libbpf Patch 4: Uses Rx-only sockets for rxdrop and Tx-only sockets for txpush in the xdpsock sample Patch 5: Add documentation entries for these two features ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-11-10 19:30:51 -08:00
Magnus Karlsson	57afa8b0cf	xsk: Extend documentation for Rx\|Tx-only sockets and shared umems Add more documentation about the new Rx-only and Tx-only sockets in libbpf and also how libbpf can now support shared umems. Also found two pieces that could be improved in the text, that got fixed in this commit. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-6-git-send-email-magnus.karlsson@intel.com	2019-11-10 19:30:46 -08:00
Magnus Karlsson	661842c46d	samples/bpf: Use Rx-only and Tx-only sockets in xdpsock Use Rx-only sockets for the rxdrop sample and Tx-only sockets for the txpush sample in the xdpsock application. This so that we exercise and show case these socket types too. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-5-git-send-email-magnus.karlsson@intel.com	2019-11-10 19:30:46 -08:00
Magnus Karlsson	a68977d269	libbpf: Allow for creating Rx or Tx only AF_XDP sockets The libbpf AF_XDP code is extended to allow for the creation of Rx only or Tx only sockets. Previously it returned an error if the socket was not initialized for both Rx and Tx. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-4-git-send-email-magnus.karlsson@intel.com	2019-11-10 19:30:46 -08:00
Magnus Karlsson	2e5d72c15f	samples/bpf: Add XDP_SHARED_UMEM support to xdpsock Add support for the XDP_SHARED_UMEM mode to the xdpsock sample application. As libbpf does not have a built in XDP program for this mode, we use an explicitly loaded XDP program. This also serves as an example on how to write your own XDP program that can route to an AF_XDP socket. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-3-git-send-email-magnus.karlsson@intel.com	2019-11-10 19:30:45 -08:00
Magnus Karlsson	cbf07409d0	libbpf: Support XDP_SHARED_UMEM with external XDP program Add support in libbpf to create multiple sockets that share a single umem. Note that an external XDP program need to be supplied that routes the incoming traffic to the desired sockets. So you need to supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load your own XDP program. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-2-git-send-email-magnus.karlsson@intel.com	2019-11-10 19:30:45 -08:00
Alexei Starovoitov	472aeb386e	Merge branch 'map-pinning' Toke Høiland-Jørgensen says: ==================== This series fixes a few bugs in libbpf that I discovered while playing around with the new auto-pinning code, and writing the first utility in xdp-tools[0]: - If object loading fails, libbpf does not clean up the pinnings created by the auto-pinning mechanism. - EPERM is not propagated to the caller on program load - Netlink functions write error messages directly to stderr In addition, libbpf currently only has a somewhat limited getter function for XDP link info, which makes it impossible to discover whether an attached program is in SKB mode or not. So the last patch in the series adds a new getter for XDP link info which returns all the information returned via netlink (and which can be extended later). Finally, add a getter for BPF program size, which can be used by the caller to estimate the amount of locked memory needed to load a program. A selftest is added for the pinning change, while the other features were tested in the xdp-filter tool from the xdp-tools repo. The 'new-libbpf-features' branch contains the commits that make use of the new XDP getter and the corrected EPERM error code. [0] https://github.com/xdp-project/xdp-tools Changelog: v4: - Don't do any size checks on struct xdp_info, just copy (and/or zero) whatever size the caller supplied. v3: - Pass through all kernel error codes on program load (instead of just EPERM). - No new bpf_object__unload() variant, just do the loop at the caller - Don't reject struct xdp_info sizes that are bigger than what we expect. - Add a comment noting that bpf_program__size() returns the size in bytes v2: - Keep function names in libbpf.map sorted properly ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-11-10 19:26:35 -08:00
Toke Høiland-Jørgensen	1a734efe06	libbpf: Add getter for program size This adds a new getter for the BPF program size (in bytes). This is useful for a caller that is trying to predict how much memory will be locked by loading a BPF object into the kernel. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333185272.88376.10996937115395724683.stgit@toke.dk	2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen	473f4e133a	libbpf: Add bpf_get_link_xdp_info() function to get more XDP information Currently, libbpf only provides a function to get a single ID for the XDP program attached to the interface. However, it can be useful to get the full set of program IDs attached, along with the attachment mode, in one go. Add a new getter function to support this, using an extendible structure to carry the information. Express the old bpf_get_link_id() function in terms of the new function. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk	2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen	b6e99b010e	libbpf: Use pr_warn() when printing netlink errors The netlink functions were using fprintf(stderr, ) directly to print out error messages, instead of going through the usual logging macros. This makes it impossible for the calling application to silence or redirect those error messages. Fix this by switching to pr_warn() in nlattr.c and netlink.c. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333185055.88376.15999360127117901443.stgit@toke.dk	2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen	4f33ddb4e3	libbpf: Propagate EPERM to caller on program load When loading an eBPF program, libbpf overrides the return code for EPERM errors instead of returning it to the caller. This makes it hard to figure out what went wrong on load. In particular, EPERM is returned when the system rlimit is too low to lock the memory required for the BPF program. Previously, this was somewhat obscured because the rlimit error would be hit on map creation (which does return it correctly). However, since maps can now be reused, object load can proceed all the way to loading programs without hitting the error; propagating it even in this case makes it possible for the caller to react appropriately (and, e.g., attempt to raise the rlimit before retrying). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk	2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen	9c4e395a1e	selftests/bpf: Add tests for automatic map unpinning on load failure This add tests for the different variations of automatic map unpinning on load failure. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184838.88376.8243704248624814775.stgit@toke.dk	2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen	ec6d5f47bf	libbpf: Unpin auto-pinned maps if loading fails Since the automatic map-pinning happens during load, it will leave pinned maps around if the load fails at a later stage. Fix this by unpinning any pinned maps on cleanup. To avoid unpinning pinned maps that were reused rather than newly pinned, add a new boolean property on struct bpf_map to keep track of whether that map was reused or not; and only unpin those maps that were not reused. Fixes: `57a00f4164` ("libbpf: Add auto-pinning of maps when loading BPF objects") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk	2019-11-10 19:26:30 -08:00
Daniel T. Lee	451d1dc886	samples: bpf: update map definition to new syntax BTF-defined map Since, the new syntax of BTF-defined map has been introduced, the syntax for using maps under samples directory are mixed up. For example, some are already using the new syntax, and some are using existing syntax by calling them as 'legacy'. As stated at commit `abd29c9314` ("libbpf: allow specifying map definitions using BTF"), the BTF-defined map has more compatablility with extending supported map definition features. The commit doesn't replace all of the map to new BTF-defined map, because some of the samples still use bpf_load instead of libbpf, which can't properly create BTF-defined map. This will only updates the samples which uses libbpf API for loading bpf program. (ex. bpf_prog_load_xattr) Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-11-08 19:21:52 -08:00
Daniel T. Lee	afbe3c27d9	samples: bpf: Update outdated error message Currently, under samples, several methods are being used to load bpf program. Since using libbpf is preferred solution, lots of previously used 'load_bpf_file' from bpf_load are replaced with 'bpf_prog_load_xattr' from libbpf. But some of the error messages still show up as 'load_bpf_file' instead of 'bpf_prog_load_xattr'. This commit fixes outdated errror messages under samples and fixes some code style issues. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191107005153.31541-2-danieltimlee@gmail.com	2019-11-08 19:19:43 -08:00
Martin KaFai Lau	ed5941af3f	bpf: Add cb access in kfree_skb test Access the skb->cb[] in the kfree_skb test. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191107180905.4097871-1-kafai@fb.com	2019-11-07 10:59:08 -08:00
Martin KaFai Lau	7e3617a72d	bpf: Add array support to btf_struct_access This patch adds array support to btf_struct_access(). It supports array of int, array of struct and multidimensional array. It also allows using u8[] as a scratch space. For example, it allows access the "char cb[48]" with size larger than the array's element "char". Another potential use case is "u64 icsk_ca_priv[]" in the tcp congestion control. btf_resolve_size() is added to resolve the size of any type. It will follow the modifier if there is any. Please see the function comment for details. This patch also adds the "off < moff" check at the beginning of the for loop. It is to reject cases when "off" is pointing to a "hole" in a struct. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191107180903.4097702-1-kafai@fb.com	2019-11-07 10:59:08 -08:00
Daniel Borkmann	30ee348c12	Merge branch 'bpf-libbpf-fixes' Andrii Nakryiko says: ==================== Github's mirror of libbpf got LGTM and Coverity statis analysis running against it and spotted few real bugs and few potential issues. This patch series fixes found issues. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-11-07 16:20:41 +01:00
Andrii Nakryiko	98e527af30	libbpf: Improve handling of corrupted ELF during map initialization If we get ELF file with "maps" section, but no symbols pointing to it, we'll end up with division by zero. Add check against this situation and exit early with error. Found by Coverity scan against Github libbpf sources. Fixes: `bf82927125` ("libbpf: refactor map initialization") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com	2019-11-07 16:20:38 +01:00
Andrii Nakryiko	994021a7e0	libbpf: Make btf__resolve_size logic always check size error condition Perform size check always in btf__resolve_size. Makes the logic a bit more robust against corrupted BTF and silences LGTM/Coverity complaining about always true (size < 0) check. Fixes: `69eaab04c6` ("btf: extract BTF type size calculation") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-5-andriin@fb.com	2019-11-07 16:20:38 +01:00
Andrii Nakryiko	dd3ab12637	libbpf: Fix another potential overflow issue in bpf_prog_linfo Fix few issues found by Coverity and LGTM. Fixes: `b053b439b7` ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-4-andriin@fb.com	2019-11-07 16:20:38 +01:00
Andrii Nakryiko	4ee1135615	libbpf: Fix potential overflow issue Fix a potential overflow issue found by LGTM analysis, based on Github libbpf source code. Fixes: `3d65014146` ("bpf: libbpf: Add btf_line_info support to libbpf") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-3-andriin@fb.com	2019-11-07 16:20:37 +01:00

1 2 3 4 5 ...

873546 Commits