Commit Graph

277 Commits

Author SHA1 Message Date
Toke Høiland-Jørgensen
dc3a2d2547 libbpf: Print hint about ulimit when getting permission denied error
Probably the single most common error newcomers to XDP are stumped by is
the 'permission denied' error they get when trying to load their program
and 'ulimit -l' is set too low. For examples, see [0], [1].

Since the error code is UAPI, we can't change that. Instead, this patch
adds a few heuristics in libbpf and outputs an additional hint if they are
met: If an EPERM is returned on map create or program load, and geteuid()
shows we are root, and the current RLIMIT_MEMLOCK is not infinity, we
output a hint about raising 'ulimit -l' as an additional log line.

[0] https://marc.info/?l=xdp-newbies&m=157043612505624&w=2
[1] https://github.com/xdp-project/xdp-tutorial/issues/86

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191216181204.724953-1-toke@redhat.com
2019-12-16 14:52:34 -08:00
Andrii Nakryiko
1b484b301c libbpf: Support flexible arrays in CO-RE
Some data stuctures in kernel are defined with either zero-sized array or
flexible (dimensionless) array at the end of a struct. Actual data of such
array follows in memory immediately after the end of that struct, forming its
variable-sized "body" of elements. Support such access pattern in CO-RE
relocation handling.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com
2019-12-15 16:53:50 -08:00
Andrii Nakryiko
2ad97d473d bpftool: Generate externs datasec in BPF skeleton
Add support for generation of mmap()-ed read-only view of libbpf-provided
extern variables. As externs are not supposed to be provided by user code
(that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only
after skeleton load is performed, map .extern contents as read-only memory.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com
2019-12-15 16:41:12 -08:00
Andrii Nakryiko
166750bc1d libbpf: Support libbpf-provided extern variables
Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
2019-12-15 16:41:12 -08:00
Andrii Nakryiko
ac9d138963 libbpf: Extract internal map names into constants
Instead of duplicating string literals, keep them in one place and consistent.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com
2019-12-15 16:41:12 -08:00
Andrii Nakryiko
d66562fba1 libbpf: Add BPF object skeleton support
Add new set of APIs, allowing to open/load/attach BPF object through BPF
object skeleton, generated by bpftool for a specific BPF object file. All the
xxx_skeleton() APIs wrap up corresponding bpf_object_xxx() APIs, but
additionally also automate map/program lookups by name, global data
initialization and mmap()-ing, etc.  All this greatly improves and simplifies
userspace usability of working with BPF programs. See follow up patches for
examples.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-13-andriin@fb.com
2019-12-15 15:58:05 -08:00
Andrii Nakryiko
3f51935314 libbpf: Reduce log level of supported section names dump
It's quite spammy. And now that bpf_object__open() is trying to determine
program type from its section name, we are getting these verbose messages all
the time. Reduce their log level to DEBUG.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-12-andriin@fb.com
2019-12-15 15:58:05 -08:00
Andrii Nakryiko
13acb508ae libbpf: Postpone BTF ID finding for TRACING programs to load phase
Move BTF ID determination for BPF_PROG_TYPE_TRACING programs to a load phase.
Performing it at open step is inconvenient, because it prevents BPF skeleton
generation on older host kernel, which doesn't contain BTF_KIND_FUNCs
information in vmlinux BTF. This is a common set up, though, when, e.g.,
selftests are compiled on older host kernel, but the test program itself is
executed in qemu VM with bleeding edge kernel. Having this BTF searching
performed at load time allows to successfully use bpf_object__open() for
codegen and inspection of BPF object file.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-11-andriin@fb.com
2019-12-15 15:58:05 -08:00
Andrii Nakryiko
eba9c5f498 libbpf: Refactor global data map initialization
Refactor global data map initialization to use anonymous mmap()-ed memory
instead of malloc()-ed one. This allows to do a transparent re-mmap()-ing of
already existing memory address to point to BPF map's memory after
bpf_object__load() step (done in follow up patch). This choreographed setup
allows to have a nice and unsurprising way to pre-initialize read-only (and
r/w as well) maps by user and after BPF map creation keep working with
mmap()-ed contents of this map. All in a way that doesn't require user code to
update any pointers: the illusion of working with memory contents is preserved
before and after actual BPF map instantiation.

Selftests and runqslower example demonstrate this feature in follow up patches.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-10-andriin@fb.com
2019-12-15 15:58:05 -08:00
Andrii Nakryiko
01af3bf067 libbpf: Expose BPF program's function name
Add APIs to get BPF program function name, as opposed to bpf_program__title(),
which returns BPF program function's section name. Function name has a benefit
of being a valid C identifier and uniquely identifies a specific BPF program,
while section name can be duplicated across multiple independent BPF programs.

Add also bpf_object__find_program_by_name(), similar to
bpf_object__find_program_by_title(), to facilitate looking up BPF programs by
their C function names.

Convert one of selftests to new API for look up.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-9-andriin@fb.com
2019-12-15 15:58:05 -08:00
Andrii Nakryiko
d7a18ea7e8 libbpf: Add generic bpf_program__attach()
Generalize BPF program attaching and allow libbpf to auto-detect type (and
extra parameters, where applicable) and attach supported BPF program types
based on program sections. Currently this is supported for:
- kprobe/kretprobe;
- tracepoint;
- raw tracepoint;
- tracing programs (typed raw TP/fentry/fexit).

More types support can be trivially added within this framework.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-3-andriin@fb.com
2019-12-15 15:58:04 -08:00
Andrii Nakryiko
0d13bfce02 libbpf: Don't require root for bpf_object__open()
Reorganize bpf_object__open and bpf_object__load steps such that
bpf_object__open doesn't need root access. This was previously done for
feature probing and BTF sanitization. This doesn't have to happen on open,
though, so move all those steps into the load phase.

This is important, because it makes it possible for tools like bpftool, to
just open BPF object file and inspect their contents: programs, maps, BTF,
etc. For such operations it is prohibitive to require root access. On the
other hand, there is a lot of custom libbpf logic in those steps, so its best
avoided for tools to reimplement all that on their own.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-2-andriin@fb.com
2019-12-15 15:58:04 -08:00
Andrii Nakryiko
783b8f01f5 libbpf: Don't attach perf_buffer to offline/missing CPUs
It's quite common on some systems to have more CPUs enlisted as "possible",
than there are (and could ever be) present/online CPUs. In such cases,
perf_buffer creationg will fail due to inability to create perf event on
missing CPU with error like this:

libbpf: failed to open perf buffer event on cpu #16: No such device

This patch fixes the logic of perf_buffer__new() to ignore CPUs that are
missing or currently offline. In rare cases where user explicitly listed
specific CPUs to connect to, behavior is unchanged: libbpf will try to open
perf event buffer on specified CPU(s) anyways.

Fixes: fb84b82246 ("libbpf: add perf buffer API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013609.1691168-1-andriin@fb.com
2019-12-13 13:00:09 -08:00
Andrii Nakryiko
6803ee25f0 libbpf: Extract and generalize CPU mask parsing logic
This logic is re-used for parsing a set of online CPUs. Having it as an
isolated piece of code working with input string makes it conveninent to test
this logic as well. While refactoring, also improve the robustness of original
implementation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212013548.1690564-1-andriin@fb.com
2019-12-13 12:58:51 -08:00
Jakub Sitnicki
67d69ccdf3 libbpf: Recognize SK_REUSEPORT programs from section name
Allow loading BPF object files that contain SK_REUSEPORT programs without
having to manually set the program type before load if the the section name
is set to "sk_reuseport".

Makes user-space code needed to load SK_REUSEPORT BPF program more concise.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191212102259.418536-2-jakub@cloudflare.com
2019-12-13 12:38:00 -08:00
Andrii Nakryiko
679152d3a3 libbpf: Fix printf compilation warnings on ppc64le arch
On ppc64le __u64 and __s64 are defined as long int and unsigned long int,
respectively. This causes compiler to emit warning when %lld/%llu are used to
printf 64-bit numbers. Fix this by casting to size_t/ssize_t with %zu and %zd
format specifiers, respectively.

v1->v2:
- use size_t/ssize_t instead of custom typedefs (Martin).

Fixes: 1f8e2bcb2c ("libbpf: Refactor relocation handling")
Fixes: abd29c9314 ("libbpf: allow specifying map definitions using BTF")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191212171918.638010-1-andriin@fb.com
2019-12-12 13:47:24 -08:00
Alexei Starovoitov
7c3977d1e8 libbpf: Fix sym->st_value print on 32-bit arches
The st_value field is a 64-bit value and causing this error on 32-bit arches:

In file included from libbpf.c:52:
libbpf.c: In function 'bpf_program__record_reloc':
libbpf_internal.h:59:22: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Addr' {aka 'const long long unsigned int'} [-Werror=format=]

Fix it with (__u64) cast.

Fixes: 1f8e2bcb2c ("libbpf: Refactor relocation handling")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-11-27 17:46:56 -08:00
Andrii Nakryiko
53f8dd434b libbpf: Fix global variable relocation
Similarly to a0d7da26ce ("libbpf: Fix call relocation offset calculation
bug"), relocations against global variables need to take into account
referenced symbol's st_value, which holds offset into a corresponding data
section (and, subsequently, offset into internal backing map). For static
variables this offset is always zero and data offset is completely described
by respective instruction's imm field.

Convert a bunch of selftests to global variables. Previously they were relying
on `static volatile` trick to ensure Clang doesn't inline static variables,
which with global variables is not necessary anymore.

Fixes: 393cdfbee8 ("libbpf: Support initialized global variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191127200651.1381348-1-andriin@fb.com
2019-11-27 16:34:21 -08:00
Andrii Nakryiko
b615e5a1e0 libbpf: Fix usage of u32 in userspace code
u32 is not defined for libbpf when compiled outside of kernel sources (e.g.,
in Github projection). Use __u32 instead.

Fixes: b8c54ea455 ("libbpf: Add support to attach to fentry/fexit tracing progs")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191125212948.1163343-1-andriin@fb.com
2019-11-25 13:52:01 -08:00
Andrii Nakryiko
1aace10f41 libbpf: Fix bpf_object name determination for bpf_object__open_file()
If bpf_object__open_file() gets path like "some/dir/obj.o", it should derive
BPF object's name as "obj" (unless overriden through opts->object_name).
Instead, due to using `path` as a fallback value for opts->obj_name, path is
used as is for object name, so for above example BPF object's name will be
verbatim "some/dir/obj", which leads to all sorts of troubles, especially when
internal maps are concern (they are using up to 8 characters of object name).
Fix that by ensuring object_name stays NULL, unless overriden.

Fixes: 291ee02b5e ("libbpf: Refactor bpf_object__open APIs to use common opts")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191122003527.551556-1-andriin@fb.com
2019-11-24 16:58:46 -08:00
Andrii Nakryiko
393cdfbee8 libbpf: Support initialized global variables
Initialized global variables are no different in ELF from static variables,
and don't require any extra support from libbpf. But they are matching
semantics of global data (backed by BPF maps) more closely, preventing
LLVM/Clang from aggressively inlining constant values and not requiring
volatile incantations to prevent those. This patch enables global variables.
It still disables uninitialized variables, which will be put into special COM
(common) ELF section, because BPF doesn't allow uninitialized data to be
accessed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-5-andriin@fb.com
2019-11-24 16:58:45 -08:00
Andrii Nakryiko
8983b731ce libbpf: Fix various errors and warning reported by checkpatch.pl
Fix a bunch of warnings and errors reported by checkpatch.pl, to make it
easier to spot new problems.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-4-andriin@fb.com
2019-11-24 16:58:45 -08:00
Andrii Nakryiko
1f8e2bcb2c libbpf: Refactor relocation handling
Relocation handling code is convoluted and unnecessarily deeply nested. Split
out per-relocation logic into separate function. Also refactor the logic to be
more a sequence of per-relocation type checks and processing steps, making it
simpler to follow control flow. This makes it easier to further extends it to
new kinds of relocations (e.g., support for extern variables).

This patch also makes relocation's section verification more robust.
Previously relocations against not yet supported externs were silently ignored
because of obj->efile.text_shndx was zero, when all BPF programs had custom
section names and there was no .text section. Also, invalid LDIMM64 relocations
against non-map sections were passed through, if they were pointing to a .text
section (or 0, which is invalid section). All these bugs are fixed within this
refactoring and checks are made more appropriate for each type of relocation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191121070743.1309473-3-andriin@fb.com
2019-11-24 16:58:45 -08:00
Andrii Nakryiko
a0d7da26ce libbpf: Fix call relocation offset calculation bug
When relocating subprogram call, libbpf doesn't take into account
relo->text_off, which comes from symbol's value. This generally works fine for
subprograms implemented as static functions, but breaks for global functions.

Taking a simplified test_pkt_access.c as an example:

__attribute__ ((noinline))
static int test_pkt_access_subprog1(volatile struct __sk_buff *skb)
{
        return skb->len * 2;
}

__attribute__ ((noinline))
static int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb)
{
        return skb->len + val;
}

SEC("classifier/test_pkt_access")
int test_pkt_access(struct __sk_buff *skb)
{
        if (test_pkt_access_subprog1(skb) != skb->len * 2)
                return TC_ACT_SHOT;
        if (test_pkt_access_subprog2(2, skb) != skb->len + 2)
                return TC_ACT_SHOT;
        return TC_ACT_UNSPEC;
}

When compiled, we get two relocations, pointing to '.text' symbol. .text has
st_value set to 0 (it points to the beginning of .text section):

0000000000000008  000000050000000a R_BPF_64_32            0000000000000000 .text
0000000000000040  000000050000000a R_BPF_64_32            0000000000000000 .text

test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two
calls) are encoded within call instruction's imm32 part as -1 and 2,
respectively:

0000000000000000 test_pkt_access_subprog1:
       0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       1:       64 00 00 00 01 00 00 00 w0 <<= 1
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 test_pkt_access_subprog2:
       3:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       4:       04 00 00 00 02 00 00 00 w0 += 2
       5:       95 00 00 00 00 00 00 00 exit

0000000000000000 test_pkt_access:
       0:       bf 16 00 00 00 00 00 00 r6 = r1
===>   1:       85 10 00 00 ff ff ff ff call -1
       2:       bc 01 00 00 00 00 00 00 w1 = w0
       3:       b4 00 00 00 02 00 00 00 w0 = 2
       4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
       5:       64 02 00 00 01 00 00 00 w2 <<= 1
       6:       5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3>
       7:       bf 61 00 00 00 00 00 00 r1 = r6
===>   8:       85 10 00 00 02 00 00 00 call 2
       9:       bc 01 00 00 00 00 00 00 w1 = w0
      10:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
      11:       04 02 00 00 02 00 00 00 w2 += 2
      12:       b4 00 00 00 ff ff ff ff w0 = -1
      13:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3>
      14:       b4 00 00 00 02 00 00 00 w0 = 2
0000000000000078 LBB0_3:
      15:       95 00 00 00 00 00 00 00 exit

Now, if we compile example with global functions, the setup changes.
Relocations are now against specifically test_pkt_access_subprog1 and
test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24
bytes into its respective section (.text), i.e., 3 instructions in:

0000000000000008  000000070000000a R_BPF_64_32            0000000000000000 test_pkt_access_subprog1
0000000000000048  000000080000000a R_BPF_64_32            0000000000000018 test_pkt_access_subprog2

Calls instructions now encode offsets relative to function symbols and are both
set ot -1:

0000000000000000 test_pkt_access_subprog1:
       0:       61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0)
       1:       64 00 00 00 01 00 00 00 w0 <<= 1
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 test_pkt_access_subprog2:
       3:       61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0)
       4:       0c 10 00 00 00 00 00 00 w0 += w1
       5:       95 00 00 00 00 00 00 00 exit

0000000000000000 test_pkt_access:
       0:       bf 16 00 00 00 00 00 00 r6 = r1
===>   1:       85 10 00 00 ff ff ff ff call -1
       2:       bc 01 00 00 00 00 00 00 w1 = w0
       3:       b4 00 00 00 02 00 00 00 w0 = 2
       4:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
       5:       64 02 00 00 01 00 00 00 w2 <<= 1
       6:       5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3>
       7:       b4 01 00 00 02 00 00 00 w1 = 2
       8:       bf 62 00 00 00 00 00 00 r2 = r6
===>   9:       85 10 00 00 ff ff ff ff call -1
      10:       bc 01 00 00 00 00 00 00 w1 = w0
      11:       61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0)
      12:       04 02 00 00 02 00 00 00 w2 += 2
      13:       b4 00 00 00 ff ff ff ff w0 = -1
      14:       1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3>
      15:       b4 00 00 00 02 00 00 00 w0 = 2
0000000000000080 LBB2_3:
      16:       95 00 00 00 00 00 00 00 exit

Thus the right formula to calculate target call offset after relocation should
take into account relocation's target symbol value (offset within section),
call instruction's imm32 offset, and (subtracting, to get relative instruction
offset) instruction index of call instruction itself. All that is shifted by
number of instructions in main program, given all sub-programs are copied over
after main program.

Convert few selftests relying on bpf-to-bpf calls to use global functions
instead of static ones.

Fixes: 48cca7e44f ("libbpf: add support for bpf_call")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com
2019-11-19 15:00:12 -08:00
Andrii Nakryiko
7fe74b4362 libbpf: Make global data internal arrays mmap()-able, if possible
Add detection of BPF_F_MMAPABLE flag support for arrays and add it as an extra
flag to internal global data maps, if supported by kernel. This allows users
to memory-map global data and use it without BPF map operations, greatly
simplifying user experience.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-5-andriin@fb.com
2019-11-18 11:41:59 +01:00
Alexei Starovoitov
e7bf94dbb8 libbpf: Add support for attaching BPF programs to other BPF programs
Extend libbpf api to pass attach_prog_fd into bpf_object__open.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-19-ast@kernel.org
2019-11-15 23:45:37 +01:00
Alexei Starovoitov
b8c54ea455 libbpf: Add support to attach to fentry/fexit tracing progs
Teach libbpf to recognize tracing programs types and attach them to
fentry/fexit.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-7-ast@kernel.org
2019-11-15 23:42:31 +01:00
Toke Høiland-Jørgensen
1a734efe06 libbpf: Add getter for program size
This adds a new getter for the BPF program size (in bytes). This is useful
for a caller that is trying to predict how much memory will be locked by
loading a BPF object into the kernel.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333185272.88376.10996937115395724683.stgit@toke.dk
2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen
4f33ddb4e3 libbpf: Propagate EPERM to caller on program load
When loading an eBPF program, libbpf overrides the return code for EPERM
errors instead of returning it to the caller. This makes it hard to figure
out what went wrong on load.

In particular, EPERM is returned when the system rlimit is too low to lock
the memory required for the BPF program. Previously, this was somewhat
obscured because the rlimit error would be hit on map creation (which does
return it correctly). However, since maps can now be reused, object load
can proceed all the way to loading programs without hitting the error;
propagating it even in this case makes it possible for the caller to react
appropriately (and, e.g., attempt to raise the rlimit before retrying).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk
2019-11-10 19:26:30 -08:00
Toke Høiland-Jørgensen
ec6d5f47bf libbpf: Unpin auto-pinned maps if loading fails
Since the automatic map-pinning happens during load, it will leave pinned
maps around if the load fails at a later stage. Fix this by unpinning any
pinned maps on cleanup. To avoid unpinning pinned maps that were reused
rather than newly pinned, add a new boolean property on struct bpf_map to
keep track of whether that map was reused or not; and only unpin those maps
that were not reused.

Fixes: 57a00f4164 ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk
2019-11-10 19:26:30 -08:00
Andrii Nakryiko
98e527af30 libbpf: Improve handling of corrupted ELF during map initialization
If we get ELF file with "maps" section, but no symbols pointing to it, we'll
end up with division by zero. Add check against this situation and exit early
with error. Found by Coverity scan against Github libbpf sources.

Fixes: bf82927125 ("libbpf: refactor map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com
2019-11-07 16:20:38 +01:00
Andrii Nakryiko
3dc5e05982 libbpf: Fix memory leak/double free issue
Coverity scan against Github libbpf code found the issue of not freeing memory and
leaving already freed memory still referenced from bpf_program. Fix it by
re-assigning successfully reallocated memory sooner.

Fixes: 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com
2019-11-07 16:20:37 +01:00
Andrii Nakryiko
94f060e984 libbpf: Add support for field size relocations
Add bpf_core_field_size() macro, capturing a relocation against field size.
Adjust bits of internal libbpf relocation logic to allow capturing size
relocations of various field types: arrays, structs/unions, enums, etc.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com
2019-11-04 16:06:56 +01:00
Andrii Nakryiko
ee26dade0e libbpf: Add support for relocatable bitfields
Add support for the new field relocation kinds, necessary to support
relocatable bitfield reads. Provide macro for abstracting necessary code doing
full relocatable bitfield extraction into u64 value. Two separate macros are
provided:
- BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs
(e.g., typed raw tracepoints). It uses direct memory dereference to extract
bitfield backing integer value.
- BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs
to be used to extract same backing integer value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
2019-11-04 16:06:56 +01:00
Toke Høiland-Jørgensen
57a00f4164 libbpf: Add auto-pinning of maps when loading BPF objects
This adds support to libbpf for setting map pinning information as part of
the BTF map declaration, to get automatic map pinning (and reuse) on load.
The pinning type currently only supports a single PIN_BY_NAME mode, where
each map will be pinned by its name in a path that can be overridden, but
defaults to /sys/fs/bpf.

Since auto-pinning only does something if any maps actually have a
'pinning' BTF attribute set, we default the new option to enabled, on the
assumption that seamless pinning is what most callers want.

When a map has a pin_path set at load time, libbpf will compare the map
pinned at that location (if any), and if the attributes match, will re-use
that map instead of creating a new one. If no existing map is found, the
newly created map will instead be pinned at the location.

Programs wanting to customise the pinning can override the pinning paths
using bpf_map__set_pin_path() before calling bpf_object__load() (including
setting it to NULL to disable pinning of a particular map).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
196f8487f5 libbpf: Move directory creation into _pin() functions
The existing pin_*() functions all try to create the parent directory
before pinning. Move this check into the per-object _pin() functions
instead. This ensures consistent behaviour when auto-pinning is
added (which doesn't go through the top-level pin_maps() function), at the
cost of a few more calls to mkdir().

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
4580b25fce libbpf: Store map pin path and status in struct bpf_map
Support storing and setting a pin path in struct bpf_map, which can be used
for automatic pinning. Also store the pin status so we can avoid attempts
to re-pin a map that has already been pinned (or reused from a previous
pinning).

The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is
called with a NULL path argument (which was previously illegal), it will
(un)pin only those maps that have a pin_path set.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
d1b4574a4b libbpf: Fix error handling in bpf_map__reuse_fd()
bpf_map__reuse_fd() was calling close() in the error path before returning
an error value based on errno. However, close can change errno, so that can
lead to potentially misleading error messages. Instead, explicitly store
errno in the err variable before each goto.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk
2019-11-02 12:35:06 -07:00
Alexei Starovoitov
12a8654b2e libbpf: Add support for prog_tracing
Cleanup libbpf from expected_attach_type == attach_btf_id hack
and introduce BPF_PROG_TYPE_TRACING.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org
2019-10-31 15:16:59 +01:00
Andrii Nakryiko
d3a3aa0c59 libbpf: Fix off-by-one error in ELF sanity check
libbpf's bpf_object__elf_collect() does simple sanity check after iterating
over all ELF sections, if checks that .strtab index is correct. Unfortunately,
due to section indices being 1-based, the check breaks for cases when .strtab
ends up being the very last section in ELF.

Fixes: 77ba9a5b48 ("tools lib bpf: Fetch map names from correct strtab")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com
2019-10-28 20:27:40 -07:00
KP Singh
58eeb2289a libbpf: Fix strncat bounds error in libbpf_prog_type_by_name
On compiling samples with this change, one gets an error:

 error: ‘strncat’ specified bound 118 equals destination size
  [-Werror=stringop-truncation]

    strncat(dst, name + section_names[i].len,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

strncat requires the destination to have enough space for the
terminating null byte.

Fixes: f75a697e09 ("libbpf: Auto-detect btf_id of BTF-based raw_tracepoint")
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191023154038.24075-1-kpsingh@chromium.org
2019-10-23 10:17:28 -07:00
Andrii Nakryiko
e00aca65e6 libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration
LIBBPF_OPTS is implemented as a mix of field declaration and memset
+ assignment. This makes it neither variable declaration nor purely
statements, which is a problem, because you can't mix it with either
other variable declarations nor other function statements, because C90
compiler mode emits warning on mixing all that together.

This patch changes LIBBPF_OPTS into a strictly declaration of variable
and solves this problem, as can be seen in case of bpftool, which
previously would emit compiler warning, if done this way (LIBBPF_OPTS as
part of function variables declaration block).

This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow
kernel convention for similar macros more closely.

v1->v2:
- rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com
2019-10-22 21:35:03 +02:00
Andrii Nakryiko
dd4436bb83 libbpf: Teach bpf_object__open to guess program types
Teach bpf_object__open how to guess program type and expected attach
type from section names, similar to what bpf_prog_load() does. This
seems like a really useful features and an oversight to not have this
done during bpf_object_open(). To preserver backwards compatible
behavior of bpf_prog_load(), its attr->prog_type is treated as an
override of bpf_object__open() decisions, if attr->prog_type is not
UNSPECIFIED.

There is a slight difference in behavior for bpf_prog_load().
Previously, if bpf_prog_load() was loading BPF object with more than one
program, first program's guessed program type and expected attach type
would determine corresponding attributes of all the subsequent program
types, even if their sections names suggest otherwise. That seems like
a rather dubious behavior and with this change it will behave more
sanely: each program's type is determined individually, unless they are
forced to uniformity through attr->prog_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-5-andriin@fb.com
2019-10-21 14:49:12 +02:00
Andrii Nakryiko
32dff6db29 libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes
Map uprobe/uretprobe into KPROBE program type. tp/raw_tp are just an
alias for more verbose tracepoint/raw_tracepoint, respectively.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-4-andriin@fb.com
2019-10-21 14:49:12 +02:00
Andrii Nakryiko
f1eead9e3c libbpf: Add bpf_program__get_{type, expected_attach_type) APIs
There are bpf_program__set_type() and
bpf_program__set_expected_attach_type(), but no corresponding getters,
which seems rather incomplete. Fix this.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191021033902.3856966-3-andriin@fb.com
2019-10-21 14:49:12 +02:00
Kefeng Wang
be18010ea2 tools, bpf: Rename pr_warning to pr_warn to align with kernel logging
For kernel logging macros, pr_warning() is completely removed and
replaced by pr_warn(). By using pr_warn() in tools/lib/bpf/ for
symmetry to kernel logging macros, we could eventually drop the
use of pr_warning() in the whole kernel tree.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191021055532.185245-1-wangkefeng.wang@huawei.com
2019-10-21 14:38:41 +02:00
John Fastabend
54b8625cd9 bpf, libbpf: Add kernel version section parsing back
With commit "libbpf: stop enforcing kern_version,..." we removed the
kernel version section parsing in favor of querying for the kernel
using uname() and populating the version using the result of the
query. After this any version sections were simply ignored.

Unfortunately, the world of kernels is not so friendly. I've found some
customized kernels where uname() does not match the in kernel version.
To fix this so programs can load in this environment this patch adds
back parsing the section and if it exists uses the user specified
kernel version to override the uname() result. However, keep most the
kernel uname() discovery bits so users are not required to insert the
version except in these odd cases.

Fixes: 5e61f27070 ("libbpf: stop enforcing kern_version, populate it for users")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157140968634.9073.6407090804163937103.stgit@john-XPS-13-9370
2019-10-18 20:59:10 +02:00
Alexei Starovoitov
f75a697e09 libbpf: Auto-detect btf_id of BTF-based raw_tracepoints
It's a responsiblity of bpf program author to annotate the program
with SEC("tp_btf/name") where "name" is a valid raw tracepoint.
The libbpf will try to find "name" in vmlinux BTF and error out
in case vmlinux BTF is not available or "name" is not found.
If "name" is indeed a valid raw tracepoint then in-kernel BTF
will have "btf_trace_##name" typedef that points to function
prototype of that raw tracepoint. BTF description captures
exact argument the kernel C code is passing into raw tracepoint.
The kernel verifier will check the types while loading bpf program.

libbpf keeps BTF type id in expected_attach_type, but since
kernel ignores this attribute for tracing programs copy it
into attach_btf_id attribute before loading.

Later the kernel will use prog->attach_btf_id to select raw tracepoint
during bpf_raw_tracepoint_open syscall command.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-6-ast@kernel.org
2019-10-17 16:44:35 +02:00
Andrii Nakryiko
62561eb442 libbpf: Add support for field existance CO-RE relocation
Add support for BPF_FRK_EXISTS relocation kind to detect existence of
captured field in a destination BTF, allowing conditional logic to
handle incompatible differences between kernels.

Also introduce opt-in relaxed CO-RE relocation handling option, which
makes libbpf emit warning for failed relocations, but proceed with other
relocations. Instruction, for which relocation failed, is patched with
(u32)-1 value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-4-andriin@fb.com
2019-10-15 16:06:05 -07:00
Andrii Nakryiko
291ee02b5e libbpf: Refactor bpf_object__open APIs to use common opts
Refactor all the various bpf_object__open variations to ultimately
specify common bpf_object_open_opts struct. This makes it easy to keep
extending this common struct w/ extra parameters without having to
update all the legacy APIs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191015182849.3922287-3-andriin@fb.com
2019-10-15 16:06:05 -07:00