Merge branch 'bpf: add helpers to support BTF-based kernel'

Alan Maguire says:

====================
This series attempts to provide a simple way for BPF programs (and in
future other consumers) to utilize BPF Type Format (BTF) information
to display kernel data structures in-kernel.  The use case this
functionality is applied to here is to support a snprintf()-like
helper to copy a BTF representation of kernel data to a string,
and a BPF seq file helper to display BTF data for an iterator.

There is already support in kernel/bpf/btf.c for "show" functionality;
the changes here generalize that support from seq-file specific
verifier display to the more generic case and add another specific
use case; rather than seq_printf()ing the show data, it is copied
to a supplied string using a snprintf()-like function.  Other future
consumers of the show functionality could include a bpf_printk_btf()
function which printk()ed the data instead.  Oops messaging in
particular would be an interesting application for such functionality.

The above potential use case hints at a potential reply to
a reasonable objection that such typed display should be
solved by tracing programs, where the in-kernel tracing records
data and the userspace program prints it out.  While this
is certainly the recommended approach for most cases, I
believe having an in-kernel mechanism would be valuable
also.  Critically in BPF programs it greatly simplifies
debugging and tracing of such data to invoking a simple
helper.

One challenge raised in an earlier iteration of this work -
where the BTF printing was implemented as a printk() format
specifier - was that the amount of data printed per
printk() was large, and other format specifiers were far
simpler.  Here we sidestep that concern by printing
components of the BTF representation as we go for the
seq file case, and in the string case the snprintf()-like
operation is intended to be a basis for perf event or
ringbuf output.  The reasons for avoiding bpf_trace_printk
are that

1. bpf_trace_printk() strings are restricted in size and
cannot display anything beyond trivial data structures; and
2. bpf_trace_printk() is for debugging purposes only.

As Alexei suggested, a bpf_trace_puts() helper could solve
this in the future but it still would be limited by the
1000 byte limit for traced strings.

Default output for an sk_buff looks like this (zeroed fields
are omitted):

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x000000007524fd8b,
 .data = (unsigned char *)0x000000007524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags can modify aspects of output format; see patch 3
for more details.

Changes since v6:

- Updated safe data size to 32, object name size to 80.
  This increases the number of safe copies done, but performance is
  not a key goal here. WRT name size the largest type name length
  in bpf-next according to "pahole -s" is 64 bytes, so that still gives
  room for additional type qualifiers, parens etc within the name limit
  (Alexei, patch 2)
- Remove inlines and converted as many #defines to functions as was
  possible.  In a few cases - btf_show_type_value[s]() specifically -
  I left these as macros as btf_show_type_value[s]() prepends and
  appends format strings to the format specifier (in order to include
  indentation, delimiters etc so a macro makes that simpler (Alexei,
  patch 2)
- Handle btf_resolve_size() error in btf_show_obj_safe() (Alexei, patch 2)
- Removed clang loop unroll in BTF snprintf test (Alexei)
- switched to using bpf_core_type_id_kernel(type) as suggested by Andrii,
  and Alexei noted that __builtin_btf_type_id(,1) should be used (patch 4)
- Added skip logic if __builtin_btf_type_id is not available (patches 4,8)
- Bumped limits on bpf iters to support printing larger structures (Alexei,
  patch 5)
- Updated overflow bpf_iter tests to reflect new iter max size (patch 6)
- Updated seq helper to use type id only (Alexei, patch 7)
- Updated BTF task iter test to use task struct instead of struct fs_struct
  since new limits allow a task_struct to be displayed (patch 8)
- Fixed E2BIG handling in iter task (Alexei, patch 8)

Changes since v5:

- Moved btf print prepare into patch 3, type show seq
  with flags into patch 2 (Alexei, patches 2,3)
- Fixed build bot warnings around static declarations
  and printf attributes
- Renamed functions to snprintf_btf/seq_printf_btf
  (Alexei, patches 3-6)

Changes since v4:

- Changed approach from a BPF trace event-centric design to one
  utilizing a snprintf()-like helper and an iter helper (Alexei,
  patches 3,5)
- Added tests to verify BTF output (patch 4)
- Added support to tests for verifying BTF type_id-based display
  as well as type name via __builtin_btf_type_id (Andrii, patch 4).
- Augmented task iter tests to cover the BTF-based seq helper.
  Because a task_struct's BTF-based representation would overflow
  the PAGE_SIZE limit on iterator data, the "struct fs_struct"
  (task->fs) is displayed for each task instead (Alexei, patch 6).

Changes since v3:

- Moved to RFC since the approach is different (and bpf-next is
  closed)
- Rather than using a printk() format specifier as the means
  of invoking BTF-enabled display, a dedicated BPF helper is
  used.  This solves the issue of printk() having to output
  large amounts of data using a complex mechanism such as
  BTF traversal, but still provides a way for the display of
  such data to be achieved via BPF programs.  Future work could
  include a bpf_printk_btf() function to invoke display via
  printk() where the elements of a data structure are printk()ed
 one at a time.  Thanks to Petr Mladek, Andy Shevchenko and
  Rasmus Villemoes who took time to look at the earlier printk()
  format-specifier-focused version of this and provided feedback
  clarifying the problems with that approach.
- Added trace id to the bpf_trace_printk events as a means of
  separating output from standard bpf_trace_printk() events,
  ensuring it can be easily parsed by the reader.
- Added bpf_trace_btf() helper tests which do simple verification
  of the various display options.

Changes since v2:

- Alexei and Yonghong suggested it would be good to use
  probe_kernel_read() on to-be-shown data to ensure safety
  during operation.  Safe copy via probe_kernel_read() to a
  buffer object in "struct btf_show" is used to support
  this.  A few different approaches were explored
  including dynamic allocation and per-cpu buffers. The
  downside of dynamic allocation is that it would be done
  during BPF program execution for bpf_trace_printk()s using
  %pT format specifiers. The problem with per-cpu buffers
  is we'd have to manage preemption and since the display
  of an object occurs over an extended period and in printk
  context where we'd rather not change preemption status,
  it seemed tricky to manage buffer safety while considering
  preemption.  The approach of utilizing stack buffer space
  via the "struct btf_show" seemed like the simplest approach.
  The stack size of the associated functions which have a
  "struct btf_show" on their stack to support show operation
  (btf_type_snprintf_show() and btf_type_seq_show()) stays
  under 500 bytes. The compromise here is the safe buffer we
  use is small - 256 bytes - and as a result multiple
  probe_kernel_read()s are needed for larger objects. Most
  objects of interest are smaller than this (e.g.
  "struct sk_buff" is 224 bytes), and while task_struct is a
  notable exception at ~8K, performance is not the priority for
  BTF-based display. (Alexei and Yonghong, patch 2).
- safe buffer use is the default behaviour (and is mandatory
  for BPF) but unsafe display - meaning no safe copy is done
  and we operate on the object itself - is supported via a
  'u' option.
- pointers are prefixed with 0x for clarity (Alexei, patch 2)
- added additional comments and explanations around BTF show
  code, especially around determining whether objects such
  zeroed. Also tried to comment safe object scheme used. (Yonghong,
  patch 2)
- added late_initcall() to initialize vmlinux BTF so that it would
  not have to be initialized during printk operation (Alexei,
  patch 5)
- removed CONFIG_BTF_PRINTF config option as it is not needed;
  CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
  determining behaviour of type-based printk can be done via
  retrieval of BTF data; if it's not there BTF was unavailable
  or broken (Alexei, patches 4,6)
- fix bpf_trace_printk test to use vmlinux.h and globals via
  skeleton infrastructure, removing need for perf events
  (Andrii, patch 8)

Changes since v1:

- changed format to be more drgn-like, rendering indented type info
  along with type names by default (Alexei)
- zeroed values are omitted (Arnaldo) by default unless the '0'
  modifier is specified (Alexei)
- added an option to print pointer values without obfuscation.
  The reason to do this is the sysctls controlling pointer display
  are likely to be irrelevant in many if not most tracing contexts.
  Some questions on this in the outstanding questions section below...
- reworked printk format specifer so that we no longer rely on format
  %pT<type> but instead use a struct * which contains type information
  (Rasmus). This simplifies the printk parsing, makes use more dynamic
  and also allows specification by BTF id as well as name.
- removed incorrect patch which tried to fix dereferencing of resolved
  BTF info for vmlinux; instead we skip modifiers for the relevant
  case (array element type determination) (Alexei).
- fixed issues with negative snprintf format length (Rasmus)
- added test cases for various data structure formats; base types,
  typedefs, structs, etc.
- tests now iterate through all typedef, enum, struct and unions
  defined for vmlinux BTF and render a version of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Alexei Starovoitov 2020-09-28 18:26:59 -07:00
commit 98b972d20a
15 changed files with 1663 additions and 121 deletions

View File

@ -1364,6 +1364,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
union bpf_attr __user *uattr);
void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
struct btf *bpf_get_btf_vmlinux(void);
/* Map specifics */
struct xdp_buff;
struct sk_buff;
@ -1820,6 +1822,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_timewait_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
extern const struct bpf_func_proto bpf_copy_from_user_proto;
extern const struct bpf_func_proto bpf_snprintf_btf_proto;
const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);

View File

@ -6,6 +6,7 @@
#include <linux/types.h>
#include <uapi/linux/btf.h>
#include <uapi/linux/bpf.h>
#define BTF_TYPE_EMIT(type) ((void)(type *)0)
@ -13,6 +14,7 @@ struct btf;
struct btf_member;
struct btf_type;
union bpf_attr;
struct btf_show;
extern const struct file_operations btf_fops;
@ -46,8 +48,45 @@ int btf_get_info_by_fd(const struct btf *btf,
const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id,
u32 *ret_size);
/*
* Options to control show behaviour.
* - BTF_SHOW_COMPACT: no formatting around type information
* - BTF_SHOW_NONAME: no struct/union member names/types
* - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
* equivalent to %px.
* - BTF_SHOW_ZERO: show zero-valued struct/union members; they
* are not displayed by default
* - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
* data before displaying it.
*/
#define BTF_SHOW_COMPACT BTF_F_COMPACT
#define BTF_SHOW_NONAME BTF_F_NONAME
#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW
#define BTF_SHOW_ZERO BTF_F_ZERO
#define BTF_SHOW_UNSAFE (1ULL << 4)
void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
struct seq_file *m);
int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
struct seq_file *m, u64 flags);
/*
* Copy len bytes of string representation of obj of BTF type_id into buf.
*
* @btf: struct btf object
* @type_id: type id of type obj points to
* @obj: pointer to typed data
* @buf: buffer to write to
* @len: maximum length to write to buf
* @flags: show options (see above)
*
* Return: length that would have been/was copied as per snprintf, or
* negative error.
*/
int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
char *buf, int len, u64 flags);
int btf_get_fd_by_id(u32 id);
u32 btf_id(const struct btf *btf);
bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,

View File

@ -3594,6 +3594,50 @@ union bpf_attr {
* the data in *dst*. This is a wrapper of **copy_from_user**\ ().
* Return
* 0 on success, or a negative error in case of failure.
*
* long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags)
* Description
* Use BTF to store a string representation of *ptr*->ptr in *str*,
* using *ptr*->type_id. This value should specify the type
* that *ptr*->ptr points to. LLVM __builtin_btf_type_id(type, 1)
* can be used to look up vmlinux BTF type ids. Traversing the
* data structure using BTF, the type information and values are
* stored in the first *str_size* - 1 bytes of *str*. Safe copy of
* the pointer data is carried out to avoid kernel crashes during
* operation. Smaller types can use string space on the stack;
* larger programs can use map data to store the string
* representation.
*
* The string can be subsequently shared with userspace via
* bpf_perf_event_output() or ring buffer interfaces.
* bpf_trace_printk() is to be avoided as it places too small
* a limit on string size to be useful.
*
* *flags* is a combination of
*
* **BTF_F_COMPACT**
* no formatting around type information
* **BTF_F_NONAME**
* no struct/union member names/types
* **BTF_F_PTR_RAW**
* show raw (unobfuscated) pointer values;
* equivalent to printk specifier %px.
* **BTF_F_ZERO**
* show zero-valued struct/union members; they
* are not displayed by default
*
* Return
* The number of bytes that were written (or would have been
* written if output had to be truncated due to string size),
* or a negative error in cases of failure.
*
* long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags)
* Description
* Use BTF to write to seq_write a string representation of
* *ptr*->ptr, using *ptr*->type_id as per bpf_snprintf_btf().
* *flags* are identical to those used for bpf_snprintf_btf.
* Return
* 0 on success or a negative error in case of failure.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@ -3745,6 +3789,8 @@ union bpf_attr {
FN(inode_storage_delete), \
FN(d_path), \
FN(copy_from_user), \
FN(snprintf_btf), \
FN(seq_printf_btf), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
@ -4853,4 +4899,34 @@ struct bpf_sk_lookup {
__u32 local_port; /* Host byte order */
};
/*
* struct btf_ptr is used for typed pointer representation; the
* type id is used to render the pointer data as the appropriate type
* via the bpf_snprintf_btf() helper described above. A flags field -
* potentially to specify additional details about the BTF pointer
* (rather than its mode of display) - is included for future use.
* Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately.
*/
struct btf_ptr {
void *ptr;
__u32 type_id;
__u32 flags; /* BTF ptr flags; unused at present. */
};
/*
* Flags to control bpf_snprintf_btf() behaviour.
* - BTF_F_COMPACT: no formatting around type information
* - BTF_F_NONAME: no struct/union member names/types
* - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values;
* equivalent to %px.
* - BTF_F_ZERO: show zero-valued struct/union members; they
* are not displayed by default
*/
enum {
BTF_F_COMPACT = (1ULL << 0),
BTF_F_NONAME = (1ULL << 1),
BTF_F_PTR_RAW = (1ULL << 2),
BTF_F_ZERO = (1ULL << 3),
};
#endif /* _UAPI__LINUX_BPF_H__ */

View File

@ -88,8 +88,8 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size,
mutex_lock(&seq->lock);
if (!seq->buf) {
seq->size = PAGE_SIZE;
seq->buf = kmalloc(seq->size, GFP_KERNEL);
seq->size = PAGE_SIZE << 3;
seq->buf = kvmalloc(seq->size, GFP_KERNEL);
if (!seq->buf) {
err = -ENOMEM;
goto done;

File diff suppressed because it is too large Load Diff

View File

@ -2216,6 +2216,8 @@ const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
const struct bpf_func_proto bpf_get_current_ancestor_cgroup_id_proto __weak;
const struct bpf_func_proto bpf_get_local_storage_proto __weak;
const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
const struct bpf_func_proto bpf_snprintf_btf_proto __weak;
const struct bpf_func_proto bpf_seq_printf_btf_proto __weak;
const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
{

View File

@ -683,6 +683,10 @@ bpf_base_func_proto(enum bpf_func_id func_id)
if (!perfmon_capable())
return NULL;
return bpf_get_trace_printk_proto();
case BPF_FUNC_snprintf_btf:
if (!perfmon_capable())
return NULL;
return &bpf_snprintf_btf_proto;
case BPF_FUNC_jiffies64:
return &bpf_jiffies64_proto;
default:

View File

@ -11533,6 +11533,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
return 0;
}
struct btf *bpf_get_btf_vmlinux(void)
{
if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
mutex_lock(&bpf_verifier_lock);
if (!btf_vmlinux)
btf_vmlinux = btf_parse_vmlinux();
mutex_unlock(&bpf_verifier_lock);
}
return btf_vmlinux;
}
int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
union bpf_attr __user *uattr)
{
@ -11566,12 +11577,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
env->ops = bpf_verifier_ops[env->prog->type];
is_priv = bpf_capable();
if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
mutex_lock(&bpf_verifier_lock);
if (!btf_vmlinux)
btf_vmlinux = btf_parse_vmlinux();
mutex_unlock(&bpf_verifier_lock);
}
bpf_get_btf_vmlinux();
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)

View File

@ -7,6 +7,7 @@
#include <linux/slab.h>
#include <linux/bpf.h>
#include <linux/bpf_perf_event.h>
#include <linux/btf.h>
#include <linux/filter.h>
#include <linux/uaccess.h>
#include <linux/ctype.h>
@ -16,6 +17,9 @@
#include <linux/error-injection.h>
#include <linux/btf_ids.h>
#include <uapi/linux/bpf.h>
#include <uapi/linux/btf.h>
#include <asm/tlb.h>
#include "trace_probe.h"
@ -67,6 +71,10 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name)
u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
u64 flags, const struct btf **btf,
s32 *btf_id);
/**
* trace_call_bpf - invoke BPF program
* @call: tracepoint event
@ -772,6 +780,31 @@ static const struct bpf_func_proto bpf_seq_write_proto = {
.arg3_type = ARG_CONST_SIZE_OR_ZERO,
};
BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr,
u32, btf_ptr_size, u64, flags)
{
const struct btf *btf;
s32 btf_id;
int ret;
ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id);
if (ret)
return ret;
return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags);
}
static const struct bpf_func_proto bpf_seq_printf_btf_proto = {
.func = bpf_seq_printf_btf,
.gpl_only = true,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_BTF_ID,
.arg1_btf_id = &btf_seq_file_ids[0],
.arg2_type = ARG_PTR_TO_MEM,
.arg3_type = ARG_CONST_SIZE_OR_ZERO,
.arg4_type = ARG_ANYTHING,
};
static __always_inline int
get_map_perf_counter(struct bpf_map *map, u64 flags,
u64 *value, u64 *enabled, u64 *running)
@ -1147,6 +1180,65 @@ static const struct bpf_func_proto bpf_d_path_proto = {
.allowed = bpf_d_path_allowed,
};
#define BTF_F_ALL (BTF_F_COMPACT | BTF_F_NONAME | \
BTF_F_PTR_RAW | BTF_F_ZERO)
static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
u64 flags, const struct btf **btf,
s32 *btf_id)
{
const struct btf_type *t;
if (unlikely(flags & ~(BTF_F_ALL)))
return -EINVAL;
if (btf_ptr_size != sizeof(struct btf_ptr))
return -EINVAL;
*btf = bpf_get_btf_vmlinux();
if (IS_ERR_OR_NULL(*btf))
return PTR_ERR(*btf);
if (ptr->type_id > 0)
*btf_id = ptr->type_id;
else
return -EINVAL;
if (*btf_id > 0)
t = btf_type_by_id(*btf, *btf_id);
if (*btf_id <= 0 || !t)
return -ENOENT;
return 0;
}
BPF_CALL_5(bpf_snprintf_btf, char *, str, u32, str_size, struct btf_ptr *, ptr,
u32, btf_ptr_size, u64, flags)
{
const struct btf *btf;
s32 btf_id;
int ret;
ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id);
if (ret)
return ret;
return btf_type_snprintf_show(btf, btf_id, ptr->ptr, str, str_size,
flags);
}
const struct bpf_func_proto bpf_snprintf_btf_proto = {
.func = bpf_snprintf_btf,
.gpl_only = false,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_MEM,
.arg2_type = ARG_CONST_SIZE,
.arg3_type = ARG_PTR_TO_MEM,
.arg4_type = ARG_CONST_SIZE,
.arg5_type = ARG_ANYTHING,
};
const struct bpf_func_proto *
bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
@ -1233,6 +1325,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_get_task_stack_proto;
case BPF_FUNC_copy_from_user:
return prog->aux->sleepable ? &bpf_copy_from_user_proto : NULL;
case BPF_FUNC_snprintf_btf:
return &bpf_snprintf_btf_proto;
default:
return NULL;
}
@ -1630,6 +1724,10 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return prog->expected_attach_type == BPF_TRACE_ITER ?
&bpf_seq_write_proto :
NULL;
case BPF_FUNC_seq_printf_btf:
return prog->expected_attach_type == BPF_TRACE_ITER ?
&bpf_seq_printf_btf_proto :
NULL;
case BPF_FUNC_d_path:
return &bpf_d_path_proto;
default:

View File

@ -433,6 +433,7 @@ class PrinterHelpers(Printer):
'struct sk_msg_md',
'struct xdp_md',
'struct path',
'struct btf_ptr',
]
known_types = {
'...',
@ -474,6 +475,7 @@ class PrinterHelpers(Printer):
'struct udp6_sock',
'struct task_struct',
'struct path',
'struct btf_ptr',
}
mapped_types = {
'u8': '__u8',

View File

@ -3594,6 +3594,50 @@ union bpf_attr {
* the data in *dst*. This is a wrapper of **copy_from_user**\ ().
* Return
* 0 on success, or a negative error in case of failure.
*
* long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags)
* Description
* Use BTF to store a string representation of *ptr*->ptr in *str*,
* using *ptr*->type_id. This value should specify the type
* that *ptr*->ptr points to. LLVM __builtin_btf_type_id(type, 1)
* can be used to look up vmlinux BTF type ids. Traversing the
* data structure using BTF, the type information and values are
* stored in the first *str_size* - 1 bytes of *str*. Safe copy of
* the pointer data is carried out to avoid kernel crashes during
* operation. Smaller types can use string space on the stack;
* larger programs can use map data to store the string
* representation.
*
* The string can be subsequently shared with userspace via
* bpf_perf_event_output() or ring buffer interfaces.
* bpf_trace_printk() is to be avoided as it places too small
* a limit on string size to be useful.
*
* *flags* is a combination of
*
* **BTF_F_COMPACT**
* no formatting around type information
* **BTF_F_NONAME**
* no struct/union member names/types
* **BTF_F_PTR_RAW**
* show raw (unobfuscated) pointer values;
* equivalent to printk specifier %px.
* **BTF_F_ZERO**
* show zero-valued struct/union members; they
* are not displayed by default
*
* Return
* The number of bytes that were written (or would have been
* written if output had to be truncated due to string size),
* or a negative error in cases of failure.
*
* long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags)
* Description
* Use BTF to write to seq_write a string representation of
* *ptr*->ptr, using *ptr*->type_id as per bpf_snprintf_btf().
* *flags* are identical to those used for bpf_snprintf_btf.
* Return
* 0 on success or a negative error in case of failure.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@ -3745,6 +3789,8 @@ union bpf_attr {
FN(inode_storage_delete), \
FN(d_path), \
FN(copy_from_user), \
FN(snprintf_btf), \
FN(seq_printf_btf), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
@ -4853,4 +4899,34 @@ struct bpf_sk_lookup {
__u32 local_port; /* Host byte order */
};
/*
* struct btf_ptr is used for typed pointer representation; the
* type id is used to render the pointer data as the appropriate type
* via the bpf_snprintf_btf() helper described above. A flags field -
* potentially to specify additional details about the BTF pointer
* (rather than its mode of display) - is included for future use.
* Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately.
*/
struct btf_ptr {
void *ptr;
__u32 type_id;
__u32 flags; /* BTF ptr flags; unused at present. */
};
/*
* Flags to control bpf_snprintf_btf() behaviour.
* - BTF_F_COMPACT: no formatting around type information
* - BTF_F_NONAME: no struct/union member names/types
* - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values;
* equivalent to %px.
* - BTF_F_ZERO: show zero-valued struct/union members; they
* are not displayed by default
*/
enum {
BTF_F_COMPACT = (1ULL << 0),
BTF_F_NONAME = (1ULL << 1),
BTF_F_PTR_RAW = (1ULL << 2),
BTF_F_ZERO = (1ULL << 3),
};
#endif /* _UAPI__LINUX_BPF_H__ */

View File

@ -7,6 +7,7 @@
#include "bpf_iter_task.skel.h"
#include "bpf_iter_task_stack.skel.h"
#include "bpf_iter_task_file.skel.h"
#include "bpf_iter_task_btf.skel.h"
#include "bpf_iter_tcp4.skel.h"
#include "bpf_iter_tcp6.skel.h"
#include "bpf_iter_udp4.skel.h"
@ -167,6 +168,77 @@ done:
bpf_iter_task_file__destroy(skel);
}
#define TASKBUFSZ 32768
static char taskbuf[TASKBUFSZ];
static void do_btf_read(struct bpf_iter_task_btf *skel)
{
struct bpf_program *prog = skel->progs.dump_task_struct;
struct bpf_iter_task_btf__bss *bss = skel->bss;
int iter_fd = -1, len = 0, bufleft = TASKBUFSZ;
struct bpf_link *link;
char *buf = taskbuf;
link = bpf_program__attach_iter(prog, NULL);
if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
return;
iter_fd = bpf_iter_create(bpf_link__fd(link));
if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
goto free_link;
do {
len = read(iter_fd, buf, bufleft);
if (len > 0) {
buf += len;
bufleft -= len;
}
} while (len > 0);
if (bss->skip) {
printf("%s:SKIP:no __builtin_btf_type_id\n", __func__);
test__skip();
goto free_link;
}
if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)))
goto free_link;
CHECK(strstr(taskbuf, "(struct task_struct)") == NULL,
"check for btf representation of task_struct in iter data",
"struct task_struct not found");
free_link:
if (iter_fd > 0)
close(iter_fd);
bpf_link__destroy(link);
}
static void test_task_btf(void)
{
struct bpf_iter_task_btf__bss *bss;
struct bpf_iter_task_btf *skel;
skel = bpf_iter_task_btf__open_and_load();
if (CHECK(!skel, "bpf_iter_task_btf__open_and_load",
"skeleton open_and_load failed\n"))
return;
bss = skel->bss;
do_btf_read(skel);
if (CHECK(bss->tasks == 0, "check if iterated over tasks",
"no task iteration, did BPF program run?\n"))
goto cleanup;
CHECK(bss->seq_err != 0, "check for unexpected err",
"bpf_seq_printf_btf returned %ld", bss->seq_err);
cleanup:
bpf_iter_task_btf__destroy(skel);
}
static void test_tcp4(void)
{
struct bpf_iter_tcp4 *skel;
@ -352,7 +424,7 @@ static void test_overflow(bool test_e2big_overflow, bool ret1)
struct bpf_map_info map_info = {};
struct bpf_iter_test_kern4 *skel;
struct bpf_link *link;
__u32 page_size;
__u32 iter_size;
char *buf;
skel = bpf_iter_test_kern4__open();
@ -374,19 +446,19 @@ static void test_overflow(bool test_e2big_overflow, bool ret1)
"map_creation failed: %s\n", strerror(errno)))
goto free_map1;
/* bpf_seq_printf kernel buffer is one page, so one map
/* bpf_seq_printf kernel buffer is 8 pages, so one map
* bpf_seq_write will mostly fill it, and the other map
* will partially fill and then trigger overflow and need
* bpf_seq_read restart.
*/
page_size = sysconf(_SC_PAGE_SIZE);
iter_size = sysconf(_SC_PAGE_SIZE) << 3;
if (test_e2big_overflow) {
skel->rodata->print_len = (page_size + 8) / 8;
expected_read_len = 2 * (page_size + 8);
skel->rodata->print_len = (iter_size + 8) / 8;
expected_read_len = 2 * (iter_size + 8);
} else if (!ret1) {
skel->rodata->print_len = (page_size - 8) / 8;
expected_read_len = 2 * (page_size - 8);
skel->rodata->print_len = (iter_size - 8) / 8;
expected_read_len = 2 * (iter_size - 8);
} else {
skel->rodata->print_len = 1;
expected_read_len = 2 * 8;
@ -957,6 +1029,8 @@ void test_bpf_iter(void)
test_task_stack();
if (test__start_subtest("task_file"))
test_task_file();
if (test__start_subtest("task_btf"))
test_task_btf();
if (test__start_subtest("tcp4"))
test_tcp4();
if (test__start_subtest("tcp6"))

View File

@ -0,0 +1,60 @@
// SPDX-License-Identifier: GPL-2.0
#include <test_progs.h>
#include <linux/btf.h>
#include "netif_receive_skb.skel.h"
/* Demonstrate that bpf_snprintf_btf succeeds and that various data types
* are formatted correctly.
*/
void test_snprintf_btf(void)
{
struct netif_receive_skb *skel;
struct netif_receive_skb__bss *bss;
int err, duration = 0;
skel = netif_receive_skb__open();
if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
return;
err = netif_receive_skb__load(skel);
if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
goto cleanup;
bss = skel->bss;
err = netif_receive_skb__attach(skel);
if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
goto cleanup;
/* generate receive event */
system("ping -c 1 127.0.0.1 > /dev/null");
if (bss->skip) {
printf("%s:SKIP:no __builtin_btf_type_id\n", __func__);
test__skip();
goto cleanup;
}
/*
* Make sure netif_receive_skb program was triggered
* and it set expected return values from bpf_trace_printk()s
* and all tests ran.
*/
if (CHECK(bss->ret <= 0,
"bpf_snprintf_btf: got return value",
"ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests))
goto cleanup;
if (CHECK(bss->ran_subtests == 0, "check if subtests ran",
"no subtests ran, did BPF program run?"))
goto cleanup;
if (CHECK(bss->num_subtests != bss->ran_subtests,
"check all subtests ran",
"only ran %d of %d tests\n", bss->num_subtests,
bss->ran_subtests))
goto cleanup;
cleanup:
netif_receive_skb__destroy(skel);
}

View File

@ -0,0 +1,50 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2020, Oracle and/or its affiliates. */
#include "bpf_iter.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
#include <errno.h>
char _license[] SEC("license") = "GPL";
long tasks = 0;
long seq_err = 0;
bool skip = false;
SEC("iter/task")
int dump_task_struct(struct bpf_iter__task *ctx)
{
struct seq_file *seq = ctx->meta->seq;
struct task_struct *task = ctx->task;
static struct btf_ptr ptr = { };
long ret;
#if __has_builtin(__builtin_btf_type_id)
ptr.type_id = bpf_core_type_id_kernel(struct task_struct);
ptr.ptr = task;
if (ctx->meta->seq_num == 0)
BPF_SEQ_PRINTF(seq, "Raw BTF task\n");
ret = bpf_seq_printf_btf(seq, &ptr, sizeof(ptr), 0);
switch (ret) {
case 0:
tasks++;
break;
case -ERANGE:
/* NULL task or task->fs, don't count it as an error. */
break;
case -E2BIG:
return 1;
default:
seq_err = ret;
break;
}
#else
skip = true;
#endif
return 0;
}

View File

@ -0,0 +1,249 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2020, Oracle and/or its affiliates. */
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
#include <errno.h>
long ret = 0;
int num_subtests = 0;
int ran_subtests = 0;
bool skip = false;
#define STRSIZE 2048
#define EXPECTED_STRSIZE 256
#ifndef ARRAY_SIZE
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
#endif
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, char[STRSIZE]);
} strdata SEC(".maps");
static int __strncmp(const void *m1, const void *m2, size_t len)
{
const unsigned char *s1 = m1;
const unsigned char *s2 = m2;
int i, delta = 0;
for (i = 0; i < len; i++) {
delta = s1[i] - s2[i];
if (delta || s1[i] == 0 || s2[i] == 0)
break;
}
return delta;
}
#if __has_builtin(__builtin_btf_type_id)
#define TEST_BTF(_str, _type, _flags, _expected, ...) \
do { \
static const char _expectedval[EXPECTED_STRSIZE] = \
_expected; \
static const char _ptrtype[64] = #_type; \
__u64 _hflags = _flags | BTF_F_COMPACT; \
static _type _ptrdata = __VA_ARGS__; \
static struct btf_ptr _ptr = { }; \
int _cmp; \
\
++num_subtests; \
if (ret < 0) \
break; \
++ran_subtests; \
_ptr.ptr = &_ptrdata; \
_ptr.type_id = bpf_core_type_id_kernel(_type); \
if (_ptr.type_id <= 0) { \
ret = -EINVAL; \
break; \
} \
ret = bpf_snprintf_btf(_str, STRSIZE, \
&_ptr, sizeof(_ptr), _hflags); \
if (ret) \
break; \
_cmp = __strncmp(_str, _expectedval, EXPECTED_STRSIZE); \
if (_cmp != 0) { \
bpf_printk("(%d) got %s", _cmp, _str); \
bpf_printk("(%d) expected %s", _cmp, \
_expectedval); \
ret = -EBADMSG; \
break; \
} \
} while (0)
#endif
/* Use where expected data string matches its stringified declaration */
#define TEST_BTF_C(_str, _type, _flags, ...) \
TEST_BTF(_str, _type, _flags, "(" #_type ")" #__VA_ARGS__, \
__VA_ARGS__)
/* TRACE_EVENT(netif_receive_skb,
* TP_PROTO(struct sk_buff *skb),
*/
SEC("tp_btf/netif_receive_skb")
int BPF_PROG(trace_netif_receive_skb, struct sk_buff *skb)
{
static __u64 flags[] = { 0, BTF_F_COMPACT, BTF_F_ZERO, BTF_F_PTR_RAW,
BTF_F_NONAME, BTF_F_COMPACT | BTF_F_ZERO |
BTF_F_PTR_RAW | BTF_F_NONAME };
static struct btf_ptr p = { };
__u32 key = 0;
int i, __ret;
char *str;
#if __has_builtin(__builtin_btf_type_id)
str = bpf_map_lookup_elem(&strdata, &key);
if (!str)
return 0;
/* Ensure we can write skb string representation */
p.type_id = bpf_core_type_id_kernel(struct sk_buff);
p.ptr = skb;
for (i = 0; i < ARRAY_SIZE(flags); i++) {
++num_subtests;
ret = bpf_snprintf_btf(str, STRSIZE, &p, sizeof(p), 0);
if (ret < 0)
bpf_printk("returned %d when writing skb", ret);
++ran_subtests;
}
/* Check invalid ptr value */
p.ptr = 0;
__ret = bpf_snprintf_btf(str, STRSIZE, &p, sizeof(p), 0);
if (__ret >= 0) {
bpf_printk("printing NULL should generate error, got (%d)",
__ret);
ret = -ERANGE;
}
/* Verify type display for various types. */
/* simple int */
TEST_BTF_C(str, int, 0, 1234);
TEST_BTF(str, int, BTF_F_NONAME, "1234", 1234);
/* zero value should be printed at toplevel */
TEST_BTF(str, int, 0, "(int)0", 0);
TEST_BTF(str, int, BTF_F_NONAME, "0", 0);
TEST_BTF(str, int, BTF_F_ZERO, "(int)0", 0);
TEST_BTF(str, int, BTF_F_NONAME | BTF_F_ZERO, "0", 0);
TEST_BTF_C(str, int, 0, -4567);
TEST_BTF(str, int, BTF_F_NONAME, "-4567", -4567);
/* simple char */
TEST_BTF_C(str, char, 0, 100);
TEST_BTF(str, char, BTF_F_NONAME, "100", 100);
/* zero value should be printed at toplevel */
TEST_BTF(str, char, 0, "(char)0", 0);
TEST_BTF(str, char, BTF_F_NONAME, "0", 0);
TEST_BTF(str, char, BTF_F_ZERO, "(char)0", 0);
TEST_BTF(str, char, BTF_F_NONAME | BTF_F_ZERO, "0", 0);
/* simple typedef */
TEST_BTF_C(str, uint64_t, 0, 100);
TEST_BTF(str, u64, BTF_F_NONAME, "1", 1);
/* zero value should be printed at toplevel */
TEST_BTF(str, u64, 0, "(u64)0", 0);
TEST_BTF(str, u64, BTF_F_NONAME, "0", 0);
TEST_BTF(str, u64, BTF_F_ZERO, "(u64)0", 0);
TEST_BTF(str, u64, BTF_F_NONAME|BTF_F_ZERO, "0", 0);
/* typedef struct */
TEST_BTF_C(str, atomic_t, 0, {.counter = (int)1,});
TEST_BTF(str, atomic_t, BTF_F_NONAME, "{1,}", {.counter = 1,});
/* typedef with 0 value should be printed at toplevel */
TEST_BTF(str, atomic_t, 0, "(atomic_t){}", {.counter = 0,});
TEST_BTF(str, atomic_t, BTF_F_NONAME, "{}", {.counter = 0,});
TEST_BTF(str, atomic_t, BTF_F_ZERO, "(atomic_t){.counter = (int)0,}",
{.counter = 0,});
TEST_BTF(str, atomic_t, BTF_F_NONAME|BTF_F_ZERO,
"{0,}", {.counter = 0,});
/* enum where enum value does (and does not) exist */
TEST_BTF_C(str, enum bpf_cmd, 0, BPF_MAP_CREATE);
TEST_BTF(str, enum bpf_cmd, 0, "(enum bpf_cmd)BPF_MAP_CREATE", 0);
TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME, "BPF_MAP_CREATE",
BPF_MAP_CREATE);
TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME|BTF_F_ZERO,
"BPF_MAP_CREATE", 0);
TEST_BTF(str, enum bpf_cmd, BTF_F_ZERO, "(enum bpf_cmd)BPF_MAP_CREATE",
BPF_MAP_CREATE);
TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME|BTF_F_ZERO,
"BPF_MAP_CREATE", BPF_MAP_CREATE);
TEST_BTF_C(str, enum bpf_cmd, 0, 2000);
TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME, "2000", 2000);
/* simple struct */
TEST_BTF_C(str, struct btf_enum, 0,
{.name_off = (__u32)3,.val = (__s32)-1,});
TEST_BTF(str, struct btf_enum, BTF_F_NONAME, "{3,-1,}",
{ .name_off = 3, .val = -1,});
TEST_BTF(str, struct btf_enum, BTF_F_NONAME, "{-1,}",
{ .name_off = 0, .val = -1,});
TEST_BTF(str, struct btf_enum, BTF_F_NONAME|BTF_F_ZERO, "{0,-1,}",
{ .name_off = 0, .val = -1,});
/* empty struct should be printed */
TEST_BTF(str, struct btf_enum, 0, "(struct btf_enum){}",
{ .name_off = 0, .val = 0,});
TEST_BTF(str, struct btf_enum, BTF_F_NONAME, "{}",
{ .name_off = 0, .val = 0,});
TEST_BTF(str, struct btf_enum, BTF_F_ZERO,
"(struct btf_enum){.name_off = (__u32)0,.val = (__s32)0,}",
{ .name_off = 0, .val = 0,});
/* struct with pointers */
TEST_BTF(str, struct list_head, BTF_F_PTR_RAW,
"(struct list_head){.next = (struct list_head *)0x0000000000000001,}",
{ .next = (struct list_head *)1 });
/* NULL pointer should not be displayed */
TEST_BTF(str, struct list_head, BTF_F_PTR_RAW,
"(struct list_head){}",
{ .next = (struct list_head *)0 });
/* struct with char array */
TEST_BTF(str, struct bpf_prog_info, 0,
"(struct bpf_prog_info){.name = (char[])['f','o','o',],}",
{ .name = "foo",});
TEST_BTF(str, struct bpf_prog_info, BTF_F_NONAME,
"{['f','o','o',],}",
{.name = "foo",});
/* leading null char means do not display string */
TEST_BTF(str, struct bpf_prog_info, 0,
"(struct bpf_prog_info){}",
{.name = {'\0', 'f', 'o', 'o'}});
/* handle non-printable characters */
TEST_BTF(str, struct bpf_prog_info, 0,
"(struct bpf_prog_info){.name = (char[])[1,2,3,],}",
{ .name = {1, 2, 3, 0}});
/* struct with non-char array */
TEST_BTF(str, struct __sk_buff, 0,
"(struct __sk_buff){.cb = (__u32[])[1,2,3,4,5,],}",
{ .cb = {1, 2, 3, 4, 5,},});
TEST_BTF(str, struct __sk_buff, BTF_F_NONAME,
"{[1,2,3,4,5,],}",
{ .cb = { 1, 2, 3, 4, 5},});
/* For non-char, arrays, show non-zero values only */
TEST_BTF(str, struct __sk_buff, 0,
"(struct __sk_buff){.cb = (__u32[])[1,],}",
{ .cb = { 0, 0, 1, 0, 0},});
/* struct with bitfields */
TEST_BTF_C(str, struct bpf_insn, 0,
{.code = (__u8)1,.dst_reg = (__u8)0x2,.src_reg = (__u8)0x3,.off = (__s16)4,.imm = (__s32)5,});
TEST_BTF(str, struct bpf_insn, BTF_F_NONAME, "{1,0x2,0x3,4,5,}",
{.code = 1, .dst_reg = 0x2, .src_reg = 0x3, .off = 4,
.imm = 5,});
#else
skip = true;
#endif
return 0;
}
char _license[] SEC("license") = "GPL";