linux/kernel/bpf/log.c
Andrii Nakryiko 1216640938 bpf: Switch BPF verifier log to be a rotating log by default
Currently, if user-supplied log buffer to collect BPF verifier log turns
out to be too small to contain full log, bpf() syscall returns -ENOSPC,
fails BPF program verification/load, and preserves first N-1 bytes of
the verifier log (where N is the size of user-supplied buffer).

This is problematic in a bunch of common scenarios, especially when
working with real-world BPF programs that tend to be pretty complex as
far as verification goes and require big log buffers. Typically, it's
when debugging tricky cases at log level 2 (verbose). Also, when BPF program
is successfully validated, log level 2 is the only way to actually see
verifier state progression and all the important details.

Even with log level 1, it's possible to get -ENOSPC even if the final
verifier log fits in log buffer, if there is a code path that's deep
enough to fill up entire log, even if normally it would be reset later
on (there is a logic to chop off successfully validated portions of BPF
verifier log).

In short, it's not always possible to pre-size log buffer. Also, what's
worse, in practice, the end of the log most often is way more important
than the beginning, but verifier stops emitting log as soon as initial
log buffer is filled up.

This patch switches BPF verifier log behavior to effectively behave as
rotating log. That is, if user-supplied log buffer turns out to be too
short, verifier will keep overwriting previously written log,
effectively treating user's log buffer as a ring buffer. -ENOSPC is
still going to be returned at the end, to notify user that log contents
was truncated, but the important last N bytes of the log would be
returned, which might be all that user really needs. This consistent
-ENOSPC behavior, regardless of rotating or fixed log behavior, allows
to prevent backwards compatibility breakage. The only user-visible
change is which portion of verifier log user ends up seeing *if buffer
is too small*. Given contents of verifier log itself is not an ABI,
there is no breakage due to this behavior change. Specialized tools that
rely on specific contents of verifier log in -ENOSPC scenario are
expected to be easily adapted to accommodate old and new behaviors.

Importantly, though, to preserve good user experience and not require
every user-space application to adopt to this new behavior, before
exiting to user-space verifier will rotate log (in place) to make it
start at the very beginning of user buffer as a continuous
zero-terminated string. The contents will be a chopped off N-1 last
bytes of full verifier log, of course.

Given beginning of log is sometimes important as well, we add
BPF_LOG_FIXED (which equals 8) flag to force old behavior, which allows
tools like veristat to request first part of verifier log, if necessary.
BPF_LOG_FIXED flag is also a simple and straightforward way to check if
BPF verifier supports rotating behavior.

On the implementation side, conceptually, it's all simple. We maintain
64-bit logical start and end positions. If we need to truncate the log,
start position will be adjusted accordingly to lag end position by
N bytes. We then use those logical positions to calculate their matching
actual positions in user buffer and handle wrap around the end of the
buffer properly. Finally, right before returning from bpf_check(), we
rotate user log buffer contents in-place as necessary, to make log
contents contiguous. See comments in relevant functions for details.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/bpf/20230406234205.323208-4-andrii@kernel.org
2023-04-11 18:05:43 +02:00

266 lines
7.5 KiB
C

// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
* Copyright (c) 2016 Facebook
* Copyright (c) 2018 Covalent IO, Inc. http://covalent.io
*/
#include <uapi/linux/btf.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/bpf.h>
#include <linux/bpf_verifier.h>
#include <linux/math64.h>
bool bpf_verifier_log_attr_valid(const struct bpf_verifier_log *log)
{
return log->len_total > 0 && log->len_total <= UINT_MAX >> 2 &&
log->level && log->ubuf && !(log->level & ~BPF_LOG_MASK);
}
void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
va_list args)
{
unsigned int n;
n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args);
WARN_ONCE(n >= BPF_VERIFIER_TMP_LOG_SIZE - 1,
"verifier log line truncated - local buffer too short\n");
if (log->level == BPF_LOG_KERNEL) {
bool newline = n > 0 && log->kbuf[n - 1] == '\n';
pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
return;
}
if (log->level & BPF_LOG_FIXED) {
n = min(log->len_total - bpf_log_used(log) - 1, n);
log->kbuf[n] = '\0';
n += 1;
if (copy_to_user(log->ubuf + log->end_pos, log->kbuf, n))
goto fail;
log->end_pos += n - 1; /* don't count terminating '\0' */
} else {
u64 new_end, new_start, cur_pos;
u32 buf_start, buf_end, new_n;
n += 1;
new_end = log->end_pos + n;
if (new_end - log->start_pos >= log->len_total)
new_start = new_end - log->len_total;
else
new_start = log->start_pos;
new_n = min(n, log->len_total);
cur_pos = new_end - new_n;
div_u64_rem(cur_pos, log->len_total, &buf_start);
div_u64_rem(new_end, log->len_total, &buf_end);
/* new_end and buf_end are exclusive indices, so if buf_end is
* exactly zero, then it actually points right to the end of
* ubuf and there is no wrap around
*/
if (buf_end == 0)
buf_end = log->len_total;
/* if buf_start > buf_end, we wrapped around;
* if buf_start == buf_end, then we fill ubuf completely; we
* can't have buf_start == buf_end to mean that there is
* nothing to write, because we always write at least
* something, even if terminal '\0'
*/
if (buf_start < buf_end) {
/* message fits within contiguous chunk of ubuf */
if (copy_to_user(log->ubuf + buf_start,
log->kbuf + n - new_n,
buf_end - buf_start))
goto fail;
} else {
/* message wraps around the end of ubuf, copy in two chunks */
if (copy_to_user(log->ubuf + buf_start,
log->kbuf + n - new_n,
log->len_total - buf_start))
goto fail;
if (copy_to_user(log->ubuf,
log->kbuf + n - buf_end,
buf_end))
goto fail;
}
log->start_pos = new_start;
log->end_pos = new_end - 1; /* don't count terminating '\0' */
}
return;
fail:
log->ubuf = NULL;
}
void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
{
char zero = 0;
u32 pos;
if (WARN_ON_ONCE(new_pos > log->end_pos))
return;
if (!bpf_verifier_log_needed(log))
return;
/* if position to which we reset is beyond current log window,
* then we didn't preserve any useful content and should adjust
* start_pos to end up with an empty log (start_pos == end_pos)
*/
log->end_pos = new_pos;
if (log->end_pos < log->start_pos)
log->start_pos = log->end_pos;
div_u64_rem(new_pos, log->len_total, &pos);
if (put_user(zero, log->ubuf + pos))
log->ubuf = NULL;
}
static void bpf_vlog_reverse_kbuf(char *buf, int len)
{
int i, j;
for (i = 0, j = len - 1; i < j; i++, j--)
swap(buf[i], buf[j]);
}
static int bpf_vlog_reverse_ubuf(struct bpf_verifier_log *log, int start, int end)
{
/* we split log->kbuf into two equal parts for both ends of array */
int n = sizeof(log->kbuf) / 2, nn;
char *lbuf = log->kbuf, *rbuf = log->kbuf + n;
/* Read ubuf's section [start, end) two chunks at a time, from left
* and right side; within each chunk, swap all the bytes; after that
* reverse the order of lbuf and rbuf and write result back to ubuf.
* This way we'll end up with swapped contents of specified
* [start, end) ubuf segment.
*/
while (end - start > 1) {
nn = min(n, (end - start ) / 2);
if (copy_from_user(lbuf, log->ubuf + start, nn))
return -EFAULT;
if (copy_from_user(rbuf, log->ubuf + end - nn, nn))
return -EFAULT;
bpf_vlog_reverse_kbuf(lbuf, nn);
bpf_vlog_reverse_kbuf(rbuf, nn);
/* we write lbuf to the right end of ubuf, while rbuf to the
* left one to end up with properly reversed overall ubuf
*/
if (copy_to_user(log->ubuf + start, rbuf, nn))
return -EFAULT;
if (copy_to_user(log->ubuf + end - nn, lbuf, nn))
return -EFAULT;
start += nn;
end -= nn;
}
return 0;
}
bool bpf_vlog_truncated(const struct bpf_verifier_log *log)
{
if (log->level & BPF_LOG_FIXED)
return bpf_log_used(log) >= log->len_total - 1;
else
return log->start_pos > 0;
}
void bpf_vlog_finalize(struct bpf_verifier_log *log)
{
u32 sublen;
int err;
if (!log || !log->level || !log->ubuf)
return;
if ((log->level & BPF_LOG_FIXED) || log->level == BPF_LOG_KERNEL)
return;
/* If we never truncated log, there is nothing to move around. */
if (log->start_pos == 0)
return;
/* Otherwise we need to rotate log contents to make it start from the
* buffer beginning and be a continuous zero-terminated string. Note
* that if log->start_pos != 0 then we definitely filled up entire log
* buffer with no gaps, and we just need to shift buffer contents to
* the left by (log->start_pos % log->len_total) bytes.
*
* Unfortunately, user buffer could be huge and we don't want to
* allocate temporary kernel memory of the same size just to shift
* contents in a straightforward fashion. Instead, we'll be clever and
* do in-place array rotation. This is a leetcode-style problem, which
* could be solved by three rotations.
*
* Let's say we have log buffer that has to be shifted left by 7 bytes
* (spaces and vertical bar is just for demonstrative purposes):
* E F G H I J K | A B C D
*
* First, we reverse entire array:
* D C B A | K J I H G F E
*
* Then we rotate first 4 bytes (DCBA) and separately last 7 bytes
* (KJIHGFE), resulting in a properly rotated array:
* A B C D | E F G H I J K
*
* We'll utilize log->kbuf to read user memory chunk by chunk, swap
* bytes, and write them back. Doing it byte-by-byte would be
* unnecessarily inefficient. Altogether we are going to read and
* write each byte twice, for total 4 memory copies between kernel and
* user space.
*/
/* length of the chopped off part that will be the beginning;
* len(ABCD) in the example above
*/
div_u64_rem(log->start_pos, log->len_total, &sublen);
sublen = log->len_total - sublen;
err = bpf_vlog_reverse_ubuf(log, 0, log->len_total);
err = err ?: bpf_vlog_reverse_ubuf(log, 0, sublen);
err = err ?: bpf_vlog_reverse_ubuf(log, sublen, log->len_total);
if (err)
log->ubuf = NULL;
}
/* log_level controls verbosity level of eBPF verifier.
* bpf_verifier_log_write() is used to dump the verification trace to the log,
* so the user can figure out what's wrong with the program
*/
__printf(2, 3) void bpf_verifier_log_write(struct bpf_verifier_env *env,
const char *fmt, ...)
{
va_list args;
if (!bpf_verifier_log_needed(&env->log))
return;
va_start(args, fmt);
bpf_verifier_vlog(&env->log, fmt, args);
va_end(args);
}
EXPORT_SYMBOL_GPL(bpf_verifier_log_write);
__printf(2, 3) void bpf_log(struct bpf_verifier_log *log,
const char *fmt, ...)
{
va_list args;
if (!bpf_verifier_log_needed(log))
return;
va_start(args, fmt);
bpf_verifier_vlog(log, fmt, args);
va_end(args);
}
EXPORT_SYMBOL_GPL(bpf_log);