mirror of
https://github.com/torvalds/linux.git
synced 2024-11-22 12:11:40 +00:00
Alexei Starovoitov says: ==================== pull-request: bpf-next 2022-03-21 v2 We've added 137 non-merge commits during the last 17 day(s) which contain a total of 143 files changed, 7123 insertions(+), 1092 deletions(-). The main changes are: 1) Custom SEC() handling in libbpf, from Andrii. 2) subskeleton support, from Delyan. 3) Use btf_tag to recognize __percpu pointers in the verifier, from Hao. 4) Fix net.core.bpf_jit_harden race, from Hou. 5) Fix bpf_sk_lookup remote_port on big-endian, from Jakub. 6) Introduce fprobe (multi kprobe) _without_ arch bits, from Masami. The arch specific bits will come later. 7) Introduce multi_kprobe bpf programs on top of fprobe, from Jiri. 8) Enable non-atomic allocations in local storage, from Joanne. 9) Various var_off ptr_to_btf_id fixed, from Kumar. 10) bpf_ima_file_hash helper, from Roberto. 11) Add "live packet" mode for XDP in BPF_PROG_RUN, from Toke. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (137 commits) selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" bpftool: Fix a bug in subskeleton code generation bpf: Fix bpf_prog_pack when PMU_SIZE is not defined bpf: Fix bpf_prog_pack for multi-node setup bpf: Fix warning for cast from restricted gfp_t in verifier bpf, arm: Fix various typos in comments libbpf: Close fd in bpf_object__reuse_map bpftool: Fix print error when show bpf map bpf: Fix kprobe_multi return probe backtrace Revert "bpf: Add support to inline bpf_get_func_ip helper on x86" bpf: Simplify check in btf_parse_hdr() selftests/bpf/test_lirc_mode2.sh: Exit with proper code bpf: Check for NULL return from bpf_get_btf_vmlinux selftests/bpf: Test skipping stacktrace bpf: Adjust BPF stack helper functions to accommodate skip > 0 bpf: Select proper size for bpf_prog_pack ... ==================== Link: https://lore.kernel.org/r/20220322050159.5507-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
commit
0db8640df5
117
Documentation/bpf/bpf_prog_run.rst
Normal file
117
Documentation/bpf/bpf_prog_run.rst
Normal file
@ -0,0 +1,117 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===================================
|
||||
Running BPF programs from userspace
|
||||
===================================
|
||||
|
||||
This document describes the ``BPF_PROG_RUN`` facility for running BPF programs
|
||||
from userspace.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 2
|
||||
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
The ``BPF_PROG_RUN`` command can be used through the ``bpf()`` syscall to
|
||||
execute a BPF program in the kernel and return the results to userspace. This
|
||||
can be used to unit test BPF programs against user-supplied context objects, and
|
||||
as way to explicitly execute programs in the kernel for their side effects. The
|
||||
command was previously named ``BPF_PROG_TEST_RUN``, and both constants continue
|
||||
to be defined in the UAPI header, aliased to the same value.
|
||||
|
||||
The ``BPF_PROG_RUN`` command can be used to execute BPF programs of the
|
||||
following types:
|
||||
|
||||
- ``BPF_PROG_TYPE_SOCKET_FILTER``
|
||||
- ``BPF_PROG_TYPE_SCHED_CLS``
|
||||
- ``BPF_PROG_TYPE_SCHED_ACT``
|
||||
- ``BPF_PROG_TYPE_XDP``
|
||||
- ``BPF_PROG_TYPE_SK_LOOKUP``
|
||||
- ``BPF_PROG_TYPE_CGROUP_SKB``
|
||||
- ``BPF_PROG_TYPE_LWT_IN``
|
||||
- ``BPF_PROG_TYPE_LWT_OUT``
|
||||
- ``BPF_PROG_TYPE_LWT_XMIT``
|
||||
- ``BPF_PROG_TYPE_LWT_SEG6LOCAL``
|
||||
- ``BPF_PROG_TYPE_FLOW_DISSECTOR``
|
||||
- ``BPF_PROG_TYPE_STRUCT_OPS``
|
||||
- ``BPF_PROG_TYPE_RAW_TRACEPOINT``
|
||||
- ``BPF_PROG_TYPE_SYSCALL``
|
||||
|
||||
When using the ``BPF_PROG_RUN`` command, userspace supplies an input context
|
||||
object and (for program types operating on network packets) a buffer containing
|
||||
the packet data that the BPF program will operate on. The kernel will then
|
||||
execute the program and return the results to userspace. Note that programs will
|
||||
not have any side effects while being run in this mode; in particular, packets
|
||||
will not actually be redirected or dropped, the program return code will just be
|
||||
returned to userspace. A separate mode for live execution of XDP programs is
|
||||
provided, documented separately below.
|
||||
|
||||
Running XDP programs in "live frame mode"
|
||||
-----------------------------------------
|
||||
|
||||
The ``BPF_PROG_RUN`` command has a separate mode for running live XDP programs,
|
||||
which can be used to execute XDP programs in a way where packets will actually
|
||||
be processed by the kernel after the execution of the XDP program as if they
|
||||
arrived on a physical interface. This mode is activated by setting the
|
||||
``BPF_F_TEST_XDP_LIVE_FRAMES`` flag when supplying an XDP program to
|
||||
``BPF_PROG_RUN``.
|
||||
|
||||
The live packet mode is optimised for high performance execution of the supplied
|
||||
XDP program many times (suitable for, e.g., running as a traffic generator),
|
||||
which means the semantics are not quite as straight-forward as the regular test
|
||||
run mode. Specifically:
|
||||
|
||||
- When executing an XDP program in live frame mode, the result of the execution
|
||||
will not be returned to userspace; instead, the kernel will perform the
|
||||
operation indicated by the program's return code (drop the packet, redirect
|
||||
it, etc). For this reason, setting the ``data_out`` or ``ctx_out`` attributes
|
||||
in the syscall parameters when running in this mode will be rejected. In
|
||||
addition, not all failures will be reported back to userspace directly;
|
||||
specifically, only fatal errors in setup or during execution (like memory
|
||||
allocation errors) will halt execution and return an error. If an error occurs
|
||||
in packet processing, like a failure to redirect to a given interface,
|
||||
execution will continue with the next repetition; these errors can be detected
|
||||
via the same trace points as for regular XDP programs.
|
||||
|
||||
- Userspace can supply an ifindex as part of the context object, just like in
|
||||
the regular (non-live) mode. The XDP program will be executed as though the
|
||||
packet arrived on this interface; i.e., the ``ingress_ifindex`` of the context
|
||||
object will point to that interface. Furthermore, if the XDP program returns
|
||||
``XDP_PASS``, the packet will be injected into the kernel networking stack as
|
||||
though it arrived on that ifindex, and if it returns ``XDP_TX``, the packet
|
||||
will be transmitted *out* of that same interface. Do note, though, that
|
||||
because the program execution is not happening in driver context, an
|
||||
``XDP_TX`` is actually turned into the same action as an ``XDP_REDIRECT`` to
|
||||
that same interface (i.e., it will only work if the driver has support for the
|
||||
``ndo_xdp_xmit`` driver op).
|
||||
|
||||
- When running the program with multiple repetitions, the execution will happen
|
||||
in batches. The batch size defaults to 64 packets (which is same as the
|
||||
maximum NAPI receive batch size), but can be specified by userspace through
|
||||
the ``batch_size`` parameter, up to a maximum of 256 packets. For each batch,
|
||||
the kernel executes the XDP program repeatedly, each invocation getting a
|
||||
separate copy of the packet data. For each repetition, if the program drops
|
||||
the packet, the data page is immediately recycled (see below). Otherwise, the
|
||||
packet is buffered until the end of the batch, at which point all packets
|
||||
buffered this way during the batch are transmitted at once.
|
||||
|
||||
- When setting up the test run, the kernel will initialise a pool of memory
|
||||
pages of the same size as the batch size. Each memory page will be initialised
|
||||
with the initial packet data supplied by userspace at ``BPF_PROG_RUN``
|
||||
invocation. When possible, the pages will be recycled on future program
|
||||
invocations, to improve performance. Pages will generally be recycled a full
|
||||
batch at a time, except when a packet is dropped (by return code or because
|
||||
of, say, a redirection error), in which case that page will be recycled
|
||||
immediately. If a packet ends up being passed to the regular networking stack
|
||||
(because the XDP program returns ``XDP_PASS``, or because it ends up being
|
||||
redirected to an interface that injects it into the stack), the page will be
|
||||
released and a new one will be allocated when the pool is empty.
|
||||
|
||||
When recycling, the page content is not rewritten; only the packet boundary
|
||||
pointers (``data``, ``data_end`` and ``data_meta``) in the context object will
|
||||
be reset to the original values. This means that if a program rewrites the
|
||||
packet contents, it has to be prepared to see either the original content or
|
||||
the modified version on subsequent invocations.
|
@ -21,6 +21,7 @@ that goes into great technical depth about the BPF Architecture.
|
||||
helpers
|
||||
programs
|
||||
maps
|
||||
bpf_prog_run
|
||||
classic_vs_extended.rst
|
||||
bpf_licensing
|
||||
test_debug
|
||||
|
174
Documentation/trace/fprobe.rst
Normal file
174
Documentation/trace/fprobe.rst
Normal file
@ -0,0 +1,174 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==================================
|
||||
Fprobe - Function entry/exit probe
|
||||
==================================
|
||||
|
||||
.. Author: Masami Hiramatsu <mhiramat@kernel.org>
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
Fprobe is a function entry/exit probe mechanism based on ftrace.
|
||||
Instead of using ftrace full feature, if you only want to attach callbacks
|
||||
on function entry and exit, similar to the kprobes and kretprobes, you can
|
||||
use fprobe. Compared with kprobes and kretprobes, fprobe gives faster
|
||||
instrumentation for multiple functions with single handler. This document
|
||||
describes how to use fprobe.
|
||||
|
||||
The usage of fprobe
|
||||
===================
|
||||
|
||||
The fprobe is a wrapper of ftrace (+ kretprobe-like return callback) to
|
||||
attach callbacks to multiple function entry and exit. User needs to set up
|
||||
the `struct fprobe` and pass it to `register_fprobe()`.
|
||||
|
||||
Typically, `fprobe` data structure is initialized with the `entry_handler`
|
||||
and/or `exit_handler` as below.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
struct fprobe fp = {
|
||||
.entry_handler = my_entry_callback,
|
||||
.exit_handler = my_exit_callback,
|
||||
};
|
||||
|
||||
To enable the fprobe, call one of register_fprobe(), register_fprobe_ips(), and
|
||||
register_fprobe_syms(). These functions register the fprobe with different types
|
||||
of parameters.
|
||||
|
||||
The register_fprobe() enables a fprobe by function-name filters.
|
||||
E.g. this enables @fp on "func*()" function except "func2()".::
|
||||
|
||||
register_fprobe(&fp, "func*", "func2");
|
||||
|
||||
The register_fprobe_ips() enables a fprobe by ftrace-location addresses.
|
||||
E.g.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
unsigned long ips[] = { 0x.... };
|
||||
|
||||
register_fprobe_ips(&fp, ips, ARRAY_SIZE(ips));
|
||||
|
||||
And the register_fprobe_syms() enables a fprobe by symbol names.
|
||||
E.g.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
char syms[] = {"func1", "func2", "func3"};
|
||||
|
||||
register_fprobe_syms(&fp, syms, ARRAY_SIZE(syms));
|
||||
|
||||
To disable (remove from functions) this fprobe, call::
|
||||
|
||||
unregister_fprobe(&fp);
|
||||
|
||||
You can temporally (soft) disable the fprobe by::
|
||||
|
||||
disable_fprobe(&fp);
|
||||
|
||||
and resume by::
|
||||
|
||||
enable_fprobe(&fp);
|
||||
|
||||
The above is defined by including the header::
|
||||
|
||||
#include <linux/fprobe.h>
|
||||
|
||||
Same as ftrace, the registered callbacks will start being called some time
|
||||
after the register_fprobe() is called and before it returns. See
|
||||
:file:`Documentation/trace/ftrace.rst`.
|
||||
|
||||
Also, the unregister_fprobe() will guarantee that the both enter and exit
|
||||
handlers are no longer being called by functions after unregister_fprobe()
|
||||
returns as same as unregister_ftrace_function().
|
||||
|
||||
The fprobe entry/exit handler
|
||||
=============================
|
||||
|
||||
The prototype of the entry/exit callback function is as follows:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
void callback_func(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);
|
||||
|
||||
Note that both entry and exit callbacks have same ptototype. The @entry_ip is
|
||||
saved at function entry and passed to exit handler.
|
||||
|
||||
@fp
|
||||
This is the address of `fprobe` data structure related to this handler.
|
||||
You can embed the `fprobe` to your data structure and get it by
|
||||
container_of() macro from @fp. The @fp must not be NULL.
|
||||
|
||||
@entry_ip
|
||||
This is the ftrace address of the traced function (both entry and exit).
|
||||
Note that this may not be the actual entry address of the function but
|
||||
the address where the ftrace is instrumented.
|
||||
|
||||
@regs
|
||||
This is the `pt_regs` data structure at the entry and exit. Note that
|
||||
the instruction pointer of @regs may be different from the @entry_ip
|
||||
in the entry_handler. If you need traced instruction pointer, you need
|
||||
to use @entry_ip. On the other hand, in the exit_handler, the instruction
|
||||
pointer of @regs is set to the currect return address.
|
||||
|
||||
Share the callbacks with kprobes
|
||||
================================
|
||||
|
||||
Since the recursion safeness of the fprobe (and ftrace) is a bit different
|
||||
from the kprobes, this may cause an issue if user wants to run the same
|
||||
code from the fprobe and the kprobes.
|
||||
|
||||
Kprobes has per-cpu 'current_kprobe' variable which protects the kprobe
|
||||
handler from recursion in all cases. On the other hand, fprobe uses
|
||||
only ftrace_test_recursion_trylock(). This allows interrupt context to
|
||||
call another (or same) fprobe while the fprobe user handler is running.
|
||||
|
||||
This is not a matter if the common callback code has its own recursion
|
||||
detection, or it can handle the recursion in the different contexts
|
||||
(normal/interrupt/NMI.)
|
||||
But if it relies on the 'current_kprobe' recursion lock, it has to check
|
||||
kprobe_running() and use kprobe_busy_*() APIs.
|
||||
|
||||
Fprobe has FPROBE_FL_KPROBE_SHARED flag to do this. If your common callback
|
||||
code will be shared with kprobes, please set FPROBE_FL_KPROBE_SHARED
|
||||
*before* registering the fprobe, like:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
fprobe.flags = FPROBE_FL_KPROBE_SHARED;
|
||||
|
||||
register_fprobe(&fprobe, "func*", NULL);
|
||||
|
||||
This will protect your common callback from the nested call.
|
||||
|
||||
The missed counter
|
||||
==================
|
||||
|
||||
The `fprobe` data structure has `fprobe::nmissed` counter field as same as
|
||||
kprobes.
|
||||
This counter counts up when;
|
||||
|
||||
- fprobe fails to take ftrace_recursion lock. This usually means that a function
|
||||
which is traced by other ftrace users is called from the entry_handler.
|
||||
|
||||
- fprobe fails to setup the function exit because of the shortage of rethook
|
||||
(the shadow stack for hooking the function return.)
|
||||
|
||||
The `fprobe::nmissed` field counts up in both cases. Therefore, the former
|
||||
skips both of entry and exit callback and the latter skips the exit
|
||||
callback, but in both case the counter will increase by 1.
|
||||
|
||||
Note that if you set the FTRACE_OPS_FL_RECURSION and/or FTRACE_OPS_FL_RCU to
|
||||
`fprobe::ops::flags` (ftrace_ops::flags) when registering the fprobe, this
|
||||
counter may not work correctly, because ftrace skips the fprobe function which
|
||||
increase the counter.
|
||||
|
||||
|
||||
Functions and structures
|
||||
========================
|
||||
|
||||
.. kernel-doc:: include/linux/fprobe.h
|
||||
.. kernel-doc:: kernel/trace/fprobe.c
|
||||
|
@ -9,6 +9,7 @@ Linux Tracing Technologies
|
||||
tracepoint-analysis
|
||||
ftrace
|
||||
ftrace-uses
|
||||
fprobe
|
||||
kprobes
|
||||
kprobetrace
|
||||
uprobetracer
|
||||
|
@ -1864,7 +1864,7 @@ static int build_body(struct jit_ctx *ctx)
|
||||
if (ctx->target == NULL)
|
||||
ctx->offsets[i] = ctx->idx;
|
||||
|
||||
/* If unsuccesfull, return with error code */
|
||||
/* If unsuccesful, return with error code */
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
@ -1973,7 +1973,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
|
||||
* for jit, although it can decrease the size of the image.
|
||||
*
|
||||
* As each arm instruction is of length 32bit, we are translating
|
||||
* number of JITed intructions into the size required to store these
|
||||
* number of JITed instructions into the size required to store these
|
||||
* JITed code.
|
||||
*/
|
||||
image_size = sizeof(u32) * ctx.idx;
|
||||
|
@ -2335,7 +2335,13 @@ out_image:
|
||||
sizeof(rw_header->size));
|
||||
bpf_jit_binary_pack_free(header, rw_header);
|
||||
}
|
||||
/* Fall back to interpreter mode */
|
||||
prog = orig_prog;
|
||||
if (extra_pass) {
|
||||
prog->bpf_func = NULL;
|
||||
prog->jited = 0;
|
||||
prog->jited_len = 0;
|
||||
}
|
||||
goto out_addrs;
|
||||
}
|
||||
if (image) {
|
||||
@ -2384,8 +2390,9 @@ out_image:
|
||||
* Both cases are serious bugs and justify WARN_ON.
|
||||
*/
|
||||
if (WARN_ON(bpf_jit_binary_pack_finalize(prog, header, rw_header))) {
|
||||
prog = orig_prog;
|
||||
goto out_addrs;
|
||||
/* header has been freed */
|
||||
header = NULL;
|
||||
goto out_image;
|
||||
}
|
||||
|
||||
bpf_tail_call_direct_fixup(prog);
|
||||
|
@ -433,21 +433,6 @@ static void veth_set_multicast_list(struct net_device *dev)
|
||||
{
|
||||
}
|
||||
|
||||
static struct sk_buff *veth_build_skb(void *head, int headroom, int len,
|
||||
int buflen)
|
||||
{
|
||||
struct sk_buff *skb;
|
||||
|
||||
skb = build_skb(head, buflen);
|
||||
if (!skb)
|
||||
return NULL;
|
||||
|
||||
skb_reserve(skb, headroom);
|
||||
skb_put(skb, len);
|
||||
|
||||
return skb;
|
||||
}
|
||||
|
||||
static int veth_select_rxq(struct net_device *dev)
|
||||
{
|
||||
return smp_processor_id() % dev->real_num_rx_queues;
|
||||
@ -494,7 +479,7 @@ static int veth_xdp_xmit(struct net_device *dev, int n,
|
||||
struct xdp_frame *frame = frames[i];
|
||||
void *ptr = veth_xdp_to_ptr(frame);
|
||||
|
||||
if (unlikely(frame->len > max_len ||
|
||||
if (unlikely(xdp_get_frame_len(frame) > max_len ||
|
||||
__ptr_ring_produce(&rq->xdp_ring, ptr)))
|
||||
break;
|
||||
nxmit++;
|
||||
@ -695,16 +680,130 @@ static void veth_xdp_rcv_bulk_skb(struct veth_rq *rq, void **frames,
|
||||
}
|
||||
}
|
||||
|
||||
static void veth_xdp_get(struct xdp_buff *xdp)
|
||||
{
|
||||
struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
|
||||
int i;
|
||||
|
||||
get_page(virt_to_page(xdp->data));
|
||||
if (likely(!xdp_buff_has_frags(xdp)))
|
||||
return;
|
||||
|
||||
for (i = 0; i < sinfo->nr_frags; i++)
|
||||
__skb_frag_ref(&sinfo->frags[i]);
|
||||
}
|
||||
|
||||
static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
|
||||
struct xdp_buff *xdp,
|
||||
struct sk_buff **pskb)
|
||||
{
|
||||
struct sk_buff *skb = *pskb;
|
||||
u32 frame_sz;
|
||||
|
||||
if (skb_shared(skb) || skb_head_is_locked(skb) ||
|
||||
skb_shinfo(skb)->nr_frags) {
|
||||
u32 size, len, max_head_size, off;
|
||||
struct sk_buff *nskb;
|
||||
struct page *page;
|
||||
int i, head_off;
|
||||
|
||||
/* We need a private copy of the skb and data buffers since
|
||||
* the ebpf program can modify it. We segment the original skb
|
||||
* into order-0 pages without linearize it.
|
||||
*
|
||||
* Make sure we have enough space for linear and paged area
|
||||
*/
|
||||
max_head_size = SKB_WITH_OVERHEAD(PAGE_SIZE -
|
||||
VETH_XDP_HEADROOM);
|
||||
if (skb->len > PAGE_SIZE * MAX_SKB_FRAGS + max_head_size)
|
||||
goto drop;
|
||||
|
||||
/* Allocate skb head */
|
||||
page = alloc_page(GFP_ATOMIC | __GFP_NOWARN);
|
||||
if (!page)
|
||||
goto drop;
|
||||
|
||||
nskb = build_skb(page_address(page), PAGE_SIZE);
|
||||
if (!nskb) {
|
||||
put_page(page);
|
||||
goto drop;
|
||||
}
|
||||
|
||||
skb_reserve(nskb, VETH_XDP_HEADROOM);
|
||||
size = min_t(u32, skb->len, max_head_size);
|
||||
if (skb_copy_bits(skb, 0, nskb->data, size)) {
|
||||
consume_skb(nskb);
|
||||
goto drop;
|
||||
}
|
||||
skb_put(nskb, size);
|
||||
|
||||
skb_copy_header(nskb, skb);
|
||||
head_off = skb_headroom(nskb) - skb_headroom(skb);
|
||||
skb_headers_offset_update(nskb, head_off);
|
||||
|
||||
/* Allocate paged area of new skb */
|
||||
off = size;
|
||||
len = skb->len - off;
|
||||
|
||||
for (i = 0; i < MAX_SKB_FRAGS && off < skb->len; i++) {
|
||||
page = alloc_page(GFP_ATOMIC | __GFP_NOWARN);
|
||||
if (!page) {
|
||||
consume_skb(nskb);
|
||||
goto drop;
|
||||
}
|
||||
|
||||
size = min_t(u32, len, PAGE_SIZE);
|
||||
skb_add_rx_frag(nskb, i, page, 0, size, PAGE_SIZE);
|
||||
if (skb_copy_bits(skb, off, page_address(page),
|
||||
size)) {
|
||||
consume_skb(nskb);
|
||||
goto drop;
|
||||
}
|
||||
|
||||
len -= size;
|
||||
off += size;
|
||||
}
|
||||
|
||||
consume_skb(skb);
|
||||
skb = nskb;
|
||||
} else if (skb_headroom(skb) < XDP_PACKET_HEADROOM &&
|
||||
pskb_expand_head(skb, VETH_XDP_HEADROOM, 0, GFP_ATOMIC)) {
|
||||
goto drop;
|
||||
}
|
||||
|
||||
/* SKB "head" area always have tailroom for skb_shared_info */
|
||||
frame_sz = skb_end_pointer(skb) - skb->head;
|
||||
frame_sz += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
|
||||
xdp_init_buff(xdp, frame_sz, &rq->xdp_rxq);
|
||||
xdp_prepare_buff(xdp, skb->head, skb_headroom(skb),
|
||||
skb_headlen(skb), true);
|
||||
|
||||
if (skb_is_nonlinear(skb)) {
|
||||
skb_shinfo(skb)->xdp_frags_size = skb->data_len;
|
||||
xdp_buff_set_frags_flag(xdp);
|
||||
} else {
|
||||
xdp_buff_clear_frags_flag(xdp);
|
||||
}
|
||||
*pskb = skb;
|
||||
|
||||
return 0;
|
||||
drop:
|
||||
consume_skb(skb);
|
||||
*pskb = NULL;
|
||||
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
|
||||
struct sk_buff *skb,
|
||||
struct veth_xdp_tx_bq *bq,
|
||||
struct veth_stats *stats)
|
||||
{
|
||||
u32 pktlen, headroom, act, metalen, frame_sz;
|
||||
void *orig_data, *orig_data_end;
|
||||
struct bpf_prog *xdp_prog;
|
||||
int mac_len, delta, off;
|
||||
struct xdp_buff xdp;
|
||||
u32 act, metalen;
|
||||
int off;
|
||||
|
||||
skb_prepare_for_gro(skb);
|
||||
|
||||
@ -715,52 +814,9 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
|
||||
goto out;
|
||||
}
|
||||
|
||||
mac_len = skb->data - skb_mac_header(skb);
|
||||
pktlen = skb->len + mac_len;
|
||||
headroom = skb_headroom(skb) - mac_len;
|
||||
|
||||
if (skb_shared(skb) || skb_head_is_locked(skb) ||
|
||||
skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) {
|
||||
struct sk_buff *nskb;
|
||||
int size, head_off;
|
||||
void *head, *start;
|
||||
struct page *page;
|
||||
|
||||
size = SKB_DATA_ALIGN(VETH_XDP_HEADROOM + pktlen) +
|
||||
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
|
||||
if (size > PAGE_SIZE)
|
||||
goto drop;
|
||||
|
||||
page = alloc_page(GFP_ATOMIC | __GFP_NOWARN);
|
||||
if (!page)
|
||||
goto drop;
|
||||
|
||||
head = page_address(page);
|
||||
start = head + VETH_XDP_HEADROOM;
|
||||
if (skb_copy_bits(skb, -mac_len, start, pktlen)) {
|
||||
page_frag_free(head);
|
||||
goto drop;
|
||||
}
|
||||
|
||||
nskb = veth_build_skb(head, VETH_XDP_HEADROOM + mac_len,
|
||||
skb->len, PAGE_SIZE);
|
||||
if (!nskb) {
|
||||
page_frag_free(head);
|
||||
goto drop;
|
||||
}
|
||||
|
||||
skb_copy_header(nskb, skb);
|
||||
head_off = skb_headroom(nskb) - skb_headroom(skb);
|
||||
skb_headers_offset_update(nskb, head_off);
|
||||
consume_skb(skb);
|
||||
skb = nskb;
|
||||
}
|
||||
|
||||
/* SKB "head" area always have tailroom for skb_shared_info */
|
||||
frame_sz = skb_end_pointer(skb) - skb->head;
|
||||
frame_sz += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
|
||||
xdp_init_buff(&xdp, frame_sz, &rq->xdp_rxq);
|
||||
xdp_prepare_buff(&xdp, skb->head, skb->mac_header, pktlen, true);
|
||||
__skb_push(skb, skb->data - skb_mac_header(skb));
|
||||
if (veth_convert_skb_to_xdp_buff(rq, &xdp, &skb))
|
||||
goto drop;
|
||||
|
||||
orig_data = xdp.data;
|
||||
orig_data_end = xdp.data_end;
|
||||
@ -771,7 +827,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
|
||||
case XDP_PASS:
|
||||
break;
|
||||
case XDP_TX:
|
||||
get_page(virt_to_page(xdp.data));
|
||||
veth_xdp_get(&xdp);
|
||||
consume_skb(skb);
|
||||
xdp.rxq->mem = rq->xdp_mem;
|
||||
if (unlikely(veth_xdp_tx(rq, &xdp, bq) < 0)) {
|
||||
@ -783,7 +839,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
|
||||
rcu_read_unlock();
|
||||
goto xdp_xmit;
|
||||
case XDP_REDIRECT:
|
||||
get_page(virt_to_page(xdp.data));
|
||||
veth_xdp_get(&xdp);
|
||||
consume_skb(skb);
|
||||
xdp.rxq->mem = rq->xdp_mem;
|
||||
if (xdp_do_redirect(rq->dev, &xdp, xdp_prog)) {
|
||||
@ -806,18 +862,27 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
|
||||
rcu_read_unlock();
|
||||
|
||||
/* check if bpf_xdp_adjust_head was used */
|
||||
delta = orig_data - xdp.data;
|
||||
off = mac_len + delta;
|
||||
off = orig_data - xdp.data;
|
||||
if (off > 0)
|
||||
__skb_push(skb, off);
|
||||
else if (off < 0)
|
||||
__skb_pull(skb, -off);
|
||||
skb->mac_header -= delta;
|
||||
|
||||
skb_reset_mac_header(skb);
|
||||
|
||||
/* check if bpf_xdp_adjust_tail was used */
|
||||
off = xdp.data_end - orig_data_end;
|
||||
if (off != 0)
|
||||
__skb_put(skb, off); /* positive on grow, negative on shrink */
|
||||
|
||||
/* XDP frag metadata (e.g. nr_frags) are updated in eBPF helpers
|
||||
* (e.g. bpf_xdp_adjust_tail), we need to update data_len here.
|
||||
*/
|
||||
if (xdp_buff_has_frags(&xdp))
|
||||
skb->data_len = skb_shinfo(skb)->xdp_frags_size;
|
||||
else
|
||||
skb->data_len = 0;
|
||||
|
||||
skb->protocol = eth_type_trans(skb, rq->dev);
|
||||
|
||||
metalen = xdp.data - xdp.data_meta;
|
||||
@ -833,7 +898,7 @@ xdp_drop:
|
||||
return NULL;
|
||||
err_xdp:
|
||||
rcu_read_unlock();
|
||||
page_frag_free(xdp.data);
|
||||
xdp_return_buff(&xdp);
|
||||
xdp_xmit:
|
||||
return NULL;
|
||||
}
|
||||
@ -855,7 +920,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
|
||||
/* ndo_xdp_xmit */
|
||||
struct xdp_frame *frame = veth_ptr_to_xdp(ptr);
|
||||
|
||||
stats->xdp_bytes += frame->len;
|
||||
stats->xdp_bytes += xdp_get_frame_len(frame);
|
||||
frame = veth_xdp_rcv_one(rq, frame, bq, stats);
|
||||
if (frame) {
|
||||
/* XDP_PASS */
|
||||
@ -1463,9 +1528,14 @@ static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog,
|
||||
goto err;
|
||||
}
|
||||
|
||||
max_mtu = PAGE_SIZE - VETH_XDP_HEADROOM -
|
||||
peer->hard_header_len -
|
||||
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
|
||||
max_mtu = SKB_WITH_OVERHEAD(PAGE_SIZE - VETH_XDP_HEADROOM) -
|
||||
peer->hard_header_len;
|
||||
/* Allow increasing the max_mtu if the program supports
|
||||
* XDP fragments.
|
||||
*/
|
||||
if (prog->aux->xdp_has_frags)
|
||||
max_mtu += PAGE_SIZE * MAX_SKB_FRAGS;
|
||||
|
||||
if (peer->mtu > max_mtu) {
|
||||
NL_SET_ERR_MSG_MOD(extack, "Peer MTU is too large to set XDP");
|
||||
err = -ERANGE;
|
||||
|
@ -334,7 +334,15 @@ enum bpf_type_flag {
|
||||
/* MEM is in user address space. */
|
||||
MEM_USER = BIT(3 + BPF_BASE_TYPE_BITS),
|
||||
|
||||
__BPF_TYPE_LAST_FLAG = MEM_USER,
|
||||
/* MEM is a percpu memory. MEM_PERCPU tags PTR_TO_BTF_ID. When tagged
|
||||
* with MEM_PERCPU, PTR_TO_BTF_ID _cannot_ be directly accessed. In
|
||||
* order to drop this tag, it must be passed into bpf_per_cpu_ptr()
|
||||
* or bpf_this_cpu_ptr(), which will return the pointer corresponding
|
||||
* to the specified cpu.
|
||||
*/
|
||||
MEM_PERCPU = BIT(4 + BPF_BASE_TYPE_BITS),
|
||||
|
||||
__BPF_TYPE_LAST_FLAG = MEM_PERCPU,
|
||||
};
|
||||
|
||||
/* Max number of base types. */
|
||||
@ -516,7 +524,6 @@ enum bpf_reg_type {
|
||||
*/
|
||||
PTR_TO_MEM, /* reg points to valid memory region */
|
||||
PTR_TO_BUF, /* reg points to a read/write buffer */
|
||||
PTR_TO_PERCPU_BTF_ID, /* reg points to a percpu kernel variable */
|
||||
PTR_TO_FUNC, /* reg points to a bpf program function */
|
||||
__BPF_REG_TYPE_MAX,
|
||||
|
||||
|
@ -154,16 +154,17 @@ void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem);
|
||||
|
||||
struct bpf_local_storage_elem *
|
||||
bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, void *value,
|
||||
bool charge_mem);
|
||||
bool charge_mem, gfp_t gfp_flags);
|
||||
|
||||
int
|
||||
bpf_local_storage_alloc(void *owner,
|
||||
struct bpf_local_storage_map *smap,
|
||||
struct bpf_local_storage_elem *first_selem);
|
||||
struct bpf_local_storage_elem *first_selem,
|
||||
gfp_t gfp_flags);
|
||||
|
||||
struct bpf_local_storage_data *
|
||||
bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
|
||||
void *value, u64 map_flags);
|
||||
void *value, u64 map_flags, gfp_t gfp_flags);
|
||||
|
||||
void bpf_local_storage_free_rcu(struct rcu_head *rcu);
|
||||
|
||||
|
@ -140,3 +140,4 @@ BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp)
|
||||
#ifdef CONFIG_PERF_EVENTS
|
||||
BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf)
|
||||
#endif
|
||||
BPF_LINK_TYPE(BPF_LINK_TYPE_KPROBE_MULTI, kprobe_multi)
|
||||
|
@ -521,6 +521,10 @@ bpf_prog_offload_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt);
|
||||
|
||||
int check_ptr_off_reg(struct bpf_verifier_env *env,
|
||||
const struct bpf_reg_state *reg, int regno);
|
||||
int check_func_arg_reg_off(struct bpf_verifier_env *env,
|
||||
const struct bpf_reg_state *reg, int regno,
|
||||
enum bpf_arg_type arg_type,
|
||||
bool is_release_func);
|
||||
int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
|
||||
u32 regno);
|
||||
int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
|
||||
|
@ -68,3 +68,28 @@
|
||||
|
||||
#define __nocfi __attribute__((__no_sanitize__("cfi")))
|
||||
#define __cficanonical __attribute__((__cfi_canonical_jump_table__))
|
||||
|
||||
/*
|
||||
* Turn individual warnings and errors on and off locally, depending
|
||||
* on version.
|
||||
*/
|
||||
#define __diag_clang(version, severity, s) \
|
||||
__diag_clang_ ## version(__diag_clang_ ## severity s)
|
||||
|
||||
/* Severity used in pragma directives */
|
||||
#define __diag_clang_ignore ignored
|
||||
#define __diag_clang_warn warning
|
||||
#define __diag_clang_error error
|
||||
|
||||
#define __diag_str1(s) #s
|
||||
#define __diag_str(s) __diag_str1(s)
|
||||
#define __diag(s) _Pragma(__diag_str(clang diagnostic s))
|
||||
|
||||
#if CONFIG_CLANG_VERSION >= 110000
|
||||
#define __diag_clang_11(s) __diag(s)
|
||||
#else
|
||||
#define __diag_clang_11(s)
|
||||
#endif
|
||||
|
||||
#define __diag_ignore_all(option, comment) \
|
||||
__diag_clang(11, ignore, option)
|
||||
|
@ -151,6 +151,9 @@
|
||||
#define __diag_GCC_8(s)
|
||||
#endif
|
||||
|
||||
#define __diag_ignore_all(option, comment) \
|
||||
__diag_GCC(8, ignore, option)
|
||||
|
||||
/*
|
||||
* Prior to 9.1, -Wno-alloc-size-larger-than (and therefore the "alloc_size"
|
||||
* attribute) do not work, and must be disabled.
|
||||
|
@ -4,6 +4,13 @@
|
||||
|
||||
#ifndef __ASSEMBLY__
|
||||
|
||||
#if defined(CONFIG_DEBUG_INFO_BTF) && defined(CONFIG_PAHOLE_HAS_BTF_TAG) && \
|
||||
__has_attribute(btf_type_tag)
|
||||
# define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
|
||||
#else
|
||||
# define BTF_TYPE_TAG(value) /* nothing */
|
||||
#endif
|
||||
|
||||
#ifdef __CHECKER__
|
||||
/* address spaces */
|
||||
# define __kernel __attribute__((address_space(0)))
|
||||
@ -31,14 +38,11 @@ static inline void __chk_io_ptr(const volatile void __iomem *ptr) { }
|
||||
# define __kernel
|
||||
# ifdef STRUCTLEAK_PLUGIN
|
||||
# define __user __attribute__((user))
|
||||
# elif defined(CONFIG_DEBUG_INFO_BTF) && defined(CONFIG_PAHOLE_HAS_BTF_TAG) && \
|
||||
__has_attribute(btf_type_tag)
|
||||
# define __user __attribute__((btf_type_tag("user")))
|
||||
# else
|
||||
# define __user
|
||||
# define __user BTF_TYPE_TAG(user)
|
||||
# endif
|
||||
# define __iomem
|
||||
# define __percpu
|
||||
# define __percpu BTF_TYPE_TAG(percpu)
|
||||
# define __rcu
|
||||
# define __chk_user_ptr(x) (void)0
|
||||
# define __chk_io_ptr(x) (void)0
|
||||
@ -371,4 +375,8 @@ struct ftrace_likely_data {
|
||||
#define __diag_error(compiler, version, option, comment) \
|
||||
__diag_ ## compiler(version, error, option)
|
||||
|
||||
#ifndef __diag_ignore_all
|
||||
#define __diag_ignore_all(option, comment)
|
||||
#endif
|
||||
|
||||
#endif /* __LINUX_COMPILER_TYPES_H */
|
||||
|
@ -566,6 +566,7 @@ struct bpf_prog {
|
||||
gpl_compatible:1, /* Is filter GPL compatible? */
|
||||
cb_access:1, /* Is control block accessed? */
|
||||
dst_needed:1, /* Do we need dst entry? */
|
||||
blinding_requested:1, /* needs constant blinding */
|
||||
blinded:1, /* Was blinded */
|
||||
is_func:1, /* program is a bpf function */
|
||||
kprobe_override:1, /* Do we override a kprobe? */
|
||||
@ -573,7 +574,7 @@ struct bpf_prog {
|
||||
enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
|
||||
call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
|
||||
call_get_func_ip:1, /* Do we call get_func_ip() */
|
||||
delivery_time_access:1; /* Accessed __sk_buff->delivery_time_type */
|
||||
tstamp_type_access:1; /* Accessed __sk_buff->tstamp_type */
|
||||
enum bpf_prog_type type; /* Type of BPF program */
|
||||
enum bpf_attach_type expected_attach_type; /* For some prog types */
|
||||
u32 len; /* Number of filter blocks */
|
||||
|
105
include/linux/fprobe.h
Normal file
105
include/linux/fprobe.h
Normal file
@ -0,0 +1,105 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/* Simple ftrace probe wrapper */
|
||||
#ifndef _LINUX_FPROBE_H
|
||||
#define _LINUX_FPROBE_H
|
||||
|
||||
#include <linux/compiler.h>
|
||||
#include <linux/ftrace.h>
|
||||
#include <linux/rethook.h>
|
||||
|
||||
/**
|
||||
* struct fprobe - ftrace based probe.
|
||||
* @ops: The ftrace_ops.
|
||||
* @nmissed: The counter for missing events.
|
||||
* @flags: The status flag.
|
||||
* @rethook: The rethook data structure. (internal data)
|
||||
* @entry_handler: The callback function for function entry.
|
||||
* @exit_handler: The callback function for function exit.
|
||||
*/
|
||||
struct fprobe {
|
||||
#ifdef CONFIG_FUNCTION_TRACER
|
||||
/*
|
||||
* If CONFIG_FUNCTION_TRACER is not set, CONFIG_FPROBE is disabled too.
|
||||
* But user of fprobe may keep embedding the struct fprobe on their own
|
||||
* code. To avoid build error, this will keep the fprobe data structure
|
||||
* defined here, but remove ftrace_ops data structure.
|
||||
*/
|
||||
struct ftrace_ops ops;
|
||||
#endif
|
||||
unsigned long nmissed;
|
||||
unsigned int flags;
|
||||
struct rethook *rethook;
|
||||
|
||||
void (*entry_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);
|
||||
void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);
|
||||
};
|
||||
|
||||
/* This fprobe is soft-disabled. */
|
||||
#define FPROBE_FL_DISABLED 1
|
||||
|
||||
/*
|
||||
* This fprobe handler will be shared with kprobes.
|
||||
* This flag must be set before registering.
|
||||
*/
|
||||
#define FPROBE_FL_KPROBE_SHARED 2
|
||||
|
||||
static inline bool fprobe_disabled(struct fprobe *fp)
|
||||
{
|
||||
return (fp) ? fp->flags & FPROBE_FL_DISABLED : false;
|
||||
}
|
||||
|
||||
static inline bool fprobe_shared_with_kprobes(struct fprobe *fp)
|
||||
{
|
||||
return (fp) ? fp->flags & FPROBE_FL_KPROBE_SHARED : false;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_FPROBE
|
||||
int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter);
|
||||
int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num);
|
||||
int register_fprobe_syms(struct fprobe *fp, const char **syms, int num);
|
||||
int unregister_fprobe(struct fprobe *fp);
|
||||
#else
|
||||
static inline int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
static inline int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
static inline int register_fprobe_syms(struct fprobe *fp, const char **syms, int num)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
static inline int unregister_fprobe(struct fprobe *fp)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
#endif
|
||||
|
||||
/**
|
||||
* disable_fprobe() - Disable fprobe
|
||||
* @fp: The fprobe to be disabled.
|
||||
*
|
||||
* This will soft-disable @fp. Note that this doesn't remove the ftrace
|
||||
* hooks from the function entry.
|
||||
*/
|
||||
static inline void disable_fprobe(struct fprobe *fp)
|
||||
{
|
||||
if (fp)
|
||||
fp->flags |= FPROBE_FL_DISABLED;
|
||||
}
|
||||
|
||||
/**
|
||||
* enable_fprobe() - Enable fprobe
|
||||
* @fp: The fprobe to be enabled.
|
||||
*
|
||||
* This will soft-enable @fp.
|
||||
*/
|
||||
static inline void enable_fprobe(struct fprobe *fp)
|
||||
{
|
||||
if (fp)
|
||||
fp->flags &= ~FPROBE_FL_DISABLED;
|
||||
}
|
||||
|
||||
#endif
|
@ -512,6 +512,8 @@ struct dyn_ftrace {
|
||||
|
||||
int ftrace_set_filter_ip(struct ftrace_ops *ops, unsigned long ip,
|
||||
int remove, int reset);
|
||||
int ftrace_set_filter_ips(struct ftrace_ops *ops, unsigned long *ips,
|
||||
unsigned int cnt, int remove, int reset);
|
||||
int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
|
||||
int len, int reset);
|
||||
int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
|
||||
@ -802,6 +804,7 @@ static inline unsigned long ftrace_location(unsigned long ip)
|
||||
#define ftrace_regex_open(ops, flag, inod, file) ({ -ENODEV; })
|
||||
#define ftrace_set_early_filter(ops, buf, enable) do { } while (0)
|
||||
#define ftrace_set_filter_ip(ops, ip, remove, reset) ({ -ENODEV; })
|
||||
#define ftrace_set_filter_ips(ops, ips, cnt, remove, reset) ({ -ENODEV; })
|
||||
#define ftrace_set_filter(ops, buf, len, reset) ({ -ENODEV; })
|
||||
#define ftrace_set_notrace(ops, buf, len, reset) ({ -ENODEV; })
|
||||
#define ftrace_free_filter(ops) do { } while (0)
|
||||
|
@ -427,6 +427,9 @@ static inline struct kprobe *kprobe_running(void)
|
||||
{
|
||||
return NULL;
|
||||
}
|
||||
#define kprobe_busy_begin() do {} while (0)
|
||||
#define kprobe_busy_end() do {} while (0)
|
||||
|
||||
static inline int register_kprobe(struct kprobe *p)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
|
100
include/linux/rethook.h
Normal file
100
include/linux/rethook.h
Normal file
@ -0,0 +1,100 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/*
|
||||
* Return hooking with list-based shadow stack.
|
||||
*/
|
||||
#ifndef _LINUX_RETHOOK_H
|
||||
#define _LINUX_RETHOOK_H
|
||||
|
||||
#include <linux/compiler.h>
|
||||
#include <linux/freelist.h>
|
||||
#include <linux/kallsyms.h>
|
||||
#include <linux/llist.h>
|
||||
#include <linux/rcupdate.h>
|
||||
#include <linux/refcount.h>
|
||||
|
||||
struct rethook_node;
|
||||
|
||||
typedef void (*rethook_handler_t) (struct rethook_node *, void *, struct pt_regs *);
|
||||
|
||||
/**
|
||||
* struct rethook - The rethook management data structure.
|
||||
* @data: The user-defined data storage.
|
||||
* @handler: The user-defined return hook handler.
|
||||
* @pool: The pool of struct rethook_node.
|
||||
* @ref: The reference counter.
|
||||
* @rcu: The rcu_head for deferred freeing.
|
||||
*
|
||||
* Don't embed to another data structure, because this is a self-destructive
|
||||
* data structure when all rethook_node are freed.
|
||||
*/
|
||||
struct rethook {
|
||||
void *data;
|
||||
rethook_handler_t handler;
|
||||
struct freelist_head pool;
|
||||
refcount_t ref;
|
||||
struct rcu_head rcu;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rethook_node - The rethook shadow-stack entry node.
|
||||
* @freelist: The freelist, linked to struct rethook::pool.
|
||||
* @rcu: The rcu_head for deferred freeing.
|
||||
* @llist: The llist, linked to a struct task_struct::rethooks.
|
||||
* @rethook: The pointer to the struct rethook.
|
||||
* @ret_addr: The storage for the real return address.
|
||||
* @frame: The storage for the frame pointer.
|
||||
*
|
||||
* You can embed this to your extended data structure to store any data
|
||||
* on each entry of the shadow stack.
|
||||
*/
|
||||
struct rethook_node {
|
||||
union {
|
||||
struct freelist_node freelist;
|
||||
struct rcu_head rcu;
|
||||
};
|
||||
struct llist_node llist;
|
||||
struct rethook *rethook;
|
||||
unsigned long ret_addr;
|
||||
unsigned long frame;
|
||||
};
|
||||
|
||||
struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
|
||||
void rethook_free(struct rethook *rh);
|
||||
void rethook_add_node(struct rethook *rh, struct rethook_node *node);
|
||||
struct rethook_node *rethook_try_get(struct rethook *rh);
|
||||
void rethook_recycle(struct rethook_node *node);
|
||||
void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
|
||||
unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame,
|
||||
struct llist_node **cur);
|
||||
|
||||
/* Arch dependent code must implement arch_* and trampoline code */
|
||||
void arch_rethook_prepare(struct rethook_node *node, struct pt_regs *regs, bool mcount);
|
||||
void arch_rethook_trampoline(void);
|
||||
|
||||
/**
|
||||
* is_rethook_trampoline() - Check whether the address is rethook trampoline
|
||||
* @addr: The address to be checked
|
||||
*
|
||||
* Return true if the @addr is the rethook trampoline address.
|
||||
*/
|
||||
static inline bool is_rethook_trampoline(unsigned long addr)
|
||||
{
|
||||
return addr == (unsigned long)dereference_symbol_descriptor(arch_rethook_trampoline);
|
||||
}
|
||||
|
||||
/* If the architecture needs to fixup the return address, implement it. */
|
||||
void arch_rethook_fixup_return(struct pt_regs *regs,
|
||||
unsigned long correct_ret_addr);
|
||||
|
||||
/* Generic trampoline handler, arch code must prepare asm stub */
|
||||
unsigned long rethook_trampoline_handler(struct pt_regs *regs,
|
||||
unsigned long frame);
|
||||
|
||||
#ifdef CONFIG_RETHOOK
|
||||
void rethook_flush_task(struct task_struct *tk);
|
||||
#else
|
||||
#define rethook_flush_task(tsk) do { } while (0)
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
@ -1481,6 +1481,9 @@ struct task_struct {
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
struct llist_head kretprobe_instances;
|
||||
#endif
|
||||
#ifdef CONFIG_RETHOOK
|
||||
struct llist_head rethooks;
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH
|
||||
/*
|
||||
|
@ -992,10 +992,10 @@ struct sk_buff {
|
||||
__u8 csum_complete_sw:1;
|
||||
__u8 csum_level:2;
|
||||
__u8 dst_pending_confirm:1;
|
||||
__u8 mono_delivery_time:1;
|
||||
__u8 mono_delivery_time:1; /* See SKB_MONO_DELIVERY_TIME_MASK */
|
||||
#ifdef CONFIG_NET_CLS_ACT
|
||||
__u8 tc_skip_classify:1;
|
||||
__u8 tc_at_ingress:1;
|
||||
__u8 tc_at_ingress:1; /* See TC_AT_INGRESS_MASK */
|
||||
#endif
|
||||
#ifdef CONFIG_IPV6_NDISC_NODETYPE
|
||||
__u8 ndisc_nodetype:2;
|
||||
@ -1094,7 +1094,9 @@ struct sk_buff {
|
||||
#endif
|
||||
#define PKT_TYPE_OFFSET offsetof(struct sk_buff, __pkt_type_offset)
|
||||
|
||||
/* if you move pkt_vlan_present around you also must adapt these constants */
|
||||
/* if you move pkt_vlan_present, tc_at_ingress, or mono_delivery_time
|
||||
* around, you also must adapt these constants.
|
||||
*/
|
||||
#ifdef __BIG_ENDIAN_BITFIELD
|
||||
#define PKT_VLAN_PRESENT_BIT 7
|
||||
#define TC_AT_INGRESS_MASK (1 << 0)
|
||||
@ -1105,8 +1107,6 @@ struct sk_buff {
|
||||
#define SKB_MONO_DELIVERY_TIME_MASK (1 << 5)
|
||||
#endif
|
||||
#define PKT_VLAN_PRESENT_OFFSET offsetof(struct sk_buff, __pkt_vlan_present_offset)
|
||||
#define TC_AT_INGRESS_OFFSET offsetof(struct sk_buff, __pkt_vlan_present_offset)
|
||||
#define SKB_MONO_DELIVERY_TIME_OFFSET offsetof(struct sk_buff, __pkt_vlan_present_offset)
|
||||
|
||||
#ifdef __KERNEL__
|
||||
/*
|
||||
|
@ -304,21 +304,16 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
|
||||
kfree_skb(skb);
|
||||
}
|
||||
|
||||
static inline void drop_sk_msg(struct sk_psock *psock, struct sk_msg *msg)
|
||||
{
|
||||
if (msg->skb)
|
||||
sock_drop(psock->sk, msg->skb);
|
||||
kfree(msg);
|
||||
}
|
||||
|
||||
static inline void sk_psock_queue_msg(struct sk_psock *psock,
|
||||
struct sk_msg *msg)
|
||||
{
|
||||
spin_lock_bh(&psock->ingress_lock);
|
||||
if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
|
||||
list_add_tail(&msg->list, &psock->ingress_msg);
|
||||
else
|
||||
drop_sk_msg(psock, msg);
|
||||
else {
|
||||
sk_msg_free(psock->sk, msg);
|
||||
kfree(msg);
|
||||
}
|
||||
spin_unlock_bh(&psock->ingress_lock);
|
||||
}
|
||||
|
||||
|
@ -6,7 +6,7 @@
|
||||
|
||||
void sort_r(void *base, size_t num, size_t size,
|
||||
cmp_r_func_t cmp_func,
|
||||
swap_func_t swap_func,
|
||||
swap_r_func_t swap_func,
|
||||
const void *priv);
|
||||
|
||||
void sort(void *base, size_t num, size_t size,
|
||||
|
@ -15,6 +15,7 @@ struct array_buffer;
|
||||
struct tracer;
|
||||
struct dentry;
|
||||
struct bpf_prog;
|
||||
union bpf_attr;
|
||||
|
||||
const char *trace_print_flags_seq(struct trace_seq *p, const char *delim,
|
||||
unsigned long flags,
|
||||
@ -738,6 +739,7 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp);
|
||||
int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
|
||||
u32 *fd_type, const char **buf,
|
||||
u64 *probe_offset, u64 *probe_addr);
|
||||
int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
|
||||
#else
|
||||
static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
|
||||
{
|
||||
@ -779,6 +781,11 @@ static inline int bpf_get_perf_event_info(const struct perf_event *event,
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
static inline int
|
||||
bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
#endif
|
||||
|
||||
enum {
|
||||
|
@ -226,6 +226,7 @@ struct callback_head {
|
||||
typedef void (*rcu_callback_t)(struct rcu_head *head);
|
||||
typedef void (*call_rcu_func_t)(struct rcu_head *head, rcu_callback_t func);
|
||||
|
||||
typedef void (*swap_r_func_t)(void *a, void *b, int size, const void *priv);
|
||||
typedef void (*swap_func_t)(void *a, void *b, int size);
|
||||
|
||||
typedef int (*cmp_r_func_t)(const void *a, const void *b, const void *priv);
|
||||
|
@ -343,6 +343,20 @@ out:
|
||||
__xdp_release_frame(xdpf->data, mem);
|
||||
}
|
||||
|
||||
static __always_inline unsigned int xdp_get_frame_len(struct xdp_frame *xdpf)
|
||||
{
|
||||
struct skb_shared_info *sinfo;
|
||||
unsigned int len = xdpf->len;
|
||||
|
||||
if (likely(!xdp_frame_has_frags(xdpf)))
|
||||
goto out;
|
||||
|
||||
sinfo = xdp_get_shared_info_from_frame(xdpf);
|
||||
len += sinfo->xdp_frags_size;
|
||||
out:
|
||||
return len;
|
||||
}
|
||||
|
||||
int __xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq,
|
||||
struct net_device *dev, u32 queue_index,
|
||||
unsigned int napi_id, u32 frag_size);
|
||||
|
@ -997,6 +997,7 @@ enum bpf_attach_type {
|
||||
BPF_SK_REUSEPORT_SELECT,
|
||||
BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
|
||||
BPF_PERF_EVENT,
|
||||
BPF_TRACE_KPROBE_MULTI,
|
||||
__MAX_BPF_ATTACH_TYPE
|
||||
};
|
||||
|
||||
@ -1011,6 +1012,7 @@ enum bpf_link_type {
|
||||
BPF_LINK_TYPE_NETNS = 5,
|
||||
BPF_LINK_TYPE_XDP = 6,
|
||||
BPF_LINK_TYPE_PERF_EVENT = 7,
|
||||
BPF_LINK_TYPE_KPROBE_MULTI = 8,
|
||||
|
||||
MAX_BPF_LINK_TYPE,
|
||||
};
|
||||
@ -1118,6 +1120,11 @@ enum bpf_link_type {
|
||||
*/
|
||||
#define BPF_F_XDP_HAS_FRAGS (1U << 5)
|
||||
|
||||
/* link_create.kprobe_multi.flags used in LINK_CREATE command for
|
||||
* BPF_TRACE_KPROBE_MULTI attach type to create return probe.
|
||||
*/
|
||||
#define BPF_F_KPROBE_MULTI_RETURN (1U << 0)
|
||||
|
||||
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
|
||||
* the following extensions:
|
||||
*
|
||||
@ -1232,6 +1239,8 @@ enum {
|
||||
|
||||
/* If set, run the test on the cpu specified by bpf_attr.test.cpu */
|
||||
#define BPF_F_TEST_RUN_ON_CPU (1U << 0)
|
||||
/* If set, XDP frames will be transmitted after processing */
|
||||
#define BPF_F_TEST_XDP_LIVE_FRAMES (1U << 1)
|
||||
|
||||
/* type for BPF_ENABLE_STATS */
|
||||
enum bpf_stats_type {
|
||||
@ -1393,6 +1402,7 @@ union bpf_attr {
|
||||
__aligned_u64 ctx_out;
|
||||
__u32 flags;
|
||||
__u32 cpu;
|
||||
__u32 batch_size;
|
||||
} test;
|
||||
|
||||
struct { /* anonymous struct used by BPF_*_GET_*_ID */
|
||||
@ -1472,6 +1482,13 @@ union bpf_attr {
|
||||
*/
|
||||
__u64 bpf_cookie;
|
||||
} perf_event;
|
||||
struct {
|
||||
__u32 flags;
|
||||
__u32 cnt;
|
||||
__aligned_u64 syms;
|
||||
__aligned_u64 addrs;
|
||||
__aligned_u64 cookies;
|
||||
} kprobe_multi;
|
||||
};
|
||||
} link_create;
|
||||
|
||||
@ -2299,8 +2316,8 @@ union bpf_attr {
|
||||
* Return
|
||||
* The return value depends on the result of the test, and can be:
|
||||
*
|
||||
* * 0, if current task belongs to the cgroup2.
|
||||
* * 1, if current task does not belong to the cgroup2.
|
||||
* * 1, if current task belongs to the cgroup2.
|
||||
* * 0, if current task does not belong to the cgroup2.
|
||||
* * A negative error code, if an error occurred.
|
||||
*
|
||||
* long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
|
||||
@ -2992,8 +3009,8 @@ union bpf_attr {
|
||||
*
|
||||
* # sysctl kernel.perf_event_max_stack=<new value>
|
||||
* Return
|
||||
* A non-negative value equal to or less than *size* on success,
|
||||
* or a negative error in case of failure.
|
||||
* The non-negative copied *buf* length equal to or less than
|
||||
* *size* on success, or a negative error in case of failure.
|
||||
*
|
||||
* long bpf_skb_load_bytes_relative(const void *skb, u32 offset, void *to, u32 len, u32 start_header)
|
||||
* Description
|
||||
@ -4299,8 +4316,8 @@ union bpf_attr {
|
||||
*
|
||||
* # sysctl kernel.perf_event_max_stack=<new value>
|
||||
* Return
|
||||
* A non-negative value equal to or less than *size* on success,
|
||||
* or a negative error in case of failure.
|
||||
* The non-negative copied *buf* length equal to or less than
|
||||
* *size* on success, or a negative error in case of failure.
|
||||
*
|
||||
* long bpf_load_hdr_opt(struct bpf_sock_ops *skops, void *searchby_res, u32 len, u64 flags)
|
||||
* Description
|
||||
@ -5087,23 +5104,22 @@ union bpf_attr {
|
||||
* 0 on success, or a negative error in case of failure. On error
|
||||
* *dst* buffer is zeroed out.
|
||||
*
|
||||
* long bpf_skb_set_delivery_time(struct sk_buff *skb, u64 dtime, u32 dtime_type)
|
||||
* long bpf_skb_set_tstamp(struct sk_buff *skb, u64 tstamp, u32 tstamp_type)
|
||||
* Description
|
||||
* Set a *dtime* (delivery time) to the __sk_buff->tstamp and also
|
||||
* change the __sk_buff->delivery_time_type to *dtime_type*.
|
||||
* Change the __sk_buff->tstamp_type to *tstamp_type*
|
||||
* and set *tstamp* to the __sk_buff->tstamp together.
|
||||
*
|
||||
* When setting a delivery time (non zero *dtime*) to
|
||||
* __sk_buff->tstamp, only BPF_SKB_DELIVERY_TIME_MONO *dtime_type*
|
||||
* is supported. It is the only delivery_time_type that will be
|
||||
* kept after bpf_redirect_*().
|
||||
*
|
||||
* If there is no need to change the __sk_buff->delivery_time_type,
|
||||
* the delivery time can be directly written to __sk_buff->tstamp
|
||||
* If there is no need to change the __sk_buff->tstamp_type,
|
||||
* the tstamp value can be directly written to __sk_buff->tstamp
|
||||
* instead.
|
||||
*
|
||||
* *dtime* 0 and *dtime_type* BPF_SKB_DELIVERY_TIME_NONE
|
||||
* can be used to clear any delivery time stored in
|
||||
* __sk_buff->tstamp.
|
||||
* BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that
|
||||
* will be kept during bpf_redirect_*(). A non zero
|
||||
* *tstamp* must be used with the BPF_SKB_TSTAMP_DELIVERY_MONO
|
||||
* *tstamp_type*.
|
||||
*
|
||||
* A BPF_SKB_TSTAMP_UNSPEC *tstamp_type* can only be used
|
||||
* with a zero *tstamp*.
|
||||
*
|
||||
* Only IPv4 and IPv6 skb->protocol are supported.
|
||||
*
|
||||
@ -5116,7 +5132,17 @@ union bpf_attr {
|
||||
* Return
|
||||
* 0 on success.
|
||||
* **-EINVAL** for invalid input
|
||||
* **-EOPNOTSUPP** for unsupported delivery_time_type and protocol
|
||||
* **-EOPNOTSUPP** for unsupported protocol
|
||||
*
|
||||
* long bpf_ima_file_hash(struct file *file, void *dst, u32 size)
|
||||
* Description
|
||||
* Returns a calculated IMA hash of the *file*.
|
||||
* If the hash is larger than *size*, then only *size*
|
||||
* bytes will be copied to *dst*
|
||||
* Return
|
||||
* The **hash_algo** is returned on success,
|
||||
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
|
||||
* invalid arguments are passed.
|
||||
*/
|
||||
#define __BPF_FUNC_MAPPER(FN) \
|
||||
FN(unspec), \
|
||||
@ -5311,7 +5337,8 @@ union bpf_attr {
|
||||
FN(xdp_load_bytes), \
|
||||
FN(xdp_store_bytes), \
|
||||
FN(copy_from_user_task), \
|
||||
FN(skb_set_delivery_time), \
|
||||
FN(skb_set_tstamp), \
|
||||
FN(ima_file_hash), \
|
||||
/* */
|
||||
|
||||
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
|
||||
@ -5502,9 +5529,12 @@ union { \
|
||||
} __attribute__((aligned(8)))
|
||||
|
||||
enum {
|
||||
BPF_SKB_DELIVERY_TIME_NONE,
|
||||
BPF_SKB_DELIVERY_TIME_UNSPEC,
|
||||
BPF_SKB_DELIVERY_TIME_MONO,
|
||||
BPF_SKB_TSTAMP_UNSPEC,
|
||||
BPF_SKB_TSTAMP_DELIVERY_MONO, /* tstamp has mono delivery time */
|
||||
/* For any BPF_SKB_TSTAMP_* that the bpf prog cannot handle,
|
||||
* the bpf prog should handle it like BPF_SKB_TSTAMP_UNSPEC
|
||||
* and try to deduce it by ingress, egress or skb->sk->sk_clockid.
|
||||
*/
|
||||
};
|
||||
|
||||
/* user accessible mirror of in-kernel sk_buff.
|
||||
@ -5547,7 +5577,7 @@ struct __sk_buff {
|
||||
__u32 gso_segs;
|
||||
__bpf_md_ptr(struct bpf_sock *, sk);
|
||||
__u32 gso_size;
|
||||
__u8 delivery_time_type;
|
||||
__u8 tstamp_type;
|
||||
__u32 :24; /* Padding, future use. */
|
||||
__u64 hwtstamp;
|
||||
};
|
||||
|
@ -30,6 +30,7 @@ config BPF_SYSCALL
|
||||
select TASKS_TRACE_RCU
|
||||
select BINARY_PRINTF
|
||||
select NET_SOCK_MSG if NET
|
||||
select PAGE_POOL if NET
|
||||
default n
|
||||
help
|
||||
Enable the bpf() system call that allows to manipulate BPF programs
|
||||
|
@ -136,7 +136,7 @@ static int bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
|
||||
|
||||
sdata = bpf_local_storage_update(f->f_inode,
|
||||
(struct bpf_local_storage_map *)map,
|
||||
value, map_flags);
|
||||
value, map_flags, GFP_ATOMIC);
|
||||
fput(f);
|
||||
return PTR_ERR_OR_ZERO(sdata);
|
||||
}
|
||||
@ -169,8 +169,9 @@ static int bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
|
||||
return err;
|
||||
}
|
||||
|
||||
BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
|
||||
void *, value, u64, flags)
|
||||
/* *gfp_flags* is a hidden argument provided by the verifier */
|
||||
BPF_CALL_5(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
|
||||
void *, value, u64, flags, gfp_t, gfp_flags)
|
||||
{
|
||||
struct bpf_local_storage_data *sdata;
|
||||
|
||||
@ -196,7 +197,7 @@ BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
|
||||
if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
|
||||
sdata = bpf_local_storage_update(
|
||||
inode, (struct bpf_local_storage_map *)map, value,
|
||||
BPF_NOEXIST);
|
||||
BPF_NOEXIST, gfp_flags);
|
||||
return IS_ERR(sdata) ? (unsigned long)NULL :
|
||||
(unsigned long)sdata->data;
|
||||
}
|
||||
|
@ -63,7 +63,7 @@ static bool selem_linked_to_map(const struct bpf_local_storage_elem *selem)
|
||||
|
||||
struct bpf_local_storage_elem *
|
||||
bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner,
|
||||
void *value, bool charge_mem)
|
||||
void *value, bool charge_mem, gfp_t gfp_flags)
|
||||
{
|
||||
struct bpf_local_storage_elem *selem;
|
||||
|
||||
@ -71,7 +71,7 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner,
|
||||
return NULL;
|
||||
|
||||
selem = bpf_map_kzalloc(&smap->map, smap->elem_size,
|
||||
GFP_ATOMIC | __GFP_NOWARN);
|
||||
gfp_flags | __GFP_NOWARN);
|
||||
if (selem) {
|
||||
if (value)
|
||||
memcpy(SDATA(selem)->data, value, smap->map.value_size);
|
||||
@ -282,7 +282,8 @@ static int check_flags(const struct bpf_local_storage_data *old_sdata,
|
||||
|
||||
int bpf_local_storage_alloc(void *owner,
|
||||
struct bpf_local_storage_map *smap,
|
||||
struct bpf_local_storage_elem *first_selem)
|
||||
struct bpf_local_storage_elem *first_selem,
|
||||
gfp_t gfp_flags)
|
||||
{
|
||||
struct bpf_local_storage *prev_storage, *storage;
|
||||
struct bpf_local_storage **owner_storage_ptr;
|
||||
@ -293,7 +294,7 @@ int bpf_local_storage_alloc(void *owner,
|
||||
return err;
|
||||
|
||||
storage = bpf_map_kzalloc(&smap->map, sizeof(*storage),
|
||||
GFP_ATOMIC | __GFP_NOWARN);
|
||||
gfp_flags | __GFP_NOWARN);
|
||||
if (!storage) {
|
||||
err = -ENOMEM;
|
||||
goto uncharge;
|
||||
@ -350,10 +351,10 @@ uncharge:
|
||||
*/
|
||||
struct bpf_local_storage_data *
|
||||
bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
|
||||
void *value, u64 map_flags)
|
||||
void *value, u64 map_flags, gfp_t gfp_flags)
|
||||
{
|
||||
struct bpf_local_storage_data *old_sdata = NULL;
|
||||
struct bpf_local_storage_elem *selem;
|
||||
struct bpf_local_storage_elem *selem = NULL;
|
||||
struct bpf_local_storage *local_storage;
|
||||
unsigned long flags;
|
||||
int err;
|
||||
@ -365,6 +366,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
|
||||
!map_value_has_spin_lock(&smap->map)))
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
if (gfp_flags == GFP_KERNEL && (map_flags & ~BPF_F_LOCK) != BPF_NOEXIST)
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
local_storage = rcu_dereference_check(*owner_storage(smap, owner),
|
||||
bpf_rcu_lock_held());
|
||||
if (!local_storage || hlist_empty(&local_storage->list)) {
|
||||
@ -373,11 +377,11 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
|
||||
if (err)
|
||||
return ERR_PTR(err);
|
||||
|
||||
selem = bpf_selem_alloc(smap, owner, value, true);
|
||||
selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags);
|
||||
if (!selem)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
err = bpf_local_storage_alloc(owner, smap, selem);
|
||||
err = bpf_local_storage_alloc(owner, smap, selem, gfp_flags);
|
||||
if (err) {
|
||||
kfree(selem);
|
||||
mem_uncharge(smap, owner, smap->elem_size);
|
||||
@ -404,6 +408,12 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
|
||||
}
|
||||
}
|
||||
|
||||
if (gfp_flags == GFP_KERNEL) {
|
||||
selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags);
|
||||
if (!selem)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
}
|
||||
|
||||
raw_spin_lock_irqsave(&local_storage->lock, flags);
|
||||
|
||||
/* Recheck local_storage->list under local_storage->lock */
|
||||
@ -429,19 +439,21 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
/* local_storage->lock is held. Hence, we are sure
|
||||
* we can unlink and uncharge the old_sdata successfully
|
||||
* later. Hence, instead of charging the new selem now
|
||||
* and then uncharge the old selem later (which may cause
|
||||
* a potential but unnecessary charge failure), avoid taking
|
||||
* a charge at all here (the "!old_sdata" check) and the
|
||||
* old_sdata will not be uncharged later during
|
||||
* bpf_selem_unlink_storage_nolock().
|
||||
*/
|
||||
selem = bpf_selem_alloc(smap, owner, value, !old_sdata);
|
||||
if (!selem) {
|
||||
err = -ENOMEM;
|
||||
goto unlock_err;
|
||||
if (gfp_flags != GFP_KERNEL) {
|
||||
/* local_storage->lock is held. Hence, we are sure
|
||||
* we can unlink and uncharge the old_sdata successfully
|
||||
* later. Hence, instead of charging the new selem now
|
||||
* and then uncharge the old selem later (which may cause
|
||||
* a potential but unnecessary charge failure), avoid taking
|
||||
* a charge at all here (the "!old_sdata" check) and the
|
||||
* old_sdata will not be uncharged later during
|
||||
* bpf_selem_unlink_storage_nolock().
|
||||
*/
|
||||
selem = bpf_selem_alloc(smap, owner, value, !old_sdata, gfp_flags);
|
||||
if (!selem) {
|
||||
err = -ENOMEM;
|
||||
goto unlock_err;
|
||||
}
|
||||
}
|
||||
|
||||
/* First, link the new selem to the map */
|
||||
@ -463,6 +475,10 @@ unlock:
|
||||
|
||||
unlock_err:
|
||||
raw_spin_unlock_irqrestore(&local_storage->lock, flags);
|
||||
if (selem) {
|
||||
mem_uncharge(smap, owner, smap->elem_size);
|
||||
kfree(selem);
|
||||
}
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
|
@ -99,6 +99,24 @@ static const struct bpf_func_proto bpf_ima_inode_hash_proto = {
|
||||
.allowed = bpf_ima_inode_hash_allowed,
|
||||
};
|
||||
|
||||
BPF_CALL_3(bpf_ima_file_hash, struct file *, file, void *, dst, u32, size)
|
||||
{
|
||||
return ima_file_hash(file, dst, size);
|
||||
}
|
||||
|
||||
BTF_ID_LIST_SINGLE(bpf_ima_file_hash_btf_ids, struct, file)
|
||||
|
||||
static const struct bpf_func_proto bpf_ima_file_hash_proto = {
|
||||
.func = bpf_ima_file_hash,
|
||||
.gpl_only = false,
|
||||
.ret_type = RET_INTEGER,
|
||||
.arg1_type = ARG_PTR_TO_BTF_ID,
|
||||
.arg1_btf_id = &bpf_ima_file_hash_btf_ids[0],
|
||||
.arg2_type = ARG_PTR_TO_UNINIT_MEM,
|
||||
.arg3_type = ARG_CONST_SIZE,
|
||||
.allowed = bpf_ima_inode_hash_allowed,
|
||||
};
|
||||
|
||||
static const struct bpf_func_proto *
|
||||
bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
|
||||
{
|
||||
@ -121,6 +139,8 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
|
||||
return &bpf_bprm_opts_set_proto;
|
||||
case BPF_FUNC_ima_inode_hash:
|
||||
return prog->aux->sleepable ? &bpf_ima_inode_hash_proto : NULL;
|
||||
case BPF_FUNC_ima_file_hash:
|
||||
return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL;
|
||||
default:
|
||||
return tracing_prog_func_proto(func_id, prog);
|
||||
}
|
||||
@ -167,6 +187,7 @@ BTF_ID(func, bpf_lsm_inode_setxattr)
|
||||
BTF_ID(func, bpf_lsm_inode_symlink)
|
||||
BTF_ID(func, bpf_lsm_inode_unlink)
|
||||
BTF_ID(func, bpf_lsm_kernel_module_request)
|
||||
BTF_ID(func, bpf_lsm_kernel_read_file)
|
||||
BTF_ID(func, bpf_lsm_kernfs_init_security)
|
||||
|
||||
#ifdef CONFIG_KEYS
|
||||
|
@ -174,7 +174,8 @@ static int bpf_pid_task_storage_update_elem(struct bpf_map *map, void *key,
|
||||
|
||||
bpf_task_storage_lock();
|
||||
sdata = bpf_local_storage_update(
|
||||
task, (struct bpf_local_storage_map *)map, value, map_flags);
|
||||
task, (struct bpf_local_storage_map *)map, value, map_flags,
|
||||
GFP_ATOMIC);
|
||||
bpf_task_storage_unlock();
|
||||
|
||||
err = PTR_ERR_OR_ZERO(sdata);
|
||||
@ -226,8 +227,9 @@ out:
|
||||
return err;
|
||||
}
|
||||
|
||||
BPF_CALL_4(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
|
||||
task, void *, value, u64, flags)
|
||||
/* *gfp_flags* is a hidden argument provided by the verifier */
|
||||
BPF_CALL_5(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
|
||||
task, void *, value, u64, flags, gfp_t, gfp_flags)
|
||||
{
|
||||
struct bpf_local_storage_data *sdata;
|
||||
|
||||
@ -250,7 +252,7 @@ BPF_CALL_4(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
|
||||
(flags & BPF_LOCAL_STORAGE_GET_F_CREATE))
|
||||
sdata = bpf_local_storage_update(
|
||||
task, (struct bpf_local_storage_map *)map, value,
|
||||
BPF_NOEXIST);
|
||||
BPF_NOEXIST, gfp_flags);
|
||||
|
||||
unlock:
|
||||
bpf_task_storage_unlock();
|
||||
|
166
kernel/bpf/btf.c
166
kernel/bpf/btf.c
@ -525,6 +525,50 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
|
||||
return -ENOENT;
|
||||
}
|
||||
|
||||
static s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
|
||||
{
|
||||
struct btf *btf;
|
||||
s32 ret;
|
||||
int id;
|
||||
|
||||
btf = bpf_get_btf_vmlinux();
|
||||
if (IS_ERR(btf))
|
||||
return PTR_ERR(btf);
|
||||
if (!btf)
|
||||
return -EINVAL;
|
||||
|
||||
ret = btf_find_by_name_kind(btf, name, kind);
|
||||
/* ret is never zero, since btf_find_by_name_kind returns
|
||||
* positive btf_id or negative error.
|
||||
*/
|
||||
if (ret > 0) {
|
||||
btf_get(btf);
|
||||
*btf_p = btf;
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* If name is not found in vmlinux's BTF then search in module's BTFs */
|
||||
spin_lock_bh(&btf_idr_lock);
|
||||
idr_for_each_entry(&btf_idr, btf, id) {
|
||||
if (!btf_is_module(btf))
|
||||
continue;
|
||||
/* linear search could be slow hence unlock/lock
|
||||
* the IDR to avoiding holding it for too long
|
||||
*/
|
||||
btf_get(btf);
|
||||
spin_unlock_bh(&btf_idr_lock);
|
||||
ret = btf_find_by_name_kind(btf, name, kind);
|
||||
if (ret > 0) {
|
||||
*btf_p = btf;
|
||||
return ret;
|
||||
}
|
||||
spin_lock_bh(&btf_idr_lock);
|
||||
btf_put(btf);
|
||||
}
|
||||
spin_unlock_bh(&btf_idr_lock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
|
||||
u32 id, u32 *res_id)
|
||||
{
|
||||
@ -4438,8 +4482,7 @@ static int btf_parse_hdr(struct btf_verifier_env *env)
|
||||
btf = env->btf;
|
||||
btf_data_size = btf->data_size;
|
||||
|
||||
if (btf_data_size <
|
||||
offsetof(struct btf_header, hdr_len) + sizeof(hdr->hdr_len)) {
|
||||
if (btf_data_size < offsetofend(struct btf_header, hdr_len)) {
|
||||
btf_verifier_log(env, "hdr_len not found");
|
||||
return -EINVAL;
|
||||
}
|
||||
@ -5057,6 +5100,8 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
|
||||
tag_value = __btf_name_by_offset(btf, t->name_off);
|
||||
if (strcmp(tag_value, "user") == 0)
|
||||
info->reg_type |= MEM_USER;
|
||||
if (strcmp(tag_value, "percpu") == 0)
|
||||
info->reg_type |= MEM_PERCPU;
|
||||
}
|
||||
|
||||
/* skip modifiers */
|
||||
@ -5285,12 +5330,16 @@ error:
|
||||
return -EACCES;
|
||||
}
|
||||
|
||||
/* check __user tag */
|
||||
/* check type tag */
|
||||
t = btf_type_by_id(btf, mtype->type);
|
||||
if (btf_type_is_type_tag(t)) {
|
||||
tag_value = __btf_name_by_offset(btf, t->name_off);
|
||||
/* check __user tag */
|
||||
if (strcmp(tag_value, "user") == 0)
|
||||
tmp_flag = MEM_USER;
|
||||
/* check __percpu tag */
|
||||
if (strcmp(tag_value, "percpu") == 0)
|
||||
tmp_flag = MEM_PERCPU;
|
||||
}
|
||||
|
||||
stype = btf_type_skip_modifiers(btf, mtype->type, &id);
|
||||
@ -5726,7 +5775,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
|
||||
const char *func_name, *ref_tname;
|
||||
const struct btf_type *t, *ref_t;
|
||||
const struct btf_param *args;
|
||||
int ref_regno = 0;
|
||||
int ref_regno = 0, ret;
|
||||
bool rel = false;
|
||||
|
||||
t = btf_type_by_id(btf, func_id);
|
||||
@ -5753,6 +5802,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* Only kfunc can be release func */
|
||||
if (is_kfunc)
|
||||
rel = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
|
||||
BTF_KFUNC_TYPE_RELEASE, func_id);
|
||||
/* check that BTF function arguments match actual types that the
|
||||
* verifier sees.
|
||||
*/
|
||||
@ -5776,6 +5829,11 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
|
||||
|
||||
ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
|
||||
ref_tname = btf_name_by_offset(btf, ref_t->name_off);
|
||||
|
||||
ret = check_func_arg_reg_off(env, reg, regno, ARG_DONTCARE, rel);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
if (btf_get_prog_ctx_type(log, btf, t,
|
||||
env->prog->type, i)) {
|
||||
/* If function expects ctx type in BTF check that caller
|
||||
@ -5787,8 +5845,6 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
|
||||
i, btf_type_str(t));
|
||||
return -EINVAL;
|
||||
}
|
||||
if (check_ptr_off_reg(env, reg, regno))
|
||||
return -EINVAL;
|
||||
} else if (is_kfunc && (reg->type == PTR_TO_BTF_ID ||
|
||||
(reg2btf_ids[base_type(reg->type)] && !type_flag(reg->type)))) {
|
||||
const struct btf_type *reg_ref_t;
|
||||
@ -5806,7 +5862,11 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
|
||||
if (reg->type == PTR_TO_BTF_ID) {
|
||||
reg_btf = reg->btf;
|
||||
reg_ref_id = reg->btf_id;
|
||||
/* Ensure only one argument is referenced PTR_TO_BTF_ID */
|
||||
/* Ensure only one argument is referenced
|
||||
* PTR_TO_BTF_ID, check_func_arg_reg_off relies
|
||||
* on only one referenced register being allowed
|
||||
* for kfuncs.
|
||||
*/
|
||||
if (reg->ref_obj_id) {
|
||||
if (ref_obj_id) {
|
||||
bpf_log(log, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
|
||||
@ -5888,18 +5948,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
|
||||
|
||||
/* Either both are set, or neither */
|
||||
WARN_ON_ONCE((ref_obj_id && !ref_regno) || (!ref_obj_id && ref_regno));
|
||||
if (is_kfunc) {
|
||||
rel = btf_kfunc_id_set_contains(btf, resolve_prog_type(env->prog),
|
||||
BTF_KFUNC_TYPE_RELEASE, func_id);
|
||||
/* We already made sure ref_obj_id is set only for one argument */
|
||||
if (rel && !ref_obj_id) {
|
||||
bpf_log(log, "release kernel function %s expects refcounted PTR_TO_BTF_ID\n",
|
||||
func_name);
|
||||
return -EINVAL;
|
||||
}
|
||||
/* Allow (!rel && ref_obj_id), so that passing such referenced PTR_TO_BTF_ID to
|
||||
* other kfuncs works
|
||||
*/
|
||||
/* We already made sure ref_obj_id is set only for one argument. We do
|
||||
* allow (!rel && ref_obj_id), so that passing such referenced
|
||||
* PTR_TO_BTF_ID to other kfuncs works. Note that rel is only true when
|
||||
* is_kfunc is true.
|
||||
*/
|
||||
if (rel && !ref_obj_id) {
|
||||
bpf_log(log, "release kernel function %s expects refcounted PTR_TO_BTF_ID\n",
|
||||
func_name);
|
||||
return -EINVAL;
|
||||
}
|
||||
/* returns argument register number > 0 in case of reference release kfunc */
|
||||
return rel ? ref_regno : 0;
|
||||
@ -6516,20 +6573,23 @@ struct module *btf_try_get_module(const struct btf *btf)
|
||||
return res;
|
||||
}
|
||||
|
||||
/* Returns struct btf corresponding to the struct module
|
||||
*
|
||||
* This function can return NULL or ERR_PTR. Note that caller must
|
||||
* release reference for struct btf iff btf_is_module is true.
|
||||
/* Returns struct btf corresponding to the struct module.
|
||||
* This function can return NULL or ERR_PTR.
|
||||
*/
|
||||
static struct btf *btf_get_module_btf(const struct module *module)
|
||||
{
|
||||
struct btf *btf = NULL;
|
||||
#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
|
||||
struct btf_module *btf_mod, *tmp;
|
||||
#endif
|
||||
struct btf *btf = NULL;
|
||||
|
||||
if (!module) {
|
||||
btf = bpf_get_btf_vmlinux();
|
||||
if (!IS_ERR_OR_NULL(btf))
|
||||
btf_get(btf);
|
||||
return btf;
|
||||
}
|
||||
|
||||
if (!module)
|
||||
return bpf_get_btf_vmlinux();
|
||||
#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
|
||||
mutex_lock(&btf_module_mutex);
|
||||
list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
|
||||
@ -6548,7 +6608,8 @@ static struct btf *btf_get_module_btf(const struct module *module)
|
||||
|
||||
BPF_CALL_4(bpf_btf_find_by_name_kind, char *, name, int, name_sz, u32, kind, int, flags)
|
||||
{
|
||||
struct btf *btf;
|
||||
struct btf *btf = NULL;
|
||||
int btf_obj_fd = 0;
|
||||
long ret;
|
||||
|
||||
if (flags)
|
||||
@ -6557,44 +6618,17 @@ BPF_CALL_4(bpf_btf_find_by_name_kind, char *, name, int, name_sz, u32, kind, int
|
||||
if (name_sz <= 1 || name[name_sz - 1])
|
||||
return -EINVAL;
|
||||
|
||||
btf = bpf_get_btf_vmlinux();
|
||||
if (IS_ERR(btf))
|
||||
return PTR_ERR(btf);
|
||||
|
||||
ret = btf_find_by_name_kind(btf, name, kind);
|
||||
/* ret is never zero, since btf_find_by_name_kind returns
|
||||
* positive btf_id or negative error.
|
||||
*/
|
||||
if (ret < 0) {
|
||||
struct btf *mod_btf;
|
||||
int id;
|
||||
|
||||
/* If name is not found in vmlinux's BTF then search in module's BTFs */
|
||||
spin_lock_bh(&btf_idr_lock);
|
||||
idr_for_each_entry(&btf_idr, mod_btf, id) {
|
||||
if (!btf_is_module(mod_btf))
|
||||
continue;
|
||||
/* linear search could be slow hence unlock/lock
|
||||
* the IDR to avoiding holding it for too long
|
||||
*/
|
||||
btf_get(mod_btf);
|
||||
spin_unlock_bh(&btf_idr_lock);
|
||||
ret = btf_find_by_name_kind(mod_btf, name, kind);
|
||||
if (ret > 0) {
|
||||
int btf_obj_fd;
|
||||
|
||||
btf_obj_fd = __btf_new_fd(mod_btf);
|
||||
if (btf_obj_fd < 0) {
|
||||
btf_put(mod_btf);
|
||||
return btf_obj_fd;
|
||||
}
|
||||
return ret | (((u64)btf_obj_fd) << 32);
|
||||
}
|
||||
spin_lock_bh(&btf_idr_lock);
|
||||
btf_put(mod_btf);
|
||||
ret = bpf_find_btf_id(name, kind, &btf);
|
||||
if (ret > 0 && btf_is_module(btf)) {
|
||||
btf_obj_fd = __btf_new_fd(btf);
|
||||
if (btf_obj_fd < 0) {
|
||||
btf_put(btf);
|
||||
return btf_obj_fd;
|
||||
}
|
||||
spin_unlock_bh(&btf_idr_lock);
|
||||
return ret | (((u64)btf_obj_fd) << 32);
|
||||
}
|
||||
if (ret > 0)
|
||||
btf_put(btf);
|
||||
return ret;
|
||||
}
|
||||
|
||||
@ -6793,9 +6827,7 @@ int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
|
||||
|
||||
hook = bpf_prog_type_to_kfunc_hook(prog_type);
|
||||
ret = btf_populate_kfunc_set(btf, hook, kset);
|
||||
/* reference is only taken for module BTF */
|
||||
if (btf_is_module(btf))
|
||||
btf_put(btf);
|
||||
btf_put(btf);
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(register_btf_kfunc_id_set);
|
||||
@ -7149,6 +7181,8 @@ bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id)
|
||||
main_btf = bpf_get_btf_vmlinux();
|
||||
if (IS_ERR(main_btf))
|
||||
return ERR_CAST(main_btf);
|
||||
if (!main_btf)
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
local_type = btf_type_by_id(local_btf, local_type_id);
|
||||
if (!local_type)
|
||||
|
@ -33,6 +33,7 @@
|
||||
#include <linux/extable.h>
|
||||
#include <linux/log2.h>
|
||||
#include <linux/bpf_verifier.h>
|
||||
#include <linux/nodemask.h>
|
||||
|
||||
#include <asm/barrier.h>
|
||||
#include <asm/unaligned.h>
|
||||
@ -105,6 +106,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
|
||||
fp->aux = aux;
|
||||
fp->aux->prog = fp;
|
||||
fp->jit_requested = ebpf_jit_enabled();
|
||||
fp->blinding_requested = bpf_jit_blinding_enabled(fp);
|
||||
|
||||
INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode);
|
||||
mutex_init(&fp->aux->used_maps_mutex);
|
||||
@ -814,15 +816,9 @@ int bpf_jit_add_poke_descriptor(struct bpf_prog *prog,
|
||||
* allocator. The prog_pack allocator uses HPAGE_PMD_SIZE page (2MB on x86)
|
||||
* to host BPF programs.
|
||||
*/
|
||||
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||
#define BPF_PROG_PACK_SIZE HPAGE_PMD_SIZE
|
||||
#else
|
||||
#define BPF_PROG_PACK_SIZE PAGE_SIZE
|
||||
#endif
|
||||
#define BPF_PROG_CHUNK_SHIFT 6
|
||||
#define BPF_PROG_CHUNK_SIZE (1 << BPF_PROG_CHUNK_SHIFT)
|
||||
#define BPF_PROG_CHUNK_MASK (~(BPF_PROG_CHUNK_SIZE - 1))
|
||||
#define BPF_PROG_CHUNK_COUNT (BPF_PROG_PACK_SIZE / BPF_PROG_CHUNK_SIZE)
|
||||
|
||||
struct bpf_prog_pack {
|
||||
struct list_head list;
|
||||
@ -830,30 +826,72 @@ struct bpf_prog_pack {
|
||||
unsigned long bitmap[];
|
||||
};
|
||||
|
||||
#define BPF_PROG_MAX_PACK_PROG_SIZE BPF_PROG_PACK_SIZE
|
||||
#define BPF_PROG_SIZE_TO_NBITS(size) (round_up(size, BPF_PROG_CHUNK_SIZE) / BPF_PROG_CHUNK_SIZE)
|
||||
|
||||
static size_t bpf_prog_pack_size = -1;
|
||||
static size_t bpf_prog_pack_mask = -1;
|
||||
|
||||
static int bpf_prog_chunk_count(void)
|
||||
{
|
||||
WARN_ON_ONCE(bpf_prog_pack_size == -1);
|
||||
return bpf_prog_pack_size / BPF_PROG_CHUNK_SIZE;
|
||||
}
|
||||
|
||||
static DEFINE_MUTEX(pack_mutex);
|
||||
static LIST_HEAD(pack_list);
|
||||
|
||||
/* PMD_SIZE is not available in some special config, e.g. ARCH=arm with
|
||||
* CONFIG_MMU=n. Use PAGE_SIZE in these cases.
|
||||
*/
|
||||
#ifdef PMD_SIZE
|
||||
#define BPF_HPAGE_SIZE PMD_SIZE
|
||||
#define BPF_HPAGE_MASK PMD_MASK
|
||||
#else
|
||||
#define BPF_HPAGE_SIZE PAGE_SIZE
|
||||
#define BPF_HPAGE_MASK PAGE_MASK
|
||||
#endif
|
||||
|
||||
static size_t select_bpf_prog_pack_size(void)
|
||||
{
|
||||
size_t size;
|
||||
void *ptr;
|
||||
|
||||
size = BPF_HPAGE_SIZE * num_online_nodes();
|
||||
ptr = module_alloc(size);
|
||||
|
||||
/* Test whether we can get huge pages. If not just use PAGE_SIZE
|
||||
* packs.
|
||||
*/
|
||||
if (!ptr || !is_vm_area_hugepages(ptr)) {
|
||||
size = PAGE_SIZE;
|
||||
bpf_prog_pack_mask = PAGE_MASK;
|
||||
} else {
|
||||
bpf_prog_pack_mask = BPF_HPAGE_MASK;
|
||||
}
|
||||
|
||||
vfree(ptr);
|
||||
return size;
|
||||
}
|
||||
|
||||
static struct bpf_prog_pack *alloc_new_pack(void)
|
||||
{
|
||||
struct bpf_prog_pack *pack;
|
||||
|
||||
pack = kzalloc(sizeof(*pack) + BITS_TO_BYTES(BPF_PROG_CHUNK_COUNT), GFP_KERNEL);
|
||||
pack = kzalloc(struct_size(pack, bitmap, BITS_TO_LONGS(bpf_prog_chunk_count())),
|
||||
GFP_KERNEL);
|
||||
if (!pack)
|
||||
return NULL;
|
||||
pack->ptr = module_alloc(BPF_PROG_PACK_SIZE);
|
||||
pack->ptr = module_alloc(bpf_prog_pack_size);
|
||||
if (!pack->ptr) {
|
||||
kfree(pack);
|
||||
return NULL;
|
||||
}
|
||||
bitmap_zero(pack->bitmap, BPF_PROG_PACK_SIZE / BPF_PROG_CHUNK_SIZE);
|
||||
bitmap_zero(pack->bitmap, bpf_prog_pack_size / BPF_PROG_CHUNK_SIZE);
|
||||
list_add_tail(&pack->list, &pack_list);
|
||||
|
||||
set_vm_flush_reset_perms(pack->ptr);
|
||||
set_memory_ro((unsigned long)pack->ptr, BPF_PROG_PACK_SIZE / PAGE_SIZE);
|
||||
set_memory_x((unsigned long)pack->ptr, BPF_PROG_PACK_SIZE / PAGE_SIZE);
|
||||
set_memory_ro((unsigned long)pack->ptr, bpf_prog_pack_size / PAGE_SIZE);
|
||||
set_memory_x((unsigned long)pack->ptr, bpf_prog_pack_size / PAGE_SIZE);
|
||||
return pack;
|
||||
}
|
||||
|
||||
@ -864,7 +902,11 @@ static void *bpf_prog_pack_alloc(u32 size)
|
||||
unsigned long pos;
|
||||
void *ptr = NULL;
|
||||
|
||||
if (size > BPF_PROG_MAX_PACK_PROG_SIZE) {
|
||||
mutex_lock(&pack_mutex);
|
||||
if (bpf_prog_pack_size == -1)
|
||||
bpf_prog_pack_size = select_bpf_prog_pack_size();
|
||||
|
||||
if (size > bpf_prog_pack_size) {
|
||||
size = round_up(size, PAGE_SIZE);
|
||||
ptr = module_alloc(size);
|
||||
if (ptr) {
|
||||
@ -872,13 +914,12 @@ static void *bpf_prog_pack_alloc(u32 size)
|
||||
set_memory_ro((unsigned long)ptr, size / PAGE_SIZE);
|
||||
set_memory_x((unsigned long)ptr, size / PAGE_SIZE);
|
||||
}
|
||||
return ptr;
|
||||
goto out;
|
||||
}
|
||||
mutex_lock(&pack_mutex);
|
||||
list_for_each_entry(pack, &pack_list, list) {
|
||||
pos = bitmap_find_next_zero_area(pack->bitmap, BPF_PROG_CHUNK_COUNT, 0,
|
||||
pos = bitmap_find_next_zero_area(pack->bitmap, bpf_prog_chunk_count(), 0,
|
||||
nbits, 0);
|
||||
if (pos < BPF_PROG_CHUNK_COUNT)
|
||||
if (pos < bpf_prog_chunk_count())
|
||||
goto found_free_area;
|
||||
}
|
||||
|
||||
@ -904,13 +945,13 @@ static void bpf_prog_pack_free(struct bpf_binary_header *hdr)
|
||||
unsigned long pos;
|
||||
void *pack_ptr;
|
||||
|
||||
if (hdr->size > BPF_PROG_MAX_PACK_PROG_SIZE) {
|
||||
mutex_lock(&pack_mutex);
|
||||
if (hdr->size > bpf_prog_pack_size) {
|
||||
module_memfree(hdr);
|
||||
return;
|
||||
goto out;
|
||||
}
|
||||
|
||||
pack_ptr = (void *)((unsigned long)hdr & ~(BPF_PROG_PACK_SIZE - 1));
|
||||
mutex_lock(&pack_mutex);
|
||||
pack_ptr = (void *)((unsigned long)hdr & bpf_prog_pack_mask);
|
||||
|
||||
list_for_each_entry(tmp, &pack_list, list) {
|
||||
if (tmp->ptr == pack_ptr) {
|
||||
@ -926,8 +967,8 @@ static void bpf_prog_pack_free(struct bpf_binary_header *hdr)
|
||||
pos = ((unsigned long)hdr - (unsigned long)pack_ptr) >> BPF_PROG_CHUNK_SHIFT;
|
||||
|
||||
bitmap_clear(pack->bitmap, pos, nbits);
|
||||
if (bitmap_find_next_zero_area(pack->bitmap, BPF_PROG_CHUNK_COUNT, 0,
|
||||
BPF_PROG_CHUNK_COUNT, 0) == 0) {
|
||||
if (bitmap_find_next_zero_area(pack->bitmap, bpf_prog_chunk_count(), 0,
|
||||
bpf_prog_chunk_count(), 0) == 0) {
|
||||
list_del(&pack->list);
|
||||
module_memfree(pack->ptr);
|
||||
kfree(pack);
|
||||
@ -1382,7 +1423,7 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
|
||||
struct bpf_insn *insn;
|
||||
int i, rewritten;
|
||||
|
||||
if (!bpf_jit_blinding_enabled(prog) || prog->blinded)
|
||||
if (!prog->blinding_requested || prog->blinded)
|
||||
return prog;
|
||||
|
||||
clone = bpf_prog_clone_create(prog, GFP_USER);
|
||||
|
@ -225,13 +225,8 @@ BPF_CALL_2(bpf_get_current_comm, char *, buf, u32, size)
|
||||
if (unlikely(!task))
|
||||
goto err_clear;
|
||||
|
||||
strncpy(buf, task->comm, size);
|
||||
|
||||
/* Verifier guarantees that size > 0. For task->comm exceeding
|
||||
* size, guarantee that buf is %NUL-terminated. Unconditionally
|
||||
* done here to save the size test.
|
||||
*/
|
||||
buf[size - 1] = 0;
|
||||
/* Verifier guarantees that size > 0 */
|
||||
strscpy(buf, task->comm, size);
|
||||
return 0;
|
||||
err_clear:
|
||||
memset(buf, 0, size);
|
||||
|
@ -1,8 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
LIBBPF_SRCS = $(srctree)/tools/lib/bpf/
|
||||
LIBBPF_INCLUDE = $(LIBBPF_SRCS)/..
|
||||
LIBBPF_INCLUDE = $(srctree)/tools/lib
|
||||
|
||||
obj-$(CONFIG_BPF_PRELOAD_UMD) += bpf_preload.o
|
||||
CFLAGS_bpf_preload_kern.o += -I $(LIBBPF_INCLUDE)
|
||||
CFLAGS_bpf_preload_kern.o += -I$(LIBBPF_INCLUDE)
|
||||
bpf_preload-objs += bpf_preload_kern.o
|
||||
|
@ -176,7 +176,7 @@ build_id_valid:
|
||||
}
|
||||
|
||||
static struct perf_callchain_entry *
|
||||
get_callchain_entry_for_task(struct task_struct *task, u32 init_nr)
|
||||
get_callchain_entry_for_task(struct task_struct *task, u32 max_depth)
|
||||
{
|
||||
#ifdef CONFIG_STACKTRACE
|
||||
struct perf_callchain_entry *entry;
|
||||
@ -187,9 +187,8 @@ get_callchain_entry_for_task(struct task_struct *task, u32 init_nr)
|
||||
if (!entry)
|
||||
return NULL;
|
||||
|
||||
entry->nr = init_nr +
|
||||
stack_trace_save_tsk(task, (unsigned long *)(entry->ip + init_nr),
|
||||
sysctl_perf_event_max_stack - init_nr, 0);
|
||||
entry->nr = stack_trace_save_tsk(task, (unsigned long *)entry->ip,
|
||||
max_depth, 0);
|
||||
|
||||
/* stack_trace_save_tsk() works on unsigned long array, while
|
||||
* perf_callchain_entry uses u64 array. For 32-bit systems, it is
|
||||
@ -201,7 +200,7 @@ get_callchain_entry_for_task(struct task_struct *task, u32 init_nr)
|
||||
int i;
|
||||
|
||||
/* copy data from the end to avoid using extra buffer */
|
||||
for (i = entry->nr - 1; i >= (int)init_nr; i--)
|
||||
for (i = entry->nr - 1; i >= 0; i--)
|
||||
to[i] = (u64)(from[i]);
|
||||
}
|
||||
|
||||
@ -218,27 +217,19 @@ static long __bpf_get_stackid(struct bpf_map *map,
|
||||
{
|
||||
struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map);
|
||||
struct stack_map_bucket *bucket, *new_bucket, *old_bucket;
|
||||
u32 max_depth = map->value_size / stack_map_data_size(map);
|
||||
/* stack_map_alloc() checks that max_depth <= sysctl_perf_event_max_stack */
|
||||
u32 init_nr = sysctl_perf_event_max_stack - max_depth;
|
||||
u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
|
||||
u32 hash, id, trace_nr, trace_len;
|
||||
bool user = flags & BPF_F_USER_STACK;
|
||||
u64 *ips;
|
||||
bool hash_matches;
|
||||
|
||||
/* get_perf_callchain() guarantees that trace->nr >= init_nr
|
||||
* and trace-nr <= sysctl_perf_event_max_stack, so trace_nr <= max_depth
|
||||
*/
|
||||
trace_nr = trace->nr - init_nr;
|
||||
|
||||
if (trace_nr <= skip)
|
||||
if (trace->nr <= skip)
|
||||
/* skipping more than usable stack trace */
|
||||
return -EFAULT;
|
||||
|
||||
trace_nr -= skip;
|
||||
trace_nr = trace->nr - skip;
|
||||
trace_len = trace_nr * sizeof(u64);
|
||||
ips = trace->ip + skip + init_nr;
|
||||
ips = trace->ip + skip;
|
||||
hash = jhash2((u32 *)ips, trace_len / sizeof(u32), 0);
|
||||
id = hash & (smap->n_buckets - 1);
|
||||
bucket = READ_ONCE(smap->buckets[id]);
|
||||
@ -295,8 +286,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
|
||||
u64, flags)
|
||||
{
|
||||
u32 max_depth = map->value_size / stack_map_data_size(map);
|
||||
/* stack_map_alloc() checks that max_depth <= sysctl_perf_event_max_stack */
|
||||
u32 init_nr = sysctl_perf_event_max_stack - max_depth;
|
||||
u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
|
||||
bool user = flags & BPF_F_USER_STACK;
|
||||
struct perf_callchain_entry *trace;
|
||||
bool kernel = !user;
|
||||
@ -305,8 +295,12 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
|
||||
BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID)))
|
||||
return -EINVAL;
|
||||
|
||||
trace = get_perf_callchain(regs, init_nr, kernel, user,
|
||||
sysctl_perf_event_max_stack, false, false);
|
||||
max_depth += skip;
|
||||
if (max_depth > sysctl_perf_event_max_stack)
|
||||
max_depth = sysctl_perf_event_max_stack;
|
||||
|
||||
trace = get_perf_callchain(regs, 0, kernel, user, max_depth,
|
||||
false, false);
|
||||
|
||||
if (unlikely(!trace))
|
||||
/* couldn't fetch the stack trace */
|
||||
@ -397,7 +391,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
|
||||
struct perf_callchain_entry *trace_in,
|
||||
void *buf, u32 size, u64 flags)
|
||||
{
|
||||
u32 init_nr, trace_nr, copy_len, elem_size, num_elem;
|
||||
u32 trace_nr, copy_len, elem_size, num_elem, max_depth;
|
||||
bool user_build_id = flags & BPF_F_USER_BUILD_ID;
|
||||
u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
|
||||
bool user = flags & BPF_F_USER_STACK;
|
||||
@ -422,30 +416,28 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
|
||||
goto err_fault;
|
||||
|
||||
num_elem = size / elem_size;
|
||||
if (sysctl_perf_event_max_stack < num_elem)
|
||||
init_nr = 0;
|
||||
else
|
||||
init_nr = sysctl_perf_event_max_stack - num_elem;
|
||||
max_depth = num_elem + skip;
|
||||
if (sysctl_perf_event_max_stack < max_depth)
|
||||
max_depth = sysctl_perf_event_max_stack;
|
||||
|
||||
if (trace_in)
|
||||
trace = trace_in;
|
||||
else if (kernel && task)
|
||||
trace = get_callchain_entry_for_task(task, init_nr);
|
||||
trace = get_callchain_entry_for_task(task, max_depth);
|
||||
else
|
||||
trace = get_perf_callchain(regs, init_nr, kernel, user,
|
||||
sysctl_perf_event_max_stack,
|
||||
trace = get_perf_callchain(regs, 0, kernel, user, max_depth,
|
||||
false, false);
|
||||
if (unlikely(!trace))
|
||||
goto err_fault;
|
||||
|
||||
trace_nr = trace->nr - init_nr;
|
||||
if (trace_nr < skip)
|
||||
if (trace->nr < skip)
|
||||
goto err_fault;
|
||||
|
||||
trace_nr -= skip;
|
||||
trace_nr = trace->nr - skip;
|
||||
trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem;
|
||||
copy_len = trace_nr * elem_size;
|
||||
ips = trace->ip + skip + init_nr;
|
||||
|
||||
ips = trace->ip + skip;
|
||||
if (user && user_build_id)
|
||||
stack_map_get_build_id_offset(buf, ips, trace_nr, user);
|
||||
else
|
||||
|
@ -32,6 +32,7 @@
|
||||
#include <linux/bpf-netns.h>
|
||||
#include <linux/rcupdate_trace.h>
|
||||
#include <linux/memcontrol.h>
|
||||
#include <linux/trace_events.h>
|
||||
|
||||
#define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
|
||||
(map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
|
||||
@ -3022,6 +3023,11 @@ out_put_file:
|
||||
fput(perf_file);
|
||||
return err;
|
||||
}
|
||||
#else
|
||||
static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
#endif /* CONFIG_PERF_EVENTS */
|
||||
|
||||
#define BPF_RAW_TRACEPOINT_OPEN_LAST_FIELD raw_tracepoint.prog_fd
|
||||
@ -3336,7 +3342,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
|
||||
}
|
||||
}
|
||||
|
||||
#define BPF_PROG_TEST_RUN_LAST_FIELD test.cpu
|
||||
#define BPF_PROG_TEST_RUN_LAST_FIELD test.batch_size
|
||||
|
||||
static int bpf_prog_test_run(const union bpf_attr *attr,
|
||||
union bpf_attr __user *uattr)
|
||||
@ -4255,7 +4261,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
|
||||
#define BPF_LINK_CREATE_LAST_FIELD link_create.kprobe_multi.cookies
|
||||
static int link_create(union bpf_attr *attr, bpfptr_t uattr)
|
||||
{
|
||||
enum bpf_prog_type ptype;
|
||||
@ -4279,7 +4285,6 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
|
||||
ret = tracing_bpf_link_attach(attr, uattr, prog);
|
||||
goto out;
|
||||
case BPF_PROG_TYPE_PERF_EVENT:
|
||||
case BPF_PROG_TYPE_KPROBE:
|
||||
case BPF_PROG_TYPE_TRACEPOINT:
|
||||
if (attr->link_create.attach_type != BPF_PERF_EVENT) {
|
||||
ret = -EINVAL;
|
||||
@ -4287,6 +4292,14 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
|
||||
}
|
||||
ptype = prog->type;
|
||||
break;
|
||||
case BPF_PROG_TYPE_KPROBE:
|
||||
if (attr->link_create.attach_type != BPF_PERF_EVENT &&
|
||||
attr->link_create.attach_type != BPF_TRACE_KPROBE_MULTI) {
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
ptype = prog->type;
|
||||
break;
|
||||
default:
|
||||
ptype = attach_type_to_prog_type(attr->link_create.attach_type);
|
||||
if (ptype == BPF_PROG_TYPE_UNSPEC || ptype != prog->type) {
|
||||
@ -4318,13 +4331,16 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
|
||||
ret = bpf_xdp_link_attach(attr, prog);
|
||||
break;
|
||||
#endif
|
||||
#ifdef CONFIG_PERF_EVENTS
|
||||
case BPF_PROG_TYPE_PERF_EVENT:
|
||||
case BPF_PROG_TYPE_TRACEPOINT:
|
||||
case BPF_PROG_TYPE_KPROBE:
|
||||
ret = bpf_perf_link_attach(attr, prog);
|
||||
break;
|
||||
#endif
|
||||
case BPF_PROG_TYPE_KPROBE:
|
||||
if (attr->link_create.attach_type == BPF_PERF_EVENT)
|
||||
ret = bpf_perf_link_attach(attr, prog);
|
||||
else
|
||||
ret = bpf_kprobe_multi_link_attach(attr, prog);
|
||||
break;
|
||||
default:
|
||||
ret = -EINVAL;
|
||||
}
|
||||
|
@ -554,7 +554,6 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
|
||||
[PTR_TO_TP_BUFFER] = "tp_buffer",
|
||||
[PTR_TO_XDP_SOCK] = "xdp_sock",
|
||||
[PTR_TO_BTF_ID] = "ptr_",
|
||||
[PTR_TO_PERCPU_BTF_ID] = "percpu_ptr_",
|
||||
[PTR_TO_MEM] = "mem",
|
||||
[PTR_TO_BUF] = "buf",
|
||||
[PTR_TO_FUNC] = "func",
|
||||
@ -562,8 +561,7 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
|
||||
};
|
||||
|
||||
if (type & PTR_MAYBE_NULL) {
|
||||
if (base_type(type) == PTR_TO_BTF_ID ||
|
||||
base_type(type) == PTR_TO_PERCPU_BTF_ID)
|
||||
if (base_type(type) == PTR_TO_BTF_ID)
|
||||
strncpy(postfix, "or_null_", 16);
|
||||
else
|
||||
strncpy(postfix, "_or_null", 16);
|
||||
@ -575,6 +573,8 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
|
||||
strncpy(prefix, "alloc_", 32);
|
||||
if (type & MEM_USER)
|
||||
strncpy(prefix, "user_", 32);
|
||||
if (type & MEM_PERCPU)
|
||||
strncpy(prefix, "percpu_", 32);
|
||||
|
||||
snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s",
|
||||
prefix, str[base_type(type)], postfix);
|
||||
@ -697,8 +697,7 @@ static void print_verifier_state(struct bpf_verifier_env *env,
|
||||
const char *sep = "";
|
||||
|
||||
verbose(env, "%s", reg_type_str(env, t));
|
||||
if (base_type(t) == PTR_TO_BTF_ID ||
|
||||
base_type(t) == PTR_TO_PERCPU_BTF_ID)
|
||||
if (base_type(t) == PTR_TO_BTF_ID)
|
||||
verbose(env, "%s", kernel_type_name(reg->btf, reg->btf_id));
|
||||
verbose(env, "(");
|
||||
/*
|
||||
@ -2783,7 +2782,6 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
|
||||
case PTR_TO_XDP_SOCK:
|
||||
case PTR_TO_BTF_ID:
|
||||
case PTR_TO_BUF:
|
||||
case PTR_TO_PERCPU_BTF_ID:
|
||||
case PTR_TO_MEM:
|
||||
case PTR_TO_FUNC:
|
||||
case PTR_TO_MAP_KEY:
|
||||
@ -3990,6 +3988,12 @@ static int __check_ptr_off_reg(struct bpf_verifier_env *env,
|
||||
* is only allowed in its original, unmodified form.
|
||||
*/
|
||||
|
||||
if (reg->off < 0) {
|
||||
verbose(env, "negative offset %s ptr R%d off=%d disallowed\n",
|
||||
reg_type_str(env, reg->type), regno, reg->off);
|
||||
return -EACCES;
|
||||
}
|
||||
|
||||
if (!fixed_off_ok && reg->off) {
|
||||
verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
|
||||
reg_type_str(env, reg->type), regno, reg->off);
|
||||
@ -4058,9 +4062,9 @@ static int check_buffer_access(struct bpf_verifier_env *env,
|
||||
const struct bpf_reg_state *reg,
|
||||
int regno, int off, int size,
|
||||
bool zero_size_allowed,
|
||||
const char *buf_info,
|
||||
u32 *max_access)
|
||||
{
|
||||
const char *buf_info = type_is_rdonly_mem(reg->type) ? "rdonly" : "rdwr";
|
||||
int err;
|
||||
|
||||
err = __check_buffer_access(env, buf_info, reg, regno, off, size);
|
||||
@ -4197,6 +4201,13 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
|
||||
return -EACCES;
|
||||
}
|
||||
|
||||
if (reg->type & MEM_PERCPU) {
|
||||
verbose(env,
|
||||
"R%d is ptr_%s access percpu memory: off=%d\n",
|
||||
regno, tname, off);
|
||||
return -EACCES;
|
||||
}
|
||||
|
||||
if (env->ops->btf_struct_access) {
|
||||
ret = env->ops->btf_struct_access(&env->log, reg->btf, t,
|
||||
off, size, atype, &btf_id, &flag);
|
||||
@ -4556,7 +4567,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
|
||||
err = check_tp_buffer_access(env, reg, regno, off, size);
|
||||
if (!err && t == BPF_READ && value_regno >= 0)
|
||||
mark_reg_unknown(env, regs, value_regno);
|
||||
} else if (reg->type == PTR_TO_BTF_ID) {
|
||||
} else if (base_type(reg->type) == PTR_TO_BTF_ID &&
|
||||
!type_may_be_null(reg->type)) {
|
||||
err = check_ptr_to_btf_access(env, regs, regno, off, size, t,
|
||||
value_regno);
|
||||
} else if (reg->type == CONST_PTR_TO_MAP) {
|
||||
@ -4564,7 +4576,6 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
|
||||
value_regno);
|
||||
} else if (base_type(reg->type) == PTR_TO_BUF) {
|
||||
bool rdonly_mem = type_is_rdonly_mem(reg->type);
|
||||
const char *buf_info;
|
||||
u32 *max_access;
|
||||
|
||||
if (rdonly_mem) {
|
||||
@ -4573,15 +4584,13 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
|
||||
regno, reg_type_str(env, reg->type));
|
||||
return -EACCES;
|
||||
}
|
||||
buf_info = "rdonly";
|
||||
max_access = &env->prog->aux->max_rdonly_access;
|
||||
} else {
|
||||
buf_info = "rdwr";
|
||||
max_access = &env->prog->aux->max_rdwr_access;
|
||||
}
|
||||
|
||||
err = check_buffer_access(env, reg, regno, off, size, false,
|
||||
buf_info, max_access);
|
||||
max_access);
|
||||
|
||||
if (!err && value_regno >= 0 && (rdonly_mem || t == BPF_READ))
|
||||
mark_reg_unknown(env, regs, value_regno);
|
||||
@ -4802,7 +4811,7 @@ static int check_stack_range_initialized(
|
||||
}
|
||||
|
||||
if (is_spilled_reg(&state->stack[spi]) &&
|
||||
state->stack[spi].spilled_ptr.type == PTR_TO_BTF_ID)
|
||||
base_type(state->stack[spi].spilled_ptr.type) == PTR_TO_BTF_ID)
|
||||
goto mark;
|
||||
|
||||
if (is_spilled_reg(&state->stack[spi]) &&
|
||||
@ -4844,7 +4853,6 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
|
||||
struct bpf_call_arg_meta *meta)
|
||||
{
|
||||
struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno];
|
||||
const char *buf_info;
|
||||
u32 *max_access;
|
||||
|
||||
switch (base_type(reg->type)) {
|
||||
@ -4871,15 +4879,13 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
|
||||
if (meta && meta->raw_mode)
|
||||
return -EACCES;
|
||||
|
||||
buf_info = "rdonly";
|
||||
max_access = &env->prog->aux->max_rdonly_access;
|
||||
} else {
|
||||
buf_info = "rdwr";
|
||||
max_access = &env->prog->aux->max_rdwr_access;
|
||||
}
|
||||
return check_buffer_access(env, reg, regno, reg->off,
|
||||
access_size, zero_size_allowed,
|
||||
buf_info, max_access);
|
||||
max_access);
|
||||
case PTR_TO_STACK:
|
||||
return check_stack_range_initialized(
|
||||
env,
|
||||
@ -5258,7 +5264,7 @@ static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM | ME
|
||||
static const struct bpf_reg_types const_map_ptr_types = { .types = { CONST_PTR_TO_MAP } };
|
||||
static const struct bpf_reg_types btf_ptr_types = { .types = { PTR_TO_BTF_ID } };
|
||||
static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALUE } };
|
||||
static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_PERCPU_BTF_ID } };
|
||||
static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_BTF_ID | MEM_PERCPU } };
|
||||
static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } };
|
||||
static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } };
|
||||
static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } };
|
||||
@ -5359,6 +5365,60 @@ found:
|
||||
return 0;
|
||||
}
|
||||
|
||||
int check_func_arg_reg_off(struct bpf_verifier_env *env,
|
||||
const struct bpf_reg_state *reg, int regno,
|
||||
enum bpf_arg_type arg_type,
|
||||
bool is_release_func)
|
||||
{
|
||||
bool fixed_off_ok = false, release_reg;
|
||||
enum bpf_reg_type type = reg->type;
|
||||
|
||||
switch ((u32)type) {
|
||||
case SCALAR_VALUE:
|
||||
/* Pointer types where reg offset is explicitly allowed: */
|
||||
case PTR_TO_PACKET:
|
||||
case PTR_TO_PACKET_META:
|
||||
case PTR_TO_MAP_KEY:
|
||||
case PTR_TO_MAP_VALUE:
|
||||
case PTR_TO_MEM:
|
||||
case PTR_TO_MEM | MEM_RDONLY:
|
||||
case PTR_TO_MEM | MEM_ALLOC:
|
||||
case PTR_TO_BUF:
|
||||
case PTR_TO_BUF | MEM_RDONLY:
|
||||
case PTR_TO_STACK:
|
||||
/* Some of the argument types nevertheless require a
|
||||
* zero register offset.
|
||||
*/
|
||||
if (arg_type != ARG_PTR_TO_ALLOC_MEM)
|
||||
return 0;
|
||||
break;
|
||||
/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
|
||||
* fixed offset.
|
||||
*/
|
||||
case PTR_TO_BTF_ID:
|
||||
/* When referenced PTR_TO_BTF_ID is passed to release function,
|
||||
* it's fixed offset must be 0. We rely on the property that
|
||||
* only one referenced register can be passed to BPF helpers and
|
||||
* kfuncs. In the other cases, fixed offset can be non-zero.
|
||||
*/
|
||||
release_reg = is_release_func && reg->ref_obj_id;
|
||||
if (release_reg && reg->off) {
|
||||
verbose(env, "R%d must have zero offset when passed to release func\n",
|
||||
regno);
|
||||
return -EINVAL;
|
||||
}
|
||||
/* For release_reg == true, fixed_off_ok must be false, but we
|
||||
* already checked and rejected reg->off != 0 above, so set to
|
||||
* true to allow fixed offset for all other cases.
|
||||
*/
|
||||
fixed_off_ok = true;
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
|
||||
}
|
||||
|
||||
static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
|
||||
struct bpf_call_arg_meta *meta,
|
||||
const struct bpf_func_proto *fn)
|
||||
@ -5408,36 +5468,14 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
switch ((u32)type) {
|
||||
case SCALAR_VALUE:
|
||||
/* Pointer types where reg offset is explicitly allowed: */
|
||||
case PTR_TO_PACKET:
|
||||
case PTR_TO_PACKET_META:
|
||||
case PTR_TO_MAP_KEY:
|
||||
case PTR_TO_MAP_VALUE:
|
||||
case PTR_TO_MEM:
|
||||
case PTR_TO_MEM | MEM_RDONLY:
|
||||
case PTR_TO_MEM | MEM_ALLOC:
|
||||
case PTR_TO_BUF:
|
||||
case PTR_TO_BUF | MEM_RDONLY:
|
||||
case PTR_TO_STACK:
|
||||
/* Some of the argument types nevertheless require a
|
||||
* zero register offset.
|
||||
*/
|
||||
if (arg_type == ARG_PTR_TO_ALLOC_MEM)
|
||||
goto force_off_check;
|
||||
break;
|
||||
/* All the rest must be rejected: */
|
||||
default:
|
||||
force_off_check:
|
||||
err = __check_ptr_off_reg(env, reg, regno,
|
||||
type == PTR_TO_BTF_ID);
|
||||
if (err < 0)
|
||||
return err;
|
||||
break;
|
||||
}
|
||||
err = check_func_arg_reg_off(env, reg, regno, arg_type, is_release_function(meta->func_id));
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
skip_type_check:
|
||||
/* check_func_arg_reg_off relies on only one referenced register being
|
||||
* allowed for BPF helpers.
|
||||
*/
|
||||
if (reg->ref_obj_id) {
|
||||
if (meta->ref_obj_id) {
|
||||
verbose(env, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
|
||||
@ -9638,7 +9676,6 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
|
||||
dst_reg->mem_size = aux->btf_var.mem_size;
|
||||
break;
|
||||
case PTR_TO_BTF_ID:
|
||||
case PTR_TO_PERCPU_BTF_ID:
|
||||
dst_reg->btf = aux->btf_var.btf;
|
||||
dst_reg->btf_id = aux->btf_var.btf_id;
|
||||
break;
|
||||
@ -10363,8 +10400,7 @@ static void adjust_btf_func(struct bpf_verifier_env *env)
|
||||
aux->func_info[i].insn_off = env->subprog_info[i].start;
|
||||
}
|
||||
|
||||
#define MIN_BPF_LINEINFO_SIZE (offsetof(struct bpf_line_info, line_col) + \
|
||||
sizeof(((struct bpf_line_info *)(0))->line_col))
|
||||
#define MIN_BPF_LINEINFO_SIZE offsetofend(struct bpf_line_info, line_col)
|
||||
#define MAX_LINEINFO_REC_SIZE MAX_FUNCINFO_REC_SIZE
|
||||
|
||||
static int check_btf_line(struct bpf_verifier_env *env,
|
||||
@ -11838,7 +11874,7 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
|
||||
type = t->type;
|
||||
t = btf_type_skip_modifiers(btf, type, NULL);
|
||||
if (percpu) {
|
||||
aux->btf_var.reg_type = PTR_TO_PERCPU_BTF_ID;
|
||||
aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU;
|
||||
aux->btf_var.btf = btf;
|
||||
aux->btf_var.btf_id = type;
|
||||
} else if (!btf_type_is_struct(t)) {
|
||||
@ -12987,6 +13023,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
|
||||
func[i]->aux->name[0] = 'F';
|
||||
func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
|
||||
func[i]->jit_requested = 1;
|
||||
func[i]->blinding_requested = prog->blinding_requested;
|
||||
func[i]->aux->kfunc_tab = prog->aux->kfunc_tab;
|
||||
func[i]->aux->kfunc_btf_tab = prog->aux->kfunc_btf_tab;
|
||||
func[i]->aux->linfo = prog->aux->linfo;
|
||||
@ -13110,6 +13147,7 @@ out_free:
|
||||
out_undo_insn:
|
||||
/* cleanup main prog to be interpreted */
|
||||
prog->jit_requested = 0;
|
||||
prog->blinding_requested = 0;
|
||||
for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) {
|
||||
if (!bpf_pseudo_call(insn))
|
||||
continue;
|
||||
@ -13203,7 +13241,6 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
|
||||
{
|
||||
struct bpf_prog *prog = env->prog;
|
||||
enum bpf_attach_type eatype = prog->expected_attach_type;
|
||||
bool expect_blinding = bpf_jit_blinding_enabled(prog);
|
||||
enum bpf_prog_type prog_type = resolve_prog_type(prog);
|
||||
struct bpf_insn *insn = prog->insnsi;
|
||||
const struct bpf_func_proto *fn;
|
||||
@ -13367,7 +13404,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
|
||||
insn->code = BPF_JMP | BPF_TAIL_CALL;
|
||||
|
||||
aux = &env->insn_aux_data[i + delta];
|
||||
if (env->bpf_capable && !expect_blinding &&
|
||||
if (env->bpf_capable && !prog->blinding_requested &&
|
||||
prog->jit_requested &&
|
||||
!bpf_map_key_poisoned(aux) &&
|
||||
!bpf_map_ptr_poisoned(aux) &&
|
||||
@ -13455,6 +13492,26 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
|
||||
goto patch_call_imm;
|
||||
}
|
||||
|
||||
if (insn->imm == BPF_FUNC_task_storage_get ||
|
||||
insn->imm == BPF_FUNC_sk_storage_get ||
|
||||
insn->imm == BPF_FUNC_inode_storage_get) {
|
||||
if (env->prog->aux->sleepable)
|
||||
insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_KERNEL);
|
||||
else
|
||||
insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_ATOMIC);
|
||||
insn_buf[1] = *insn;
|
||||
cnt = 2;
|
||||
|
||||
new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
|
||||
if (!new_prog)
|
||||
return -ENOMEM;
|
||||
|
||||
delta += cnt - 1;
|
||||
env->prog = prog = new_prog;
|
||||
insn = new_prog->insnsi + i + delta;
|
||||
goto patch_call_imm;
|
||||
}
|
||||
|
||||
/* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup
|
||||
* and other inlining handlers are currently limited to 64 bit
|
||||
* only.
|
||||
|
@ -64,6 +64,7 @@
|
||||
#include <linux/compat.h>
|
||||
#include <linux/io_uring.h>
|
||||
#include <linux/kprobes.h>
|
||||
#include <linux/rethook.h>
|
||||
|
||||
#include <linux/uaccess.h>
|
||||
#include <asm/unistd.h>
|
||||
@ -169,6 +170,7 @@ static void delayed_put_task_struct(struct rcu_head *rhp)
|
||||
struct task_struct *tsk = container_of(rhp, struct task_struct, rcu);
|
||||
|
||||
kprobe_flush_task(tsk);
|
||||
rethook_flush_task(tsk);
|
||||
perf_event_delayed_put(tsk);
|
||||
trace_sched_process_free(tsk);
|
||||
put_task_struct(tsk);
|
||||
|
@ -2255,6 +2255,9 @@ static __latent_entropy struct task_struct *copy_process(
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
p->kretprobe_instances.first = NULL;
|
||||
#endif
|
||||
#ifdef CONFIG_RETHOOK
|
||||
p->rethooks.first = NULL;
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Ensure that the cgroup subsystem policies allow the new process to be
|
||||
|
@ -212,6 +212,10 @@ unsigned long kallsyms_lookup_name(const char *name)
|
||||
unsigned long i;
|
||||
unsigned int off;
|
||||
|
||||
/* Skip the search for empty string. */
|
||||
if (!*name)
|
||||
return 0;
|
||||
|
||||
for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
|
||||
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
|
||||
|
||||
|
@ -10,6 +10,17 @@ config USER_STACKTRACE_SUPPORT
|
||||
config NOP_TRACER
|
||||
bool
|
||||
|
||||
config HAVE_RETHOOK
|
||||
bool
|
||||
|
||||
config RETHOOK
|
||||
bool
|
||||
depends on HAVE_RETHOOK
|
||||
help
|
||||
Enable generic return hooking feature. This is an internal
|
||||
API, which will be used by other function-entry hooking
|
||||
features like fprobe and kprobes.
|
||||
|
||||
config HAVE_FUNCTION_TRACER
|
||||
bool
|
||||
help
|
||||
@ -236,6 +247,21 @@ config DYNAMIC_FTRACE_WITH_ARGS
|
||||
depends on DYNAMIC_FTRACE
|
||||
depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS
|
||||
|
||||
config FPROBE
|
||||
bool "Kernel Function Probe (fprobe)"
|
||||
depends on FUNCTION_TRACER
|
||||
depends on DYNAMIC_FTRACE_WITH_REGS
|
||||
depends on HAVE_RETHOOK
|
||||
select RETHOOK
|
||||
default n
|
||||
help
|
||||
This option enables kernel function probe (fprobe) based on ftrace.
|
||||
The fprobe is similar to kprobes, but probes only for kernel function
|
||||
entries and exits. This also can probe multiple functions by one
|
||||
fprobe.
|
||||
|
||||
If unsure, say N.
|
||||
|
||||
config FUNCTION_PROFILER
|
||||
bool "Kernel function profiler"
|
||||
depends on FUNCTION_TRACER
|
||||
|
@ -97,6 +97,8 @@ obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
|
||||
obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
|
||||
obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o
|
||||
obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o
|
||||
obj-$(CONFIG_FPROBE) += fprobe.o
|
||||
obj-$(CONFIG_RETHOOK) += rethook.o
|
||||
|
||||
obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
|
||||
|
||||
|
@ -17,6 +17,9 @@
|
||||
#include <linux/error-injection.h>
|
||||
#include <linux/btf_ids.h>
|
||||
#include <linux/bpf_lsm.h>
|
||||
#include <linux/fprobe.h>
|
||||
#include <linux/bsearch.h>
|
||||
#include <linux/sort.h>
|
||||
|
||||
#include <net/bpf_sk_storage.h>
|
||||
|
||||
@ -77,6 +80,8 @@ u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
|
||||
static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
|
||||
u64 flags, const struct btf **btf,
|
||||
s32 *btf_id);
|
||||
static u64 bpf_kprobe_multi_cookie(struct bpf_run_ctx *ctx);
|
||||
static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx *ctx);
|
||||
|
||||
/**
|
||||
* trace_call_bpf - invoke BPF program
|
||||
@ -1036,6 +1041,30 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe = {
|
||||
.arg1_type = ARG_PTR_TO_CTX,
|
||||
};
|
||||
|
||||
BPF_CALL_1(bpf_get_func_ip_kprobe_multi, struct pt_regs *, regs)
|
||||
{
|
||||
return bpf_kprobe_multi_entry_ip(current->bpf_ctx);
|
||||
}
|
||||
|
||||
static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe_multi = {
|
||||
.func = bpf_get_func_ip_kprobe_multi,
|
||||
.gpl_only = false,
|
||||
.ret_type = RET_INTEGER,
|
||||
.arg1_type = ARG_PTR_TO_CTX,
|
||||
};
|
||||
|
||||
BPF_CALL_1(bpf_get_attach_cookie_kprobe_multi, struct pt_regs *, regs)
|
||||
{
|
||||
return bpf_kprobe_multi_cookie(current->bpf_ctx);
|
||||
}
|
||||
|
||||
static const struct bpf_func_proto bpf_get_attach_cookie_proto_kmulti = {
|
||||
.func = bpf_get_attach_cookie_kprobe_multi,
|
||||
.gpl_only = false,
|
||||
.ret_type = RET_INTEGER,
|
||||
.arg1_type = ARG_PTR_TO_CTX,
|
||||
};
|
||||
|
||||
BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
|
||||
{
|
||||
struct bpf_trace_run_ctx *run_ctx;
|
||||
@ -1279,9 +1308,13 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
|
||||
return &bpf_override_return_proto;
|
||||
#endif
|
||||
case BPF_FUNC_get_func_ip:
|
||||
return &bpf_get_func_ip_proto_kprobe;
|
||||
return prog->expected_attach_type == BPF_TRACE_KPROBE_MULTI ?
|
||||
&bpf_get_func_ip_proto_kprobe_multi :
|
||||
&bpf_get_func_ip_proto_kprobe;
|
||||
case BPF_FUNC_get_attach_cookie:
|
||||
return &bpf_get_attach_cookie_proto_trace;
|
||||
return prog->expected_attach_type == BPF_TRACE_KPROBE_MULTI ?
|
||||
&bpf_get_attach_cookie_proto_kmulti :
|
||||
&bpf_get_attach_cookie_proto_trace;
|
||||
default:
|
||||
return bpf_tracing_func_proto(func_id, prog);
|
||||
}
|
||||
@ -2181,3 +2214,314 @@ static int __init bpf_event_init(void)
|
||||
|
||||
fs_initcall(bpf_event_init);
|
||||
#endif /* CONFIG_MODULES */
|
||||
|
||||
#ifdef CONFIG_FPROBE
|
||||
struct bpf_kprobe_multi_link {
|
||||
struct bpf_link link;
|
||||
struct fprobe fp;
|
||||
unsigned long *addrs;
|
||||
u64 *cookies;
|
||||
u32 cnt;
|
||||
};
|
||||
|
||||
struct bpf_kprobe_multi_run_ctx {
|
||||
struct bpf_run_ctx run_ctx;
|
||||
struct bpf_kprobe_multi_link *link;
|
||||
unsigned long entry_ip;
|
||||
};
|
||||
|
||||
static void bpf_kprobe_multi_link_release(struct bpf_link *link)
|
||||
{
|
||||
struct bpf_kprobe_multi_link *kmulti_link;
|
||||
|
||||
kmulti_link = container_of(link, struct bpf_kprobe_multi_link, link);
|
||||
unregister_fprobe(&kmulti_link->fp);
|
||||
}
|
||||
|
||||
static void bpf_kprobe_multi_link_dealloc(struct bpf_link *link)
|
||||
{
|
||||
struct bpf_kprobe_multi_link *kmulti_link;
|
||||
|
||||
kmulti_link = container_of(link, struct bpf_kprobe_multi_link, link);
|
||||
kvfree(kmulti_link->addrs);
|
||||
kvfree(kmulti_link->cookies);
|
||||
kfree(kmulti_link);
|
||||
}
|
||||
|
||||
static const struct bpf_link_ops bpf_kprobe_multi_link_lops = {
|
||||
.release = bpf_kprobe_multi_link_release,
|
||||
.dealloc = bpf_kprobe_multi_link_dealloc,
|
||||
};
|
||||
|
||||
static void bpf_kprobe_multi_cookie_swap(void *a, void *b, int size, const void *priv)
|
||||
{
|
||||
const struct bpf_kprobe_multi_link *link = priv;
|
||||
unsigned long *addr_a = a, *addr_b = b;
|
||||
u64 *cookie_a, *cookie_b;
|
||||
unsigned long tmp1;
|
||||
u64 tmp2;
|
||||
|
||||
cookie_a = link->cookies + (addr_a - link->addrs);
|
||||
cookie_b = link->cookies + (addr_b - link->addrs);
|
||||
|
||||
/* swap addr_a/addr_b and cookie_a/cookie_b values */
|
||||
tmp1 = *addr_a; *addr_a = *addr_b; *addr_b = tmp1;
|
||||
tmp2 = *cookie_a; *cookie_a = *cookie_b; *cookie_b = tmp2;
|
||||
}
|
||||
|
||||
static int __bpf_kprobe_multi_cookie_cmp(const void *a, const void *b)
|
||||
{
|
||||
const unsigned long *addr_a = a, *addr_b = b;
|
||||
|
||||
if (*addr_a == *addr_b)
|
||||
return 0;
|
||||
return *addr_a < *addr_b ? -1 : 1;
|
||||
}
|
||||
|
||||
static int bpf_kprobe_multi_cookie_cmp(const void *a, const void *b, const void *priv)
|
||||
{
|
||||
return __bpf_kprobe_multi_cookie_cmp(a, b);
|
||||
}
|
||||
|
||||
static u64 bpf_kprobe_multi_cookie(struct bpf_run_ctx *ctx)
|
||||
{
|
||||
struct bpf_kprobe_multi_run_ctx *run_ctx;
|
||||
struct bpf_kprobe_multi_link *link;
|
||||
u64 *cookie, entry_ip;
|
||||
unsigned long *addr;
|
||||
|
||||
if (WARN_ON_ONCE(!ctx))
|
||||
return 0;
|
||||
run_ctx = container_of(current->bpf_ctx, struct bpf_kprobe_multi_run_ctx, run_ctx);
|
||||
link = run_ctx->link;
|
||||
if (!link->cookies)
|
||||
return 0;
|
||||
entry_ip = run_ctx->entry_ip;
|
||||
addr = bsearch(&entry_ip, link->addrs, link->cnt, sizeof(entry_ip),
|
||||
__bpf_kprobe_multi_cookie_cmp);
|
||||
if (!addr)
|
||||
return 0;
|
||||
cookie = link->cookies + (addr - link->addrs);
|
||||
return *cookie;
|
||||
}
|
||||
|
||||
static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
|
||||
{
|
||||
struct bpf_kprobe_multi_run_ctx *run_ctx;
|
||||
|
||||
run_ctx = container_of(current->bpf_ctx, struct bpf_kprobe_multi_run_ctx, run_ctx);
|
||||
return run_ctx->entry_ip;
|
||||
}
|
||||
|
||||
static int
|
||||
kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
|
||||
unsigned long entry_ip, struct pt_regs *regs)
|
||||
{
|
||||
struct bpf_kprobe_multi_run_ctx run_ctx = {
|
||||
.link = link,
|
||||
.entry_ip = entry_ip,
|
||||
};
|
||||
struct bpf_run_ctx *old_run_ctx;
|
||||
int err;
|
||||
|
||||
if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
|
||||
err = 0;
|
||||
goto out;
|
||||
}
|
||||
|
||||
migrate_disable();
|
||||
rcu_read_lock();
|
||||
old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
|
||||
err = bpf_prog_run(link->link.prog, regs);
|
||||
bpf_reset_run_ctx(old_run_ctx);
|
||||
rcu_read_unlock();
|
||||
migrate_enable();
|
||||
|
||||
out:
|
||||
__this_cpu_dec(bpf_prog_active);
|
||||
return err;
|
||||
}
|
||||
|
||||
static void
|
||||
kprobe_multi_link_handler(struct fprobe *fp, unsigned long entry_ip,
|
||||
struct pt_regs *regs)
|
||||
{
|
||||
struct bpf_kprobe_multi_link *link;
|
||||
|
||||
link = container_of(fp, struct bpf_kprobe_multi_link, fp);
|
||||
kprobe_multi_link_prog_run(link, entry_ip, regs);
|
||||
}
|
||||
|
||||
static int
|
||||
kprobe_multi_resolve_syms(const void *usyms, u32 cnt,
|
||||
unsigned long *addrs)
|
||||
{
|
||||
unsigned long addr, size;
|
||||
const char **syms;
|
||||
int err = -ENOMEM;
|
||||
unsigned int i;
|
||||
char *func;
|
||||
|
||||
size = cnt * sizeof(*syms);
|
||||
syms = kvzalloc(size, GFP_KERNEL);
|
||||
if (!syms)
|
||||
return -ENOMEM;
|
||||
|
||||
func = kmalloc(KSYM_NAME_LEN, GFP_KERNEL);
|
||||
if (!func)
|
||||
goto error;
|
||||
|
||||
if (copy_from_user(syms, usyms, size)) {
|
||||
err = -EFAULT;
|
||||
goto error;
|
||||
}
|
||||
|
||||
for (i = 0; i < cnt; i++) {
|
||||
err = strncpy_from_user(func, syms[i], KSYM_NAME_LEN);
|
||||
if (err == KSYM_NAME_LEN)
|
||||
err = -E2BIG;
|
||||
if (err < 0)
|
||||
goto error;
|
||||
err = -EINVAL;
|
||||
addr = kallsyms_lookup_name(func);
|
||||
if (!addr)
|
||||
goto error;
|
||||
if (!kallsyms_lookup_size_offset(addr, &size, NULL))
|
||||
goto error;
|
||||
addr = ftrace_location_range(addr, addr + size - 1);
|
||||
if (!addr)
|
||||
goto error;
|
||||
addrs[i] = addr;
|
||||
}
|
||||
|
||||
err = 0;
|
||||
error:
|
||||
kvfree(syms);
|
||||
kfree(func);
|
||||
return err;
|
||||
}
|
||||
|
||||
int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
|
||||
{
|
||||
struct bpf_kprobe_multi_link *link = NULL;
|
||||
struct bpf_link_primer link_primer;
|
||||
void __user *ucookies;
|
||||
unsigned long *addrs;
|
||||
u32 flags, cnt, size;
|
||||
void __user *uaddrs;
|
||||
u64 *cookies = NULL;
|
||||
void __user *usyms;
|
||||
int err;
|
||||
|
||||
/* no support for 32bit archs yet */
|
||||
if (sizeof(u64) != sizeof(void *))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (prog->expected_attach_type != BPF_TRACE_KPROBE_MULTI)
|
||||
return -EINVAL;
|
||||
|
||||
flags = attr->link_create.kprobe_multi.flags;
|
||||
if (flags & ~BPF_F_KPROBE_MULTI_RETURN)
|
||||
return -EINVAL;
|
||||
|
||||
uaddrs = u64_to_user_ptr(attr->link_create.kprobe_multi.addrs);
|
||||
usyms = u64_to_user_ptr(attr->link_create.kprobe_multi.syms);
|
||||
if (!!uaddrs == !!usyms)
|
||||
return -EINVAL;
|
||||
|
||||
cnt = attr->link_create.kprobe_multi.cnt;
|
||||
if (!cnt)
|
||||
return -EINVAL;
|
||||
|
||||
size = cnt * sizeof(*addrs);
|
||||
addrs = kvmalloc(size, GFP_KERNEL);
|
||||
if (!addrs)
|
||||
return -ENOMEM;
|
||||
|
||||
if (uaddrs) {
|
||||
if (copy_from_user(addrs, uaddrs, size)) {
|
||||
err = -EFAULT;
|
||||
goto error;
|
||||
}
|
||||
} else {
|
||||
err = kprobe_multi_resolve_syms(usyms, cnt, addrs);
|
||||
if (err)
|
||||
goto error;
|
||||
}
|
||||
|
||||
ucookies = u64_to_user_ptr(attr->link_create.kprobe_multi.cookies);
|
||||
if (ucookies) {
|
||||
cookies = kvmalloc(size, GFP_KERNEL);
|
||||
if (!cookies) {
|
||||
err = -ENOMEM;
|
||||
goto error;
|
||||
}
|
||||
if (copy_from_user(cookies, ucookies, size)) {
|
||||
err = -EFAULT;
|
||||
goto error;
|
||||
}
|
||||
}
|
||||
|
||||
link = kzalloc(sizeof(*link), GFP_KERNEL);
|
||||
if (!link) {
|
||||
err = -ENOMEM;
|
||||
goto error;
|
||||
}
|
||||
|
||||
bpf_link_init(&link->link, BPF_LINK_TYPE_KPROBE_MULTI,
|
||||
&bpf_kprobe_multi_link_lops, prog);
|
||||
|
||||
err = bpf_link_prime(&link->link, &link_primer);
|
||||
if (err)
|
||||
goto error;
|
||||
|
||||
if (flags & BPF_F_KPROBE_MULTI_RETURN)
|
||||
link->fp.exit_handler = kprobe_multi_link_handler;
|
||||
else
|
||||
link->fp.entry_handler = kprobe_multi_link_handler;
|
||||
|
||||
link->addrs = addrs;
|
||||
link->cookies = cookies;
|
||||
link->cnt = cnt;
|
||||
|
||||
if (cookies) {
|
||||
/*
|
||||
* Sorting addresses will trigger sorting cookies as well
|
||||
* (check bpf_kprobe_multi_cookie_swap). This way we can
|
||||
* find cookie based on the address in bpf_get_attach_cookie
|
||||
* helper.
|
||||
*/
|
||||
sort_r(addrs, cnt, sizeof(*addrs),
|
||||
bpf_kprobe_multi_cookie_cmp,
|
||||
bpf_kprobe_multi_cookie_swap,
|
||||
link);
|
||||
}
|
||||
|
||||
err = register_fprobe_ips(&link->fp, addrs, cnt);
|
||||
if (err) {
|
||||
bpf_link_cleanup(&link_primer);
|
||||
return err;
|
||||
}
|
||||
|
||||
return bpf_link_settle(&link_primer);
|
||||
|
||||
error:
|
||||
kfree(link);
|
||||
kvfree(addrs);
|
||||
kvfree(cookies);
|
||||
return err;
|
||||
}
|
||||
#else /* !CONFIG_FPROBE */
|
||||
int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
|
||||
{
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
static u64 bpf_kprobe_multi_cookie(struct bpf_run_ctx *ctx)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
332
kernel/trace/fprobe.c
Normal file
332
kernel/trace/fprobe.c
Normal file
@ -0,0 +1,332 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* fprobe - Simple ftrace probe wrapper for function entry.
|
||||
*/
|
||||
#define pr_fmt(fmt) "fprobe: " fmt
|
||||
|
||||
#include <linux/err.h>
|
||||
#include <linux/fprobe.h>
|
||||
#include <linux/kallsyms.h>
|
||||
#include <linux/kprobes.h>
|
||||
#include <linux/rethook.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/sort.h>
|
||||
|
||||
#include "trace.h"
|
||||
|
||||
struct fprobe_rethook_node {
|
||||
struct rethook_node node;
|
||||
unsigned long entry_ip;
|
||||
};
|
||||
|
||||
static void fprobe_handler(unsigned long ip, unsigned long parent_ip,
|
||||
struct ftrace_ops *ops, struct ftrace_regs *fregs)
|
||||
{
|
||||
struct fprobe_rethook_node *fpr;
|
||||
struct rethook_node *rh;
|
||||
struct fprobe *fp;
|
||||
int bit;
|
||||
|
||||
fp = container_of(ops, struct fprobe, ops);
|
||||
if (fprobe_disabled(fp))
|
||||
return;
|
||||
|
||||
bit = ftrace_test_recursion_trylock(ip, parent_ip);
|
||||
if (bit < 0) {
|
||||
fp->nmissed++;
|
||||
return;
|
||||
}
|
||||
|
||||
if (fp->entry_handler)
|
||||
fp->entry_handler(fp, ip, ftrace_get_regs(fregs));
|
||||
|
||||
if (fp->exit_handler) {
|
||||
rh = rethook_try_get(fp->rethook);
|
||||
if (!rh) {
|
||||
fp->nmissed++;
|
||||
goto out;
|
||||
}
|
||||
fpr = container_of(rh, struct fprobe_rethook_node, node);
|
||||
fpr->entry_ip = ip;
|
||||
rethook_hook(rh, ftrace_get_regs(fregs), true);
|
||||
}
|
||||
|
||||
out:
|
||||
ftrace_test_recursion_unlock(bit);
|
||||
}
|
||||
NOKPROBE_SYMBOL(fprobe_handler);
|
||||
|
||||
static void fprobe_kprobe_handler(unsigned long ip, unsigned long parent_ip,
|
||||
struct ftrace_ops *ops, struct ftrace_regs *fregs)
|
||||
{
|
||||
struct fprobe *fp = container_of(ops, struct fprobe, ops);
|
||||
|
||||
if (unlikely(kprobe_running())) {
|
||||
fp->nmissed++;
|
||||
return;
|
||||
}
|
||||
kprobe_busy_begin();
|
||||
fprobe_handler(ip, parent_ip, ops, fregs);
|
||||
kprobe_busy_end();
|
||||
}
|
||||
|
||||
static void fprobe_exit_handler(struct rethook_node *rh, void *data,
|
||||
struct pt_regs *regs)
|
||||
{
|
||||
struct fprobe *fp = (struct fprobe *)data;
|
||||
struct fprobe_rethook_node *fpr;
|
||||
|
||||
if (!fp || fprobe_disabled(fp))
|
||||
return;
|
||||
|
||||
fpr = container_of(rh, struct fprobe_rethook_node, node);
|
||||
|
||||
fp->exit_handler(fp, fpr->entry_ip, regs);
|
||||
}
|
||||
NOKPROBE_SYMBOL(fprobe_exit_handler);
|
||||
|
||||
/* Convert ftrace location address from symbols */
|
||||
static unsigned long *get_ftrace_locations(const char **syms, int num)
|
||||
{
|
||||
unsigned long addr, size;
|
||||
unsigned long *addrs;
|
||||
int i;
|
||||
|
||||
/* Convert symbols to symbol address */
|
||||
addrs = kcalloc(num, sizeof(*addrs), GFP_KERNEL);
|
||||
if (!addrs)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
for (i = 0; i < num; i++) {
|
||||
addr = kallsyms_lookup_name(syms[i]);
|
||||
if (!addr) /* Maybe wrong symbol */
|
||||
goto error;
|
||||
|
||||
/* Convert symbol address to ftrace location. */
|
||||
if (!kallsyms_lookup_size_offset(addr, &size, NULL) || !size)
|
||||
goto error;
|
||||
|
||||
addr = ftrace_location_range(addr, addr + size - 1);
|
||||
if (!addr) /* No dynamic ftrace there. */
|
||||
goto error;
|
||||
|
||||
addrs[i] = addr;
|
||||
}
|
||||
|
||||
return addrs;
|
||||
|
||||
error:
|
||||
kfree(addrs);
|
||||
|
||||
return ERR_PTR(-ENOENT);
|
||||
}
|
||||
|
||||
static void fprobe_init(struct fprobe *fp)
|
||||
{
|
||||
fp->nmissed = 0;
|
||||
if (fprobe_shared_with_kprobes(fp))
|
||||
fp->ops.func = fprobe_kprobe_handler;
|
||||
else
|
||||
fp->ops.func = fprobe_handler;
|
||||
fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS;
|
||||
}
|
||||
|
||||
static int fprobe_init_rethook(struct fprobe *fp, int num)
|
||||
{
|
||||
int i, size;
|
||||
|
||||
if (num < 0)
|
||||
return -EINVAL;
|
||||
|
||||
if (!fp->exit_handler) {
|
||||
fp->rethook = NULL;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Initialize rethook if needed */
|
||||
size = num * num_possible_cpus() * 2;
|
||||
if (size < 0)
|
||||
return -E2BIG;
|
||||
|
||||
fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler);
|
||||
for (i = 0; i < size; i++) {
|
||||
struct rethook_node *node;
|
||||
|
||||
node = kzalloc(sizeof(struct fprobe_rethook_node), GFP_KERNEL);
|
||||
if (!node) {
|
||||
rethook_free(fp->rethook);
|
||||
fp->rethook = NULL;
|
||||
return -ENOMEM;
|
||||
}
|
||||
rethook_add_node(fp->rethook, node);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void fprobe_fail_cleanup(struct fprobe *fp)
|
||||
{
|
||||
if (fp->rethook) {
|
||||
/* Don't need to cleanup rethook->handler because this is not used. */
|
||||
rethook_free(fp->rethook);
|
||||
fp->rethook = NULL;
|
||||
}
|
||||
ftrace_free_filter(&fp->ops);
|
||||
}
|
||||
|
||||
/**
|
||||
* register_fprobe() - Register fprobe to ftrace by pattern.
|
||||
* @fp: A fprobe data structure to be registered.
|
||||
* @filter: A wildcard pattern of probed symbols.
|
||||
* @notfilter: A wildcard pattern of NOT probed symbols.
|
||||
*
|
||||
* Register @fp to ftrace for enabling the probe on the symbols matched to @filter.
|
||||
* If @notfilter is not NULL, the symbols matched the @notfilter are not probed.
|
||||
*
|
||||
* Return 0 if @fp is registered successfully, -errno if not.
|
||||
*/
|
||||
int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter)
|
||||
{
|
||||
struct ftrace_hash *hash;
|
||||
unsigned char *str;
|
||||
int ret, len;
|
||||
|
||||
if (!fp || !filter)
|
||||
return -EINVAL;
|
||||
|
||||
fprobe_init(fp);
|
||||
|
||||
len = strlen(filter);
|
||||
str = kstrdup(filter, GFP_KERNEL);
|
||||
ret = ftrace_set_filter(&fp->ops, str, len, 0);
|
||||
kfree(str);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (notfilter) {
|
||||
len = strlen(notfilter);
|
||||
str = kstrdup(notfilter, GFP_KERNEL);
|
||||
ret = ftrace_set_notrace(&fp->ops, str, len, 0);
|
||||
kfree(str);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* TODO:
|
||||
* correctly calculate the total number of filtered symbols
|
||||
* from both filter and notfilter.
|
||||
*/
|
||||
hash = fp->ops.local_hash.filter_hash;
|
||||
if (WARN_ON_ONCE(!hash))
|
||||
goto out;
|
||||
|
||||
ret = fprobe_init_rethook(fp, (int)hash->count);
|
||||
if (!ret)
|
||||
ret = register_ftrace_function(&fp->ops);
|
||||
|
||||
out:
|
||||
if (ret)
|
||||
fprobe_fail_cleanup(fp);
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(register_fprobe);
|
||||
|
||||
/**
|
||||
* register_fprobe_ips() - Register fprobe to ftrace by address.
|
||||
* @fp: A fprobe data structure to be registered.
|
||||
* @addrs: An array of target ftrace location addresses.
|
||||
* @num: The number of entries of @addrs.
|
||||
*
|
||||
* Register @fp to ftrace for enabling the probe on the address given by @addrs.
|
||||
* The @addrs must be the addresses of ftrace location address, which may be
|
||||
* the symbol address + arch-dependent offset.
|
||||
* If you unsure what this mean, please use other registration functions.
|
||||
*
|
||||
* Return 0 if @fp is registered successfully, -errno if not.
|
||||
*/
|
||||
int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (!fp || !addrs || num <= 0)
|
||||
return -EINVAL;
|
||||
|
||||
fprobe_init(fp);
|
||||
|
||||
ret = ftrace_set_filter_ips(&fp->ops, addrs, num, 0, 0);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = fprobe_init_rethook(fp, num);
|
||||
if (!ret)
|
||||
ret = register_ftrace_function(&fp->ops);
|
||||
|
||||
if (ret)
|
||||
fprobe_fail_cleanup(fp);
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(register_fprobe_ips);
|
||||
|
||||
/**
|
||||
* register_fprobe_syms() - Register fprobe to ftrace by symbols.
|
||||
* @fp: A fprobe data structure to be registered.
|
||||
* @syms: An array of target symbols.
|
||||
* @num: The number of entries of @syms.
|
||||
*
|
||||
* Register @fp to the symbols given by @syms array. This will be useful if
|
||||
* you are sure the symbols exist in the kernel.
|
||||
*
|
||||
* Return 0 if @fp is registered successfully, -errno if not.
|
||||
*/
|
||||
int register_fprobe_syms(struct fprobe *fp, const char **syms, int num)
|
||||
{
|
||||
unsigned long *addrs;
|
||||
int ret;
|
||||
|
||||
if (!fp || !syms || num <= 0)
|
||||
return -EINVAL;
|
||||
|
||||
addrs = get_ftrace_locations(syms, num);
|
||||
if (IS_ERR(addrs))
|
||||
return PTR_ERR(addrs);
|
||||
|
||||
ret = register_fprobe_ips(fp, addrs, num);
|
||||
|
||||
kfree(addrs);
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(register_fprobe_syms);
|
||||
|
||||
/**
|
||||
* unregister_fprobe() - Unregister fprobe from ftrace
|
||||
* @fp: A fprobe data structure to be unregistered.
|
||||
*
|
||||
* Unregister fprobe (and remove ftrace hooks from the function entries).
|
||||
*
|
||||
* Return 0 if @fp is unregistered successfully, -errno if not.
|
||||
*/
|
||||
int unregister_fprobe(struct fprobe *fp)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (!fp || fp->ops.func != fprobe_handler)
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
* rethook_free() starts disabling the rethook, but the rethook handlers
|
||||
* may be running on other processors at this point. To make sure that all
|
||||
* current running handlers are finished, call unregister_ftrace_function()
|
||||
* after this.
|
||||
*/
|
||||
if (fp->rethook)
|
||||
rethook_free(fp->rethook);
|
||||
|
||||
ret = unregister_ftrace_function(&fp->ops);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
ftrace_free_filter(&fp->ops);
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(unregister_fprobe);
|
@ -4958,7 +4958,7 @@ ftrace_notrace_write(struct file *file, const char __user *ubuf,
|
||||
}
|
||||
|
||||
static int
|
||||
ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
|
||||
__ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
|
||||
{
|
||||
struct ftrace_func_entry *entry;
|
||||
|
||||
@ -4976,9 +4976,30 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
|
||||
return add_hash_entry(hash, ip);
|
||||
}
|
||||
|
||||
static int
|
||||
ftrace_match_addr(struct ftrace_hash *hash, unsigned long *ips,
|
||||
unsigned int cnt, int remove)
|
||||
{
|
||||
unsigned int i;
|
||||
int err;
|
||||
|
||||
for (i = 0; i < cnt; i++) {
|
||||
err = __ftrace_match_addr(hash, ips[i], remove);
|
||||
if (err) {
|
||||
/*
|
||||
* This expects the @hash is a temporary hash and if this
|
||||
* fails the caller must free the @hash.
|
||||
*/
|
||||
return err;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
|
||||
unsigned long ip, int remove, int reset, int enable)
|
||||
unsigned long *ips, unsigned int cnt,
|
||||
int remove, int reset, int enable)
|
||||
{
|
||||
struct ftrace_hash **orig_hash;
|
||||
struct ftrace_hash *hash;
|
||||
@ -5008,8 +5029,8 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
|
||||
ret = -EINVAL;
|
||||
goto out_regex_unlock;
|
||||
}
|
||||
if (ip) {
|
||||
ret = ftrace_match_addr(hash, ip, remove);
|
||||
if (ips) {
|
||||
ret = ftrace_match_addr(hash, ips, cnt, remove);
|
||||
if (ret < 0)
|
||||
goto out_regex_unlock;
|
||||
}
|
||||
@ -5026,10 +5047,10 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
|
||||
}
|
||||
|
||||
static int
|
||||
ftrace_set_addr(struct ftrace_ops *ops, unsigned long ip, int remove,
|
||||
int reset, int enable)
|
||||
ftrace_set_addr(struct ftrace_ops *ops, unsigned long *ips, unsigned int cnt,
|
||||
int remove, int reset, int enable)
|
||||
{
|
||||
return ftrace_set_hash(ops, NULL, 0, ip, remove, reset, enable);
|
||||
return ftrace_set_hash(ops, NULL, 0, ips, cnt, remove, reset, enable);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
|
||||
@ -5634,10 +5655,29 @@ int ftrace_set_filter_ip(struct ftrace_ops *ops, unsigned long ip,
|
||||
int remove, int reset)
|
||||
{
|
||||
ftrace_ops_init(ops);
|
||||
return ftrace_set_addr(ops, ip, remove, reset, 1);
|
||||
return ftrace_set_addr(ops, &ip, 1, remove, reset, 1);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ftrace_set_filter_ip);
|
||||
|
||||
/**
|
||||
* ftrace_set_filter_ips - set functions to filter on in ftrace by addresses
|
||||
* @ops - the ops to set the filter with
|
||||
* @ips - the array of addresses to add to or remove from the filter.
|
||||
* @cnt - the number of addresses in @ips
|
||||
* @remove - non zero to remove ips from the filter
|
||||
* @reset - non zero to reset all filters before applying this filter.
|
||||
*
|
||||
* Filters denote which functions should be enabled when tracing is enabled
|
||||
* If @ips array or any ip specified within is NULL , it fails to update filter.
|
||||
*/
|
||||
int ftrace_set_filter_ips(struct ftrace_ops *ops, unsigned long *ips,
|
||||
unsigned int cnt, int remove, int reset)
|
||||
{
|
||||
ftrace_ops_init(ops);
|
||||
return ftrace_set_addr(ops, ips, cnt, remove, reset, 1);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ftrace_set_filter_ips);
|
||||
|
||||
/**
|
||||
* ftrace_ops_set_global_filter - setup ops to use global filters
|
||||
* @ops - the ops which will use the global filters
|
||||
@ -5659,7 +5699,7 @@ static int
|
||||
ftrace_set_regex(struct ftrace_ops *ops, unsigned char *buf, int len,
|
||||
int reset, int enable)
|
||||
{
|
||||
return ftrace_set_hash(ops, buf, len, 0, 0, reset, enable);
|
||||
return ftrace_set_hash(ops, buf, len, NULL, 0, 0, reset, enable);
|
||||
}
|
||||
|
||||
/**
|
||||
|
317
kernel/trace/rethook.c
Normal file
317
kernel/trace/rethook.c
Normal file
@ -0,0 +1,317 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#define pr_fmt(fmt) "rethook: " fmt
|
||||
|
||||
#include <linux/bug.h>
|
||||
#include <linux/kallsyms.h>
|
||||
#include <linux/kprobes.h>
|
||||
#include <linux/preempt.h>
|
||||
#include <linux/rethook.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/sort.h>
|
||||
|
||||
/* Return hook list (shadow stack by list) */
|
||||
|
||||
/*
|
||||
* This function is called from delayed_put_task_struct() when a task is
|
||||
* dead and cleaned up to recycle any kretprobe instances associated with
|
||||
* this task. These left over instances represent probed functions that
|
||||
* have been called but will never return.
|
||||
*/
|
||||
void rethook_flush_task(struct task_struct *tk)
|
||||
{
|
||||
struct rethook_node *rhn;
|
||||
struct llist_node *node;
|
||||
|
||||
node = __llist_del_all(&tk->rethooks);
|
||||
while (node) {
|
||||
rhn = container_of(node, struct rethook_node, llist);
|
||||
node = node->next;
|
||||
preempt_disable();
|
||||
rethook_recycle(rhn);
|
||||
preempt_enable();
|
||||
}
|
||||
}
|
||||
|
||||
static void rethook_free_rcu(struct rcu_head *head)
|
||||
{
|
||||
struct rethook *rh = container_of(head, struct rethook, rcu);
|
||||
struct rethook_node *rhn;
|
||||
struct freelist_node *node;
|
||||
int count = 1;
|
||||
|
||||
node = rh->pool.head;
|
||||
while (node) {
|
||||
rhn = container_of(node, struct rethook_node, freelist);
|
||||
node = node->next;
|
||||
kfree(rhn);
|
||||
count++;
|
||||
}
|
||||
|
||||
/* The rh->ref is the number of pooled node + 1 */
|
||||
if (refcount_sub_and_test(count, &rh->ref))
|
||||
kfree(rh);
|
||||
}
|
||||
|
||||
/**
|
||||
* rethook_free() - Free struct rethook.
|
||||
* @rh: the struct rethook to be freed.
|
||||
*
|
||||
* Free the rethook. Before calling this function, user must ensure the
|
||||
* @rh::data is cleaned if needed (or, the handler can access it after
|
||||
* calling this function.) This function will set the @rh to be freed
|
||||
* after all rethook_node are freed (not soon). And the caller must
|
||||
* not touch @rh after calling this.
|
||||
*/
|
||||
void rethook_free(struct rethook *rh)
|
||||
{
|
||||
rcu_assign_pointer(rh->handler, NULL);
|
||||
|
||||
call_rcu(&rh->rcu, rethook_free_rcu);
|
||||
}
|
||||
|
||||
/**
|
||||
* rethook_alloc() - Allocate struct rethook.
|
||||
* @data: a data to pass the @handler when hooking the return.
|
||||
* @handler: the return hook callback function.
|
||||
*
|
||||
* Allocate and initialize a new rethook with @data and @handler.
|
||||
* Return NULL if memory allocation fails or @handler is NULL.
|
||||
* Note that @handler == NULL means this rethook is going to be freed.
|
||||
*/
|
||||
struct rethook *rethook_alloc(void *data, rethook_handler_t handler)
|
||||
{
|
||||
struct rethook *rh = kzalloc(sizeof(struct rethook), GFP_KERNEL);
|
||||
|
||||
if (!rh || !handler)
|
||||
return NULL;
|
||||
|
||||
rh->data = data;
|
||||
rh->handler = handler;
|
||||
rh->pool.head = NULL;
|
||||
refcount_set(&rh->ref, 1);
|
||||
|
||||
return rh;
|
||||
}
|
||||
|
||||
/**
|
||||
* rethook_add_node() - Add a new node to the rethook.
|
||||
* @rh: the struct rethook.
|
||||
* @node: the struct rethook_node to be added.
|
||||
*
|
||||
* Add @node to @rh. User must allocate @node (as a part of user's
|
||||
* data structure.) The @node fields are initialized in this function.
|
||||
*/
|
||||
void rethook_add_node(struct rethook *rh, struct rethook_node *node)
|
||||
{
|
||||
node->rethook = rh;
|
||||
freelist_add(&node->freelist, &rh->pool);
|
||||
refcount_inc(&rh->ref);
|
||||
}
|
||||
|
||||
static void free_rethook_node_rcu(struct rcu_head *head)
|
||||
{
|
||||
struct rethook_node *node = container_of(head, struct rethook_node, rcu);
|
||||
|
||||
if (refcount_dec_and_test(&node->rethook->ref))
|
||||
kfree(node->rethook);
|
||||
kfree(node);
|
||||
}
|
||||
|
||||
/**
|
||||
* rethook_recycle() - return the node to rethook.
|
||||
* @node: The struct rethook_node to be returned.
|
||||
*
|
||||
* Return back the @node to @node::rethook. If the @node::rethook is already
|
||||
* marked as freed, this will free the @node.
|
||||
*/
|
||||
void rethook_recycle(struct rethook_node *node)
|
||||
{
|
||||
lockdep_assert_preemption_disabled();
|
||||
|
||||
if (likely(READ_ONCE(node->rethook->handler)))
|
||||
freelist_add(&node->freelist, &node->rethook->pool);
|
||||
else
|
||||
call_rcu(&node->rcu, free_rethook_node_rcu);
|
||||
}
|
||||
NOKPROBE_SYMBOL(rethook_recycle);
|
||||
|
||||
/**
|
||||
* rethook_try_get() - get an unused rethook node.
|
||||
* @rh: The struct rethook which pools the nodes.
|
||||
*
|
||||
* Get an unused rethook node from @rh. If the node pool is empty, this
|
||||
* will return NULL. Caller must disable preemption.
|
||||
*/
|
||||
struct rethook_node *rethook_try_get(struct rethook *rh)
|
||||
{
|
||||
rethook_handler_t handler = READ_ONCE(rh->handler);
|
||||
struct freelist_node *fn;
|
||||
|
||||
lockdep_assert_preemption_disabled();
|
||||
|
||||
/* Check whether @rh is going to be freed. */
|
||||
if (unlikely(!handler))
|
||||
return NULL;
|
||||
|
||||
fn = freelist_try_get(&rh->pool);
|
||||
if (!fn)
|
||||
return NULL;
|
||||
|
||||
return container_of(fn, struct rethook_node, freelist);
|
||||
}
|
||||
NOKPROBE_SYMBOL(rethook_try_get);
|
||||
|
||||
/**
|
||||
* rethook_hook() - Hook the current function return.
|
||||
* @node: The struct rethook node to hook the function return.
|
||||
* @regs: The struct pt_regs for the function entry.
|
||||
* @mcount: True if this is called from mcount(ftrace) context.
|
||||
*
|
||||
* Hook the current running function return. This must be called when the
|
||||
* function entry (or at least @regs must be the registers of the function
|
||||
* entry.) @mcount is used for identifying the context. If this is called
|
||||
* from ftrace (mcount) callback, @mcount must be set true. If this is called
|
||||
* from the real function entry (e.g. kprobes) @mcount must be set false.
|
||||
* This is because the way to hook the function return depends on the context.
|
||||
*/
|
||||
void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount)
|
||||
{
|
||||
arch_rethook_prepare(node, regs, mcount);
|
||||
__llist_add(&node->llist, ¤t->rethooks);
|
||||
}
|
||||
NOKPROBE_SYMBOL(rethook_hook);
|
||||
|
||||
/* This assumes the 'tsk' is the current task or is not running. */
|
||||
static unsigned long __rethook_find_ret_addr(struct task_struct *tsk,
|
||||
struct llist_node **cur)
|
||||
{
|
||||
struct rethook_node *rh = NULL;
|
||||
struct llist_node *node = *cur;
|
||||
|
||||
if (!node)
|
||||
node = tsk->rethooks.first;
|
||||
else
|
||||
node = node->next;
|
||||
|
||||
while (node) {
|
||||
rh = container_of(node, struct rethook_node, llist);
|
||||
if (rh->ret_addr != (unsigned long)arch_rethook_trampoline) {
|
||||
*cur = node;
|
||||
return rh->ret_addr;
|
||||
}
|
||||
node = node->next;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
NOKPROBE_SYMBOL(__rethook_find_ret_addr);
|
||||
|
||||
/**
|
||||
* rethook_find_ret_addr -- Find correct return address modified by rethook
|
||||
* @tsk: Target task
|
||||
* @frame: A frame pointer
|
||||
* @cur: a storage of the loop cursor llist_node pointer for next call
|
||||
*
|
||||
* Find the correct return address modified by a rethook on @tsk in unsigned
|
||||
* long type.
|
||||
* The @tsk must be 'current' or a task which is not running. @frame is a hint
|
||||
* to get the currect return address - which is compared with the
|
||||
* rethook::frame field. The @cur is a loop cursor for searching the
|
||||
* kretprobe return addresses on the @tsk. The '*@cur' should be NULL at the
|
||||
* first call, but '@cur' itself must NOT NULL.
|
||||
*
|
||||
* Returns found address value or zero if not found.
|
||||
*/
|
||||
unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame,
|
||||
struct llist_node **cur)
|
||||
{
|
||||
struct rethook_node *rhn = NULL;
|
||||
unsigned long ret;
|
||||
|
||||
if (WARN_ON_ONCE(!cur))
|
||||
return 0;
|
||||
|
||||
if (WARN_ON_ONCE(tsk != current && task_is_running(tsk)))
|
||||
return 0;
|
||||
|
||||
do {
|
||||
ret = __rethook_find_ret_addr(tsk, cur);
|
||||
if (!ret)
|
||||
break;
|
||||
rhn = container_of(*cur, struct rethook_node, llist);
|
||||
} while (rhn->frame != frame);
|
||||
|
||||
return ret;
|
||||
}
|
||||
NOKPROBE_SYMBOL(rethook_find_ret_addr);
|
||||
|
||||
void __weak arch_rethook_fixup_return(struct pt_regs *regs,
|
||||
unsigned long correct_ret_addr)
|
||||
{
|
||||
/*
|
||||
* Do nothing by default. If the architecture which uses a
|
||||
* frame pointer to record real return address on the stack,
|
||||
* it should fill this function to fixup the return address
|
||||
* so that stacktrace works from the rethook handler.
|
||||
*/
|
||||
}
|
||||
|
||||
/* This function will be called from each arch-defined trampoline. */
|
||||
unsigned long rethook_trampoline_handler(struct pt_regs *regs,
|
||||
unsigned long frame)
|
||||
{
|
||||
struct llist_node *first, *node = NULL;
|
||||
unsigned long correct_ret_addr;
|
||||
rethook_handler_t handler;
|
||||
struct rethook_node *rhn;
|
||||
|
||||
correct_ret_addr = __rethook_find_ret_addr(current, &node);
|
||||
if (!correct_ret_addr) {
|
||||
pr_err("rethook: Return address not found! Maybe there is a bug in the kernel\n");
|
||||
BUG_ON(1);
|
||||
}
|
||||
|
||||
instruction_pointer_set(regs, correct_ret_addr);
|
||||
|
||||
/*
|
||||
* These loops must be protected from rethook_free_rcu() because those
|
||||
* are accessing 'rhn->rethook'.
|
||||
*/
|
||||
preempt_disable();
|
||||
|
||||
/*
|
||||
* Run the handler on the shadow stack. Do not unlink the list here because
|
||||
* stackdump inside the handlers needs to decode it.
|
||||
*/
|
||||
first = current->rethooks.first;
|
||||
while (first) {
|
||||
rhn = container_of(first, struct rethook_node, llist);
|
||||
if (WARN_ON_ONCE(rhn->frame != frame))
|
||||
break;
|
||||
handler = READ_ONCE(rhn->rethook->handler);
|
||||
if (handler)
|
||||
handler(rhn, rhn->rethook->data, regs);
|
||||
|
||||
if (first == node)
|
||||
break;
|
||||
first = first->next;
|
||||
}
|
||||
|
||||
/* Fixup registers for returning to correct address. */
|
||||
arch_rethook_fixup_return(regs, correct_ret_addr);
|
||||
|
||||
/* Unlink used shadow stack */
|
||||
first = current->rethooks.first;
|
||||
current->rethooks.first = node->next;
|
||||
node->next = NULL;
|
||||
|
||||
while (first) {
|
||||
rhn = container_of(first, struct rethook_node, llist);
|
||||
first = first->next;
|
||||
rethook_recycle(rhn);
|
||||
}
|
||||
preempt_enable();
|
||||
|
||||
return correct_ret_addr;
|
||||
}
|
||||
NOKPROBE_SYMBOL(rethook_trampoline_handler);
|
@ -2118,6 +2118,18 @@ config KPROBES_SANITY_TEST
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config FPROBE_SANITY_TEST
|
||||
bool "Self test for fprobe"
|
||||
depends on DEBUG_KERNEL
|
||||
depends on FPROBE
|
||||
depends on KUNIT=y
|
||||
help
|
||||
This option will enable testing the fprobe when the system boot.
|
||||
A series of tests are made to verify that the fprobe is functioning
|
||||
properly.
|
||||
|
||||
Say N if you are unsure.
|
||||
|
||||
config BACKTRACE_SELF_TEST
|
||||
tristate "Self test for the backtrace code"
|
||||
depends on DEBUG_KERNEL
|
||||
|
@ -103,6 +103,8 @@ obj-$(CONFIG_TEST_HMM) += test_hmm.o
|
||||
obj-$(CONFIG_TEST_FREE_PAGES) += test_free_pages.o
|
||||
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
|
||||
obj-$(CONFIG_TEST_REF_TRACKER) += test_ref_tracker.o
|
||||
CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
|
||||
obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
|
||||
#
|
||||
# CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
|
||||
# off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
|
||||
|
40
lib/sort.c
40
lib/sort.c
@ -122,16 +122,27 @@ static void swap_bytes(void *a, void *b, size_t n)
|
||||
* a pointer, but small integers make for the smallest compare
|
||||
* instructions.
|
||||
*/
|
||||
#define SWAP_WORDS_64 (swap_func_t)0
|
||||
#define SWAP_WORDS_32 (swap_func_t)1
|
||||
#define SWAP_BYTES (swap_func_t)2
|
||||
#define SWAP_WORDS_64 (swap_r_func_t)0
|
||||
#define SWAP_WORDS_32 (swap_r_func_t)1
|
||||
#define SWAP_BYTES (swap_r_func_t)2
|
||||
#define SWAP_WRAPPER (swap_r_func_t)3
|
||||
|
||||
struct wrapper {
|
||||
cmp_func_t cmp;
|
||||
swap_func_t swap;
|
||||
};
|
||||
|
||||
/*
|
||||
* The function pointer is last to make tail calls most efficient if the
|
||||
* compiler decides not to inline this function.
|
||||
*/
|
||||
static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
|
||||
static void do_swap(void *a, void *b, size_t size, swap_r_func_t swap_func, const void *priv)
|
||||
{
|
||||
if (swap_func == SWAP_WRAPPER) {
|
||||
((const struct wrapper *)priv)->swap(a, b, (int)size);
|
||||
return;
|
||||
}
|
||||
|
||||
if (swap_func == SWAP_WORDS_64)
|
||||
swap_words_64(a, b, size);
|
||||
else if (swap_func == SWAP_WORDS_32)
|
||||
@ -139,7 +150,7 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
|
||||
else if (swap_func == SWAP_BYTES)
|
||||
swap_bytes(a, b, size);
|
||||
else
|
||||
swap_func(a, b, (int)size);
|
||||
swap_func(a, b, (int)size, priv);
|
||||
}
|
||||
|
||||
#define _CMP_WRAPPER ((cmp_r_func_t)0L)
|
||||
@ -147,7 +158,7 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
|
||||
static int do_cmp(const void *a, const void *b, cmp_r_func_t cmp, const void *priv)
|
||||
{
|
||||
if (cmp == _CMP_WRAPPER)
|
||||
return ((cmp_func_t)(priv))(a, b);
|
||||
return ((const struct wrapper *)priv)->cmp(a, b);
|
||||
return cmp(a, b, priv);
|
||||
}
|
||||
|
||||
@ -198,7 +209,7 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
|
||||
*/
|
||||
void sort_r(void *base, size_t num, size_t size,
|
||||
cmp_r_func_t cmp_func,
|
||||
swap_func_t swap_func,
|
||||
swap_r_func_t swap_func,
|
||||
const void *priv)
|
||||
{
|
||||
/* pre-scale counters for performance */
|
||||
@ -208,6 +219,10 @@ void sort_r(void *base, size_t num, size_t size,
|
||||
if (!a) /* num < 2 || size == 0 */
|
||||
return;
|
||||
|
||||
/* called from 'sort' without swap function, let's pick the default */
|
||||
if (swap_func == SWAP_WRAPPER && !((struct wrapper *)priv)->swap)
|
||||
swap_func = NULL;
|
||||
|
||||
if (!swap_func) {
|
||||
if (is_aligned(base, size, 8))
|
||||
swap_func = SWAP_WORDS_64;
|
||||
@ -230,7 +245,7 @@ void sort_r(void *base, size_t num, size_t size,
|
||||
if (a) /* Building heap: sift down --a */
|
||||
a -= size;
|
||||
else if (n -= size) /* Sorting: Extract root to --n */
|
||||
do_swap(base, base + n, size, swap_func);
|
||||
do_swap(base, base + n, size, swap_func, priv);
|
||||
else /* Sort complete */
|
||||
break;
|
||||
|
||||
@ -257,7 +272,7 @@ void sort_r(void *base, size_t num, size_t size,
|
||||
c = b; /* Where "a" belongs */
|
||||
while (b != a) { /* Shift it into place */
|
||||
b = parent(b, lsbit, size);
|
||||
do_swap(base + b, base + c, size, swap_func);
|
||||
do_swap(base + b, base + c, size, swap_func, priv);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -267,6 +282,11 @@ void sort(void *base, size_t num, size_t size,
|
||||
cmp_func_t cmp_func,
|
||||
swap_func_t swap_func)
|
||||
{
|
||||
return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
|
||||
struct wrapper w = {
|
||||
.cmp = cmp_func,
|
||||
.swap = swap_func,
|
||||
};
|
||||
|
||||
return sort_r(base, num, size, _CMP_WRAPPER, SWAP_WRAPPER, &w);
|
||||
}
|
||||
EXPORT_SYMBOL(sort);
|
||||
|
174
lib/test_fprobe.c
Normal file
174
lib/test_fprobe.c
Normal file
@ -0,0 +1,174 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-or-later
|
||||
/*
|
||||
* test_fprobe.c - simple sanity test for fprobe
|
||||
*/
|
||||
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/fprobe.h>
|
||||
#include <linux/random.h>
|
||||
#include <kunit/test.h>
|
||||
|
||||
#define div_factor 3
|
||||
|
||||
static struct kunit *current_test;
|
||||
|
||||
static u32 rand1, entry_val, exit_val;
|
||||
|
||||
/* Use indirect calls to avoid inlining the target functions */
|
||||
static u32 (*target)(u32 value);
|
||||
static u32 (*target2)(u32 value);
|
||||
static unsigned long target_ip;
|
||||
static unsigned long target2_ip;
|
||||
|
||||
static noinline u32 fprobe_selftest_target(u32 value)
|
||||
{
|
||||
return (value / div_factor);
|
||||
}
|
||||
|
||||
static noinline u32 fprobe_selftest_target2(u32 value)
|
||||
{
|
||||
return (value / div_factor) + 1;
|
||||
}
|
||||
|
||||
static notrace void fp_entry_handler(struct fprobe *fp, unsigned long ip, struct pt_regs *regs)
|
||||
{
|
||||
KUNIT_EXPECT_FALSE(current_test, preemptible());
|
||||
/* This can be called on the fprobe_selftest_target and the fprobe_selftest_target2 */
|
||||
if (ip != target_ip)
|
||||
KUNIT_EXPECT_EQ(current_test, ip, target2_ip);
|
||||
entry_val = (rand1 / div_factor);
|
||||
}
|
||||
|
||||
static notrace void fp_exit_handler(struct fprobe *fp, unsigned long ip, struct pt_regs *regs)
|
||||
{
|
||||
unsigned long ret = regs_return_value(regs);
|
||||
|
||||
KUNIT_EXPECT_FALSE(current_test, preemptible());
|
||||
if (ip != target_ip) {
|
||||
KUNIT_EXPECT_EQ(current_test, ip, target2_ip);
|
||||
KUNIT_EXPECT_EQ(current_test, ret, (rand1 / div_factor) + 1);
|
||||
} else
|
||||
KUNIT_EXPECT_EQ(current_test, ret, (rand1 / div_factor));
|
||||
KUNIT_EXPECT_EQ(current_test, entry_val, (rand1 / div_factor));
|
||||
exit_val = entry_val + div_factor;
|
||||
}
|
||||
|
||||
/* Test entry only (no rethook) */
|
||||
static void test_fprobe_entry(struct kunit *test)
|
||||
{
|
||||
struct fprobe fp_entry = {
|
||||
.entry_handler = fp_entry_handler,
|
||||
};
|
||||
|
||||
current_test = test;
|
||||
|
||||
/* Before register, unregister should be failed. */
|
||||
KUNIT_EXPECT_NE(test, 0, unregister_fprobe(&fp_entry));
|
||||
KUNIT_EXPECT_EQ(test, 0, register_fprobe(&fp_entry, "fprobe_selftest_target*", NULL));
|
||||
|
||||
entry_val = 0;
|
||||
exit_val = 0;
|
||||
target(rand1);
|
||||
KUNIT_EXPECT_NE(test, 0, entry_val);
|
||||
KUNIT_EXPECT_EQ(test, 0, exit_val);
|
||||
|
||||
entry_val = 0;
|
||||
exit_val = 0;
|
||||
target2(rand1);
|
||||
KUNIT_EXPECT_NE(test, 0, entry_val);
|
||||
KUNIT_EXPECT_EQ(test, 0, exit_val);
|
||||
|
||||
KUNIT_EXPECT_EQ(test, 0, unregister_fprobe(&fp_entry));
|
||||
}
|
||||
|
||||
static void test_fprobe(struct kunit *test)
|
||||
{
|
||||
struct fprobe fp = {
|
||||
.entry_handler = fp_entry_handler,
|
||||
.exit_handler = fp_exit_handler,
|
||||
};
|
||||
|
||||
current_test = test;
|
||||
KUNIT_EXPECT_EQ(test, 0, register_fprobe(&fp, "fprobe_selftest_target*", NULL));
|
||||
|
||||
entry_val = 0;
|
||||
exit_val = 0;
|
||||
target(rand1);
|
||||
KUNIT_EXPECT_NE(test, 0, entry_val);
|
||||
KUNIT_EXPECT_EQ(test, entry_val + div_factor, exit_val);
|
||||
|
||||
entry_val = 0;
|
||||
exit_val = 0;
|
||||
target2(rand1);
|
||||
KUNIT_EXPECT_NE(test, 0, entry_val);
|
||||
KUNIT_EXPECT_EQ(test, entry_val + div_factor, exit_val);
|
||||
|
||||
KUNIT_EXPECT_EQ(test, 0, unregister_fprobe(&fp));
|
||||
}
|
||||
|
||||
static void test_fprobe_syms(struct kunit *test)
|
||||
{
|
||||
static const char *syms[] = {"fprobe_selftest_target", "fprobe_selftest_target2"};
|
||||
struct fprobe fp = {
|
||||
.entry_handler = fp_entry_handler,
|
||||
.exit_handler = fp_exit_handler,
|
||||
};
|
||||
|
||||
current_test = test;
|
||||
KUNIT_EXPECT_EQ(test, 0, register_fprobe_syms(&fp, syms, 2));
|
||||
|
||||
entry_val = 0;
|
||||
exit_val = 0;
|
||||
target(rand1);
|
||||
KUNIT_EXPECT_NE(test, 0, entry_val);
|
||||
KUNIT_EXPECT_EQ(test, entry_val + div_factor, exit_val);
|
||||
|
||||
entry_val = 0;
|
||||
exit_val = 0;
|
||||
target2(rand1);
|
||||
KUNIT_EXPECT_NE(test, 0, entry_val);
|
||||
KUNIT_EXPECT_EQ(test, entry_val + div_factor, exit_val);
|
||||
|
||||
KUNIT_EXPECT_EQ(test, 0, unregister_fprobe(&fp));
|
||||
}
|
||||
|
||||
static unsigned long get_ftrace_location(void *func)
|
||||
{
|
||||
unsigned long size, addr = (unsigned long)func;
|
||||
|
||||
if (!kallsyms_lookup_size_offset(addr, &size, NULL) || !size)
|
||||
return 0;
|
||||
|
||||
return ftrace_location_range(addr, addr + size - 1);
|
||||
}
|
||||
|
||||
static int fprobe_test_init(struct kunit *test)
|
||||
{
|
||||
do {
|
||||
rand1 = prandom_u32();
|
||||
} while (rand1 <= div_factor);
|
||||
|
||||
target = fprobe_selftest_target;
|
||||
target2 = fprobe_selftest_target2;
|
||||
target_ip = get_ftrace_location(target);
|
||||
target2_ip = get_ftrace_location(target2);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct kunit_case fprobe_testcases[] = {
|
||||
KUNIT_CASE(test_fprobe_entry),
|
||||
KUNIT_CASE(test_fprobe),
|
||||
KUNIT_CASE(test_fprobe_syms),
|
||||
{}
|
||||
};
|
||||
|
||||
static struct kunit_suite fprobe_test_suite = {
|
||||
.name = "fprobe_test",
|
||||
.init = fprobe_test_init,
|
||||
.test_cases = fprobe_testcases,
|
||||
};
|
||||
|
||||
kunit_test_suites(&fprobe_test_suite);
|
||||
|
||||
MODULE_LICENSE("GPL");
|
@ -15,6 +15,7 @@
|
||||
#include <net/sock.h>
|
||||
#include <net/tcp.h>
|
||||
#include <net/net_namespace.h>
|
||||
#include <net/page_pool.h>
|
||||
#include <linux/error-injection.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/sock_diag.h>
|
||||
@ -53,10 +54,11 @@ static void bpf_test_timer_leave(struct bpf_test_timer *t)
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
static bool bpf_test_timer_continue(struct bpf_test_timer *t, u32 repeat, int *err, u32 *duration)
|
||||
static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations,
|
||||
u32 repeat, int *err, u32 *duration)
|
||||
__must_hold(rcu)
|
||||
{
|
||||
t->i++;
|
||||
t->i += iterations;
|
||||
if (t->i >= repeat) {
|
||||
/* We're done. */
|
||||
t->time_spent += ktime_get_ns() - t->time_start;
|
||||
@ -88,6 +90,284 @@ reset:
|
||||
return false;
|
||||
}
|
||||
|
||||
/* We put this struct at the head of each page with a context and frame
|
||||
* initialised when the page is allocated, so we don't have to do this on each
|
||||
* repetition of the test run.
|
||||
*/
|
||||
struct xdp_page_head {
|
||||
struct xdp_buff orig_ctx;
|
||||
struct xdp_buff ctx;
|
||||
struct xdp_frame frm;
|
||||
u8 data[];
|
||||
};
|
||||
|
||||
struct xdp_test_data {
|
||||
struct xdp_buff *orig_ctx;
|
||||
struct xdp_rxq_info rxq;
|
||||
struct net_device *dev;
|
||||
struct page_pool *pp;
|
||||
struct xdp_frame **frames;
|
||||
struct sk_buff **skbs;
|
||||
u32 batch_size;
|
||||
u32 frame_cnt;
|
||||
};
|
||||
|
||||
#define TEST_XDP_FRAME_SIZE (PAGE_SIZE - sizeof(struct xdp_page_head))
|
||||
#define TEST_XDP_MAX_BATCH 256
|
||||
|
||||
static void xdp_test_run_init_page(struct page *page, void *arg)
|
||||
{
|
||||
struct xdp_page_head *head = phys_to_virt(page_to_phys(page));
|
||||
struct xdp_buff *new_ctx, *orig_ctx;
|
||||
u32 headroom = XDP_PACKET_HEADROOM;
|
||||
struct xdp_test_data *xdp = arg;
|
||||
size_t frm_len, meta_len;
|
||||
struct xdp_frame *frm;
|
||||
void *data;
|
||||
|
||||
orig_ctx = xdp->orig_ctx;
|
||||
frm_len = orig_ctx->data_end - orig_ctx->data_meta;
|
||||
meta_len = orig_ctx->data - orig_ctx->data_meta;
|
||||
headroom -= meta_len;
|
||||
|
||||
new_ctx = &head->ctx;
|
||||
frm = &head->frm;
|
||||
data = &head->data;
|
||||
memcpy(data + headroom, orig_ctx->data_meta, frm_len);
|
||||
|
||||
xdp_init_buff(new_ctx, TEST_XDP_FRAME_SIZE, &xdp->rxq);
|
||||
xdp_prepare_buff(new_ctx, data, headroom, frm_len, true);
|
||||
new_ctx->data = new_ctx->data_meta + meta_len;
|
||||
|
||||
xdp_update_frame_from_buff(new_ctx, frm);
|
||||
frm->mem = new_ctx->rxq->mem;
|
||||
|
||||
memcpy(&head->orig_ctx, new_ctx, sizeof(head->orig_ctx));
|
||||
}
|
||||
|
||||
static int xdp_test_run_setup(struct xdp_test_data *xdp, struct xdp_buff *orig_ctx)
|
||||
{
|
||||
struct xdp_mem_info mem = {};
|
||||
struct page_pool *pp;
|
||||
int err = -ENOMEM;
|
||||
struct page_pool_params pp_params = {
|
||||
.order = 0,
|
||||
.flags = 0,
|
||||
.pool_size = xdp->batch_size,
|
||||
.nid = NUMA_NO_NODE,
|
||||
.init_callback = xdp_test_run_init_page,
|
||||
.init_arg = xdp,
|
||||
};
|
||||
|
||||
xdp->frames = kvmalloc_array(xdp->batch_size, sizeof(void *), GFP_KERNEL);
|
||||
if (!xdp->frames)
|
||||
return -ENOMEM;
|
||||
|
||||
xdp->skbs = kvmalloc_array(xdp->batch_size, sizeof(void *), GFP_KERNEL);
|
||||
if (!xdp->skbs)
|
||||
goto err_skbs;
|
||||
|
||||
pp = page_pool_create(&pp_params);
|
||||
if (IS_ERR(pp)) {
|
||||
err = PTR_ERR(pp);
|
||||
goto err_pp;
|
||||
}
|
||||
|
||||
/* will copy 'mem.id' into pp->xdp_mem_id */
|
||||
err = xdp_reg_mem_model(&mem, MEM_TYPE_PAGE_POOL, pp);
|
||||
if (err)
|
||||
goto err_mmodel;
|
||||
|
||||
xdp->pp = pp;
|
||||
|
||||
/* We create a 'fake' RXQ referencing the original dev, but with an
|
||||
* xdp_mem_info pointing to our page_pool
|
||||
*/
|
||||
xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
|
||||
xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
|
||||
xdp->rxq.mem.id = pp->xdp_mem_id;
|
||||
xdp->dev = orig_ctx->rxq->dev;
|
||||
xdp->orig_ctx = orig_ctx;
|
||||
|
||||
return 0;
|
||||
|
||||
err_mmodel:
|
||||
page_pool_destroy(pp);
|
||||
err_pp:
|
||||
kvfree(xdp->skbs);
|
||||
err_skbs:
|
||||
kvfree(xdp->frames);
|
||||
return err;
|
||||
}
|
||||
|
||||
static void xdp_test_run_teardown(struct xdp_test_data *xdp)
|
||||
{
|
||||
page_pool_destroy(xdp->pp);
|
||||
kfree(xdp->frames);
|
||||
kfree(xdp->skbs);
|
||||
}
|
||||
|
||||
static bool ctx_was_changed(struct xdp_page_head *head)
|
||||
{
|
||||
return head->orig_ctx.data != head->ctx.data ||
|
||||
head->orig_ctx.data_meta != head->ctx.data_meta ||
|
||||
head->orig_ctx.data_end != head->ctx.data_end;
|
||||
}
|
||||
|
||||
static void reset_ctx(struct xdp_page_head *head)
|
||||
{
|
||||
if (likely(!ctx_was_changed(head)))
|
||||
return;
|
||||
|
||||
head->ctx.data = head->orig_ctx.data;
|
||||
head->ctx.data_meta = head->orig_ctx.data_meta;
|
||||
head->ctx.data_end = head->orig_ctx.data_end;
|
||||
xdp_update_frame_from_buff(&head->ctx, &head->frm);
|
||||
}
|
||||
|
||||
static int xdp_recv_frames(struct xdp_frame **frames, int nframes,
|
||||
struct sk_buff **skbs,
|
||||
struct net_device *dev)
|
||||
{
|
||||
gfp_t gfp = __GFP_ZERO | GFP_ATOMIC;
|
||||
int i, n;
|
||||
LIST_HEAD(list);
|
||||
|
||||
n = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, nframes, (void **)skbs);
|
||||
if (unlikely(n == 0)) {
|
||||
for (i = 0; i < nframes; i++)
|
||||
xdp_return_frame(frames[i]);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
for (i = 0; i < nframes; i++) {
|
||||
struct xdp_frame *xdpf = frames[i];
|
||||
struct sk_buff *skb = skbs[i];
|
||||
|
||||
skb = __xdp_build_skb_from_frame(xdpf, skb, dev);
|
||||
if (!skb) {
|
||||
xdp_return_frame(xdpf);
|
||||
continue;
|
||||
}
|
||||
|
||||
list_add_tail(&skb->list, &list);
|
||||
}
|
||||
netif_receive_skb_list(&list);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog,
|
||||
u32 repeat)
|
||||
{
|
||||
struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
|
||||
int err = 0, act, ret, i, nframes = 0, batch_sz;
|
||||
struct xdp_frame **frames = xdp->frames;
|
||||
struct xdp_page_head *head;
|
||||
struct xdp_frame *frm;
|
||||
bool redirect = false;
|
||||
struct xdp_buff *ctx;
|
||||
struct page *page;
|
||||
|
||||
batch_sz = min_t(u32, repeat, xdp->batch_size);
|
||||
|
||||
local_bh_disable();
|
||||
xdp_set_return_frame_no_direct();
|
||||
|
||||
for (i = 0; i < batch_sz; i++) {
|
||||
page = page_pool_dev_alloc_pages(xdp->pp);
|
||||
if (!page) {
|
||||
err = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
head = phys_to_virt(page_to_phys(page));
|
||||
reset_ctx(head);
|
||||
ctx = &head->ctx;
|
||||
frm = &head->frm;
|
||||
xdp->frame_cnt++;
|
||||
|
||||
act = bpf_prog_run_xdp(prog, ctx);
|
||||
|
||||
/* if program changed pkt bounds we need to update the xdp_frame */
|
||||
if (unlikely(ctx_was_changed(head))) {
|
||||
ret = xdp_update_frame_from_buff(ctx, frm);
|
||||
if (ret) {
|
||||
xdp_return_buff(ctx);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
switch (act) {
|
||||
case XDP_TX:
|
||||
/* we can't do a real XDP_TX since we're not in the
|
||||
* driver, so turn it into a REDIRECT back to the same
|
||||
* index
|
||||
*/
|
||||
ri->tgt_index = xdp->dev->ifindex;
|
||||
ri->map_id = INT_MAX;
|
||||
ri->map_type = BPF_MAP_TYPE_UNSPEC;
|
||||
fallthrough;
|
||||
case XDP_REDIRECT:
|
||||
redirect = true;
|
||||
ret = xdp_do_redirect_frame(xdp->dev, ctx, frm, prog);
|
||||
if (ret)
|
||||
xdp_return_buff(ctx);
|
||||
break;
|
||||
case XDP_PASS:
|
||||
frames[nframes++] = frm;
|
||||
break;
|
||||
default:
|
||||
bpf_warn_invalid_xdp_action(NULL, prog, act);
|
||||
fallthrough;
|
||||
case XDP_DROP:
|
||||
xdp_return_buff(ctx);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
out:
|
||||
if (redirect)
|
||||
xdp_do_flush();
|
||||
if (nframes) {
|
||||
ret = xdp_recv_frames(frames, nframes, xdp->skbs, xdp->dev);
|
||||
if (ret)
|
||||
err = ret;
|
||||
}
|
||||
|
||||
xdp_clear_return_frame_no_direct();
|
||||
local_bh_enable();
|
||||
return err;
|
||||
}
|
||||
|
||||
static int bpf_test_run_xdp_live(struct bpf_prog *prog, struct xdp_buff *ctx,
|
||||
u32 repeat, u32 batch_size, u32 *time)
|
||||
|
||||
{
|
||||
struct xdp_test_data xdp = { .batch_size = batch_size };
|
||||
struct bpf_test_timer t = { .mode = NO_MIGRATE };
|
||||
int ret;
|
||||
|
||||
if (!repeat)
|
||||
repeat = 1;
|
||||
|
||||
ret = xdp_test_run_setup(&xdp, ctx);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
bpf_test_timer_enter(&t);
|
||||
do {
|
||||
xdp.frame_cnt = 0;
|
||||
ret = xdp_test_run_batch(&xdp, prog, repeat - t.i);
|
||||
if (unlikely(ret < 0))
|
||||
break;
|
||||
} while (bpf_test_timer_continue(&t, xdp.frame_cnt, repeat, &ret, time));
|
||||
bpf_test_timer_leave(&t);
|
||||
|
||||
xdp_test_run_teardown(&xdp);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
|
||||
u32 *retval, u32 *time, bool xdp)
|
||||
{
|
||||
@ -119,7 +399,7 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
|
||||
*retval = bpf_prog_run_xdp(prog, ctx);
|
||||
else
|
||||
*retval = bpf_prog_run(prog, ctx);
|
||||
} while (bpf_test_timer_continue(&t, repeat, &ret, time));
|
||||
} while (bpf_test_timer_continue(&t, 1, repeat, &ret, time));
|
||||
bpf_reset_run_ctx(old_ctx);
|
||||
bpf_test_timer_leave(&t);
|
||||
|
||||
@ -201,8 +481,8 @@ out:
|
||||
* future.
|
||||
*/
|
||||
__diag_push();
|
||||
__diag_ignore(GCC, 8, "-Wmissing-prototypes",
|
||||
"Global functions as their definitions will be in vmlinux BTF");
|
||||
__diag_ignore_all("-Wmissing-prototypes",
|
||||
"Global functions as their definitions will be in vmlinux BTF");
|
||||
int noinline bpf_fentry_test1(int a)
|
||||
{
|
||||
return a + 1;
|
||||
@ -270,9 +550,14 @@ struct sock * noinline bpf_kfunc_call_test3(struct sock *sk)
|
||||
return sk;
|
||||
}
|
||||
|
||||
struct prog_test_member {
|
||||
u64 c;
|
||||
};
|
||||
|
||||
struct prog_test_ref_kfunc {
|
||||
int a;
|
||||
int b;
|
||||
struct prog_test_member memb;
|
||||
struct prog_test_ref_kfunc *next;
|
||||
};
|
||||
|
||||
@ -295,6 +580,10 @@ noinline void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p)
|
||||
{
|
||||
}
|
||||
|
||||
noinline void bpf_kfunc_call_memb_release(struct prog_test_member *p)
|
||||
{
|
||||
}
|
||||
|
||||
struct prog_test_pass1 {
|
||||
int x0;
|
||||
struct {
|
||||
@ -379,6 +668,7 @@ BTF_ID(func, bpf_kfunc_call_test2)
|
||||
BTF_ID(func, bpf_kfunc_call_test3)
|
||||
BTF_ID(func, bpf_kfunc_call_test_acquire)
|
||||
BTF_ID(func, bpf_kfunc_call_test_release)
|
||||
BTF_ID(func, bpf_kfunc_call_memb_release)
|
||||
BTF_ID(func, bpf_kfunc_call_test_pass_ctx)
|
||||
BTF_ID(func, bpf_kfunc_call_test_pass1)
|
||||
BTF_ID(func, bpf_kfunc_call_test_pass2)
|
||||
@ -396,6 +686,7 @@ BTF_SET_END(test_sk_acquire_kfunc_ids)
|
||||
|
||||
BTF_SET_START(test_sk_release_kfunc_ids)
|
||||
BTF_ID(func, bpf_kfunc_call_test_release)
|
||||
BTF_ID(func, bpf_kfunc_call_memb_release)
|
||||
BTF_SET_END(test_sk_release_kfunc_ids)
|
||||
|
||||
BTF_SET_START(test_sk_ret_null_kfunc_ids)
|
||||
@ -435,7 +726,7 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
|
||||
int b = 2, err = -EFAULT;
|
||||
u32 retval = 0;
|
||||
|
||||
if (kattr->test.flags || kattr->test.cpu)
|
||||
if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
|
||||
return -EINVAL;
|
||||
|
||||
switch (prog->expected_attach_type) {
|
||||
@ -499,7 +790,7 @@ int bpf_prog_test_run_raw_tp(struct bpf_prog *prog,
|
||||
/* doesn't support data_in/out, ctx_out, duration, or repeat */
|
||||
if (kattr->test.data_in || kattr->test.data_out ||
|
||||
kattr->test.ctx_out || kattr->test.duration ||
|
||||
kattr->test.repeat)
|
||||
kattr->test.repeat || kattr->test.batch_size)
|
||||
return -EINVAL;
|
||||
|
||||
if (ctx_size_in < prog->aux->max_ctx_offset ||
|
||||
@ -730,7 +1021,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
|
||||
void *data;
|
||||
int ret;
|
||||
|
||||
if (kattr->test.flags || kattr->test.cpu)
|
||||
if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
|
||||
return -EINVAL;
|
||||
|
||||
data = bpf_test_init(kattr, kattr->test.data_size_in,
|
||||
@ -911,10 +1202,12 @@ static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md)
|
||||
int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
|
||||
union bpf_attr __user *uattr)
|
||||
{
|
||||
bool do_live = (kattr->test.flags & BPF_F_TEST_XDP_LIVE_FRAMES);
|
||||
u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
|
||||
u32 batch_size = kattr->test.batch_size;
|
||||
u32 retval = 0, duration, max_data_sz;
|
||||
u32 size = kattr->test.data_size_in;
|
||||
u32 headroom = XDP_PACKET_HEADROOM;
|
||||
u32 retval, duration, max_data_sz;
|
||||
u32 repeat = kattr->test.repeat;
|
||||
struct netdev_rx_queue *rxqueue;
|
||||
struct skb_shared_info *sinfo;
|
||||
@ -927,6 +1220,20 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
|
||||
prog->expected_attach_type == BPF_XDP_CPUMAP)
|
||||
return -EINVAL;
|
||||
|
||||
if (kattr->test.flags & ~BPF_F_TEST_XDP_LIVE_FRAMES)
|
||||
return -EINVAL;
|
||||
|
||||
if (do_live) {
|
||||
if (!batch_size)
|
||||
batch_size = NAPI_POLL_WEIGHT;
|
||||
else if (batch_size > TEST_XDP_MAX_BATCH)
|
||||
return -E2BIG;
|
||||
|
||||
headroom += sizeof(struct xdp_page_head);
|
||||
} else if (batch_size) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
|
||||
if (IS_ERR(ctx))
|
||||
return PTR_ERR(ctx);
|
||||
@ -935,14 +1242,20 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
|
||||
/* There can't be user provided data before the meta data */
|
||||
if (ctx->data_meta || ctx->data_end != size ||
|
||||
ctx->data > ctx->data_end ||
|
||||
unlikely(xdp_metalen_invalid(ctx->data)))
|
||||
unlikely(xdp_metalen_invalid(ctx->data)) ||
|
||||
(do_live && (kattr->test.data_out || kattr->test.ctx_out)))
|
||||
goto free_ctx;
|
||||
/* Meta data is allocated from the headroom */
|
||||
headroom -= ctx->data;
|
||||
}
|
||||
|
||||
max_data_sz = 4096 - headroom - tailroom;
|
||||
size = min_t(u32, size, max_data_sz);
|
||||
if (size > max_data_sz) {
|
||||
/* disallow live data mode for jumbo frames */
|
||||
if (do_live)
|
||||
goto free_ctx;
|
||||
size = max_data_sz;
|
||||
}
|
||||
|
||||
data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
|
||||
if (IS_ERR(data)) {
|
||||
@ -1000,7 +1313,10 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
|
||||
if (repeat > 1)
|
||||
bpf_prog_change_xdp(NULL, prog);
|
||||
|
||||
ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
|
||||
if (do_live)
|
||||
ret = bpf_test_run_xdp_live(prog, &xdp, repeat, batch_size, &duration);
|
||||
else
|
||||
ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
|
||||
/* We convert the xdp_buff back to an xdp_md before checking the return
|
||||
* code so the reference count of any held netdevice will be decremented
|
||||
* even if the test run failed.
|
||||
@ -1062,7 +1378,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
|
||||
if (prog->type != BPF_PROG_TYPE_FLOW_DISSECTOR)
|
||||
return -EINVAL;
|
||||
|
||||
if (kattr->test.flags || kattr->test.cpu)
|
||||
if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
|
||||
return -EINVAL;
|
||||
|
||||
if (size < ETH_HLEN)
|
||||
@ -1097,7 +1413,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
|
||||
do {
|
||||
retval = bpf_flow_dissect(prog, &ctx, eth->h_proto, ETH_HLEN,
|
||||
size, flags);
|
||||
} while (bpf_test_timer_continue(&t, repeat, &ret, &duration));
|
||||
} while (bpf_test_timer_continue(&t, 1, repeat, &ret, &duration));
|
||||
bpf_test_timer_leave(&t);
|
||||
|
||||
if (ret < 0)
|
||||
@ -1129,7 +1445,7 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
|
||||
if (prog->type != BPF_PROG_TYPE_SK_LOOKUP)
|
||||
return -EINVAL;
|
||||
|
||||
if (kattr->test.flags || kattr->test.cpu)
|
||||
if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
|
||||
return -EINVAL;
|
||||
|
||||
if (kattr->test.data_in || kattr->test.data_size_in || kattr->test.data_out ||
|
||||
@ -1192,7 +1508,7 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
|
||||
do {
|
||||
ctx.selected_sk = NULL;
|
||||
retval = BPF_PROG_SK_LOOKUP_RUN_ARRAY(progs, ctx, bpf_prog_run);
|
||||
} while (bpf_test_timer_continue(&t, repeat, &ret, &duration));
|
||||
} while (bpf_test_timer_continue(&t, 1, repeat, &ret, &duration));
|
||||
bpf_test_timer_leave(&t);
|
||||
|
||||
if (ret < 0)
|
||||
@ -1231,7 +1547,8 @@ int bpf_prog_test_run_syscall(struct bpf_prog *prog,
|
||||
/* doesn't support data_in/out, ctx_out, duration, or repeat or flags */
|
||||
if (kattr->test.data_in || kattr->test.data_out ||
|
||||
kattr->test.ctx_out || kattr->test.duration ||
|
||||
kattr->test.repeat || kattr->test.flags)
|
||||
kattr->test.repeat || kattr->test.flags ||
|
||||
kattr->test.batch_size)
|
||||
return -EINVAL;
|
||||
|
||||
if (ctx_size_in < prog->aux->max_ctx_offset ||
|
||||
|
@ -141,7 +141,7 @@ static int bpf_fd_sk_storage_update_elem(struct bpf_map *map, void *key,
|
||||
if (sock) {
|
||||
sdata = bpf_local_storage_update(
|
||||
sock->sk, (struct bpf_local_storage_map *)map, value,
|
||||
map_flags);
|
||||
map_flags, GFP_ATOMIC);
|
||||
sockfd_put(sock);
|
||||
return PTR_ERR_OR_ZERO(sdata);
|
||||
}
|
||||
@ -172,7 +172,7 @@ bpf_sk_storage_clone_elem(struct sock *newsk,
|
||||
{
|
||||
struct bpf_local_storage_elem *copy_selem;
|
||||
|
||||
copy_selem = bpf_selem_alloc(smap, newsk, NULL, true);
|
||||
copy_selem = bpf_selem_alloc(smap, newsk, NULL, true, GFP_ATOMIC);
|
||||
if (!copy_selem)
|
||||
return NULL;
|
||||
|
||||
@ -230,7 +230,7 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
|
||||
bpf_selem_link_map(smap, copy_selem);
|
||||
bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
|
||||
} else {
|
||||
ret = bpf_local_storage_alloc(newsk, smap, copy_selem);
|
||||
ret = bpf_local_storage_alloc(newsk, smap, copy_selem, GFP_ATOMIC);
|
||||
if (ret) {
|
||||
kfree(copy_selem);
|
||||
atomic_sub(smap->elem_size,
|
||||
@ -255,8 +255,9 @@ out:
|
||||
return ret;
|
||||
}
|
||||
|
||||
BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
|
||||
void *, value, u64, flags)
|
||||
/* *gfp_flags* is a hidden argument provided by the verifier */
|
||||
BPF_CALL_5(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
|
||||
void *, value, u64, flags, gfp_t, gfp_flags)
|
||||
{
|
||||
struct bpf_local_storage_data *sdata;
|
||||
|
||||
@ -277,7 +278,7 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
|
||||
refcount_inc_not_zero(&sk->sk_refcnt)) {
|
||||
sdata = bpf_local_storage_update(
|
||||
sk, (struct bpf_local_storage_map *)map, value,
|
||||
BPF_NOEXIST);
|
||||
BPF_NOEXIST, gfp_flags);
|
||||
/* sk must be a fullsock (guaranteed by verifier),
|
||||
* so sock_gen_put() is unnecessary.
|
||||
*/
|
||||
@ -405,6 +406,8 @@ static bool bpf_sk_storage_tracing_allowed(const struct bpf_prog *prog)
|
||||
case BPF_TRACE_FENTRY:
|
||||
case BPF_TRACE_FEXIT:
|
||||
btf_vmlinux = bpf_get_btf_vmlinux();
|
||||
if (IS_ERR_OR_NULL(btf_vmlinux))
|
||||
return false;
|
||||
btf_id = prog->aux->attach_btf_id;
|
||||
t = btf_type_by_id(btf_vmlinux, btf_id);
|
||||
tname = btf_name_by_offset(btf_vmlinux, t->name_off);
|
||||
@ -417,14 +420,16 @@ static bool bpf_sk_storage_tracing_allowed(const struct bpf_prog *prog)
|
||||
return false;
|
||||
}
|
||||
|
||||
BPF_CALL_4(bpf_sk_storage_get_tracing, struct bpf_map *, map, struct sock *, sk,
|
||||
void *, value, u64, flags)
|
||||
/* *gfp_flags* is a hidden argument provided by the verifier */
|
||||
BPF_CALL_5(bpf_sk_storage_get_tracing, struct bpf_map *, map, struct sock *, sk,
|
||||
void *, value, u64, flags, gfp_t, gfp_flags)
|
||||
{
|
||||
WARN_ON_ONCE(!bpf_rcu_lock_held());
|
||||
if (in_hardirq() || in_nmi())
|
||||
return (unsigned long)NULL;
|
||||
|
||||
return (unsigned long)____bpf_sk_storage_get(map, sk, value, flags);
|
||||
return (unsigned long)____bpf_sk_storage_get(map, sk, value, flags,
|
||||
gfp_flags);
|
||||
}
|
||||
|
||||
BPF_CALL_2(bpf_sk_storage_delete_tracing, struct bpf_map *, map,
|
||||
|
@ -7388,36 +7388,36 @@ static const struct bpf_func_proto bpf_sock_ops_reserve_hdr_opt_proto = {
|
||||
.arg3_type = ARG_ANYTHING,
|
||||
};
|
||||
|
||||
BPF_CALL_3(bpf_skb_set_delivery_time, struct sk_buff *, skb,
|
||||
u64, dtime, u32, dtime_type)
|
||||
BPF_CALL_3(bpf_skb_set_tstamp, struct sk_buff *, skb,
|
||||
u64, tstamp, u32, tstamp_type)
|
||||
{
|
||||
/* skb_clear_delivery_time() is done for inet protocol */
|
||||
if (skb->protocol != htons(ETH_P_IP) &&
|
||||
skb->protocol != htons(ETH_P_IPV6))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
switch (dtime_type) {
|
||||
case BPF_SKB_DELIVERY_TIME_MONO:
|
||||
if (!dtime)
|
||||
switch (tstamp_type) {
|
||||
case BPF_SKB_TSTAMP_DELIVERY_MONO:
|
||||
if (!tstamp)
|
||||
return -EINVAL;
|
||||
skb->tstamp = dtime;
|
||||
skb->tstamp = tstamp;
|
||||
skb->mono_delivery_time = 1;
|
||||
break;
|
||||
case BPF_SKB_DELIVERY_TIME_NONE:
|
||||
if (dtime)
|
||||
case BPF_SKB_TSTAMP_UNSPEC:
|
||||
if (tstamp)
|
||||
return -EINVAL;
|
||||
skb->tstamp = 0;
|
||||
skb->mono_delivery_time = 0;
|
||||
break;
|
||||
default:
|
||||
return -EOPNOTSUPP;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct bpf_func_proto bpf_skb_set_delivery_time_proto = {
|
||||
.func = bpf_skb_set_delivery_time,
|
||||
static const struct bpf_func_proto bpf_skb_set_tstamp_proto = {
|
||||
.func = bpf_skb_set_tstamp,
|
||||
.gpl_only = false,
|
||||
.ret_type = RET_INTEGER,
|
||||
.arg1_type = ARG_PTR_TO_CTX,
|
||||
@ -7786,8 +7786,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
|
||||
return &bpf_tcp_gen_syncookie_proto;
|
||||
case BPF_FUNC_sk_assign:
|
||||
return &bpf_sk_assign_proto;
|
||||
case BPF_FUNC_skb_set_delivery_time:
|
||||
return &bpf_skb_set_delivery_time_proto;
|
||||
case BPF_FUNC_skb_set_tstamp:
|
||||
return &bpf_skb_set_tstamp_proto;
|
||||
#endif
|
||||
default:
|
||||
return bpf_sk_base_func_proto(func_id);
|
||||
@ -8127,9 +8127,9 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
|
||||
return false;
|
||||
info->reg_type = PTR_TO_SOCK_COMMON_OR_NULL;
|
||||
break;
|
||||
case offsetof(struct __sk_buff, delivery_time_type):
|
||||
case offsetof(struct __sk_buff, tstamp_type):
|
||||
return false;
|
||||
case offsetofend(struct __sk_buff, delivery_time_type) ... offsetof(struct __sk_buff, hwtstamp) - 1:
|
||||
case offsetofend(struct __sk_buff, tstamp_type) ... offsetof(struct __sk_buff, hwtstamp) - 1:
|
||||
/* Explicitly prohibit access to padding in __sk_buff. */
|
||||
return false;
|
||||
default:
|
||||
@ -8484,14 +8484,14 @@ static bool tc_cls_act_is_valid_access(int off, int size,
|
||||
break;
|
||||
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
|
||||
return false;
|
||||
case offsetof(struct __sk_buff, delivery_time_type):
|
||||
case offsetof(struct __sk_buff, tstamp_type):
|
||||
/* The convert_ctx_access() on reading and writing
|
||||
* __sk_buff->tstamp depends on whether the bpf prog
|
||||
* has used __sk_buff->delivery_time_type or not.
|
||||
* Thus, we need to set prog->delivery_time_access
|
||||
* has used __sk_buff->tstamp_type or not.
|
||||
* Thus, we need to set prog->tstamp_type_access
|
||||
* earlier during is_valid_access() here.
|
||||
*/
|
||||
((struct bpf_prog *)prog)->delivery_time_access = 1;
|
||||
((struct bpf_prog *)prog)->tstamp_type_access = 1;
|
||||
return size == sizeof(__u8);
|
||||
}
|
||||
|
||||
@ -8888,42 +8888,22 @@ static u32 flow_dissector_convert_ctx_access(enum bpf_access_type type,
|
||||
return insn - insn_buf;
|
||||
}
|
||||
|
||||
static struct bpf_insn *bpf_convert_dtime_type_read(const struct bpf_insn *si,
|
||||
struct bpf_insn *insn)
|
||||
static struct bpf_insn *bpf_convert_tstamp_type_read(const struct bpf_insn *si,
|
||||
struct bpf_insn *insn)
|
||||
{
|
||||
__u8 value_reg = si->dst_reg;
|
||||
__u8 skb_reg = si->src_reg;
|
||||
/* AX is needed because src_reg and dst_reg could be the same */
|
||||
__u8 tmp_reg = BPF_REG_AX;
|
||||
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
|
||||
SKB_MONO_DELIVERY_TIME_OFFSET);
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
|
||||
SKB_MONO_DELIVERY_TIME_MASK);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0, 2);
|
||||
/* value_reg = BPF_SKB_DELIVERY_TIME_MONO */
|
||||
*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_DELIVERY_TIME_MONO);
|
||||
*insn++ = BPF_JMP_A(IS_ENABLED(CONFIG_NET_CLS_ACT) ? 10 : 5);
|
||||
|
||||
*insn++ = BPF_LDX_MEM(BPF_DW, tmp_reg, skb_reg,
|
||||
offsetof(struct sk_buff, tstamp));
|
||||
*insn++ = BPF_JMP_IMM(BPF_JNE, tmp_reg, 0, 2);
|
||||
/* value_reg = BPF_SKB_DELIVERY_TIME_NONE */
|
||||
*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_DELIVERY_TIME_NONE);
|
||||
*insn++ = BPF_JMP_A(IS_ENABLED(CONFIG_NET_CLS_ACT) ? 6 : 1);
|
||||
|
||||
#ifdef CONFIG_NET_CLS_ACT
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, TC_AT_INGRESS_OFFSET);
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, TC_AT_INGRESS_MASK);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0, 2);
|
||||
/* At ingress, value_reg = 0 */
|
||||
*insn++ = BPF_MOV32_IMM(value_reg, 0);
|
||||
PKT_VLAN_PRESENT_OFFSET);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg,
|
||||
SKB_MONO_DELIVERY_TIME_MASK, 2);
|
||||
*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_UNSPEC);
|
||||
*insn++ = BPF_JMP_A(1);
|
||||
#endif
|
||||
*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_TSTAMP_DELIVERY_MONO);
|
||||
|
||||
/* value_reg = BPF_SKB_DELIVERYT_TIME_UNSPEC */
|
||||
*insn++ = BPF_MOV32_IMM(value_reg, BPF_SKB_DELIVERY_TIME_UNSPEC);
|
||||
|
||||
/* 15 insns with CONFIG_NET_CLS_ACT */
|
||||
return insn;
|
||||
}
|
||||
|
||||
@ -8956,21 +8936,22 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog,
|
||||
__u8 skb_reg = si->src_reg;
|
||||
|
||||
#ifdef CONFIG_NET_CLS_ACT
|
||||
if (!prog->delivery_time_access) {
|
||||
/* If the tstamp_type is read,
|
||||
* the bpf prog is aware the tstamp could have delivery time.
|
||||
* Thus, read skb->tstamp as is if tstamp_type_access is true.
|
||||
*/
|
||||
if (!prog->tstamp_type_access) {
|
||||
/* AX is needed because src_reg and dst_reg could be the same */
|
||||
__u8 tmp_reg = BPF_REG_AX;
|
||||
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, TC_AT_INGRESS_OFFSET);
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, TC_AT_INGRESS_MASK);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0, 5);
|
||||
/* @ingress, read __sk_buff->tstamp as the (rcv) timestamp,
|
||||
* so check the skb->mono_delivery_time.
|
||||
*/
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
|
||||
SKB_MONO_DELIVERY_TIME_OFFSET);
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, PKT_VLAN_PRESENT_OFFSET);
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
|
||||
SKB_MONO_DELIVERY_TIME_MASK);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0, 2);
|
||||
/* skb->mono_delivery_time is set, read 0 as the (rcv) timestamp. */
|
||||
TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JNE, tmp_reg,
|
||||
TC_AT_INGRESS_MASK | SKB_MONO_DELIVERY_TIME_MASK, 2);
|
||||
/* skb->tc_at_ingress && skb->mono_delivery_time,
|
||||
* read 0 as the (rcv) timestamp.
|
||||
*/
|
||||
*insn++ = BPF_MOV64_IMM(value_reg, 0);
|
||||
*insn++ = BPF_JMP_A(1);
|
||||
}
|
||||
@ -8989,25 +8970,27 @@ static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog,
|
||||
__u8 skb_reg = si->dst_reg;
|
||||
|
||||
#ifdef CONFIG_NET_CLS_ACT
|
||||
if (!prog->delivery_time_access) {
|
||||
/* If the tstamp_type is read,
|
||||
* the bpf prog is aware the tstamp could have delivery time.
|
||||
* Thus, write skb->tstamp as is if tstamp_type_access is true.
|
||||
* Otherwise, writing at ingress will have to clear the
|
||||
* mono_delivery_time bit also.
|
||||
*/
|
||||
if (!prog->tstamp_type_access) {
|
||||
__u8 tmp_reg = BPF_REG_AX;
|
||||
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, TC_AT_INGRESS_OFFSET);
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, TC_AT_INGRESS_MASK);
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JEQ, tmp_reg, 0, 3);
|
||||
/* Writing __sk_buff->tstamp at ingress as the (rcv) timestamp.
|
||||
* Clear the skb->mono_delivery_time.
|
||||
*/
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg,
|
||||
SKB_MONO_DELIVERY_TIME_OFFSET);
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg,
|
||||
~SKB_MONO_DELIVERY_TIME_MASK);
|
||||
*insn++ = BPF_STX_MEM(BPF_B, skb_reg, tmp_reg,
|
||||
SKB_MONO_DELIVERY_TIME_OFFSET);
|
||||
*insn++ = BPF_LDX_MEM(BPF_B, tmp_reg, skb_reg, PKT_VLAN_PRESENT_OFFSET);
|
||||
/* Writing __sk_buff->tstamp as ingress, goto <clear> */
|
||||
*insn++ = BPF_JMP32_IMM(BPF_JSET, tmp_reg, TC_AT_INGRESS_MASK, 1);
|
||||
/* goto <store> */
|
||||
*insn++ = BPF_JMP_A(2);
|
||||
/* <clear>: mono_delivery_time */
|
||||
*insn++ = BPF_ALU32_IMM(BPF_AND, tmp_reg, ~SKB_MONO_DELIVERY_TIME_MASK);
|
||||
*insn++ = BPF_STX_MEM(BPF_B, skb_reg, tmp_reg, PKT_VLAN_PRESENT_OFFSET);
|
||||
}
|
||||
#endif
|
||||
|
||||
/* skb->tstamp = tstamp */
|
||||
/* <store>: skb->tstamp = tstamp */
|
||||
*insn++ = BPF_STX_MEM(BPF_DW, skb_reg, value_reg,
|
||||
offsetof(struct sk_buff, tstamp));
|
||||
return insn;
|
||||
@ -9326,8 +9309,8 @@ static u32 bpf_convert_ctx_access(enum bpf_access_type type,
|
||||
insn = bpf_convert_tstamp_read(prog, si, insn);
|
||||
break;
|
||||
|
||||
case offsetof(struct __sk_buff, delivery_time_type):
|
||||
insn = bpf_convert_dtime_type_read(si, insn);
|
||||
case offsetof(struct __sk_buff, tstamp_type):
|
||||
insn = bpf_convert_tstamp_type_read(si, insn);
|
||||
break;
|
||||
|
||||
case offsetof(struct __sk_buff, gso_segs):
|
||||
@ -11006,13 +10989,24 @@ static bool sk_lookup_is_valid_access(int off, int size,
|
||||
case bpf_ctx_range(struct bpf_sk_lookup, local_ip4):
|
||||
case bpf_ctx_range_till(struct bpf_sk_lookup, remote_ip6[0], remote_ip6[3]):
|
||||
case bpf_ctx_range_till(struct bpf_sk_lookup, local_ip6[0], local_ip6[3]):
|
||||
case offsetof(struct bpf_sk_lookup, remote_port) ...
|
||||
offsetof(struct bpf_sk_lookup, local_ip4) - 1:
|
||||
case bpf_ctx_range(struct bpf_sk_lookup, local_port):
|
||||
case bpf_ctx_range(struct bpf_sk_lookup, ingress_ifindex):
|
||||
bpf_ctx_record_field_size(info, sizeof(__u32));
|
||||
return bpf_ctx_narrow_access_ok(off, size, sizeof(__u32));
|
||||
|
||||
case bpf_ctx_range(struct bpf_sk_lookup, remote_port):
|
||||
/* Allow 4-byte access to 2-byte field for backward compatibility */
|
||||
if (size == sizeof(__u32))
|
||||
return true;
|
||||
bpf_ctx_record_field_size(info, sizeof(__be16));
|
||||
return bpf_ctx_narrow_access_ok(off, size, sizeof(__be16));
|
||||
|
||||
case offsetofend(struct bpf_sk_lookup, remote_port) ...
|
||||
offsetof(struct bpf_sk_lookup, local_ip4) - 1:
|
||||
/* Allow access to zero padding for backward compatibility */
|
||||
bpf_ctx_record_field_size(info, sizeof(__u16));
|
||||
return bpf_ctx_narrow_access_ok(off, size, sizeof(__u16));
|
||||
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
@ -11094,6 +11088,11 @@ static u32 sk_lookup_convert_ctx_access(enum bpf_access_type type,
|
||||
sport, 2, target_size));
|
||||
break;
|
||||
|
||||
case offsetofend(struct bpf_sk_lookup, remote_port):
|
||||
*target_size = 2;
|
||||
*insn++ = BPF_MOV32_IMM(si->dst_reg, 0);
|
||||
break;
|
||||
|
||||
case offsetof(struct bpf_sk_lookup, local_port):
|
||||
*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg,
|
||||
bpf_target_off(struct bpf_sk_lookup_kern,
|
||||
|
@ -27,6 +27,7 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len,
|
||||
int elem_first_coalesce)
|
||||
{
|
||||
struct page_frag *pfrag = sk_page_frag(sk);
|
||||
u32 osize = msg->sg.size;
|
||||
int ret = 0;
|
||||
|
||||
len -= msg->sg.size;
|
||||
@ -35,13 +36,17 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len,
|
||||
u32 orig_offset;
|
||||
int use, i;
|
||||
|
||||
if (!sk_page_frag_refill(sk, pfrag))
|
||||
return -ENOMEM;
|
||||
if (!sk_page_frag_refill(sk, pfrag)) {
|
||||
ret = -ENOMEM;
|
||||
goto msg_trim;
|
||||
}
|
||||
|
||||
orig_offset = pfrag->offset;
|
||||
use = min_t(int, len, pfrag->size - orig_offset);
|
||||
if (!sk_wmem_schedule(sk, use))
|
||||
return -ENOMEM;
|
||||
if (!sk_wmem_schedule(sk, use)) {
|
||||
ret = -ENOMEM;
|
||||
goto msg_trim;
|
||||
}
|
||||
|
||||
i = msg->sg.end;
|
||||
sk_msg_iter_var_prev(i);
|
||||
@ -71,6 +76,10 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len,
|
||||
}
|
||||
|
||||
return ret;
|
||||
|
||||
msg_trim:
|
||||
sk_msg_trim(sk, msg, osize);
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(sk_msg_alloc);
|
||||
|
||||
|
@ -529,6 +529,7 @@ void xdp_return_buff(struct xdp_buff *xdp)
|
||||
out:
|
||||
__xdp_return(xdp->data, &xdp->rxq->mem, true, xdp);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(xdp_return_buff);
|
||||
|
||||
/* Only called for MEM_TYPE_PAGE_POOL see xdp.h */
|
||||
void __xdp_release_frame(void *data, struct xdp_mem_info *mem)
|
||||
|
@ -138,10 +138,9 @@ int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg,
|
||||
struct sk_psock *psock = sk_psock_get(sk);
|
||||
int ret;
|
||||
|
||||
if (unlikely(!psock)) {
|
||||
sk_msg_free(sk, msg);
|
||||
return 0;
|
||||
}
|
||||
if (unlikely(!psock))
|
||||
return -EPIPE;
|
||||
|
||||
ret = ingress ? bpf_tcp_ingress(sk, psock, msg, bytes, flags) :
|
||||
tcp_bpf_push_locked(sk, msg, bytes, flags, false);
|
||||
sk_psock_put(sk, psock);
|
||||
@ -335,7 +334,7 @@ more_data:
|
||||
cork = true;
|
||||
psock->cork = NULL;
|
||||
}
|
||||
sk_msg_return(sk, msg, tosend);
|
||||
sk_msg_return(sk, msg, msg->sg.size);
|
||||
release_sock(sk);
|
||||
|
||||
ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
|
||||
@ -375,8 +374,11 @@ more_data:
|
||||
}
|
||||
if (msg &&
|
||||
msg->sg.data[msg->sg.start].page_link &&
|
||||
msg->sg.data[msg->sg.start].length)
|
||||
msg->sg.data[msg->sg.start].length) {
|
||||
if (eval == __SK_REDIRECT)
|
||||
sk_mem_charge(sk, msg->sg.size);
|
||||
goto more_data;
|
||||
}
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
@ -12,6 +12,7 @@
|
||||
#include <linux/btf_ids.h>
|
||||
#include <linux/net_namespace.h>
|
||||
#include <net/netfilter/nf_conntrack.h>
|
||||
#include <net/netfilter/nf_conntrack_bpf.h>
|
||||
#include <net/netfilter/nf_conntrack_core.h>
|
||||
|
||||
/* bpf_ct_opts - Options for CT lookup helpers
|
||||
@ -102,8 +103,8 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
|
||||
}
|
||||
|
||||
__diag_push();
|
||||
__diag_ignore(GCC, 8, "-Wmissing-prototypes",
|
||||
"Global functions as their definitions will be in nf_conntrack BTF");
|
||||
__diag_ignore_all("-Wmissing-prototypes",
|
||||
"Global functions as their definitions will be in nf_conntrack BTF");
|
||||
|
||||
/* bpf_xdp_ct_lookup - Lookup CT entry for the given tuple, and acquire a
|
||||
* reference to it
|
||||
|
@ -73,6 +73,13 @@ config SAMPLE_HW_BREAKPOINT
|
||||
help
|
||||
This builds kernel hardware breakpoint example modules.
|
||||
|
||||
config SAMPLE_FPROBE
|
||||
tristate "Build fprobe examples -- loadable modules only"
|
||||
depends on FPROBE && m
|
||||
help
|
||||
This builds a fprobe example module. This module has an option 'symbol'.
|
||||
You can specify a probed symbol or symbols separated with ','.
|
||||
|
||||
config SAMPLE_KFIFO
|
||||
tristate "Build kfifo examples -- loadable modules only"
|
||||
depends on m
|
||||
|
@ -33,3 +33,4 @@ subdir-$(CONFIG_SAMPLE_WATCHDOG) += watchdog
|
||||
subdir-$(CONFIG_SAMPLE_WATCH_QUEUE) += watch_queue
|
||||
obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak/
|
||||
obj-$(CONFIG_SAMPLE_CORESIGHT_SYSCFG) += coresight/
|
||||
obj-$(CONFIG_SAMPLE_FPROBE) += fprobe/
|
||||
|
@ -1984,15 +1984,15 @@ int main(int argc, char **argv)
|
||||
|
||||
setlocale(LC_ALL, "");
|
||||
|
||||
prev_time = get_nsecs();
|
||||
start_time = prev_time;
|
||||
|
||||
if (!opt_quiet) {
|
||||
ret = pthread_create(&pt, NULL, poller, NULL);
|
||||
if (ret)
|
||||
exit_with_error(ret);
|
||||
}
|
||||
|
||||
prev_time = get_nsecs();
|
||||
start_time = prev_time;
|
||||
|
||||
/* Configure sched priority for better wake-up accuracy */
|
||||
memset(&schparam, 0, sizeof(schparam));
|
||||
schparam.sched_priority = opt_schprio;
|
||||
|
3
samples/fprobe/Makefile
Normal file
3
samples/fprobe/Makefile
Normal file
@ -0,0 +1,3 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
|
||||
obj-$(CONFIG_SAMPLE_FPROBE) += fprobe_example.o
|
120
samples/fprobe/fprobe_example.c
Normal file
120
samples/fprobe/fprobe_example.c
Normal file
@ -0,0 +1,120 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-only
|
||||
/*
|
||||
* Here's a sample kernel module showing the use of fprobe to dump a
|
||||
* stack trace and selected registers when kernel_clone() is called.
|
||||
*
|
||||
* For more information on theory of operation of kprobes, see
|
||||
* Documentation/trace/kprobes.rst
|
||||
*
|
||||
* You will see the trace data in /var/log/messages and on the console
|
||||
* whenever kernel_clone() is invoked to create a new process.
|
||||
*/
|
||||
|
||||
#define pr_fmt(fmt) "%s: " fmt, __func__
|
||||
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/fprobe.h>
|
||||
#include <linux/sched/debug.h>
|
||||
#include <linux/slab.h>
|
||||
|
||||
#define BACKTRACE_DEPTH 16
|
||||
#define MAX_SYMBOL_LEN 4096
|
||||
struct fprobe sample_probe;
|
||||
|
||||
static char symbol[MAX_SYMBOL_LEN] = "kernel_clone";
|
||||
module_param_string(symbol, symbol, sizeof(symbol), 0644);
|
||||
static char nosymbol[MAX_SYMBOL_LEN] = "";
|
||||
module_param_string(nosymbol, nosymbol, sizeof(nosymbol), 0644);
|
||||
static bool stackdump = true;
|
||||
module_param(stackdump, bool, 0644);
|
||||
|
||||
static void show_backtrace(void)
|
||||
{
|
||||
unsigned long stacks[BACKTRACE_DEPTH];
|
||||
unsigned int len;
|
||||
|
||||
len = stack_trace_save(stacks, BACKTRACE_DEPTH, 2);
|
||||
stack_trace_print(stacks, len, 24);
|
||||
}
|
||||
|
||||
static void sample_entry_handler(struct fprobe *fp, unsigned long ip, struct pt_regs *regs)
|
||||
{
|
||||
pr_info("Enter <%pS> ip = 0x%p\n", (void *)ip, (void *)ip);
|
||||
if (stackdump)
|
||||
show_backtrace();
|
||||
}
|
||||
|
||||
static void sample_exit_handler(struct fprobe *fp, unsigned long ip, struct pt_regs *regs)
|
||||
{
|
||||
unsigned long rip = instruction_pointer(regs);
|
||||
|
||||
pr_info("Return from <%pS> ip = 0x%p to rip = 0x%p (%pS)\n",
|
||||
(void *)ip, (void *)ip, (void *)rip, (void *)rip);
|
||||
if (stackdump)
|
||||
show_backtrace();
|
||||
}
|
||||
|
||||
static int __init fprobe_init(void)
|
||||
{
|
||||
char *p, *symbuf = NULL;
|
||||
const char **syms;
|
||||
int ret, count, i;
|
||||
|
||||
sample_probe.entry_handler = sample_entry_handler;
|
||||
sample_probe.exit_handler = sample_exit_handler;
|
||||
|
||||
if (strchr(symbol, '*')) {
|
||||
/* filter based fprobe */
|
||||
ret = register_fprobe(&sample_probe, symbol,
|
||||
nosymbol[0] == '\0' ? NULL : nosymbol);
|
||||
goto out;
|
||||
} else if (!strchr(symbol, ',')) {
|
||||
symbuf = symbol;
|
||||
ret = register_fprobe_syms(&sample_probe, (const char **)&symbuf, 1);
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* Comma separated symbols */
|
||||
symbuf = kstrdup(symbol, GFP_KERNEL);
|
||||
if (!symbuf)
|
||||
return -ENOMEM;
|
||||
p = symbuf;
|
||||
count = 1;
|
||||
while ((p = strchr(++p, ',')) != NULL)
|
||||
count++;
|
||||
|
||||
pr_info("%d symbols found\n", count);
|
||||
|
||||
syms = kcalloc(count, sizeof(char *), GFP_KERNEL);
|
||||
if (!syms) {
|
||||
kfree(symbuf);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
p = symbuf;
|
||||
for (i = 0; i < count; i++)
|
||||
syms[i] = strsep(&p, ",");
|
||||
|
||||
ret = register_fprobe_syms(&sample_probe, syms, count);
|
||||
kfree(syms);
|
||||
kfree(symbuf);
|
||||
out:
|
||||
if (ret < 0)
|
||||
pr_err("register_fprobe failed, returned %d\n", ret);
|
||||
else
|
||||
pr_info("Planted fprobe at %s\n", symbol);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void __exit fprobe_exit(void)
|
||||
{
|
||||
unregister_fprobe(&sample_probe);
|
||||
|
||||
pr_info("fprobe at %s unregistered\n", symbol);
|
||||
}
|
||||
|
||||
module_init(fprobe_init)
|
||||
module_exit(fprobe_exit)
|
||||
MODULE_LICENSE("GPL");
|
@ -418,6 +418,7 @@ int ima_file_mmap(struct file *file, unsigned long prot)
|
||||
|
||||
/**
|
||||
* ima_file_mprotect - based on policy, limit mprotect change
|
||||
* @vma: vm_area_struct protection is set to
|
||||
* @prot: contains the protection that will be applied by the kernel.
|
||||
*
|
||||
* Files can be mmap'ed read/write and later changed to execute to circumvent
|
||||
@ -519,20 +520,38 @@ int ima_file_check(struct file *file, int mask)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ima_file_check);
|
||||
|
||||
static int __ima_inode_hash(struct inode *inode, char *buf, size_t buf_size)
|
||||
static int __ima_inode_hash(struct inode *inode, struct file *file, char *buf,
|
||||
size_t buf_size)
|
||||
{
|
||||
struct integrity_iint_cache *iint;
|
||||
int hash_algo;
|
||||
struct integrity_iint_cache *iint = NULL, tmp_iint;
|
||||
int rc, hash_algo;
|
||||
|
||||
if (!ima_policy_flag)
|
||||
return -EOPNOTSUPP;
|
||||
if (ima_policy_flag) {
|
||||
iint = integrity_iint_find(inode);
|
||||
if (iint)
|
||||
mutex_lock(&iint->mutex);
|
||||
}
|
||||
|
||||
if ((!iint || !(iint->flags & IMA_COLLECTED)) && file) {
|
||||
if (iint)
|
||||
mutex_unlock(&iint->mutex);
|
||||
|
||||
memset(&tmp_iint, 0, sizeof(tmp_iint));
|
||||
tmp_iint.inode = inode;
|
||||
mutex_init(&tmp_iint.mutex);
|
||||
|
||||
rc = ima_collect_measurement(&tmp_iint, file, NULL, 0,
|
||||
ima_hash_algo, NULL);
|
||||
if (rc < 0)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
iint = &tmp_iint;
|
||||
mutex_lock(&iint->mutex);
|
||||
}
|
||||
|
||||
iint = integrity_iint_find(inode);
|
||||
if (!iint)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
mutex_lock(&iint->mutex);
|
||||
|
||||
/*
|
||||
* ima_file_hash can be called when ima_collect_measurement has still
|
||||
* not been called, we might not always have a hash.
|
||||
@ -551,12 +570,14 @@ static int __ima_inode_hash(struct inode *inode, char *buf, size_t buf_size)
|
||||
hash_algo = iint->ima_hash->algo;
|
||||
mutex_unlock(&iint->mutex);
|
||||
|
||||
if (iint == &tmp_iint)
|
||||
kfree(iint->ima_hash);
|
||||
|
||||
return hash_algo;
|
||||
}
|
||||
|
||||
/**
|
||||
* ima_file_hash - return the stored measurement if a file has been hashed and
|
||||
* is in the iint cache.
|
||||
* ima_file_hash - return a measurement of the file
|
||||
* @file: pointer to the file
|
||||
* @buf: buffer in which to store the hash
|
||||
* @buf_size: length of the buffer
|
||||
@ -569,7 +590,7 @@ static int __ima_inode_hash(struct inode *inode, char *buf, size_t buf_size)
|
||||
* The file hash returned is based on the entire file, including the appended
|
||||
* signature.
|
||||
*
|
||||
* If IMA is disabled or if no measurement is available, return -EOPNOTSUPP.
|
||||
* If the measurement cannot be performed, return -EOPNOTSUPP.
|
||||
* If the parameters are incorrect, return -EINVAL.
|
||||
*/
|
||||
int ima_file_hash(struct file *file, char *buf, size_t buf_size)
|
||||
@ -577,7 +598,7 @@ int ima_file_hash(struct file *file, char *buf, size_t buf_size)
|
||||
if (!file)
|
||||
return -EINVAL;
|
||||
|
||||
return __ima_inode_hash(file_inode(file), buf, buf_size);
|
||||
return __ima_inode_hash(file_inode(file), file, buf, buf_size);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ima_file_hash);
|
||||
|
||||
@ -604,14 +625,14 @@ int ima_inode_hash(struct inode *inode, char *buf, size_t buf_size)
|
||||
if (!inode)
|
||||
return -EINVAL;
|
||||
|
||||
return __ima_inode_hash(inode, buf, buf_size);
|
||||
return __ima_inode_hash(inode, NULL, buf, buf_size);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ima_inode_hash);
|
||||
|
||||
/**
|
||||
* ima_post_create_tmpfile - mark newly created tmpfile as new
|
||||
* @mnt_userns: user namespace of the mount the inode was found from
|
||||
* @file : newly created tmpfile
|
||||
* @mnt_userns: user namespace of the mount the inode was found from
|
||||
* @inode: inode of the newly created tmpfile
|
||||
*
|
||||
* No measuring, appraising or auditing of newly created tmpfiles is needed.
|
||||
* Skip calling process_measurement(), but indicate which newly, created
|
||||
@ -643,7 +664,7 @@ void ima_post_create_tmpfile(struct user_namespace *mnt_userns,
|
||||
|
||||
/**
|
||||
* ima_post_path_mknod - mark as a new inode
|
||||
* @mnt_userns: user namespace of the mount the inode was found from
|
||||
* @mnt_userns: user namespace of the mount the inode was found from
|
||||
* @dentry: newly created dentry
|
||||
*
|
||||
* Mark files created via the mknodat syscall as new, so that the
|
||||
@ -814,8 +835,8 @@ int ima_load_data(enum kernel_load_data_id id, bool contents)
|
||||
* ima_post_load_data - appraise decision based on policy
|
||||
* @buf: pointer to in memory file contents
|
||||
* @size: size of in memory file contents
|
||||
* @id: kernel load data caller identifier
|
||||
* @description: @id-specific description of contents
|
||||
* @load_id: kernel load data caller identifier
|
||||
* @description: @load_id-specific description of contents
|
||||
*
|
||||
* Measure/appraise/audit in memory buffer based on policy. Policy rules
|
||||
* are written in terms of a policy identifier.
|
||||
|
@ -25,6 +25,7 @@ GEN COMMANDS
|
||||
|
||||
| **bpftool** **gen object** *OUTPUT_FILE* *INPUT_FILE* [*INPUT_FILE*...]
|
||||
| **bpftool** **gen skeleton** *FILE* [**name** *OBJECT_NAME*]
|
||||
| **bpftool** **gen subskeleton** *FILE* [**name** *OBJECT_NAME*]
|
||||
| **bpftool** **gen min_core_btf** *INPUT* *OUTPUT* *OBJECT* [*OBJECT*...]
|
||||
| **bpftool** **gen help**
|
||||
|
||||
@ -150,6 +151,30 @@ DESCRIPTION
|
||||
(non-read-only) data from userspace, with same simplicity
|
||||
as for BPF side.
|
||||
|
||||
**bpftool gen subskeleton** *FILE*
|
||||
Generate BPF subskeleton C header file for a given *FILE*.
|
||||
|
||||
Subskeletons are similar to skeletons, except they do not own
|
||||
the corresponding maps, programs, or global variables. They
|
||||
require that the object file used to generate them is already
|
||||
loaded into a *bpf_object* by some other means.
|
||||
|
||||
This functionality is useful when a library is included into a
|
||||
larger BPF program. A subskeleton for the library would have
|
||||
access to all objects and globals defined in it, without
|
||||
having to know about the larger program.
|
||||
|
||||
Consequently, there are only two functions defined
|
||||
for subskeletons:
|
||||
|
||||
- **example__open(bpf_object\*)**
|
||||
Instantiates a subskeleton from an already opened (but not
|
||||
necessarily loaded) **bpf_object**.
|
||||
|
||||
- **example__destroy()**
|
||||
Frees the storage for the subskeleton but *does not* unload
|
||||
any BPF programs or maps.
|
||||
|
||||
**bpftool** **gen min_core_btf** *INPUT* *OUTPUT* *OBJECT* [*OBJECT*...]
|
||||
Generate a minimum BTF file as *OUTPUT*, derived from a given
|
||||
*INPUT* BTF file, containing all needed BTF types so one, or
|
||||
|
@ -20,7 +20,8 @@ SYNOPSIS
|
||||
|
||||
**bpftool** **version**
|
||||
|
||||
*OBJECT* := { **map** | **program** | **cgroup** | **perf** | **net** | **feature** }
|
||||
*OBJECT* := { **map** | **program** | **link** | **cgroup** | **perf** | **net** | **feature** |
|
||||
**btf** | **gen** | **struct_ops** | **iter** }
|
||||
|
||||
*OPTIONS* := { { **-V** | **--version** } | |COMMON_OPTIONS| }
|
||||
|
||||
@ -31,6 +32,8 @@ SYNOPSIS
|
||||
*PROG-COMMANDS* := { **show** | **list** | **dump jited** | **dump xlated** | **pin** |
|
||||
**load** | **attach** | **detach** | **help** }
|
||||
|
||||
*LINK-COMMANDS* := { **show** | **list** | **pin** | **detach** | **help** }
|
||||
|
||||
*CGROUP-COMMANDS* := { **show** | **list** | **attach** | **detach** | **help** }
|
||||
|
||||
*PERF-COMMANDS* := { **show** | **list** | **help** }
|
||||
@ -39,6 +42,14 @@ SYNOPSIS
|
||||
|
||||
*FEATURE-COMMANDS* := { **probe** | **help** }
|
||||
|
||||
*BTF-COMMANDS* := { **show** | **list** | **dump** | **help** }
|
||||
|
||||
*GEN-COMMANDS* := { **object** | **skeleton** | **min_core_btf** | **help** }
|
||||
|
||||
*STRUCT-OPS-COMMANDS* := { **show** | **list** | **dump** | **register** | **unregister** | **help** }
|
||||
|
||||
*ITER-COMMANDS* := { **pin** | **help** }
|
||||
|
||||
DESCRIPTION
|
||||
===========
|
||||
*bpftool* allows for inspection and simple modification of BPF objects
|
||||
|
@ -1003,13 +1003,25 @@ _bpftool()
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
subskeleton)
|
||||
case $prev in
|
||||
$command)
|
||||
_filedir
|
||||
return 0
|
||||
;;
|
||||
*)
|
||||
_bpftool_once_attr 'name'
|
||||
return 0
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
min_core_btf)
|
||||
_filedir
|
||||
return 0
|
||||
;;
|
||||
*)
|
||||
[[ $prev == $object ]] && \
|
||||
COMPREPLY=( $( compgen -W 'object skeleton help min_core_btf' -- "$cur" ) )
|
||||
COMPREPLY=( $( compgen -W 'object skeleton subskeleton help min_core_btf' -- "$cur" ) )
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
|
@ -56,7 +56,6 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = {
|
||||
[BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
|
||||
[BPF_CGROUP_GETSOCKOPT] = "getsockopt",
|
||||
[BPF_CGROUP_SETSOCKOPT] = "setsockopt",
|
||||
|
||||
[BPF_SK_SKB_STREAM_PARSER] = "sk_skb_stream_parser",
|
||||
[BPF_SK_SKB_STREAM_VERDICT] = "sk_skb_stream_verdict",
|
||||
[BPF_SK_SKB_VERDICT] = "sk_skb_verdict",
|
||||
@ -76,6 +75,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = {
|
||||
[BPF_SK_REUSEPORT_SELECT] = "sk_skb_reuseport_select",
|
||||
[BPF_SK_REUSEPORT_SELECT_OR_MIGRATE] = "sk_skb_reuseport_select_or_migrate",
|
||||
[BPF_PERF_EVENT] = "perf_event",
|
||||
[BPF_TRACE_KPROBE_MULTI] = "trace_kprobe_multi",
|
||||
};
|
||||
|
||||
void p_err(const char *fmt, ...)
|
||||
|
@ -3,6 +3,7 @@
|
||||
|
||||
#include <ctype.h>
|
||||
#include <errno.h>
|
||||
#include <fcntl.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
#include <net/if.h>
|
||||
@ -45,6 +46,11 @@ static bool run_as_unprivileged;
|
||||
|
||||
/* Miscellaneous utility functions */
|
||||
|
||||
static bool grep(const char *buffer, const char *pattern)
|
||||
{
|
||||
return !!strstr(buffer, pattern);
|
||||
}
|
||||
|
||||
static bool check_procfs(void)
|
||||
{
|
||||
struct statfs st_fs;
|
||||
@ -135,6 +141,32 @@ static void print_end_section(void)
|
||||
|
||||
/* Probing functions */
|
||||
|
||||
static int get_vendor_id(int ifindex)
|
||||
{
|
||||
char ifname[IF_NAMESIZE], path[64], buf[8];
|
||||
ssize_t len;
|
||||
int fd;
|
||||
|
||||
if (!if_indextoname(ifindex, ifname))
|
||||
return -1;
|
||||
|
||||
snprintf(path, sizeof(path), "/sys/class/net/%s/device/vendor", ifname);
|
||||
|
||||
fd = open(path, O_RDONLY | O_CLOEXEC);
|
||||
if (fd < 0)
|
||||
return -1;
|
||||
|
||||
len = read(fd, buf, sizeof(buf));
|
||||
close(fd);
|
||||
if (len < 0)
|
||||
return -1;
|
||||
if (len >= (ssize_t)sizeof(buf))
|
||||
return -1;
|
||||
buf[len] = '\0';
|
||||
|
||||
return strtol(buf, NULL, 0);
|
||||
}
|
||||
|
||||
static int read_procfs(const char *path)
|
||||
{
|
||||
char *endptr, *line = NULL;
|
||||
@ -478,6 +510,40 @@ static bool probe_bpf_syscall(const char *define_prefix)
|
||||
return res;
|
||||
}
|
||||
|
||||
static bool
|
||||
probe_prog_load_ifindex(enum bpf_prog_type prog_type,
|
||||
const struct bpf_insn *insns, size_t insns_cnt,
|
||||
char *log_buf, size_t log_buf_sz,
|
||||
__u32 ifindex)
|
||||
{
|
||||
LIBBPF_OPTS(bpf_prog_load_opts, opts,
|
||||
.log_buf = log_buf,
|
||||
.log_size = log_buf_sz,
|
||||
.log_level = log_buf ? 1 : 0,
|
||||
.prog_ifindex = ifindex,
|
||||
);
|
||||
int fd;
|
||||
|
||||
errno = 0;
|
||||
fd = bpf_prog_load(prog_type, NULL, "GPL", insns, insns_cnt, &opts);
|
||||
if (fd >= 0)
|
||||
close(fd);
|
||||
|
||||
return fd >= 0 && errno != EINVAL && errno != EOPNOTSUPP;
|
||||
}
|
||||
|
||||
static bool probe_prog_type_ifindex(enum bpf_prog_type prog_type, __u32 ifindex)
|
||||
{
|
||||
/* nfp returns -EINVAL on exit(0) with TC offload */
|
||||
struct bpf_insn insns[2] = {
|
||||
BPF_MOV64_IMM(BPF_REG_0, 2),
|
||||
BPF_EXIT_INSN()
|
||||
};
|
||||
|
||||
return probe_prog_load_ifindex(prog_type, insns, ARRAY_SIZE(insns),
|
||||
NULL, 0, ifindex);
|
||||
}
|
||||
|
||||
static void
|
||||
probe_prog_type(enum bpf_prog_type prog_type, bool *supported_types,
|
||||
const char *define_prefix, __u32 ifindex)
|
||||
@ -488,11 +554,19 @@ probe_prog_type(enum bpf_prog_type prog_type, bool *supported_types,
|
||||
bool res;
|
||||
|
||||
if (ifindex) {
|
||||
p_info("BPF offload feature probing is not supported");
|
||||
return;
|
||||
switch (prog_type) {
|
||||
case BPF_PROG_TYPE_SCHED_CLS:
|
||||
case BPF_PROG_TYPE_XDP:
|
||||
break;
|
||||
default:
|
||||
return;
|
||||
}
|
||||
|
||||
res = probe_prog_type_ifindex(prog_type, ifindex);
|
||||
} else {
|
||||
res = libbpf_probe_bpf_prog_type(prog_type, NULL);
|
||||
}
|
||||
|
||||
res = libbpf_probe_bpf_prog_type(prog_type, NULL);
|
||||
#ifdef USE_LIBCAP
|
||||
/* Probe may succeed even if program load fails, for unprivileged users
|
||||
* check that we did not fail because of insufficient permissions
|
||||
@ -521,6 +595,26 @@ probe_prog_type(enum bpf_prog_type prog_type, bool *supported_types,
|
||||
define_prefix);
|
||||
}
|
||||
|
||||
static bool probe_map_type_ifindex(enum bpf_map_type map_type, __u32 ifindex)
|
||||
{
|
||||
LIBBPF_OPTS(bpf_map_create_opts, opts);
|
||||
int key_size, value_size, max_entries;
|
||||
int fd;
|
||||
|
||||
opts.map_ifindex = ifindex;
|
||||
|
||||
key_size = sizeof(__u32);
|
||||
value_size = sizeof(__u32);
|
||||
max_entries = 1;
|
||||
|
||||
fd = bpf_map_create(map_type, NULL, key_size, value_size, max_entries,
|
||||
&opts);
|
||||
if (fd >= 0)
|
||||
close(fd);
|
||||
|
||||
return fd >= 0;
|
||||
}
|
||||
|
||||
static void
|
||||
probe_map_type(enum bpf_map_type map_type, const char *define_prefix,
|
||||
__u32 ifindex)
|
||||
@ -531,11 +625,18 @@ probe_map_type(enum bpf_map_type map_type, const char *define_prefix,
|
||||
bool res;
|
||||
|
||||
if (ifindex) {
|
||||
p_info("BPF offload feature probing is not supported");
|
||||
return;
|
||||
}
|
||||
switch (map_type) {
|
||||
case BPF_MAP_TYPE_HASH:
|
||||
case BPF_MAP_TYPE_ARRAY:
|
||||
break;
|
||||
default:
|
||||
return;
|
||||
}
|
||||
|
||||
res = libbpf_probe_bpf_map_type(map_type, NULL);
|
||||
res = probe_map_type_ifindex(map_type, ifindex);
|
||||
} else {
|
||||
res = libbpf_probe_bpf_map_type(map_type, NULL);
|
||||
}
|
||||
|
||||
/* Probe result depends on the success of map creation, no additional
|
||||
* check required for unprivileged users
|
||||
@ -559,6 +660,33 @@ probe_map_type(enum bpf_map_type map_type, const char *define_prefix,
|
||||
define_prefix);
|
||||
}
|
||||
|
||||
static bool
|
||||
probe_helper_ifindex(enum bpf_func_id id, enum bpf_prog_type prog_type,
|
||||
__u32 ifindex)
|
||||
{
|
||||
struct bpf_insn insns[2] = {
|
||||
BPF_EMIT_CALL(id),
|
||||
BPF_EXIT_INSN()
|
||||
};
|
||||
char buf[4096] = {};
|
||||
bool res;
|
||||
|
||||
probe_prog_load_ifindex(prog_type, insns, ARRAY_SIZE(insns), buf,
|
||||
sizeof(buf), ifindex);
|
||||
res = !grep(buf, "invalid func ") && !grep(buf, "unknown func ");
|
||||
|
||||
switch (get_vendor_id(ifindex)) {
|
||||
case 0x19ee: /* Netronome specific */
|
||||
res = res && !grep(buf, "not supported by FW") &&
|
||||
!grep(buf, "unsupported function id");
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
static void
|
||||
probe_helper_for_progtype(enum bpf_prog_type prog_type, bool supported_type,
|
||||
const char *define_prefix, unsigned int id,
|
||||
@ -567,12 +695,10 @@ probe_helper_for_progtype(enum bpf_prog_type prog_type, bool supported_type,
|
||||
bool res = false;
|
||||
|
||||
if (supported_type) {
|
||||
if (ifindex) {
|
||||
p_info("BPF offload feature probing is not supported");
|
||||
return;
|
||||
}
|
||||
|
||||
res = libbpf_probe_bpf_helper(prog_type, id, NULL);
|
||||
if (ifindex)
|
||||
res = probe_helper_ifindex(id, prog_type, ifindex);
|
||||
else
|
||||
res = libbpf_probe_bpf_helper(prog_type, id, NULL);
|
||||
#ifdef USE_LIBCAP
|
||||
/* Probe may succeed even if program load fails, for
|
||||
* unprivileged users check that we did not fail because of
|
||||
|
@ -64,11 +64,11 @@ static void get_obj_name(char *name, const char *file)
|
||||
sanitize_identifier(name);
|
||||
}
|
||||
|
||||
static void get_header_guard(char *guard, const char *obj_name)
|
||||
static void get_header_guard(char *guard, const char *obj_name, const char *suffix)
|
||||
{
|
||||
int i;
|
||||
|
||||
sprintf(guard, "__%s_SKEL_H__", obj_name);
|
||||
sprintf(guard, "__%s_%s__", obj_name, suffix);
|
||||
for (i = 0; guard[i]; i++)
|
||||
guard[i] = toupper(guard[i]);
|
||||
}
|
||||
@ -231,6 +231,17 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static bool is_internal_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
|
||||
{
|
||||
if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
|
||||
return false;
|
||||
|
||||
if (!get_map_ident(map, buf, sz))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static int codegen_datasecs(struct bpf_object *obj, const char *obj_name)
|
||||
{
|
||||
struct btf *btf = bpf_object__btf(obj);
|
||||
@ -247,12 +258,7 @@ static int codegen_datasecs(struct bpf_object *obj, const char *obj_name)
|
||||
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
/* only generate definitions for memory-mapped internal maps */
|
||||
if (!bpf_map__is_internal(map))
|
||||
continue;
|
||||
if (!(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
|
||||
continue;
|
||||
|
||||
if (!get_map_ident(map, map_ident, sizeof(map_ident)))
|
||||
if (!is_internal_mmapable_map(map, map_ident, sizeof(map_ident)))
|
||||
continue;
|
||||
|
||||
sec = find_type_for_map(btf, map_ident);
|
||||
@ -280,6 +286,96 @@ out:
|
||||
return err;
|
||||
}
|
||||
|
||||
static bool btf_is_ptr_to_func_proto(const struct btf *btf,
|
||||
const struct btf_type *v)
|
||||
{
|
||||
return btf_is_ptr(v) && btf_is_func_proto(btf__type_by_id(btf, v->type));
|
||||
}
|
||||
|
||||
static int codegen_subskel_datasecs(struct bpf_object *obj, const char *obj_name)
|
||||
{
|
||||
struct btf *btf = bpf_object__btf(obj);
|
||||
struct btf_dump *d;
|
||||
struct bpf_map *map;
|
||||
const struct btf_type *sec, *var;
|
||||
const struct btf_var_secinfo *sec_var;
|
||||
int i, err = 0, vlen;
|
||||
char map_ident[256], sec_ident[256];
|
||||
bool strip_mods = false, needs_typeof = false;
|
||||
const char *sec_name, *var_name;
|
||||
__u32 var_type_id;
|
||||
|
||||
d = btf_dump__new(btf, codegen_btf_dump_printf, NULL, NULL);
|
||||
if (!d)
|
||||
return -errno;
|
||||
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
/* only generate definitions for memory-mapped internal maps */
|
||||
if (!is_internal_mmapable_map(map, map_ident, sizeof(map_ident)))
|
||||
continue;
|
||||
|
||||
sec = find_type_for_map(btf, map_ident);
|
||||
if (!sec)
|
||||
continue;
|
||||
|
||||
sec_name = btf__name_by_offset(btf, sec->name_off);
|
||||
if (!get_datasec_ident(sec_name, sec_ident, sizeof(sec_ident)))
|
||||
continue;
|
||||
|
||||
strip_mods = strcmp(sec_name, ".kconfig") != 0;
|
||||
printf(" struct %s__%s {\n", obj_name, sec_ident);
|
||||
|
||||
sec_var = btf_var_secinfos(sec);
|
||||
vlen = btf_vlen(sec);
|
||||
for (i = 0; i < vlen; i++, sec_var++) {
|
||||
DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts,
|
||||
.indent_level = 2,
|
||||
.strip_mods = strip_mods,
|
||||
/* we'll print the name separately */
|
||||
.field_name = "",
|
||||
);
|
||||
|
||||
var = btf__type_by_id(btf, sec_var->type);
|
||||
var_name = btf__name_by_offset(btf, var->name_off);
|
||||
var_type_id = var->type;
|
||||
|
||||
/* static variables are not exposed through BPF skeleton */
|
||||
if (btf_var(var)->linkage == BTF_VAR_STATIC)
|
||||
continue;
|
||||
|
||||
/* The datasec member has KIND_VAR but we want the
|
||||
* underlying type of the variable (e.g. KIND_INT).
|
||||
*/
|
||||
var = skip_mods_and_typedefs(btf, var->type, NULL);
|
||||
|
||||
printf("\t\t");
|
||||
/* Func and array members require special handling.
|
||||
* Instead of producing `typename *var`, they produce
|
||||
* `typeof(typename) *var`. This allows us to keep a
|
||||
* similar syntax where the identifier is just prefixed
|
||||
* by *, allowing us to ignore C declaration minutiae.
|
||||
*/
|
||||
needs_typeof = btf_is_array(var) || btf_is_ptr_to_func_proto(btf, var);
|
||||
if (needs_typeof)
|
||||
printf("typeof(");
|
||||
|
||||
err = btf_dump__emit_type_decl(d, var_type_id, &opts);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
if (needs_typeof)
|
||||
printf(")");
|
||||
|
||||
printf(" *%s;\n", var_name);
|
||||
}
|
||||
printf(" } %s;\n", sec_ident);
|
||||
}
|
||||
|
||||
out:
|
||||
btf_dump__free(d);
|
||||
return err;
|
||||
}
|
||||
|
||||
static void codegen(const char *template, ...)
|
||||
{
|
||||
const char *src, *end;
|
||||
@ -389,11 +485,7 @@ static void codegen_asserts(struct bpf_object *obj, const char *obj_name)
|
||||
", obj_name);
|
||||
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
if (!bpf_map__is_internal(map))
|
||||
continue;
|
||||
if (!(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
|
||||
continue;
|
||||
if (!get_map_ident(map, map_ident, sizeof(map_ident)))
|
||||
if (!is_internal_mmapable_map(map, map_ident, sizeof(map_ident)))
|
||||
continue;
|
||||
|
||||
sec = find_type_for_map(btf, map_ident);
|
||||
@ -608,11 +700,7 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h
|
||||
const void *mmap_data = NULL;
|
||||
size_t mmap_size = 0;
|
||||
|
||||
if (!get_map_ident(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
if (!bpf_map__is_internal(map) ||
|
||||
!(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
|
||||
if (!is_internal_mmapable_map(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
codegen("\
|
||||
@ -671,11 +759,7 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
const char *mmap_flags;
|
||||
|
||||
if (!get_map_ident(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
if (!bpf_map__is_internal(map) ||
|
||||
!(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
|
||||
if (!is_internal_mmapable_map(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
if (bpf_map__map_flags(map) & BPF_F_RDONLY_PROG)
|
||||
@ -727,10 +811,95 @@ out:
|
||||
return err;
|
||||
}
|
||||
|
||||
static void
|
||||
codegen_maps_skeleton(struct bpf_object *obj, size_t map_cnt, bool mmaped)
|
||||
{
|
||||
struct bpf_map *map;
|
||||
char ident[256];
|
||||
size_t i;
|
||||
|
||||
if (!map_cnt)
|
||||
return;
|
||||
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
/* maps */ \n\
|
||||
s->map_cnt = %zu; \n\
|
||||
s->map_skel_sz = sizeof(*s->maps); \n\
|
||||
s->maps = (struct bpf_map_skeleton *)calloc(s->map_cnt, s->map_skel_sz);\n\
|
||||
if (!s->maps) \n\
|
||||
goto err; \n\
|
||||
",
|
||||
map_cnt
|
||||
);
|
||||
i = 0;
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
if (!get_map_ident(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
s->maps[%zu].name = \"%s\"; \n\
|
||||
s->maps[%zu].map = &obj->maps.%s; \n\
|
||||
",
|
||||
i, bpf_map__name(map), i, ident);
|
||||
/* memory-mapped internal maps */
|
||||
if (mmaped && is_internal_mmapable_map(map, ident, sizeof(ident))) {
|
||||
printf("\ts->maps[%zu].mmaped = (void **)&obj->%s;\n",
|
||||
i, ident);
|
||||
}
|
||||
i++;
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
codegen_progs_skeleton(struct bpf_object *obj, size_t prog_cnt, bool populate_links)
|
||||
{
|
||||
struct bpf_program *prog;
|
||||
int i;
|
||||
|
||||
if (!prog_cnt)
|
||||
return;
|
||||
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
/* programs */ \n\
|
||||
s->prog_cnt = %zu; \n\
|
||||
s->prog_skel_sz = sizeof(*s->progs); \n\
|
||||
s->progs = (struct bpf_prog_skeleton *)calloc(s->prog_cnt, s->prog_skel_sz);\n\
|
||||
if (!s->progs) \n\
|
||||
goto err; \n\
|
||||
",
|
||||
prog_cnt
|
||||
);
|
||||
i = 0;
|
||||
bpf_object__for_each_program(prog, obj) {
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
s->progs[%1$zu].name = \"%2$s\"; \n\
|
||||
s->progs[%1$zu].prog = &obj->progs.%2$s;\n\
|
||||
",
|
||||
i, bpf_program__name(prog));
|
||||
|
||||
if (populate_links) {
|
||||
codegen("\
|
||||
\n\
|
||||
s->progs[%1$zu].link = &obj->links.%2$s;\n\
|
||||
",
|
||||
i, bpf_program__name(prog));
|
||||
}
|
||||
i++;
|
||||
}
|
||||
}
|
||||
|
||||
static int do_skeleton(int argc, char **argv)
|
||||
{
|
||||
char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
|
||||
size_t i, map_cnt = 0, prog_cnt = 0, file_sz, mmap_sz;
|
||||
size_t map_cnt = 0, prog_cnt = 0, file_sz, mmap_sz;
|
||||
DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts);
|
||||
char obj_name[MAX_OBJ_NAME_LEN] = "", *obj_data;
|
||||
struct bpf_object *obj = NULL;
|
||||
@ -821,7 +990,7 @@ static int do_skeleton(int argc, char **argv)
|
||||
prog_cnt++;
|
||||
}
|
||||
|
||||
get_header_guard(header_guard, obj_name);
|
||||
get_header_guard(header_guard, obj_name, "SKEL_H");
|
||||
if (use_loader) {
|
||||
codegen("\
|
||||
\n\
|
||||
@ -1024,66 +1193,10 @@ static int do_skeleton(int argc, char **argv)
|
||||
",
|
||||
obj_name
|
||||
);
|
||||
if (map_cnt) {
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
/* maps */ \n\
|
||||
s->map_cnt = %zu; \n\
|
||||
s->map_skel_sz = sizeof(*s->maps); \n\
|
||||
s->maps = (struct bpf_map_skeleton *)calloc(s->map_cnt, s->map_skel_sz);\n\
|
||||
if (!s->maps) \n\
|
||||
goto err; \n\
|
||||
",
|
||||
map_cnt
|
||||
);
|
||||
i = 0;
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
if (!get_map_ident(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
s->maps[%zu].name = \"%s\"; \n\
|
||||
s->maps[%zu].map = &obj->maps.%s; \n\
|
||||
",
|
||||
i, bpf_map__name(map), i, ident);
|
||||
/* memory-mapped internal maps */
|
||||
if (bpf_map__is_internal(map) &&
|
||||
(bpf_map__map_flags(map) & BPF_F_MMAPABLE)) {
|
||||
printf("\ts->maps[%zu].mmaped = (void **)&obj->%s;\n",
|
||||
i, ident);
|
||||
}
|
||||
i++;
|
||||
}
|
||||
}
|
||||
if (prog_cnt) {
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
/* programs */ \n\
|
||||
s->prog_cnt = %zu; \n\
|
||||
s->prog_skel_sz = sizeof(*s->progs); \n\
|
||||
s->progs = (struct bpf_prog_skeleton *)calloc(s->prog_cnt, s->prog_skel_sz);\n\
|
||||
if (!s->progs) \n\
|
||||
goto err; \n\
|
||||
",
|
||||
prog_cnt
|
||||
);
|
||||
i = 0;
|
||||
bpf_object__for_each_program(prog, obj) {
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
s->progs[%1$zu].name = \"%2$s\"; \n\
|
||||
s->progs[%1$zu].prog = &obj->progs.%2$s;\n\
|
||||
s->progs[%1$zu].link = &obj->links.%2$s;\n\
|
||||
",
|
||||
i, bpf_program__name(prog));
|
||||
i++;
|
||||
}
|
||||
}
|
||||
codegen_maps_skeleton(obj, map_cnt, true /*mmaped*/);
|
||||
codegen_progs_skeleton(obj, prog_cnt, true /*populate_links*/);
|
||||
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
@ -1141,6 +1254,310 @@ out:
|
||||
return err;
|
||||
}
|
||||
|
||||
/* Subskeletons are like skeletons, except they don't own the bpf_object,
|
||||
* associated maps, links, etc. Instead, they know about the existence of
|
||||
* variables, maps, programs and are able to find their locations
|
||||
* _at runtime_ from an already loaded bpf_object.
|
||||
*
|
||||
* This allows for library-like BPF objects to have userspace counterparts
|
||||
* with access to their own items without having to know anything about the
|
||||
* final BPF object that the library was linked into.
|
||||
*/
|
||||
static int do_subskeleton(int argc, char **argv)
|
||||
{
|
||||
char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SUBSKEL_H__")];
|
||||
size_t i, len, file_sz, map_cnt = 0, prog_cnt = 0, mmap_sz, var_cnt = 0, var_idx = 0;
|
||||
DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts);
|
||||
char obj_name[MAX_OBJ_NAME_LEN] = "", *obj_data;
|
||||
struct bpf_object *obj = NULL;
|
||||
const char *file, *var_name;
|
||||
char ident[256];
|
||||
int fd, err = -1, map_type_id;
|
||||
const struct bpf_map *map;
|
||||
struct bpf_program *prog;
|
||||
struct btf *btf;
|
||||
const struct btf_type *map_type, *var_type;
|
||||
const struct btf_var_secinfo *var;
|
||||
struct stat st;
|
||||
|
||||
if (!REQ_ARGS(1)) {
|
||||
usage();
|
||||
return -1;
|
||||
}
|
||||
file = GET_ARG();
|
||||
|
||||
while (argc) {
|
||||
if (!REQ_ARGS(2))
|
||||
return -1;
|
||||
|
||||
if (is_prefix(*argv, "name")) {
|
||||
NEXT_ARG();
|
||||
|
||||
if (obj_name[0] != '\0') {
|
||||
p_err("object name already specified");
|
||||
return -1;
|
||||
}
|
||||
|
||||
strncpy(obj_name, *argv, MAX_OBJ_NAME_LEN - 1);
|
||||
obj_name[MAX_OBJ_NAME_LEN - 1] = '\0';
|
||||
} else {
|
||||
p_err("unknown arg %s", *argv);
|
||||
return -1;
|
||||
}
|
||||
|
||||
NEXT_ARG();
|
||||
}
|
||||
|
||||
if (argc) {
|
||||
p_err("extra unknown arguments");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (use_loader) {
|
||||
p_err("cannot use loader for subskeletons");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (stat(file, &st)) {
|
||||
p_err("failed to stat() %s: %s", file, strerror(errno));
|
||||
return -1;
|
||||
}
|
||||
file_sz = st.st_size;
|
||||
mmap_sz = roundup(file_sz, sysconf(_SC_PAGE_SIZE));
|
||||
fd = open(file, O_RDONLY);
|
||||
if (fd < 0) {
|
||||
p_err("failed to open() %s: %s", file, strerror(errno));
|
||||
return -1;
|
||||
}
|
||||
obj_data = mmap(NULL, mmap_sz, PROT_READ, MAP_PRIVATE, fd, 0);
|
||||
if (obj_data == MAP_FAILED) {
|
||||
obj_data = NULL;
|
||||
p_err("failed to mmap() %s: %s", file, strerror(errno));
|
||||
goto out;
|
||||
}
|
||||
if (obj_name[0] == '\0')
|
||||
get_obj_name(obj_name, file);
|
||||
|
||||
/* The empty object name allows us to use bpf_map__name and produce
|
||||
* ELF section names out of it. (".data" instead of "obj.data")
|
||||
*/
|
||||
opts.object_name = "";
|
||||
obj = bpf_object__open_mem(obj_data, file_sz, &opts);
|
||||
if (!obj) {
|
||||
char err_buf[256];
|
||||
|
||||
libbpf_strerror(errno, err_buf, sizeof(err_buf));
|
||||
p_err("failed to open BPF object file: %s", err_buf);
|
||||
obj = NULL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
btf = bpf_object__btf(obj);
|
||||
if (!btf) {
|
||||
err = -1;
|
||||
p_err("need btf type information for %s", obj_name);
|
||||
goto out;
|
||||
}
|
||||
|
||||
bpf_object__for_each_program(prog, obj) {
|
||||
prog_cnt++;
|
||||
}
|
||||
|
||||
/* First, count how many variables we have to find.
|
||||
* We need this in advance so the subskel can allocate the right
|
||||
* amount of storage.
|
||||
*/
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
if (!get_map_ident(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
/* Also count all maps that have a name */
|
||||
map_cnt++;
|
||||
|
||||
if (!is_internal_mmapable_map(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
map_type_id = bpf_map__btf_value_type_id(map);
|
||||
if (map_type_id <= 0) {
|
||||
err = map_type_id;
|
||||
goto out;
|
||||
}
|
||||
map_type = btf__type_by_id(btf, map_type_id);
|
||||
|
||||
var = btf_var_secinfos(map_type);
|
||||
len = btf_vlen(map_type);
|
||||
for (i = 0; i < len; i++, var++) {
|
||||
var_type = btf__type_by_id(btf, var->type);
|
||||
|
||||
if (btf_var(var_type)->linkage == BTF_VAR_STATIC)
|
||||
continue;
|
||||
|
||||
var_cnt++;
|
||||
}
|
||||
}
|
||||
|
||||
get_header_guard(header_guard, obj_name, "SUBSKEL_H");
|
||||
codegen("\
|
||||
\n\
|
||||
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ \n\
|
||||
\n\
|
||||
/* THIS FILE IS AUTOGENERATED! */ \n\
|
||||
#ifndef %2$s \n\
|
||||
#define %2$s \n\
|
||||
\n\
|
||||
#include <errno.h> \n\
|
||||
#include <stdlib.h> \n\
|
||||
#include <bpf/libbpf.h> \n\
|
||||
\n\
|
||||
struct %1$s { \n\
|
||||
struct bpf_object *obj; \n\
|
||||
struct bpf_object_subskeleton *subskel; \n\
|
||||
", obj_name, header_guard);
|
||||
|
||||
if (map_cnt) {
|
||||
printf("\tstruct {\n");
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
if (!get_map_ident(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
printf("\t\tstruct bpf_map *%s;\n", ident);
|
||||
}
|
||||
printf("\t} maps;\n");
|
||||
}
|
||||
|
||||
if (prog_cnt) {
|
||||
printf("\tstruct {\n");
|
||||
bpf_object__for_each_program(prog, obj) {
|
||||
printf("\t\tstruct bpf_program *%s;\n",
|
||||
bpf_program__name(prog));
|
||||
}
|
||||
printf("\t} progs;\n");
|
||||
}
|
||||
|
||||
err = codegen_subskel_datasecs(obj, obj_name);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
/* emit code that will allocate enough storage for all symbols */
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
#ifdef __cplusplus \n\
|
||||
static inline struct %1$s *open(const struct bpf_object *src);\n\
|
||||
static inline void destroy(struct %1$s *skel); \n\
|
||||
#endif /* __cplusplus */ \n\
|
||||
}; \n\
|
||||
\n\
|
||||
static inline void \n\
|
||||
%1$s__destroy(struct %1$s *skel) \n\
|
||||
{ \n\
|
||||
if (!skel) \n\
|
||||
return; \n\
|
||||
if (skel->subskel) \n\
|
||||
bpf_object__destroy_subskeleton(skel->subskel);\n\
|
||||
free(skel); \n\
|
||||
} \n\
|
||||
\n\
|
||||
static inline struct %1$s * \n\
|
||||
%1$s__open(const struct bpf_object *src) \n\
|
||||
{ \n\
|
||||
struct %1$s *obj; \n\
|
||||
struct bpf_object_subskeleton *s; \n\
|
||||
int err; \n\
|
||||
\n\
|
||||
obj = (struct %1$s *)calloc(1, sizeof(*obj)); \n\
|
||||
if (!obj) { \n\
|
||||
errno = ENOMEM; \n\
|
||||
goto err; \n\
|
||||
} \n\
|
||||
s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s));\n\
|
||||
if (!s) { \n\
|
||||
errno = ENOMEM; \n\
|
||||
goto err; \n\
|
||||
} \n\
|
||||
s->sz = sizeof(*s); \n\
|
||||
s->obj = src; \n\
|
||||
s->var_skel_sz = sizeof(*s->vars); \n\
|
||||
obj->subskel = s; \n\
|
||||
\n\
|
||||
/* vars */ \n\
|
||||
s->var_cnt = %2$d; \n\
|
||||
s->vars = (struct bpf_var_skeleton *)calloc(%2$d, sizeof(*s->vars));\n\
|
||||
if (!s->vars) { \n\
|
||||
errno = ENOMEM; \n\
|
||||
goto err; \n\
|
||||
} \n\
|
||||
",
|
||||
obj_name, var_cnt
|
||||
);
|
||||
|
||||
/* walk through each symbol and emit the runtime representation */
|
||||
bpf_object__for_each_map(map, obj) {
|
||||
if (!is_internal_mmapable_map(map, ident, sizeof(ident)))
|
||||
continue;
|
||||
|
||||
map_type_id = bpf_map__btf_value_type_id(map);
|
||||
if (map_type_id <= 0)
|
||||
/* skip over internal maps with no type*/
|
||||
continue;
|
||||
|
||||
map_type = btf__type_by_id(btf, map_type_id);
|
||||
var = btf_var_secinfos(map_type);
|
||||
len = btf_vlen(map_type);
|
||||
for (i = 0; i < len; i++, var++) {
|
||||
var_type = btf__type_by_id(btf, var->type);
|
||||
var_name = btf__name_by_offset(btf, var_type->name_off);
|
||||
|
||||
if (btf_var(var_type)->linkage == BTF_VAR_STATIC)
|
||||
continue;
|
||||
|
||||
/* Note that we use the dot prefix in .data as the
|
||||
* field access operator i.e. maps%s becomes maps.data
|
||||
*/
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
s->vars[%3$d].name = \"%1$s\"; \n\
|
||||
s->vars[%3$d].map = &obj->maps.%2$s; \n\
|
||||
s->vars[%3$d].addr = (void **) &obj->%2$s.%1$s;\n\
|
||||
", var_name, ident, var_idx);
|
||||
|
||||
var_idx++;
|
||||
}
|
||||
}
|
||||
|
||||
codegen_maps_skeleton(obj, map_cnt, false /*mmaped*/);
|
||||
codegen_progs_skeleton(obj, prog_cnt, false /*links*/);
|
||||
|
||||
codegen("\
|
||||
\n\
|
||||
\n\
|
||||
err = bpf_object__open_subskeleton(s); \n\
|
||||
if (err) \n\
|
||||
goto err; \n\
|
||||
\n\
|
||||
return obj; \n\
|
||||
err: \n\
|
||||
%1$s__destroy(obj); \n\
|
||||
return NULL; \n\
|
||||
} \n\
|
||||
\n\
|
||||
#ifdef __cplusplus \n\
|
||||
struct %1$s *%1$s::open(const struct bpf_object *src) { return %1$s__open(src); }\n\
|
||||
void %1$s::destroy(struct %1$s *skel) { %1$s__destroy(skel); }\n\
|
||||
#endif /* __cplusplus */ \n\
|
||||
\n\
|
||||
#endif /* %2$s */ \n\
|
||||
",
|
||||
obj_name, header_guard);
|
||||
err = 0;
|
||||
out:
|
||||
bpf_object__close(obj);
|
||||
if (obj_data)
|
||||
munmap(obj_data, mmap_sz);
|
||||
close(fd);
|
||||
return err;
|
||||
}
|
||||
|
||||
static int do_object(int argc, char **argv)
|
||||
{
|
||||
struct bpf_linker *linker;
|
||||
@ -1192,6 +1609,7 @@ static int do_help(int argc, char **argv)
|
||||
fprintf(stderr,
|
||||
"Usage: %1$s %2$s object OUTPUT_FILE INPUT_FILE [INPUT_FILE...]\n"
|
||||
" %1$s %2$s skeleton FILE [name OBJECT_NAME]\n"
|
||||
" %1$s %2$s subskeleton FILE [name OBJECT_NAME]\n"
|
||||
" %1$s %2$s min_core_btf INPUT OUTPUT OBJECT [OBJECT...]\n"
|
||||
" %1$s %2$s help\n"
|
||||
"\n"
|
||||
@ -1788,6 +2206,7 @@ static int do_min_core_btf(int argc, char **argv)
|
||||
static const struct cmd cmds[] = {
|
||||
{ "object", do_object },
|
||||
{ "skeleton", do_skeleton },
|
||||
{ "subskeleton", do_subskeleton },
|
||||
{ "min_core_btf", do_min_core_btf},
|
||||
{ "help", do_help },
|
||||
{ 0 }
|
||||
|
@ -113,7 +113,9 @@ struct obj_ref {
|
||||
|
||||
struct obj_refs {
|
||||
int ref_cnt;
|
||||
bool has_bpf_cookie;
|
||||
struct obj_ref *refs;
|
||||
__u64 bpf_cookie;
|
||||
};
|
||||
|
||||
struct btf;
|
||||
|
@ -504,7 +504,7 @@ static int show_map_close_json(int fd, struct bpf_map_info *info)
|
||||
jsonw_uint_field(json_wtr, "max_entries", info->max_entries);
|
||||
|
||||
if (memlock)
|
||||
jsonw_int_field(json_wtr, "bytes_memlock", atoi(memlock));
|
||||
jsonw_int_field(json_wtr, "bytes_memlock", atoll(memlock));
|
||||
free(memlock);
|
||||
|
||||
if (info->type == BPF_MAP_TYPE_PROG_ARRAY) {
|
||||
@ -620,17 +620,14 @@ static int show_map_close_plain(int fd, struct bpf_map_info *info)
|
||||
u32_as_hash_field(info->id))
|
||||
printf("\n\tpinned %s", (char *)entry->value);
|
||||
}
|
||||
printf("\n");
|
||||
|
||||
if (frozen_str) {
|
||||
frozen = atoi(frozen_str);
|
||||
free(frozen_str);
|
||||
}
|
||||
|
||||
if (!info->btf_id && !frozen)
|
||||
return 0;
|
||||
|
||||
printf("\t");
|
||||
if (info->btf_id || frozen)
|
||||
printf("\n\t");
|
||||
|
||||
if (info->btf_id)
|
||||
printf("btf_id %d", info->btf_id);
|
||||
|
@ -78,6 +78,8 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
|
||||
ref->pid = e->pid;
|
||||
memcpy(ref->comm, e->comm, sizeof(ref->comm));
|
||||
refs->ref_cnt = 1;
|
||||
refs->has_bpf_cookie = e->has_bpf_cookie;
|
||||
refs->bpf_cookie = e->bpf_cookie;
|
||||
|
||||
err = hashmap__append(map, u32_as_hash_field(e->id), refs);
|
||||
if (err)
|
||||
@ -205,6 +207,9 @@ void emit_obj_refs_json(struct hashmap *map, __u32 id,
|
||||
if (refs->ref_cnt == 0)
|
||||
break;
|
||||
|
||||
if (refs->has_bpf_cookie)
|
||||
jsonw_lluint_field(json_writer, "bpf_cookie", refs->bpf_cookie);
|
||||
|
||||
jsonw_name(json_writer, "pids");
|
||||
jsonw_start_array(json_writer);
|
||||
for (i = 0; i < refs->ref_cnt; i++) {
|
||||
@ -234,6 +239,9 @@ void emit_obj_refs_plain(struct hashmap *map, __u32 id, const char *prefix)
|
||||
if (refs->ref_cnt == 0)
|
||||
break;
|
||||
|
||||
if (refs->has_bpf_cookie)
|
||||
printf("\n\tbpf_cookie %llu", (unsigned long long) refs->bpf_cookie);
|
||||
|
||||
printf("%s", prefix);
|
||||
for (i = 0; i < refs->ref_cnt; i++) {
|
||||
struct obj_ref *ref = &refs->refs[i];
|
||||
|
@ -485,7 +485,7 @@ static void print_prog_json(struct bpf_prog_info *info, int fd)
|
||||
|
||||
memlock = get_fdinfo(fd, "memlock");
|
||||
if (memlock)
|
||||
jsonw_int_field(json_wtr, "bytes_memlock", atoi(memlock));
|
||||
jsonw_int_field(json_wtr, "bytes_memlock", atoll(memlock));
|
||||
free(memlock);
|
||||
|
||||
if (info->nr_map_ids)
|
||||
|
@ -38,6 +38,17 @@ static __always_inline __u32 get_obj_id(void *ent, enum bpf_obj_type type)
|
||||
}
|
||||
}
|
||||
|
||||
/* could be used only with BPF_LINK_TYPE_PERF_EVENT links */
|
||||
static __u64 get_bpf_cookie(struct bpf_link *link)
|
||||
{
|
||||
struct bpf_perf_link *perf_link;
|
||||
struct perf_event *event;
|
||||
|
||||
perf_link = container_of(link, struct bpf_perf_link, link);
|
||||
event = BPF_CORE_READ(perf_link, perf_file, private_data);
|
||||
return BPF_CORE_READ(event, bpf_cookie);
|
||||
}
|
||||
|
||||
SEC("iter/task_file")
|
||||
int iter(struct bpf_iter__task_file *ctx)
|
||||
{
|
||||
@ -69,8 +80,19 @@ int iter(struct bpf_iter__task_file *ctx)
|
||||
if (file->f_op != fops)
|
||||
return 0;
|
||||
|
||||
__builtin_memset(&e, 0, sizeof(e));
|
||||
e.pid = task->tgid;
|
||||
e.id = get_obj_id(file->private_data, obj_type);
|
||||
|
||||
if (obj_type == BPF_OBJ_LINK) {
|
||||
struct bpf_link *link = (struct bpf_link *) file->private_data;
|
||||
|
||||
if (BPF_CORE_READ(link, type) == BPF_LINK_TYPE_PERF_EVENT) {
|
||||
e.has_bpf_cookie = true;
|
||||
e.bpf_cookie = get_bpf_cookie(link);
|
||||
}
|
||||
}
|
||||
|
||||
bpf_probe_read_kernel_str(&e.comm, sizeof(e.comm),
|
||||
task->group_leader->comm);
|
||||
bpf_seq_write(ctx->meta->seq, &e, sizeof(e));
|
||||
|
@ -6,6 +6,8 @@
|
||||
struct pid_iter_entry {
|
||||
__u32 id;
|
||||
int pid;
|
||||
__u64 bpf_cookie;
|
||||
bool has_bpf_cookie;
|
||||
char comm[16];
|
||||
};
|
||||
|
||||
|
@ -997,6 +997,7 @@ enum bpf_attach_type {
|
||||
BPF_SK_REUSEPORT_SELECT,
|
||||
BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
|
||||
BPF_PERF_EVENT,
|
||||
BPF_TRACE_KPROBE_MULTI,
|
||||
__MAX_BPF_ATTACH_TYPE
|
||||
};
|
||||
|
||||
@ -1011,6 +1012,7 @@ enum bpf_link_type {
|
||||
BPF_LINK_TYPE_NETNS = 5,
|
||||
BPF_LINK_TYPE_XDP = 6,
|
||||
BPF_LINK_TYPE_PERF_EVENT = 7,
|
||||
BPF_LINK_TYPE_KPROBE_MULTI = 8,
|
||||
|
||||
MAX_BPF_LINK_TYPE,
|
||||
};
|
||||
@ -1118,6 +1120,11 @@ enum bpf_link_type {
|
||||
*/
|
||||
#define BPF_F_XDP_HAS_FRAGS (1U << 5)
|
||||
|
||||
/* link_create.kprobe_multi.flags used in LINK_CREATE command for
|
||||
* BPF_TRACE_KPROBE_MULTI attach type to create return probe.
|
||||
*/
|
||||
#define BPF_F_KPROBE_MULTI_RETURN (1U << 0)
|
||||
|
||||
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
|
||||
* the following extensions:
|
||||
*
|
||||
@ -1232,6 +1239,8 @@ enum {
|
||||
|
||||
/* If set, run the test on the cpu specified by bpf_attr.test.cpu */
|
||||
#define BPF_F_TEST_RUN_ON_CPU (1U << 0)
|
||||
/* If set, XDP frames will be transmitted after processing */
|
||||
#define BPF_F_TEST_XDP_LIVE_FRAMES (1U << 1)
|
||||
|
||||
/* type for BPF_ENABLE_STATS */
|
||||
enum bpf_stats_type {
|
||||
@ -1393,6 +1402,7 @@ union bpf_attr {
|
||||
__aligned_u64 ctx_out;
|
||||
__u32 flags;
|
||||
__u32 cpu;
|
||||
__u32 batch_size;
|
||||
} test;
|
||||
|
||||
struct { /* anonymous struct used by BPF_*_GET_*_ID */
|
||||
@ -1472,6 +1482,13 @@ union bpf_attr {
|
||||
*/
|
||||
__u64 bpf_cookie;
|
||||
} perf_event;
|
||||
struct {
|
||||
__u32 flags;
|
||||
__u32 cnt;
|
||||
__aligned_u64 syms;
|
||||
__aligned_u64 addrs;
|
||||
__aligned_u64 cookies;
|
||||
} kprobe_multi;
|
||||
};
|
||||
} link_create;
|
||||
|
||||
@ -2299,8 +2316,8 @@ union bpf_attr {
|
||||
* Return
|
||||
* The return value depends on the result of the test, and can be:
|
||||
*
|
||||
* * 0, if current task belongs to the cgroup2.
|
||||
* * 1, if current task does not belong to the cgroup2.
|
||||
* * 1, if current task belongs to the cgroup2.
|
||||
* * 0, if current task does not belong to the cgroup2.
|
||||
* * A negative error code, if an error occurred.
|
||||
*
|
||||
* long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
|
||||
@ -5087,23 +5104,22 @@ union bpf_attr {
|
||||
* 0 on success, or a negative error in case of failure. On error
|
||||
* *dst* buffer is zeroed out.
|
||||
*
|
||||
* long bpf_skb_set_delivery_time(struct sk_buff *skb, u64 dtime, u32 dtime_type)
|
||||
* long bpf_skb_set_tstamp(struct sk_buff *skb, u64 tstamp, u32 tstamp_type)
|
||||
* Description
|
||||
* Set a *dtime* (delivery time) to the __sk_buff->tstamp and also
|
||||
* change the __sk_buff->delivery_time_type to *dtime_type*.
|
||||
* Change the __sk_buff->tstamp_type to *tstamp_type*
|
||||
* and set *tstamp* to the __sk_buff->tstamp together.
|
||||
*
|
||||
* When setting a delivery time (non zero *dtime*) to
|
||||
* __sk_buff->tstamp, only BPF_SKB_DELIVERY_TIME_MONO *dtime_type*
|
||||
* is supported. It is the only delivery_time_type that will be
|
||||
* kept after bpf_redirect_*().
|
||||
*
|
||||
* If there is no need to change the __sk_buff->delivery_time_type,
|
||||
* the delivery time can be directly written to __sk_buff->tstamp
|
||||
* If there is no need to change the __sk_buff->tstamp_type,
|
||||
* the tstamp value can be directly written to __sk_buff->tstamp
|
||||
* instead.
|
||||
*
|
||||
* *dtime* 0 and *dtime_type* BPF_SKB_DELIVERY_TIME_NONE
|
||||
* can be used to clear any delivery time stored in
|
||||
* __sk_buff->tstamp.
|
||||
* BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that
|
||||
* will be kept during bpf_redirect_*(). A non zero
|
||||
* *tstamp* must be used with the BPF_SKB_TSTAMP_DELIVERY_MONO
|
||||
* *tstamp_type*.
|
||||
*
|
||||
* A BPF_SKB_TSTAMP_UNSPEC *tstamp_type* can only be used
|
||||
* with a zero *tstamp*.
|
||||
*
|
||||
* Only IPv4 and IPv6 skb->protocol are supported.
|
||||
*
|
||||
@ -5116,7 +5132,17 @@ union bpf_attr {
|
||||
* Return
|
||||
* 0 on success.
|
||||
* **-EINVAL** for invalid input
|
||||
* **-EOPNOTSUPP** for unsupported delivery_time_type and protocol
|
||||
* **-EOPNOTSUPP** for unsupported protocol
|
||||
*
|
||||
* long bpf_ima_file_hash(struct file *file, void *dst, u32 size)
|
||||
* Description
|
||||
* Returns a calculated IMA hash of the *file*.
|
||||
* If the hash is larger than *size*, then only *size*
|
||||
* bytes will be copied to *dst*
|
||||
* Return
|
||||
* The **hash_algo** is returned on success,
|
||||
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
|
||||
* invalid arguments are passed.
|
||||
*/
|
||||
#define __BPF_FUNC_MAPPER(FN) \
|
||||
FN(unspec), \
|
||||
@ -5311,7 +5337,8 @@ union bpf_attr {
|
||||
FN(xdp_load_bytes), \
|
||||
FN(xdp_store_bytes), \
|
||||
FN(copy_from_user_task), \
|
||||
FN(skb_set_delivery_time), \
|
||||
FN(skb_set_tstamp), \
|
||||
FN(ima_file_hash), \
|
||||
/* */
|
||||
|
||||
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
|
||||
@ -5502,9 +5529,12 @@ union { \
|
||||
} __attribute__((aligned(8)))
|
||||
|
||||
enum {
|
||||
BPF_SKB_DELIVERY_TIME_NONE,
|
||||
BPF_SKB_DELIVERY_TIME_UNSPEC,
|
||||
BPF_SKB_DELIVERY_TIME_MONO,
|
||||
BPF_SKB_TSTAMP_UNSPEC,
|
||||
BPF_SKB_TSTAMP_DELIVERY_MONO, /* tstamp has mono delivery time */
|
||||
/* For any BPF_SKB_TSTAMP_* that the bpf prog cannot handle,
|
||||
* the bpf prog should handle it like BPF_SKB_TSTAMP_UNSPEC
|
||||
* and try to deduce it by ingress, egress or skb->sk->sk_clockid.
|
||||
*/
|
||||
};
|
||||
|
||||
/* user accessible mirror of in-kernel sk_buff.
|
||||
@ -5547,7 +5577,7 @@ struct __sk_buff {
|
||||
__u32 gso_segs;
|
||||
__bpf_md_ptr(struct bpf_sock *, sk);
|
||||
__u32 gso_size;
|
||||
__u8 delivery_time_type;
|
||||
__u8 tstamp_type;
|
||||
__u32 :24; /* Padding, future use. */
|
||||
__u64 hwtstamp;
|
||||
};
|
||||
|
@ -29,6 +29,7 @@
|
||||
#include <errno.h>
|
||||
#include <linux/bpf.h>
|
||||
#include <linux/filter.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <limits.h>
|
||||
#include <sys/resource.h>
|
||||
#include "bpf.h"
|
||||
@ -111,7 +112,7 @@ int probe_memcg_account(void)
|
||||
BPF_EMIT_CALL(BPF_FUNC_ktime_get_coarse_ns),
|
||||
BPF_EXIT_INSN(),
|
||||
};
|
||||
size_t insn_cnt = sizeof(insns) / sizeof(insns[0]);
|
||||
size_t insn_cnt = ARRAY_SIZE(insns);
|
||||
union bpf_attr attr;
|
||||
int prog_fd;
|
||||
|
||||
@ -853,6 +854,15 @@ int bpf_link_create(int prog_fd, int target_fd,
|
||||
if (!OPTS_ZEROED(opts, perf_event))
|
||||
return libbpf_err(-EINVAL);
|
||||
break;
|
||||
case BPF_TRACE_KPROBE_MULTI:
|
||||
attr.link_create.kprobe_multi.flags = OPTS_GET(opts, kprobe_multi.flags, 0);
|
||||
attr.link_create.kprobe_multi.cnt = OPTS_GET(opts, kprobe_multi.cnt, 0);
|
||||
attr.link_create.kprobe_multi.syms = ptr_to_u64(OPTS_GET(opts, kprobe_multi.syms, 0));
|
||||
attr.link_create.kprobe_multi.addrs = ptr_to_u64(OPTS_GET(opts, kprobe_multi.addrs, 0));
|
||||
attr.link_create.kprobe_multi.cookies = ptr_to_u64(OPTS_GET(opts, kprobe_multi.cookies, 0));
|
||||
if (!OPTS_ZEROED(opts, kprobe_multi))
|
||||
return libbpf_err(-EINVAL);
|
||||
break;
|
||||
default:
|
||||
if (!OPTS_ZEROED(opts, flags))
|
||||
return libbpf_err(-EINVAL);
|
||||
@ -994,6 +1004,7 @@ int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts)
|
||||
|
||||
memset(&attr, 0, sizeof(attr));
|
||||
attr.test.prog_fd = prog_fd;
|
||||
attr.test.batch_size = OPTS_GET(opts, batch_size, 0);
|
||||
attr.test.cpu = OPTS_GET(opts, cpu, 0);
|
||||
attr.test.flags = OPTS_GET(opts, flags, 0);
|
||||
attr.test.repeat = OPTS_GET(opts, repeat, 0);
|
||||
|
@ -413,10 +413,17 @@ struct bpf_link_create_opts {
|
||||
struct {
|
||||
__u64 bpf_cookie;
|
||||
} perf_event;
|
||||
struct {
|
||||
__u32 flags;
|
||||
__u32 cnt;
|
||||
const char **syms;
|
||||
const unsigned long *addrs;
|
||||
const __u64 *cookies;
|
||||
} kprobe_multi;
|
||||
};
|
||||
size_t :0;
|
||||
};
|
||||
#define bpf_link_create_opts__last_field perf_event
|
||||
#define bpf_link_create_opts__last_field kprobe_multi.cookies
|
||||
|
||||
LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
|
||||
enum bpf_attach_type attach_type,
|
||||
@ -512,8 +519,9 @@ struct bpf_test_run_opts {
|
||||
__u32 duration; /* out: average per repetition in ns */
|
||||
__u32 flags;
|
||||
__u32 cpu;
|
||||
__u32 batch_size;
|
||||
};
|
||||
#define bpf_test_run_opts__last_field cpu
|
||||
#define bpf_test_run_opts__last_field batch_size
|
||||
|
||||
LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
|
||||
struct bpf_test_run_opts *opts);
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -425,6 +425,29 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
|
||||
const char *func_name,
|
||||
const struct bpf_kprobe_opts *opts);
|
||||
|
||||
struct bpf_kprobe_multi_opts {
|
||||
/* size of this struct, for forward/backward compatibility */
|
||||
size_t sz;
|
||||
/* array of function symbols to attach */
|
||||
const char **syms;
|
||||
/* array of function addresses to attach */
|
||||
const unsigned long *addrs;
|
||||
/* array of user-provided values fetchable through bpf_get_attach_cookie */
|
||||
const __u64 *cookies;
|
||||
/* number of elements in syms/addrs/cookies arrays */
|
||||
size_t cnt;
|
||||
/* create return kprobes */
|
||||
bool retprobe;
|
||||
size_t :0;
|
||||
};
|
||||
|
||||
#define bpf_kprobe_multi_opts__last_field retprobe
|
||||
|
||||
LIBBPF_API struct bpf_link *
|
||||
bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
|
||||
const char *pattern,
|
||||
const struct bpf_kprobe_multi_opts *opts);
|
||||
|
||||
struct bpf_uprobe_opts {
|
||||
/* size of this struct, for forward/backward compatiblity */
|
||||
size_t sz;
|
||||
@ -1289,6 +1312,35 @@ LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s);
|
||||
LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s);
|
||||
LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
|
||||
|
||||
struct bpf_var_skeleton {
|
||||
const char *name;
|
||||
struct bpf_map **map;
|
||||
void **addr;
|
||||
};
|
||||
|
||||
struct bpf_object_subskeleton {
|
||||
size_t sz; /* size of this struct, for forward/backward compatibility */
|
||||
|
||||
const struct bpf_object *obj;
|
||||
|
||||
int map_cnt;
|
||||
int map_skel_sz; /* sizeof(struct bpf_map_skeleton) */
|
||||
struct bpf_map_skeleton *maps;
|
||||
|
||||
int prog_cnt;
|
||||
int prog_skel_sz; /* sizeof(struct bpf_prog_skeleton) */
|
||||
struct bpf_prog_skeleton *progs;
|
||||
|
||||
int var_cnt;
|
||||
int var_skel_sz; /* sizeof(struct bpf_var_skeleton) */
|
||||
struct bpf_var_skeleton *vars;
|
||||
};
|
||||
|
||||
LIBBPF_API int
|
||||
bpf_object__open_subskeleton(struct bpf_object_subskeleton *s);
|
||||
LIBBPF_API void
|
||||
bpf_object__destroy_subskeleton(struct bpf_object_subskeleton *s);
|
||||
|
||||
struct gen_loader_opts {
|
||||
size_t sz; /* size of this struct, for forward/backward compatiblity */
|
||||
const char *data;
|
||||
@ -1328,6 +1380,115 @@ LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker,
|
||||
LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker);
|
||||
LIBBPF_API void bpf_linker__free(struct bpf_linker *linker);
|
||||
|
||||
/*
|
||||
* Custom handling of BPF program's SEC() definitions
|
||||
*/
|
||||
|
||||
struct bpf_prog_load_opts; /* defined in bpf.h */
|
||||
|
||||
/* Called during bpf_object__open() for each recognized BPF program. Callback
|
||||
* can use various bpf_program__set_*() setters to adjust whatever properties
|
||||
* are necessary.
|
||||
*/
|
||||
typedef int (*libbpf_prog_setup_fn_t)(struct bpf_program *prog, long cookie);
|
||||
|
||||
/* Called right before libbpf performs bpf_prog_load() to load BPF program
|
||||
* into the kernel. Callback can adjust opts as necessary.
|
||||
*/
|
||||
typedef int (*libbpf_prog_prepare_load_fn_t)(struct bpf_program *prog,
|
||||
struct bpf_prog_load_opts *opts, long cookie);
|
||||
|
||||
/* Called during skeleton attach or through bpf_program__attach(). If
|
||||
* auto-attach is not supported, callback should return 0 and set link to
|
||||
* NULL (it's not considered an error during skeleton attach, but it will be
|
||||
* an error for bpf_program__attach() calls). On error, error should be
|
||||
* returned directly and link set to NULL. On success, return 0 and set link
|
||||
* to a valid struct bpf_link.
|
||||
*/
|
||||
typedef int (*libbpf_prog_attach_fn_t)(const struct bpf_program *prog, long cookie,
|
||||
struct bpf_link **link);
|
||||
|
||||
struct libbpf_prog_handler_opts {
|
||||
/* size of this struct, for forward/backward compatiblity */
|
||||
size_t sz;
|
||||
/* User-provided value that is passed to prog_setup_fn,
|
||||
* prog_prepare_load_fn, and prog_attach_fn callbacks. Allows user to
|
||||
* register one set of callbacks for multiple SEC() definitions and
|
||||
* still be able to distinguish them, if necessary. For example,
|
||||
* libbpf itself is using this to pass necessary flags (e.g.,
|
||||
* sleepable flag) to a common internal SEC() handler.
|
||||
*/
|
||||
long cookie;
|
||||
/* BPF program initialization callback (see libbpf_prog_setup_fn_t).
|
||||
* Callback is optional, pass NULL if it's not necessary.
|
||||
*/
|
||||
libbpf_prog_setup_fn_t prog_setup_fn;
|
||||
/* BPF program loading callback (see libbpf_prog_prepare_load_fn_t).
|
||||
* Callback is optional, pass NULL if it's not necessary.
|
||||
*/
|
||||
libbpf_prog_prepare_load_fn_t prog_prepare_load_fn;
|
||||
/* BPF program attach callback (see libbpf_prog_attach_fn_t).
|
||||
* Callback is optional, pass NULL if it's not necessary.
|
||||
*/
|
||||
libbpf_prog_attach_fn_t prog_attach_fn;
|
||||
};
|
||||
#define libbpf_prog_handler_opts__last_field prog_attach_fn
|
||||
|
||||
/**
|
||||
* @brief **libbpf_register_prog_handler()** registers a custom BPF program
|
||||
* SEC() handler.
|
||||
* @param sec section prefix for which custom handler is registered
|
||||
* @param prog_type BPF program type associated with specified section
|
||||
* @param exp_attach_type Expected BPF attach type associated with specified section
|
||||
* @param opts optional cookie, callbacks, and other extra options
|
||||
* @return Non-negative handler ID is returned on success. This handler ID has
|
||||
* to be passed to *libbpf_unregister_prog_handler()* to unregister such
|
||||
* custom handler. Negative error code is returned on error.
|
||||
*
|
||||
* *sec* defines which SEC() definitions are handled by this custom handler
|
||||
* registration. *sec* can have few different forms:
|
||||
* - if *sec* is just a plain string (e.g., "abc"), it will match only
|
||||
* SEC("abc"). If BPF program specifies SEC("abc/whatever") it will result
|
||||
* in an error;
|
||||
* - if *sec* is of the form "abc/", proper SEC() form is
|
||||
* SEC("abc/something"), where acceptable "something" should be checked by
|
||||
* *prog_init_fn* callback, if there are additional restrictions;
|
||||
* - if *sec* is of the form "abc+", it will successfully match both
|
||||
* SEC("abc") and SEC("abc/whatever") forms;
|
||||
* - if *sec* is NULL, custom handler is registered for any BPF program that
|
||||
* doesn't match any of the registered (custom or libbpf's own) SEC()
|
||||
* handlers. There could be only one such generic custom handler registered
|
||||
* at any given time.
|
||||
*
|
||||
* All custom handlers (except the one with *sec* == NULL) are processed
|
||||
* before libbpf's own SEC() handlers. It is allowed to "override" libbpf's
|
||||
* SEC() handlers by registering custom ones for the same section prefix
|
||||
* (i.e., it's possible to have custom SEC("perf_event/LLC-load-misses")
|
||||
* handler).
|
||||
*
|
||||
* Note, like much of global libbpf APIs (e.g., libbpf_set_print(),
|
||||
* libbpf_set_strict_mode(), etc)) these APIs are not thread-safe. User needs
|
||||
* to ensure synchronization if there is a risk of running this API from
|
||||
* multiple threads simultaneously.
|
||||
*/
|
||||
LIBBPF_API int libbpf_register_prog_handler(const char *sec,
|
||||
enum bpf_prog_type prog_type,
|
||||
enum bpf_attach_type exp_attach_type,
|
||||
const struct libbpf_prog_handler_opts *opts);
|
||||
/**
|
||||
* @brief *libbpf_unregister_prog_handler()* unregisters previously registered
|
||||
* custom BPF program SEC() handler.
|
||||
* @param handler_id handler ID returned by *libbpf_register_prog_handler()*
|
||||
* after successful registration
|
||||
* @return 0 on success, negative error code if handler isn't found
|
||||
*
|
||||
* Note, like much of global libbpf APIs (e.g., libbpf_set_print(),
|
||||
* libbpf_set_strict_mode(), etc)) these APIs are not thread-safe. User needs
|
||||
* to ensure synchronization if there is a risk of running this API from
|
||||
* multiple threads simultaneously.
|
||||
*/
|
||||
LIBBPF_API int libbpf_unregister_prog_handler(int handler_id);
|
||||
|
||||
#ifdef __cplusplus
|
||||
} /* extern "C" */
|
||||
#endif
|
||||
|
@ -439,3 +439,12 @@ LIBBPF_0.7.0 {
|
||||
libbpf_probe_bpf_prog_type;
|
||||
libbpf_set_memlock_rlim_max;
|
||||
} LIBBPF_0.6.0;
|
||||
|
||||
LIBBPF_0.8.0 {
|
||||
global:
|
||||
bpf_object__destroy_subskeleton;
|
||||
bpf_object__open_subskeleton;
|
||||
libbpf_register_prog_handler;
|
||||
libbpf_unregister_prog_handler;
|
||||
bpf_program__attach_kprobe_multi_opts;
|
||||
} LIBBPF_0.7.0;
|
||||
|
@ -449,6 +449,11 @@ __s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
|
||||
|
||||
extern enum libbpf_strict_mode libbpf_mode;
|
||||
|
||||
typedef int (*kallsyms_cb_t)(unsigned long long sym_addr, char sym_type,
|
||||
const char *sym_name, void *ctx);
|
||||
|
||||
int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *arg);
|
||||
|
||||
/* handle direct returned errors */
|
||||
static inline int libbpf_err(int ret)
|
||||
{
|
||||
|
@ -54,6 +54,10 @@ enum libbpf_strict_mode {
|
||||
*
|
||||
* Note, in this mode the program pin path will be based on the
|
||||
* function name instead of section name.
|
||||
*
|
||||
* Additionally, routines in the .text section are always considered
|
||||
* sub-programs. Legacy behavior allows for a single routine in .text
|
||||
* to be a program.
|
||||
*/
|
||||
LIBBPF_STRICT_SEC_NAME = 0x04,
|
||||
/*
|
||||
|
@ -4,6 +4,6 @@
|
||||
#define __LIBBPF_VERSION_H
|
||||
|
||||
#define LIBBPF_MAJOR_VERSION 0
|
||||
#define LIBBPF_MINOR_VERSION 7
|
||||
#define LIBBPF_MINOR_VERSION 8
|
||||
|
||||
#endif /* __LIBBPF_VERSION_H */
|
||||
|
@ -481,8 +481,8 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
|
||||
BPF_EMIT_CALL(BPF_FUNC_redirect_map),
|
||||
BPF_EXIT_INSN(),
|
||||
};
|
||||
size_t insns_cnt[] = {sizeof(prog) / sizeof(struct bpf_insn),
|
||||
sizeof(prog_redirect_flags) / sizeof(struct bpf_insn),
|
||||
size_t insns_cnt[] = {ARRAY_SIZE(prog),
|
||||
ARRAY_SIZE(prog_redirect_flags),
|
||||
};
|
||||
struct bpf_insn *progs[] = {prog, prog_redirect_flags};
|
||||
enum xsk_prog option = get_xsk_prog();
|
||||
@ -1193,12 +1193,23 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
|
||||
|
||||
int xsk_umem__delete(struct xsk_umem *umem)
|
||||
{
|
||||
struct xdp_mmap_offsets off;
|
||||
int err;
|
||||
|
||||
if (!umem)
|
||||
return 0;
|
||||
|
||||
if (umem->refcount)
|
||||
return -EBUSY;
|
||||
|
||||
err = xsk_get_mmap_offsets(umem->fd, &off);
|
||||
if (!err && umem->fill_save && umem->comp_save) {
|
||||
munmap(umem->fill_save->ring - off.fr.desc,
|
||||
off.fr.desc + umem->config.fill_size * sizeof(__u64));
|
||||
munmap(umem->comp_save->ring - off.cr.desc,
|
||||
off.cr.desc + umem->config.comp_size * sizeof(__u64));
|
||||
}
|
||||
|
||||
close(umem->fd);
|
||||
free(umem);
|
||||
|
||||
|
@ -89,6 +89,9 @@ ifeq ($(CC_NO_CLANG), 1)
|
||||
EXTRA_WARNINGS += -Wstrict-aliasing=3
|
||||
|
||||
else ifneq ($(CROSS_COMPILE),)
|
||||
# Allow userspace to override CLANG_CROSS_FLAGS to specify their own
|
||||
# sysroots and flags or to avoid the GCC call in pure Clang builds.
|
||||
ifeq ($(CLANG_CROSS_FLAGS),)
|
||||
CLANG_CROSS_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%))
|
||||
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)gcc 2>/dev/null))
|
||||
ifneq ($(GCC_TOOLCHAIN_DIR),)
|
||||
@ -96,6 +99,7 @@ CLANG_CROSS_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE))
|
||||
CLANG_CROSS_FLAGS += --sysroot=$(shell $(CROSS_COMPILE)gcc -print-sysroot)
|
||||
CLANG_CROSS_FLAGS += --gcc-toolchain=$(realpath $(GCC_TOOLCHAIN_DIR)/..)
|
||||
endif # GCC_TOOLCHAIN_DIR
|
||||
endif # CLANG_CROSS_FLAGS
|
||||
CFLAGS += $(CLANG_CROSS_FLAGS)
|
||||
AFLAGS += $(CLANG_CROSS_FLAGS)
|
||||
endif # CROSS_COMPILE
|
||||
|
1
tools/testing/selftests/bpf/.gitignore
vendored
1
tools/testing/selftests/bpf/.gitignore
vendored
@ -31,6 +31,7 @@ test_tcp_check_syncookie_user
|
||||
test_sysctl
|
||||
xdping
|
||||
test_cpp
|
||||
*.subskel.h
|
||||
*.skel.h
|
||||
*.lskel.h
|
||||
/no_alu32
|
||||
|
@ -25,7 +25,7 @@ CFLAGS += -g -O0 -rdynamic -Wall -Werror $(GENFLAGS) $(SAN_CFLAGS) \
|
||||
-I$(CURDIR) -I$(INCLUDE_DIR) -I$(GENDIR) -I$(LIBDIR) \
|
||||
-I$(TOOLSINCDIR) -I$(APIDIR) -I$(OUTPUT)
|
||||
LDFLAGS += $(SAN_CFLAGS)
|
||||
LDLIBS += -lcap -lelf -lz -lrt -lpthread
|
||||
LDLIBS += -lelf -lz -lrt -lpthread
|
||||
|
||||
# Silence some warnings when compiled with clang
|
||||
ifneq ($(LLVM),)
|
||||
@ -195,6 +195,7 @@ $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED): $(BPFOBJ)
|
||||
CGROUP_HELPERS := $(OUTPUT)/cgroup_helpers.o
|
||||
TESTING_HELPERS := $(OUTPUT)/testing_helpers.o
|
||||
TRACE_HELPERS := $(OUTPUT)/trace_helpers.o
|
||||
CAP_HELPERS := $(OUTPUT)/cap_helpers.o
|
||||
|
||||
$(OUTPUT)/test_dev_cgroup: $(CGROUP_HELPERS) $(TESTING_HELPERS)
|
||||
$(OUTPUT)/test_skb_cgroup_id_user: $(CGROUP_HELPERS) $(TESTING_HELPERS)
|
||||
@ -211,7 +212,7 @@ $(OUTPUT)/test_lirc_mode2_user: $(TESTING_HELPERS)
|
||||
$(OUTPUT)/xdping: $(TESTING_HELPERS)
|
||||
$(OUTPUT)/flow_dissector_load: $(TESTING_HELPERS)
|
||||
$(OUTPUT)/test_maps: $(TESTING_HELPERS)
|
||||
$(OUTPUT)/test_verifier: $(TESTING_HELPERS)
|
||||
$(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS)
|
||||
|
||||
BPFTOOL ?= $(DEFAULT_BPFTOOL)
|
||||
$(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \
|
||||
@ -326,7 +327,13 @@ endef
|
||||
SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
|
||||
|
||||
LINKED_SKELS := test_static_linked.skel.h linked_funcs.skel.h \
|
||||
linked_vars.skel.h linked_maps.skel.h
|
||||
linked_vars.skel.h linked_maps.skel.h \
|
||||
test_subskeleton.skel.h test_subskeleton_lib.skel.h
|
||||
|
||||
# In the subskeleton case, we want the test_subskeleton_lib.subskel.h file
|
||||
# but that's created as a side-effect of the skel.h generation.
|
||||
test_subskeleton.skel.h-deps := test_subskeleton_lib2.o test_subskeleton_lib.o test_subskeleton.o
|
||||
test_subskeleton_lib.skel.h-deps := test_subskeleton_lib2.o test_subskeleton_lib.o
|
||||
|
||||
LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c \
|
||||
test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c \
|
||||
@ -404,6 +411,7 @@ $(TRUNNER_BPF_SKELS): %.skel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
|
||||
$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked3.o) $$(<:.o=.linked2.o)
|
||||
$(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o)
|
||||
$(Q)$$(BPFTOOL) gen skeleton $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$@
|
||||
$(Q)$$(BPFTOOL) gen subskeleton $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$(@:.skel.h=.subskel.h)
|
||||
|
||||
$(TRUNNER_BPF_LSKELS): %.lskel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
|
||||
$$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@)
|
||||
@ -421,6 +429,7 @@ $(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_BPF_OBJS) $(BPFTOOL) | $(TRUNNER_OUTPUT)
|
||||
$(Q)diff $$(@:.skel.h=.linked2.o) $$(@:.skel.h=.linked3.o)
|
||||
$$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@)
|
||||
$(Q)$$(BPFTOOL) gen skeleton $$(@:.skel.h=.linked3.o) name $$(notdir $$(@:.skel.h=)) > $$@
|
||||
$(Q)$$(BPFTOOL) gen subskeleton $$(@:.skel.h=.linked3.o) name $$(notdir $$(@:.skel.h=)) > $$(@:.skel.h=.subskel.h)
|
||||
endif
|
||||
|
||||
# ensure we set up tests.h header generation rule just once
|
||||
@ -479,7 +488,8 @@ TRUNNER_TESTS_DIR := prog_tests
|
||||
TRUNNER_BPF_PROGS_DIR := progs
|
||||
TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \
|
||||
network_helpers.c testing_helpers.c \
|
||||
btf_helpers.c flow_dissector_load.h
|
||||
btf_helpers.c flow_dissector_load.h \
|
||||
cap_helpers.c
|
||||
TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
|
||||
ima_setup.sh \
|
||||
$(wildcard progs/btf_dump_test_case_*.c)
|
||||
@ -557,6 +567,6 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \
|
||||
EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \
|
||||
prog_tests/tests.h map_tests/tests.h verifier/tests.h \
|
||||
feature bpftool \
|
||||
$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h no_alu32 bpf_gcc bpf_testmod.ko)
|
||||
$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h no_alu32 bpf_gcc bpf_testmod.ko)
|
||||
|
||||
.PHONY: docs docs-clean
|
||||
|
@ -32,11 +32,19 @@ For more information on about using the script, run:
|
||||
|
||||
$ tools/testing/selftests/bpf/vmtest.sh -h
|
||||
|
||||
In case of linker errors when running selftests, try using static linking:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ LDLIBS=-static vmtest.sh
|
||||
|
||||
.. note:: Some distros may not support static linking.
|
||||
|
||||
.. note:: The script uses pahole and clang based on host environment setting.
|
||||
If you want to change pahole and llvm, you can change `PATH` environment
|
||||
variable in the beginning of script.
|
||||
|
||||
.. note:: The script currently only supports x86_64.
|
||||
.. note:: The script currently only supports x86_64 and s390x architectures.
|
||||
|
||||
Additional information about selftest failures are
|
||||
documented here.
|
||||
|
@ -33,6 +33,10 @@ struct bpf_testmod_btf_type_tag_2 {
|
||||
struct bpf_testmod_btf_type_tag_1 __user *p;
|
||||
};
|
||||
|
||||
struct bpf_testmod_btf_type_tag_3 {
|
||||
struct bpf_testmod_btf_type_tag_1 __percpu *p;
|
||||
};
|
||||
|
||||
noinline int
|
||||
bpf_testmod_test_btf_type_tag_user_1(struct bpf_testmod_btf_type_tag_1 __user *arg) {
|
||||
BTF_TYPE_EMIT(func_proto_typedef);
|
||||
@ -46,6 +50,16 @@ bpf_testmod_test_btf_type_tag_user_2(struct bpf_testmod_btf_type_tag_2 *arg) {
|
||||
return arg->p->a;
|
||||
}
|
||||
|
||||
noinline int
|
||||
bpf_testmod_test_btf_type_tag_percpu_1(struct bpf_testmod_btf_type_tag_1 __percpu *arg) {
|
||||
return arg->a;
|
||||
}
|
||||
|
||||
noinline int
|
||||
bpf_testmod_test_btf_type_tag_percpu_2(struct bpf_testmod_btf_type_tag_3 *arg) {
|
||||
return arg->p->a;
|
||||
}
|
||||
|
||||
noinline int bpf_testmod_loop_test(int n)
|
||||
{
|
||||
int i, sum = 0;
|
||||
|
67
tools/testing/selftests/bpf/cap_helpers.c
Normal file
67
tools/testing/selftests/bpf/cap_helpers.c
Normal file
@ -0,0 +1,67 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include "cap_helpers.h"
|
||||
|
||||
/* Avoid including <sys/capability.h> from the libcap-devel package,
|
||||
* so directly declare them here and use them from glibc.
|
||||
*/
|
||||
int capget(cap_user_header_t header, cap_user_data_t data);
|
||||
int capset(cap_user_header_t header, const cap_user_data_t data);
|
||||
|
||||
int cap_enable_effective(__u64 caps, __u64 *old_caps)
|
||||
{
|
||||
struct __user_cap_data_struct data[_LINUX_CAPABILITY_U32S_3];
|
||||
struct __user_cap_header_struct hdr = {
|
||||
.version = _LINUX_CAPABILITY_VERSION_3,
|
||||
};
|
||||
__u32 cap0 = caps;
|
||||
__u32 cap1 = caps >> 32;
|
||||
int err;
|
||||
|
||||
err = capget(&hdr, data);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (old_caps)
|
||||
*old_caps = (__u64)(data[1].effective) << 32 | data[0].effective;
|
||||
|
||||
if ((data[0].effective & cap0) == cap0 &&
|
||||
(data[1].effective & cap1) == cap1)
|
||||
return 0;
|
||||
|
||||
data[0].effective |= cap0;
|
||||
data[1].effective |= cap1;
|
||||
err = capset(&hdr, data);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int cap_disable_effective(__u64 caps, __u64 *old_caps)
|
||||
{
|
||||
struct __user_cap_data_struct data[_LINUX_CAPABILITY_U32S_3];
|
||||
struct __user_cap_header_struct hdr = {
|
||||
.version = _LINUX_CAPABILITY_VERSION_3,
|
||||
};
|
||||
__u32 cap0 = caps;
|
||||
__u32 cap1 = caps >> 32;
|
||||
int err;
|
||||
|
||||
err = capget(&hdr, data);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (old_caps)
|
||||
*old_caps = (__u64)(data[1].effective) << 32 | data[0].effective;
|
||||
|
||||
if (!(data[0].effective & cap0) && !(data[1].effective & cap1))
|
||||
return 0;
|
||||
|
||||
data[0].effective &= ~cap0;
|
||||
data[1].effective &= ~cap1;
|
||||
err = capset(&hdr, data);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
return 0;
|
||||
}
|
19
tools/testing/selftests/bpf/cap_helpers.h
Normal file
19
tools/testing/selftests/bpf/cap_helpers.h
Normal file
@ -0,0 +1,19 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef __CAP_HELPERS_H
|
||||
#define __CAP_HELPERS_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/capability.h>
|
||||
|
||||
#ifndef CAP_PERFMON
|
||||
#define CAP_PERFMON 38
|
||||
#endif
|
||||
|
||||
#ifndef CAP_BPF
|
||||
#define CAP_BPF 39
|
||||
#endif
|
||||
|
||||
int cap_enable_effective(__u64 caps, __u64 *old_caps);
|
||||
int cap_disable_effective(__u64 caps, __u64 *old_caps);
|
||||
|
||||
#endif
|
@ -12,7 +12,7 @@ LOG_FILE="$(mktemp /tmp/ima_setup.XXXX.log)"
|
||||
|
||||
usage()
|
||||
{
|
||||
echo "Usage: $0 <setup|cleanup|run> <existing_tmp_dir>"
|
||||
echo "Usage: $0 <setup|cleanup|run|modify-bin|restore-bin|load-policy> <existing_tmp_dir>"
|
||||
exit 1
|
||||
}
|
||||
|
||||
@ -51,6 +51,7 @@ setup()
|
||||
|
||||
ensure_mount_securityfs
|
||||
echo "measure func=BPRM_CHECK fsuuid=${mount_uuid}" > ${IMA_POLICY_FILE}
|
||||
echo "measure func=BPRM_CHECK fsuuid=${mount_uuid}" > ${mount_dir}/policy_test
|
||||
}
|
||||
|
||||
cleanup() {
|
||||
@ -77,6 +78,32 @@ run()
|
||||
exec "${copied_bin_path}"
|
||||
}
|
||||
|
||||
modify_bin()
|
||||
{
|
||||
local tmp_dir="$1"
|
||||
local mount_dir="${tmp_dir}/mnt"
|
||||
local copied_bin_path="${mount_dir}/$(basename ${TEST_BINARY})"
|
||||
|
||||
echo "mod" >> "${copied_bin_path}"
|
||||
}
|
||||
|
||||
restore_bin()
|
||||
{
|
||||
local tmp_dir="$1"
|
||||
local mount_dir="${tmp_dir}/mnt"
|
||||
local copied_bin_path="${mount_dir}/$(basename ${TEST_BINARY})"
|
||||
|
||||
truncate -s -4 "${copied_bin_path}"
|
||||
}
|
||||
|
||||
load_policy()
|
||||
{
|
||||
local tmp_dir="$1"
|
||||
local mount_dir="${tmp_dir}/mnt"
|
||||
|
||||
echo ${mount_dir}/policy_test > ${IMA_POLICY_FILE} 2> /dev/null
|
||||
}
|
||||
|
||||
catch()
|
||||
{
|
||||
local exit_code="$1"
|
||||
@ -105,6 +132,12 @@ main()
|
||||
cleanup "${tmp_dir}"
|
||||
elif [[ "${action}" == "run" ]]; then
|
||||
run "${tmp_dir}"
|
||||
elif [[ "${action}" == "modify-bin" ]]; then
|
||||
modify_bin "${tmp_dir}"
|
||||
elif [[ "${action}" == "restore-bin" ]]; then
|
||||
restore_bin "${tmp_dir}"
|
||||
elif [[ "${action}" == "load-policy" ]]; then
|
||||
load_policy "${tmp_dir}"
|
||||
else
|
||||
echo "Unknown action: ${action}"
|
||||
exit 1
|
||||
|
@ -1,18 +1,25 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-only
|
||||
#define _GNU_SOURCE
|
||||
|
||||
#include <errno.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
#include <sched.h>
|
||||
|
||||
#include <arpa/inet.h>
|
||||
#include <sys/mount.h>
|
||||
#include <sys/stat.h>
|
||||
|
||||
#include <linux/err.h>
|
||||
#include <linux/in.h>
|
||||
#include <linux/in6.h>
|
||||
#include <linux/limits.h>
|
||||
|
||||
#include "bpf_util.h"
|
||||
#include "network_helpers.h"
|
||||
#include "test_progs.h"
|
||||
|
||||
#define clean_errno() (errno == 0 ? "None" : strerror(errno))
|
||||
#define log_err(MSG, ...) ({ \
|
||||
@ -356,3 +363,82 @@ char *ping_command(int family)
|
||||
}
|
||||
return "ping";
|
||||
}
|
||||
|
||||
struct nstoken {
|
||||
int orig_netns_fd;
|
||||
};
|
||||
|
||||
static int setns_by_fd(int nsfd)
|
||||
{
|
||||
int err;
|
||||
|
||||
err = setns(nsfd, CLONE_NEWNET);
|
||||
close(nsfd);
|
||||
|
||||
if (!ASSERT_OK(err, "setns"))
|
||||
return err;
|
||||
|
||||
/* Switch /sys to the new namespace so that e.g. /sys/class/net
|
||||
* reflects the devices in the new namespace.
|
||||
*/
|
||||
err = unshare(CLONE_NEWNS);
|
||||
if (!ASSERT_OK(err, "unshare"))
|
||||
return err;
|
||||
|
||||
/* Make our /sys mount private, so the following umount won't
|
||||
* trigger the global umount in case it's shared.
|
||||
*/
|
||||
err = mount("none", "/sys", NULL, MS_PRIVATE, NULL);
|
||||
if (!ASSERT_OK(err, "remount private /sys"))
|
||||
return err;
|
||||
|
||||
err = umount2("/sys", MNT_DETACH);
|
||||
if (!ASSERT_OK(err, "umount2 /sys"))
|
||||
return err;
|
||||
|
||||
err = mount("sysfs", "/sys", "sysfs", 0, NULL);
|
||||
if (!ASSERT_OK(err, "mount /sys"))
|
||||
return err;
|
||||
|
||||
err = mount("bpffs", "/sys/fs/bpf", "bpf", 0, NULL);
|
||||
if (!ASSERT_OK(err, "mount /sys/fs/bpf"))
|
||||
return err;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
struct nstoken *open_netns(const char *name)
|
||||
{
|
||||
int nsfd;
|
||||
char nspath[PATH_MAX];
|
||||
int err;
|
||||
struct nstoken *token;
|
||||
|
||||
token = malloc(sizeof(struct nstoken));
|
||||
if (!ASSERT_OK_PTR(token, "malloc token"))
|
||||
return NULL;
|
||||
|
||||
token->orig_netns_fd = open("/proc/self/ns/net", O_RDONLY);
|
||||
if (!ASSERT_GE(token->orig_netns_fd, 0, "open /proc/self/ns/net"))
|
||||
goto fail;
|
||||
|
||||
snprintf(nspath, sizeof(nspath), "%s/%s", "/var/run/netns", name);
|
||||
nsfd = open(nspath, O_RDONLY | O_CLOEXEC);
|
||||
if (!ASSERT_GE(nsfd, 0, "open netns fd"))
|
||||
goto fail;
|
||||
|
||||
err = setns_by_fd(nsfd);
|
||||
if (!ASSERT_OK(err, "setns_by_fd"))
|
||||
goto fail;
|
||||
|
||||
return token;
|
||||
fail:
|
||||
free(token);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
void close_netns(struct nstoken *token)
|
||||
{
|
||||
ASSERT_OK(setns_by_fd(token->orig_netns_fd), "setns_by_fd");
|
||||
free(token);
|
||||
}
|
||||
|
@ -55,4 +55,13 @@ int make_sockaddr(int family, const char *addr_str, __u16 port,
|
||||
struct sockaddr_storage *addr, socklen_t *len);
|
||||
char *ping_command(int family);
|
||||
|
||||
struct nstoken;
|
||||
/**
|
||||
* open_netns() - Switch to specified network namespace by name.
|
||||
*
|
||||
* Returns token with which to restore the original namespace
|
||||
* using close_netns().
|
||||
*/
|
||||
struct nstoken *open_netns(const char *name);
|
||||
void close_netns(struct nstoken *token);
|
||||
#endif
|
||||
|
@ -4,9 +4,9 @@
|
||||
#include <stdlib.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/capability.h>
|
||||
|
||||
#include "test_progs.h"
|
||||
#include "cap_helpers.h"
|
||||
#include "bind_perm.skel.h"
|
||||
|
||||
static int duration;
|
||||
@ -49,41 +49,11 @@ close_socket:
|
||||
close(fd);
|
||||
}
|
||||
|
||||
bool cap_net_bind_service(cap_flag_value_t flag)
|
||||
{
|
||||
const cap_value_t cap_net_bind_service = CAP_NET_BIND_SERVICE;
|
||||
cap_flag_value_t original_value;
|
||||
bool was_effective = false;
|
||||
cap_t caps;
|
||||
|
||||
caps = cap_get_proc();
|
||||
if (CHECK(!caps, "cap_get_proc", "errno %d", errno))
|
||||
goto free_caps;
|
||||
|
||||
if (CHECK(cap_get_flag(caps, CAP_NET_BIND_SERVICE, CAP_EFFECTIVE,
|
||||
&original_value),
|
||||
"cap_get_flag", "errno %d", errno))
|
||||
goto free_caps;
|
||||
|
||||
was_effective = (original_value == CAP_SET);
|
||||
|
||||
if (CHECK(cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_net_bind_service,
|
||||
flag),
|
||||
"cap_set_flag", "errno %d", errno))
|
||||
goto free_caps;
|
||||
|
||||
if (CHECK(cap_set_proc(caps), "cap_set_proc", "errno %d", errno))
|
||||
goto free_caps;
|
||||
|
||||
free_caps:
|
||||
CHECK(cap_free(caps), "cap_free", "errno %d", errno);
|
||||
return was_effective;
|
||||
}
|
||||
|
||||
void test_bind_perm(void)
|
||||
{
|
||||
bool cap_was_effective;
|
||||
const __u64 net_bind_svc_cap = 1ULL << CAP_NET_BIND_SERVICE;
|
||||
struct bind_perm *skel;
|
||||
__u64 old_caps = 0;
|
||||
int cgroup_fd;
|
||||
|
||||
if (create_netns())
|
||||
@ -105,7 +75,8 @@ void test_bind_perm(void)
|
||||
if (!ASSERT_OK_PTR(skel, "bind_v6_prog"))
|
||||
goto close_skeleton;
|
||||
|
||||
cap_was_effective = cap_net_bind_service(CAP_CLEAR);
|
||||
ASSERT_OK(cap_disable_effective(net_bind_svc_cap, &old_caps),
|
||||
"cap_disable_effective");
|
||||
|
||||
try_bind(AF_INET, 110, EACCES);
|
||||
try_bind(AF_INET6, 110, EACCES);
|
||||
@ -113,8 +84,9 @@ void test_bind_perm(void)
|
||||
try_bind(AF_INET, 111, 0);
|
||||
try_bind(AF_INET6, 111, 0);
|
||||
|
||||
if (cap_was_effective)
|
||||
cap_net_bind_service(CAP_SET);
|
||||
if (old_caps & net_bind_svc_cap)
|
||||
ASSERT_OK(cap_enable_effective(net_bind_svc_cap, NULL),
|
||||
"cap_enable_effective");
|
||||
|
||||
close_skeleton:
|
||||
bind_perm__destroy(skel);
|
||||
|
@ -7,6 +7,7 @@
|
||||
#include <unistd.h>
|
||||
#include <test_progs.h>
|
||||
#include "test_bpf_cookie.skel.h"
|
||||
#include "kprobe_multi.skel.h"
|
||||
|
||||
/* uprobe attach point */
|
||||
static void trigger_func(void)
|
||||
@ -63,6 +64,178 @@ cleanup:
|
||||
bpf_link__destroy(retlink2);
|
||||
}
|
||||
|
||||
static void kprobe_multi_test_run(struct kprobe_multi *skel)
|
||||
{
|
||||
LIBBPF_OPTS(bpf_test_run_opts, topts);
|
||||
int err, prog_fd;
|
||||
|
||||
prog_fd = bpf_program__fd(skel->progs.trigger);
|
||||
err = bpf_prog_test_run_opts(prog_fd, &topts);
|
||||
ASSERT_OK(err, "test_run");
|
||||
ASSERT_EQ(topts.retval, 0, "test_run");
|
||||
|
||||
ASSERT_EQ(skel->bss->kprobe_test1_result, 1, "kprobe_test1_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test2_result, 1, "kprobe_test2_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test3_result, 1, "kprobe_test3_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test4_result, 1, "kprobe_test4_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test5_result, 1, "kprobe_test5_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test6_result, 1, "kprobe_test6_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test7_result, 1, "kprobe_test7_result");
|
||||
ASSERT_EQ(skel->bss->kprobe_test8_result, 1, "kprobe_test8_result");
|
||||
|
||||
ASSERT_EQ(skel->bss->kretprobe_test1_result, 1, "kretprobe_test1_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test2_result, 1, "kretprobe_test2_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test3_result, 1, "kretprobe_test3_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test4_result, 1, "kretprobe_test4_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test5_result, 1, "kretprobe_test5_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test6_result, 1, "kretprobe_test6_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test7_result, 1, "kretprobe_test7_result");
|
||||
ASSERT_EQ(skel->bss->kretprobe_test8_result, 1, "kretprobe_test8_result");
|
||||
}
|
||||
|
||||
static void kprobe_multi_link_api_subtest(void)
|
||||
{
|
||||
int prog_fd, link1_fd = -1, link2_fd = -1;
|
||||
struct kprobe_multi *skel = NULL;
|
||||
LIBBPF_OPTS(bpf_link_create_opts, opts);
|
||||
unsigned long long addrs[8];
|
||||
__u64 cookies[8];
|
||||
|
||||
if (!ASSERT_OK(load_kallsyms(), "load_kallsyms"))
|
||||
goto cleanup;
|
||||
|
||||
skel = kprobe_multi__open_and_load();
|
||||
if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
|
||||
goto cleanup;
|
||||
|
||||
skel->bss->pid = getpid();
|
||||
skel->bss->test_cookie = true;
|
||||
|
||||
#define GET_ADDR(__sym, __addr) ({ \
|
||||
__addr = ksym_get_addr(__sym); \
|
||||
if (!ASSERT_NEQ(__addr, 0, "ksym_get_addr " #__sym)) \
|
||||
goto cleanup; \
|
||||
})
|
||||
|
||||
GET_ADDR("bpf_fentry_test1", addrs[0]);
|
||||
GET_ADDR("bpf_fentry_test2", addrs[1]);
|
||||
GET_ADDR("bpf_fentry_test3", addrs[2]);
|
||||
GET_ADDR("bpf_fentry_test4", addrs[3]);
|
||||
GET_ADDR("bpf_fentry_test5", addrs[4]);
|
||||
GET_ADDR("bpf_fentry_test6", addrs[5]);
|
||||
GET_ADDR("bpf_fentry_test7", addrs[6]);
|
||||
GET_ADDR("bpf_fentry_test8", addrs[7]);
|
||||
|
||||
#undef GET_ADDR
|
||||
|
||||
cookies[0] = 1;
|
||||
cookies[1] = 2;
|
||||
cookies[2] = 3;
|
||||
cookies[3] = 4;
|
||||
cookies[4] = 5;
|
||||
cookies[5] = 6;
|
||||
cookies[6] = 7;
|
||||
cookies[7] = 8;
|
||||
|
||||
opts.kprobe_multi.addrs = (const unsigned long *) &addrs;
|
||||
opts.kprobe_multi.cnt = ARRAY_SIZE(addrs);
|
||||
opts.kprobe_multi.cookies = (const __u64 *) &cookies;
|
||||
prog_fd = bpf_program__fd(skel->progs.test_kprobe);
|
||||
|
||||
link1_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_KPROBE_MULTI, &opts);
|
||||
if (!ASSERT_GE(link1_fd, 0, "link1_fd"))
|
||||
goto cleanup;
|
||||
|
||||
cookies[0] = 8;
|
||||
cookies[1] = 7;
|
||||
cookies[2] = 6;
|
||||
cookies[3] = 5;
|
||||
cookies[4] = 4;
|
||||
cookies[5] = 3;
|
||||
cookies[6] = 2;
|
||||
cookies[7] = 1;
|
||||
|
||||
opts.kprobe_multi.flags = BPF_F_KPROBE_MULTI_RETURN;
|
||||
prog_fd = bpf_program__fd(skel->progs.test_kretprobe);
|
||||
|
||||
link2_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_KPROBE_MULTI, &opts);
|
||||
if (!ASSERT_GE(link2_fd, 0, "link2_fd"))
|
||||
goto cleanup;
|
||||
|
||||
kprobe_multi_test_run(skel);
|
||||
|
||||
cleanup:
|
||||
close(link1_fd);
|
||||
close(link2_fd);
|
||||
kprobe_multi__destroy(skel);
|
||||
}
|
||||
|
||||
static void kprobe_multi_attach_api_subtest(void)
|
||||
{
|
||||
struct bpf_link *link1 = NULL, *link2 = NULL;
|
||||
LIBBPF_OPTS(bpf_kprobe_multi_opts, opts);
|
||||
LIBBPF_OPTS(bpf_test_run_opts, topts);
|
||||
struct kprobe_multi *skel = NULL;
|
||||
const char *syms[8] = {
|
||||
"bpf_fentry_test1",
|
||||
"bpf_fentry_test2",
|
||||
"bpf_fentry_test3",
|
||||
"bpf_fentry_test4",
|
||||
"bpf_fentry_test5",
|
||||
"bpf_fentry_test6",
|
||||
"bpf_fentry_test7",
|
||||
"bpf_fentry_test8",
|
||||
};
|
||||
__u64 cookies[8];
|
||||
|
||||
skel = kprobe_multi__open_and_load();
|
||||
if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
|
||||
goto cleanup;
|
||||
|
||||
skel->bss->pid = getpid();
|
||||
skel->bss->test_cookie = true;
|
||||
|
||||
cookies[0] = 1;
|
||||
cookies[1] = 2;
|
||||
cookies[2] = 3;
|
||||
cookies[3] = 4;
|
||||
cookies[4] = 5;
|
||||
cookies[5] = 6;
|
||||
cookies[6] = 7;
|
||||
cookies[7] = 8;
|
||||
|
||||
opts.syms = syms;
|
||||
opts.cnt = ARRAY_SIZE(syms);
|
||||
opts.cookies = cookies;
|
||||
|
||||
link1 = bpf_program__attach_kprobe_multi_opts(skel->progs.test_kprobe,
|
||||
NULL, &opts);
|
||||
if (!ASSERT_OK_PTR(link1, "bpf_program__attach_kprobe_multi_opts"))
|
||||
goto cleanup;
|
||||
|
||||
cookies[0] = 8;
|
||||
cookies[1] = 7;
|
||||
cookies[2] = 6;
|
||||
cookies[3] = 5;
|
||||
cookies[4] = 4;
|
||||
cookies[5] = 3;
|
||||
cookies[6] = 2;
|
||||
cookies[7] = 1;
|
||||
|
||||
opts.retprobe = true;
|
||||
|
||||
link2 = bpf_program__attach_kprobe_multi_opts(skel->progs.test_kretprobe,
|
||||
NULL, &opts);
|
||||
if (!ASSERT_OK_PTR(link2, "bpf_program__attach_kprobe_multi_opts"))
|
||||
goto cleanup;
|
||||
|
||||
kprobe_multi_test_run(skel);
|
||||
|
||||
cleanup:
|
||||
bpf_link__destroy(link2);
|
||||
bpf_link__destroy(link1);
|
||||
kprobe_multi__destroy(skel);
|
||||
}
|
||||
static void uprobe_subtest(struct test_bpf_cookie *skel)
|
||||
{
|
||||
DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
|
||||
@ -199,7 +372,7 @@ static void pe_subtest(struct test_bpf_cookie *skel)
|
||||
attr.type = PERF_TYPE_SOFTWARE;
|
||||
attr.config = PERF_COUNT_SW_CPU_CLOCK;
|
||||
attr.freq = 1;
|
||||
attr.sample_freq = 4000;
|
||||
attr.sample_freq = 1000;
|
||||
pfd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, PERF_FLAG_FD_CLOEXEC);
|
||||
if (!ASSERT_GE(pfd, 0, "perf_fd"))
|
||||
goto cleanup;
|
||||
@ -249,6 +422,10 @@ void test_bpf_cookie(void)
|
||||
|
||||
if (test__start_subtest("kprobe"))
|
||||
kprobe_subtest(skel);
|
||||
if (test__start_subtest("multi_kprobe_link_api"))
|
||||
kprobe_multi_link_api_subtest();
|
||||
if (test__start_subtest("multi_kprobe_attach_api"))
|
||||
kprobe_multi_attach_api_subtest();
|
||||
if (test__start_subtest("uprobe"))
|
||||
uprobe_subtest(skel);
|
||||
if (test__start_subtest("tracepoint"))
|
||||
|
@ -10,6 +10,7 @@ struct btf_type_tag_test {
|
||||
};
|
||||
#include "btf_type_tag.skel.h"
|
||||
#include "btf_type_tag_user.skel.h"
|
||||
#include "btf_type_tag_percpu.skel.h"
|
||||
|
||||
static void test_btf_decl_tag(void)
|
||||
{
|
||||
@ -43,38 +44,81 @@ static void test_btf_type_tag(void)
|
||||
btf_type_tag__destroy(skel);
|
||||
}
|
||||
|
||||
static void test_btf_type_tag_mod_user(bool load_test_user1)
|
||||
/* loads vmlinux_btf as well as module_btf. If the caller passes NULL as
|
||||
* module_btf, it will not load module btf.
|
||||
*
|
||||
* Returns 0 on success.
|
||||
* Return -1 On error. In case of error, the loaded btf will be freed and the
|
||||
* input parameters will be set to pointing to NULL.
|
||||
*/
|
||||
static int load_btfs(struct btf **vmlinux_btf, struct btf **module_btf,
|
||||
bool needs_vmlinux_tag)
|
||||
{
|
||||
const char *module_name = "bpf_testmod";
|
||||
struct btf *vmlinux_btf, *module_btf;
|
||||
struct btf_type_tag_user *skel;
|
||||
__s32 type_id;
|
||||
int err;
|
||||
|
||||
if (!env.has_testmod) {
|
||||
test__skip();
|
||||
return;
|
||||
return -1;
|
||||
}
|
||||
|
||||
/* skip the test if the module does not have __user tags */
|
||||
vmlinux_btf = btf__load_vmlinux_btf();
|
||||
if (!ASSERT_OK_PTR(vmlinux_btf, "could not load vmlinux BTF"))
|
||||
return;
|
||||
*vmlinux_btf = btf__load_vmlinux_btf();
|
||||
if (!ASSERT_OK_PTR(*vmlinux_btf, "could not load vmlinux BTF"))
|
||||
return -1;
|
||||
|
||||
module_btf = btf__load_module_btf(module_name, vmlinux_btf);
|
||||
if (!ASSERT_OK_PTR(module_btf, "could not load module BTF"))
|
||||
if (!needs_vmlinux_tag)
|
||||
goto load_module_btf;
|
||||
|
||||
/* skip the test if the vmlinux does not have __user tags */
|
||||
type_id = btf__find_by_name_kind(*vmlinux_btf, "user", BTF_KIND_TYPE_TAG);
|
||||
if (type_id <= 0) {
|
||||
printf("%s:SKIP: btf_type_tag attribute not in vmlinux btf", __func__);
|
||||
test__skip();
|
||||
goto free_vmlinux_btf;
|
||||
}
|
||||
|
||||
load_module_btf:
|
||||
/* skip loading module_btf, if not requested by caller */
|
||||
if (!module_btf)
|
||||
return 0;
|
||||
|
||||
*module_btf = btf__load_module_btf(module_name, *vmlinux_btf);
|
||||
if (!ASSERT_OK_PTR(*module_btf, "could not load module BTF"))
|
||||
goto free_vmlinux_btf;
|
||||
|
||||
type_id = btf__find_by_name_kind(module_btf, "user", BTF_KIND_TYPE_TAG);
|
||||
/* skip the test if the module does not have __user tags */
|
||||
type_id = btf__find_by_name_kind(*module_btf, "user", BTF_KIND_TYPE_TAG);
|
||||
if (type_id <= 0) {
|
||||
printf("%s:SKIP: btf_type_tag attribute not in %s", __func__, module_name);
|
||||
test__skip();
|
||||
goto free_module_btf;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
free_module_btf:
|
||||
btf__free(*module_btf);
|
||||
free_vmlinux_btf:
|
||||
btf__free(*vmlinux_btf);
|
||||
|
||||
*vmlinux_btf = NULL;
|
||||
if (module_btf)
|
||||
*module_btf = NULL;
|
||||
return -1;
|
||||
}
|
||||
|
||||
static void test_btf_type_tag_mod_user(bool load_test_user1)
|
||||
{
|
||||
struct btf *vmlinux_btf = NULL, *module_btf = NULL;
|
||||
struct btf_type_tag_user *skel;
|
||||
int err;
|
||||
|
||||
if (load_btfs(&vmlinux_btf, &module_btf, /*needs_vmlinux_tag=*/false))
|
||||
return;
|
||||
|
||||
skel = btf_type_tag_user__open();
|
||||
if (!ASSERT_OK_PTR(skel, "btf_type_tag_user"))
|
||||
goto free_module_btf;
|
||||
goto cleanup;
|
||||
|
||||
bpf_program__set_autoload(skel->progs.test_sys_getsockname, false);
|
||||
if (load_test_user1)
|
||||
@ -87,34 +131,23 @@ static void test_btf_type_tag_mod_user(bool load_test_user1)
|
||||
|
||||
btf_type_tag_user__destroy(skel);
|
||||
|
||||
free_module_btf:
|
||||
cleanup:
|
||||
btf__free(module_btf);
|
||||
free_vmlinux_btf:
|
||||
btf__free(vmlinux_btf);
|
||||
}
|
||||
|
||||
static void test_btf_type_tag_vmlinux_user(void)
|
||||
{
|
||||
struct btf_type_tag_user *skel;
|
||||
struct btf *vmlinux_btf;
|
||||
__s32 type_id;
|
||||
struct btf *vmlinux_btf = NULL;
|
||||
int err;
|
||||
|
||||
/* skip the test if the vmlinux does not have __user tags */
|
||||
vmlinux_btf = btf__load_vmlinux_btf();
|
||||
if (!ASSERT_OK_PTR(vmlinux_btf, "could not load vmlinux BTF"))
|
||||
if (load_btfs(&vmlinux_btf, NULL, /*needs_vmlinux_tag=*/true))
|
||||
return;
|
||||
|
||||
type_id = btf__find_by_name_kind(vmlinux_btf, "user", BTF_KIND_TYPE_TAG);
|
||||
if (type_id <= 0) {
|
||||
printf("%s:SKIP: btf_type_tag attribute not in vmlinux btf", __func__);
|
||||
test__skip();
|
||||
goto free_vmlinux_btf;
|
||||
}
|
||||
|
||||
skel = btf_type_tag_user__open();
|
||||
if (!ASSERT_OK_PTR(skel, "btf_type_tag_user"))
|
||||
goto free_vmlinux_btf;
|
||||
goto cleanup;
|
||||
|
||||
bpf_program__set_autoload(skel->progs.test_user2, false);
|
||||
bpf_program__set_autoload(skel->progs.test_user1, false);
|
||||
@ -124,7 +157,70 @@ static void test_btf_type_tag_vmlinux_user(void)
|
||||
|
||||
btf_type_tag_user__destroy(skel);
|
||||
|
||||
free_vmlinux_btf:
|
||||
cleanup:
|
||||
btf__free(vmlinux_btf);
|
||||
}
|
||||
|
||||
static void test_btf_type_tag_mod_percpu(bool load_test_percpu1)
|
||||
{
|
||||
struct btf *vmlinux_btf, *module_btf;
|
||||
struct btf_type_tag_percpu *skel;
|
||||
int err;
|
||||
|
||||
if (load_btfs(&vmlinux_btf, &module_btf, /*needs_vmlinux_tag=*/false))
|
||||
return;
|
||||
|
||||
skel = btf_type_tag_percpu__open();
|
||||
if (!ASSERT_OK_PTR(skel, "btf_type_tag_percpu"))
|
||||
goto cleanup;
|
||||
|
||||
bpf_program__set_autoload(skel->progs.test_percpu_load, false);
|
||||
bpf_program__set_autoload(skel->progs.test_percpu_helper, false);
|
||||
if (load_test_percpu1)
|
||||
bpf_program__set_autoload(skel->progs.test_percpu2, false);
|
||||
else
|
||||
bpf_program__set_autoload(skel->progs.test_percpu1, false);
|
||||
|
||||
err = btf_type_tag_percpu__load(skel);
|
||||
ASSERT_ERR(err, "btf_type_tag_percpu");
|
||||
|
||||
btf_type_tag_percpu__destroy(skel);
|
||||
|
||||
cleanup:
|
||||
btf__free(module_btf);
|
||||
btf__free(vmlinux_btf);
|
||||
}
|
||||
|
||||
static void test_btf_type_tag_vmlinux_percpu(bool load_test)
|
||||
{
|
||||
struct btf_type_tag_percpu *skel;
|
||||
struct btf *vmlinux_btf = NULL;
|
||||
int err;
|
||||
|
||||
if (load_btfs(&vmlinux_btf, NULL, /*needs_vmlinux_tag=*/true))
|
||||
return;
|
||||
|
||||
skel = btf_type_tag_percpu__open();
|
||||
if (!ASSERT_OK_PTR(skel, "btf_type_tag_percpu"))
|
||||
goto cleanup;
|
||||
|
||||
bpf_program__set_autoload(skel->progs.test_percpu2, false);
|
||||
bpf_program__set_autoload(skel->progs.test_percpu1, false);
|
||||
if (load_test) {
|
||||
bpf_program__set_autoload(skel->progs.test_percpu_helper, false);
|
||||
|
||||
err = btf_type_tag_percpu__load(skel);
|
||||
ASSERT_ERR(err, "btf_type_tag_percpu_load");
|
||||
} else {
|
||||
bpf_program__set_autoload(skel->progs.test_percpu_load, false);
|
||||
|
||||
err = btf_type_tag_percpu__load(skel);
|
||||
ASSERT_OK(err, "btf_type_tag_percpu_helper");
|
||||
}
|
||||
|
||||
btf_type_tag_percpu__destroy(skel);
|
||||
|
||||
cleanup:
|
||||
btf__free(vmlinux_btf);
|
||||
}
|
||||
|
||||
@ -134,10 +230,20 @@ void test_btf_tag(void)
|
||||
test_btf_decl_tag();
|
||||
if (test__start_subtest("btf_type_tag"))
|
||||
test_btf_type_tag();
|
||||
|
||||
if (test__start_subtest("btf_type_tag_user_mod1"))
|
||||
test_btf_type_tag_mod_user(true);
|
||||
if (test__start_subtest("btf_type_tag_user_mod2"))
|
||||
test_btf_type_tag_mod_user(false);
|
||||
if (test__start_subtest("btf_type_tag_sys_user_vmlinux"))
|
||||
test_btf_type_tag_vmlinux_user();
|
||||
|
||||
if (test__start_subtest("btf_type_tag_percpu_mod1"))
|
||||
test_btf_type_tag_mod_percpu(true);
|
||||
if (test__start_subtest("btf_type_tag_percpu_mod2"))
|
||||
test_btf_type_tag_mod_percpu(false);
|
||||
if (test__start_subtest("btf_type_tag_percpu_vmlinux_load"))
|
||||
test_btf_type_tag_vmlinux_percpu(true);
|
||||
if (test__start_subtest("btf_type_tag_percpu_vmlinux_helper"))
|
||||
test_btf_type_tag_vmlinux_percpu(false);
|
||||
}
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user