mirror of
https://github.com/torvalds/linux.git
synced 2024-12-04 01:51:34 +00:00
16b7c970cc
Add docs on extended 64-bit immediate instructions, including six instructions previously undocumented. Include a brief description of maps and variables, as used by those instructions. V1 -> V2: rebased on top of latest master V2 -> V3: addressed comments from Alexei V3 -> V4: addressed comments from David Vernet Signed-off-by: Dave Thaler <dthaler@microsoft.com> Link: https://lore.kernel.org/r/20230326054946.2331-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
84 lines
3.2 KiB
ReStructuredText
84 lines
3.2 KiB
ReStructuredText
.. contents::
|
|
.. sectnum::
|
|
|
|
==========================
|
|
Linux implementation notes
|
|
==========================
|
|
|
|
This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
|
|
|
|
Byte swap instructions
|
|
======================
|
|
|
|
``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively.
|
|
|
|
Jump instructions
|
|
=================
|
|
|
|
``BPF_CALL | BPF_X | BPF_JMP`` (0x8d), where the helper function
|
|
integer would be read from a specified register, is not currently supported
|
|
by the verifier. Any programs with this instruction will fail to load
|
|
until such support is added.
|
|
|
|
Maps
|
|
====
|
|
|
|
Linux only supports the 'map_val(map)' operation on array maps with a single element.
|
|
|
|
Linux uses an fd_array to store maps associated with a BPF program. Thus,
|
|
map_by_idx(imm) uses the fd at that index in the array.
|
|
|
|
Variables
|
|
=========
|
|
|
|
The following 64-bit immediate instruction specifies that a variable address,
|
|
which corresponds to some integer stored in the 'imm' field, should be loaded:
|
|
|
|
========================= ====== === ========================================= =========== ==============
|
|
opcode construction opcode src pseudocode imm type dst type
|
|
========================= ====== === ========================================= =========== ==============
|
|
BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer
|
|
========================= ====== === ========================================= =========== ==============
|
|
|
|
On Linux, this integer is a BTF ID.
|
|
|
|
Legacy BPF Packet access instructions
|
|
=====================================
|
|
|
|
As mentioned in the `ISA standard documentation <instruction-set.rst#legacy-bpf-packet-access-instructions>`_,
|
|
Linux has special eBPF instructions for access to packet data that have been
|
|
carried over from classic BPF to retain the performance of legacy socket
|
|
filters running in the eBPF interpreter.
|
|
|
|
The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
|
|
``BPF_IND | <size> | BPF_LD``.
|
|
|
|
These instructions are used to access packet data and can only be used when
|
|
the program context is a pointer to a networking packet. ``BPF_ABS``
|
|
accesses packet data at an absolute offset specified by the immediate data
|
|
and ``BPF_IND`` access packet data at an offset that includes the value of
|
|
a register in addition to the immediate data.
|
|
|
|
These instructions have seven implicit operands:
|
|
|
|
* Register R6 is an implicit input that must contain a pointer to a
|
|
struct sk_buff.
|
|
* Register R0 is an implicit output which contains the data fetched from
|
|
the packet.
|
|
* Registers R1-R5 are scratch registers that are clobbered by the
|
|
instruction.
|
|
|
|
These instructions have an implicit program exit condition as well. If an
|
|
eBPF program attempts access data beyond the packet boundary, the
|
|
program execution will be aborted.
|
|
|
|
``BPF_ABS | BPF_W | BPF_LD`` (0x20) means::
|
|
|
|
R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + imm))
|
|
|
|
where ``ntohl()`` converts a 32-bit value from network byte order to host byte order.
|
|
|
|
``BPF_IND | BPF_W | BPF_LD`` (0x40) means::
|
|
|
|
R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + src + imm))
|