flow_dissector: rst'ify documentation
Rename bpf_flow_dissector.txt to bpf_flow_dissector.rst and fix
formatting. Also, link it from the Documentation/networking/index.rst.
Tested with 'make htmldocs' to make sure it looks reasonable.
Fixes: ae82899bbe
("flow_dissector: document BPF flow dissector environment")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This commit is contained in:
parent
a090dbf25c
commit
5eed789862
126
Documentation/networking/bpf_flow_dissector.rst
Normal file
126
Documentation/networking/bpf_flow_dissector.rst
Normal file
@ -0,0 +1,126 @@
|
|||||||
|
.. SPDX-License-Identifier: GPL-2.0
|
||||||
|
|
||||||
|
==================
|
||||||
|
BPF Flow Dissector
|
||||||
|
==================
|
||||||
|
|
||||||
|
Overview
|
||||||
|
========
|
||||||
|
|
||||||
|
Flow dissector is a routine that parses metadata out of the packets. It's
|
||||||
|
used in the various places in the networking subsystem (RFS, flow hash, etc).
|
||||||
|
|
||||||
|
BPF flow dissector is an attempt to reimplement C-based flow dissector logic
|
||||||
|
in BPF to gain all the benefits of BPF verifier (namely, limits on the
|
||||||
|
number of instructions and tail calls).
|
||||||
|
|
||||||
|
API
|
||||||
|
===
|
||||||
|
|
||||||
|
BPF flow dissector programs operate on an ``__sk_buff``. However, only the
|
||||||
|
limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
|
||||||
|
``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
|
||||||
|
and output arguments.
|
||||||
|
|
||||||
|
The inputs are:
|
||||||
|
* ``nhoff`` - initial offset of the networking header
|
||||||
|
* ``thoff`` - initial offset of the transport header, initialized to nhoff
|
||||||
|
* ``n_proto`` - L3 protocol type, parsed out of L2 header
|
||||||
|
|
||||||
|
Flow dissector BPF program should fill out the rest of the ``struct
|
||||||
|
bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
|
||||||
|
also adjusted accordingly.
|
||||||
|
|
||||||
|
The return code of the BPF program is either BPF_OK to indicate successful
|
||||||
|
dissection, or BPF_DROP to indicate parsing error.
|
||||||
|
|
||||||
|
__sk_buff->data
|
||||||
|
===============
|
||||||
|
|
||||||
|
In the VLAN-less case, this is what the initial state of the BPF flow
|
||||||
|
dissector looks like::
|
||||||
|
|
||||||
|
+------+------+------------+-----------+
|
||||||
|
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
|
||||||
|
+------+------+------------+-----------+
|
||||||
|
^
|
||||||
|
|
|
||||||
|
+-- flow dissector starts here
|
||||||
|
|
||||||
|
|
||||||
|
.. code:: c
|
||||||
|
|
||||||
|
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
|
||||||
|
flow_keys->thoff = nhoff
|
||||||
|
flow_keys->n_proto = ETHER_TYPE
|
||||||
|
|
||||||
|
In case of VLAN, flow dissector can be called with the two different states.
|
||||||
|
|
||||||
|
Pre-VLAN parsing::
|
||||||
|
|
||||||
|
+------+------+------+-----+-----------+-----------+
|
||||||
|
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||||
|
+------+------+------+-----+-----------+-----------+
|
||||||
|
^
|
||||||
|
|
|
||||||
|
+-- flow dissector starts here
|
||||||
|
|
||||||
|
.. code:: c
|
||||||
|
|
||||||
|
skb->data + flow_keys->nhoff point the to first byte of TCI
|
||||||
|
flow_keys->thoff = nhoff
|
||||||
|
flow_keys->n_proto = TPID
|
||||||
|
|
||||||
|
Please note that TPID can be 802.1AD and, hence, BPF program would
|
||||||
|
have to parse VLAN information twice for double tagged packets.
|
||||||
|
|
||||||
|
|
||||||
|
Post-VLAN parsing::
|
||||||
|
|
||||||
|
+------+------+------+-----+-----------+-----------+
|
||||||
|
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||||
|
+------+------+------+-----+-----------+-----------+
|
||||||
|
^
|
||||||
|
|
|
||||||
|
+-- flow dissector starts here
|
||||||
|
|
||||||
|
.. code:: c
|
||||||
|
|
||||||
|
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
|
||||||
|
flow_keys->thoff = nhoff
|
||||||
|
flow_keys->n_proto = ETHER_TYPE
|
||||||
|
|
||||||
|
In this case VLAN information has been processed before the flow dissector
|
||||||
|
and BPF flow dissector is not required to handle it.
|
||||||
|
|
||||||
|
|
||||||
|
The takeaway here is as follows: BPF flow dissector program can be called with
|
||||||
|
the optional VLAN header and should gracefully handle both cases: when single
|
||||||
|
or double VLAN is present and when it is not present. The same program
|
||||||
|
can be called for both cases and would have to be written carefully to
|
||||||
|
handle both cases.
|
||||||
|
|
||||||
|
|
||||||
|
Reference Implementation
|
||||||
|
========================
|
||||||
|
|
||||||
|
See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
|
||||||
|
implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
|
||||||
|
for the loader. bpftool can be used to load BPF flow dissector program as well.
|
||||||
|
|
||||||
|
The reference implementation is organized as follows:
|
||||||
|
* ``jmp_table`` map that contains sub-programs for each supported L3 protocol
|
||||||
|
* ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
|
||||||
|
does ``bpf_tail_call`` to the appropriate L3 handler
|
||||||
|
|
||||||
|
Since BPF at this point doesn't support looping (or any jumping back),
|
||||||
|
jmp_table is used instead to handle multiple levels of encapsulation (and
|
||||||
|
IPv6 options).
|
||||||
|
|
||||||
|
|
||||||
|
Current Limitations
|
||||||
|
===================
|
||||||
|
BPF flow dissector doesn't support exporting all the metadata that in-kernel
|
||||||
|
C-based implementation can export. Notable example is single VLAN (802.1Q)
|
||||||
|
and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
|
||||||
|
for a set of information that's currently can be exported from the BPF context.
|
@ -1,115 +0,0 @@
|
|||||||
==================
|
|
||||||
BPF Flow Dissector
|
|
||||||
==================
|
|
||||||
|
|
||||||
Overview
|
|
||||||
========
|
|
||||||
|
|
||||||
Flow dissector is a routine that parses metadata out of the packets. It's
|
|
||||||
used in the various places in the networking subsystem (RFS, flow hash, etc).
|
|
||||||
|
|
||||||
BPF flow dissector is an attempt to reimplement C-based flow dissector logic
|
|
||||||
in BPF to gain all the benefits of BPF verifier (namely, limits on the
|
|
||||||
number of instructions and tail calls).
|
|
||||||
|
|
||||||
API
|
|
||||||
===
|
|
||||||
|
|
||||||
BPF flow dissector programs operate on an __sk_buff. However, only the
|
|
||||||
limited set of fields is allowed: data, data_end and flow_keys. flow_keys
|
|
||||||
is 'struct bpf_flow_keys' and contains flow dissector input and
|
|
||||||
output arguments.
|
|
||||||
|
|
||||||
The inputs are:
|
|
||||||
* nhoff - initial offset of the networking header
|
|
||||||
* thoff - initial offset of the transport header, initialized to nhoff
|
|
||||||
* n_proto - L3 protocol type, parsed out of L2 header
|
|
||||||
|
|
||||||
Flow dissector BPF program should fill out the rest of the 'struct
|
|
||||||
bpf_flow_keys' fields. Input arguments nhoff/thoff/n_proto should be also
|
|
||||||
adjusted accordingly.
|
|
||||||
|
|
||||||
The return code of the BPF program is either BPF_OK to indicate successful
|
|
||||||
dissection, or BPF_DROP to indicate parsing error.
|
|
||||||
|
|
||||||
__sk_buff->data
|
|
||||||
===============
|
|
||||||
|
|
||||||
In the VLAN-less case, this is what the initial state of the BPF flow
|
|
||||||
dissector looks like:
|
|
||||||
+------+------+------------+-----------+
|
|
||||||
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
|
|
||||||
+------+------+------------+-----------+
|
|
||||||
^
|
|
||||||
|
|
|
||||||
+-- flow dissector starts here
|
|
||||||
|
|
||||||
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER.
|
|
||||||
flow_keys->thoff = nhoff
|
|
||||||
flow_keys->n_proto = ETHER_TYPE
|
|
||||||
|
|
||||||
|
|
||||||
In case of VLAN, flow dissector can be called with the two different states.
|
|
||||||
|
|
||||||
Pre-VLAN parsing:
|
|
||||||
+------+------+------+-----+-----------+-----------+
|
|
||||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
|
||||||
+------+------+------+-----+-----------+-----------+
|
|
||||||
^
|
|
||||||
|
|
|
||||||
+-- flow dissector starts here
|
|
||||||
|
|
||||||
skb->data + flow_keys->nhoff point the to first byte of TCI.
|
|
||||||
flow_keys->thoff = nhoff
|
|
||||||
flow_keys->n_proto = TPID
|
|
||||||
|
|
||||||
Please note that TPID can be 802.1AD and, hence, BPF program would
|
|
||||||
have to parse VLAN information twice for double tagged packets.
|
|
||||||
|
|
||||||
|
|
||||||
Post-VLAN parsing:
|
|
||||||
+------+------+------+-----+-----------+-----------+
|
|
||||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
|
||||||
+------+------+------+-----+-----------+-----------+
|
|
||||||
^
|
|
||||||
|
|
|
||||||
+-- flow dissector starts here
|
|
||||||
|
|
||||||
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER.
|
|
||||||
flow_keys->thoff = nhoff
|
|
||||||
flow_keys->n_proto = ETHER_TYPE
|
|
||||||
|
|
||||||
In this case VLAN information has been processed before the flow dissector
|
|
||||||
and BPF flow dissector is not required to handle it.
|
|
||||||
|
|
||||||
|
|
||||||
The takeaway here is as follows: BPF flow dissector program can be called with
|
|
||||||
the optional VLAN header and should gracefully handle both cases: when single
|
|
||||||
or double VLAN is present and when it is not present. The same program
|
|
||||||
can be called for both cases and would have to be written carefully to
|
|
||||||
handle both cases.
|
|
||||||
|
|
||||||
|
|
||||||
Reference Implementation
|
|
||||||
========================
|
|
||||||
|
|
||||||
See tools/testing/selftests/bpf/progs/bpf_flow.c for the reference
|
|
||||||
implementation and tools/testing/selftests/bpf/flow_dissector_load.[hc] for
|
|
||||||
the loader. bpftool can be used to load BPF flow dissector program as well.
|
|
||||||
|
|
||||||
The reference implementation is organized as follows:
|
|
||||||
* jmp_table map that contains sub-programs for each supported L3 protocol
|
|
||||||
* _dissect routine - entry point; it does input n_proto parsing and does
|
|
||||||
bpf_tail_call to the appropriate L3 handler
|
|
||||||
|
|
||||||
Since BPF at this point doesn't support looping (or any jumping back),
|
|
||||||
jmp_table is used instead to handle multiple levels of encapsulation (and
|
|
||||||
IPv6 options).
|
|
||||||
|
|
||||||
|
|
||||||
Current Limitations
|
|
||||||
===================
|
|
||||||
BPF flow dissector doesn't support exporting all the metadata that in-kernel
|
|
||||||
C-based implementation can export. Notable example is single VLAN (802.1Q)
|
|
||||||
and double VLAN (802.1AD) tags. Please refer to the 'struct bpf_flow_keys'
|
|
||||||
for a set of information that's currently can be exported from the BPF context.
|
|
@ -9,6 +9,7 @@ Contents:
|
|||||||
netdev-FAQ
|
netdev-FAQ
|
||||||
af_xdp
|
af_xdp
|
||||||
batman-adv
|
batman-adv
|
||||||
|
bpf_flow_dissector
|
||||||
can
|
can
|
||||||
can_ucan_protocol
|
can_ucan_protocol
|
||||||
device_drivers/freescale/dpaa2/index
|
device_drivers/freescale/dpaa2/index
|
||||||
|
Loading…
Reference in New Issue
Block a user