linux/Documentation
David S. Miller 94810bd365 mlx5-updates-2019-09-01 (Software steering support)
Abstract:
 --------
 Mellanox ConnetX devices supports packet matching, packet modification and
 redirection. These functionalities are also referred to as flow-steering.
 To configure a steering rule, the rule is written to the device owned
 memory, this memory is accessed and cached by the device when processing
 a packet.
 Steering rules are constructed from multiple steering entries (STE).
 
 Rules are configured using the Firmware command interface. The Firmware
 processes the given driver command and translates them to STEs, then
 writes them to the device memory in the current steering tables.
 This process is slow due to the architecture of the command interface and
 the processing complexity of each rule.
 
 The highlight of this patchset is to cut the middle man (The firmware) and
 do steering rules programming into device directly from the driver, with
 no firmware intervention whatsoever.
 
 Motivation:
 -----------
 Software (driver managed) steering allows for high rule insertion rates
 compared to the FW steering described above, this is achieved by using
 internal RDMA writes to the device owned memory instead of the slow
 command interface to program steering rules.
 
 Software (driver managed) steering, doesn't depend on new FW
 for new steering functionality, new implementations can be done in the
 driver skipping the FW layer.
 
 Performance:
 ------------
 The insertion rate on a single core using the new approach allows
 programming ~300K rules per sec. (Done via direct raw test to the new mlx5
 sw steering layer, without any kernel layer involved).
 
 Test: TC L2 rules
 33K/s with Software steering (this patchset).
 5K/s  with FW and current driver.
 This will improve OVS based solution performance.
 
 Architecture and implementation details:
 ----------------------------------------
 Software steering will be dynamically selected via devlink device
 parameter. Example:
 $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
           pci/0000:06:00.0:
           name flow_steering_mode type driver-specific
           values:
              cmode runtime value smfs
 
 mlx5 software steering module a.k.a (DR - Direct Rule) is implemented
 and contained in mlx5/core/steering directory and controlled by
 MLX5_SW_STEERING kconfig flag.
 
 mlx5 core steering layer (fs_core) already provides a shim layer for
 implementing different steering mechanisms, software steering will
 leverage that as seen at the end of this series.
 
 When Software Steering for a specific steering domain
 (NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules
 targeting this domain to be created using  SW steering instead of FW.
 
 The implementation includes:
 Domain - The steering domain is the object that all other object resides
     in. It holds the memory allocator, send engine, locks and other shared
     data needed by lower objects such as table, matcher, rule, action.
     Each domain can contain multiple tables. Domain is equivalent to
     namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented
     currently in mlx5_core fs_core (flow steering core).
 
 Table - Table objects are used for holding multiple matchers, each table
     has a level used to prevent processing loops. Packets are being
     directed to this table once it is set as the root table, this is done
     by fs_core using a FW command. A packet is being processed inside the
     table matcher by matcher until a successful hit, otherwise the packet
     will perform the default action.
 
 Matcher - Matchers objects are used to specify the fields mask for
     matching when processing a packet. A matcher belongs to a table, each
     matcher can hold multiple rules, each rule with different matching
     values corresponding to the matcher mask. Each matcher has a priority
     used for rule processing order inside the table.
 
 Action - Action objects are created to specify different steering actions
     such as count, reformat (encapsulate, decapsulate, ...), modify
     header, forward to table and many other actions. When creating a rule
     a sequence of actions can be provided to be executed on a successful
     match.
 
 Rule - Rule objects are used to specify a specific match on packets as
     well as the actions that should be executed. A rule belongs to a
     matcher.
 
 STE - This layer is used to hold the specific STE format for the device
     and to convert the requested rule to STEs. Each rule is constructed of
     an STE chain, Multiple rules construct a steering graph. Each node in
     the graph is a hash table containing multiple STEs. The index of each
     STE in the hash table is being calculated using a CRC32 hash function.
 
 Memory pool - Used for managing and caching device owned memory for rule
     insertion. The memory is being allocated using DM (device memory) API.
 
 Communication with device - layer for standard RDMA operation using  RC QP
     to configure the device steering.
 
 Command utility - This module holds all of the FW commands that are
     required for SW steering to function.
 
 Patch planning and files:
 -------------------------
 1) First patch, adds the support to Add flow steering actions to fs_cmd
 shim layer.
 
 2) Next 12 patch will add a file per each Software steering
 functionality/module as described above. (See patches with title: DR, *)
 
 3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable
 build with the new files
 
 4) Next two patches will add the support for software steering in mlx5
 steering shim layer
 net/mlx5: Add API to set the namespace steering mode
 net/mlx5: Add direct rule fs_cmd implementation
 
 5) Last two patches will add the new devlink parameter to select mlx5
 steering mode, will be valid only for switchdev mode for now.
 Two modes are supported:
     1. DMFS - Device managed flow steering
     2. SMFS - Software/Driver managed flow steering.
 
     In the DMFS mode, the HW steering entities are created through the
     FW. In the SMFS mode this entities are created though the driver
     directly.
 
     The driver will use the devlink steering mode only if the steering
     domain supports it, for now SMFS will manages only the switchdev
     eswitch steering domain.
 
     User command examples:
     - Set SMFS flow steering mode::
 
         $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime
 
     - Read device flow steering mode::
 
         $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
           pci/0000:06:00.0:
           name flow_steering_mode type driver-specific
           values:
              cmode runtime value smfs
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl1uxPAACgkQSD+KveBX
 +j5AkggAymoYqG2G+s8cLa4vQFySaD1Td3VzzWglp7PlpDBE3UcSoMAMg/gIDU1D
 8F04PeCsJ6snt1ICk56vPNyAEHWfWeBUd56+QK5lEJBuwozyFvBh6HP81Bnr6T/n
 n6uTx45ljAFQPTHJjEOLBPSzEXecLu07+mvpzSoW0F3ehfGbELhL1IkVobr/RELx
 z4xZW9uM2vm5ylheWvjf4V1S/SvokgJazW9+4fh//rl8tfXgun5IfPoS0hqKie1/
 h5sjcMSYkYR4gLVqrhKmBYHmHVl/h0TYROckW8iC/+XX7ailSo9uPG7lPa6cm+GE
 7Bajlbz4oD/K5RWoByo+q+dmyjeVhQ==
 =M9bS
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-09-01-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-09-01  (Software steering support)

Abstract:
--------
Mellanox ConnetX devices supports packet matching, packet modification and
redirection. These functionalities are also referred to as flow-steering.
To configure a steering rule, the rule is written to the device owned
memory, this memory is accessed and cached by the device when processing
a packet.
Steering rules are constructed from multiple steering entries (STE).

Rules are configured using the Firmware command interface. The Firmware
processes the given driver command and translates them to STEs, then
writes them to the device memory in the current steering tables.
This process is slow due to the architecture of the command interface and
the processing complexity of each rule.

The highlight of this patchset is to cut the middle man (The firmware) and
do steering rules programming into device directly from the driver, with
no firmware intervention whatsoever.

Motivation:
-----------
Software (driver managed) steering allows for high rule insertion rates
compared to the FW steering described above, this is achieved by using
internal RDMA writes to the device owned memory instead of the slow
command interface to program steering rules.

Software (driver managed) steering, doesn't depend on new FW
for new steering functionality, new implementations can be done in the
driver skipping the FW layer.

Performance:
------------
The insertion rate on a single core using the new approach allows
programming ~300K rules per sec. (Done via direct raw test to the new mlx5
sw steering layer, without any kernel layer involved).

Test: TC L2 rules
33K/s with Software steering (this patchset).
5K/s  with FW and current driver.
This will improve OVS based solution performance.

Architecture and implementation details:
----------------------------------------
Software steering will be dynamically selected via devlink device
parameter. Example:
$ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
          pci/0000:06:00.0:
          name flow_steering_mode type driver-specific
          values:
             cmode runtime value smfs

mlx5 software steering module a.k.a (DR - Direct Rule) is implemented
and contained in mlx5/core/steering directory and controlled by
MLX5_SW_STEERING kconfig flag.

mlx5 core steering layer (fs_core) already provides a shim layer for
implementing different steering mechanisms, software steering will
leverage that as seen at the end of this series.

When Software Steering for a specific steering domain
(NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules
targeting this domain to be created using  SW steering instead of FW.

The implementation includes:
Domain - The steering domain is the object that all other object resides
    in. It holds the memory allocator, send engine, locks and other shared
    data needed by lower objects such as table, matcher, rule, action.
    Each domain can contain multiple tables. Domain is equivalent to
    namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented
    currently in mlx5_core fs_core (flow steering core).

Table - Table objects are used for holding multiple matchers, each table
    has a level used to prevent processing loops. Packets are being
    directed to this table once it is set as the root table, this is done
    by fs_core using a FW command. A packet is being processed inside the
    table matcher by matcher until a successful hit, otherwise the packet
    will perform the default action.

Matcher - Matchers objects are used to specify the fields mask for
    matching when processing a packet. A matcher belongs to a table, each
    matcher can hold multiple rules, each rule with different matching
    values corresponding to the matcher mask. Each matcher has a priority
    used for rule processing order inside the table.

Action - Action objects are created to specify different steering actions
    such as count, reformat (encapsulate, decapsulate, ...), modify
    header, forward to table and many other actions. When creating a rule
    a sequence of actions can be provided to be executed on a successful
    match.

Rule - Rule objects are used to specify a specific match on packets as
    well as the actions that should be executed. A rule belongs to a
    matcher.

STE - This layer is used to hold the specific STE format for the device
    and to convert the requested rule to STEs. Each rule is constructed of
    an STE chain, Multiple rules construct a steering graph. Each node in
    the graph is a hash table containing multiple STEs. The index of each
    STE in the hash table is being calculated using a CRC32 hash function.

Memory pool - Used for managing and caching device owned memory for rule
    insertion. The memory is being allocated using DM (device memory) API.

Communication with device - layer for standard RDMA operation using  RC QP
    to configure the device steering.

Command utility - This module holds all of the FW commands that are
    required for SW steering to function.

Patch planning and files:
-------------------------
1) First patch, adds the support to Add flow steering actions to fs_cmd
shim layer.

2) Next 12 patch will add a file per each Software steering
functionality/module as described above. (See patches with title: DR, *)

3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable
build with the new files

4) Next two patches will add the support for software steering in mlx5
steering shim layer
net/mlx5: Add API to set the namespace steering mode
net/mlx5: Add direct rule fs_cmd implementation

5) Last two patches will add the new devlink parameter to select mlx5
steering mode, will be valid only for switchdev mode for now.
Two modes are supported:
    1. DMFS - Device managed flow steering
    2. SMFS - Software/Driver managed flow steering.

    In the DMFS mode, the HW steering entities are created through the
    FW. In the SMFS mode this entities are created though the driver
    directly.

    The driver will use the devlink steering mode only if the steering
    domain supports it, for now SMFS will manages only the switchdev
    eswitch steering domain.

    User command examples:
    - Set SMFS flow steering mode::

        $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime

    - Read device flow steering mode::

        $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
          pci/0000:06:00.0:
          name flow_steering_mode type driver-specific
          values:
             cmode runtime value smfs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-03 21:46:13 -07:00
..
ABI btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux 2019-08-13 23:19:42 +02:00
accounting docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
acpi/dsd
admin-guide Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-08-25 10:10:15 -07:00
arm docs: arm: fix a breakage with pdf output 2019-07-15 11:03:04 -03:00
arm64 docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
auxdisplay docs: admin-guide: add a series of orphaned documents 2019-07-15 11:03:02 -03:00
block docs conversion for v5.3-rc1 2019-07-16 12:21:41 -07:00
bpf bpf/flow_dissector: document flags 2019-07-25 18:00:41 -07:00
cdrom docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
core-api docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
cpu-freq docs: power: convert docs to ReST and rename to *.rst 2019-06-14 16:08:36 -05:00
crypto docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
dev-tools Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental into mauro 2019-07-22 13:51:20 -06:00
devicetree Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-02 11:20:17 -07:00
doc-guide docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
driver-api docs: phy: Drop duplicate 'be made' 2019-07-26 08:15:26 -06:00
EDID docs: driver-api: add a series of orphaned documents 2019-07-15 11:03:02 -03:00
fault-injection docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
fb docs conversion for v5.3-rc1 2019-07-16 12:21:41 -07:00
features Documentation/stackprotector: powerpc supports stack protector 2019-06-14 14:44:43 -06:00
filesystems smb3: update TODO list of missing features 2019-08-05 22:50:38 -05:00
firmware_class
firmware-guide docs: gpio: add sysfs interface to the admin-guide 2019-07-15 11:03:03 -03:00
fpga docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
gpu docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
hid docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
hwmon hwmon: (k8temp) documentation: update URL of datasheet 2019-07-21 19:18:45 -07:00
i2c Merge branch 'i2c/for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2019-07-15 21:10:39 -07:00
ia64 docs: add SPDX tags to new index files 2019-07-15 11:03:03 -03:00
ide docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
iio docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
infiniband docs: infiniband: add it to the driver-api bookset 2019-07-08 14:22:56 -03:00
input docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
ioctl docs: ioctl: add it to the uAPI guide 2019-07-15 09:20:28 -03:00
isdn
kbuild Kbuild updates for v5.3 (2nd) 2019-07-20 09:34:55 -07:00
kernel-hacking docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
leds docs: leds: add it to the driver-api book 2019-07-15 09:20:28 -03:00
livepatch docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
locking docs: fix broken doc references due to renames 2019-07-17 06:57:51 -03:00
m68k docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
maintainer docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
media docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
mic docs: driver-api: add remaining converted dirs to it 2019-07-15 11:03:03 -03:00
mips
misc-devices
netlabel docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
networking net/mlx5: Add devlink flow_steering_mode parameter 2019-09-03 12:54:24 -07:00
nios2
openrisc
parisc
PCI Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-08-27 14:23:31 -07:00
pcmcia docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
power Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental into mauro 2019-07-22 13:51:20 -06:00
powerpc docs: powerpc: convert docs to ReST and rename to *.rst 2019-07-17 06:57:51 -03:00
process Documentation/process: Embargoed hardware security issues 2019-08-28 22:36:07 +02:00
RCU docs: fix broken doc references due to renames 2019-07-17 06:57:51 -03:00
riscv RISC-V updates for v5.3 2019-07-18 12:26:59 -07:00
s390 Fixes in vfio-ccw for older and newer issues. 2019-07-23 10:44:28 +02:00
scheduler docs conversion for v5.3-rc1 2019-07-16 12:21:41 -07:00
scsi scsi: ufs: Documentation: Announce ufs-tool v1.0 2019-06-26 22:47:51 -04:00
security docs: security: move some books to it and update 2019-07-15 11:03:01 -03:00
sh docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
sound docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
sparc docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
sphinx docs: load_config.py: ensure subdirs end with "/" 2019-07-19 08:49:27 -03:00
sphinx-static
spi
target docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
thermal docs: thermal: convert to ReST 2019-06-27 21:22:15 +08:00
timers docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
trace The main changes in this release include: 2019-07-18 11:51:00 -07:00
translations doc:it_IT: translations in process/ 2019-07-22 14:47:02 -06:00
usb docs: usb: rename files to .rst and add them to drivers-api 2019-06-20 14:28:36 +02:00
userspace-api docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
virt This is mostly a set of follow-on fixes from Mauro fixing various fallout 2019-07-26 11:29:24 -07:00
vm HMM patches for 5.3-rc 2019-07-30 12:54:44 -07:00
w1 docs: driver-api: add a series of orphaned documents 2019-07-15 11:03:02 -03:00
watchdog Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental into mauro 2019-07-22 13:51:20 -06:00
wimax
x86 docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
xtensa docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
.gitignore
atomic_bitops.txt
atomic_t.txt Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-08 16:12:03 -07:00
bus-virt-phys-mapping.txt
Changes
CodingStyle
conf.py docs: conf.py: only use CJK if the font is available 2019-07-17 06:57:51 -03:00
COPYING-logo docs: logo.txt: rename it to COPYING-logo 2019-07-15 09:20:27 -03:00
crc32.txt
debugging-modules.txt
debugging-via-ohci1394.txt
digsig.txt
DMA-API-HOWTO.txt docs: DMA-API-HOWTO.txt: fix an unmarked code block 2019-07-15 09:20:24 -03:00
DMA-API.txt
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf doc-rst: Add missing newline at end of file 2019-06-20 14:16:56 -06:00
dontdiff kbuild: create *.mod with full directory path and remove MODVERDIR 2019-07-18 02:19:31 +09:00
futex-requeue-pi.txt
hwspinlock.txt hwspinlock: add the 'in_atomic' API 2019-06-29 21:08:14 -07:00
index.rst docs: virtual: add it to the documentation body 2019-07-17 06:57:52 -03:00
io_ordering.txt
io-mapping.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
Kconfig
kobject.txt
kprobes.txt
kref.txt
logo.gif
lzo.txt
mailbox.txt
Makefile
memory-barriers.txt docs: fix broken doc references due to renames 2019-07-17 06:57:51 -03:00
nommu-mmap.txt
packing.txt
padata.txt
percpu-rw-semaphore.txt
pi-futex.txt docs: locking: convert docs to ReST and rename to *.rst 2019-07-15 08:53:27 -03:00
preempt-locking.txt
rbtree.txt docs: rbtree.txt: fix Sphinx build warnings 2019-07-15 09:20:24 -03:00
remoteproc.txt remoteproc: add vendor resources handling 2019-06-29 12:02:17 -07:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
speculation.txt
static-keys.txt
SubmittingPatches
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
xz.txt