Add new trap ACL which reports flow action cookie in a metadata. Allow
used to setup the cookie using debugfs file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
nsim_dev_take_snapshot_write() uses nsim_dev and nsim_dev->dummy_region.
So, during this function, these data shouldn't be removed.
But there is no protecting stuff in this function.
There are two similar cases.
1. reload case
reload could be called during nsim_dev_take_snapshot_write().
When reload is being executed, nsim_dev_reload_down() is called and it
calls nsim_dev_reload_destroy(). nsim_dev_reload_destroy() calls
devlink_region_destroy() to destroy nsim_dev->dummy_region.
So, during nsim_dev_take_snapshot_write(), nsim_dev->dummy_region()
would be removed.
At this point, snapshot_write() would access freed pointer.
In order to fix this case, take_snapshot file will be removed before
devlink_region_destroy().
The take_snapshot file will be re-created by ->reload_up().
2. del_device_store case
del_device_store() also could call nsim_dev_reload_destroy()
during nsim_dev_take_snapshot_write(). If so, panic would occur.
This problem is actually the same problem with the first case.
So, this problem will be fixed by the first case's solution.
Test commands:
modprobe netdevsim
while :
do
echo 1 > /sys/bus/netdevsim/new_device &
echo 1 > /sys/bus/netdevsim/del_device &
devlink dev reload netdevsim/netdevsim1 &
echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/take_snapshot &
done
Splat looks like:
[ 45.564513][ T975] general protection fault, probably for non-canonical address 0xdffffc000000003a: 0000 [#1] SMP DEI
[ 45.566131][ T975] KASAN: null-ptr-deref in range [0x00000000000001d0-0x00000000000001d7]
[ 45.566135][ T975] CPU: 1 PID: 975 Comm: bash Not tainted 5.5.0+ #322
[ 45.569020][ T975] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 45.569026][ T975] RIP: 0010:__mutex_lock+0x10a/0x14b0
[ 45.570518][ T975] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f
[ 45.570522][ T975] RSP: 0018:ffff888046ccfbf0 EFLAGS: 00010206
[ 45.572305][ T975] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 45.572308][ T975] RDX: 000000000000003a RSI: ffffffffac926440 RDI: 00000000000001d0
[ 45.576843][ T975] RBP: ffff888046ccfd70 R08: ffffffffab610645 R09: 0000000000000000
[ 45.576847][ T975] R10: ffff888046ccfd90 R11: ffffed100d6360ad R12: 0000000000000000
[ 45.578471][ T975] R13: dffffc0000000000 R14: ffffffffae1976c0 R15: 0000000000000168
[ 45.578475][ T975] FS: 00007f614d6e7740(0000) GS:ffff88806c400000(0000) knlGS:0000000000000000
[ 45.581492][ T975] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 45.582942][ T975] CR2: 00005618677d1cf0 CR3: 000000005fb9c002 CR4: 00000000000606e0
[ 45.584543][ T975] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 45.586633][ T975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 45.589889][ T975] Call Trace:
[ 45.591445][ T975] ? devlink_region_snapshot_create+0x55/0x4a0
[ 45.601250][ T975] ? mutex_lock_io_nested+0x1380/0x1380
[ 45.602817][ T975] ? mutex_lock_io_nested+0x1380/0x1380
[ 45.603875][ T975] ? mark_held_locks+0xa5/0xe0
[ 45.604769][ T975] ? _raw_spin_unlock_irqrestore+0x2d/0x50
[ 45.606147][ T975] ? __mutex_unlock_slowpath+0xd0/0x670
[ 45.607723][ T975] ? crng_backtrack_protect+0x80/0x80
[ 45.613530][ T975] ? wait_for_completion+0x390/0x390
[ 45.615152][ T975] ? devlink_region_snapshot_create+0x55/0x4a0
[ 45.616834][ T975] devlink_region_snapshot_create+0x55/0x4a0
[ ... ]
Fixes: 4418f862d6 ("netdevsim: implement support for devlink region and snapshots")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implement "empty" and "dummy" reporters. The first one is really simple
and does nothing. The other one has debugfs files to trigger breakage
and it is able to do recovery. The ops also implement dummy fmsg
content.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add flag to disallow reload and another one that causes reload to
always fail.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When user does create new netdevsim instance using sysfs bus file,
create the devlink instance and related netdev instance in the namespace
of the caller.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register newly created port netdevice into net namespace
that the parent device belongs to.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During devlink reload, all driver objects should be reinstantiated with
the exception of devlink instance and devlink resources and params.
Move existing devlink_resource_size_get() calls into fib_create() just
before fib notifier is registered. Also, make sure that extack is
propagated down to fib_notifier_register() call.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the accounting is done per-namespace. However, devlink
instance is always in init_net namespace for now, so only the accounting
related to init_net is used. Limitations set using devlink resources
are only considered for init_net. nsim_devlink_net() always
returns init_net always.
Make the accounting per-device. This brings no functional change.
Per-device accounting has the same values as per-net.
For a single netdevsim instance, the behaviour is exactly the same
as before. When multiple netdevsim instances are created, each
can have different limits.
This is in prepare to implement proper devlink netns support. After
that, the devlink instance which would exist in particular netns would
account and limit that netns.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Have netdevsim register its trap groups and traps with devlink during
initialization and periodically report trapped packets to devlink core.
Since netdevsim is not a real device, the trapped packets are emulated
using a workqueue that periodically reports a UDP packet with a random
5-tuple from each active packet trap and from each running netdev.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement dummy region of size 32K and allow user to create snapshots
or random data using debugfs file trigger.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register couple of devlink params, one generic, one driver-specific.
Make the values available over debugfs.
Example:
$ echo "111" > /sys/bus/netdevsim/new_device
$ devlink dev param
netdevsim/netdevsim111:
name max_macs type generic
values:
cmode driverinit value 32
name test1 type driver-specific
values:
cmode driverinit value true
$ cat /sys/kernel/debug/netdevsim/netdevsim111/max_macs
32
$ cat /sys/kernel/debug/netdevsim/netdevsim111/test1
Y
$ devlink dev param set netdevsim/netdevsim111 name max_macs cmode driverinit value 16
$ devlink dev param set netdevsim/netdevsim111 name test1 cmode driverinit value false
$ devlink dev reload netdevsim/netdevsim111
$ cat /sys/kernel/debug/netdevsim/netdevsim111/max_macs
16
$ cat /sys/kernel/debug/netdevsim/netdevsim111/test1
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to the commit in the fixes tag, the resource controller in netdevsim
tracked fib entries and rules per network namespace. Restore that behavior.
Fixes: 5fc494225c ("netdevsim: create devlink instance per netdevsim instance")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the existing way to create netdevsim over rtnetlink and move the
netdev creation/destruction to dev probe, so for every probed port,
a netdevsim-netdev instance is created.
Adjust selftests to work with new interface.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to test flows in core, it is beneficial to maintain previously
supported possibility to add and delete ports during netdevsim lifetime.
Do it by extending device sysfs attrs by "new_port" and "del_port".
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement netdevsim bus probing of netdevsim devices. For every probed
device create a devlink instance. According to the user-passed value,
create a number of ports represented by devlink port instances.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the model where dev is represented by devlink and ports are
represented by devlink ports, make debugfs file names independent
on netdev names. Change the topology to the one illustrated
by the following example:
$ ls /sys/kernel/debug/netdevsim/
netdevsim1
$ ls /sys/kernel/debug/netdevsim/netdevsim1/
bpf_bind_accept bpf_bind_verifier_delay bpf_bound_progs ports
$ ls /sys/kernel/debug/netdevsim/netdevsim1/ports/
0 1
$ ls /sys/kernel/debug/netdevsim/netdevsim1/ports/0/
bpf_map_accept bpf_offloaded_id bpf_tc_accept bpf_tc_non_bound_accept bpf_xdpdrv_accept bpf_xdpoffload_accept dev ipsec
$ ls /sys/kernel/debug/netdevsim/netdevsim1/ports/0/dev -l
lrwxrwxrwx 1 root root 0 Apr 13 15:58 /sys/kernel/debug/netdevsim/netdevsim1/ports/0/dev -> ../../../netdevsim1
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation of parent_id/switch_id does not follow the
original idea of being unique. The values are "0", "1", etc. Instead of
that, generate 32 random bytes.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As previously introduce dev which is mapped 1:1 to a bus device covers
the purpose of the original shared device, merge the sdev code into dev.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These functions are going to be called from bus probe/release(),
therefore make them independent on ns struct and rename accordingly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a way to add new netdevsim device on netdevsim bus and also to
delete existing netdevsim device from the bus. Track the bus devices
in using a list.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move netdevsim device registration into bus.c and alongside with that
the related sysfs attributes. Introduce new struct nsim_bus_dev to
represent a netdevsim device on netdevsim bus.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the code related to netdevsim bus is going to get bigger, move the
existing code to a separate file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing devlink.c code is going to be extended to represent asic
device on a bus. As this is about more than just devlink,
rename the file. Do appropriate prefix renaming alongside with that.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is one devlink instance created per network namespace.
That is quite odd considering the fact that devlink instance should
represent an ASIC. The following patches are going to move the devlink
instance even more down to a bus device, but until then, have one
devlink instance per netdevsim instance. Struct nsim_devlink is
introduced to hold fib setting.
The changes in the fib code are only related to holding the
configuration per devlink instance instead of network namespace.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some netdevsim bpf debugfs files are per-sdev, yet they are defined per
netdevsim instance. Move them under sdev directory.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To make code easier to read, move shared dev bits into a separate file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit f6b19b354d ("net: devlink: select NET_DEVLINK
from drivers") adds implicit select of NET_DEVLINK for
netdevsim, the code does not have to deal with the case
when CONFIG_NET_DEVLINK is not enabled. So remove the ifcase.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The root directories of netdevsim should only be used by the core
to create per-device subdirectories, so limit their visibility to
the core file.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a higher-level entity to represent a device/ASIC to allow
programs and maps to be shared between device ports. The extra
work is required to make sure we don't destroy BPF objects as
soon as the netdev for which they were loaded gets destroyed,
as other ports may still be using them. When netdev goes away
all of its BPF objects will be moved to other netdevs of the
device, and only destroyed when last netdev is unregistered.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Move bound program information from netdevsim to shared sub-object,
as programs will soon be shared between netdevs of the same ASIC.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Factor out sharable netdevsim sub-object and use IFLA_LINK to link
netdevsims together at creation time. Sharable object will have
its own DebugFS directory.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Grouping netdevsim devices into "ASICs" will soon be supported.
Add switch_id attribute to all netdevsims. For now each netdevsim
will have its switch_id matching the device id.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Allow netdevsim to accept driver and offload attachment of XDP
BPF programs at the same time.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Basic operations drivers perform during xdp setup and query can
be moved to helpers in the core. Encapsulate program and flags
into a structure and add helpers. Note that the structure is
intended as the "main" program information source in the driver.
Most drivers will additionally place the program pointer in their
fast path or ring structures.
The helpers don't have a huge impact now, but they will
decrease the code duplication when programs can be installed
in HW and driver at the same time. Encapsulating the basic
operations in helpers will hopefully also reduce the number
of changes to drivers which adopt them.
Helpers could really be static inline, but they depend on
definition of struct netdev_bpf which means they'd have
to be placed in netdevice.h, an already 4500 line header.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Implement the IPsec/XFRM offload API for testing.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devlink reset command can fail if a FIB resource limit is set to a value
lower than the current occupancy. Return a proper message indicating the
reason for the failure.
$ devlink resource sh netdevsim/netdevsim0
netdevsim/netdevsim0:
name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
resources:
name fib size unlimited occ 43 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name fib-rules size unlimited occ 4 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name IPv6 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
resources:
name fib size unlimited occ 54 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name fib-rules size unlimited occ 3 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 40
$ devlink dev reload netdevsim/netdevsim0
Error: netdevsim: New size is less than current occupancy.
devlink answers: Invalid argument
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change nsim_devlink_setup to return any error back to the caller and
update nsim_init to handle it.
Requested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devlink support to netdevsim and use it to implement a simple,
profile based resource controller. Only one controller is needed
per namespace, so the first netdevsim netdevice in a namespace
registers with devlink. If that device is deleted, the resource
settings are deleted.
The resource controller allows a user to limit the number of IPv4 and
IPv6 FIB entries and FIB rules. The resource paths are:
/IPv4
/IPv4/fib
/IPv4/fib-rules
/IPv6
/IPv6/fib
/IPv6/fib-rules
The IPv4 and IPv6 top level resources are unlimited in size and can not
be changed. From there, the number of FIB entries and FIB rule entries
are unlimited by default. A user can specify a limit for the fib and
fib-rules resources:
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
$ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
$ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
$ devlink dev reload netdevsim/netdevsim0
such that the number of rules or routes is limited (96 ipv4 routes in the
example above):
$ for n in $(seq 1 32); do ip ro add 10.99.$n.0/24 dev eth1; done
Error: netdevsim: Exceeded number of supported fib entries.
$ devlink resource show netdevsim/netdevsim0
netdevsim/netdevsim0:
name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables non
resources:
name fib size 96 occ 96 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables
...
With this template in place for resource management, it is fairly trivial
to extend and shows one way to implement a simple counter based resource
controller typical of network profiles.
Currently, devlink only supports initial namespace. Code is in place to
adapt netdevsim to a per namespace controller once the network namespace
issues are resolved.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should not compile netdevsim/bpf.c if BPF syscall is not
enabled. Otherwise bpf core would have to provide wrappers
for all functions offload drivers may call, even though
system will never see a BPF object.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add to netdevsim ability to pretend it's offloading BPF maps.
We only allow allocation of tiny 2 entry maps, to keep things
simple. Mutex lock may seem heavy for the operations we
perform, but we want to make sure callbacks can sleep.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
dummy driver was extended with VF-related netdev APIs for testing
SR-IOV-related software. netdevsim did not exist back then.
Implement SR-IOV functionality in netdevsim. Notable difference
is that since netdevsim has no module parameters, we will actually
create a device with sriov_numvfs attribute for each netdev.
The zero MAC address is accepted as some HW use it to mean any
address is allowed. Link state is also now validated.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add support for loading programs for netdevsim devices and
expose the related information via DebugFS. Both offload
of XDP and cls_bpf programs is supported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
To be able to run selftests without any hardware required we
need a software model. The model can also serve as an example
implementation for those implementing actual HW offloads.
The dummy driver have previously been extended to test SR-IOV,
but the general consensus seems to be against adding further
features to it.
Add a new driver for purposes of software modelling only.
eBPF and SR-IOV will be added here shortly, others are invited
to further extend the driver with their offload models.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>