In preparation for the trap code move, use tracepoint_enabled() helper
instead of trace_devlink_trap_report_enabled() which would not be
defined in that scope.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-10-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since both dpipe and resource code is using this helper, in preparation
for code split to separate files, move
devlink_dpipe_send_and_alloc_skb() helper into netlink.c. Rename it on
the way.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In preparations of leftover.c split to individual files, avoid need to
have object structures exposed in devl_internal.h and allow to have them
maintained in object files.
The register/unregister notifications need to know the structures
to iterate lists. To avoid the need, introduce per-object
register/unregister notification helpers and use them.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expose port function commands to enable / disable IPsec packet offloads,
this is used to control the port IPsec capabilities.
When IPsec packet is disabled for a function of the port (default),
function cannot offload IPsec packet operations (encapsulation and XFRM
policy offload). When enabled, IPsec packet operations can be offloaded
by the function of the port, which includes crypto operation
(Encrypt/Decrypt), IPsec encapsulation and XFRM state and policy
offload.
Example of a PCI VF port which supports IPsec packet offloads:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_packet disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_packet enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expose port function commands to enable / disable IPsec crypto offloads,
this is used to control the port IPsec capabilities.
When IPsec crypto is disabled for a function of the port (default),
function cannot offload any IPsec crypto operations (Encrypt/Decrypt and
XFRM state offloading). When enabled, IPsec crypto operations can be
offloaded by the function of the port.
Example of a PCI VF port which supports IPsec crypto offloads:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cited fixes commit introduced linecard notifications for register,
however it didn't add them for unregister. Fix that by adding them.
Fixes: c246f9b5fd ("devlink: add support to create line card and expose to user")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230817125240.2144794-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since dumps carry struct genl_info now, use the attrs pointer
from genl_info and remove the one in struct genl_dumpit_info.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230814214723.2924989-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow user to pass port index for health reporter dump request.
Re-generate the related code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-14-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend per-instance dump command definitions to accept instance
attributes. Allow parsing of devlink handle attributes so they could
be used for instance selection.
Re-generate the related code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-12-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For SFs, one devlink instance per SF is created. There might be
thousands of these on a single host. When a user needs to know port
handle for specific SF, he needs to dump all devlink ports on the host
which does not scale good.
Allow user to pass devlink handle attributes alongside the dump command
and dump only objects which are under selected devlink instance.
Example:
$ devlink port show
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false
$ devlink port show auxiliary/mlx5_core.eth.0
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
$ devlink port show auxiliary/mlx5_core.eth.1
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-11-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As the commands are already defined in split ops, remove them
from small ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-10-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove the duplicate temporary netlink callback prototype as the
generated ones are already in place.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-9-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the definitions for the commands that do per-instance dump
and re-generate the related code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-8-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to easily set NLM_F_DUMP_FILTERED for partial dumps, pass the
flags as an arg of dump_one() callback. Currently, it is always
NLM_F_MULTI.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-7-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce dumpit callbacks for generated split ops. Have them
as a thin wrapper around iteration function and allow to pass dump_one()
function pointer directly without need to store in devlink_cmd structs.
Note that the function prototypes are temporary until the generated ones
will replace them in a follow-up patch.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-6-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rename netlink doit callback functions for the commands that do
implement per-instance dump to match the generated names that are going
to be introduce in the follow-up patch.
Note that the function prototypes are temporary until the generated ones
will replace them in a follow-up patch.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Define port handling helpers what don't rely on internal_flags.
Have __devlink_nl_pre_doit() to accept the flags as a function arg and
make devlink_nl_pre_doit() a wrapper helper function calling it.
Introduce new helpers devlink_nl_pre_doit_port() and
devlink_nl_pre_doit_port_optional() to be used by split ops in follow-up
patch.
Note that the function prototypes are temporary until the generated ones
will replace them in a follow-up patch.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-4-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
No need to give the rate any special treatment in netlink attributes
parsing, as unlike for ports, there is only a couple of commands
benefiting from that.
Remove DEVLINK_NL_FLAG_NEED_RATE*, make pre_doit() callback simpler
by moving the rate attributes parsing to rate_*_doit() ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-3-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
No need to give the linecards any special treatment in netlink attribute
parsing, as unlike for ports, there is only a couple of commands
benefiting from that.
Remove DEVLINK_NL_FLAG_NEED_LINECARD, make pre_doit() callback simpler
by moving the linecard attribute parsing to linecard_[gs]et_doit() ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Do the switch and use generated split ops for get and info_get commands.
Remove those from small ops array.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-13-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Put the newly added generated header to the include list. Remove the
duplicated temporary function prototypes.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-12-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Improve the existing devlink spec in order to serve as a source for
generation of valid devlink split ops for the existing commands.
Add the generated sources.
Node that the policies are narrowed down only to the attributes that
are actually parsed. The dont-validate-strict parsing policy makes sure
that other possibly passed garbage attributes from userspace are
ignored during validation.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-11-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To be prepared for the follow-up generated split ops addition,
make the functions devlink_nl_pre_doit() and devlink_nl_post_doit()
usable outside of netlink.c. Introduce temporary prototypes which are
going to be removed once the generated header will be included.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-9-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce couple of dumpit callbacks for generated split ops. Have them
as a thin wrapper around iteration function and allow to pass dump_one()
function pointer directly without need to store in devlink_cmd structs.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-8-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The generated names of the doit netlink callback are missing "cmd" in
their names. Change names to be ready to switch to generated split ops
header.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-7-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to avoid name collision with the generated split ops array
which is going to be introduced as a follow-up patch, rename
the existing ops array to devlink_nl_small_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-6-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The checks in question were introduced by:
commit 6b4db2e528 ("devlink: Fix use-after-free after a failed reload").
That fixed an issue of reload with mlxsw driver.
Back then, that was a valid fix, because there was a limitation
in place that prevented drivers from registering/unregistering params
when devlink instance was registered.
It was possible to do the fix differently by changing drivers to
register/unregister params in appropriate places making sure the ops
operate only on memory which is allocated and initialized. But that,
as a dependency, would require to remove the limitation mentioned above.
Eventually, this limitation was lifted by:
commit 1d18bb1a4d ("devlink: allow registering parameters after the instance")
Also, the alternative fix (which also fixed another issue) was done by:
commit 74cbc3c03c ("mlxsw: spectrum_acl_tcam: Move devlink param to TCAM code").
Therefore, the checks are no longer relevant. Each driver should make
sure to have the params registered only when the memory the ops
are working with is allocated and initialized.
So remove the checks.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the offending fixes commit I mistakenly removed the reply message of
the port new command. I was under impression it is a new port
notification, partly due to the "notify" in the name of the helper
function. Bring the code sending reply with new port message back, this
time putting it directly to devlink_nl_cmd_port_new_doit()
Fixes: c496daeb86 ("devlink: remove duplicate port notification")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531142025.2605001-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Devlink health is involved in error recovery. Machines in bad
state tend to be fairly unreliable, and occasionally get stuck
in error loops. Even with a reasonable grace period devlink health
may get a thousand reports in an hour.
In case of reporting on an unregistered devlink instance
the subsequent reports don't add much value. Switch to
WARN_ON_ONCE() to avoid flooding dmesg and fleet monitoring
dashboards.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230531015523.48961-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now when the original ops variable is removed, introduce it again
but this time for devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move port_fn_hw_addr_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In devlink, some of the objects have separate ops registered alongside
with the object itself. Port however have ops in devlink_ops structure.
For drivers what register multiple kinds of ports with different ops
this is not convenient. Introduce devlink_port_ops and a set
of functions that allow drivers to pass ops pointer during
port registration.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Historically there was a reason why port_dev() along with for example
port_split() did get port_index instead of the devlink_port pointer.
With the locking changes that were done which ensured devlink instance
mutex is hold for every command, the port ops could get devlink_port
pointer directly. Change the forgotten port_dev() op to be as others
and pass devlink_port pointer instead of port_index.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The notification about created port is send from devl_port_register()
function called from ops->port_new(). No need to send it again here,
so remove the call and the helper function.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'__net_initdata' becomes a no-op with CONFIG_NET_NS=y, but when this
option is disabled it becomes '__initdata', which means the data can be
freed after the initialization phase. This annotation is obviously
incorrect for the devlink net device notifier block which is still
registered after the initialization phase [1].
Fix this crash by removing the '__net_initdata' annotation.
[1]
general protection fault, probably for non-canonical address 0xcccccccccccccccc: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 117 Comm: (udev-worker) Not tainted 6.4.0-rc1-custom-gdf0acdc59b09 #64
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
RIP: 0010:notifier_call_chain+0x58/0xc0
[...]
Call Trace:
<TASK>
dev_set_mac_address+0x85/0x120
dev_set_mac_address_user+0x30/0x50
do_setlink+0x219/0x1270
rtnl_setlink+0xf7/0x1a0
rtnetlink_rcv_msg+0x142/0x390
netlink_rcv_skb+0x58/0x100
netlink_unicast+0x188/0x270
netlink_sendmsg+0x214/0x470
__sys_sendto+0x12f/0x1a0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x38/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: e93c9378e3 ("devlink: change per-devlink netdev notifier to static one")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/netdev/600ddf9e-589a-2aa0-7b69-a438f833ca10@samsung.com/
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230515162925.1144416-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The commit 565b4824c3 ("devlink: change port event netdev notifier
from per-net to global") changed original per-net notifier to be
per-devlink instance. That fixed the issue of non-receiving events
of netdev uninit if that moved to a different namespace.
That worked fine in -net tree.
However, later on when commit ee75f1fc44 ("net/mlx5e: Create
separate devlink instance for ethernet auxiliary device") and
commit 72ed5d5624 ("net/mlx5: Suspend auxiliary devices only in
case of PCI device suspend") were merged, a deadlock was introduced
when removing a namespace with devlink instance with another nested
instance.
Here there is the bad flow example resulting in deadlock with mlx5:
net_cleanup_work -> cleanup_net (takes down_read(&pernet_ops_rwsem) ->
devlink_pernet_pre_exit() -> devlink_reload() ->
mlx5_devlink_reload_down() -> mlx5_unload_one_devl_locked() ->
mlx5_detach_device() -> del_adev() -> mlx5e_remove() ->
mlx5e_destroy_devlink() -> devlink_free() ->
unregister_netdevice_notifier() (takes down_write(&pernet_ops_rwsem)
Steps to reproduce:
$ modprobe mlx5_core
$ ip netns add ns1
$ devlink dev reload pci/0000:08:00.0 netns ns1
$ ip netns del ns1
Resolve this by converting the notifier from per-devlink instance to
a static one registered during init phase and leaving it registered
forever. Use this notifier for all devlink port instances created
later on.
Note what a tree needs this fix only in case all of the cited fixes
commits are present.
Reported-by: Moshe Shemesh <moshe@nvidia.com>
Fixes: 565b4824c3 ("devlink: change port event netdev notifier from per-net to global")
Fixes: ee75f1fc44 ("net/mlx5e: Create separate devlink instance for ethernet auxiliary device")
Fixes: 72ed5d5624 ("net/mlx5: Suspend auxiliary devices only in case of PCI device suspend")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230510144621.932017-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The recent merge from net left-over some unused code in
leftover.c - nomen omen.
Just drop the unused bits.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all devlink health callbacks and related code are in file
health.c move common health functions and devlink_health_reporter struct
to be local in health.c file.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink health report test callback from leftover.c to health.c. No
functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink health report dump callbacks and related code from
leftover.c to health.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Devlink fmsg (formatted message) is used by devlink health diagnose,
dump and drivers which support these devlink health callbacks.
Therefore, move devlink fmsg helpers and related code to file health.c.
Move devlink health diagnose to file health.c. No functional change in
this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink health report helper and recover callback and related code
from leftover.c to health.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink health get and set callbacks and related code from
leftover.c to health.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
devlink_nl_health_reporter_fill() error flow calls nla_nest_end(). Fix
it to call nla_nest_cancel() instead.
Note the bug is harmless as genlmsg_cancel() cancel the entire message,
so no fixes tag added.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink health reporter create/destroy and related dev code to new
file health.c. This file shall include all callbacks and functionality
that are related to devlink health.
In addition, fix kdoc indentation and make reporter create/destroy kdoc
more clear. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The change on network namespace only makes sense during re-init reload
action. For FW activation it is not applicable. So check if user passed
an ATTR indicating network namespace change request and forbid it.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230213115836.3404039-1-jiri@resnulli.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Driver calling devl_param_driverinit_value_set() has to hold devlink
instance lock while doing that. Put an assertion there.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the driver maintains following basic sane behavior, the
devl_param_driverinit_value_get() function could be called without
holding instance lock:
1) Driver ensures a call to devl_param_driverinit_value_get() cannot
race with registering/unregistering the parameter with
the same parameter ID.
2) Driver ensures a call to devl_param_driverinit_value_get() cannot
race with devl_param_driverinit_value_set() call with
the same parameter ID.
3) Driver ensures a call to devl_param_driverinit_value_get() cannot
race with reload operation.
By the nature of params usage, these requirements should be
trivially achievable. If the driver for some off reason
is not able to comply, it has to take the devlink->lock while
calling devl_param_driverinit_value_get().
Remove the lock assertion and add comment describing
the locking requirements.
This fixes a splat in mlx5 driver introduced by the commit
referenced in the "Fixes" tag.
Lore: https://lore.kernel.org/netdev/719de4f0-76ac-e8b9-38a9-167ae239efc7@amd.com/
Reported-by: Kim Phillips <kim.phillips@amd.com>
Fixes: 075935f0ae ("devlink: protect devlink param list by instance lock")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Loose the linked list for params and use xarray instead.
Note that this is required to be eventually possible to call
devl_param_driverinit_value_get() without holding instance lock.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As xarray has an iterator helper that allows to start from specified
index, use this directly and avoid repeated iteration from 0.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Probably due to copy-paste error, the name of the arg is "init_val"
which is misleading, as the pointer is used to point to struct where to
store the current value. Rename it to "val" and change the arg comment
a bit on the way.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driverinit param purpose is to serve the driver during init/reload
time to provide a value, either default or set by user.
Make sure that driver does not read value updated by user before the
reload is performed. Hold the new value in a separate struct and switch
it during reload.
Note that this is required to be eventually possible to call
devl_param_driverinit_value_get() without holding instance lock.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to treat string params any different comparing to other types.
Rely on the struct assign to copy the whole struct, including the
string.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NL_SET_ERR_MSG_MOD inserts the KBUILD_MODNAME and a ':' before the actual
extended error message. The devlink feature hasn't been able to be compiled
as a module since commit f4b6bcc700 ("net: devlink: turn devlink into a
built-in").
Stop using NL_SET_ERR_MSG_MOD, and just use the base NL_SET_ERR_MSG. This
aligns the extended error messages better with the NL_SET_ERR_MSG_ATTR
messages as well.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback devlink_nl_cmd_health_reporter_diagnose_doit() miss
devlink_fmsg_free(), which leads to memory leak.
Fix it by adding devlink_fmsg_free().
Fixes: e994a75fb7 ("devlink: remove reporter reference counting")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/1675698976-45993-1-git-send-email-moshe@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink dev selftest callbacks and related code from leftover.c to
file dev.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As all users of the struct devlink_info_req are already in dev.c, move
this struct from devl_internal.c to be local in dev.c.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink dev flash callbacks, helpers and other related code from
leftover.c to dev.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink dev info callbacks, related drivers helpers functions and
other related code from leftover.c to dev.c. No functional change in
this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink dev eswitch callbacks and related code from leftover.c to
file dev.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink dev reload callback and related code from leftover.c to
file dev.c. No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move devlink dev get and dump callbacks and related dev code to new file
dev.c. This file shall include all callbacks that are specific on
devlink dev object.
No functional change in this patch.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to maintain naming consistency, rename and reorder all usages
of struct struct devlink_cmd in the following way:
1) Remove "gen" and replace it with "cmd" to match the struct name
2) Order devl_cmds[] and the header file to match the order
of enum devlink_command
3) Move devl_cmd_rate_get among the peers
4) Remove "inst" for DEVLINK_CMD_GET
5) Add "_get" suffix to all to match DEVLINK_CMD_*_GET (only rate had it
done correctly)
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
No need to have "gen" inside name of the structure for devlink commands.
Remove it.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To have the name of the function consistent with the struct cb name,
rename devlink_nl_instance_iter_dump() to
devlink_nl_instance_iter_dumpit().
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Devlink features were introduced to disallow devlink reload calls of
userspace before the devlink was fully initialized. The reason for this
workaround was the fact that devlink reload was originally called
without devlink instance lock held.
However, with recent changes that converted devlink reload to be
performed under devlink instance lock, this is redundant so remove
devlink features entirely.
Note that mlx5 used this to enable devlink reload conditionally only
when device didn't act as multi port slave. Move the multi port check
into mlx5_devlink_reload_down() callback alongside with the other
checks preventing the device from reload in certain states.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the notifications are only sent for params. People who
introduced other objects forgot to add the reload notifications here.
To make sure all notifications happen according to existing comment,
benefit from existence of devlink_notify_register/unregister() helpers
and use them in reload code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This effectively reverts commit 05a7f4a8df ("devlink: Break parameter
notification sequence to be before/after unload/load driver").
Cited commit resolved a problem in mlx5 params implementation,
when param notification code accessed memory previously freed
during reload.
Now, when the params can be registered and unregistered when devlink
instance is registered, mlx5 code unregisters the problematic param
during devlink reload. The fix is therefore no longer needed.
Current behavior is a it problematic, as it sends DEL notifications even
in potential case when reload_down() call fails which might confuse
userspace notifications listener.
So move the reload notifications back where they were originally in
between reload_down() and reload_up() calls.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 1d18bb1a4d ("devlink: allow registering parameters after
the instance") as the subject implies introduced possibility to register
devlink params even for already registered devlink instance. This is a
bit problematic, as the consistency or params list was originally
secured by the fact it is static during devlink lifetime. So in order to
protect the params list, take devlink instance lock during the params
operations. Introduce unlocked function variants and use them in drivers
in locked context. Put lock assertions to appropriate places.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Put couple of WARN_ONs in devlink_param_driverinit_value_get() function
to clearly indicate, that it is a driver bug if used without reload
support or for non-driverinit param.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devlink_param_driverinit_value_set() currently returns int with possible
error, but no user is checking it anyway. The only reason for a fail is
a driver bug. So convert the function to return void and put WARN_ONs
on error paths.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a WARN_ON checking the param_item for being NULL when the param
is not inserted in the list. That indicates a driver BUG. Instead of
continuing to work with NULL pointer with its consequences, return.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no user outside the devlink code, so remove the export and make
the functions static. Move them before callers to avoid forward
declarations.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Build bot detects that err may be returned uninitialized in
devlink_fmsg_prepare_skb(). This is not really true because
all fmsgs users should create at least one outer nest, and
therefore fmsg can't be completely empty.
That said the assumption is not trivial to confirm, so let's
follow the bots advice, anyway.
This code does not seem to have changed since its inception in
commit 1db64e8733 ("devlink: Add devlink formatted message (fmsg) API")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230124035231.787381-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 2c7bc10d0f ("netlink: add macro for checking dump ctx size")
misspelled the name of the assert as asset, missing an R.
Reported-by: Ido Schimmel <idosch@idosch.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20230123222224.732338-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After region and linecard lock removals, this helper is always supposed
to be called with instance lock held. So put the assertion here and
remove the comment which is no longer accurate.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
devlink_dump_for_each_instance_get() is currently called from
a single place in netlink.c. As there is no need to use
this helper anywhere else in the future, remove it and
call devlinks_xa_find_get() directly from while loop
in devlink_nl_instance_iter_dump(). Also remove redundant
idx clear on loop end as it is already done
in devlink_nl_instance_iter_dump().
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>