In sctp_get_port_local(), when binding to IP and PORT, it should
also check sk_bound_dev_if to match listening sk if it's set by
SO_BINDTOIFINDEX, so that multiple sockets with the same IP and
PORT, but different sk_bound_dev_if can be listened at the same
time.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When binding to an ipv6 address, it calls ipv6_chk_addr() to check if
this address is on any dev. If a socket binds to a l3mdev but no dev
is passed to do this check, all l3mdev and slaves will be skipped and
the check will fail.
This patch is to pass the bound_dev to make sure the devices under the
same l3mdev can be returned in ipv6_chk_addr(). When the bound_dev is
not a l3mdev or l3slave, l3mdev_master_dev_rcu() will return NULL in
__ipv6_chk_addr_and_flags(), it will keep compitable with before when
NULL dev was passed.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After binding to a l3mdev, it should use the route table from the
corresponding VRF to verify the addr when binding to an address.
Note ipv6 doesn't need it, as binding to ipv6 address does not
verify the addr with route lookup.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix duplicate check on polling timeout.
Fixes: 1efa9bfe58 ("net: libwx: Implement interaction with firmware")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following 1b16b3fdf6 ("net: phy: mscc: macsec: clear encryption keys when freeing a flow"),
go one step further and instead of calling memzero_explicit on the key
when freeing a flow, simply not copy the key in the first place as it's
only used when a new flow is set up.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder says:
====================
net: ipa: change GSI firmware load specification
Currently, GSI firmware must be loaded for IPA before it can be
used--either by the modem, or by the AP. New hardware supports a
third option, with the bootloader taking responsibility for loading
GSI firmware. In that case, neither the AP nor the modem needs to
do that.
The first patch in this series deprecates the "modem-init" Device
Tree property in the IPA binding, using a new "qcom,gsi-loader"
property instead. The second and third implement logic in the code
to support either the "old" or the "new" way of specifying how GSI
firmware is loaded.
The last two patches implement a new value for the "qcom,gsi-loader"
property. If the value is "skip", neither the AP nor modem needs to
load the GSI firmware. The first of these patches implements the
change in the IPA binding; the second implements it in the code.
====================
Link: https://lore.kernel.org/r/20221116073257.34010-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Define a new value "skip" for the "qcom,gsi-loader" Device Tree
property. If used, it indicates that neither the AP nor the modem
need to load GSI firmware (because it has already been loaded--for
example by the boot loader).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new enumerated value to those defined for the qcom,gsi-loader
property. If the qcom,gsi-loader is "skip", the GSI firmware will
already be loaded, so neither the AP nor modem is required to load
GSI firmware.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce a new way of specifying how the GSI firmware gets loaded
for IPA. Currently, this is indicated by the presence or absence of
the Boolean "modem-init" Device Tree property. The new property
must have a value--either "self" or "modem"--which indicates whether
the AP or modem is the GSI firmware loader, respectively.
For legacy systems, the new property will not exist, and the
"modem-init" property will be used. For newer systems, the
"qcom,gsi-loader" property *must* exist, and must have one of the
two prescribed values. It is an error to have both properties
defined, and it is an error for the new property to have an
unrecognized value.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The GSI layer used for IPA requires firmware to be loaded.
Currently either the AP or the modem loads the firmware,
distinguished by whether the "modem-init" Device Tree
property is defined.
Some newer systems implement a third option. In preparation for
that, encapsulate the code that determines how the GSI firmware
gets loaded in a new function, ipa_firmware_loader().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
GSI firmware for IPA must be loaded during initialization, either by
the AP or by the modem. The loader is currently specified based on
whether the Boolean modem-init property is present.
Instead, use a new property with an enumerated value to indicate
explicitly how GSI firmware gets loaded. With this in place, a
third approach can be added in an upcoming patch.
The new qcom,gsi-loader property has two defined values:
- self: The AP loads GSI firmware
- modem: The modem loads GSI firmware
The modem-init property must still be supported, but is now marked
deprecated.
Update the example so it represents the SC7180 SoC, and provide
examples for the qcom,gsi-loader, memory-region, and firmware-name
properties.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move these two macros from net/sctp/sctp.h to linux/sctp.h, so that
it will be enough to include only linux/sctp.h in nft_exthdr.c and
xt_sctp.c. It should not include "net/sctp/sctp.h" if a module does
not have a dependence on SCTP module.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Link: https://lore.kernel.org/r/ef6468a687f36da06f575c2131cd4612f6b7be88.1668526821.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently "net/sctp/checksum.h" including "net/sctp/sctp.h" is
included in quite some places in netfilter and openswitch and
net/sched. It's not necessary to include "net/sctp/sctp.h" if
a module does not have dependence on SCTP, "linux/sctp.h" is
the right one to include.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Link: https://lore.kernel.org/r/ca7ea96d62a26732f0491153c3979dc1c0d8d34a.1668526793.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michal Wilczynski says:
====================
Implement devlink-rate API and extend it
This patch series implements devlink-rate for ice driver. Unfortunately
current API isn't flexible enough for our use case, so there is a need to
extend it. Some functions have been introduced to enable the driver to
export current Tx scheduling configuration.
Pasting justification for this series from commit implementing devlink-rate
in ice driver(that is a part of this series):
There is a need to support modification of Tx scheduler tree, in the
ice driver. This will allow user to control Tx settings of each node in
the internal hierarchy of nodes. As a result user will be able to use
Hierarchy QoS implemented entirely in the hardware.
This patch implemenents devlink-rate API. It also exports initial
default hierarchy. It's mostly dictated by the fact that the tree
can't be removed entirely, all we can do is enable the user to modify
it. For example root node shouldn't ever be removed, also nodes that
have children are off-limits.
Example initial tree with 2 VF's:
[root@fedora ~]# devlink port function rate show
pci/0000:4b:00.0/node_27: type node parent node_26
pci/0000:4b:00.0/node_26: type node parent node_0
pci/0000:4b:00.0/node_34: type node parent node_33
pci/0000:4b:00.0/node_33: type node parent node_32
pci/0000:4b:00.0/node_32: type node parent node_16
pci/0000:4b:00.0/node_19: type node parent node_18
pci/0000:4b:00.0/node_18: type node parent node_17
pci/0000:4b:00.0/node_17: type node parent node_16
pci/0000:4b:00.0/node_21: type node parent node_20
pci/0000:4b:00.0/node_20: type node parent node_3
pci/0000:4b:00.0/node_14: type node parent node_5
pci/0000:4b:00.0/node_5: type node parent node_3
pci/0000:4b:00.0/node_13: type node parent node_4
pci/0000:4b:00.0/node_12: type node parent node_4
pci/0000:4b:00.0/node_11: type node parent node_4
pci/0000:4b:00.0/node_10: type node parent node_4
pci/0000:4b:00.0/node_9: type node parent node_4
pci/0000:4b:00.0/node_8: type node parent node_4
pci/0000:4b:00.0/node_7: type node parent node_4
pci/0000:4b:00.0/node_6: type node parent node_4
pci/0000:4b:00.0/node_4: type node parent node_3
pci/0000:4b:00.0/node_3: type node parent node_16
pci/0000:4b:00.0/node_16: type node parent node_15
pci/0000:4b:00.0/node_15: type node parent node_0
pci/0000:4b:00.0/node_2: type node parent node_1
pci/0000:4b:00.0/node_1: type node parent node_0
pci/0000:4b:00.0/node_0: type node
pci/0000:4b:00.0/1: type leaf parent node_27
pci/0000:4b:00.0/2: type leaf parent node_27
Let me visualize part of the tree:
+---------+
| node_0 |
+---------+
|
+----v----+
| node_26 |
+----+----+
|
+----v----+
| node_27 |
+----+----+
|
|-----------------|
+----v----+ +----v----+
| VF 1 | | VF 2 |
+----+----+ +----+----+
So at this point there is a couple things that can be done.
For example we could only assign parameters to VF's.
[root@fedora ~]# devlink port function rate set pci/0000:4b:00.0/1 \
tx_max 5Gbps
This would cap the VF 1 BW to 5Gbps.
But let's say you would like to create a completely new branch.
This can be done like this:
[root@fedora ~]# devlink port function rate add \
pci/0000:4b:00.0/node_custom parent node_0
[root@fedora ~]# devlink port function rate add \
pci/0000:4b:00.0/node_custom_1 parent node_custom
[root@fedora ~]# devlink port function rate set \
pci/0000:4b:00.0/1 parent node_custom_1
This creates a completely new branch and reassigns VF 1 to it.
A number of parameters is supported per each node: tx_max, tx_share,
tx_priority and tx_weight.
====================
Link: https://lore.kernel.org/r/20221115104825.172668-1-michal.wilczynski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Provide documentation for newly introduced netlink attributes for
devlink-rate: tx_priority and tx_weight.
Mention the possibility to export tree from the driver.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add documentation to a newly added devlink-rate feature. Provide some
examples on how to use the commands, which netlink attributes are
supported and descriptions of the attributes.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ADQ, DCB might interfere with Custom Tx Scheduler changes that user
might introduce using devlink-rate API.
Check if ADQ, DCB is active, when user tries to change any setting
in exported Tx scheduler tree. If any of those are active block the user
from doing so, and log an appropriate message.
Remove the exported hierarchy if user enable ADQ or DCB.
Prevent ADQ or DCB from getting configured if user already made some
changes using devlink-rate API.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a need to support modification of Tx scheduler tree, in the
ice driver. This will allow user to control Tx settings of each node in
the internal hierarchy of nodes. As a result user will be able to use
Hierarchy QoS implemented entirely in the hardware.
This patch implemenents devlink-rate API. It also exports initial
default hierarchy. It's mostly dictated by the fact that the tree
can't be removed entirely, all we can do is enable the user to modify
it. For example root node shouldn't ever be removed, also nodes that
have children are off-limits.
Example initial tree with 2 VF's:
[root@fedora ~]# devlink port function rate show
pci/0000:4b:00.0/node_27: type node parent node_26
pci/0000:4b:00.0/node_26: type node parent node_0
pci/0000:4b:00.0/node_34: type node parent node_33
pci/0000:4b:00.0/node_33: type node parent node_32
pci/0000:4b:00.0/node_32: type node parent node_16
pci/0000:4b:00.0/node_19: type node parent node_18
pci/0000:4b:00.0/node_18: type node parent node_17
pci/0000:4b:00.0/node_17: type node parent node_16
pci/0000:4b:00.0/node_21: type node parent node_20
pci/0000:4b:00.0/node_20: type node parent node_3
pci/0000:4b:00.0/node_14: type node parent node_5
pci/0000:4b:00.0/node_5: type node parent node_3
pci/0000:4b:00.0/node_13: type node parent node_4
pci/0000:4b:00.0/node_12: type node parent node_4
pci/0000:4b:00.0/node_11: type node parent node_4
pci/0000:4b:00.0/node_10: type node parent node_4
pci/0000:4b:00.0/node_9: type node parent node_4
pci/0000:4b:00.0/node_8: type node parent node_4
pci/0000:4b:00.0/node_7: type node parent node_4
pci/0000:4b:00.0/node_6: type node parent node_4
pci/0000:4b:00.0/node_4: type node parent node_3
pci/0000:4b:00.0/node_3: type node parent node_16
pci/0000:4b:00.0/node_16: type node parent node_15
pci/0000:4b:00.0/node_15: type node parent node_0
pci/0000:4b:00.0/node_2: type node parent node_1
pci/0000:4b:00.0/node_1: type node parent node_0
pci/0000:4b:00.0/node_0: type node
pci/0000:4b:00.0/1: type leaf parent node_27
pci/0000:4b:00.0/2: type leaf parent node_27
Let me visualize part of the tree:
+---------+
| node_0 |
+---------+
|
+----v----+
| node_26 |
+----+----+
|
+----v----+
| node_27 |
+----+----+
|
|-----------------|
+----v----+ +----v----+
| VF 1 | | VF 2 |
+----+----+ +----+----+
So at this point there is a couple things that can be done.
For example we could only assign parameters to VF's.
[root@fedora ~]# devlink port function rate set pci/0000:4b:00.0/1 \
tx_max 5Gbps
This would cap the VF 1 BW to 5Gbps.
But let's say you would like to create a completely new branch.
This can be done like this:
[root@fedora ~]# devlink port function rate add \
pci/0000:4b:00.0/node_custom parent node_0
[root@fedora ~]# devlink port function rate add \
pci/0000:4b:00.0/node_custom_1 parent node_custom
[root@fedora ~]# devlink port function rate set \
pci/0000:4b:00.0/1 parent node_custom_1
This creates a completely new branch and reassigns VF 1 to it.
A number of parameters is supported per each node: tx_max, tx_share,
tx_priority and tx_weight.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
devlink-rate API requires a priv object to be allocated when node still
doesn't have a parent. This is problematic, because ice_sched_node can't
be currently created without a parent.
Add an option to pre-allocate memory for ice_sched_node struct. Add
new arguments to ice_sched_add() and ice_sched_add_elems() that allow
for pre-allocation of memory for ice_sched_node struct.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To support new devlink-rate API ice_sched_node struct needs to store
a number of additional parameters. This includes tx_max, tx_share,
tx_weight, and tx_priority.
Add new fields to ice_sched_node struct. Add new functions to configure
the hardware with new parameters. Introduce new xarray to identify
nodes uniquely.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently the driver is able to create leaf nodes for the devlink-rate,
but is unable to set parent for them. This wasn't as issue before the
possibility to export hierarchy from the driver. After adding the export
feature, in order for the driver to supply correct hierarchy, it's
necessary for it to be able to supply a parent name to
devl_rate_leaf_create().
Introduce a new parameter 'parent_name' in devl_rate_leaf_create().
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently it's not possible to reassign the parent of the node using one
command. As the previous commit introduced a way to export entire
hierarchy from the driver, being able to modify and reassign parents
become important. This way user might easily change QoS settings without
interrupting traffic.
Example command:
devlink port function rate set pci/0000:4b:00.0/1 parent node_custom_1
This reassigns leaf node parent to node_custom_1.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Intel 100G card internal firmware hierarchy for Hierarchicial QoS is very
rigid and can't be easily removed. This requires an ability to export
default hierarchy to allow user to modify it. Currently the driver is
only able to create the 'leaf' nodes, which usually represent the vport.
This is not enough for HQoS implemented in Intel hardware.
Introduce new function devl_rate_node_create() that allows for creation
of the devlink-rate nodes from the driver.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To fully utilize offload capabilities of Intel 100G card QoS capabilities
new attribute 'tx_weight' needs to be introduced. This attribute allows
for usage of Weighted Fair Queuing arbitration scheme among siblings.
This arbitration scheme can be used simultaneously with the strict
priority.
Introduce new attribute in devlink-rate that will allow for configuration
of Weighted Fair Queueing. New attribute is optional.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To fully utilize offload capabilities of Intel 100G card QoS capabilities
new attribute 'tx_priority' needs to be introduced. This attribute allows
for usage of strict priority arbiter among siblings. This arbitration
scheme attempts to schedule nodes based on their priority as long as the
nodes remain within their bandwidth limit.
Introduce new attribute in devlink-rate that will allow for configuration
of strict priority. New attribute is optional.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean says:
====================
Autoload DSA tagging driver when dynamically changing protocol
This patch set solves the issue reported by Michael and Heiko here:
https://lore.kernel.org/lkml/20221027113248.420216-1-michael@walle.cc/
making full use of Michael's suggestion of having two modaliases: one
gets used for loading the tagging protocol when it's the default one
reported by the switch driver, the other gets loaded at user's request,
by name.
# modinfo tag_ocelot
filename: /lib/modules/6.1.0-rc4+/kernel/net/dsa/tag_ocelot.ko
license: GPL v2
alias: dsa_tag:seville
alias: dsa_tag:id-21
alias: dsa_tag:ocelot
alias: dsa_tag:id-15
depends: dsa_core
intree: Y
name: tag_ocelot
vermagic: 6.1.0-rc4+ SMP preempt mod_unload modversions aarch64
Tested on NXP LS1028A-RDB with the following device tree addition:
&mscc_felix_port4 {
dsa-tag-protocol = "ocelot-8021q";
};
&mscc_felix_port5 {
dsa-tag-protocol = "ocelot-8021q";
};
CONFIG_NET_DSA and everything that depends on it is built as module.
Everything auto-loads, and "cat /sys/class/net/eno2/dsa/tagging" shows
"ocelot-8021q". Traffic works as well. Furthermore, "echo ocelot-8021q"
into the aforementioned sysfs file now auto-loads the driver for it.
====================
Link: https://lore.kernel.org/r/20221115011847.2843127-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Issue a request_module() call when an attempt to change the tagging
protocol is made, either by sysfs or by device tree. In the case of
ocelot (the only driver for which the default and the alternative
tagging protocol are compiled as different modules), the user is now no
longer required to insert tag_ocelot_8021q.ko manually.
In the particular case of ocelot, this solves a problem where
tag_ocelot_8021q.ko is built as module, and this is present in the
device tree:
&mscc_felix_port4 {
dsa-tag-protocol = "ocelot-8021q";
};
&mscc_felix_port5 {
dsa-tag-protocol = "ocelot-8021q";
};
Because no one attempts to load the module into the kernel at boot time,
the switch driver will fail to probe (actually forever defer) until
someone manually inserts tag_ocelot_8021q.ko. This is now no longer
necessary and happens automatically.
Rename dsa_find_tagger_by_name() to denote the change in functionality:
there is now feature parity with dsa_tag_driver_get_by_id(), i.o.w. we
also load the module if it's missing.
Link: https://lore.kernel.org/lkml/20221027113248.420216-1-michael@walle.cc/
Suggested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28 w/ ocelot_8021q
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A future patch will introduce one more way of getting a reference on a
tagging protocl driver (by name). Rename the current method to "by_id".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, dsa_find_tagger_by_name() uses sysfs_streq() which works both
with strings that contain \n at the end (echo ocelot > .../dsa/tagging)
and with strings that don't (printf ocelot > .../dsa/tagging).
There will be a problem once we'll want to construct the modalias string
based on which we auto-load the protocol kernel module. If the sysfs
buffer ends in a newline, we need to strip it first. This is a
preparatory patch specifically for that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, tagging protocol drivers have a modalias of
"dsa_tag:id-<number>", where the number is one of DSA_TAG_PROTO_*_VALUE.
This modalias makes it possible for the request_module() call in
dsa_tag_driver_get() to work, given the input it has - an integer
returned by ds->ops->get_tag_protocol().
It is also possible to change tagging protocols at (pseudo-)runtime, via
sysfs or via device tree, and this works via the name string of the
tagging protocol rather than via its id (DSA_TAG_PROTO_*_VALUE).
In the latter case, there is no request_module() call, because there is
no association that the DSA core has between the string name and the ID,
to construct the modalias. The module is simply assumed to have been
inserted. This is actually slightly problematic when the tagging
protocol change should take place at probe time, since it's expected
that the dependency module should get autoloaded.
For this purpose, let's introduce a second modalias, so that the DSA
core can call request_module() by name. There is no reason to make the
modalias by name optional, so just modify the MODULE_ALIAS_DSA_TAG_DRIVER()
macro to take both the ID and the name as arguments, and generate two
modaliases behind the scenes.
Suggested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28 w/ ocelot_8021q
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It's autumn cleanup time, and today's target are modaliases.
Michael says that for users of modinfo, "dsa_tag-20" is not the most
suggestive name, and recommends a change to "dsa_tag-id-20".
Andrew points out that other modaliases have a prefix delimited by
colons, so he recommends "dsa_tag:20" instead of "dsa_tag-20".
To satisfy both proposals, Florian recommends "dsa_tag:id-20".
The modaliases are not stable ABI, and the essential information
(protocol ID) is still conveyed in the new string, which
request_module() must be adapted to form.
Link: 20221027210830.3577793-1-vladimir.oltean@nxp.com
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Michael Walle <michael@walle.cc>
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The DSA tagging protocol driver macros are in the public include/net/dsa.h
probably because that's also where the DSA_TAG_PROTO_*_VALUE macros are
(MODULE_ALIAS_DSA_TAG_DRIVER hinges on those macro definitions).
But there is no reason to expose these helpers to <net/dsa.h>. That
header is shared between switch drivers (drivers/net/dsa/), tagging
protocol drivers (net/dsa/tag_*.c), the DSA core (net/dsa/ sans tag_*.c),
and the rest of the world (DSA master drivers, network stack, etc).
Too much exposure.
On the other hand, net/dsa/dsa_priv.h is included only by the DSA core
and by DSA tagging protocol drivers (or IOW, "friend" modules). Also a
bit too much exposure - I've contemplated creating a new header which is
only included by tagging protocol drivers, but completely separating a
new dsa_tag_proto.h from dsa_priv.h is not immediately trivial - for
example dsa_slave_to_port() is used both from the fast path and from the
control path.
So for now, move these definitions to dsa_priv.h which at least hides
them from the world.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
IPQ5018, IPQ6018 and IPQ8074 require clock-names to be set as driver is
requesting the clock based on it and not index, so document that and make
it required for the listed SoC-s.
Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221114194734.3287854-4-robimarko@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now that we can match the platforms requiring clocks by compatible start
using those to allow clocks per compatible and make them required.
Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221114194734.3287854-3-robimarko@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow using IPQ8074 specific compatible along with the fallback IPQ4019
one in order to be able to specify which compatibles require clocks to
be able to validate them via schema.
Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221114194734.3287854-2-robimarko@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Document IPQ6018 compatible that is already being used in the DTS along
with the fallback IPQ4019 compatible as driver itself only gets probed
on IPQ4019 and IPQ5018 compatibles.
This is also required in order to specify which platform require clock to
be defined and validate it in schema.
Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221114194734.3287854-1-robimarko@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rasmus Villemoes says:
====================
net: dsa: use more appropriate NET_NAME_* constants for user ports
The intention of commit 685343fc3b ("net: add name_assign_type
netdev attribute") was clearly that drivers be switched over one by
one to select appropriate NET_NAME_* constants instead of
NET_NAME_UNKNOWN. This small series attempts to do that for DSA user
ports.
This is obviously and intentionally user-visible changes, so there's a
small chance that it could lead to a regression. To make it easy to
revert either of the "label in DT" and "fallback to eth%d" changes,
this is done as a refactoring which shouldn't introduce any functional
change (but by itself adds code which looks a little odd, with the two
identical assignments in the two branches), followed by changing the
constant used in each case in two different patches.
====================
Link: https://lore.kernel.org/r/20221116105205.1127843-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a user port does not have a label in device tree, and we thus
fall back to the eth%d scheme, the proper constant to use is
NET_NAME_ENUM. See also commit e9f656b7a2 ("net: ethernet: set
default assignment identifier to NET_NAME_ENUM"), which in turn quoted
commit 685343fc3b ("net: add name_assign_type netdev attribute"):
... when the kernel has given the interface a name using global
device enumeration based on order of discovery (ethX, wlanY, etc)
... are labelled NET_NAME_ENUM.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.faineli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a user port has a label in device tree, the corresponding
netdevice is, to quote include/uapi/linux/netdevice.h, "predictably
named by the kernel". This is also explicitly one of the intended use
cases for NET_NAME_PREDICTABLE, quoting 685343fc3b ("net: add
name_assign_type netdev attribute"):
NET_NAME_PREDICTABLE:
The ifname has been assigned by the kernel in a predictable way
[...] Examples include [...] and names deduced from hardware
properties (including being given explicitly by the firmware).
Expose that information properly for the benefit of userspace tools
that make decisions based on the name_assign_type attribute,
e.g. a systemd-udev rule with "kernel" in NamePolicy.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.faineli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The following two patches each have a (small) chance of causing
regressions for userspace and will in that case of course need to be
reverted.
In order to prepare for that and make those two patches independent
and individually revertable, refactor the code which sets the names
for user ports by moving the "fall back to eth%d if no label is given
in device tree" to dsa_slave_create().
No functional change (at least none intended).
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.faineli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Many of the drivers which implement ethtool_ops::get_drvinfo() will
prints the .driver, .version or .bus_info of struct ethtool_drvinfo.
To have a glance of current state, do:
$ git grep -W "get_drvinfo(struct"
Printing in those three fields is useless because:
- since [1], the driver version should be the kernel version (at
least for upstream drivers). Arguably, out of tree drivers might
still want to set a custom version, but out of tree is not our
focus.
- since [2], the core is able to provide default values for .driver
and .bus_info.
In summary, drivers may provide .fw_version and .erom_version, the
rest is expected to be done by the core.
In struct ethtool_ops doc from linux/ethtool: rephrase field
get_drvinfo() doc to discourage developers from implementing this
callback.
In struct ethtool_drvinfo doc from uapi/linux/ethtool.h: remove the
paragraph mentioning what drivers should do. Rationale: no need to
repeat what is already written in struct ethtool_ops doc. But add a
note that .fw_version and .erom_version are driver defined.
Also update the dummy driver and simply remove the callback in order
not to confuse the newcomers: most of the drivers will not need this
callback function any more.
[1] commit 6a7e25c7fb ("net/core: Replace driver version to be
kernel version")
Link: https://git.kernel.org/torvalds/linux/c/6a7e25c7fb48
[2] commit edaf5df22c ("ethtool: ethtool_get_drvinfo: populate
drvinfo fields even if callback exits")
Link: https://git.kernel.org/netdev/net-next/c/edaf5df22cb8
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/r/20221116171828.4093-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current release - regressions:
- tls: fix memory leak in tls_enc_skb() and tls_sw_fallback_init()
Previous releases - regressions:
- bridge: fix memory leaks when changing VLAN protocol
- dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims
- dsa: don't leak tagger-owned storage on switch driver unbind
- eth: mlxsw: avoid warnings when not offloaded FDB entry with IPv6 is removed
- eth: stmmac: ensure tx function is not running in stmmac_xdp_release()
- eth: hns3: fix return value check bug of rx copybreak
Previous releases - always broken:
- kcm: close race conditions on sk_receive_queue
- bpf: fix alignment problem in bpf_prog_test_run_skb()
- bpf: fix writing offset in case of fault in strncpy_from_kernel_nofault
- eth: macvlan: use built-in RCU list checking
- eth: marvell: add sleep time after enabling the loopback bit
- eth: octeon_ep: fix potential memory leak in octep_device_setup()
Misc:
- tcp: configurable source port perturb table size
- bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmN2FlMSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkWAwQAJcV7XEB7bEssgabFkEmC4uvS/sFlyHC
uSwFRn5ojaB2c56T1CnNYmitg9Wr4arC6Vca28iai6BgqB6t4qLRI/WWTsZiEPhi
mt/pjNN2u9JMyaafHFHYfXnbSDWRF7kPMpNw4l3uL0vkGyjSI7LGAOP4Qh8C1h/d
tNVSDZnj4Laj/3JtDf7Rk6ydCqPYnNdWxFfoZ/SQkjYZKD3Ze9tml7WJykAzCTLp
yUiPC6TvHOnWIZYbB04sVVOQD4V+95TSOgEhB6wzs/CXB7iBEY+N+oCedjP9Xrfw
n3ea4anBoTleDnJXJI57LhdJBkyoXncfbpbYLwXljyIgosr7XVTALvOG8XUhg/DW
FzN5DWQ54jzTsx2eXFJzjQQcDIgyxazk9EdoHdqF8byCasP+fofq1JvzyqtvNSyh
h8Ps6jdMZrWpXuFDVApXUhP32A/+9q+dFSYHJO681m6mf4CIaUXdm4aB1dkxDAvg
PSlk797U94RQCzJgqxhrgsq1PGQPBb+qadZrAiD3aQi26g0NWCTg7uFpCeCEK2ZF
fLwc2XxrwLQm1q7xQVoEg4UxPIIf0mUesvOD9sLDYop0rFIw8x0v7jdYM4kyhN3o
6FWAXKxBe3LJ9jTTsVTbZbfHYpTnS8Q2KSclBN+/dZNHwwsUPHjy17Z2Ct3o3Jlm
lNbiiD30BgsD
=vVJk
-----END PGP SIGNATURE-----
Merge tag 'net-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf.
Current release - regressions:
- tls: fix memory leak in tls_enc_skb() and tls_sw_fallback_init()
Previous releases - regressions:
- bridge: fix memory leaks when changing VLAN protocol
- dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims
- dsa: don't leak tagger-owned storage on switch driver unbind
- eth: mlxsw: avoid warnings when not offloaded FDB entry with IPv6
is removed
- eth: stmmac: ensure tx function is not running in
stmmac_xdp_release()
- eth: hns3: fix return value check bug of rx copybreak
Previous releases - always broken:
- kcm: close race conditions on sk_receive_queue
- bpf: fix alignment problem in bpf_prog_test_run_skb()
- bpf: fix writing offset in case of fault in
strncpy_from_kernel_nofault
- eth: macvlan: use built-in RCU list checking
- eth: marvell: add sleep time after enabling the loopback bit
- eth: octeon_ep: fix potential memory leak in octep_device_setup()
Misc:
- tcp: configurable source port perturb table size
- bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)"
* tag 'net-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
net: use struct_group to copy ip/ipv6 header addresses
net: usb: smsc95xx: fix external PHY reset
net: usb: qmi_wwan: add Telit 0x103a composition
netdevsim: Fix memory leak of nsim_dev->fa_cookie
tcp: configurable source port perturb table size
l2tp: Serialize access to sk_user_data with sk_callback_lock
net: thunderbolt: Fix error handling in tbnet_init()
net: microchip: sparx5: Fix potential null-ptr-deref in sparx_stats_init() and sparx5_start()
net: lan966x: Fix potential null-ptr-deref in lan966x_stats_init()
net: dsa: don't leak tagger-owned storage on switch driver unbind
net/x25: Fix skb leak in x25_lapb_receive_frame()
net: ag71xx: call phylink_disconnect_phy if ag71xx_hw_enable() fail in ag71xx_open()
bridge: switchdev: Fix memory leaks when changing VLAN protocol
net: hns3: fix setting incorrect phy link ksettings for firmware in resetting process
net: hns3: fix return value check bug of rx copybreak
net: hns3: fix incorrect hw rss hash type of rx packet
net: phy: marvell: add sleep time after enabling the loopback bit
net: ena: Fix error handling in ena_init()
kcm: close race conditions on sk_receive_queue
net: ionic: Fix error handling in ionic_init_module()
...
The rswitch_etha_wait_link_verification() is supposed to return zero
on success or negative error codes. Unfortunately it is declared as a
bool so the caller treats everything as success.
Fixes: 3590918b5d ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/Y3OPo6AOL6PTvXFU@kili
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* iwlmei fixes
* Debug mechanism update for new devices (BZ)
* Checksum offload fix for the new devices (BZ)
* A few rate scale fixes and cleanups
* A fix for iwlwifi debug mechanism
* Start of MLO preparations - supporting new key API
-----BEGIN PGP SIGNATURE-----
iQJPBAABCAA5FiEE9cg2NujikJ5EMZusCDCCYA5zdzwFAmNs5fsbHGdyZWdvcnku
Z3JlZW5tYW5AaW50ZWwuY29tAAoJEAgwgmAOc3c8HycP/29xLeQi5PDCuU4Y1oDq
vp6ddZUwvfgNaeGn+hv4nbob/gpXzaWgXxum3QrE2AwB95U6UNjNLjCAKh+2HS3V
tzClp5kNU6TWIwgUnfWfuGWQjqC5X9YOG+LLjKpDXZgm4StfMn5vg7U3temxIpye
OA1Mdk992eJ+vp38zJmYIofh5GiZ/TXgxDoGMm5B2CeHyRt6WlUZh+9lFLXZ5maw
JLfLqoAhQHs0WGGDSNjCCImmspCDiueiX07O3W9EO9A9Hmv6iSb1PDsxOcvcuxml
YoeqgALtz3i6LIjeEtMi50dinlSr3THbxdBWQZF5QTpWZqvXvH5Zq9zq6ymU7f0Q
tN35uIDgbY5W0ByXf021rTmifhHMwhiSp0WHdEgo/tKt9g1Zkr4FtoRL+JUbmRd9
CCGErUb/7nCAtSi/FKqvF1xUouFws38i4xbvBI1Vg9lfAwqx6MEfkxwrREW78jur
byl5HYtBfASAPt0oWaSUMBCNQwPGdj0rmgQNAkR/bMgS+9YZdicG4uuSVJQp77/P
unE0eOyBCIcZBQji9l3V8kuZuvm7mZrAJp5POlRyvhtd4szCboq/OF7ke7LKYJqo
tsuO/ouvkFjIJXtYUtNOxU51lBWoACPYJBTXe/4qc3O2cSl3KDXMQ7HKU8eGK42V
sE3yKbbwuIZAm+VYTgnY0Uy4
=ggBO
-----END PGP SIGNATURE-----
Merge tag 'iwlwifi-next-for-kalle-2022-11-06-v2' of http://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
iwlwifi patches intended for v6.2
* iwlmei fixes
* Debug mechanism update for new devices (BZ)
* Checksum offload fix for the new devices (BZ)
* A few rate scale fixes and cleanups
* A fix for iwlwifi debug mechanism
* Start of MLO preparations - supporting new key API
The current virtual nci driver is great for testing and fuzzing.
But it allows to create at most one "global" device which does not allow
to run parallel tests and harms fuzzing isolation and reproducibility.
Restructure the driver to allow creation of multiple independent devices.
This should be backwards compatible for existing tests.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Cc: Bongsu Jeon <bongsu.jeon@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20221115100017.787929-1-dvyukov@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Variable cnt is just being incremented and it's never used
anywhere else. The variable and the increment are redundant so
remove it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221115093137.144002-1-colin.i.king@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
kernel test robot reported warnings when build bonding module with
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/net/bonding/:
from ../drivers/net/bonding/bond_main.c:35:
In function ‘fortify_memcpy_chk’,
inlined from ‘iph_to_flow_copy_v4addrs’ at ../include/net/ip.h:566:2,
inlined from ‘bond_flow_ip’ at ../drivers/net/bonding/bond_main.c:3984:3:
../include/linux/fortify-string.h:413:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of f
ield (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
413 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘fortify_memcpy_chk’,
inlined from ‘iph_to_flow_copy_v6addrs’ at ../include/net/ipv6.h:900:2,
inlined from ‘bond_flow_ip’ at ../drivers/net/bonding/bond_main.c:3994:3:
../include/linux/fortify-string.h:413:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of f
ield (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
413 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is because we try to copy the whole ip/ip6 address to the flow_key,
while we only point the to ip/ip6 saddr. Note that since these are UAPI
headers, __struct_group() is used to avoid the compiler warnings.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: c3f8324188 ("net: Add full IPv6 addresses to flow_keys")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20221115142400.1204786-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
An external PHY needs settling time after power up or reset.
In the bind() function an mdio bus is registered. If at this point
the external PHY is still initialising, no valid PHY ID will be
read and on phy_find_first() the bind() function will fail.
If an external PHY is present, wait the maximum time specified
in 802.3 45.2.7.1.1.
Fixes: 05b35e7eb9 ("smsc95xx: add phylib support")
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221115114434.9991-2-alexandru.tachici@analog.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>