Commit Graph

1428 Commits

Author SHA1 Message Date
Vladimir Oltean
0064b863ab net: dsa: tag_sja1105: absorb entire sja1105_vlan_rcv() into dsa_8021q_rcv()
tag_sja1105 has a wrapper over dsa_8021q_rcv(): sja1105_vlan_rcv(),
which determines whether the packet came from a bridge with
vlan_filtering=1 (the case resolved via
dsa_find_designated_bridge_port_by_vid()), or if it contains a tag_8021q
header.

Looking at a new tagger implementation for vsc73xx, based also on
tag_8021q, it is becoming clear that the logic is needed there as well.
So instead of forcing each tagger to wrap around dsa_8021q_rcv(), let's
merge the logic into the core.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-5-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15 06:55:15 -07:00
Vladimir Oltean
dcfe767378 net: dsa: tag_sja1105: absorb logic for not overwriting precise info into dsa_8021q_rcv()
In both sja1105_rcv() and sja1110_rcv(), we may have precise source port
information coming from parallel hardware mechanisms, in addition to the
tag_8021q header.

Only sja1105_rcv() has extra logic to not overwrite that precise info
with what's present in the VLAN tag. This is because sja1110_rcv() gets
by, by having a reversed set of checks when assigning skb->dev. When the
source port is imprecise (vbid >=1), source_port and switch_id will be
set to zeroes by dsa_8021q_rcv(), which might be problematic. But by
checking for vbid >= 1 first, sja1110_rcv() fends that off.

We would like to make more code common between sja1105_rcv() and
sja1110_rcv(), and for that, we need to make sure that sja1110_rcv()
also goes through the precise source port preservation logic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-4-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15 06:55:15 -07:00
Chris Packham
983e44f0ee net: dsa: Fix typo in NET_DSA_TAG_RTL4_A Kconfig
Fix a minor typo in the help text for the NET_DSA_TAG_RTL4_A config
option.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Link: https://lore.kernel.org/r/20240607020843.1380735-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11 19:24:11 -07:00
Marek Behún
eef8e906ae net: dsa: update the unicast MAC address when changing conduit
When changing DSA user interface conduit while the user interface is up,
DSA exhibits different behavior in comparison to when the interface is
down. This different behavior concerns the primary unicast MAC address
stored in the port standalone FDB and in the conduit device UC database.

If we put a switch port down while changing the conduit with
  ip link set sw0p0 down
  ip link set sw0p0 type dsa conduit conduit1
  ip link set sw0p0 up
we delete the address in dsa_user_close() and install the (possibly
different) address in dsa_user_open().

But when changing the conduit on the fly, the old address is not
deleted and the new one is not installed.

Since we explicitly want to support live-changing the conduit, uninstall
the old address before calling dsa_port_assign_conduit() and install the
(possibly different) new address after the call.

Because conduit change might also trigger address change (the user
interface is supposed to inherit the conduit interface MAC address if no
address is defined in hardware (dp->mac is a zero address)), move the
eth_hw_addr_inherit() call from dsa_user_change_conduit() to
dsa_port_change_conduit(), just before installing the new address.

Although this is in theory a flaw in DSA core, it needs not be
backported, since there is currently no DSA driver that can be affected
by this. The only DSA driver that supports changing conduit is felix,
and, as explained by Vladimir Oltean [1]:

  There are 2 reasons why with felix the bug does not manifest itself.

  First is because both the 'ocelot' and the alternate 'ocelot-8021q'
  tagging protocols have the 'promisc_on_conduit = true' flag. So the
  unicast address doesn't have to be in the conduit's RX filter -
  neither the old or the new conduit.

  Second, dsa_user_host_uc_install() theoretically leaves behind host
  FDB entries installed towards the wrong (old) CPU port. But in
  felix_fdb_add(), we treat any FDB entry requested towards any CPU port
  as if it was a multicast FDB entry programmed towards _all_ CPU ports.
  For that reason, it is installed towards the port mask of the PGID_CPU
  port group ID:

	if (dsa_port_is_cpu(dp))
		port = PGID_CPU;

Therefore no Fixes tag for this change.

[1] https://lore.kernel.org/netdev/20240507201827.47suw4fwcjrbungy@skbuf/
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 13:48:06 +01:00
Marek Behún
77f7541248 net: dsa: deduplicate code adding / deleting the port address to fdb
The sequence
  if (dsa_switch_supports_uc_filtering(ds))
    dsa_port_standalone_host_fdb_add(dp, addr, 0);
  if (!ether_addr_equal(addr, conduit->dev_addr))
    dev_uc_add(conduit, addr);
is executed both in dsa_user_open() and dsa_user_set_mac_addr().

Its reverse is executed both in dsa_user_close() and
dsa_user_set_mac_addr().

Refactor these sequences into new functions dsa_user_host_uc_install()
and dsa_user_host_uc_uninstall().

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 13:48:06 +01:00
Russell King (Oracle)
bbb31b7ae1 net: dsa: remove mac_prepare()/mac_finish() shims
No DSA driver makes use of the mac_prepare()/mac_finish() shimmed
operations anymore, so we can remove these.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/E1sByNx-00ELW1-Vp@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 18:41:15 -07:00
Steven Rostedt (Google)
2c92ca849f tracing/treewide: Remove second parameter of __assign_str()
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.

This means that with:

  __string(field, mystring)

Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.

There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:

  git grep -l __assign_str | while read a ; do
      sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
      mv /tmp/test-file $a;
  done

I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.

Note, the same updates will need to be done for:

  __assign_str_len()
  __assign_rel_str()
  __assign_rel_str_len()

I tested this with both an allmodconfig and an allyesconfig (build only for both).

[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/

Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>	# xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-22 20:14:47 -04:00
Oleksij Rempel
5f5109af47 net: dsa: add support switches global DSCP priority mapping
Some switches like Microchip KSZ variants do not support per port DSCP
priority configuration. Instead there is a global DSCP mapping table.

To handle it, we will accept set/del request to any of user ports to
make global configuration and update dcb app entries for all other
ports.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08 10:35:10 +01:00
Oleksij Rempel
96c6f33795 net: dsa: add support for DCB get/set apptrust configuration
Add DCB support to get/set trust configuration for different packet
priority information sources. Some switch allow to chose different
source of packet priority classification. For example on KSZ switches it
is possible to configure VLAN PCP and/or DSCP sources.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08 10:35:09 +01:00
Eric Dumazet
1eb2cded45 net: annotate writes on dev->mtu from ndo_change_mtu()
Simon reported that ndo_change_mtu() methods were never
updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted
in commit 501a90c945 ("inet: protect against too small
mtu values.")

We read dev->mtu without holding RTNL in many places,
with READ_ONCE() annotations.

It is time to take care of ndo_change_mtu() methods
to use corresponding WRITE_ONCE()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Simon Horman <horms@kernel.org>
Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07 16:19:14 -07:00
Florian Fainelli
8a021a863a net: dsa: Remove adjust_link paths
Now that we no longer any drivers using PHYLIB's adjust_link callback,
remove all paths that made use of adjust_link as well as the associated
functions.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20240430164816.2400606-3-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-01 19:06:32 -07:00
Mateusz Polchlopek
5625ca5640 devlink: extend devlink_param *set pointer
Extend devlink_param *set function pointer to take extack as a param.
Sometimes it is needed to pass information to the end user from set
function. It is more proper to use for that netlink instead of passing
message to dmesg.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-22 13:05:19 -07:00
Russell King (Oracle)
a788fafff5 net: dsa: convert dsa_user_phylink_fixed_state() to use dsa_phylink_to_port()
Convert dsa_user_phylink_fixed_state() to use the newly introduced
dsa_phylink_to_port() helper.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-15 10:48:41 +01:00
Russell King (Oracle)
cae425cb43 net: dsa: allow DSA switch drivers to provide their own phylink mac ops
Rather than having a shim for each and every phylink MAC operation,
allow DSA switch drivers to provide their own ops structure. When a
DSA driver provides the phylink MAC operations, the shimmed ops must
not be provided, so fail an attempt to register a switch with both
the phylink_mac_ops in struct dsa_switch and the phylink_mac_*
operations populated in dsa_switch_ops populated.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/E1rudqF-006K9H-Cc@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-11 20:01:14 -07:00
Russell King (Oracle)
dd0c9855b4 net: dsa: introduce dsa_phylink_to_port()
We convert from a phylink_config struct to a dsa_port struct in many
places, let's provide a helper for this.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1rudqA-006K9B-85@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-11 20:01:14 -07:00
Florian Fainelli
4f6473ad60 net: dsa: Leverage core stats allocator
With commit 34d21de99c ("net: Move {l,t,d}stats allocation to core
and convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Remove the allocation in the DSA user network device code and leverage
the network core allocation instead.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240306200416.2973179-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-07 20:37:13 -08:00
Eric Dumazet
e353ea9ce4 rtnetlink: prepare nla_put_iflink() to run under RCU
We want to be able to run rtnl_fill_ifinfo() under RCU protection
instead of RTNL in the future.

This patch prepares dev_get_iflink() and nla_put_iflink()
to run either with RTNL or RCU held.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Ricardo B. Marliere
7e0acba3b4 net: dsa: constify the struct device_type usage
Since commit aed65af1cc ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the dsa_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-21 09:45:22 +00:00
Arınç ÜNAL
ae94dc25fd net: dsa: remove OF-based MDIO bus registration from DSA core
The code block under the "!ds->user_mii_bus && ds->ops->phy_read" check
under dsa_switch_setup() populates ds->user_mii_bus. The use of
ds->user_mii_bus is inappropriate when the MDIO bus of the switch is
described on the device tree [1].

For this reason, use this code block only for switches [with MDIO bus]
probed on platform_data, and OF which the switch MDIO bus isn't described
on the device tree. Therefore, remove OF-based MDIO bus registration as
it's useless for these cases.

These subdrivers which control switches [with MDIO bus] probed on OF, will
lose the ability to register the MDIO bus OF-based:

drivers/net/dsa/b53/b53_common.c
drivers/net/dsa/lan9303-core.c
drivers/net/dsa/vitesse-vsc73xx-core.c

These subdrivers let the DSA core driver register the bus:
- ds->ops->phy_read() and ds->ops->phy_write() are present.
- ds->user_mii_bus is not populated.

The commit fe7324b932 ("net: dsa: OF-ware slave_mii_bus") which brought
OF-based MDIO bus registration on the DSA core driver is reasonably recent
and, in this time frame, there have been no device trees in the Linux
repository that started describing the MDIO bus, or dt-bindings defining
the MDIO bus for the switches these subdrivers control. So I don't expect
any devices to be affected.

The logic we encourage is that all subdrivers should register the switch
MDIO bus on their own [2]. And, for subdrivers which control switches [with
MDIO bus] probed on OF, this logic must be followed to support all cases
properly:

No switch MDIO bus defined: Populate ds->user_mii_bus, register the MDIO
bus, set the interrupts for PHYs if "interrupt-controller" is defined at
the switch node. This case should only be covered for the switches which
their dt-bindings documentation didn't document the MDIO bus from the
start. This is to keep supporting the device trees that do not describe the
MDIO bus on the device tree but the MDIO bus is being used nonetheless.

Switch MDIO bus defined: Don't populate ds->user_mii_bus, register the MDIO
bus, set the interrupts for PHYs if ["interrupt-controller" is defined at
the switch node and "interrupts" is defined at the PHY nodes under the
switch MDIO bus node].

Switch MDIO bus defined but explicitly disabled: If the device tree says
status = "disabled" for the MDIO bus, we shouldn't need an MDIO bus at all.
Instead, just exit as early as possible and do not call any MDIO API.

After all subdrivers that control switches with MDIO buses are made to
register the MDIO buses on their own, we will be able to get rid of
dsa_switch_ops :: phy_read() and :: phy_write(), and the code block for
registering the MDIO bus on the DSA core driver.

Link: https://lore.kernel.org/netdev/20231213120656.x46fyad6ls7sqyzv@skbuf/ [1]
Link: https://lore.kernel.org/netdev/20240103184459.dcbh57wdnlox6w7d@skbuf/ [2]
Suggested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Acked-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240213-for-netnext-dsa-mdio-bus-v2-1-0ff6f4823a9e@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-15 21:30:02 -08:00
Vladimir Oltean
36f75f74dc net: dsa: tag_sja1105: remove "inline" keyword
The convention is to not use the "inline" keyword for functions in C
files, but to let the compiler choose.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206112927.4134375-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-08 19:03:57 -08:00
Vladimir Oltean
83acbb9d07 net: dsa: remove "inline" from dsa_user_netpoll_send_skb()
The convention is to not use "inline" functions in C files, and let the
compiler decide whether to inline or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206112927.4134375-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-08 19:03:57 -08:00
Vladimir Oltean
0cd216d769 net: dsa: reindent arguments of dsa_user_vlan_for_each()
These got misaligned after commit 6ca80638b9 ("net: dsa: Use conduit
and user terms").

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-05 12:37:06 +00:00
Heiner Kallweit
d80a523353 ethtool: replace struct ethtool_eee with a new struct ethtool_keee on kernel side
In order to pass EEE link modes beyond bit 32 to userspace we have to
complement the 32 bit bitmaps in struct ethtool_eee with linkmode
bitmaps. Therefore, similar to ethtool_link_settings and
ethtool_link_ksettings, add a struct ethtool_keee. In a first step
it's an identical copy of ethtool_eee. This patch simply does a
s/ethtool_eee/ethtool_keee/g for all users.
No functional change intended.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-31 12:30:47 +00:00
Vladimir Oltean
844f104790 net: dsa: fix netdev_priv() dereference before check on non-DSA netdevice events
After the blamed commit, we started doing this dereference for every
NETDEV_CHANGEUPPER and NETDEV_PRECHANGEUPPER event in the system.

static inline struct dsa_port *dsa_user_to_port(const struct net_device *dev)
{
	struct dsa_user_priv *p = netdev_priv(dev);

	return p->dp;
}

Which is obviously bogus, because not all net_devices have a netdev_priv()
of type struct dsa_user_priv. But struct dsa_user_priv is fairly small,
and p->dp means dereferencing 8 bytes starting with offset 16. Most
drivers allocate that much private memory anyway, making our access not
fault, and we discard the bogus data quickly afterwards, so this wasn't
caught.

But the dummy interface is somewhat special in that it calls
alloc_netdev() with a priv size of 0. So every netdev_priv() dereference
is invalid, and we get this when we emit a NETDEV_PRECHANGEUPPER event
with a VLAN as its new upper:

$ ip link add dummy1 type dummy
$ ip link add link dummy1 name dummy1.100 type vlan id 100
[   43.309174] ==================================================================
[   43.316456] BUG: KASAN: slab-out-of-bounds in dsa_user_prechangeupper+0x30/0xe8
[   43.323835] Read of size 8 at addr ffff3f86481d2990 by task ip/374
[   43.330058]
[   43.342436] Call trace:
[   43.366542]  dsa_user_prechangeupper+0x30/0xe8
[   43.371024]  dsa_user_netdevice_event+0xb38/0xee8
[   43.375768]  notifier_call_chain+0xa4/0x210
[   43.379985]  raw_notifier_call_chain+0x24/0x38
[   43.384464]  __netdev_upper_dev_link+0x3ec/0x5d8
[   43.389120]  netdev_upper_dev_link+0x70/0xa8
[   43.393424]  register_vlan_dev+0x1bc/0x310
[   43.397554]  vlan_newlink+0x210/0x248
[   43.401247]  rtnl_newlink+0x9fc/0xe30
[   43.404942]  rtnetlink_rcv_msg+0x378/0x580

Avoid the kernel oops by dereferencing after the type check, as customary.

Fixes: 4c3f80d22b ("net: dsa: walk through all changeupper notifier functions")
Reported-and-tested-by: syzbot+d81bcd883824180500c8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/0000000000001d4255060e87545c@google.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240110003354.2796778-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:33:52 -08:00
Jakub Kicinski
0ed6e95255 net: fill in MODULE_DESCRIPTION()s for DSA tags
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all the DSA tag modules.

The descriptions are copy/pasted Kconfig names, with s/^Tag/DSA tag/.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20240104143759.1318137-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-05 08:06:19 -08:00
Radu Pirea (NXP OSS)
90abde49ea net: rename dsa_realloc_skb to skb_ensure_writable_head_tail
Rename dsa_realloc_skb to skb_ensure_writable_head_tail and move it to
skbuff.c to use it as helper.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:09 +00:00
Florian Fainelli
8fedaaca40 net: dsa: tag_rtl4_a: Use existing ETH_P_REALTEK constant
No functional change, uses the existing ETH_P_REALTEK constant already
defined in if_ether.h.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231113165030.2440083-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-14 19:45:35 -08:00
Florian Fainelli
87cd83714f net: dsa: Rename IFLA_DSA_MASTER to IFLA_DSA_CONDUIT
This preserves the existing IFLA_DSA_MASTER which is part of the uAPI
and creates an alias named IFLA_DSA_CONDUIT.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231023181729.1191071-3-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24 13:08:14 -07:00
Florian Fainelli
6ca80638b9 net: dsa: Use conduit and user terms
Use more inclusive terms throughout the DSA subsystem by moving away
from "master" which is replaced by "conduit" and "slave" which is
replaced by "user". No functional changes.

Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231023181729.1191071-2-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24 13:08:14 -07:00
Russell King (Oracle)
63b9f7a19f net: dsa: remove dsa_port_phylink_validate()
As all drivers now provide phylink capabilities (including MAC), the
if() condition in dsa_port_phylink_validate() will always be true. We
will always use the generic validator, which phylink will call itself
if the .validate method isn't populated. Thus, there is now no need to
implement the .validate method, so this implementation can be removed.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11 10:06:05 +01:00
Lukasz Majewski
5e5db71a92 net: dsa: tag_ksz: Extend ksz9477_xmit() for HSR frame duplication
The KSZ9477 has support for HSR (High-Availability Seamless Redundancy).
One of its offloading (i.e. performed in the switch IC hardware) features
is to duplicate received frame to both HSR aware switch ports.

To achieve this goal - the tail TAG needs to be modified. To be more
specific, both ports must be marked as destination (egress) ones.

The NETIF_F_HW_HSR_DUP flag indicates that the device supports HSR and
assures (in HSR core code) that frame is sent only once from HOST to
switch with tail tag indicating both ports.

Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-03 13:51:02 +02:00
Vladimir Oltean
6715042cd1 net: dsa: notify drivers of MAC address changes on user ports
In some cases, drivers may need to veto the changing of a MAC address on
a user port. Such is the case with KSZ9477 when it offloads a HSR device,
because it programs the MAC address of multiple ports to a shared
hardware register. Those ports need to have equal MAC addresses for the
lifetime of the HSR offload.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-03 13:51:02 +02:00
Vladimir Oltean
fefe5dc4af net: dsa: propagate extack to ds->ops->port_hsr_join()
Drivers can provide meaningful error messages which state a reason why
they can't perform an offload, and dsa_slave_changeupper() already has
the infrastructure to propagate these over netlink rather than printing
to the kernel log. So pass the extack argument and modify the xrs700x
driver's port_hsr_join() prototype.

Also take the opportunity and use the extack for the 2 -EOPNOTSUPP cases
from xrs700x_hsr_join().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-03 13:51:02 +02:00
Russell King (Oracle)
145622771d net: dsa: mark parsed interface mode for legacy switch drivers
If we successfully parsed an interface mode with a legacy switch
driver, populate that mode into phylink's supported interfaces rather
than defaulting to the internal and gmii interfaces.

This hasn't caused an issue so far, because when the interface doesn't
match a supported one, phylink_validate() doesn't clear the supported
mask, but instead returns -EINVAL. phylink_parse_fixedlink() doesn't
check this return value, and merely relies on the supported ethtool
link modes mask being cleared. Therefore, the fixed link settings end
up being allowed despite validation failing.

Before this causes a problem, arrange for DSA to more accurately
populate phylink's supported interfaces mask so validation can
correctly succeed.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1qTKdM-003Cpx-Eh@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09 13:08:09 -07:00
Jakub Kicinski
35b1b1fd96 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/dsa/port.c
  9945c1fb03 ("net: dsa: fix older DSA drivers using phylink")
  a88dd75384 ("net: dsa: remove legacy_pre_march2020 detection")
https://lore.kernel.org/all/20230731102254.2c9868ca@canb.auug.org.au/

net/xdp/xsk.c
  3c5b4d69c3 ("net: annotate data-races around sk->sk_mark")
  b7f72a30e9 ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path")
https://lore.kernel.org/all/20230731102631.39988412@canb.auug.org.au/

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  37b61cda9c ("bnxt: don't handle XDP in netpoll")
  2b56b3d992 ("eth: bnxt: handle invalid Tx completions more gracefully")
https://lore.kernel.org/all/20230801101708.1dc7faac@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
  62da08331f ("net/mlx5e: Set proper IPsec source port in L4 selector")
  fbd517549c ("net/mlx5e: Add function to get IPsec offload namespace")

drivers/net/ethernet/sfc/selftest.c
  55c1528f9b ("sfc: fix field-spanning memcpy in selftest")
  ae9d445cd4 ("sfc: Miscellaneous comment removals")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03 14:34:37 -07:00
Christian Marangi
de9db136dc net: dsa: tag_qca: return early if dev is not found
Currently checksum is recalculated and dsa tag stripped even if we later
don't find the dev.

To improve code, exit early if we don't find the dev and skip additional
operation on the skb since it will be freed anyway.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-01 12:02:42 +02:00
Russell King (Oracle)
9945c1fb03 net: dsa: fix older DSA drivers using phylink
Older DSA drivers that do not provide an dsa_ops adjust_link method end
up using phylink. Unfortunately, a recent phylink change that requires
its supported_interfaces bitmap to be filled breaks these drivers
because the bitmap remains empty.

Rather than fixing each driver individually, fix it in the core code so
we have a sensible set of defaults.

Reported-by: Sergei Antonov <saproj@gmail.com>
Fixes: de5c9bf40c ("net: phylink: require supported_interfaces to be filled")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com> # dsa_loop
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/E1qOflM-001AEz-D3@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-27 17:19:46 -07:00
justinstitt@google.com
5c9f7b04aa net: dsa: remove deprecated strncpy
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

Even call sites utilizing length-bounded destination buffers should
switch over to using `strtomem` or `strtomem_pad`. In this case,
however, the compiler is unable to determine the size of the `data`
buffer which renders `strtomem` unusable. Due to this, `strscpy`
should be used.

It should be noted that most call sites already zero-initialize the
destination buffer. However, I've opted to use `strscpy_pad` to maintain
the same exact behavior that `strncpy` produced (zero-padded tail up to
`len`).

Also see [3].

[1]: www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
[2]: elixir.bootlin.com/linux/v6.3/source/net/ethtool/ioctl.c#L1944
[3]: manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html

Link: https://github.com/KSPP/linux/issues/90
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-23 11:45:46 +01:00
Russell King (Oracle)
a88dd75384 net: dsa: remove legacy_pre_march2020 detection
All drivers are now updated for the March 2020 changes, and no longer
make use of the mac_pcs_get_state() or mac_an_restart() operations,
which are now NULL across all DSA drivers. All DSA drivers don't look
at speed, duplex, pause or advertisement in their phylink_mac_config()
method either.

Remove support for these operations from DSA, and stop marking DSA as
a legacy driver by default.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-18 09:47:08 +02:00
Vladimir Oltean
a372d66af4 net: dsa: sja1105: always enable the send_meta options
incl_srcpt has the limitation, mentioned in commit b4638af888 ("net:
dsa: sja1105: always enable the INCL_SRCPT option"), that frames with a
MAC DA of 01:80:c2:xx:yy:zz will be received as 01:80:c2:00:00:zz unless
PTP RX timestamping is enabled.

The incl_srcpt option was initially unconditionally enabled, then that
changed with commit 42824463d3 ("net: dsa: sja1105: Limit use of
incl_srcpt to bridge+vlan mode"), then again with b4638af888 ("net:
dsa: sja1105: always enable the INCL_SRCPT option"). Bottom line is that
it now needs to be always enabled, otherwise the driver does not have a
reliable source of information regarding source_port and switch_id for
link-local traffic (tag_8021q VLANs may be imprecise since now they
identify an entire bridging domain when ports are not standalone).

If we accept that PTP RX timestamping (and therefore, meta frame
generation) is always enabled in hardware, then that limitation could be
avoided and packets with any MAC DA can be properly received, because
meta frames do contain the original bytes from the MAC DA of their
associated link-local packet.

This change enables meta frame generation unconditionally, which also
has the nice side effects of simplifying the switch control path
(a switch reset is no longer required on hwtstamping settings change)
and the tagger data path (it no longer needs to be informed whether to
expect meta frames or not - it always does).

Fixes: 227d07a07e ("net: dsa: sja1105: Add support for traffic through standalone ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-04 19:42:27 +01:00
Vladimir Oltean
1dcf6efd5f net: dsa: tag_sja1105: fix MAC DA patching from meta frames
The SJA1105 manual says that at offset 4 into the meta frame payload we
have "MAC destination byte 2" and at offset 5 we have "MAC destination
byte 1". These are counted from the LSB, so byte 1 is h_dest[ETH_HLEN-2]
aka h_dest[4] and byte 2 is h_dest[ETH_HLEN-3] aka h_dest[3].

The sja1105_meta_unpack() function decodes these the other way around,
so a frame with MAC DA 01:80:c2:11:22:33 is received by the network
stack as having 01:80:c2:22:11:33.

Fixes: e53e18a6fe ("net: dsa: sja1105: Receive and decode meta frames")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-04 19:42:27 +01:00
Vladimir Oltean
a398b9ea0c net: dsa: tag_sja1105: fix source port decoding in vlan_filtering=0 bridge mode
There was a regression introduced by the blamed commit, where pinging to
a VLAN-unaware bridge would fail with the repeated message "Couldn't
decode source port" coming from the tagging protocol driver.

When receiving packets with a bridge_vid as determined by
dsa_tag_8021q_bridge_join(), dsa_8021q_rcv() will decode:
- source_port = 0 (which isn't really valid, more like "don't know")
- switch_id = 0 (which isn't really valid, more like "don't know")
- vbid = value in range 1-7

Since the blamed patch has reversed the order of the checks, we are now
going to believe that source_port != -1 and switch_id != -1, so they're
valid, but they aren't.

The minimal solution to the problem is to only populate source_port and
switch_id with what dsa_8021q_rcv() came up with, if the vbid is zero,
i.e. the source port information is trustworthy.

Fixes: c1ae02d876 ("net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-03 09:12:30 +01:00
Vladimir Oltean
c1ae02d876 net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT
Currently the sja1105 tagging protocol prefers using the source port
information from the VLAN header if that is available, falling back to
the INCL_SRCPT option if it isn't. The VLAN header is available for all
frames except for META frames initiated by the switch (containing RX
timestamps), and thus, the "if (is_link_local)" branch is practically
dead.

The tag_8021q source port identification has become more loose
("imprecise") and will report a plausible rather than exact bridge port,
when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local
traffic always needs to know the precise source port. With incorrect
source port reporting, for example PTP traffic over 2 bridged ports will
all be seen on sockets opened on the first such port, which is incorrect.

Now that the tagging protocol has been changed to make link-local frames
always contain source port information, we can reverse the order of the
checks so that we always give precedence to that information (which is
always precise) in lieu of the tag_8021q VID which is only precise for a
standalone port.

Fixes: d7f9787a76 ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID")
Fixes: 91495f21fc ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-29 14:41:26 +02:00
Jakub Kicinski
3674fbf045 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.5 net-next PR.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-27 09:45:22 -07:00
Vladimir Oltean
d06f925f13 net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses
When using the felix driver (the only one which supports UC filtering
and MC filtering) as a DSA master for a random other DSA switch, one can
see the following stack trace when the downstream switch ports join a
VLAN-aware bridge:

=============================
WARNING: suspicious RCU usage
-----------------------------
net/8021q/vlan_core.c:238 suspicious rcu_dereference_protected() usage!

stack backtrace:
Workqueue: dsa_ordered dsa_slave_switchdev_event_work
Call trace:
 lockdep_rcu_suspicious+0x170/0x210
 vlan_for_each+0x8c/0x188
 dsa_slave_sync_uc+0x128/0x178
 __hw_addr_sync_dev+0x138/0x158
 dsa_slave_set_rx_mode+0x58/0x70
 __dev_set_rx_mode+0x88/0xa8
 dev_uc_add+0x74/0xa0
 dsa_port_bridge_host_fdb_add+0xec/0x180
 dsa_slave_switchdev_event_work+0x7c/0x1c8
 process_one_work+0x290/0x568

What it's saying is that vlan_for_each() expects rtnl_lock() context and
it's not getting it, when it's called from the DSA master's ndo_set_rx_mode().

The caller of that - dsa_slave_set_rx_mode() - is the slave DSA
interface's dsa_port_bridge_host_fdb_add() which comes from the deferred
dsa_slave_switchdev_event_work().

We went to great lengths to avoid the rtnl_lock() context in that call
path in commit 0faf890fc5 ("net: dsa: drop rtnl_lock from
dsa_slave_switchdev_event_work"), and calling rtnl_lock() is simply not
an option due to the possibility of deadlocking when calling
dsa_flush_workqueue() from the call paths that do hold rtnl_lock() -
basically all of them.

So, when the DSA master calls vlan_for_each() from its ndo_set_rx_mode(),
the state of the 8021q driver on this device is really not protected
from concurrent access by anything.

Looking at net/8021q/, I don't think that vlan_info->vid_list was
particularly designed with RCU traversal in mind, so introducing an RCU
read-side form of vlan_for_each() - vlan_for_each_rcu() - won't be so
easy, and it also wouldn't be exactly what we need anyway.

In general I believe that the solution isn't in net/8021q/ anyway;
vlan_for_each() is not cut out for this task. DSA doesn't need rtnl_lock()
to be held per se - since it's not a netdev state change that we're
blocking, but rather, just concurrent additions/removals to a VLAN list.
We don't even need sleepable context - the callback of vlan_for_each()
just schedules deferred work.

The proposed escape is to remove the dependency on vlan_for_each() and
to open-code a non-sleepable, rtnl-free alternative to that, based on
copies of the VLAN list modified from .ndo_vlan_rx_add_vid() and
.ndo_vlan_rx_kill_vid().

Fixes: 64fdc5f341 ("net: dsa: sync unicast and multicast addresses for VLAN filters too")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230626154402.3154454-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-27 09:37:41 -07:00
Jakub Kicinski
a7384f3918 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

tools/testing/selftests/net/fcnal-test.sh
  d7a2fc1437 ("selftests: net: fcnal-test: check if FIPS mode is enabled")
  dd017c72dd ("selftests: fcnal: Test SO_DONTROUTE on TCP sockets.")
https://lore.kernel.org/all/5007b52c-dd16-dbf6-8d64-b9701bfa498b@tessares.net/
https://lore.kernel.org/all/20230619105427.4a0df9b3@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-22 18:40:38 -07:00
Vladimir Oltean
b79d7c14f4 net: dsa: introduce preferred_default_local_cpu_port and use on MT7530
Since the introduction of the OF bindings, DSA has always had a policy that
in case multiple CPU ports are present in the device tree, the numerically
smallest one is always chosen.

The MT7530 switch family, except the switch on the MT7988 SoC, has 2 CPU
ports, 5 and 6, where port 6 is preferable on the MT7531BE switch because
it has higher bandwidth.

The MT7530 driver developers had 3 options:
- to modify DSA when the MT7531 switch support was introduced, such as to
  prefer the better port
- to declare both CPU ports in device trees as CPU ports, and live with the
  sub-optimal performance resulting from not preferring the better port
- to declare just port 6 in the device tree as a CPU port

Of course they chose the path of least resistance (3rd option), kicking the
can down the road. The hardware description in the device tree is supposed
to be stable - developers are not supposed to adopt the strategy of
piecemeal hardware description, where the device tree is updated in
lockstep with the features that the kernel currently supports.

Now, as a result of the fact that they did that, any attempts to modify the
device tree and describe both CPU ports as CPU ports would make DSA change
its default selection from port 6 to 5, effectively resulting in a
performance degradation visible to users with the MT7531BE switch as can be
seen below.

Without preferring port 6:

[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-20.00  sec   374 MBytes   157 Mbits/sec  734    sender
[  5][TX-C]   0.00-20.00  sec   373 MBytes   156 Mbits/sec    receiver
[  7][RX-C]   0.00-20.00  sec  1.81 GBytes   778 Mbits/sec    0    sender
[  7][RX-C]   0.00-20.00  sec  1.81 GBytes   777 Mbits/sec    receiver

With preferring port 6:

[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-20.00  sec  1.99 GBytes   856 Mbits/sec  273    sender
[  5][TX-C]   0.00-20.00  sec  1.99 GBytes   855 Mbits/sec    receiver
[  7][RX-C]   0.00-20.00  sec  1.72 GBytes   737 Mbits/sec   15    sender
[  7][RX-C]   0.00-20.00  sec  1.71 GBytes   736 Mbits/sec    receiver

Using one port for WAN and the other ports for LAN is a very popular use
case which is what this test emulates.

As such, this change proposes that we retroactively modify stable kernels
(which don't support the modification of the CPU port assignments, so as to
let user space fix the problem and restore the throughput) to keep the
mt7530 driver preferring port 6 even with device trees where the hardware
is more fully described.

Fixes: c288575f78 ("net: dsa: mt7530: Add the support of MT7531 switch")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-20 09:40:26 +01:00
Russell King (Oracle)
dd805cf3e8 net: dsa: add support for mac_prepare() and mac_finish() calls
Add DSA support for the phylink mac_prepare() and mac_finish() calls.
These were introduced as part of the PCS support to allow MACs to
perform preparatory steps prior to configuration, and finalisation
steps after the MAC and PCS has been configured.

Introducing phylink_pcs support to the mv88e6xxx DSA driver needs some
code moved out of its mac_config() stage into the mac_prepare() and
mac_finish() stages, and this commit facilitates such code in DSA
drivers.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-26 10:39:40 +01:00
Vladimir Oltean
0bcf2e4aca net: dsa: tag_ocelot: call only the relevant portion of __skb_vlan_pop() on TX
ocelot_xmit_get_vlan_info() calls __skb_vlan_pop() as the most
appropriate helper I could find which strips away a VLAN header.
That's all I need it to do, but __skb_vlan_pop() has more logic, which
will become incompatible with the future revert of commit 6d1ccff627
("net: reset mac header in dev_start_xmit()").

Namely, it performs a sanity check on skb_mac_header(), which will stop
being set after the above revert, so it will return an error instead of
removing the VLAN tag.

ocelot_xmit_get_vlan_info() gets called in 2 circumstances:

(1) the port is under a VLAN-aware bridge and the bridge sends
    VLAN-tagged packets

(2) the port is under a VLAN-aware bridge and somebody else (an 8021q
    upper) sends VLAN-tagged packets (using a VID that isn't in the
    bridge vlan tables)

In case (1), there is actually no bug to defend against, because
br_dev_xmit() calls skb_reset_mac_header() and things continue to work.

However, in case (2), illustrated using the commands below, it can be
seen that our intervention is needed, since __skb_vlan_pop() complains:

$ ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
$ ip link set $eth master br0 && ip link set $eth up
$ ip link add link $eth name $eth.100 type vlan id 100 && ip link set $eth.100 up
$ ip addr add 192.168.100.1/24 dev $eth.100

I could fend off the checks in __skb_vlan_pop() with some
skb_mac_header_was_set() calls, but seeing how few callers of
__skb_vlan_pop() there are from TX paths, that seems rather
unproductive.

As an alternative solution, extract the bare minimum logic to strip a
VLAN header, and move it to a new helper named vlan_remove_tag(), close
to the definition of vlan_insert_tag(). Document it appropriately and
make ocelot_xmit_get_vlan_info() call this smaller helper instead.

Seeing that it doesn't appear illegal to test skb->protocol in the TX
path, I guess it would be a good for vlan_remove_tag() to also absorb
the vlan_set_encap_proto() function call.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23 14:16:45 +01:00
Vladimir Oltean
f0a9d56306 net: dsa: update TX path comments to not mention skb_mac_header()
Once commit 6d1ccff627 ("net: reset mac header in dev_start_xmit()")
will be reverted, it will no longer be true that skb->data points at
skb_mac_header(skb) - since the skb->mac_header will not be set - so
stop saying that, and just say that it points to the MAC header.

I've reviewed vlan_insert_tag() and it does not *actually* depend on
skb_mac_header(), so reword that to avoid the confusion.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23 14:16:45 +01:00