Commit Graph

2446 Commits

Author SHA1 Message Date
Vladimir Oltean
b6ad86e6ad net: dsa: sja1105: add bridge TX data plane offload based on tag_8021q
The main desire for having this feature in sja1105 is to support network
stack termination for traffic coming from a VLAN-aware bridge.

For sja1105, offloading the bridge data plane means sending packets
as-is, with the proper VLAN tag, to the chip. The chip will look up its
FDB and forward them to the correct destination port.

But we support bridge data plane offload even for VLAN-unaware bridges,
and the implementation there is different. In fact, VLAN-unaware
bridging is governed by tag_8021q, so it makes sense to have the
.bridge_fwd_offload_add() implementation fully within tag_8021q.
The key difference is that we only support 1 VLAN-aware bridge, but we
support multiple VLAN-unaware bridges. So we need to make sure that the
forwarding domain is not crossed by packets injected from the stack.

For this, we introduce the concept of a tag_8021q TX VLAN for bridge
forwarding offload. As opposed to the regular TX VLANs which contain
only 2 ports (the user port and the CPU port), a bridge data plane TX
VLAN is "multicast" (or "imprecise"): it contains all the ports that are
part of a certain bridge, and the hardware will select where the packet
goes within this "imprecise" forwarding domain.

Each VLAN-unaware bridge has its own "imprecise" TX VLAN, so we make use
of the unique "bridge_num" provided by DSA for the data plane offload.
We use the same 3 bits from the tag_8021q VLAN ID format to encode this
bridge number.

Note that these 3 bit positions have been used before for sub-VLANs in
best-effort VLAN filtering mode. The difference is that for best-effort,
the sub-VLANs were only valid on RX (and it was documented that the
sub-VLAN field needed to be transmitted as zero). Whereas for the bridge
data plane offload, these 3 bits are only valid on TX.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Vladimir Oltean
884be12f85 net: dsa: sja1105: add support for imprecise RX
This is already common knowledge by now, but the sja1105 does not have
hardware support for DSA tagging for data plane packets, and tag_8021q
sets up a unique pvid per port, transmitted as VLAN-tagged towards the
CPU, for the source port to be decoded nonetheless.

When the port is part of a VLAN-aware bridge, the pvid committed to
hardware is taken from the bridge and not from tag_8021q, so we need to
work with that the best we can.

Configure the switches to send all packets to the CPU as VLAN-tagged
(even ones that were originally untagged on the wire) and make use of
dsa_untag_bridge_pvid() to get rid of it before we send those packets up
the network stack.

With the classified VLAN used by hardware known to the tagger, we first
peek at the VID in an attempt to figure out if the packet was received
from a VLAN-unaware port (standalone or under a VLAN-unaware bridge),
case in which we can continue to call dsa_8021q_rcv(). If that is not
the case, the packet probably came from a VLAN-aware bridge. So we call
the DSA helper that finds for us a "designated bridge port" - one that
is a member of the VLAN ID from the packet, and is in the proper STP
state - basically these are all checks performed by br_handle_frame() in
the software RX data path.

The bridge will accept the packet as valid even if the source port was
maybe wrong. So it will maybe learn the MAC SA of the packet on the
wrong port, and its software FDB will be out of sync with the hardware
FDB. So replies towards this same MAC DA will not work, because the
bridge will send towards a different netdev.

This is where the bridge data plane offload ("imprecise TX") added by
the next patch comes in handy. The software FDB is wrong, true, but the
hardware FDB isn't, and by offloading the bridge forwarding plane we
have a chance to right a wrong, and have the hardware look up the FDB
for us for the reply packet. So it all cancels out.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Vladimir Oltean
19fa937a39 net: dsa: sja1105: deny more than one VLAN-aware bridge
With tag_sja1105.c's only ability being to perform an imprecise RX
procedure and identify whether a packet comes from a VLAN-aware bridge
or not, we have no way to determine whether a packet with VLAN ID 5
comes from, say, br0 or br1. Actually we could, but it would mean that
we need to restrict all VLANs from br0 to be different from all VLANs
from br1, and this includes the default_pvid, which makes a setup with 2
VLAN-aware bridges highly imprectical.

The fact of the matter is that this isn't even that big of a practical
limitation, since even with a single VLAN-aware bridge we can pretty
much enforce forwarding isolation based on the VLAN port membership.

So in the end, tell the user that they need to model their setup using a
single VLAN-aware bridge.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Vladimir Oltean
4fbc08bd36 net: dsa: sja1105: deny 8021q uppers on ports
Now that best-effort VLAN filtering is gone and we are left with the
imprecise RX and imprecise TX based in VLAN-aware mode, where the tagger
just guesses the source port based on plausibility of the VLAN ID, 8021q
uppers installed on top of a standalone port, while other ports of that
switch are under a VLAN-aware bridge don't quite "just work".

In fact it could be possible to restrict the VLAN IDs used by the 8021q
uppers to not be shared with VLAN IDs used by that VLAN-aware bridge,
but then the tagger needs to be patched to search for 8021q uppers too,
not just for the "designated bridge port" which will be introduced in a
later patch.

I haven't given a possible implementation full thought, it seems maybe
possible but not worth the effort right now. The only certain thing is
that currently the tagger won't be able to figure out the source port
for these packets because they will come with the VLAN ID of the 8021q
upper and are no longer retagged to a tag_8021q sub-VLAN like the best
effort VLAN filtering code used to do. So just deny these for the
moment.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Vladimir Oltean
6dfd23d35e net: dsa: sja1105: delete vlan delta save/restore logic
With the best_effort_vlan_filtering mode now gone, the driver does not
have 3 operating modes anymore (VLAN-unaware, VLAN-aware and best effort),
but only 2.

The idea is that we will gain support for network stack I/O through a
VLAN-aware bridge, using the data plane offload framework (imprecise RX,
imprecise TX). So the VLAN-aware use case will be more functional.

But standalone ports that are part of the same switch when some other
ports are under a VLAN-aware bridge should work too. Termination on
those should work through the tag_8021q RX VLAN and TX VLAN.

This was not possible using the old logic, because:
- in VLAN-unaware mode, only the tag_8021q VLANs were committed to hw
- in VLAN-aware mode, only the bridge VLANs were committed to hw
- in best-effort VLAN mode, both the tag_8021q and bridge VLANs were
  committed to hw

The strategy for the new VLAN-aware mode is to allow the bridge and the
tag_8021q VLANs to coexist in the VLAN table at the same time.

[ yes, we need to make sure that the bridge cannot install a tag_8021q
  VLAN, but ]

This means that the save/restore logic introduced by commit ec5ae61076
("net: dsa: sja1105: save/restore VLANs using a delta commit method")
does not serve a purpose any longer. We can delete it and restore the
old code that simply adds a VLAN to the VLAN table and calls it a day.

Note that we keep the sja1105_commit_pvid() function from those days,
but adapt it slightly. Ports that are under a VLAN-aware bridge use the
bridge's pvid, ports that are standalone or under a VLAN-unaware bridge
use the tag_8021q pvid, for local termination or VLAN-unaware forwarding.

Now, when the vlan_filtering property is toggled for the bridge, the
pvid of the ports beneath it is the only thing that's changing, we no
longer delete some VLANs and restore others.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Colin Ian King
d63f8877c4 net: dsa: sja1105: remove redundant re-assignment of pointer table
The pointer table is being re-assigned with a value that is never
read. The assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Vladimir Oltean
c92c74131a net: dsa: mv88e6xxx: silently accept the deletion of VID 0 too
The blamed commit modified the driver to accept the addition of VID 0
without doing anything, but deleting that VID still fails:

[   32.080780] mv88e6085 d0032004.mdio-mii:10 lan8: failed to kill vid 0081/0

Modify mv88e6xxx_port_vlan_leave() to do the same thing as the addition.

Fixes: b8b79c414e ("net: dsa: mv88e6xxx: Fix adding vlan 0")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 17:13:02 +01:00
Vladimir Oltean
ce5df6894a net: dsa: mv88e6xxx: map virtual bridges with forwarding offload in the PVT
The mv88e6xxx switches have the ability to receive FORWARD (data plane)
frames from the CPU port and route them according to the FDB. We can use
this to offload the forwarding process of packets sent by the software
bridge.

Because DSA supports bridge domain isolation between user ports, just
sending FORWARD frames is not enough, as they might leak the intended
broadcast domain of the bridge on behalf of which the packets are sent.

It should be noted that FORWARD frames are also (and typically) used to
forward data plane packets on DSA links in cross-chip topologies. The
FORWARD frame header contains the source port and switch ID, and
switches receiving this frame header forward the packet according to
their cross-chip port-based VLAN table (PVT).

To address the bridging domain isolation in the context of offloading
the forwarding on TX, the idea is that we can reuse the parts of the PVT
that don't have any physical switch mapped to them, one entry for each
software bridge. The switches will therefore think that behind their
upstream port lie many switches, all in fact backed up by software
bridges through tag_dsa.c, which constructs FORWARD packets with the
right switch ID corresponding to each bridge.

The mapping we use is absolutely trivial: DSA gives us a unique bridge
number, and we add the number of the physical switches in the DSA switch
tree to that, to obtain a unique virtual bridge device number to use in
the PVT.

Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:32:37 +01:00
David S. Miller
5af84df962 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts are simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:13:06 +01:00
Vladimir Oltean
e40cba9490 net: dsa: sja1105: make VID 4095 a bridge VLAN too
This simple series of commands:

ip link add br0 type bridge vlan_filtering 1
ip link set swp0 master br0

fails on sja1105 with the following error:
[   33.439103] sja1105 spi0.1: vlan-lookup-table needs to have at least the default untagged VLAN
[   33.447710] sja1105 spi0.1: Invalid config, cannot upload
Warning: sja1105: Failed to change VLAN Ethertype.

For context, sja1105 has 3 operating modes:
- SJA1105_VLAN_UNAWARE: the dsa_8021q_vlans are committed to hardware
- SJA1105_VLAN_FILTERING_FULL: the bridge_vlans are committed to hardware
- SJA1105_VLAN_FILTERING_BEST_EFFORT: both the dsa_8021q_vlans and the
  bridge_vlans are committed to hardware

Swapping out a VLAN list and another in happens in
sja1105_build_vlan_table(), which performs a delta update procedure.
That function is called from a few places, notably from
sja1105_vlan_filtering() which is called from the
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler.

The above set of 2 commands fails when run on a kernel pre-commit
8841f6e63f ("net: dsa: sja1105: make devlink property
best_effort_vlan_filtering true by default"). So the priv->vlan_state
transition that takes place is between VLAN-unaware and full VLAN
filtering. So the dsa_8021q_vlans are swapped out and the bridge_vlans
are swapped in.

So why does it fail?

Well, the bridge driver, through nbp_vlan_init(), first sets up the
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute, and only then
proceeds to call nbp_vlan_add for the default_pvid.

So when we swap out the dsa_8021q_vlans and swap in the bridge_vlans in
the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler, there are no bridge
VLANs (yet). So we have wiped the VLAN table clean, and the low-level
static config checker complains of an invalid configuration. We _will_
add the bridge VLANs using the dynamic config interface, albeit later,
when nbp_vlan_add() calls us. So it is natural that it fails.

So why did it ever work?

Surprisingly, it looks like I only tested this configuration with 2
things set up in a particular way:
- a network manager that brings all ports up
- a kernel with CONFIG_VLAN_8021Q=y

It is widely known that commit ad1afb0039 ("vlan_dev: VLAN 0 should be
treated as "no vlan tag" (802.1p packet)") installs VID 0 to every net
device that comes up. DSA treats these VLANs as bridge VLANs, and
therefore, in my testing, the list of bridge_vlans was never empty.

However, if CONFIG_VLAN_8021Q is not enabled, or the port is not up when
it joins a VLAN-aware bridge, the bridge_vlans list will be temporarily
empty, and the sja1105_static_config_reload() call from
sja1105_vlan_filtering() will fail.

To fix this, the simplest thing is to keep VID 4095, the one used for
CPU-injected control packets since commit ed040abca4 ("net: dsa:
sja1105: use 4095 as the private VLAN for untagged traffic"), in the
list of bridge VLANs too, not just the list of tag_8021q VLANs. This
ensures that the list of bridge VLANs will never be empty.

Fixes: ec5ae61076 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Reported-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-21 22:53:43 -07:00
Eric Woudstra
7e77702178 mt7530 mt7530_fdb_write only set ivl bit vid larger than 1
Fixes my earlier patch which broke vlan unaware bridges.

The IVL bit now only gets set for vid's larger than 1.

Fixes: 11d8d98cbe ("mt7530 fix mt7530_fdb_write vid missing ivl bit")
Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 07:01:14 -07:00
Vladimir Oltean
c64b9c0504 net: dsa: tag_8021q: add proper cross-chip notifier support
The big problem which mandates cross-chip notifiers for tag_8021q is
this:

                                             |
    sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
 [  user ] [  user ] [  user ] [  dsa  ] [  cpu  ]
                                   |
                                   +---------+
                                             |
    sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
 [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
                                   |
                                   +---------+
                                             |
    sw2p0     sw2p1     sw2p2     sw2p3     sw2p4
 [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]

When the user runs:

ip link add br0 type bridge
ip link set sw0p0 master br0
ip link set sw2p0 master br0

It doesn't work.

This is because dsa_8021q_crosschip_bridge_join() assumes that "ds" and
"other_ds" are at most 1 hop away from each other, so it is sufficient
to add the RX VLAN of {ds, port} into {other_ds, other_port} and vice
versa and presto, the cross-chip link works. When there is another
switch in the middle, such as in this case switch 1 with its DSA links
sw1p3 and sw1p4, somebody needs to tell it about these VLANs too.

Which is exactly why the problem is quadratic: when a port joins a
bridge, for each port in the tree that's already in that same bridge we
notify a tag_8021q VLAN addition of that port's RX VLAN to the entire
tree. It is a very complicated web of VLANs.

It must be mentioned that currently we install tag_8021q VLANs on too
many ports (DSA links - to be precise, on all of them). For example,
when sw2p0 joins br0, and assuming sw1p0 was part of br0 too, we add the
RX VLAN of sw2p0 on the DSA links of switch 0 too, even though there
isn't any port of switch 0 that is a member of br0 (at least yet).
In theory we could notify only the switches which sit in between the
port joining the bridge and the port reacting to that bridge_join event.
But in practice that is impossible, because of the way 'link' properties
are described in the device tree. The DSA bindings require DT writers to
list out not only the real/physical DSA links, but in fact the entire
routing table, like for example switch 0 above will have:

	sw0p3: port@3 {
		link = <&sw1p4 &sw2p4>;
	};

This was done because:

/* TODO: ideally DSA ports would have a single dp->link_dp member,
 * and no dst->rtable nor this struct dsa_link would be needed,
 * but this would require some more complex tree walking,
 * so keep it stupid at the moment and list them all.
 */

but it is a perfect example of a situation where too much information is
actively detrimential, because we are now in the position where we
cannot distinguish a real DSA link from one that is put there to avoid
the 'complex tree walking'. And because DT is ABI, there is not much we
can change.

And because we do not know which DSA links are real and which ones
aren't, we can't really know if DSA switch A is in the data path between
switches B and C, in the general case.

So this is why tag_8021q RX VLANs are added on all DSA links, and
probably why it will never change.

On the other hand, at least the number of additions/deletions is well
balanced, and this means that once we implement reference counting at
the cross-chip notifier level a la fdb/mdb, there is absolutely zero
need for a struct dsa_8021q_crosschip_link, it's all self-managing.

In fact, with the tag_8021q notifiers emitted from the bridge join
notifiers, it becomes so generic that sja1105 does not need to do
anything anymore, we can just delete its implementation of the
.crosschip_bridge_{join,leave} methods.

Among other things we can simply delete is the home-grown implementation
of sja1105_notify_crosschip_switches(). The reason why that is wrong is
because it is not quadratic - it only covers remote switches to which we
have a cross-chip bridging link and that does not cover in-between
switches. This deletion is part of the same patch because sja1105 used
to poke deep inside the guts of the tag_8021q context in order to do
that. Because the cross-chip links went away, so needs the sja1105 code.

Last but not least, dsa_8021q_setup_port() is simplified (and also
renamed). Because our TAG_8021Q_VLAN_ADD notifier is designed to react
on the CPU port too, the four dsa_8021q_vid_apply() calls:
- 1 for RX VLAN on user port
- 1 for the user port's RX VLAN on the CPU port
- 1 for TX VLAN on user port
- 1 for the user port's TX VLAN on the CPU port

now get squashed into only 2 notifier calls via
dsa_port_tag_8021q_vlan_add.

And because the notifiers to add and to delete a tag_8021q VLAN are
distinct, now we finally break up the port setup and teardown into
separate functions instead of relying on a "bool enabled" flag which
tells us what to do. Arguably it should have been this way from the
get go.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
328621f613 net: dsa: tag_8021q: absorb dsa_8021q_setup into dsa_tag_8021q_{,un}register
Right now, setting up tag_8021q is a 2-step operation for a driver,
first the context structure needs to be created, then the VLANs need to
be installed on the ports. A similar thing is true for teardown.

Merge the 2 steps into the register/unregister methods, to be as
transparent as possible for the driver as to what tag_8021q does behind
the scenes. This also gets rid of the funny "bool setup == true means
setup, == false means teardown" API that tag_8021q used to expose.

Note that dsa_tag_8021q_register() must be called at least in the
.setup() driver method and never earlier (like in the driver probe
function). This is because the DSA switch tree is not initialized at
probe time, and the cross-chip notifiers will not work.

For symmetry with .setup(), the unregister method should be put in
.teardown().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
5da11eb407 net: dsa: make tag_8021q operations part of the core
Make tag_8021q a more central element of DSA and move the 2 driver
specific operations outside of struct dsa_8021q_context (which is
supposed to hold dynamic data and not really constant function
pointers).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
d7b1fd520d net: dsa: let the core manage the tag_8021q context
The basic problem description is as follows:

Be there 3 switches in a daisy chain topology:

                                             |
    sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
 [  user ] [  user ] [  user ] [  dsa  ] [  cpu  ]
                                   |
                                   +---------+
                                             |
    sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
 [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
                                   |
                                   +---------+
                                             |
    sw2p0     sw2p1     sw2p2     sw2p3     sw2p4
 [  user ] [  user ] [  user ] [  user ] [  dsa  ]

The CPU will not be able to ping through the user ports of the
bottom-most switch (like for example sw2p0), simply because tag_8021q
was not coded up for this scenario - it has always assumed DSA switch
trees with a single switch.

To add support for the topology above, we must admit that the RX VLAN of
sw2p0 must be added on some ports of switches 0 and 1 as well. This is
in fact a textbook example of thing that can use the cross-chip notifier
framework that DSA has set up in switch.c.

There is only one problem: core DSA (switch.c) is not able right now to
make the connection between a struct dsa_switch *ds and a struct
dsa_8021q_context *ctx. Right now, it is drivers who call into
tag_8021q.c and always provide a struct dsa_8021q_context *ctx pointer,
and tag_8021q.c calls them back with the .tag_8021q_vlan_{add,del}
methods.

But with cross-chip notifiers, it is possible for tag_8021q to call
drivers without drivers having ever asked for anything. A good example
is right above: when sw2p0 wants to set itself up for tag_8021q,
the .tag_8021q_vlan_add method needs to be called for switches 1 and 0,
so that they transport sw2p0's VLANs towards the CPU without dropping
them.

So instead of letting drivers manage the tag_8021q context, add a
tag_8021q_ctx pointer inside of struct dsa_switch, which will be
populated when dsa_tag_8021q_register() returns success.

The patch is fairly long-winded because we are partly reverting commit
5899ee367a ("net: dsa: tag_8021q: add a context structure") which made
the driver-facing tag_8021q API use "ctx" instead of "ds". Now that we
can access "ctx" directly from "ds", this is no longer needed.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
cedf467064 net: dsa: tag_8021q: create dsa_tag_8021q_{register,unregister} helpers
In preparation of moving tag_8021q to core DSA, move all initialization
and teardown related to tag_8021q which is currently done by drivers in
2 functions called "register" and "unregister". These will gather more
functionality in future patches, which will better justify the chosen
naming scheme.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
0fac6aa098 net: dsa: sja1105: delete the best_effort_vlan_filtering mode
Simply put, the best-effort VLAN filtering mode relied on VLAN retagging
from a bridge VLAN towards a tag_8021q sub-VLAN in order to be able to
decode the source port in the tagger, but the VLAN retagging
implementation inside the sja1105 chips is not the best and we were
relying on marginal operating conditions.

The most notable limitation of the best-effort VLAN filtering mode is
its incapacity to treat this case properly:

ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
ip link set swp4 master br0
bridge vlan del dev swp4 vid 1
bridge vlan add dev swp4 vid 1 pvid

When sending an untagged packet through swp2, the expectation is for it
to be forwarded to swp4 as egress-tagged (so it will contain VLAN ID 1
on egress). But the switch will send it as egress-untagged.

There was an attempt to fix this here:
https://patchwork.kernel.org/project/netdevbpf/patch/20210407201452.1703261-2-olteanv@gmail.com/

but it failed miserably because it broke PTP RX timestamping, in a way
that cannot be corrected due to hardware issues related to VLAN
retagging.

So with either PTP broken or pushing VLAN headers on egress for untagged
packets being broken, the sad reality is that the best-effort VLAN
filtering code is broken. Delete it.

Note that this means there will be a temporary loss of functionality in
this driver until it is replaced with something better (network stack
RX/TX capability for "mode 2" as described in
Documentation/networking/dsa/sja1105.rst, the "port under VLAN-aware
bridge" case). We simply cannot keep this code until that driver rework
is done, it is super bloated and tangled with tag_8021q.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Eric Woudstra
11d8d98cbe mt7530 fix mt7530_fdb_write vid missing ivl bit
According to reference guides mt7530 (mt7620) and mt7531:

NOTE: When IVL is reset, MAC[47:0] and FID[2:0] will be used to
read/write the address table. When IVL is set, MAC[47:0] and CVID[11:0]
will be used to read/write the address table.

Since the function only fills in CVID and no FID, we need to set the
IVL bit. The existing code does not set it.

This is a fix for the issue I dropped here earlier:

http://lists.infradead.org/pipermail/linux-mediatek/2021-June/025697.html

With this patch, it is now possible to delete the 'self' fdb entry
manually. However, wifi roaming still has the same issue, the entry
does not get deleted automatically. Wifi roaming also needs a fix
somewhere else to function correctly in combination with vlan.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-16 13:24:33 -07:00
Geert Uytterhoeven
99bb2ebab9 net: dsa: mv88e6xxx: NET_DSA_MV88E6XXX_PTP should depend on NET_DSA_MV88E6XXX
Making global2 support mandatory removed the Kconfig symbol
NET_DSA_MV88E6XXX_GLOBAL2.  This symbol also served as an intermediate
symbol to make NET_DSA_MV88E6XXX_PTP depend on NET_DSA_MV88E6XXX.  With
the symbol removed, the user is always asked about PTP support for
Marvell 88E6xxx switches, even if the latter support is not enabled.

Fix this by reinstating the dependency.

Fixes: 63368a7416 ("net: dsa: mv88e6xxx: Make global2 support mandatory")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15 10:04:43 -07:00
Vladimir Oltean
b0b33b048d net: dsa: sja1105: fix address learning getting disabled on the CPU port
In May 2019 when commit 640f763f98 ("net: dsa: sja1105: Add support
for Spanning Tree Protocol") was introduced, the comment that "STP does
not get called for the CPU port" was true. This changed after commit
0394a63acf ("net: dsa: enable and disable all ports") in August 2019
and went largely unnoticed, because the sja1105_bridge_stp_state_set()
method did nothing different compared to the static setup done by
sja1105_init_mac_settings().

With the ability to turn address learning off introduced by the blamed
commit, there is a new priv->learn_ena port mask in the driver. When
sja1105_bridge_stp_state_set() gets called and we are in
BR_STATE_LEARNING or later, address learning is enabled or not depending
on priv->learn_ena & BIT(port).

So what happens is that priv->learn_ena is not being set from anywhere
for the CPU port, and the static configuration done by
sja1105_init_mac_settings() is being overwritten.

To solve this, acknowledge that the static configuration of STP state is
no longer necessary because the STP state is being set by the DSA core
now, but what is necessary is to set priv->learn_ena for the CPU port.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:32:41 -07:00
kernel test robot
84f7e0bb48 dsa: fix for_each_child.cocci warnings
For_each_available_child_of_node should have of_node_put() before
return around line 423.

Generated by: scripts/coccinelle/iterators/for_each_child.cocci

CC: Alexander Lobakin <alobakin@pm.me>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-11 10:01:55 -07:00
Marek Behún
953b0dcbe2 net: dsa: mv88e6xxx: enable SerDes PCS register dump via ethtool -d on Topaz
Commit bf3504cea7 ("net: dsa: mv88e6xxx: Add 6390 family PCS
registers to ethtool -d") added support for dumping SerDes PCS registers
via ethtool -d for Peridot.

The same implementation is also valid for Topaz, but was not
enabled at the time.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: bf3504cea7 ("net: dsa: mv88e6xxx: Add 6390 family PCS registers to ethtool -d")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:51:36 -07:00
Marek Behún
a03b98d683 net: dsa: mv88e6xxx: enable SerDes RX stats for Topaz
Commit 0df9528736 ("mv88e6xxx: Add serdes Rx statistics") added
support for RX statistics on SerDes ports for Peridot.

This same implementation is also valid for Topaz, but was not enabled
at the time.

We need to use the generic .serdes_get_lane() method instead of the
Peridot specific one in the stats methods so that on Topaz the proper
one is used.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: 0df9528736 ("mv88e6xxx: Add serdes Rx statistics")
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:51:36 -07:00
Marek Behún
c07fff3492 net: dsa: mv88e6xxx: enable devlink ATU hash param for Topaz
Commit 23e8b470c7 ("net: dsa: mv88e6xxx: Add devlink param for ATU
hash algorithm.") introduced ATU hash algorithm access via devlink, but
did not enable it for Topaz.

Enable this feature also for Topaz.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: 23e8b470c7 ("net: dsa: mv88e6xxx: Add devlink param for ATU hash algorithm.")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:51:36 -07:00
Marek Behún
3709488790 net: dsa: mv88e6xxx: enable .rmu_disable() on Topaz
Commit 9e5baf9b36 ("net: dsa: mv88e6xxx: add RMU disable op")
introduced .rmu_disable() method with implementation for several models,
but forgot to add Topaz, which can use the Peridot implementation.

Use the Peridot implementation of .rmu_disable() on Topaz.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: 9e5baf9b36 ("net: dsa: mv88e6xxx: add RMU disable op")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:51:36 -07:00
Marek Behún
11527f3c47 net: dsa: mv88e6xxx: use correct .stats_set_histogram() on Topaz
Commit 40cff8fca9 ("net: dsa: mv88e6xxx: Fix stats histogram mode")
introduced wrong .stats_set_histogram() method for Topaz family.

The Peridot method should be used instead.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: 40cff8fca9 ("net: dsa: mv88e6xxx: Fix stats histogram mode")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:51:36 -07:00
Marek Behún
7da467d82d net: dsa: mv88e6xxx: enable .port_set_policy() on Topaz
Commit f3a2cd326e ("net: dsa: mv88e6xxx: introduce .port_set_policy")
introduced .port_set_policy() method with implementation for several
models, but forgot to add Topaz, which can use the 6352 implementation.

Use the 6352 implementation of .port_set_policy() on Topaz.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: f3a2cd326e ("net: dsa: mv88e6xxx: introduce .port_set_policy")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:51:36 -07:00
Jakub Kicinski
b6df00789e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in net/netfilter/nf_tables_api.c.

Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.

skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29 15:45:27 -07:00
Vladimir Oltean
74e7feff0e net: dsa: sja1105: fix dynamic access to L2 Address Lookup table for SJA1110
The SJA1105P/Q/R/S and SJA1110 may have the same layout for the command
to read/write/search for L2 Address Lookup entries, but as explained in
the comments at the beginning of the sja1105_dynamic_config.c file, the
command portion of the buffer is at the end, and we need to obtain a
pointer to it by adding the length of the entry to the buffer.

Alas, the length of an L2 Address Lookup entry is larger in SJA1110 than
it is for SJA1105P/Q/R/S, so we need to create a common helper to access
the command buffer, and this receives as argument the length of the
entry buffer.

Fixes: 3e77e59bf8 ("net: dsa: sja1105: add support for the SJA1110 switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:49:05 -07:00
Vladimir Oltean
be7f62eeba net: dsa: sja1105: fix NULL pointer dereference in sja1105_reload_cbs()
priv->cbs is an array of priv->info->num_cbs_shapers elements of type
struct sja1105_cbs_entry which only get allocated if CONFIG_NET_SCH_CBS
is enabled.

However, sja1105_reload_cbs() is called from sja1105_static_config_reload()
which in turn is called for any of the items in sja1105_reset_reasons,
therefore during the normal runtime of the driver and not just from a
code path which can be triggered by the tc-cbs offload.

The sja1105_reload_cbs() function does not contain a check whether the
priv->cbs array is NULL or not, it just assumes it isn't and proceeds to
iterate through the credit-based shaper elements. This leads to a NULL
pointer dereference.

The solution is to return success if the priv->cbs array has not been
allocated, since sja1105_reload_cbs() has nothing to do.

Fixes: 4d7525085a ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 15:46:51 -07:00
Vladimir Oltean
75e994709f net: dsa: sja1105: document the SJA1110 in the Kconfig
Mention support for the SJA1110 in menuconfig.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:55:57 -07:00
Florian Fainelli
64a81b2448 net: dsa: b53: Create default VLAN entry explicitly
In case CONFIG_VLAN_8021Q is not set, there will be no call down to the
b53 driver to ensure that the default PVID VLAN entry will be configured
with the appropriate untagged attribute towards the CPU port. We were
implicitly relying on dsa_slave_vlan_rx_add_vid() to do that for us,
instead make it explicit.

Reported-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 10:19:39 -07:00
Eldar Gasanov
b8b79c414e net: dsa: mv88e6xxx: Fix adding vlan 0
8021q module adds vlan 0 to all interfaces when it starts.
When 8021q module is loaded it isn't possible to create bond
with mv88e6xxx interfaces, bonding module dipslay error
"Couldn't add bond vlan ids", because it tries to add vlan 0
to slave interfaces.

There is unexpected behavior in the switch. When a PVID
is assigned to a port the switch changes VID to PVID
in ingress frames with VID 0 on the port. Expected
that the switch doesn't assign PVID to tagged frames
with VID 0. But there isn't a way to change this behavior
in the switch.

Fixes: 57e661aae6 ("net: dsa: mv88e6xxx: Link aggregation support")
Signed-off-by: Eldar Gasanov <eldargasanov2@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21 14:45:42 -07:00
Vladimir Oltean
61c77533b8 net: dsa: sja1105: completely error out in sja1105_static_config_reload if something fails
If reloading the static config fails for whatever reason, for example if
sja1105_static_config_check_valid() fails, then we "goto out_unlock_ptp"
but we print anyway that "Reset switch and programmed static config.",
which is confusing because we didn't. We also do a bunch of other stuff
like reprogram the XPCS and reload the credit-based shapers, as if a
switch reset took place, which didn't.

So just unlock the PTP lock and goto out, skipping all of that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:26:17 -07:00
Vladimir Oltean
1303e7f9b6 net: dsa: sja1105: allow the TTEthernet configuration in the static config for SJA1110
Currently sja1105_static_config_check_valid() is coded up to detect
whether TTEthernet is supported based on device ID, and this check was
not updated to cover SJA1110.

However, it is desirable to have as few checks for the device ID as
possible, so the driver core is more generic. So what we can do is look
at the static config table operations implemented by that specific
switch family (populated by sja1105_static_config_init) whether the
schedule table has a non-zero maximum entry count (meaning that it is
supported) or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:26:17 -07:00
Vladimir Oltean
cb5a82d2b9 net: dsa: sja1105: properly power down the microcontroller clock for SJA1110
It turns out that powering down the BASE_TIMER_CLK does not turn off the
microcontroller, just its timers, including the one for the watchdog.
So the embedded microcontroller is still running, and potentially still
doing things.

To prevent unwanted interference, we should power down the BASE_MCSS_CLK
as well (MCSS = microcontroller subsystem).

The trouble is that currently we turn off the BASE_TIMER_CLK for SJA1110
from the .clocking_setup() method, mostly because this is a Clock
Generation Unit (CGU) setting which was traditionally configured in that
method for SJA1105. But in SJA1105, the CGU was used for bringing up the
port clocks at the proper speeds, and in SJA1110 it's not (but rather
for initial configuration), so it's best that we rebrand the
sja1110_clocking_setup() method into what it really is - an implementation
of the .disable_microcontroller() method.

Since disabling the microcontroller only needs to be done once, at probe
time, we can choose the best place to do that as being in sja1105_setup(),
before we upload the static config to the device. This guarantees that
the static config being used by the switch afterwards is really ours.

Note that the procedure to upload a static config necessarily resets the
switch. This already did not reset the microcontroller, only the switch
core, so since the .disable_microcontroller() method is guaranteed to be
called by that point, if it's disabled, it remains disabled. Add a
comment to make that clear.

With the code movement for SJA1110 from .clocking_setup() to
.disable_microcontroller(), both methods are optional and are guarded by
"if" conditions.

Tested by enabling in the device tree the rev-mii switch port 0 that
goes towards the microcontroller, and flashing a firmware that would
have networking. Without this patch, the microcontroller can be pinged,
with this patch it cannot.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:26:17 -07:00
George McCollister
a4fc566543 net: dsa: xrs700x: forward HSR supervision frames
Forward supervision frames between redunant HSR ports. This was broken
in the last commit.

Fixes: 1a42624aec ("net: dsa: xrs700x: allow HSR/PRP supervision dupes for node_table")
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:17:03 -07:00
Colin Ian King
11b57faf95 net: dsa: b53: remove redundant null check on dev
The pointer dev can never be null, the null check is redundant
and can be removed. Cleans up a static analysis warning that
pointer priv is dereferencing dev before dev is being null
checked.

Addresses-Coverity: ("Dereference before null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 11:28:01 -07:00
Vladimir Oltean
3009e8aa85 net: dsa: sja1105: constify the sja1105_regs structures
The struct sja1105_regs tables are not modified during the runtime of
the driver, so they can be made constant. In fact, struct sja1105_info
already holds a const pointer to these.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 13:14:24 -07:00
Oleksij Rempel
49011e0c15 net: phy: micrel: ksz886x/ksz8081: add cabletest support
This patch support for cable test for the ksz886x switches and the
ksz8081 PHY.

The patch was tested on a KSZ8873RLL switch with following results:

- port 1:
  - provides invalid values, thus return -ENOTSUPP
    (Errata: DS80000830A: "LinkMD does not work on Port 1",
     http://ww1.microchip.com/downloads/en/DeviceDoc/KSZ8873-Errata-DS80000830A.pdf)

- port 2:
  - can detect distance
  - can detect open on each wire of pair A (wire 1 and 2)
  - can detect open only on one wire of pair B (only wire 3)
  - can detect short between wires of a pair (wires 1 + 2 or 3 + 6)
  - short between pairs is detected as open.
    For example short between wires 2 + 3 is detected as open.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:54:43 -07:00
Oleksij Rempel
36838050c4 net: dsa: microchip: ksz8795: add LINK_MD register support
Add mapping for LINK_MD register to enable cable testing functionality.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:54:43 -07:00
Oleksij Rempel
52939393bd net: phy/dsa micrel/ksz886x add MDI-X support
Add support for MDI-X status and configuration

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:54:43 -07:00
Michael Grzeschik
2c709e0bda net: dsa: microchip: ksz8795: add phylink support
This patch adds the phylink support to the ksz8795 driver to provide
configuration exceptions on quirky KSZ8863 and KSZ8873 ports.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:54:43 -07:00
Michael Grzeschik
ec4b94f9b3 net: phy: micrel: move phy reg offsets to common header
Some micrel devices share the same PHY register defines. This patch
moves them to one common header so other drivers can reuse them.
And reuse generic MII_* defines where possible.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:54:43 -07:00
Vladimir Oltean
56b6346633 net: dsa: sja1105: plug in support for 2500base-x
The MAC treats 2500base-x same as SGMII (yay for that) except that it
must be set to a different speed.

Extend all places that check for SGMII to also check for 2500base-x.

Also add the missing 2500base-x compatibility matrix entry for SJA1110D.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
ece578bc3e net: dsa: sja1105: SGMII and 2500base-x on the SJA1110 are 'special'
For the xMII Mode Parameters Table to be properly configured for SGMII
mode on SJA1110, we need to set the "special" bit, since SGMII is
officially bitwise coded as 0b0011 in SJA1105 (decimal 3, equal to
XMII_MODE_SGMII), and as 0b1011 in SJA1110 (decimal 11).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
27871359bd net: dsa: sja1105: register the PCS MDIO bus for SJA1110
On the SJA1110, the PCS of each SERDES-capable port is accessed through
a different memory window which is 0x100 bytes in size, denoted by
"pcs_base".

In each PCS register access window, the XPCS MMDs are accessed in an
indirect way: in pages/banks of up to 0x100 addresses each. Changing the
page/bank is done by writing to a special register at the end of the
access window.

The MDIO register map accessed indirectly through the indirect banked
method described above is similar to what SJA1105 has: upper 5 bits are
the MMD, lower 16 bits are the MDIO address within that MMD.

Since the PHY ID reported by the XPCS inside SJA1110 is also all zeroes
(like SJA1105), we need to trap those reads and return a fake PHY ID so
that the xpcs driver can apply some specific fixups for our integration.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
3ad1d17154 net: dsa: sja1105: migrate to xpcs for SGMII
There is a desire to use the generic driver for the Synopsys XPCS
located in drivers/net/pcs, and to achieve that, the sja1105 driver must
expose an MDIO bus for the SGMII PCS, because the XPCS probes as an
mdio_device.

In preparation of the SJA1110 which in fact has a different access
procedure for the SJA1105, we register this PCS MDIO bus once in the
common code, but we implement function pointers for the read and write
methods. In this patch there is a single implementation for them.

There is exactly one MDIO bus for the PCS, this will contain all PCSes
at MDIO addresses equal to the port number.

We delete a bunch of hardware support code because the xpcs driver
already does what we need.

We need to hack up the MDIO reads for the PHY ID, since our XPCS
instantiation returns zeroes and there are some specific fixups which
need to be applied by the xpcs driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
566b18c8b7 net: dsa: sja1105: implement TX timestamping for SJA1110
The TX timestamping procedure for SJA1105 is a bit unconventional
because the transmit procedure itself is unconventional.

Control packets (and therefore PTP as well) are transmitted to a
specific port in SJA1105 using "management routes" which must be written
over SPI to the switch. These are one-shot rules that match by
destination MAC address on traffic coming from the CPU port, and select
the precise destination port for that packet. So to transmit a packet
from NET_TX softirq context, we actually need to defer to a process
context so that we can perform that SPI write before we send the packet.
The DSA master dev_queue_xmit() runs in process context, and we poll
until the switch confirms it took the TX timestamp, then we annotate the
skb clone with that TX timestamp. This is why the sja1105 driver does
not need an skb queue for TX timestamping.

But the SJA1110 is a bit (not much!) more conventional, and you can
request 2-step TX timestamping through the DSA header, as well as give
the switch a cookie (timestamp ID) which it will give back to you when
it has the timestamp. So now we do need a queue for keeping the skb
clones until their TX timestamps become available.

The interesting part is that the metadata frames from SJA1105 haven't
disappeared completely. On SJA1105 they were used as follow-ups which
contained RX timestamps, but on SJA1110 they are actually TX completion
packets, which contain a variable (up to 32) array of timestamps.
Why an array? Because:
- not only is the TX timestamp on the egress port being communicated,
  but also the RX timestamp on the CPU port. Nice, but we don't care
  about that, so we ignore it.
- because a packet could be multicast to multiple egress ports, each
  port takes its own timestamp, and the TX completion packet contains
  the individual timestamps on each port.

This is unconventional because switches typically have a timestamping
FIFO and raise an interrupt, but this one doesn't. So the tagger needs
to detect and parse meta frames, and call into the main switch driver,
which pairs the timestamps with the skbs in the TX timestamping queue
which are waiting for one.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
30b73242e6 net: dsa: sja1105: add the RX timestamping procedure for SJA1110
This is really easy, since the full RX timestamp is in the DSA trailer
and the tagger code transfers it to SJA1105_SKB_CB(skb)->tstamp, we just
need to move it to the skb shared info region. This is as opposed to
SJA1105, where the RX timestamp was received in a meta frame (so there
needed to be a state machine to pair the 2 packets) and the timestamp
was partial (so the packet, once matched with its timestamp, needed to
be added to an RX timestamping queue where the PTP aux worker would
reconstruct that timestamp).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
4913b8ebf8 net: dsa: add support for the SJA1110 native tagging protocol
The SJA1110 has improved a few things compared to SJA1105:

- To send a control packet from the host port with SJA1105, one needed
  to program a one-shot "management route" over SPI. This is no longer
  true with SJA1110, you can actually send "in-band control extensions"
  in the packets sent by DSA, these are in fact DSA tags which contain
  the destination port and switch ID.

- When receiving a control packet from the switch with SJA1105, the
  source port and switch ID were written in bytes 3 and 4 of the
  destination MAC address of the frame (which was a very poor shot at a
  DSA header). If the control packet also had an RX timestamp, that
  timestamp was sent in an actual follow-up packet, so there were
  reordering concerns on multi-core/multi-queue DSA masters, where the
  metadata frame with the RX timestamp might get processed before the
  actual packet to which that timestamp belonged (there is no way to
  pair a packet to its timestamp other than the order in which they were
  received). On SJA1110, this is no longer true, control packets have
  the source port, switch ID and timestamp all in the DSA tags.

- Timestamps from the switch were partial: to get a 64-bit timestamp as
  required by PTP stacks, one would need to take the partial 24-bit or
  32-bit timestamp from the packet, then read the current PTP time very
  quickly, and then patch in the high bits of the current PTP time into
  the captured partial timestamp, to reconstruct what the full 64-bit
  timestamp must have been. That is awful because packet processing is
  done in NAPI context, but reading the current PTP time is done over
  SPI and therefore needs sleepable context.

But it also aggravated a few things:

- Not only is there a DSA header in SJA1110, but there is a DSA trailer
  in fact, too. So DSA needs to be extended to support taggers which
  have both a header and a trailer. Very unconventional - my understanding
  is that the trailer exists because the timestamps couldn't be prepared
  in time for putting them in the header area.

- Like SJA1105, not all packets sent to the CPU have the DSA tag added
  to them, only control packets do:

  * the ones which match the destination MAC filters/traps in
    MAC_FLTRES1 and MAC_FLTRES0
  * the ones which match FDB entries which have TRAP or TAKETS bits set

  So we could in theory hack something up to request the switch to take
  timestamps for all packets that reach the CPU, and those would be
  DSA-tagged and contain the source port / switch ID by virtue of the
  fact that there needs to be a timestamp trailer provided. BUT:

- The SJA1110 does not parse its own DSA tags in a way that is useful
  for routing in cross-chip topologies, a la Marvell. And the sja1105
  driver already supports cross-chip bridging from the SJA1105 days.
  It does that by automatically setting up the DSA links as VLAN trunks
  which contain all the necessary tag_8021q RX VLANs that must be
  communicated between the switches that span the same bridge. So when
  using tag_8021q on sja1105, it is possible to have 2 switches with
  ports sw0p0, sw0p1, sw1p0, sw1p1, and 2 VLAN-unaware bridges br0 and
  br1, and br0 can take sw0p0 and sw1p0, and br1 can take sw0p1 and
  sw1p1, and forwarding will happen according to the expected rules of
  the Linux bridge.
  We like that, and we don't want that to go away, so as a matter of
  fact, the SJA1110 tagger still needs to support tag_8021q.

So the sja1110 tagger is a hybrid between tag_8021q for data packets,
and the native hardware support for control packets.

On RX, packets have a 13-byte trailer if they contain an RX timestamp.
That trailer is padded in such a way that its byte 8 (the start of the
"residence time" field - not parsed by Linux because we don't care) is
aligned on a 16 byte boundary. So the padding has a variable length
between 0 and 15 bytes. The DSA header contains the offset of the
beginning of the padding relative to the beginning of the frame (and the
end of the padding is obviously the end of the packet minus 13 bytes,
the length of the trailer). So we discard it.

Packets which don't have a trailer contain the source port and switch ID
information in the header (they are "trap-to-host" packets). Packets
which have a trailer contain the source port and switch ID in the trailer.

On TX, the destination port mask and switch ID is always in the trailer,
so we always need to say in the header that a trailer is present.

The header needs a custom EtherType and this was chosen as 0xdadc, after
0xdada which is for Marvell and 0xdadb which is for VLANs in
VLAN-unaware mode on SJA1105 (and SJA1110 in fact too).

Because we use tag_8021q in concert with the native tagging protocol,
control packets will have 2 DSA tags.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
617ef8d937 net: dsa: sja1105: make SJA1105_SKB_CB fit a full timestamp
In SJA1105, RX timestamps for packets sent to the CPU are transmitted in
separate follow-up packets (metadata frames). These contain partial
timestamps (24 or 32 bits) which are kept in SJA1105_SKB_CB(skb)->meta_tstamp.

Thankfully, SJA1110 improved that, and the RX timestamps are now
transmitted in-band with the actual packet, in the timestamp trailer.
The RX timestamps are now full-width 64 bits.

Because we process the RX DSA tags in the rcv() method in the tagger,
but we would like to preserve the DSA code structure in that we populate
the skb timestamp in the port_rxtstamp() call which only happens later,
the implication is that we must somehow pass the 64-bit timestamp from
the rcv() method all the way to port_rxtstamp(). We can use the skb->cb
for that.

Rename the meta_tstamp from struct sja1105_skb_cb from "meta_tstamp" to
"tstamp", and increase its size to 64 bits.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
6c0de59b3d net: dsa: sja1105: allow RX timestamps to be taken on all ports for SJA1110
On SJA1105, there is support for a cascade port which is presumably
connected to a downstream SJA1105 switch. The upstream one does not take
PTP timestamps for packets received on this port, presumably because the
downstream switch already did (and for PTP, it only makes sense for the
leaf nodes in a DSA switch tree to do that).

I haven't been able to validate that feature in a fully assembled setup,
so I am disabling the feature by setting the cascade port to an unused
port value (ds->num_ports).

In SJA1110, multiple cascade ports are supported, and CASC_PORT became
a bit mask from a port number. So when CASC_PORT is set to ds->num_ports
(which is 11 on SJA1110), it is actually set to 0b1011, so ports 3, 1
and 0 are configured as cascade ports and we cannot take RX timestamps
on them.

So we need to introduce a check for SJA1110 and set things differently
(to zero there), so that the cascading feature is properly disabled and
RX timestamps can be taken on all ports.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
29305260d2 net: dsa: sja1105: enable the TTEthernet engine on SJA1110
As opposed to SJA1105 where there are parts with TTEthernet and parts
without, in SJA1110 all parts support it, but it must be enabled in the
static config. So enable it unconditionally. We use it for the tc-taprio
and tc-gate offload.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Colin Ian King
ab324d8dfd net: dsa: sja1105: Fix assigned yet unused return code rc
The return code variable rc is being set to return error values in two
places in sja1105_mdiobus_base_tx_register and yet it is not being
returned, the function always returns 0 instead. Fix this by replacing
the return 0 with the return code rc.

Addresses-Coverity: ("Unused value")
Fixes: 5a8f09748e ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 15:46:30 -07:00
Dan Carpenter
3d0167f2a6 net: dsa: qca8k: check the correct variable in qca8k_set_mac_eee()
This code check "reg" but "ret" was intended so the error handling will
never trigger.

Fixes: 7c9896e378 ("net: dsa: qca8k: check return value of read functions correctly")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 14:10:38 -07:00
Dan Carpenter
aa3d020b22 net: dsa: qca8k: fix an endian bug in qca8k_get_ethtool_stats()
The "hi" variable is a u64 but the qca8k_read() writes to the top 32
bits of it.  That will work on little endian systems but it's a bit
subtle.  It's cleaner to make declare "hi" as a u32.  We will still need
to cast it when we shift it later on in the function but that's fine.

Fixes: 7c9896e378 ("net: dsa: qca8k: check return value of read functions correctly")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 14:10:38 -07:00
Florian Fainelli
2c32a3d3c2 net: dsa: b53: Do not force CPU to be always tagged
Commit ca89319483 ("net: dsa: b53: Keep CPU port as tagged in all
VLANs") forced the CPU port to be always tagged in any VLAN membership.
This was necessary back then because we did not support Broadcom tags
for all configurations so the only way to differentiate tagged and
untagged traffic while DSA_TAG_PROTO_NONE was used was to force the CPU
port into being always tagged.

With most configurations enabling Broadcom tags, especially after
8fab459e69 ("net: dsa: b53: Enable Broadcom tags for 531x5/539x
families") we do not need to apply this unconditional force tagging of
the CPU port in all VLANs.

A helper function is introduced to faciliate the encapsulation of the
specific condition requiring the CPU port to be tagged in all VLANs and
the dsa_switch_ops::untag_bridge_pvid boolean is moved to when
dsa_switch_ops::setup is called when we have already determined the
tagging protocol we will be using.

Reported-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 13:49:20 -07:00
Vladimir Oltean
de274be32c net: dsa: felix: set TX flow control according to the phylink_mac_link_up resolution
Instead of relying on the static initialization done by ocelot_init_port()
which enables flow control unconditionally, set SYS_PAUSE_CFG_PAUSE_ENA
according to the parameters negotiated by the PHY.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 16:35:14 -07:00
Vladimir Oltean
5a8f09748e net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX
The SJA1110 contains two types of integrated PHYs: one 100base-TX PHY
and multiple 100base-T1 PHYs.

The access procedure for the 100base-T1 PHYs is also different than it
is for the 100base-TX one. So we register 2 MDIO buses, one for the
base-TX and the other for the base-T1. Each bus has an OF node which is
a child of the "mdio" subnode of the switch, and they are recognized by
compatible string.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:37:16 -07:00
Vladimir Oltean
ceec8bc098 net: dsa: sja1105: make sure the retagging port is enabled for SJA1110
The SJA1110 has an extra configuration in the General Parameters Table
through which the user can select the buffer reservation config.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:37:16 -07:00
Vladimir Oltean
3e77e59bf8 net: dsa: sja1105: add support for the SJA1110 switch family
The SJA1110 is basically an SJA1105 with more ports, some integrated
PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which
can be disabled, and the switch core can be controlled by a host running
Linux, over SPI.

This patch contains:
- the static and dynamic config packing functions, for the tables that
  are common with SJA1105
- one more static config tables which is "unique" to the SJA1110
  (actually it is a rehash of stuff that was placed somewhere else in
  SJA1105): the PCP Remapping Table
- a reset and clock configuration procedure for the SJA1110 switch.
  This resets just the switch subsystem, and gates off the clock which
  powers on the embedded microcontroller.
- an RGMII delay configuration procedure for SJA1110, which is very
  similar to SJA1105, but different enough for us to be unable to reuse
  it (this is a pattern that repeats itself)
- some adaptations to dynamic config table entries which are no longer
  programmed in the same way. For example, to delete a VLAN, you used to
  write an entry through the dynamic reconfiguration interface with the
  desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN
  table entries contain a TYPE_ENTRY field, which must be set to zero
  (in a backwards-incompatible way) in order for the entry to be deleted,
  or to some other entry for the VLAN to match "inner tagged" or "outer
  tagged" packets.
- a similar thing for the static config: the xMII Mode Parameters Table
  encoding for SGMII and MII (the latter just when attached to a
  100base-TX PHY) just isn't what it used to be in SJA1105. They are
  identical, except there is an extra "special" bit which needs to be
  set. Set it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:37:16 -07:00
Yang Yingliang
f1fe19c2cb net: mscc: ocelot: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:02:25 -07:00
Zou Wei
3f07ce8e52 net: dsa: hellcreek: Use is_zero_ether_addr() instead of memcmp()
Using is_zero_ether_addr() instead of directly use
memcmp() to determine if the ethernet address is all
zeros.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:16:36 -07:00
Vladimir Oltean
5d645df99a net: dsa: sja1105: determine PHY/MAC role from PHY interface type
Now that both RevMII as well as RevRMII exist, we can deprecate the
sja1105,role-mac and sja1105,role-phy properties and simply let the user
select that a port operates in MII PHY role by using
	phy-mode = "rev-mii";
or in RMII PHY role by using
	phy-mode = "rev-rmii";

There are no fixed-link MII or RMII properties in mainline device trees,
and the setup itself is fairly uncommon, so there shouldn't be risks of
breaking compatibility.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 12:20:18 -07:00
Vladimir Oltean
29afb83ac9 net: dsa: sja1105: apply RGMII delays based on the fixed-link property
The sja1105 driver has an intermediate way of determining whether the
RGMII delays should be applied by the PHY or by itself: by looking at
the port role (PHY or MAC). The port can be put in the PHY role either
explicitly (sja1105,role-phy) or implicitly (fixed-link).

We want to deprecate the sja1105,role-phy property, so all that remains
is the fixed-link property. Introduce a "fixed_link" array of booleans
in the driver, and use that to determine whether RGMII delays must be
applied or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 12:20:18 -07:00
George McCollister
1a42624aec net: dsa: xrs700x: allow HSR/PRP supervision dupes for node_table
Add an inbound policy filter which matches the HSR/PRP supervision
MAC range and forwards to the CPU port without discarding duplicates.
This is required to correctly populate time_in[A] and time_in[B] in the
HSR/PRP node_table. Leave the policy disabled by default and
enable/disable it when joining/leaving hsr.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04 14:49:28 -07:00
Vladimir Oltean
96c85f51f1 net: dsa: sja1105: some table entries are always present when read dynamically
The SJA1105 has a static configuration comprised of a number of tables
with entries. Some of these can be read and modified at runtime as well,
through the dynamic configuration interface.

As a careful reader can notice from the comments in this file, the
software interface for accessing a table entry through the dynamic
reconfiguration is a bit of a no man's land, and varies wildly across
switch generations and even from one kind of table to another.

I have tried my best to come up with a software representation of a
'common denominator' SPI command to access a table entry through the
dynamic configuration interface:

struct sja1105_dyn_cmd {
	bool search;
	u64 valid; /* must be set to 1 */
	u64 rdwrset; /* 0 to read, 1 to write */
	u64 errors;
	u64 valident; /* 0 if entry is invalid, 1 if valid */
	u64 index;
};

Relevant to this patch is the VALIDENT bit, which for READ commands is
populated by the switch and lets us know if we're looking at junk or at
a real table entry.

In SJA1105, the dynamic reconfiguration interface for management routes
has notably not implemented the VALIDENT bit, leading to a workaround to
ignore this field in sja1105_dynamic_config_read(), as it will be set to
zero, but the data is valid nonetheless.

In SJA1110, this pattern has sadly been abused to death, and while there
are many more tables which can be read back over the dynamic config
interface compared to SJA1105, their handling isn't in any way more
uniform. Generally speaking, if there is a single possible entry in a
given table, and loading that table in the static config is mandatory as
per the documentation, then the VALIDENT bit is deemed as redundant and
more than likely not implemented.

So it is time to make the workaround more official, and add a bit to the
flags implemented by dynamic config tables. It will be used by more
tables when SJA1110 support arrives.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
f41fad3cb8 net: dsa: sja1105: always keep RGMII ports in the MAC role
In SJA1105, the xMII Mode Parameters Table field called PHY_MAC denotes
the 'role' of the port, be it a PHY or a MAC. This makes a difference in
the MII and RMII protocols, but RGMII is symmetric, so either PHY or MAC
settings result in the same hardware behavior.

The SJA1110 is different, and the RGMII ports only work when configured
in MAC mode, so keep the port roles in MAC mode unconditionally.

Why we had an RGMII port in the PHY role in the first place was because
we wanted to have a way in the driver to denote whether RGMII delays
should be applied based on the phy-mode property or not. This is already
done in sja1105_parse_rgmii_delays() based on an intermediary
struct sja1105_dt_port (which contains the port role). So it is a
logical fallacy to use the hardware configuration as a scratchpad for
driver data, it isn't necessary.

We can also remove the gating condition for applying RGMII delays only
for ports in the PHY role. The .setup_rgmii_delay() method looks at
the priv->rgmii_rx_delay[port] and priv->rgmii_tx_delay[port] properties
which are already populated properly (in the case of a port in the MAC
role they are false). Removing this condition generates a few more SPI
writes for these ports (clearing the RGMII delays) which are perhaps
useless for SJA1105P/Q/R/S, where we know that the delays are disabled
by default. But for SJA1110, the firmware on the embedded microcontroller
might have done something funny, so it's always a good idea to clear the
RGMII delays if that's what Linux expects.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
41fed17fdb net: dsa: sja1105: add a translation table for port speeds
In order to support the new speed of 2500Mbps, the SJA1110 has achieved
the great performance of changing the encoding in the MAC Configuration
Table for the port speeds of 10, 100, 1000 compared to SJA1105.

Because this is a common driver, we need a layer of indirection in order
to program the hardware with the right values irrespective of switch
generation.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
91a050782c net: dsa: sja1105: add a PHY interface type compatibility matrix
On the SJA1105, all ports support the parallel "xMII" protocols (MII,
RMII, RGMII) except for port 4 on SJA1105R/S which supports only SGMII.
This was relatively easy to model, by special-casing the SGMII port.

On the SJA1110, certain ports can be pinmuxed between SGMII and xMII, or
between SGMII and an internal 100base-TX PHY. This creates problems,
because the driver's assumption so far was that if a port supports
SGMII, it uses SGMII.

We allow the device tree to tell us how the port pinmuxing is done, and
check that against a PHY interface type compatibility matrix for
plausibility.

The other big change is that instead of doing SGMII configuration based
on what the port supports, we do it based on what is the configured
phy_mode of the port.

The 2500base-x support added in this patch is not complete.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
bf4edf4afb net: dsa: sja1105: cache the phy-mode port property
So far we've succeeded in operating without keeping a copy of the
phy-mode in the driver, since we already have the static config and we
can look at the xMII Mode Parameters Table which already holds that
information.

But with the SJA1110, we cannot make the distinction between sgmii and
2500base-x, because to the hardware's static config, it's all SGMII.
So add a phy_mode property per port inside struct sja1105_private.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Vladimir Oltean
4c7ee010cf net: dsa: sja1105: the 0x1F0000 SGMII "base address" is actually MDIO_MMD_VEND2
Looking at the SGMII PCS from SJA1110, which is accessed indirectly
through a different base address as can be seen in the next patch, it
appears odd that the address accessed through indirection still
references the base address from the SJA1105S register map (first MDIO
register is at 0x1f0000), when it could index the SGMII registers
starting from zero.

Except that the 0x1f0000 is not a base address at all, it seems. It is
0x1f << 16 | 0x0000, and 0x1f is coding for the vendor-specific MMD2.
So, it turns out, the Synopsys PCS implements all its registers inside
the vendor-specific MMDs 1 and 2 (0x1e and 0x1f). This explains why the
PCS has no overlaps (for the other MMDs) with other register regions of
the switch (because no other MMDs are implemented).

Change the code to remove the SGMII "base address" and explicitly encode
the MMD for reads/writes. This will become necessary for SJA1110 support.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Vladimir Oltean
84db00f2c0 net: dsa: sja1105: allow SGMII PCS configuration to be per port
The SJA1105 R and S switches have 1 SGMII port (port 4). Because there
is only one such port, there is no "port" parameter in the configuration
code for the SGMII PCS.

However, the SJA1110 can have up to 4 SGMII ports, each with its own
SGMII register map. So we need to generalize the logic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Vladimir Oltean
15074a361f net: dsa: sja1105: be compatible with "ethernet-ports" OF node name
Since commit f2f3e09396be ("net: dsa: sja1105: be compatible with
"ethernet-ports" OF node name"), DSA supports the "ethernet-ports" name
for the container node of the ports, but the sja1105 driver doesn't,
because it handles some device tree parsing of its own.

Add the second node name as a fallback.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Yang Yingliang
9fe99de014 net: dsa: qca8k: add missing check return value in qca8k_phylink_mac_config()
Now we can check qca8k_read() return value correctly, so if
it fails, we need return directly.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-30 14:22:31 -07:00
Yang Yingliang
7c9896e378 net: dsa: qca8k: check return value of read functions correctly
Current return type of qca8k_mii_read32() and qca8k_read() are
unsigned, it can't be negative, so the return value check is
unuseful. For check the return value correctly, change return
type of the read functions and add a output parameter to store
the read value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-30 14:22:31 -07:00
Jakub Kicinski
5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
George McCollister
8c42a49738 net: dsa: microchip: enable phy errata workaround on 9567
Also enable phy errata workaround on 9567 since has the same errata as
the 9477 according to the manufacture's documentation.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 14:27:18 -07:00
Vladimir Oltean
1bf658eefe net: dsa: sja1105: allow the frame buffer size to be customized
The shared frame buffer of the SJA1110 is larger than that of SJA1105,
which is natural due to the fact that there are more ports.

Introduce yet another property in struct sja1105_info which encodes the
maximum number of 128 byte blocks that can be used for frame buffers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:04 -07:00
Vladimir Oltean
38fbe91f22 net: dsa: sja1105: configure the multicast policers, if present
The SJA1110 policer array is similar in layout with SJA1105, except it
contains one multicast policer per port at the end.

Detect the presence of multicast policers based on the maximum number of
supported L2 Policing Table entries, and make those policers have a
shared index equal to the port's default policer. Letting the user
configure these policers is not supported at the moment.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
f78a2517cf net: dsa: sja1105: use sja1105_xfer_u32 for the reset procedure
Using sja1105_xfer_buf results in a higher overhead and is harder to
read.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
fd6f2c257b net: dsa: sja1105: dynamically choose the number of static config table entries
Due to the fact that the port count is different, some static config
tables have a different number of elements in SJA1105 compared to
SJA1110. Such an example is the L2 Policing table, which has 45 entries
in SJA1105 (one per port x traffic class, and one broadcast policer per
port) and 110 entries in SJA1110 (one per port x traffic class, one
broadcast and one multicast policer per port).

Similarly, the MAC Configuration Table, the L2 Forwarding table, all
have a different number of elements simply because the port count is
different, and although this can be accounted for by looking at
ds->ports, the policing table can't because of the presence of the extra
multicast policers.

The common denominator for the static config initializers for these
tables is that they must set up all the entries within that table.
So the simplest way to account for these differences in a uniform manner
is to look at struct sja1105_table_ops::max_entry_count. For the sake of
uniformity, this patch makes that change also for tables whose number of
elements did not change in SJA1110, like the xMII Mode Parameters, the
L2 Lookup Parameters, General Parameters, AVB Parameters (all of these
are singleton tables with a single entry).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
c50376783f net: dsa: sja1105: skip CGU configuration if it's unnecessary
There are two distinct code paths which enter sja1105_clocking.c, one
through sja1105_clocking_setup() and the other through
sja1105_clocking_setup_port():

sja1105_static_config_reload      sja1105_setup
              |                         |
              |      +------------------+
              |      |
              v      v
   sja1105_clocking_setup               sja1105_adjust_port_config
                 |                                   |
                 v                                   |
      sja1105_clocking_setup_port <------------------+

As opposed to SJA1105, the SJA1110 does not need any configuration of
the Clock Generation Unit in order for xMII ports to work. Just RGMII
internal delays need to be configured, and that is done inside
sja1105_clocking_setup_port for the RGMII ports.

So this patch introduces the concept of a "reserved address", which the
CGU configuration functions from sja1105_clocking.c must check before
proceeding to do anything. The SJA1110 will have reserved addresses for
the CGU PLLs for MII/RMII/RGMII.

Additionally, make sja1105_clocking_setup() a function pointer so it can
be overridden by the SJA1110. Even though nothing port-related needs to
be done in the CGU, there are some operations such as disabling the
watchdog clock which are unique to the SJA1110.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
df2a81a35e net: dsa: sja1105: don't assign the host port using dsa_upstream_port()
If @port is unused, then dsa_upstream_port(ds, port) returns @port,
which means we cannot assume the CPU port can be retrieved this way.

The sja1105 switches support a single CPU port, so just iterate over the
switch ports and stop at the first CPU port we see.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
82760d7f2e net: dsa: sja1105: dimension the data structures for a larger port count
Introduce a SJA1105_MAX_NUM_PORTS macro which at the moment is equal to
SJA1105_NUM_PORTS (5). With the introduction of SJA1110, these
structures will need to hold information for up to 11 ports.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
f238fef1b3 net: dsa: sja1105: avoid some work for unused ports
Do not put unused ports in the forwarding domain, and do not allocate
FDB entries for dynamic address learning for them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
542043e91d net: dsa: sja1105: parameterize the number of ports
The sja1105 driver will gain support for the next-gen SJA1110 switch,
which is very similar except for the fact it has more than 5 ports.

So we need to replace the hardcoded SJA1105_NUM_PORTS in this driver
with ds->num_ports. This patch is as mechanical as possible (save for
the fact that ds->num_ports is not an integer constant expression).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
b38e659de9 net: dsa: sja1105: update existing VLANs from the bridge VLAN list
When running this sequence of operations:

ip link add br0 type bridge vlan_filtering 1
ip link set swp4 master br0
bridge vlan add dev swp4 vid 1

We observe the traffic sent on swp4 is still untagged, even though the
bridge has overwritten the existing VLAN entry:

port    vlan ids
swp4     1 PVID

br0      1 PVID Egress Untagged

This happens because we didn't consider that the 'bridge vlan add'
command just overwrites VLANs like it's nothing. We treat the 'vid 1
pvid untagged' and the 'vid 1' as two separate VLANs, and the first
still has precedence when calling sja1105_build_vlan_table. Obviously
there is a disagreement regarding semantics, and we end up doing
something unexpected from the PoV of the bridge.

Let's actually consider an "existing VLAN" to be one which is on the
same port, and has the same VLAN ID, as one we already have, and update
it if it has different flags than we do.

The first blamed commit is the one introducing the bug, the second one
is the latest on top of which the bugfix still applies.

Fixes: ec5ae61076 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Fixes: 5899ee367a ("net: dsa: tag_8021q: add a context structure")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
ed040abca4 net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic
One thing became visible when writing the blamed commit, and that was
that STP and PTP frames injected by net/dsa/tag_sja1105.c using the
deferred xmit mechanism are always classified to the pvid of the CPU
port, regardless of whatever VLAN there might be in these packets.

So a decision needed to be taken regarding the mechanism through which
we should ensure that delivery of STP and PTP traffic is possible when
we are in a VLAN awareness mode that involves tag_8021q. This is because
tag_8021q is not concerned with managing the pvid of the CPU port, since
as far as tag_8021q is concerned, no traffic should be sent as untagged
from the CPU port. So we end up not actually having a pvid on the CPU
port if we only listen to tag_8021q, and unless we do something about it.

The decision taken at the time was to keep VLAN 1 in the list of
priv->dsa_8021q_vlans, and make it a pvid of the CPU port. This ensures
that STP and PTP frames can always be sent to the outside world.

However there is a problem. If we do the following while we are in
the best_effort_vlan_filtering=true mode:

ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
bridge vlan del dev swp2 vid 1

Then untagged and pvid-tagged frames should be dropped. But we observe
that they aren't, and this is because of the precaution we took that VID
1 is always installed on all ports.

So clearly VLAN 1 is not good for this purpose. What about VLAN 0?
Well, VLAN 0 is managed by the 8021q module, and that module wants to
ensure that 802.1p tagged frames are always received by a port, and are
always transmitted as VLAN-tagged (with VLAN ID 0). Whereas we want our
STP and PTP frames to be untagged if the stack sent them as untagged -
we don't want the driver to just decide out of the blue that it adds
VID 0 to some packets.

So what to do?

Well, there is one other VLAN that is reserved, and that is 4095:
$ ip link add link swp2 name swp2.4095 type vlan id 4095
Error: 8021q: Invalid VLAN id.
$ bridge vlan add dev swp2 vid 4095
Error: bridge: Vlan id is invalid.

After we made this change, VLAN 1 is indeed forwarded and/or dropped
according to the bridge VLAN table, there are no further alterations
done by the sja1105 driver.

Fixes: ec5ae61076 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
6729188d26 net: dsa: sja1105: error out on unsupported PHY mode
The driver continues probing when a port is configured for an
unsupported PHY interface type, instead it should stop.

Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
cec279a898 net: dsa: sja1105: add error handling in sja1105_setup()
If any of sja1105_static_config_load(), sja1105_clocking_setup() or
sja1105_devlink_setup() fails, we can't just return in the middle of
sja1105_setup() or memory will leak. Add a cleanup path.

Fixes: 0a7bdbc23d ("net: dsa: sja1105: move devlink param code to sja1105_devlink.c")
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
dc596e3fe6 net: dsa: sja1105: call dsa_unregister_switch when allocating memory fails
Unlike other drivers which pretty much end their .probe() execution with
dsa_register_switch(), the sja1105 does some extra stuff. When that
fails with -ENOMEM, the driver is quick to return that, forgetting to
call dsa_unregister_switch(). Not critical, but a bug nonetheless.

Fixes: 4d7525085a ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Fixes: a68578c20a ("net: dsa: Make deferred_xmit private to sja1105")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
ba61cf167c net: dsa: sja1105: fix VL lookup command packing for P/Q/R/S
At the beginning of the sja1105_dynamic_config.c file there is a diagram
of the dynamic config interface layout:

 packed_buf

 |
 V
 +-----------------------------------------+------------------+
 |              ENTRY BUFFER               |  COMMAND BUFFER  |
 +-----------------------------------------+------------------+

 <----------------------- packed_size ------------------------>

So in order to pack/unpack the command bits into the buffer,
sja1105_vl_lookup_cmd_packing must first advance the buffer pointer by
the length of the entry. This is similar to what the other *cmd_packing
functions do.

This bug exists because the command packing function for P/Q/R/S was
copied from the E/T generation, and on E/T, the command was actually
embedded within the entry buffer itself.

Fixes: 94f94d4acf ("net: dsa: sja1105: add static tables for virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
DENG Qingfang
474a2ddaa1 net: dsa: mt7530: fix VLAN traffic leaks
PCR_MATRIX field was set to all 1's when VLAN filtering is enabled, but
was not reset when it is disabled, which may cause traffic leaks:

	ip link add br0 type bridge vlan_filtering 1
	ip link add br1 type bridge vlan_filtering 1
	ip link set swp0 master br0
	ip link set swp1 master br1
	ip link set br0 type bridge vlan_filtering 0
	ip link set br1 type bridge vlan_filtering 0
	# traffic in br0 and br1 will start leaking to each other

As port_bridge_{add,del} have set up PCR_MATRIX properly, remove the
PCR_MATRIX write from mt7530_port_set_vlan_aware.

Fixes: 83163f7dca ("net: dsa: mediatek: add VLAN support for MT7530")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-23 17:24:53 -07:00
Florian Fainelli
fc516d3a6a net: dsa: bcm_sf2: Fix bcm_sf2_reg_rgmii_cntrl() call for non-RGMII port
We cannot call bcm_sf2_reg_rgmii_cntrl() for a port that is not RGMII,
yet we do that in bcm_sf2_sw_mac_link_up() irrespective of the port's
interface. Move that read until we have properly qualified the PHY
interface mode. This avoids triggering a warning on 7278 platforms that
have GMII ports.

Fixes: 55cfeb3969 ("net: dsa: bcm_sf2: add function finding RGMII register")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 14:18:34 -07:00
Vladimir Oltean
039b167d68 net: dsa: sja1105: don't use burst SPI reads for port statistics
The current internal sja1105 driver API is optimized for retrieving many
statistics counters at once. But the switch does not do atomic snapshotting
for them anyway.

In case we start reporting the hardware port counters through
ndo_get_stats64 as well, not just ethtool, it would be good to be able
to read individual port counters and not all of them.

Additionally, since Arnd Bergmann's commit ae1804de93 ("dsa: sja1105:
dynamically allocate stats structure"), sja1105_get_ethtool_stats
allocates memory dynamically, since struct sja1105_port_status was
deemed to consume too much stack memory. That is not ideal.
The large structure is only needed because of the burst read.
If we read statistics one by one, we can consume less memory, and
we can avoid dynamic allocation.

Additionally, latency-sensitive interfaces such as PTP operations (for
phc2sys) might suffer if the SPI mutex is being held for too long, which
happens in the case of SPI burst reads. By reading counters one by one,
we give a chance for higher priority processes to preempt and take the
SPI bus mutex for accessing the PTP clock.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 14:01:41 -07:00
Vladimir Oltean
30a2e9c0f5 net: dsa: sja1105: stop reporting the queue levels in ethtool port counters
The queue levels are not counters, but instead they represent the
occupancy of the MAC TX queues. Having these in ethtool port counters is
not helpful, so remove them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 14:01:41 -07:00
Vladimir Oltean
718bad0e4d net: dsa: sja1105: adapt to a SPI controller with a limited max transfer size
The static config of the sja1105 switch is a long stream of bytes which
is programmed to the hardware in chunks (portions with the chip select
continuously asserted) of max 256 bytes each. Each chunk is a
spi_message composed of 2 spi_transfers: the buffer with the data and a
preceding buffer with the SPI access header.

Only that certain SPI controllers, such as the spi-sc18is602 I2C-to-SPI
bridge, cannot keep the chip select asserted for that long.
The spi_max_transfer_size() and spi_max_message_size() functions are how
the controller can impose its hardware limitations upon the SPI
peripheral driver.

For the sja1105 driver to work with these controllers, both buffers must
be smaller than the transfer limit, and their sum must be smaller than
the message limit.

Regression-tested on a switch connected to a controller with no
limitations (spi-fsl-dspi) as well as with one with caps for both
max_transfer_size and max_message_size (spi-sc18is602).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 13:23:29 -07:00
Vladimir Oltean
ca021f0dd8 net: dsa: sja1105: send multiple spi_messages instead of using cs_change
The sja1105 driver has been described by Mark Brown as "not using the
[ SPI ] API at all idiomatically" due to the use of cs_change:
https://patchwork.kernel.org/project/netdevbpf/patch/20210520135031.2969183-1-olteanv@gmail.com/

According to include/linux/spi/spi.h, the chip select is supposed to be
asserted for the entire length of a SPI message, as long as cs_change is
false for all member transfers. The cs_change flag changes the following:

(i) When a non-final SPI transfer has cs_change = true, the chip select
    should temporarily deassert and then reassert starting with the next
    transfer.
(ii) When a final SPI transfer has cs_change = true, the chip select
     should remain asserted until the following SPI message.

The sja1105 driver only uses cs_change for its first property, to form a
single SPI message whose layout can be seen below:

                                             this is an entire, single spi_message
           _______________________________________________________________________________________________
          /                                                                                               \
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
          | hdr_xfer[0] | chunk_xfer[0] | hdr_xfer[1] | chunk_xfer[1] |     | hdr_xfer[n] | chunk_xfer[n] |
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
cs_change      false          true           false           true                false          false

           ____________________________  _____________________________       _____________________________
CS line __/                            \/                             \ ... /                             \__

The fact of the matter is that spi_max_message_size() has an ambiguous
meaning if any non-final transfer has cs_change = true.

If the SPI master has a limitation in that it cannot keep the chip
select asserted for more than, say, 200 bytes (like the spi-sc18is602),
the normal thing for it to do is to implement .max_transfer_size and
.max_message_size, and limit both to 200: in the "worst case" where
cs_change is always false, then the controller can, indeed, not send
messages larger than 200 bytes.

But the fact that the SPI controller's max_message_size does not
necessarily mean that we cannot send messages larger than that.
Notably, if the SPI master special-cases the transfers with cs_change
and treats every chip select toggling as an entirely new transaction,
then a SPI message can easily exceed that limit. So there is a
temptation to ignore the controller's reported max_message_size when
using cs_change = true in non-final transfers.

But that can lead to false conclusions. As Mark points out, the SPI
controller might have a different kind of limitation with the max
message size, that has nothing at all to do with how long it can keep
the chip select asserted.
For example, that might be the case if the device is able to offload the
chip select changes to the hardware as part of the data stream, and it
packs the entire stream of commands+data (corresponding to a SPI
message) into a single DMA transfer that is itself limited in size.

So the only thing we can do is avoid ambiguity by not using cs_change at
all. Instead of sending a single spi_message, we now send multiple SPI
messages as follows:

                  spi_message 0                 spi_message 1                       spi_message n
           ____________________________   ___________________________        _____________________________
          /                            \ /                           \      /                             \
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
          | hdr_xfer[0] | chunk_xfer[0] | hdr_xfer[1] | chunk_xfer[1] |     | hdr_xfer[n] | chunk_xfer[n] |
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
cs_change      false          true           false           true                false          false

           ____________________________  _____________________________       _____________________________
CS line __/                            \/                             \ ... /                             \__

which is clearer because the max_message_size limit is now easier to
enforce. What is transmitted on the wire stays, of course, the same.

Additionally, because we send no more than 2 transfers at a time, we now
avoid dynamic memory allocation too, which might be seen as an
improvement by some.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 13:23:29 -07:00
DENG Qingfang
ba751e28d4 net: dsa: mt7530: add interrupt support
Add support for MT7530 interrupt controller to handle internal PHYs.
In order to assign an IRQ number to each PHY, the registration of MDIO bus
is also done in this driver.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-19 13:27:42 -07:00
Wei Yongjun
0d56e5c191 net: dsa: qca8k: fix missing unlock on error in qca8k_vlan_(add|del)
Add the missing unlock before return from function qca8k_vlan_add()
and qca8k_vlan_del() in the error handling case.

Fixes: 028f5f8ef4 ("net: dsa: qca8k: handle error with qca8k_read operation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-18 13:39:32 -07:00
Ansuel Smith
a46aec02bc net: dsa: qca8k: pass switch_revision info to phy dev_flags
Define get_phy_flags to pass switch_Revision needed to tweak the
internal PHY with debug values based on the revision.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:23 -07:00
Ansuel Smith
b7ebac354d net: dsa: qca8k: improve internal mdio read/write bus access
Improve the internal mdio read/write bus access by caching the value
without accessing it for every read/write.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:23 -07:00
Ansuel Smith
759bafb8a3 net: dsa: qca8k: add support for internal phy and internal mdio
Add support to setup_mdio_bus for internal phy declaration. Introduce a
flag to use the legacy port phy mapping by default and use the direct
mapping if a mdio node is detected in the switch node. Register a
dedicated mdio internal mdio bus to address the different mapping
between port and phy if the mdio node is detected.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
617960d72e net: dsa: qca8k: enlarge mdio delay and timeout
The witch require some extra delay after setting page or the next
read/write can use still use the old page. Add a delay after the
set_page function to address this as it's done in QSDK legacy driver.
Some timeouts were notice with VLAN and phy function, enlarge the
mdio busy wait timeout to fix these problems.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
60df02b6ea net: dsa: qca8k: dsa: qca8k: protect MASTER busy_wait with mdio mutex
MDIO_MASTER operation have a dedicated busy wait that is not protected
by the mdio mutex. This can cause situation where the MASTER operation
is done and a normal operation is executed between the MASTER read/write
and the MASTER busy_wait. Rework the qca8k_mdio_read/write function to
address this issue by binding the lock for the whole MASTER operation
and not only the mdio read/write common operation.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
63c33bbfeb net: dsa: qca8k: clear MASTER_EN after phy read/write
Clear MDIO_MASTER_EN bit from MDIO_MASTER_CTRL after read/write
operation. The MDIO_MASTER_EN bit is not reset after read/write
operation and the next operation can be wrongly interpreted by the
switch as a mdio operation. This cause a production of wrong/garbage
data from the switch and underfined bheavior. (random port drop,
unplugged port flagged with link up, wrong port speed)
Also on driver remove the MASTER_CTRL can be left set and cause the
malfunction of any next driver using the mdio device.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
e4b9977cee net: dsa: qca8k: make rgmii delay configurable
The legacy qsdk code used a different delay instead of the max value.
Qsdk use 1 ns for rx and 2 ns for tx. Make these values configurable
using the standard rx/tx-internal-delay-ps ethernet binding and apply
qsdk values by default. The connected gmac doesn't add any delay so no
additional delay is added to tx/rx.
On this switch the delay is actually in ns so value should be in the
1000 order. Any value converted from ps to ns by dividing it by 1000
as the switch max value for delay is 3ns.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
1ee0591a10 net: dsa: qca8k: add ethernet-ports fallback to setup_mdio_bus
Dsa now also supports ethernet-ports. Add this new binding as a fallback
if the ports node can't be found.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
95ffeaf18b net: dsa: qca8k: add support for switch rev
qca8k internal phy driver require some special debug value to be set
based on the switch revision. Rework the switch id read function to
also read the chip revision.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
0fc57e4b5e net: dsa: qca8k: add GLOBAL_FC settings needed for qca8327
Switch qca8327 needs special settings for the GLOBAL_FC_THRES regs.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
5bf9ff3b9f net: dsa: qca8k: limit port5 delay to qca8337
Limit port5 rx delay to qca8337. This is taken from the legacy QSDK code
that limits the rx delay on port5 to only this particular switch version,
on other switch only the tx and rx delay for port0 are needed.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
83a3ceb39b net: dsa: qca8k: add priority tweak to qca8337 switch
The port 5 of the qca8337 have some problem in flood condition. The
original legacy driver had some specific buffer and priority settings
for the different port suggested by the QCA switch team. Add this
missing settings to improve switch stability under load condition.
The packet priority tweak is only needed for the qca8337 switch and
other qca8k switch are not affected.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
6e82a457e0 net: dsa: qca8k: add support for qca8327 switch
qca8327 switch is a low tier version of the more recent qca8337.
It does share the same regs used by the qca8k driver and can be
supported with minimal change.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
b7c818d194 net: dsa: qca8k: handle error from qca8k_busy_wait
Propagate errors from qca8k_busy_wait instead of hardcoding return
value.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
aaf421425c net: dsa: qca8k: handle error with qca8k_rmw operation
qca8k_rmw can fail. Rework any user to handle error values and
correctly return. Change qca8k_rmw to return the error code or 0 instead
of the reg value. The reg returned by qca8k_rmw wasn't used anywhere,
so this doesn't cause any functional change.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
d7805757c7 net: dsa: qca8k: handle error with qca8k_write operation
qca8k_write can fail. Rework any user to handle error values and
correctly return.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
028f5f8ef4 net: dsa: qca8k: handle error with qca8k_read operation
qca8k_read can fail. Rework any user to handle error values and
correctly return.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
ba5707ec58 net: dsa: qca8k: handle qca8k_set_page errors
With a remote possibility, the set_page function can fail. Since this is
a critical part of the write/read qca8k regs, propagate the error and
terminate any read/write operation.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
504bf65931 net: dsa: qca8k: improve qca8k read/write/rmw bus access
Put bus in local variable to improve faster access to the mdio bus.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:22 -07:00
Ansuel Smith
2ad255f2fa net: dsa: qca8k: use iopoll macro for qca8k_busy_wait
Use iopoll macro instead of while loop.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:21 -07:00
Ansuel Smith
5d9e068402 net: dsa: qca8k: change simple print to dev variant
Change pr_err and pr_warn to dev variant.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:30:21 -07:00
Michael Walle
297c4de6f7 net: dsa: felix: re-enable TAS guard band mode
Commit 316bcffe44 ("net: dsa: felix: disable always guard band bit for
TAS config") disabled the guard band and broke 802.3Qbv compliance.

There are two issues here:
 (1) Without the guard band the end of the scheduling window could be
     overrun by a frame in transit.
 (2) Frames that don't fit into a configured window will still be sent.

The reason for both issues is that the switch will schedule the _start_
of a frame transmission inside the predefined window without taking the
length of the frame into account. Thus, we'll need the guard band which
will close the gate early, so that a complete frame can still be sent.
Revert the commit and add a note.

For a lengthy discussion see [1].

[1] https://lore.kernel.org/netdev/c7618025da6723418c56a54fe4683bd7@walle.cc/

Fixes: 316bcffe44 ("net: dsa: felix: disable always guard band bit for TAS config")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-10 14:48:55 -07:00
Oleksij Rempel
d4eecfb28b net: dsa: ksz: ksz8863_smi_probe: set proper return value for ksz_switch_alloc()
ksz_switch_alloc() will return NULL only if allocation is failed. So,
the proper return value is -ENOMEM.

Fixes: 60a3647600 ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:54:35 -07:00
Oleksij Rempel
ba46b576a7 net: dsa: ksz: ksz8795_spi_probe: fix possible NULL pointer dereference
Fix possible NULL pointer dereference in case devm_kzalloc() failed to
allocate memory

Fixes: cc13e52c3a ("net: dsa: microchip: Add Microchip KSZ8863 SPI based driver support")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:54:35 -07:00
Oleksij Rempel
d27f0201b9 net: dsa: ksz: ksz8863_smi_probe: fix possible NULL pointer dereference
Fix possible NULL pointer dereference in case devm_kzalloc() failed to
allocate memory.

Fixes: 60a3647600 ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:54:35 -07:00
Colin Ian King
12c2bb96c3 net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
Currently the for-loop in ksz8_port_init_cnt is causing a static
analysis infinite loop warning with the comparison of
mib->cnt_ptr < dev->reg_mib_cnt. This occurs because mib->cnt_ptr
is a u8 and dev->reg_mib_cnt is an int and the analyzer determines
that mib->cnt_ptr potentially can wrap around to zero if the value
in dev->reg_mib_cnt is > 255. However, this value is never this
large, it is always less than 256 so make reg_mib_cnt a u8.

Addresses-Coverity: ("Infinite loop")
Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210428120010.337959-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-28 13:56:27 -07:00
Michael Grzeschik
60a3647600 net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support
Add KSZ88X3 driver support. We add support for the KXZ88X3 three port
switches using the Microchip SMI Interface. They are supported using the
MDIO-Bitbang Interface.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:13:24 -07:00
Michael Grzeschik
cc13e52c3a net: dsa: microchip: Add Microchip KSZ8863 SPI based driver support
Add KSZ88X3 driver support. We add support for the KXZ88X3 three port
switches using the SPI Interface.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:13:24 -07:00
Oleksij Rempel
4b20a07e10 net: dsa: microchip: ksz8795: add support for ksz88xx chips
We add support for the ksz8863 and ksz8873 chips which are
using the same register patterns but other offsets as the
ksz8795.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:13:24 -07:00
Michael Grzeschik
9f73e11250 net: dsa: microchip: ksz8795: move register offsets and shifts to separate struct
In order to get this driver used with other switches the functions need
to use different offsets and register shifts. This patch changes the
direct use of the register defines to register description structures,
which can be set depending on the chips register layout.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:13:23 -07:00
Michael Grzeschik
c2ac4d2ac5 net: dsa: microchip: ksz8795: move cpu_select_interface to extra function
This patch moves the cpu interface selection code to a individual
function specific for ksz8795. It will make it simpler to customize the
code path for different switches supported by this driver.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:13:23 -07:00
Michael Grzeschik
4b5baca040 net: dsa: microchip: ksz8795: change drivers prefix to be generic
The driver can be used on other chips of this type. To reflect
this we rename the drivers prefix from ksz8795 to ksz8.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:13:23 -07:00
Yangbo Lu
682eaad93e net: mscc: ocelot: convert to ocelot_port_txtstamp_request()
Convert to a common ocelot_port_txtstamp_request() for TX timestamp
request handling.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu
c4b364ce12 net: dsa: free skb->cb usage in core driver
Free skb->cb usage in core driver and let device drivers decide to
use or not. The reason having a DSA_SKB_CB(skb)->clone was because
dsa_skb_tx_timestamp() which may set the clone pointer was called
before p->xmit() which would use the clone if any, and the device
driver has no way to initialize the clone pointer.

This patch just put memset(skb->cb, 0, sizeof(skb->cb)) at beginning
of dsa_slave_xmit(). Some new features in the future, like one-step
timestamp may need more bytes of skb->cb to use in
dsa_skb_tx_timestamp(), and p->xmit().

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu
5c5416f5d4 net: dsa: no longer clone skb in core driver
It was a waste to clone skb directly in dsa_skb_tx_timestamp().
For one-step timestamping, a clone was not needed. For any failure of
port_txtstamp (this may usually happen), the skb clone had to be freed.

So this patch moves skb cloning for tx timestamp out of dsa core, and
let drivers clone skb in port_txtstamp if they really need.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu
cf536ea3c7 net: dsa: no longer identify PTP packet in core driver
Move ptp_classify_raw out of dsa core driver for handling tx
timestamp request. Let device drivers do this if they want.
Not all drivers want to limit tx timestamping for only PTP
packet.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu
cfd12c06cd net: dsa: check tx timestamp request in core driver
Check tx timestamp request in core driver at very beginning of
dsa_skb_tx_timestamp(), so that most skbs not requiring tx
timestamp just return. And drop such checking in device drivers.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Tobias Waldekranz
6066234aa3 net: dsa: mv88e6xxx: Fix 6095/6097/6185 ports in non-SERDES CMODE
The .serdes_get_lane op used the magic value 0xff to indicate a valid
SERDES lane and 0 signaled that a non-SERDES mode was set on the port.

Unfortunately, "0" is also a valid lane ID, so even when these ports
where configured to e.g. RGMII the driver would set them up as SERDES
ports.

- Replace 0xff with 0 to indicate a valid lane ID. The number is on
  the one hand just as arbitrary, but it is at least the first valid one
  and therefore less of a surprise.

- Follow the other .serdes_get_lane implementations and return -ENODEV
  in the case where no SERDES is assigned to the port.

Fixes: f5be107c33 ("net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:06:19 -07:00
Tobias Waldekranz
836021a2d0 net: dsa: mv88e6xxx: Export cross-chip PVT as devlink region
Export the raw PVT data in a devlink region so that it can be
inspected from userspace and compared to the current bridge
configuration.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:25:09 -07:00
Tobias Waldekranz
281140a0a2 net: dsa: mv88e6xxx: Fix off-by-one in VTU devlink region size
In the unlikely event of the VTU being loaded to the brim with 4k
entries, the last one was placed in the buffer, but the size reported
to devlink was off-by-one. Make sure that the final entry is available
to the caller.

Fixes: ca4d632aef ("net: dsa: mv88e6xxx: Export VTU as devlink region")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:25:09 -07:00
Tobias Waldekranz
78e70dbcfd net: dsa: mv88e6xxx: Correct spelling of define "ADRR" -> "ADDR"
Because ADRR is not a thing.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:25:09 -07:00
Tobias Waldekranz
9a99bef5f8 net: dsa: mv88e6xxx: Allow dynamic reconfiguration of tag protocol
For devices that supports both regular and Ethertyped DSA tags, allow
the user to change the protocol.

Additionally, because there are ethernet controllers that do not
handle regular DSA tags in all cases, also allow the protocol to be
changed on devices with undocumented support for EDSA. But, in those
cases, make sure to log the fact that an undocumented feature has been
enabled.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:51:19 -07:00
Tobias Waldekranz
670bb80f81 net: dsa: mv88e6xxx: Mark chips with undocumented EDSA tag support
All devices are capable of using regular DSA tags. Support for
Ethertyped DSA tags sort into three categories:

1. No support. Older chips fall into this category.

2. Full support. Datasheet explicitly supports configuring the CPU
   port to receive FORWARDs with a DSA tag.

3. Undocumented support. Datasheet lists the configuration from
   category 2 as "reserved for future use", but does empirically
   behave like a category 2 device.

So, instead of listing the one true protocol that should be used by a
particular chip, specify the level of support for EDSA (support for
regular DSA is implicit on all chips). As before, we use EDSA for all
chips that fully supports it.

In upcoming changes, we will use this information to support
dynamically changing the tag protocol.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:51:19 -07:00
Marek Behún
c5d015b0e0 net: dsa: mv88e6xxx: simulate Amethyst PHY model number
Amethyst internal PHYs also report empty model number in MII_PHYSID2.

Fill in switch product number, as is done for Topaz and Peridot.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:27:54 -07:00
Xiaoliang Yang
316bcffe44 net: dsa: felix: disable always guard band bit for TAS config
ALWAYS_GUARD_BAND_SCH_Q bit in TAS config register is descripted as
this:
	0: Guard band is implemented for nonschedule queues to schedule
	   queues transition.
	1: Guard band is implemented for any queue to schedule queue
	   transition.

The driver set guard band be implemented for any queue to schedule queue
transition before, which will make each GCL time slot reserve a guard
band time that can pass the max SDU frame. Because guard band time could
not be set in tc-taprio now, it will use about 12000ns to pass 1500B max
SDU. This limits each GCL time interval to be more than 12000ns.

This patch change the guard band to be only implemented for nonschedule
queues to schedule queues transition, so that there is no need to reserve
guard band on each GCL. Users can manually add guard band time for each
schedule queues in their configuration if they want.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:09:42 -07:00
Jakub Kicinski
8203c7ce4e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 - keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
 - fix build after move to net_generic

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-17 11:08:07 -07:00
René van Dorst
40b5d2f15c net: dsa: mt7530: Add support for EEE features
This patch adds EEE support.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-13 14:28:08 -07:00
Pali Rohár
1fe976d308 net: phy: marvell: fix detection of PHY on Topaz switches
Since commit fee2d54641 ("net: phy: marvell: mv88e6390 temperature
sensor reading"), Linux reports the temperature of Topaz hwmon as
constant -75°C.

This is because switches from the Topaz family (88E6141 / 88E6341) have
the address of the temperature sensor register different from Peridot.

This address is instead compatible with 88E1510 PHYs, as was used for
Topaz before the above mentioned commit.

Create a new mapping table between switch family and PHY ID for families
which don't have a model number. And define PHY IDs for Topaz and Peridot
families.

Create a new PHY ID and a new PHY driver for Topaz's internal PHY.
The only difference from Peridot's PHY driver is the HWMON probing
method.

Prior this change Topaz's internal PHY is detected by kernel as:

  PHY [...] driver [Marvell 88E6390] (irq=63)

And afterwards as:

  PHY [...] driver [Marvell 88E6341 Family] (irq=63)

Signed-off-by: Pali Rohár <pali@kernel.org>
BugLink: https://github.com/globalscaletechnologies/linux/issues/1
Fixes: fee2d54641 ("net: phy: marvell: mv88e6390 temperature sensor reading")
Reviewed-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12 14:20:19 -07:00
Jakub Kicinski
8859a44ea0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

MAINTAINERS
 - keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
 - simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
 - trivial
include/linux/ethtool.h
 - trivial, fix kdoc while at it
include/linux/skmsg.h
 - move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
 - add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
 - trivial

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09 20:48:35 -07:00
Martin Blumenstingl
4b5923249b net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
There are a few more bits in the GSWIP_MII_CFG register for which we
did rely on the boot-loader (or the hardware defaults) to set them up
properly.

For some external RMII PHYs we need to select the GSWIP_MII_CFG_RMII_CLK
bit and also we should un-set it for non-RMII PHYs. The
GSWIP_MII_CFG_RMII_CLK bit is ignored for other PHY connection modes.

The GSWIP IP also supports in-band auto-negotiation for RGMII PHYs when
the GSWIP_MII_CFG_RGMII_IBS bit is set. Clear this bit always as there's
no known hardware which uses this (so it is not tested yet).

Clear the xMII isolation bit when set at initialization time if it was
previously set by the bootloader. Not doing so could lead to no traffic
(neither RX nor TX) on a port with this bit set.

While here, also add the GSWIP_MII_CFG_RESET bit. We don't need to
manage it because this bit is self-clearning when set. We still add it
here to get a better overview of the GSWIP_MII_CFG register.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Cc: stable@vger.kernel.org
Suggested-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08 16:38:23 -07:00
Martin Blumenstingl
3e9005be87 net: dsa: lantiq_gswip: Don't use PHY auto polling
PHY auto polling on the GSWIP hardware can be used so link changes
(speed, link up/down, etc.) can be detected automatically. Internally
GSWIP reads the PHY's registers for this functionality. Based on this
automatic detection GSWIP can also automatically re-configure it's port
settings. Unfortunately this auto polling (and configuration) mechanism
seems to cause various issues observed by different people on different
devices:
- FritzBox 7360v2: the two Gbit/s ports (connected to the two internal
  PHY11G instances) are working fine but the two Fast Ethernet ports
  (using an AR8030 RMII PHY) are completely dead (neither RX nor TX are
  received). It turns out that the AR8030 PHY sets the BMSR_ESTATEN bit
  as well as the ESTATUS_1000_TFULL and ESTATUS_1000_XFULL bits. This
  makes the PHY auto polling state machine (rightfully?) think that the
  established link speed (when the other side is Gbit/s capable) is
  1Gbit/s.
- None of the Ethernet ports on the Zyxel P-2812HNU-F1 (two are
  connected to the internal PHY11G GPHYs while the other three are
  external RGMII PHYs) are working. Neither RX nor TX traffic was
  observed. It is not clear which part of the PHY auto polling state-
  machine caused this.
- FritzBox 7412 (only one LAN port which is connected to one of the
  internal GPHYs running in PHY22F / Fast Ethernet mode) was seeing
  random disconnects (link down events could be seen). Sometimes all
  traffic would stop after such disconnect. It is not clear which part
  of the PHY auto polling state-machine cauased this.
- TP-Link TD-W9980 (two ports are connected to the internal GPHYs
  running in PHY11G / Gbit/s mode, the other two are external RGMII
  PHYs) was affected by similar issues as the FritzBox 7412 just without
  the "link down" events

Switch to software based configuration instead of PHY auto polling (and
letting the GSWIP hardware configure the ports automatically) for the
following link parameters:
- link up/down
- link speed
- full/half duplex
- flow control (RX / TX pause)

After a big round of manual testing by various people (who helped test
this on OpenWrt) it turns out that this fixes all reported issues.

Additionally it can be considered more future proof because any
"quirk" which is implemented for a PHY on the driver side can now be
used with the GSWIP hardware as well because Linux is in control of the
link parameters.

As a nice side-effect this also solves a problem where fixed-links were
not supported previously because we were relying on the PHY auto polling
mechanism, which cannot work for fixed-links as there's no PHY from
where it can read the registers. Configuring the link settings on the
GSWIP ports means that we now use the settings from device-tree also for
ports with fixed-links.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Fixes: 3e6fdeb28f ("net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock")
Cc: stable@vger.kernel.org
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08 16:38:23 -07:00
Guobin Huang
a180be79db net: mscc: ocelot: remove redundant dev_err call in vsc9959_mdio_bus_alloc()
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Guobin Huang <huangguobin4@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29 13:16:44 -07:00
Guobin Huang
656151aaa6 net: dsa: hellcreek: Remove redundant dev_err call in hellcreek_probe()
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Guobin Huang <huangguobin4@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-28 18:05:31 -07:00
Ilya Lipnitskiy
4732315ca9 net: dsa: mt7530: clean up core and TRGMII clock setup
Three minor changes:

- When disabling PLL, there is no need to call core_write_mmd_indirect
  directly, use the core_write wrapper instead like the rest of the code
  in the function does. This change helps with consistency and
  readability. Move the comment to the definition of
  core_read_mmd_indirect where it belongs.

- Disable both core and TRGMII Tx clocks prior to reconfiguring.
  Previously, only the core clock was disabled, but not TRGMII Tx clock.
  So disable both, then configure them, then re-enable both, for
  consistency.

- The core clock enable bit (REG_GSWCK_EN) is written redundantly three
  times. Simplify the code and only write the register only once at the
  end of clock reconfiguration to enable both core and TRGMII Tx clocks.

Tested on Ubiquiti ER-X running the GMAC0 and MT7530 in TRGMII mode.

Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-28 17:55:12 -07:00
Qinglang Miao
866f1577ba net: dsa: b53: spi: add missing MODULE_DEVICE_TABLE
This patch adds missing MODULE_DEVICE_TABLE definition which generates
correct modalias for automatic loading of this driver when it is built
as an external module.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 17:11:22 -07:00
Martin Blumenstingl
3e6fdeb28f net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock
The xMII interface clock depends on the PHY interface (MII, RMII, RGMII)
as well as the current link speed. Explicitly configure the GSWIP to
automatically select the appropriate xMII interface clock.

This fixes an issue seen by some users where ports using an external
RMII or RGMII PHY were deaf (no RX or TX traffic could be seen). Most
likely this is due to an "invalid" xMII clock being selected either by
the bootloader or hardware-defaults.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 16:53:38 -07:00
David S. Miller
efd13b71a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 15:31:22 -07:00
Vladimir Oltean
e4bd44e89d net: ocelot: replay switchdev events when joining bridge
The premise of this change is that the switchdev port attributes and
objects offloaded by ocelot might have been missed when we are joining
an already existing bridge port, such as a bonding interface.

The patch pulls these switchdev attributes and objects from the bridge,
on behalf of the 'bridge port' net device which might be either the
ocelot switch interface, or the bonding upper interface.

The ocelot_net.c belongs strictly to the switchdev ocelot driver, while
ocelot.c is part of a library shared with the DSA felix driver.
The ocelot_port_bridge_leave function (part of the common library) used
to call ocelot_port_vlan_filtering(false), something which is not
necessary for DSA, since the framework deals with that already there.
So we move this function to ocelot_switchdev_unsync, which is specific
to the switchdev driver.

The code movement described above makes ocelot_port_bridge_leave no
longer return an error code, so we change its type from int to void.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23 14:49:06 -07:00
Kurt Kanzenbach
1ab568e92b net: dsa: hellcreek: Report switch name and ID
Report the driver name, ASIC ID and the switch name via devlink. This is a
useful information for user space tooling.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 18:02:10 -07:00
Aleksander Jan Bajkowski
204c761473 net: dsa: lantiq: verify compatible strings against hardware
Verify compatible string against hardware.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 16:33:39 -07:00
Aleksander Jan Bajkowski
a09d042b08 net: dsa: lantiq: allow to use all GPHYs on xRX300 and xRX330
This patch allows to use all PHYs on GRX300 and GRX330. The ARX300
has 3 and the GRX330 has 4 integrated PHYs connected to different
ports compared to VRX200. Each integrated PHY can work as single
Gigabit Ethernet PHY (GMII) or as double Fast Ethernet PHY (MII).

Allowed port configurations:

xRX200:
GMAC0: RGMII, MII, REVMII or RMII port
GMAC1: RGMII, MII, REVMII or RMII port
GMAC2: GPHY0 (GMII)
GMAC3: GPHY0 (MII)
GMAC4: GPHY1 (GMII)
GMAC5: GPHY1 (MII) or RGMII port

xRX300:
GMAC0: RGMII port
GMAC1: GPHY2 (GMII)
GMAC2: GPHY0 (GMII)
GMAC3: GPHY0 (MII)
GMAC4: GPHY1 (GMII)
GMAC5: GPHY1 (MII) or RGMII port

xRX330:
GMAC0: RGMII, GMII or RMII port
GMAC1: GPHY2 (GMII)
GMAC2: GPHY0 (GMII)
GMAC3: GPHY0 (MII) or GPHY3 (GMII)
GMAC4: GPHY1 (GMII)
GMAC5: GPHY1 (MII), RGMII or RMII port

Tested on D-Link DWR966 (xRX330) with OpenWRT.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 16:33:39 -07:00
Vladimir Oltean
3de43dc986 net: dsa: mv88e6xxx: fix up kerneldoc some more
Commit 0b5294483c ("net: dsa: mv88e6xxx: scratch: Fixup kerneldoc")
has addressed some but not all kerneldoc warnings for the Global 2
Scratch register accessors. Namely, we have some mismatches between
the function names in the kerneldoc and the ones in the actual code.
Let's adjust the comments so that they match the functions they're
sitting next to.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:09:02 -07:00
Alexander Lobakin
227d72063f dsa: simplify Kconfig symbols and dependencies
1. Remove CONFIG_HAVE_NET_DSA.

CONFIG_HAVE_NET_DSA is a legacy leftover from the times when drivers
should have selected CONFIG_NET_DSA manually.
Currently, all drivers has explicit 'depends on NET_DSA', so this is
no more needed.

2. CONFIG_HAVE_NET_DSA dependencies became CONFIG_NET_DSA's ones.

 - dropped !S390 dependency which was introduced to be sure NET_DSA
   can select CONFIG_PHYLIB. DSA migrated to Phylink almost 3 years
   ago and the PHY library itself doesn't depend on !S390 since
   commit 870a2b5e4f ("phylib: remove !S390 dependeny from Kconfig");
 - INET dependency is kept to be sure we can select NET_SWITCHDEV;
 - NETDEVICES dependency is kept to be sure we can select PHYLINK.

3. DSA drivers menu now depends on NET_DSA.

Instead on 'depends on NET_DSA' on every single driver, the entire
menu now depends on it. This eliminates a lot of duplicated lines
from Kconfig with no loss (when CONFIG_NET_DSA=m, drivers also can
be only m or n).
This also has a nice side effect that there's no more empty menu on
configurations without DSA.

4. Kbuild will now descend into 'drivers/net/dsa' only when
   CONFIG_NET_DSA is y or m.

This is safe since no objects inside this folder can be built without
DSA core, as well as when CONFIG_NET_DSA=m, no objects can be
built-in.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 12:15:37 -07:00
Vladimir Oltean
a1e6f641e3 Revert "net: dsa: sja1105: Clear VLAN filtering offload netdev feature"
This reverts commit e9bf96943b.

The topic of the reverted patch is the support for switches with global
VLAN filtering, added by commit 061f6a505a ("net: dsa: Add
ndo_vlan_rx_{add, kill}_vid implementation"). Be there a switch with 4
ports swp0 -> swp3, and the following setup:

ip link add br0 type bridge vlan_filtering 1
ip link set swp0 master br0
ip link set swp1 master br0

What would happen with VLAN-tagged traffic received on standalone ports
swp2 and swp3? Well, it would get dropped, were it not for the
.ndo_vlan_rx_add_vid and .ndo_vlan_rx_kill_vid implementations (called
from vlan_vid_add and vlan_vid_del respectively). Basically, for DSA
switches where VLAN filtering is a global attribute, we enforce the
standalone ports to have 'rx-vlan-filter: off [fixed]' in their ethtool
features, which lets the user know that all VLAN-tagged packets that are
not explicitly added in the RX filtering list are dropped.

As for the sja1105 driver, at the time of the reverted patch, it was
operating in a pretty handicapped mode when it had ports under a bridge
with vlan_filtering=1. Specifically, it was unable to terminate traffic
through the CPU port (for further explanation see "Traffic support" in
Documentation/networking/dsa/sja1105.rst).

However, since then, the sja1105 driver has made considerable progress,
and that limitation is no longer as severe now. Specifically, since
commit 2cafa72e51 ("net: dsa: sja1105: add a new
best_effort_vlan_filtering devlink parameter"), the driver is able to
perform CPU termination even when some ports are under bridges with
vlan_filtering=1. Then, since commit 8841f6e63f ("net: dsa: sja1105:
make devlink property best_effort_vlan_filtering true by default"), this
even became the default operating mode.

So we can now take advantage of the logic in the DSA core.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-20 19:17:05 -07:00
Tobias Waldekranz
8d1d8298eb net: dsa: mv88e6xxx: Offload bridge broadcast flooding flag
These switches have two modes of classifying broadcast:

1. Broadcast is multicast.
2. Broadcast is its own unique thing that is always flooded
   everywhere.

This driver uses the first option, making sure to load the broadcast
address into all active databases. Because of this, we can support
per-port broadcast flooding by (1) making sure to only set the subset
of ports that have it enabled whenever joining a new bridge or VLAN,
and (2) by updating all active databases whenever the setting is
changed on a port.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Tobias Waldekranz
041bd545e1 net: dsa: mv88e6xxx: Offload bridge learning flag
Allow a user to control automatic learning per port.

Many chips have an explicit "LearningDisable"-bit that can be used for
this, but we opt for setting/clearing the PAV instead, as it works on
all devices at least as far back as 6083.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Tobias Waldekranz
7b9f16fe40 net: dsa: mv88e6xxx: Flood all traffic classes on standalone ports
In accordance with the comment in dsa_port_bridge_leave, standalone
ports shall be configured to flood all types of traffic. This change
aligns the mv88e6xxx driver with that policy.

Previously a standalone port would initially not egress any unknown
traffic, but after joining and then leaving a bridge, it would.

This does not matter that much since we only ever send FROM_CPUs on
standalone ports, but it seems prudent to make sure that the initial
values match those that are applied after a bridging/unbridging cycle.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Tobias Waldekranz
0806dd4654 net: dsa: mv88e6xxx: Use standard helper for broadcast address
Use the conventional declaration style of a MAC address in the
kernel (u8 addr[ETH_ALEN]) for the broadcast address, then set it
using the existing helper.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Tobias Waldekranz
34065c5830 net: dsa: mv88e6xxx: Remove some bureaucracy around querying the VTU
The hardware has a somewhat quirky protocol for reading out the VTU
entry for a particular VID. But there is no reason why we cannot
create a better API for ourselves in the driver.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Tobias Waldekranz
d89ef4b8b3 net: dsa: mv88e6xxx: Provide generic VTU iterator
Move the intricacies of correctly iterating over the VTU to a common
implementation.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Tobias Waldekranz
ffcec3f257 net: dsa: mv88e6xxx: Avoid useless attempts to fast-age LAGs
When a port is a part of a LAG, the ATU will create dynamic entries
belonging to the LAG ID when learning is enabled. So trying to
fast-age those out using the constituent port will have no
effect. Unfortunately the hardware does not support move operations on
LAGs so there is no obvious way to transform the request to target the
LAG instead.

Instead we document this known limitation and at least avoid wasting
any time on it.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 16:24:06 -07:00
Rafał Miłecki
6859d91549 net: dsa: bcm_sf2: fix BCM4908 RGMII reg(s)
BCM4908 has only 1 RGMII reg for controlling port 7.

Fixes: 73b7a60479 ("net: dsa: bcm_sf2: support BCM4908's integrated switch")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 14:44:05 -07:00
Rafał Miłecki
55cfeb3969 net: dsa: bcm_sf2: add function finding RGMII register
Simple macro like REG_RGMII_CNTRL_P() is insufficient as:
1. It doesn't validate port argument
2. It doesn't support chipsets with non-lineral RGMII regs layout

Missing port validation could result in getting register offset from out
of array. Random memory -> random offset -> random reads/writes. It
affected e.g. BCM4908 for REG_RGMII_CNTRL_P(7).

Fixes: a78e86ed58 ("net: dsa: bcm_sf2: Prepare for different register layouts")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 14:44:05 -07:00
Álvaro Fernández Rojas
a5538a777b net: dsa: b53: mmap: Add device tree support
Add device tree support to b53_mmap.c while keeping platform devices support.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18 14:37:44 -07:00
Marek Behún
6584b26020 net: dsa: mv88e6xxx: implement .port_set_policy for Amethyst
The 16-bit Port Policy CTL register from older chips is on 6393x changed
to Port Policy MGMT CTL, which can access more data, but indirectly and
via 8-bit registers.

The original 16-bit value is divided into first two 8-bit register in
the Port Policy MGMT CTL.

We can therefore use the previous code to compute the mask and shift,
and then
- if 0 <= shift < 8, we access register 0 in Port Policy MGMT CTL
- if 8 <= shift < 16, we access register 1 in Port Policy MGMT CTL

There are in fact other possible policy settings for Amethyst which
could be added here, but this can be done in the future.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17 14:44:19 -07:00
Pavana Sharma
de776d0d31 net: dsa: mv88e6xxx: add support for mv88e6393x family
The Marvell 88E6393X device is a single-chip integration of a 11-port
Ethernet switch with eight integrated Gigabit Ethernet (GbE)
transceivers and three 10-Gigabit interfaces.

This patch adds functionalities specific to mv88e6393x family (88E6393X,
88E6193X and 88E6191X).

The main differences between previous devices and this one are:
- port 0 can be a SERDES port
- all SERDESes are one-lane, eg. no XAUI nor RXAUI
- on the other hand the SERDESes can do USXGMII, 10GBASER and 5GBASER
  (on 6191X only one SERDES is capable of more than 1g; USXGMII is not
  yet supported with this change)
- Port Policy CTL register is changed to Port Policy MGMT CTL register,
  via which several more registers can be accessed indirectly
- egress monitor port is configured differently
- ingress monitor/CPU/mirror ports are configured differently and can be
  configured per port (ie. each port can have different ingress monitor
  port, for example)
- port speed AltBit works differently than previously
- PHY registers can be also accessed via MDIO address 0x18 and 0x19
  (on previous devices they could be accessed only via Global 2 offsets
   0x18 and 0x19, which means two indirections; this feature is not yet
   leveraged with thiis commit)

Co-developed-by: Ashkan Boldaji <ashkan.boldaji@digi.com>
Signed-off-by: Ashkan Boldaji <ashkan.boldaji@digi.com>
Signed-off-by: Pavana Sharma <pavana.sharma@digi.com>
Co-developed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17 14:44:18 -07:00
Marek Behún
2fda45f019 net: dsa: mv88e6xxx: wrap .set_egress_port method
There are two implementations of the .set_egress_port method, and both
of them, if successful, set chip->*gress_dest_port variable.

To avoid code repetition, wrap this method into
mv88e6xxx_set_egress_port.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17 14:44:18 -07:00
Pavana Sharma
193c5b2698 net: dsa: mv88e6xxx: change serdes lane parameter type from u8 type to int
Returning 0 is no more an error case with MV88E6393 family
which has serdes lane numbers 0, 9 or 10.
So with this change .serdes_get_lane will return lane number
or -errno (-ENODEV or -EOPNOTSUPP).

Signed-off-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17 14:44:18 -07:00
Álvaro Fernández Rojas
46c5176c58 net: dsa: b53: support legacy tags
These tags are used on BCM5325, BCM5365 and BCM63xx switches.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17 12:24:36 -07:00
Álvaro Fernández Rojas
ad426d7d96 net: dsa: b53: relax is63xx() condition
BCM63xx switches are present on bcm63xx and bmips devices.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-17 12:05:46 -07:00
DENG Qingfang
5a30833b9a net: dsa: mt7530: support MDB and bridge flag operations
Support port MDB and bridge flag operations.

As the hardware can manage multicast forwarding itself, offload_fwd_mark
can be unconditionally set to true.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16 11:54:41 -07:00
Wei Yongjun
6f0d32509a net: dsa: sja1105: fix error return code in sja1105_cls_flower_add()
The return value 'rc' maybe overwrite to 0 in the flow_action_for_each
loop, the error code from the offload not support error handling will
not set. This commit fix it to return -EOPNOTSUPP.

Fixes: 6a56e19902 ("flow_offload: reject configuration of packet-per-second policing in offload drivers")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16 11:14:59 -07:00
Álvaro Fernández Rojas
6d16eadab6 net: dsa: b53: spi: allow device tree probing
Add missing of_match_table to allow device tree probing.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16 11:11:24 -07:00
Kurt Kanzenbach
db7284a6cc net: dsa: hellcreek: Offload bridge port flags
The switch implements unicast and multicast filtering per port.
Add support for it. By default filtering is disabled.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15 12:32:12 -07:00
Florian Fainelli
f4e6d7cdbf net: dsa: bcm_sf2: Fill in BCM4908 CFP entries
The BCM4908 switch has 256 CFP entrie, update that setting so CFP can be
used.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-14 14:34:46 -07:00
Kurt Kanzenbach
292cd449fe net: dsa: hellcreek: Add devlink FDB region
Allow to dump the FDB table via devlink. This is a useful debugging feature.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:30:48 -08:00
Kurt Kanzenbach
eb5f3d3141 net: dsa: hellcreek: Move common code to helper
There are two functions which need to populate fdb entries. Move that to a
helper function.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:30:48 -08:00
Kurt Kanzenbach
e81813fb56 net: dsa: hellcreek: Use boolean value
hellcreek_select_vlan() takes a boolean instead of an integer.
So, use false accordingly.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:30:48 -08:00
Kurt Kanzenbach
ba2d1c2888 net: dsa: hellcreek: Add devlink VLAN region
Allow to dump the VLAN table via devlink. This especially useful, because the
driver internally leverages VLANs for the port separation. These are not visible
via the bridge utility.

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:30:48 -08:00
Baowen Zheng
6a56e19902 flow_offload: reject configuration of packet-per-second policing in offload drivers
A follow-up patch will allow users to configures packet-per-second policing
in the software datapath. In preparation for this, teach all drivers that
support offload of the policer action to reject such configuration as
currently none of them support it.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:18:09 -08:00
Rafał Miłecki
a9349f08ec net: dsa: bcm_sf2: setup BCM4908 internal crossbar
On some SoCs (e.g. BCM4908, BCM631[345]8) SF2 has an integrated
crossbar. It allows connecting its selected external ports to internal
ports. It's used by vendors to handle custom Ethernet setups.

BCM4908 has following 3x2 crossbar. On Asus GT-AC5300 rgmii is used for
connecting external BCM53134S switch. GPHY4 is usually used for WAN
port. More fancy devices use SerDes for 2.5 Gbps Ethernet.

              ┌──────────┐
SerDes ─── 0 ─┤          │
              │   3x2    ├─ 0 ─── switch port 7
 GPHY4 ─── 1 ─┤          │
              │ crossbar ├─ 1 ─── runner (accelerator)
 rgmii ─── 2 ─┤          │
              └──────────┘

Use setup data based on DT info to configure BCM4908's switch port 7.
Right now only GPHY and rgmii variants are supported. Handling SerDes
can be implemented later.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:06:37 -08:00
Rafał Miłecki
01488a0ccd net: dsa: bcm_sf2: store PHY interface/mode in port structure
It's needed later for proper switch / crossbar setup.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:06:37 -08:00
Ilya Lipnitskiy
c3b8e07909 net: dsa: mt7530: setup core clock even in TRGMII mode
A recent change to MIPS ralink reset logic made it so mt7530 actually
resets the switch on platforms such as mt7621 (where bit 2 is the reset
line for the switch). That exposed an issue where the switch would not
function properly in TRGMII mode after a reset.

Reconfigure core clock in TRGMII mode to fix the issue.

Tested on Ubiquiti ER-X (MT7621) with TRGMII mode enabled.

Fixes: 3f9ef7785a ("MIPS: ralink: manage low reset lines")
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:58:36 -08:00
Florian Fainelli
ee47ed08d7 net: dsa: b53: Add debug prints in b53_vlan_enable()
Having dynamic debug prints in b53_vlan_enable() has been helpful to
uncover a recent but update the function to indicate the port being
configured (or -1 for initial setup) and include the global VLAN enabled
and VLAN filtering enable status.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-11 12:33:30 -08:00
Florian Fainelli
47142ed6c3 net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port
Similar to commit 92696286f3 ("net:
bcmgenet: Set phydev->dev_flags only for internal PHYs") we need to
qualify the phydev->dev_flags based on whether the port is connected to
an internal or external PHY otherwise we risk having a flags collision
with a completely different interpretation depending on the driver.

Fixes: aa9aef77c7 ("net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-10 16:05:20 -08:00
Florian Fainelli
d45c36bafb net: dsa: b53: VLAN filtering is global to all users
The bcm_sf2 driver uses the b53 driver as a library but does not make
usre of the b53_setup() function, this made it fail to inherit the
vlan_filtering_is_global attribute. Fix this by moving the assignment to
b53_switch_alloc() which is used by bcm_sf2.

Fixes: 7228b23e68 ("net: dsa: b53: Let DSA handle mismatched VLAN filtering settings")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-10 15:48:53 -08:00
Rafał Miłecki
8373a0fe9c net: dsa: bcm_sf2: use 2 Gbps IMP port link on BCM4908
BCM4908 uses 2 Gbps link between switch and the Ethernet interface.
Without this BCM4908 devices were able to achieve only 2 x ~895 Mb/s.
This allows handling e.g. NAT traffic with 940 Mb/s.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-10 15:22:27 -08:00
George McCollister
286a8624d7 net: dsa: xrs700x: check if partner is same as port in hsr join
Don't assign dp to partner if it's the same port that xrs700x_hsr_join
was called with. The partner port is supposed to be the other port in
the HSR/PRP redundant pair not the same port. This fixes an issue
observed in testing where forwarding between redundant HSR ports on this
switch didn't work depending on the order the ports were added to the
hsr device.

Fixes: bd62e6f5e6 ("net: dsa: xrs700x: add HSR offloading support")
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:03:37 -08:00
Philipp Zabel
bf9279cd63 net: dsa: bcm_sf2: simplify optional reset handling
As of commit bb475230b8 ("reset: make optional functions really
optional"), the reset framework API calls use NULL pointers to describe
optional, non-present reset controls.

This allows to unconditionally return errors from
devm_reset_control_get_optional_exclusive.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08 11:51:36 -08:00
Vladimir Oltean
6a5166e07c net: dsa: sja1105: fix ucast/bcast flooding always remaining enabled
In the blamed patch I managed to introduce a bug while moving code
around: the same logic is applied to the ucast_egress_floods and
bcast_egress_floods variables both on the "if" and the "else" branches.

This is clearly an unintended change compared to how the code used to be
prior to that bugfix, so restore it.

Fixes: 7f7ccdea8c ("net: dsa: sja1105: fix leakage of flooded frames outside bridging domain")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-04 14:19:01 -08:00
Vladimir Oltean
053d8ad10d net: dsa: sja1105: fix SGMII PCS being forced to SPEED_UNKNOWN instead of SPEED_10
When using MLO_AN_PHY or MLO_AN_FIXED, the MII_BMCR of the SGMII PCS is
read before resetting the switch so it can be reprogrammed afterwards.
This works for the speeds of 1Gbps and 100Mbps, but not for 10Mbps,
because SPEED_10 is actually 0, so AND-ing anything with 0 is false,
therefore that last branch is dead code.

Do what others do (genphy_read_status_fixed, phy_mii_ioctl) and just
remove the check for SPEED_10, let it fall into the default case.

Fixes: ffe10e679c ("net: dsa: sja1105: Add support for the SGMII port")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-04 14:19:01 -08:00
DENG Qingfang
63c75c053b net: dsa: mt7530: don't build GPIO support if !GPIOLIB
The new GPIO support may be optional at runtime, but it requires
building against gpiolib:

ERROR: modpost: "gpiochip_get_data" [drivers/net/dsa/mt7530.ko]
undefined!
ERROR: modpost: "devm_gpiochip_add_data_with_key"
[drivers/net/dsa/mt7530.ko] undefined!

Add #ifdef to exclude GPIO support if GPIOLIB is not enabled.

Fixes: 429a0edeef ("net: dsa: mt7530: MT7530 optional GPIO support")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Link: https://lore.kernel.org/r/20210226063226.8474-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-26 15:31:28 -08:00
Geert Uytterhoeven
fcd4ba3bcb net: dsa: sja1105: Remove unneeded cast in sja1105_crc32()
sja1105_unpack() takes a "const void *buf" as its first parameter, so
there is no need to cast away the "const" of the "buf" variable before
calling it.

Drop the cast, as it prevents the compiler performing some checks.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20210223112003.2223332-1-geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-25 09:41:32 -08:00
Florian Fainelli
f9b3827ee6 net: dsa: b53: Support setting learning on port
Add support for being able to set the learning attribute on port, and
make sure that the standalone ports start up with learning disabled.

We can remove the code in bcm_sf2 that configured the ports learning
attribute because we want the standalone ports to have learning disabled
by default and port 7 cannot be bridged, so its learning attribute will
not change past its initial configuration.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23 12:23:00 -08:00
Florian Fainelli
e6dd86ed27 net: dsa: bcm_sf2: Wire-up br_flags_pre, br_flags and set_mrouter
Because bcm_sf2 implements its own dsa_switch_ops we need to export the
b53_br_flags_pre(), b53_br_flags() and b53_set_mrouter so we can wire-up
them up like they used to be with the former b53_br_egress_floods().

Fixes: a8b659e7ff ("net: dsa: act as passthrough for bridge port flags")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23 12:23:00 -08:00
Horatiu Vultur
a026c50b59 net: dsa: felix: Add support for MRP
Implement functions 'port_mrp_add', 'port_mrp_del',
'port_mrp_add_ring_role' and 'port_mrp_del_ring_role' to call the mrp
functions from ocelot.

Also all MRP frames that arrive to CPU on queue number OCELOT_MRP_CPUQ
will be forward by the SW.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:47:46 -08:00
Vladimir Oltean
7f7ccdea8c net: dsa: sja1105: fix leakage of flooded frames outside bridging domain
Quite embarrasingly, I managed to fool myself into thinking that the
flooding domain of sja1105 source ports is restricted by the forwarding
domain, which it isn't. Frames which match an FDB entry are forwarded
towards that entry's DESTPORTS restricted by REACH_PORT[SRC_PORT], while
frames that don't match any FDB entry are forwarded towards
FL_DOMAIN[SRC_PORT] or BC_DOMAIN[SRC_PORT].

This means we can't get away with doing the simple thing, and we must
manage the flooding domain ourselves such that it is restricted by the
forwarding domain. This new function must be called from the
.port_bridge_join and .port_bridge_leave methods too, not just from
.port_bridge_flags as we did before.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:02:46 -08:00
Vladimir Oltean
4c44fc5e94 net: dsa: sja1105: fix configuration of source address learning
Due to a mistake, the driver always sets the address learning flag to
the previously stored value, and not to the currently configured one.
The bug is visible only in standalone ports mode, because when the port
is bridged, the issue is masked by .port_stp_state_set which overwrites
the address learning state to the proper value.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:02:46 -08:00
Vladimir Oltean
6b73b7c96a net: dsa: felix: perform teardown on error in felix_setup
If the driver fails to probe, it would be nice to not leak memory.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:52:57 -08:00
Vladimir Oltean
42b5adbbac net: dsa: felix: don't deinitialize unused ports
ocelot_init_port is called only if dsa_is_unused_port == false, however
ocelot_deinit_port is called unconditionally. This causes a warning in
the skb_queue_purge inside ocelot_deinit_port saying that the spin lock
protecting ocelot_port->tx_skbs was not initialized.

Fixes: e5fb512d81 ("net: mscc: ocelot: deinitialize only initialized ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:52:14 -08:00
Vladimir Oltean
8841f6e63f net: dsa: sja1105: make devlink property best_effort_vlan_filtering true by default
The sja1105 driver has a limitation, extensively described under
Documentation/networking/dsa/sja1105.rst and
Documentation/networking/devlink/sja1105.rst, which says that when the
ports are under a bridge with vlan_filtering=1, traffic to and from
the network stack is not possible, unless the driver-specific
best_effort_vlan_filtering devlink parameter is enabled.

For users, this creates a 'wtf' moment. They need to go to the
documentation and find about the existence of this property, then maybe
install devlink and set it to true.

Having best_effort_vlan_filtering enabled by the kernel by default
delays that 'wtf' moment (maybe up to the point that it never even
happens). The user doesn't need to care that the driver supports
addressing the ports individually by retagging VLAN IDs until he/she
needs to use more than 32 VLAN IDs (since there can be at most 32
retagging rules). Only then do they need to think whether they need the
full VLAN table, at the expense of no individual port addressing, or
not.

But the odds that an sja1105 user will need more than 32 VLANs
terminated by the CPU is probably low. And, if we were to follow the
principle that more advanced use cases should require more advanced
preparation steps, then it makes more sense for ping to 'just work'
while CPU termination of > 32 VLAN IDs to require a bit more forethought
and possibly a driver-specific devlink param.

So we should be able to safely change the default here, and make this
driver act just a little bit more sanely out of the box.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:23:57 -08:00
Vladimir Oltean
89153ed6eb net: dsa: propagate extack to .port_vlan_filtering
Some drivers can't dynamically change the VLAN filtering option, or
impose some restrictions, it would be nice to propagate this info
through netlink instead of printing it to a kernel log that might never
be read. Also netlink extack includes the module that emitted the
message, which means that it's easier to figure out which ones are
driver-generated errors as opposed to command misuse.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:38:12 -08:00
Vladimir Oltean
31046a5fd9 net: dsa: propagate extack to .port_vlan_add
Allow drivers to communicate their restrictions to user space directly,
instead of printing to the kernel log. Where the conversion would have
been lossy and things like VLAN ID could no longer be conveyed (due to
the lack of support for printf format specifier in netlink extack), I
chose to keep the messages in full form to the kernel log only, and
leave it up to individual driver maintainers to move more messages to
extack.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:38:11 -08:00
Vladimir Oltean
0a6f17c6ae net: dsa: tag_ocelot_8021q: add support for PTP timestamping
For TX timestamping, we use the felix_txtstamp method which is common
with the regular (non-8021q) ocelot tagger. This method says that skb
deferral is needed, prepares a timestamp request ID, and puts a clone of
the skb in a queue waiting for the timestamp IRQ.

felix_txtstamp is called by dsa_skb_tx_timestamp() just before the
tagger's xmit method. In the tagger xmit, we divert the packets
classified by dsa_skb_tx_timestamp() as PTP towards the MMIO-based
injection registers, and we declare them as dead towards dsa_slave_xmit.
If not PTP, we proceed with normal tag_8021q stuff.

Then the timestamp IRQ fires, the clone queued up from felix_txtstamp is
matched to the TX timestamp retrieved from the switch's FIFO based on
the timestamp request ID, and the clone is delivered to the stack.

On RX, thanks to the VCAP IS2 rule that redirects the frames with an
EtherType for 1588 towards two destinations:
- the CPU port module (for MMIO based extraction) and
- if the "no XTR IRQ" workaround is in place, the dsa_8021q CPU port
the relevant data path processing starts in the ptp_classify_raw BPF
classifier installed by DSA in the RX data path (post tagger, which is
completely unaware that it saw a PTP packet).

This time we can't reuse the same implementation of .port_rxtstamp that
also works with the default ocelot tagger. That is because felix_rxtstamp
is given an skb with a freshly stripped DSA header, and it says "I don't
need deferral for its RX timestamp, it's right in it, let me show you";
and it just points to the header right behind skb->data, from where it
unpacks the timestamp and annotates the skb with it.

The same thing cannot happen with tag_ocelot_8021q, because for one
thing, the skb did not have an extraction frame header in the first
place, but a VLAN tag with no timestamp information. So the code paths
in felix_rxtstamp for the regular and 8021q tagger are completely
independent. With tag_8021q, the timestamp must come from the packet's
duplicate delivered to the CPU port module, but there is potentially
complex logic to be handled [ and prone to reordering ] if we were to
just start reading packets from the CPU port module, and try to match
them to the one we received over Ethernet and which needs an RX
timestamp. So we do something simple: we tell DSA "give me some time to
think" (we request skb deferral by returning false from .port_rxtstamp)
and we just drop the frame we got over Ethernet with no attempt to match
it to anything - we just treat it as a notification that there's data to
be processed from the CPU port module's queues. Then we proceed to read
the packets from those, one by one, which we deliver up the stack,
timestamped, using netif_rx - the same function that any driver would
use anyway if it needed RX timestamp deferral. So the assumption is that
we'll come across the PTP packet that triggered the CPU extraction
notification eventually, but we don't know when exactly. Thanks to the
VCAP IS2 trap/redirect rule and the exclusion of the CPU port module
from the flooding replicators, only PTP frames should be present in the
CPU port module's RX queues anyway.

There is just one conflict between the VCAP IS2 trapping rule and the
semantics of the BPF classifier. Namely, ptp_classify_raw() deems
general messages as non-timestampable, but still, those are trapped to
the CPU port module since they have an EtherType of ETH_P_1588. So, if
the "no XTR IRQ" workaround is in place, we need to run another BPF
classifier on the frames extracted over MMIO, to avoid duplicates being
sent to the stack (once over Ethernet, once over MMIO). It doesn't look
like it's possible to install VCAP IS2 rules based on keys extracted
from the 1588 frame headers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean
c8c0ba4fe2 net: dsa: felix: setup MMIO filtering rules for PTP when using tag_8021q
Since the tag_8021q tagger is software-defined, it has no means by
itself for retrieving hardware timestamps of PTP event messages.

Because we do want to support PTP on ocelot even with tag_8021q, we need
to use the CPU port module for that. The RX timestamp is present in the
Extraction Frame Header. And because we can't use NPI mode which redirects
the CPU queues to an "external CPU" (meaning the ARM CPU running Linux),
then we need to poll the CPU port module through the MMIO registers to
retrieve TX and RX timestamps.

Sadly, on NXP LS1028A, the Felix switch was integrated into the SoC
without wiring the extraction IRQ line to the ARM GIC. So, if we want to
be notified of any PTP packets received on the CPU port module, we have
a problem.

There is a possible workaround, which is to use the Ethernet CPU port as
a notification channel that packets are available on the CPU port module
as well. When a PTP packet is received by the DSA tagger (without timestamp,
of course), we go to the CPU extraction queues, poll for it there, then
we drop the original Ethernet packet and masquerade the packet retrieved
over MMIO (plus the timestamp) as the original when we inject it up the
stack.

Create a quirk in struct felix is selected by the Felix driver (but not
by Seville, since that doesn't support PTP at all). We want to do this
such that the workaround is minimally invasive for future switches that
don't require this workaround.

The only traffic for which we need timestamps is PTP traffic, so add a
redirection rule to the CPU port module for this. Currently we only have
the need for PTP over L2, so redirection rules for UDP ports 319 and 320
are TBD for now.

Note that for the workaround of matching of PTP-over-Ethernet-port with
PTP-over-MMIO queues to work properly, both channels need to be
absolutely lossless. There are two parts to achieving that:
- We keep flow control enabled on the tag_8021q CPU port
- We put the DSA master interface in promiscuous mode, so it will never
  drop a PTP frame (for the profiles we are interested in, these are
  sent to the multicast MAC addresses of 01-80-c2-00-00-0e and
  01-1b-19-00-00-00).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean
7c4bb540e9 net: dsa: tag_ocelot: create separate tagger for Seville
The ocelot tagger is a hot mess currently, it relies on memory
initialized by the attached driver for basic frame transmission.
This is against all that DSA tagging protocols stand for, which is that
the transmission and reception of a DSA-tagged frame, the data path,
should be independent from the switch control path, because the tag
protocol is in principle hot-pluggable and reusable across switches
(even if in practice it wasn't until very recently). But if another
driver like dsa_loop wants to make use of tag_ocelot, it couldn't.

This was done to have common code between Felix and Ocelot, which have
one bit difference in the frame header format. Quoting from commit
67c2404922 ("net: dsa: felix: create a template for the DSA tags on
xmit"):

    Other alternatives have been analyzed, such as:
    - Create a separate tag_seville.c: too much code duplication for just 1
      bit field difference.
    - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
      tag_brcm.c, which would have a separate .xmit function. Again, too
      much code duplication for just 1 bit field difference.
    - Allocate the template from the init function of the tag_ocelot.c
      module, instead of from the driver: couldn't figure out a method of
      accessing the correct port template corresponding to the correct
      tagger in the .xmit function.

The really interesting part is that Seville should have had its own
tagging protocol defined - it is not compatible on the wire with Ocelot,
even for that single bit. In principle, a packet generated by
DSA_TAG_PROTO_OCELOT when booted on NXP LS1028A would look in a certain
way, but when booted on NXP T1040 it would look differently. The reverse
is also true: a packet generated by a Seville switch would be
interpreted incorrectly by Wireshark if it was told it was generated by
an Ocelot switch.

Actually things are a bit more nuanced. If we concentrate only on the
DSA tag, what I said above is true, but Ocelot/Seville also support an
optional DSA tag prefix, which can be short or long, and it is possible
to distinguish the two taggers based on an integer constant put in that
prefix. Nonetheless, creating a separate tagger is still justified,
since the tag prefix is optional, and without it, there is again no way
to distinguish.

Claiming backwards binary compatibility is a bit more tough, since I've
already changed the format of tag_ocelot once, in commit 5124197ce5
("net: dsa: tag_ocelot: use a short prefix on both ingress and egress").
Therefore I am not very concerned with treating this as a bugfix and
backporting it to stable kernels (which would be another mess due to the
fact that there would be lots of conflicts with the other DSA_TAG_PROTO*
definitions). It's just simpler to say that the string values of the
taggers have ABI value starting with kernel 5.12, which will be when the
changing of tag protocol via /sys/class/net/<dsa-master>/dsa/tagging
goes live.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean
40d3f295b5 net: mscc: ocelot: use common tag parsing code with DSA
The Injection Frame Header and Extraction Frame Header that the switch
prepends to frames over the NPI port is also prepended to frames
delivered over the CPU port module's queues.

Let's unify the handling of the frame headers by making the ocelot
driver call some helpers exported by the DSA tagger. Among other things,
this allows us to get rid of the strange cpu_to_be32 when transmitting
the Injection Frame Header on ocelot, since the packing API uses
network byte order natively (when "quirks" is 0).

The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and
the cpu extraction queue mask to something, but the code doesn't do it,
so we don't do it either.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:31:44 -08:00
Vladimir Oltean
4d94235495 net: dsa: sja1105: offload bridge port flags to device
The chip can configure unicast flooding, broadcast flooding and learning.
Learning is per port, while flooding is per {ingress, egress} port pair
and we need to configure the same value for all possible ingress ports
towards the requested one.

While multicast flooding is not officially supported, we can hack it by
using a feature of the second generation (P/Q/R/S) devices, which is that
FDB entries are maskable, and multicast addresses always have an odd
first octet. So by putting a match-all for 00:01:00:00:00:00 addr and
00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is
always checked last, and does not take precedence in front of any other
MDB. So it behaves effectively as an unknown multicast entry.

For the first generation switches, this feature is not available, so
unknown multicast will always be treated the same as unknown unicast.
So the only thing we can do is request the user to offload the settings
for these 2 flags in tandem, i.e.

ip link set swp2 type bridge_slave flood off
Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
ip link set swp2 type bridge_slave flood off mcast_flood off
ip link set swp2 type bridge_slave mcast_flood on
Error: sja1105: This chip cannot configure multicast flooding independently of unicast.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean
421741ea56 net: mscc: ocelot: offload bridge port flags to device
We should not be unconditionally enabling address learning, since doing
that is actively detrimential when a port is standalone and not offloading
a bridge. Namely, if a port in the switch is standalone and others are
offloading the bridge, then we could enter a situation where we learn an
address towards the standalone port, but the bridged ports could not
forward the packet there, because the CPU is the only path between the
standalone and the bridged ports. The solution of course is to not
enable address learning unless the bridge asks for it.

We need to set up the initial port flags for no learning and flooding
everything, and also when the port joins and leaves the bridge.
The flood configuration was already configured ok for standalone mode
in ocelot_init, we just need to disable learning in ocelot_init_port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean
b360d94f1b net: mscc: ocelot: use separate flooding PGID for broadcast
In preparation of offloading the bridge port flags which have
independent settings for unknown multicast and for broadcast, we should
also start reserving one destination Port Group ID for the flooding of
broadcast packets, to allow configuring it individually.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean
6edb9e8d45 net: dsa: felix: restore multicast flood to CPU when NPI tagger reinitializes
ocelot_init sets up PGID_MC to include the CPU port module, and that is
fine, but the ocelot-8021q tagger removes the CPU port module from the
unknown multicast replicator. So after a transition from the default
ocelot tagger towards ocelot-8021q and then again towards ocelot,
multicast flooding towards the CPU port module will be disabled.

Fixes: e21268efbe ("net: dsa: felix: perform switch setup for tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean
a8b659e7ff net: dsa: act as passthrough for bridge port flags
There are multiple ways in which a PORT_BRIDGE_FLAGS attribute can be
expressed by the bridge through switchdev, and not all of them can be
emulated by DSA mid-layer API at the same time.

One possible configuration is when the bridge offloads the port flags
using a mask that has a single bit set - therefore only one feature
should change. However, DSA currently groups together unicast and
multicast flooding in the .port_egress_floods method, which limits our
options when we try to add support for turning off broadcast flooding:
do we extend .port_egress_floods with a third parameter which b53 and
mv88e6xxx will ignore? But that means that the DSA layer, which
currently implements the PRE_BRIDGE_FLAGS attribute all by itself, will
see that .port_egress_floods is implemented, and will report that all 3
types of flooding are supported - not necessarily true.

Another configuration is when the user specifies more than one flag at
the same time, in the same netlink message. If we were to create one
individual function per offloadable bridge port flag, we would limit the
expressiveness of the switch driver of refusing certain combinations of
flag values. For example, a switch may not have an explicit knob for
flooding of unknown multicast, just for flooding in general. In that
case, the only correct thing to do is to allow changes to BR_FLOOD and
BR_MCAST_FLOOD in tandem, and never allow mismatched values. But having
a separate .port_set_unicast_flood and .port_set_multicast_flood would
not allow the driver to possibly reject that.

Also, DSA doesn't consider it necessary to inform the driver that a
SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it
just calls .port_egress_floods for the CPU port. When we'll add support
for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real
problem because the flood settings will need to be held statefully in
the DSA middle layer, otherwise changing the mrouter port attribute will
impact the flooding attribute. And that's _assuming_ that the underlying
hardware doesn't have anything else to do when a multicast router
attaches to a port than flood unknown traffic to it.  If it does, there
will need to be a dedicated .port_set_mrouter anyway.

So we need to let the DSA drivers see the exact form that the bridge
passes this switchdev attribute in, otherwise we are standing in the
way. Therefore we also need to use this form of language when
communicating to the driver that it needs to configure its initial
(before bridge join) and final (after bridge leave) port flags.

The b53 and mv88e6xxx drivers are converted to the passthrough API and
their implementation of .port_egress_floods is split into two: a
function that configures unicast flooding and another for multicast.
The mv88e6xxx implementation is quite hairy, and it turns out that
the implementations of unknown unicast flooding are actually the same
for 6185 and for 6352:

behind the confusing names actually lie two individual bits:
NO_UNKNOWN_MC -> FLOOD_UC = 0x4 = BIT(2)
NO_UNKNOWN_UC -> FLOOD_MC = 0x8 = BIT(3)

so there was no reason to entangle them in the first place.

Whereas the 6185 writes to MV88E6185_PORT_CTL0_FORWARD_UNKNOWN of
PORT_CTL0, which has the exact same bit index. I have left the
implementations separate though, for the only reason that the names are
different enough to confuse me, since I am not able to double-check with
a user manual. The multicast flooding setting for 6185 is in a different
register than for 6352 though.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:04 -08:00
George McCollister
bd62e6f5e6 net: dsa: xrs700x: add HSR offloading support
Add offloading for HSR/PRP (IEC 62439-3) tag insertion, tag removal
forwarding and duplication supported by the xrs7000 series switches.

Only HSR v1 and PRP v1 are supported by the xrs7000 series switches (HSR
v0 is not).

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 13:24:45 -08:00
George McCollister
f8a7e0145d net: dsa: xrs700x: use of_match_ptr() on xrs700x_mdio_dt_ids
Use of_match_ptr() on xrs700x_mdio_dt_ids so that NULL is substituted
when CONFIG_OF isn't defined. This will prevent unnecessary use of
xrs700x_mdio_dt_ids when CONFIG_OF isn't defined.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 13:17:04 -08:00
George McCollister
3e0103a35a net: dsa: xrs700x: fix unused warning for of_device_id
Fix unused variable warning that occurs when CONFIG_OF isn't defined by
adding __maybe_unused.

>> drivers/net/dsa/xrs700x/xrs700x_i2c.c:127:34: warning: unused
variable 'xrs700x_i2c_dt_ids' [-Wunused-const-variable]
   static const struct of_device_id xrs700x_i2c_dt_ids[] = {

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 13:17:04 -08:00
David S. Miller
dc9d87581d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
Vladimir Oltean
eb4733d7cf net: dsa: felix: implement port flushing on .phylink_mac_link_down
There are several issues which may be seen when the link goes down while
forwarding traffic, all of which can be attributed to the fact that the
port flushing procedure from the reference manual was not closely
followed.

With flow control enabled on both the ingress port and the egress port,
it may happen when a link goes down that Ethernet packets are in flight.
In flow control mode, frames are held back and not dropped. When there
is enough traffic in flight (example: iperf3 TCP), then the ingress port
might enter congestion and never exit that state. This is a problem,
because it is the egress port's link that went down, and that has caused
the inability of the ingress port to send packets to any other port.
This is solved by flushing the egress port's queues when it goes down.

There is also a problem when performing stream splitting for
IEEE 802.1CB traffic (not yet upstream, but a sort of multicast,
basically). There, if one port from the destination ports mask goes
down, splitting the stream towards the other destinations will no longer
be performed. This can be traced down to this line:

	ocelot_port_writel(ocelot_port, 0, DEV_MAC_ENA_CFG);

which should have been instead, as per the reference manual:

	ocelot_port_rmwl(ocelot_port, 0, DEV_MAC_ENA_CFG_RX_ENA,
			 DEV_MAC_ENA_CFG);

Basically only DEV_MAC_ENA_CFG_RX_ENA should be disabled, but not
DEV_MAC_ENA_CFG_TX_ENA - I don't have further insight into why that is
the case, but apparently multicasting to several ports will cause issues
if at least one of them doesn't have DEV_MAC_ENA_CFG_TX_ENA set.

I am not sure what the state of the Ocelot VSC7514 driver is, but
probably not as bad as Felix/Seville, since VSC7514 uses phylib and has
the following in ocelot_adjust_link:

	if (!phydev->link)
		return;

therefore the port is not really put down when the link is lost, unlike
the DSA drivers which use .phylink_mac_link_down for that.

Nonetheless, I put ocelot_port_flush() in the common ocelot.c because it
needs to access some registers from drivers/net/ethernet/mscc/ocelot_rew.h
which are not exported in include/soc/mscc/ and a bugfix patch should
probably not move headers around.

Fixes: bdeced75b1 ("net: dsa: felix: Add PCS operations for PHYLINK")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 11:41:11 -08:00
Vladimir Oltean
8fe6832e96 net: dsa: felix: propagate the LAG offload ops towards the ocelot lib
The ocelot switch has been supporting LAG offload since its initial
commit, however felix could not make use of that, due to lack of a LAG
abstraction in DSA. Now that we have that, let's forward DSA's calls
towards the ocelot library, who will deal with setting up the bonding.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06 14:51:51 -08:00
Tobias Waldekranz
add285bce3 net: dsa: xrs700x: Correctly address device over I2C
On read, master should send 31 MSB of the register (only even values
are ever used), followed by a 1 to indicate read. Then, reading two
bytes, the device will output the register's value.

On write, master sends 31 MSB of the register, followed by a 0 to
indicate write, followed by two bytes containing the register value.

Flexibilis' documentation (version 1.3) specifies the opposite
polarity (#read/write), but the scope indicates that it is, in fact,
read/#write.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Link: https://lore.kernel.org/r/20210202191645.439-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04 19:09:47 -08:00
Vladimir Oltean
b53014f079 net: dsa: bcm_sf2: Check egress tagging of CFP rule with proper accessor
The flow steering struct ethtool_flow_ext::data field is __be32, so when
the CFP code needs to check the VLAN egress tagging attribute in bit 0,
it does this in CPU native endianness. So logically, the endianness
conversion is set up the other way around, although in practice the same
result is produced.

Gets rid of build warning:

warning: cast from restricted __be32
warning: incorrect type in argument 1 (different base types)
   expected unsigned int [usertype] val
   got restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: restricted __be32 degrades to integer

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210203193918.2236994-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04 19:08:52 -08:00
Jakub Kicinski
d1e1355aef Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 14:21:31 -08:00
Kurt Kanzenbach
8486e83fe1 net: dsa: hellcreek: Report FDB table occupancy
Report the FDB table size and occupancy via devlink. The actual size depends on
the used Hellcreek version:

|root@tsn:~# devlink resource show platform/ff240000.switch
|platform/ff240000.switch:
|  name VLAN size 4096 occ 2 unit entry dpipe_tables none
|  name FDB size 256 occ 6 unit entry dpipe_tables none

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01 18:28:34 -08:00
Kurt Kanzenbach
7f976d5cf1 net: dsa: hellcreek: Report VLAN table occupancy
The VLAN membership configuration is cached in software already. So, it can be
reported via devlink. Add support for it:

|root@tsn:~# devlink resource show platform/ff240000.switch
|platform/ff240000.switch:
|  name VLAN size 4096 occ 4 unit entry dpipe_tables none

Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01 18:28:33 -08:00
DENG Qingfang
f72f2fb8fb net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Having multiple destination ports for a unicast address does not make
sense.
Make port_db_load_purge override existent unicast portvec instead of
adding a new port bit.

Fixes: 8847293992 ("net: dsa: mv88e6xxx: handle multiple ports in ATU")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20210130134334.10243-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01 18:24:49 -08:00
Vladimir Oltean
e21268efbe net: dsa: felix: perform switch setup for tag_8021q
Unlike sja1105, the only other user of the software-defined tag_8021q.c
tagger format, the implementation we choose for the Felix DSA switch
driver preserves full functionality under a vlan_filtering bridge
(i.e. IP termination works through the DSA user ports under all
circumstances).

The tag_8021q protocol just wants:
- Identifying the ingress switch port based on the RX VLAN ID, as seen
  by the CPU. We achieve this by using the TCAM engines (which are also
  used for tc-flower offload) to push the RX VLAN as a second, outer
  tag, on egress towards the CPU port.
- Steering traffic injected into the switch from the network stack
  towards the correct front port based on the TX VLAN, and consuming
  (popping) that header on the switch's egress.

A tc-flower pseudocode of the static configuration done by the driver
would look like this:

$ tc qdisc add dev <cpu-port> clsact
$ for eth in swp0 swp1 swp2 swp3; do \
	tc filter add dev <cpu-port> egress flower indev ${eth} \
		action vlan push id <rxvlan> protocol 802.1ad; \
	tc filter add dev <cpu-port> ingress protocol 802.1Q flower
		vlan_id <txvlan> action vlan pop \
		action mirred egress redirect dev ${eth}; \
done

but of course since DSA does not register network interfaces for the CPU
port, this configuration would be impossible for the user to do. Also,
due to the same reason, it is impossible for the user to inadvertently
delete these rules using tc. These rules do not collide in any way with
tc-flower, they just consume some TCAM space, which is something we can
live with.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 21:25:27 -08:00
Vladimir Oltean
7c83a7c539 net: dsa: add a second tagger for Ocelot switches based on tag_8021q
There are use cases for which the existing tagger, based on the NPI
(Node Processor Interface) functionality, is insufficient.

Namely:
- Frames injected through the NPI port bypass the frame analyzer, so no
  source address learning is performed, no TSN stream classification,
  etc.
- Flow control is not functional over an NPI port (PAUSE frames are
  encapsulated in the same Extraction Frame Header as all other frames)
- There can be at most one NPI port configured for an Ocelot switch. But
  in NXP LS1028A and T1040 there are two Ethernet CPU ports. The non-NPI
  port is currently either disabled, or operated as a plain user port
  (albeit an internally-facing one). Having the ability to configure the
  two CPU ports symmetrically could pave the way for e.g. creating a LAG
  between them, to increase bandwidth seamlessly for the system.

So there is a desire to have an alternative to the NPI mode. This change
keeps the default tagger for the Seville and Felix switches as "ocelot",
but it can be changed via the following device attribute:

echo ocelot-8021q > /sys/class/<dsa-master>/dsa/tagging

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 21:25:27 -08:00
Vladimir Oltean
adb3dccf09 net: dsa: felix: convert to the new .change_tag_protocol DSA API
In expectation of the new tag_ocelot_8021q tagger implementation, we
need to be able to do runtime switchover between one tagger and another.
So we must structure the existing code for the current NPI-based tagger
in a certain way.

We move the felix_npi_port_init function in expectation of the future
driver configuration necessary for tag_ocelot_8021q: we would like to
not have the NPI-related bits interspersed with the tag_8021q bits.

The conversion from this:

	ocelot_write_rix(ocelot,
			 ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports, 0)),
			 ANA_PGID_PGID, PGID_UC);

to this:

	cpu_flood = ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports));
	ocelot_rmw_rix(ocelot, cpu_flood, cpu_flood, ANA_PGID_PGID, PGID_UC);

is perhaps non-trivial, but is nonetheless non-functional. The PGID_UC
(replicator for unknown unicast) is already configured out of hardware
reset to flood to all ports except ocelot->num_phys_ports (the CPU port
module). All we change is that we use a read-modify-write to only add
the CPU port module to the unknown unicast replicator, as opposed to
doing a full write to the register.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 21:25:27 -08:00
Vladimir Oltean
cacea62fcd net: mscc: ocelot: don't use NPI tag prefix for the CPU port module
Context: Ocelot switches put the injection/extraction frame header in
front of the Ethernet header. When used in NPI mode, a DSA master would
see junk instead of the destination MAC address, and it would most
likely drop the packets. So the Ocelot frame header can have an optional
prefix, which is just "ff:ff:ff:ff:ff:fe > ff:ff:ff:ff:ff:ff" padding
put before the actual tag (still before the real Ethernet header) such
that the DSA master thinks it's looking at a broadcast frame with a
strange EtherType.

Unfortunately, a lesson learned in commit 69df578c5f ("net: mscc:
ocelot: eliminate confusion between CPU and NPI port") seems to have
been forgotten in the meanwhile.

The CPU port module and the NPI port have independent settings for the
length of the tag prefix. However, the driver is using the same variable
to program both of them.

There is no reason really to use any tag prefix with the CPU port
module, since that is not connected to any Ethernet port. So this patch
makes the inj_prefix and xtr_prefix variables apply only to the NPI
port (which the switchdev ocelot_vsc7514 driver does not use).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 21:24:30 -08:00
Kurt Kanzenbach
6c13d75bee net: dsa: hellcreek: Add missing TAPRIO dependency
Add missing dependency to TAPRIO to avoid build failures such as:

|ERROR: modpost: "taprio_offload_get" [drivers/net/dsa/hirschmann/hellcreek_sw.ko] undefined!
|ERROR: modpost: "taprio_offload_free" [drivers/net/dsa/hirschmann/hellcreek_sw.ko] undefined!

Fixes: 24dfc6eb39 ("net: dsa: hellcreek: Add TAPRIO offloading support")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20210128163338.22665-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 21:02:31 -08:00
Jakub Kicinski
c358f95205 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/can/dev.c
  b552766c87 ("can: dev: prevent potential information leak in can_fill_info()")
  3e77f70e73 ("can: dev: move driver related infrastructure into separate subdir")
  0a042c6ec9 ("can: dev: move netlink related code into seperate file")

  Code move.

drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
  57ac4a31c4 ("net/mlx5e: Correctly handle changing the number of queues when the interface is down")
  214baf2287 ("net/mlx5e: Support HTB offload")

  Adjacent code changes

net/switchdev/switchdev.c
  20776b465c ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP")
  ffb68fc58e ("net: switchdev: remove the transaction structure from port object notifiers")
  bae33f2b5a ("net: switchdev: remove the transaction structure from port attributes")

  Transaction parameter gets dropped otherwise keep the fix.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28 17:09:31 -08:00
Lorenzo Carletti
d1f3bdd4ea net: dsa: rtl8366rb: standardize init jam tables
In the rtl8366rb driver there are some jam tables which contain
undocumented values.
While trying to understand what these tables actually do,
I noticed a discrepancy in how one of those was treated.
Most of them were plain u16 arrays, while the ethernet one was
an u16 matrix.
By looking at the vendor's droplets of source code these tables came from,
I found out that they were all originally u16 matrixes.

This commit standardizes the jam tables, turning them all into
jam_tbl_entry arrays. Each entry contains 2 u16 values.
This change makes it easier to understand how the jam tables are used
and also makes it possible for a single function to handle all of them,
removing some duplicated code.

Signed-off-by: Lorenzo Carletti <lorenzo.carletti98@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27 20:21:20 -08:00
Andrew Lunn
63368a7416 net: dsa: mv88e6xxx: Make global2 support mandatory
Early generations of the mv88e6xxx did not have the global 2
registers. In order to keep the driver slim, it was decided to make
the code for these registers optional. Over time, more generations of
switches have been added, always supporting global 2 and adding more
and more registers. No effort has been made to keep these additional
registers also optional to slim the driver down when used for older
generations. Optional global 2 now just gives additional development
and maintenance burden for no real gain.

Make global 2 support always compiled in.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20210127003210.663173-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27 19:28:16 -08:00
Rasmus Villemoes
b28f3f3c3f net: dsa: mv88e6xxx: use mv88e6185_g1_vtu_loadpurge() for the 6250
Apart from the mask used to get the high bits of the fid,
mv88e6185_g1_vtu_loadpurge() and mv88e6250_g1_vtu_loadpurge() are
identical. Since the entry->fid passed in should never exceed the
number of databases, we can simply use the former as-is as replacement
for the latter.

Suggested-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 17:58:28 -08:00
Rasmus Villemoes
67c9ed1c88 net: dsa: mv88e6xxx: use mv88e6185_g1_vtu_getnext() for the 6250
mv88e6250_g1_vtu_getnext is almost identical to
mv88e6185_g1_vtu_getnext, except for the 6250 only having 64 databases
instead of 256. We can reduce code duplication by simply masking off
the extra two garbage bits when assembling the fid from VTU op [3:0]
and [11:8].

Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 17:58:28 -08:00
DENG Qingfang
429a0edeef net: dsa: mt7530: MT7530 optional GPIO support
MT7530's LED controller can drive up to 15 LED/GPIOs.

Add support for GPIO control and allow users to use its GPIOs by
setting gpio-controller property in device tree.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 18:29:04 -08:00
Kurt Kanzenbach
24dfc6eb39 net: dsa: hellcreek: Add TAPRIO offloading support
The switch has support for the 802.1Qbv Time Aware Shaper (TAS). Traffic
schedules may be configured individually on each front port. Each port has eight
egress queues. The traffic is mapped to a traffic class respectively via the PCP
field of a VLAN tagged frame.

The TAPRIO Qdisc already implements that. Therefore, this interface can simply
be reused. Add .port_setup_tc() accordingly.

The activation of a schedule on a port is split into two parts:

 * Programming the necessary gate control list (GCL)
 * Setup delayed work for starting the schedule

The hardware supports starting a schedule up to eight seconds in the future. The
TAPRIO interface provides an absolute base time. Therefore, periodic delayed
work is leveraged to check whether a schedule may be started or not.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-23 21:25:16 -08:00
Richard Cochran
57ba00774b net: dsa: mv88e6xxx: Remove bogus Kconfig dependency.
The mv88e6xxx is a DSA driver, and it implements DSA style time
stamping of PTP frames.  It has no need of the expensive option to
enable PHY time stamping.  Remove the bogus dependency.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Brandon Streiff <brandon.streiff@ni.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-23 13:23:31 -08:00
Pan Bian
cf3c46631e net: dsa: bcm_sf2: put device node before return
Put the device node dn before return error code on failure path.

Fixes: 461cd1b03e ("net: dsa: bcm_sf2: Register our slave MDIO bus")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Link: https://lore.kernel.org/r/20210121123343.26330-1-bianpan2016@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-23 13:17:08 -08:00
Marek Vasut
1c45ba93d3 net: dsa: microchip: Adjust reset release timing to match reference reset circuit
KSZ8794CNX datasheet section 8.0 RESET CIRCUIT describes recommended
circuit for interfacing with CPU/FPGA reset consisting of 10k pullup
resistor and 10uF capacitor to ground. This circuit takes ~100 ms to
rise enough to release the reset.

For maximum supply voltage VDDIO=3.3V VIH=2.0V R=10kR C=10uF that is
                    VDDIO - VIH
  t = R * C * -ln( ------------- ) = 10000*0.00001*-(-0.93)=0.093 s
                       VDDIO
so we need ~95 ms for the reset to really de-assert, and then the
original 100us for the switch itself to come out of reset. Simply
msleep() for 100 ms which fits the constraint with a bit of extra
space.

Fixes: 5b79798090 ("net: dsa: microchip: Implement recommended reset timing")
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
Reviewed-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210120030502.617185-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 20:52:28 -08:00
Marek Vasut
c369d7fc8f net: dsa: microchip: ksz8795: Fix KSZ8794 port map again
The KSZ8795 switch has 4 external ports {0,1,2,3} and 1 CPU port {4}, so
does the KSZ8765. The KSZ8794 seems to be repackaged KSZ8795 with different
ID and port 3 not routed out, however the port 3 registers are present in
the silicon, so the KSZ8794 switch has 3 external ports {0,1,2} and 1 CPU
port {4}. Currently the driver always uses the last port as CPU port, on
KSZ8795/KSZ8765 that is port 4 and that is OK, but on KSZ8794 that is port
3 and that is not OK, as it must also be port 4.

This patch adjusts the driver such that it always registers a switch with
5 ports total (4 external ports, 1 CPU port), always sets the CPU port to
switch port 4, and then configures the external port mask according to
the switch model -- 3 ports for KSZ8794 and 4 for KSZ8795/KSZ8765.

Fixes: 68a1b676db ("net: dsa: microchip: ksz8795: remove superfluous port_cnt assignment")
Fixes: 4ce2a984ab ("net: dsa: microchip: ksz8795: use phy_port_cnt where possible")
Fixes: 241ed719bc ("net: dsa: microchip: ksz8795: use port_cnt instead of TOTOAL_PORT_NUM")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210120001045.488506-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 20:50:49 -08:00
Dan Carpenter
646188c955 net: dsa: Fix off by one in dsa_loop_port_vlan_add()
The > comparison is intended to be >= to prevent reading beyond the
end of the ps->vlans[] array.  It doesn't affect run time though because
the ps->vlans[] array has VLAN_N_VID (4096) elements and the vlan->vid
cannot be > 4094 because it is checked earlier.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/YAbyb5kBJQlpYCs2@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 17:10:04 -08:00
Jakub Kicinski
0fe2f273ab Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/can/dev.c
  commit 03f16c5075 ("can: dev: can_restart: fix use after free bug")
  commit 3e77f70e73 ("can: dev: move driver related infrastructure into separate subdir")

  Code move.

drivers/net/dsa/b53/b53_common.c
 commit 8e4052c32d ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
 commit b7a9e0da2d ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")

 Field rename.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 12:16:11 -08:00
Dan Carpenter
8e4052c32d net: dsa: b53: fix an off by one in checking "vlan->vid"
The > comparison should be >= to prevent accessing one element beyond
the end of the dev->vlans[] array in the caller function, b53_vlan_add().
The "dev->vlans" array is allocated in the b53_switch_init() function
and it has "dev->num_vlans" elements.

Fixes: a2482d2ce3 ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/YAbxI97Dl/pmBy5V@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-19 19:34:49 -08:00
Rasmus Villemoes
87fe04367d net: dsa: mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext
mv88e6xxx_port_vlan_join checks whether the VTU already contains an
entry for the given vid (via mv88e6xxx_vtu_getnext), and if so, merely
changes the relevant .member[] element and loads the updated entry
into the VTU.

However, at least for the mv88e6250, the on-stack struct
mv88e6xxx_vtu_entry vlan never has its .state[] array explicitly
initialized, neither in mv88e6xxx_port_vlan_join() nor inside the
getnext implementation. So the new entry has random garbage for the
STU bits, breaking VLAN filtering.

When the VTU entry is initially created, those bits are all zero, and
we should make sure to keep them that way when the entry is updated.

Fixes: 92307069a9 (net: dsa: mv88e6xxx: Avoid VTU corruption on 6097)
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-18 13:04:28 -08:00
Vladimir Oltean
f59fd9cab7 net: mscc: ocelot: configure watermarks using devlink-sb
Using devlink-sb, we can configure 12/16 (the important 75%) of the
switch's controlling watermarks for congestion drops, and we can monitor
50% of the watermark occupancies (we can monitor the reservation
watermarks, but not the sharing watermarks, which are exposed as pool
sizes).

The following definitions can be made:

SB_BUF=0 # The devlink-sb for frame buffers
SB_REF=1 # The devlink-sb for frame references
POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances
           # have one of these.
POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances
           # have one of these.

Editing the hardware watermarks is done in the following way:
BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING
REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING
BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR
REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR

Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly
by modifying the corresponding pool size. By default, the pool size has
maximum size, so this can be skipped.

devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \
	size 129840 thtype static

Since by default there is no buffer reservation, the above command has
maxed out BUF_COL_SHR_I(dp=0).

Configuring the per-port reservation watermark (P_RSRV) is done in the
following way:

devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \
	pool $POOL_ING th 1000

The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this
command, the sharing watermarks are internally reconfigured with 1000
bytes less, i.e. from 129840 bytes to 128840 bytes.

Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in
the following way:

for tc in {0..7}; do
	devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \
		type ingress pool $POOL_ING \
		th 3000
done

The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes.
The sharing watermarks are again reconfigured with 24000 bytes less.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 20:02:35 -08:00
Vladimir Oltean
70d39a6e62 net: mscc: ocelot: export NUM_TC constant from felix to common switch lib
We should be moving anything that isn't DSA-specific or SoC-specific out
of the felix DSA driver, and into the common mscc_ocelot switch library.

The number of traffic classes is one of the aspects that is common
between all ocelot switches, so it belongs in the library.

This patch also makes seville use 8 TX queues, and therefore enables
prioritization via the QOS_CLASS field in the NPI injection header.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 20:02:34 -08:00
Vladimir Oltean
d19741b0f5 net: dsa: felix: perform teardown in reverse order of setup
In general it is desirable that cleanup is the reverse process of setup.
In this case I am not seeing any particular issue, but with the
introduction of devlink-sb for felix, a non-obvious decision had to be
made as to where to put its cleanup method. When there's a convention in
place, that decision becomes obvious.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 20:02:34 -08:00
Vladimir Oltean
a7096915e4 net: dsa: felix: reindent struct dsa_switch_ops
The devlink function pointer names are super long, and they would break
the alignment. So reindent the existing ops now by adding one tab.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 20:02:34 -08:00
Vladimir Oltean
703b762190 net: mscc: ocelot: add ops for decoding watermark threshold and occupancy
We'll need to read back the watermark thresholds and occupancy from
hardware (for devlink-sb integration), not only to write them as we did
so far in ocelot_port_set_maxlen. So introduce 2 new functions in struct
ocelot_ops, similar to wm_enc, and implement them for the 3 supported
mscc_ocelot switches.

Remove the INUSE and MAXUSE unpacking helpers for the QSYS_RES_STAT
register, because that doesn't scale with the number of switches that
mscc_ocelot supports now. They have different bit widths for the
watermarks, and we need function pointers to abstract that difference
away.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 20:02:34 -08:00
Vladimir Oltean
f6fe01d6fa net: mscc: ocelot: auto-detect packet buffer size and number of frame references
Instead of reading these values from the reference manual and writing
them down into the driver, it appears that the hardware gives us the
option of detecting them dynamically.

The number of frame references corresponds to what the reference manual
notes, however it seems that the frame buffers are reported as slightly
less than the books would indicate. On VSC9959 (Felix), the books say it
should have 128KB of packet buffer, but the registers indicate only
129840 bytes (126.79 KB). Also, the unit of measurement for FREECNT from
the documentation of all these devices is incorrect (taken from an older
generation). This was confirmed by Younes Leroul from Microchip support.

Not having anything better to do with these values at the moment* (this
will change soon), let's just print them.

*The frame buffer size is, in fact, used to calculate the tail dropping
watermarks.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 20:02:33 -08:00
Vladimir Oltean
0ee2af4ebb net: dsa: set configure_vlan_while_not_filtering to true by default
As explained in commit 54a0ed0df4 ("net: dsa: provide an option for
drivers to always receive bridge VLANs"), DSA has historically been
skipping VLAN switchdev operations when the bridge wasn't in
vlan_filtering mode, but the reason why it was doing that has never been
clear. So the configure_vlan_while_not_filtering option is there merely
to preserve functionality for existing drivers. It isn't some behavior
that drivers should opt into. Ideally, when all drivers leave this flag
set, we can delete the dsa_port_skip_vlan_configuration() function.

New drivers always seem to omit setting this flag, for some reason. So
let's reverse the logic: the DSA core sets it by default to true before
the .setup() callback, and legacy drivers can turn it off. This way, new
drivers get the new behavior by default, unless they explicitly set the
flag to false, which is more obvious during review.

Remove the assignment from drivers which were setting it to true, and
add the assignment to false for the drivers that didn't previously have
it. This way, it should be easier to see how many we have left.

The following drivers: lan9303, mv88e6060 were skipped from setting this
flag to false, because they didn't have any VLAN offload ops in the
first place.

The Broadcom Starfighter 2 driver calls the common b53_switch_alloc and
therefore also inherits the configure_vlan_while_not_filtering=true
behavior.

Also, print a message through netlink extack every time a VLAN has been
skipped. This is mildly annoying on purpose, so that (a) it is at least
clear that VLANs are being skipped - the legacy behavior in itself is
confusing, and the extack should be much more difficult to miss, unlike
kernel logs - and (b) people have one more incentive to convert to the
new behavior.

No behavior change except for the added prints is intended at this time.

$ ip link add br0 type bridge vlan_filtering 0
$ ip link set sw0p2 master br0
[   60.315148] br0: port 1(sw0p2) entered blocking state
[   60.320350] br0: port 1(sw0p2) entered disabled state
[   60.327839] device sw0p2 entered promiscuous mode
[   60.334905] br0: port 1(sw0p2) entered blocking state
[   60.340142] br0: port 1(sw0p2) entered forwarding state
Warning: dsa_core: skipping configuration of VLAN. # This was the pvid
$ bridge vlan add dev sw0p2 vid 100
Warning: dsa_core: skipping configuration of VLAN.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210115231919.43834-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 17:29:40 -08:00
Tobias Waldekranz
b80dc51b72 net: dsa: mv88e6xxx: Only allow LAG offload on supported hardware
There are chips that do have Global 2 registers, and therefore trunk
mapping/mask tables are not available. Refuse the offload as early as
possible on those devices.

Fixes: 57e661aae6 ("net: dsa: mv88e6xxx: Link aggregation support")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 16:08:51 -08:00
Tobias Waldekranz
d38001d30d net: dsa: mv88e6xxx: Provide dummy implementations for trunk setters
Support for Global 2 registers is build-time optional. In the case
where it was not enabled the build would fail as no "dummy"
implementation of these functions was available.

Fixes: 57e661aae6 ("net: dsa: mv88e6xxx: Link aggregation support")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 16:08:51 -08:00
George McCollister
ee00b24f32 net: dsa: add Arrow SpeedChips XRS700x driver
Add a driver with initial support for the Arrow SpeedChips XRS7000
series of gigabit Ethernet switch chips which are typically used in
critical networking applications.

The switches have up to three RGMII ports and one RMII port.
Management to the switches can be performed over i2c or mdio.

Support for advanced features such as PTP and
HSR/PRP (IEC 62439-3 Clause 5 & 4) is not included in this patch and
may be added at a later date.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 15:37:37 -08:00
Tobias Waldekranz
57e661aae6 net: dsa: mv88e6xxx: Link aggregation support
Support offloading of LAGs to hardware. LAGs may be attached to a
bridge in which case VLANs, multicast groups, etc. are also offloaded
as usual.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-14 17:11:56 -08:00
Oleksij Rempel
bf9ce38593 net: dsa: qca: ar9331: export stats64
Add stats support for the ar9331 switch.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12 20:17:10 -08:00
Vladimir Oltean
537e2b8822 net: dsa: felix: the switch does not support DMA
The code that sets the DMA mask to 64 bits is bogus, it is taken from
the enetc driver together with the rest of the PCI probing boilerplate.

Since this patch is touching the error path to delete err_dma, let's
also change the err_alloc_felix label which was incorrect. The kzalloc
failure does not need a kfree, but it doesn't hurt either, since kfree
works with NULL pointers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210109203415.2120142-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:16:35 -08:00
Vladimir Oltean
1958d5815c net: dsa: remove the transactional logic from VLAN objects
It should be the driver's business to logically separate its VLAN
offloading into a preparation and a commit phase, and some drivers don't
need / can't do this.

So remove the transactional shim from DSA and let drivers propagate
errors directly from the .port_vlan_add callback.

It would appear that the code has worse error handling now than it had
before. DSA is the only in-kernel user of switchdev that offloads one
switchdev object to more than one port: for every VLAN object offloaded
to a user port, that VLAN is also offloaded to the CPU port. So the
"prepare for user port -> check for errors -> prepare for CPU port ->
check for errors -> commit for user port -> commit for CPU port"
sequence appears to make more sense than the one we are using now:
"offload to user port -> check for errors -> offload to CPU port ->
check for errors", but it is really a compromise. In the new way, we can
catch errors from the commit phase that we previously had to ignore.
But we have our hands tied and cannot do any rollback now: if we add a
VLAN on the CPU port and it fails, we can't do the rollback by simply
deleting it from the user port, because the switchdev API is not so nice
with us: it could have simply been there already, even with the same
flags. So we don't even attempt to rollback anything on addition error,
just leave whatever VLANs managed to get offloaded right where they are.
This should not be a problem at all in practice.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:57 -08:00
Vladimir Oltean
a52b2da778 net: dsa: remove the transactional logic from MDB entries
For many drivers, the .port_mdb_prepare callback was not a good opportunity
to avoid any error condition, and they would suppress errors found during
the actual commit phase.

Where a logical separation between the prepare and the commit phase
existed, the function that used to implement the .port_mdb_prepare
callback still exists, but now it is called directly from .port_mdb_add,
which was modified to return an int code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Linus Wallei <linus.walleij@linaro.org> # RTL8366
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:57 -08:00
Vladimir Oltean
bae33f2b5a net: switchdev: remove the transaction structure from port attributes
Since the introduction of the switchdev API, port attributes were
transmitted to drivers for offloading using a two-step transactional
model, with a prepare phase that was supposed to catch all errors, and a
commit phase that was supposed to never fail.

Some classes of failures can never be avoided, like hardware access, or
memory allocation. In the latter case, merely attempting to move the
memory allocation to the preparation phase makes it impossible to avoid
memory leaks, since commit 91cf8eceff ("switchdev: Remove unused
transaction item queue") which has removed the unused mechanism of
passing on the allocated memory between one phase and another.

It is time we admit that separating the preparation from the commit
phase is something that is best left for the driver to decide, and not
something that should be baked into the API, especially since there are
no switchdev callers that depend on this.

This patch removes the struct switchdev_trans member from switchdev port
attribute notifier structures, and converts drivers to not look at this
member.

In part, this patch contains a revert of my previous commit 2e554a7a5d
("net: dsa: propagate switchdev vlan_filtering prepare phase to
drivers").

For the most part, the conversion was trivial except for:
- Rocker's world implementation based on Broadcom OF-DPA had an odd
  implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
  done mechanically, by pasting the implementation twice, then only
  keeping the code that would get executed during prepare phase on top,
  then only keeping the code that gets executed during the commit phase
  on bottom, then simplifying the resulting code until this was obtained.
- DSA's offloading of STP state, bridge flags, VLAN filtering and
  multicast router could be converted right away. But the ageing time
  could not, so a shim was introduced and this was left for a further
  commit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:57 -08:00
Vladimir Oltean
3e85f580e3 net: dsa: mv88e6xxx: deny vid 0 on the CPU port and DSA links too
mv88e6xxx apparently has a problem offloading VID 0, which the 8021q
module tries to install as part of commit ad1afb0039 ("vlan_dev: VLAN
0 should be treated as "no vlan tag" (802.1p packet)"). That mv88e6xxx
restriction seems to have been introduced by the "VTU GetNext VID-1
trick to retrieve a single entry" - see commit 2fb5ef09de ("net: dsa:
mv88e6xxx: extract single VLAN retrieval").

There is one more problem. The mv88e6xxx CPU port and DSA links do not
report properly in the prepare phase what are the VLANs that they can
offload. They'll say they can offload everything:

mv88e6xxx_port_vlan_prepare
-> mv88e6xxx_port_check_hw_vlan:

	/* DSA and CPU ports have to be members of multiple vlans */
	if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port))
		return 0;

Except that if you actually try to commit to it, they'll error out and
print this message:

[   32.802438] mv88e6085 d0032004.mdio-mii:12: p9: failed to add VLAN 0t

which comes from:

mv88e6xxx_port_vlan_add
-> mv88e6xxx_port_vlan_join:

	if (!vid)
		return -EOPNOTSUPP;

What prevents this condition from triggering in real life? The fact that
when a DSA_NOTIFIER_VLAN_ADD is emitted, it never targets a DSA link
directly. Instead, the notifier will always target either a user port or
a CPU port. DSA links just happen to get dragged in by:

static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
				  struct dsa_notifier_vlan_info *info)
{
	...
	if (dsa_is_dsa_port(ds, port))
		return true;
	...
}

So for every DSA VLAN notifier, during the prepare phase, it will just
so happen that there will be somebody to say "no, don't do that".

This will become a problem when the switchdev prepare/commit transactional
model goes away. Every port needs to think on its own. DSA links can no
longer bluff and rely on the fact that the prepare phase will not go
through to the end, because there will be no prepare phase any longer.

Fix this issue before it becomes a problem, by having the "vid == 0"
check earlier than the check whether we are a CPU port / DSA link or not.
Also, the "vid == 0" check becomes unnecessary in the .port_vlan_add
callback, so we can remove it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:56 -08:00
Vladimir Oltean
b7a9e0da2d net: switchdev: remove vid_begin -> vid_end range from VLAN objects
The call path of a switchdev VLAN addition to the bridge looks something
like this today:

        nbp_vlan_init
        |  __br_vlan_set_default_pvid
        |  |                       |
        |  |    br_afspec          |
        |  |        |              |
        |  |        v              |
        |  | br_process_vlan_info  |
        |  |        |              |
        |  |        v              |
        |  |   br_vlan_info        |
        |  |       / \            /
        |  |      /   \          /
        |  |     /     \        /
        |  |    /       \      /
        v  v   v         v    v
      nbp_vlan_add   br_vlan_add ------+
       |              ^      ^ |       |
       |             /       | |       |
       |            /       /  /       |
       \ br_vlan_get_master/  /        v
        \        ^        /  /  br_vlan_add_existing
         \       |       /  /          |
          \      |      /  /          /
           \     |     /  /          /
            \    |    /  /          /
             \   |   /  /          /
              v  |   | v          /
              __vlan_add         /
                 / |            /
                /  |           /
               v   |          /
   __vlan_vid_add  |         /
               \   |        /
                v  v        v
      br_switchdev_port_vlan_add

The ranges UAPI was introduced to the bridge in commit bdced7ef78
("bridge: support for multiple vlans and vlan ranges in setlink and
dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
have always been passed one by one, through struct bridge_vlan_info
tmp_vinfo, to br_vlan_info. So the range never went too far in depth.

Then Scott Feldman introduced the switchdev_port_bridge_setlink function
in commit 47f8328bb1 ("switchdev: add new switchdev bridge setlink").
That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
full use of the range. But switchdev_port_bridge_setlink was called like
this:

br_setlink
-> br_afspec
-> switchdev_port_bridge_setlink

Basically, the switchdev and the bridge code were not tightly integrated.
Then commit 41c498b935 ("bridge: restore br_setlink back to original")
came, and switchdev drivers were required to implement
.ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.

In the meantime, commits such as 0944d6b5a2 ("bridge: try switchdev op
first in __vlan_vid_add/del") finally made switchdev penetrate the
br_vlan_info() barrier and start to develop the call path we have today.
But remember, br_vlan_info() still receives VLANs one by one.

Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
29ab586c3d ("net: switchdev: Remove bridge bypass support from
switchdev") so that drivers would not implement .ndo_bridge_setlink any
longer. The switchdev_port_bridge_setlink also got deleted.
This refactoring removed the parallel bridge_setlink implementation from
switchdev, and left the only switchdev VLAN objects to be the ones
offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
(the latter coming from commit 9c86ce2c1a ("net: bridge: Notify about
bridge VLANs")).

That is to say, today the switchdev VLAN object ranges are not used in
the kernel. Refactoring the above call path is a bit complicated, when
the bridge VLAN call path is already a bit complicated.

Let's go off and finish the job of commit 29ab586c3d by deleting the
bogus iteration through the VLAN ranges from the drivers. Some aspects
of this feature never made too much sense in the first place. For
example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
flag supposed to mean, when a port can obviously have a single pvid?
This particular configuration _is_ denied as of commit 6623c60dc2
("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
perspective, the driver still has to play pretend, and only offload the
vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
modify the flags of another, completely unrelated, switchdev VLAN
object! (a VLAN that is PVID will invalidate the PVID flag from whatever
other VLAN had previously been offloaded with switchdev and had that
flag. Yet switchdev never notifies about that change, drivers are
supposed to guess).

Nonetheless, having a VLAN range in the API makes error handling look
scarier than it really is - unwinding on errors and all of that.
When in reality, no one really calls this API with more than one VLAN.
It is all unnecessary complexity.

And despite appearing pretentious (two-phase transactional model and
all), the switchdev API is really sloppy because the VLAN addition and
removal operations are not paired with one another (you can add a VLAN
100 times and delete it just once). The bridge notifies through
switchdev of a VLAN addition not only when the flags of an existing VLAN
change, but also when nothing changes. There are switchdev drivers out
there who don't like adding a VLAN that has already been added, and
those checks don't really belong at driver level. But the fact that the
API contains ranges is yet another factor that prevents this from being
addressed in the future.

Of the existing switchdev pieces of hardware, it appears that only
Mellanox Spectrum supports offloading more than one VLAN at a time,
through mlxsw_sp_port_vlan_set. I have kept that code internal to the
driver, because there is some more bookkeeping that makes use of it, but
I deleted it from the switchdev API. But since the switchdev support for
ranges has already been de facto deleted by a Mellanox employee and
nobody noticed for 4 years, I'm going to assume it's not a biggie.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:56 -08:00
Rafał Miłecki
73b7a60479 net: dsa: bcm_sf2: support BCM4908's integrated switch
BCM4908 family SoCs come with integrated Starfighter 2 switch. Its
registers layout it a mix of BCM7278 and BCM7445. It has 5 integrated
PHYs and 8 ports. It also supports RGMII and SerDes.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210106213202.17459-3-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 19:18:10 -08:00
Jakub Kicinski
833d22f2f9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in CAN on file rename.

Conflicts:
	drivers/net/can/m_can/tcan4x5x-core.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-08 13:28:00 -08:00
Aleksander Jan Bajkowski
3545454c78 net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
Exclude RMII from modes that report 1 GbE support. Reduced MII supports
up to 100 MbE.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210107195818.3878-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 19:00:11 -08:00
Vladimir Oltean
c54913c1d4 net: dsa: ocelot: request DSA to fix up lack of address learning on CPU port
Given the following setup:

ip link add br0 type bridge
ip link set eno0 master br0
ip link set swp0 master br0
ip link set swp1 master br0
ip link set swp2 master br0
ip link set swp3 master br0

Currently, packets received on a DSA slave interface (such as swp0)
which should be routed by the software bridge towards a non-switch port
(such as eno0) are also flooded towards the other switch ports (swp1,
swp2, swp3) because the destination is unknown to the hardware switch.

This patch addresses the issue by monitoring the addresses learnt by the
software bridge on eno0, and adding/deleting them as static FDB entries
on the CPU port accordingly.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 15:34:46 -08:00
Randy Dunlap
7f847db304 net: dsa: fix led_classdev build errors
Fix build errors when LEDS_CLASS=m and NET_DSA_HIRSCHMANN_HELLCREEK=y.
This limits the latter to =m when LEDS_CLASS=m.

microblaze-linux-ld: drivers/net/dsa/hirschmann/hellcreek_ptp.o: in function `hellcreek_ptp_setup':
(.text+0xf80): undefined reference to `led_classdev_register_ext'
microblaze-linux-ld: (.text+0xf94): undefined reference to `led_classdev_register_ext'
microblaze-linux-ld: drivers/net/dsa/hirschmann/hellcreek_ptp.o: in function `hellcreek_ptp_free':
(.text+0x1018): undefined reference to `led_classdev_unregister'
microblaze-linux-ld: (.text+0x1024): undefined reference to `led_classdev_unregister'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Link: lore.kernel.org/r/202101060655.iUvMJqS2-lkp@intel.com
Cc: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20210106021815.31796-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-06 16:17:14 -08:00
Zheng Yongjun
c75857b055 net: dsa: sja1105: Use kzalloc for allocating only one thing
Use kzalloc rather than kcalloc(1,...)

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
@@

- kcalloc(1,
+ kzalloc(
          ...)
// </smpl>

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-05 15:43:41 -08:00
Martin Blumenstingl
709a3c9dff net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
There is one GSWIP_MII_CFG register for each switch-port except the CPU
port. The register offset for the first port is 0x0, 0x02 for the
second, 0x04 for the third and so on.

Update the driver to not only restrict the GSWIP_MII_CFG registers to
ports 0, 1 and 5. Handle ports 0..5 instead but skip the CPU port. This
means we are not overwriting the configuration for the third port (port
two since we start counting from zero) with the settings for the sixth
port (with number five) anymore.

The GSWIP_MII_PCDU(p) registers are not updated because there's really
only three (one for each of the following ports: 0, 1, 5).

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-04 13:47:15 -08:00
Martin Blumenstingl
c1a9ec7e5d net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
Enable GSWIP_MII_CFG_EN also for internal PHYs to make traffic flow.
Without this the PHY link is detected properly and ethtool statistics
for TX are increasing but there's no RX traffic coming in.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Suggested-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-04 13:47:15 -08:00
Oleksij Rempel
3e47495fc4 net: dsa: qca: ar9331: fix sleeping function called from invalid context bug
With lockdep enabled, we will get following warning:

 ar9331_switch ethernet.1:10 lan0 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:00] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 18, name: kworker/0:1
 INFO: lockdep is turned off.
 irq event stamp: 602
 hardirqs last  enabled at (601): [<8073fde0>] _raw_spin_unlock_irq+0x3c/0x80
 hardirqs last disabled at (602): [<8073a4f4>] __schedule+0x184/0x800
 softirqs last  enabled at (0): [<80080f60>] copy_process+0x578/0x14c8
 softirqs last disabled at (0): [<00000000>] 0x0
 CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.10.0-rc3-ar9331-00734-g7d644991df0c #31
 Workqueue: events deferred_probe_work_func
 Stack : 80980000 80980000 8089ef70 80890000 804b5414 80980000 00000002 80b53728
         00000000 800d1268 804b5414 ffffffde 00000017 800afe08 81943860 0f5bfc32
         00000000 00000000 8089ef70 819436c0 ffffffea 00000000 00000000 00000000
         8194390c 808e353c 0000000f 66657272 80980000 00000000 00000000 80890000
         804b5414 80980000 00000002 80b53728 00000000 00000000 00000000 80d40000
         ...
 Call Trace:
 [<80069ce0>] show_stack+0x9c/0x140
 [<800afe08>] ___might_sleep+0x220/0x244
 [<8073bfb0>] __mutex_lock+0x70/0x374
 [<8073c2e0>] mutex_lock_nested+0x2c/0x38
 [<804b5414>] regmap_update_bits_base+0x38/0x8c
 [<804ee584>] regmap_update_bits+0x1c/0x28
 [<804ee714>] ar9331_sw_unmask_irq+0x34/0x60
 [<800d91f0>] unmask_irq+0x48/0x70
 [<800d93d4>] irq_startup+0x114/0x11c
 [<800d65b4>] __setup_irq+0x4f4/0x6d0
 [<800d68a0>] request_threaded_irq+0x110/0x190
 [<804e3ef0>] phy_request_interrupt+0x4c/0xe4
 [<804df508>] phylink_bringup_phy+0x2c0/0x37c
 [<804df7bc>] phylink_of_phy_connect+0x118/0x130
 [<806c1a64>] dsa_slave_create+0x3d0/0x578
 [<806bc4ec>] dsa_register_switch+0x934/0xa20
 [<804eef98>] ar9331_sw_probe+0x34c/0x364
 [<804eb48c>] mdio_probe+0x44/0x70
 [<8049e3b4>] really_probe+0x30c/0x4f4
 [<8049ea10>] driver_probe_device+0x264/0x26c
 [<8049bc10>] bus_for_each_drv+0xb4/0xd8
 [<8049e684>] __device_attach+0xe8/0x18c
 [<8049ce58>] bus_probe_device+0x48/0xc4
 [<8049db70>] deferred_probe_work_func+0xdc/0xf8
 [<8009ff64>] process_one_work+0x2e4/0x4a0
 [<800a0770>] worker_thread+0x2a8/0x354
 [<800a774c>] kthread+0x16c/0x174
 [<8006306c>] ret_from_kernel_thread+0x14/0x1c

 ar9331_switch ethernet.1:10 lan1 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:02] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
 DSA: tree 0 setup

To fix it, it is better to move access to MDIO register to the .irq_bus_sync_unlock
call back.

Fixes: ec6698c272 ("net: dsa: add support for Atheros AR9331 built-in switch")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20201211110317.17061-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-16 10:57:35 -08:00
Rasmus Villemoes
49506a9ba0 net: dsa: mv88e6xxx: don't set non-existing learn2all bit for 6220/6250
The 6220 and 6250 switches do not have a learn2all bit in global1, ATU
control register; bit 3 is reserverd.

On the switches that do have that bit, it is used to control whether
learning frames are sent out the ports that have the message_port bit
set. So rather than adding yet another chip method, use the existence
of the ->port_setup_message_port method as a proxy for determining
whether the learn2all bit exists (and should be set).

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Link: https://lore.kernel.org/r/20201210110645.27765-1-rasmus.villemoes@prevas.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-14 17:25:03 -08:00
DENG Qingfang
771c890156 net: dsa: mt7530: enable MTU normalization
MT7530 has a global RX length register, so we are actually changing its
MRU.
Enable MTU normalization for this reason.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Acked-by: Landen Chao <landen.chao@mediatek.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20201210170322.3433-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-12 15:00:56 -08:00
Jakub Kicinski
46d5e62dd3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_return_frame_bulk() needs to pass a xdp_buff
to __xdp_return().

strlcpy got converted to strscpy but here it makes no
functional difference, so just keep the right code.

Conflicts:
	net/netfilter/nf_tables_api.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-11 22:29:38 -08:00
Zheng Yongjun
965b8b2bad net: dsa: simplify the return rtl8366_vlan_prepare()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 17:05:37 -08:00
Zheng Yongjun
59d4c93d31 net: mv88e6xxx: convert comma to semicolon
Replace a comma between expression statements by a semicolon.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 16:23:08 -08:00
DENG Qingfang
ea6d5c924e net: dsa: mt7530: support setting ageing time
MT7530 has a global address age control register, so use it to set
ageing time.

The applied timer is (AGE_CNT + 1) * (AGE_UNIT + 1) seconds

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-08 16:18:26 -08:00
Vladimir Oltean
edd2410b16 net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville
The current assumption is that the felix DSA driver has flooding knobs
per traffic class, while ocelot switchdev has a single flooding knob.
This was correct for felix VSC9959 and ocelot VSC7514, but with the
introduction of seville VSC9953, we see a switch driven by felix.c which
has a single flooding knob.

So it is clear that we must do what should have been done from the
beginning, which is not to overwrite the configuration done by ocelot.c
in felix, but instead to teach the common ocelot library about the
differences in our switches, and set up the flooding PGIDs centrally.

The effect that the bogus iteration through FELIX_NUM_TC has upon
seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and
ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will
actually overwrite ANA_FLOODING_IPMC when attempting to write
ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING.

And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect
is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D:
	MC6_DATA = 61,
	MC6_CTRL = 61,
	MC4_DATA = 60,
	MC4_CTRL = 0.
Because MC4_CTRL is zero, this means that IPv4 multicast control packets
are not flooded, but dropped. An invalid configuration, and this is how
the issue was actually spotted.

Reported-by: Eldar Gasanov <eldargasanov2@gmail.com>
Reported-by: Maxim Kochetkov <fido_max@inbox.ru>
Tested-by: Eldar Gasanov <eldargasanov2@gmail.com>
Fixes: 84705fc165 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
Fixes: 3c7b51bd39 ("net: dsa: felix: allow flooding for all traffic classes")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201204175416.1445937-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-05 15:41:34 -08:00
Michael Grzeschik
02ffbb0270 net: dsa: microchip: ksz8795: use num_vlans where possible
The value of the define VLAN_TABLE_ENTRIES can be derived from
num_vlans. This patch is using the variable num_vlans instead and
removes the extra define.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
241ed719bc net: dsa: microchip: ksz8795: use port_cnt instead of TOTOAL_PORT_NUM
To get the driver working with other chips using different port counts
the dyn_mac_table should be flushed depending on the amount of available
ports. This patch remove the extra define TOTOAL_PORT_NUM and is
making use of the dynamic port_cnt variable instead.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
c9f4633b93 net: dsa: microchip: remove usage of mib_port_count
The variable mib_port_cnt has the same meaning as port_cnt.
This driver removes the extra variable and is using port_cnt
everywhere instead.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
94374dd162 net: dsa: microchip: ksz8795: align port_cnt usage with other microchip drivers
The ksz8795 driver is using port_cnt differently to the other microchip
DSA drivers. It sets it to the external physical port count, than the
whole port count (including the cpu port). This patch is aligning the
variables purpose with the other microchip drivers.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
557d1a1fba net: dsa: microchip: remove superfluous num_ports assignment
The variable num_ports is already assigned in the init function. The
drivers individual init function already handles the different purpose
of port_cnt vs port_cnt + 1. This patch removes the extra assignment of
the variable.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
4ce2a984ab net: dsa: microchip: ksz8795: use phy_port_cnt where possible
The driver is currently hard coded to SWITCH_PORT_NUM being
(TOTAL_PORT_NUM - 1) which is limited to 4 user ports for the ksz8795.
The phy_port_cnt is referring to its user ports. The patch removes the
extra define and use the assigned variable phy_port_cnt instead so the
driver can be used on different switches.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
65fe1acf07 net: dsa: microchip: ksz8795: use mib_cnt where possible
The variable mib_cnt is assigned with TOTAL_SWITCH_COUNTER_NUM. This
value can also be derived from the array size of mib_names. This patch
uses this calculated value instead, removes the extra define and uses
mib_cnt everywhere possible instead of the static define
TOTAL_SWITCH_COUNTER_NUM. Keeping it in a separate variable instead of
using ARRAY_SIZE everywhere instead makes the driver more flexible for
future use of devices with different amount of counters.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
31b62c78c1 net: dsa: microchip: ksz8795: use reg_mib_cnt where possible
The extra define SWITCH_COUNTER_NUM is a copy of the KSZ8795_COUNTER_NUM
define. This patch initializes reg_mib_cnt with KSZ8795_COUNTER_NUM,
makes use of reg_mib_cnt everywhere instead and removes the extra
define.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
7fc32b41fe net: dsa: microchip: ksz8795: move variable assignments from detect to init
This patch moves all variable assignments to the init function. It
leaves the detect function for its single purpose to detect which chip
version is found.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:29 -08:00
Michael Grzeschik
68a1b676db net: dsa: microchip: ksz8795: remove superfluous port_cnt assignment
The port_cnt assignment will be done again in the init function.
This patch removes the previous assignment in the detect function.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:28 -08:00
Michael Grzeschik
453aa4cd7e net: dsa: microchip: ksz8795: remove unused last_port variable
The variable last_port is not used anywhere, this patch removes it.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:16:28 -08:00
Chris Packham
0fd5d79efa net: dsa: mv88e6xxx: Handle error in serdes_get_regs
If the underlying read operation failed we would end up writing stale
data to the supplied buffer. This would end up with the last
successfully read value repeating. Fix this by only writing the data
when we know the read was good. This will mean that failed values will
return 0xffff.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Chris Packham
5c19bc8b57 net: dsa: mv88e6xxx: Add serdes interrupt support for MV88E6097
The MV88E6097 presents the serdes interrupts for ports 8 and 9 via the
Switch Global 2 registers. There is no additional layer of
enablinh/disabling the serdes interrupts like other mv88e6xxx switches.
Even though most of the serdes behaviour is the same as the MV88E6185
that chip does not provide interrupts for serdes events so unlike
earlier commits the functions added here are specific to the MV88E6097.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Chris Packham
f5be107c33 net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185
Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for
the MV88E6097/6095/6185 so that ports 8 & 9 can be supported as serdes
ports and directly connected to other network interfaces or to SFPs
without a PHY.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Chris Packham
4efe766290 net: dsa: mv88e6xxx: Don't force link when using in-band-status
When a port is configured with 'managed = "in-band-status"' switch chips
like the 88E6390 need to propagate the SERDES link state to the MAC
because the link state is not correctly detected. This causes problems
on the 88E6185/88E6097 where the link partner won't see link state
changes because we're forcing the link.

To address this introduce a new device specific op port_sync_link() and
push the logic from mv88e6xxx_mac_link_up() into that. Provide an
implementation for the 88E6185 like devices which doesn't force the
link.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Christian Eggers
8c4599f498 net: dsa: microchip: ksz8795: setup SPI mode
This should be done in the device driver instead of the device tree.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Christian Eggers
9ed602bac9 net: dsa: microchip: ksz9477: setup SPI mode
This should be done in the device driver instead of the device tree.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Christian Eggers
44e53c8882 net: dsa: microchip: support for "ethernet-ports" node
The dsa.yaml device tree binding allows "ethernet-ports" (preferred) and
"ports".

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Kurt Kanzenbach
ed5ef9fb20 net: dsa: hellcreek: Don't print error message on defer
When DSA is not loaded when the driver is probed an error message is
printed. But, that's not really an error, just a defer. Use dev_err_probe()
instead.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 16:57:21 -08:00
Jakub Kicinski
56495a2442 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 19:08:46 -08:00
Andrew Lunn
a3dcb3e7e7 net: dsa: mv88e6xxx: Wait for EEPROM done after HW reset
When the switch is hardware reset, it reads the contents of the
EEPROM. This can contain instructions for programming values into
registers and to perform waits between such programming. Reading the
EEPROM can take longer than the 100ms mv88e6xxx_hardware_reset() waits
after deasserting the reset GPIO. So poll the EEPROM done bit to
ensure it is complete.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Ruslan Sushko <rus@sushko.dev>
Link: https://lore.kernel.org/r/20201116164301.977661-1-rus@sushko.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:24:44 -08:00
Martin Blumenstingl
2a1828e378 net: lantiq: Wait for the GPHY firmware to be ready
A user reports (slightly shortened from the original message):
  libphy: lantiq,xrx200-mdio: probed
  mdio_bus 1e108000.switch-mii: MDIO device at address 17 is missing.
  gswip 1e108000.switch lan: no phy at 2
  gswip 1e108000.switch lan: failed to connect to port 2: -19
  lantiq,xrx200-net 1e10b308.eth eth0: error -19 setting up slave phy

This is a single-port board using the internal Fast Ethernet PHY. The
user reports that switching to PHY scanning instead of configuring the
PHY within device-tree works around this issue.

The documentation for the standalone variant of the PHY11G (which is
probably very similar to what is used inside the xRX200 SoCs but having
the firmware burnt onto that standalone chip in the factory) states that
the PHY needs 300ms to be ready for MDIO communication after releasing
the reset.

Add a 300ms delay after initializing all GPHYs to ensure that the GPHY
firmware had enough time to initialize and to appear on the MDIO bus.
Unfortunately there is no (known) documentation on what the minimum time
to wait after releasing the reset on an internal PHY so play safe and
take the one for the external variant. Only wait after the last GPHY
firmware is loaded to not slow down the initialization too much (
xRX200 has two GPHYs but newer SoCs have at least three GPHYs).

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20201115165757.552641-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 13:38:18 -08:00
Tobias Waldekranz
92307069a9 net: dsa: mv88e6xxx: Avoid VTU corruption on 6097
As soon as you add the second port to a VLAN, all other port
membership configuration is overwritten with zeroes. The HW interprets
this as all ports being "unmodified members" of the VLAN.

In the simple case when all ports belong to the same VLAN, switching
will still work. But using multiple VLANs or trying to set multiple
ports as tagged members will not work.

On the 6352, doing a VTU GetNext op, followed by an STU GetNext op
will leave you with both the member- and state- data in the VTU/STU
data registers. But on the 6097 (which uses the same implementation),
the STU GetNext will override the information gathered from the VTU
GetNext.

Separate the two stages, parsing the result of the VTU GetNext before
doing the STU GetNext.

We opt to update the existing implementation for all applicable chips,
as opposed to creating a separate callback for 6097, because although
the previous implementation did work for (at least) 6352, the
datasheet does not mention the masking behavior.

Fixes: ef6fcea37f ("net: dsa: mv88e6xxx: get STU entry on VTU GetNext")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Link: https://lore.kernel.org/r/20201112114335.27371-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 11:29:29 -08:00
Jakub Kicinski
e1d9d7b913 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 16:54:48 -08:00
Tobias Waldekranz
e545f86573 net: dsa: mv88e6xxx: Add helper to get a chip's max_vid
Most of the other chip info constants have helpers to get at them; add
one for max_vid to keep things consistent.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201110185720.18228-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 18:04:23 -08:00
zhangxiaoxu
2bae900b94 net: dsa: mv88e6xxx: Fix memleak in mv88e6xxx_region_atu_snapshot
When mv88e6xxx_fid_map return error, we lost free the table.

Fix it.

Fixes: bfb2554289 ("net: dsa: mv88e6xxx: Add devlink regions")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhangxiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201109144416.1540867-1-zhangxiaoxu5@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-10 17:49:06 -08:00
Colin Ian King
2776d2320a net: dsa: fix unintended sign extension on a u16 left shift
The left shift of u16 variable high is promoted to the type int and
then sign extended to a 64 bit u64 value.  If the top bit of high is
set then the upper 32 bits of the result end up being set by the
sign extension. Fix this by explicitly casting the value in high to
a u64 before left shifting by 16 places.

Also, remove the initialisation of variable value to 0 at the start
of each loop iteration as the value is never read and hence the
assignment it is redundant.

Addresses-Coverity: ("Unintended sign extension")
Fixes: e4b27ebc78 ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20201109124008.2079873-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-10 17:46:20 -08:00
Tobias Waldekranz
ca4d632aef net: dsa: mv88e6xxx: Export VTU as devlink region
Export the raw VTU data and related registers in a devlink region so
that it can be inspected from userspace and compared to the current
bridge configuration.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201109082927.8684-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-09 17:43:22 -08:00
Jakub Kicinski
ae0d0bb29b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-06 17:33:38 -08:00
Kurt Kanzenbach
7d9ee2e8ff net: dsa: hellcreek: Add PTP status LEDs
The switch has two controllable I/Os which are usually connected to LEDs. This
is useful to immediately visually see the PTP status.

These provide two signals:

 * is_gm

   This LED can be activated if the current device is the grand master in that
   PTP domain.

 * sync_good

   This LED can be activated if the current device is in sync with the network
   time.

Expose these via the LED framework to be controlled via user space
e.g. linuxptp.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 14:04:50 -08:00
Kamil Alkhouri
f0d4ba9eff net: dsa: hellcreek: Add support for hardware timestamping
The switch has the ability to take hardware generated time stamps per port for
PTPv2 event messages in Rx and Tx direction. That is useful for achieving needed
time synchronization precision for TSN devices/switches. So add support for it.

There are two directions:

 * RX

   The switch has a single register per port to capture a timestamp. That
   mechanism is not used due to correlation problems. If the software processing
   is too slow and a PTPv2 event message is received before the previous one has
   been processed, false timestamps will be captured. Therefore, the switch can
   do "inline" timestamping which means it can insert the nanoseconds part of
   the timestamp directly into the PTPv2 event message. The reserved field (4
   bytes) is leveraged for that. This might not be in accordance with (older)
   PTP standards, but is the only way to get reliable results.

 * TX

   In Tx direction there is no correlation problem, because the software and the
   driver has to ensure that only one event message is "on the fly". However,
   the switch provides also a mechanism to check whether a timestamp is
   lost. That can only happen when a timestamp is read and at this point another
   message is timestamped. So, that lost bit is checked just in case to indicate
   to the user that the driver or the software is somewhat buggy.

Signed-off-by: Kamil Alkhouri <kamil.alkhouri@hs-offenburg.de>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 14:04:49 -08:00
Kamil Alkhouri
ddd56dfe52 net: dsa: hellcreek: Add PTP clock support
The switch has internal PTP hardware clocks. Add support for it. There are three
clocks:

 * Synchronized
 * Syntonized
 * Free running

Currently the synchronized clock is exported to user space which is a good
default for the beginning. The free running clock might be exported later
e.g. for implementing 802.1AS-2011/2020 Time Aware Bridges (TAB). The switch
also supports cross time stamping for that purpose.

The implementation adds support setting/getting the time as well as offset and
frequency adjustments. However, the clock only holds a partial timeofday
timestamp. This is why we track the seconds completely in software (see overflow
work and last_ts).

Furthermore, add the PTP multicast addresses into the FDB to forward that
packages only to the CPU port where they are processed by a PTP program.

Signed-off-by: Kamil Alkhouri <kamil.alkhouri@hs-offenburg.de>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 14:04:49 -08:00
Kurt Kanzenbach
e4b27ebc78 net: dsa: Add DSA driver for Hirschmann Hellcreek switches
Add a basic DSA driver for Hirschmann Hellcreek switches. Those switches are
implementing features needed for Time Sensitive Networking (TSN) such as support
for the Time Precision Protocol and various shapers like the Time Aware Shaper.

This driver includes basic support for networking:

 * VLAN handling
 * FDB handling
 * Port statistics
 * STP
 * Phylink

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 14:04:49 -08:00
DENG Qingfang
9470174e75 net: dsa: mt7530: support setting MTU
MT7530/7531 has a global RX packet length register, which can be used
to set MTU.

Supported packet length values are 1522 (1518 if untagged), 1536,
1552, and multiple of 1024 (from 2048 to 15360).

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Link: https://lore.kernel.org/r/20201103050618.11419-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-04 16:48:11 -08:00
Tom Rix
0e8c266c59 net: dsa: mt7530: remove unneeded semicolon
A semicolon is not needed after a switch statement.

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20201031153047.2147341-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02 17:51:28 -08:00
Vladimir Oltean
9a72068080 net: dsa: felix: improve the workaround for multiple native VLANs on NPI port
After the good discussion with Florian from here:
https://lore.kernel.org/netdev/20200911000337.htwr366ng3nc3a7d@skbuf/

I realized that the VLAN settings on the NPI port (the hardware "CPU port",
in DSA parlance) don't actually make any difference, because that port
is hardcoded in hardware to use what mv88e6xxx would call "unmodified"
egress policy for VLANs.

So earlier patch 183be6f967 ("net: dsa: felix: send VLANs on CPU port
as egress-tagged") was incorrect in the sense that it didn't actually
make the VLANs be sent on the NPI port as egress-tagged. It only made
ocelot_port_set_native_vlan shut up.

Now that we have moved the check from ocelot_port_set_native_vlan to
ocelot_vlan_prepare, we can simply shunt ocelot_vlan_prepare from DSA,
and avoid calling it. This is the correct way to deal with things,
because the NPI port configuration is DSA-specific, so the ocelot switch
library should not have the check for multiple native VLANs refined in
any way, it is correct the way it is.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02 17:09:07 -08:00
Vladimir Oltean
2f0402fedf net: mscc: ocelot: deny changing the native VLAN from the prepare phase
Put the preparation phase of switchdev VLAN objects to some good use,
and move the check we already had, for preventing the existence of more
than one egress-untagged VLAN per port, to the preparation phase of the
addition.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02 17:09:07 -08:00
Jonathan McDowell
99cab7107d net: dsa: qca8k: Fix port MTU setting
The qca8k only supports a switch-wide MTU setting, and the code to take
the max of all ports was only looking at the port currently being set.
Fix to examine all ports.

Reported-by: DENG Qingfang <dqfext@gmail.com>
Fixes: f58d2598cf ("net: dsa: qca8k: implement the port MTU callbacks")
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201030183315.GA6736@earth.li
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02 15:14:59 -08:00
Russell King
1fb7419198 net: dsa: mv88e6xxx: fix vlan setup
DSA assumes that a bridge which has vlan filtering disabled is not
vlan aware, and ignores all vlan configuration. However, the kernel
software bridge code allows configuration in this state.

This causes the kernel's idea of the bridge vlan state and the
hardware state to disagree, so "bridge vlan show" indicates a correct
configuration but the hardware lacks all configuration. Even worse,
enabling vlan filtering on a DSA bridge immediately blocks all traffic
which, given the output of "bridge vlan show", is very confusing.

Allow the VLAN configuration to be updated on Marvell DSA bridges,
otherwise we end up cutting all traffic when enabling vlan filtering.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1kYAU3-00071C-1G@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-30 14:31:00 -07:00
Colin Ian King
d978d6d008 net: dsa: bcm_sf2: make const array static, makes object smaller
Don't populate the const array rate_table on the stack but instead it
static. Makes the object code smaller by 46 bytes.

Before:
   text	   data	    bss	    dec	    hex	filename
  29812	   3824	    192	  33828	   8424	drivers/net/dsa/bcm_sf2.o

After:
   text	   data	    bss	    dec	    hex	filename
  29670	   3920	    192	  33782	   83f6	drivers/net/dsa/bcm_sf2.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20201020165029.56383-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-20 20:57:57 -07:00
Maxim Kochetkov
a15a6afb3b net: dsa: seville: the packet buffer is 2 megabits, not megabytes
The VSC9953 Seville switch has 2 megabits of buffer split into 4360
words of 60 bytes each. 2048 * 1024 is 2 megabytes instead of 2 megabits.
2 megabits is (2048 / 8) * 1024 = 256 * 1024.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Fixes: a63ed92d21 ("net: dsa: seville: fix buffer size of the queue system")
Link: https://lore.kernel.org/r/20201019050625.21533-1-fido_max@inbox.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-19 18:03:42 -07:00
Jakub Kicinski
2295cddf99 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflicts in net/mptcp/protocol.h and
tools/testing/selftests/net/Makefile.

In both cases code was added on both sides in the same place
so just keep both.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-15 12:43:21 -07:00
Christian Eggers
8098bd69bc net: dsa: microchip: fix race condition
Between queuing the delayed work and finishing the setup of the dsa
ports, the process may sleep in request_module() (via
phy_device_create()) and the queued work may be executed prior to the
switch net devices being registered. In ksz_mib_read_work(), a NULL
dereference will happen within netof_carrier_ok(dp->slave).

Not queuing the delayed work in ksz_init_mib_timer() makes things even
worse because the work will now be queued for immediate execution
(instead of 2000 ms) in ksz_mac_link_down() via
dsa_port_link_register_of().

Call tree:
ksz9477_i2c_probe()
\--ksz9477_switch_register()
   \--ksz_switch_register()
      +--dsa_register_switch()
      |  \--dsa_switch_probe()
      |     \--dsa_tree_setup()
      |        \--dsa_tree_setup_switches()
      |           +--dsa_switch_setup()
      |           |  +--ksz9477_setup()
      |           |  |  \--ksz_init_mib_timer()
      |           |  |     |--/* Start the timer 2 seconds later. */
      |           |  |     \--schedule_delayed_work(&dev->mib_read, msecs_to_jiffies(2000));
      |           |  \--__mdiobus_register()
      |           |     \--mdiobus_scan()
      |           |        \--get_phy_device()
      |           |           +--get_phy_id()
      |           |           \--phy_device_create()
      |           |              |--/* sleeping, ksz_mib_read_work() can be called meanwhile */
      |           |              \--request_module()
      |           |
      |           \--dsa_port_setup()
      |              +--/* Called for non-CPU ports */
      |              +--dsa_slave_create()
      |              |  +--/* Too late, ksz_mib_read_work() may be called beforehand */
      |              |  \--port->slave = ...
      |             ...
      |              +--Called for CPU port */
      |              \--dsa_port_link_register_of()
      |                 \--ksz_mac_link_down()
      |                    +--/* mib_read must be initialized here */
      |                    +--/* work is already scheduled, so it will be executed after 2000 ms */
      |                    \--schedule_delayed_work(&dev->mib_read, 0);
      \-- /* here port->slave is setup properly, scheduling the delayed work should be safe */

Solution:
1. Do not queue (only initialize) delayed work in ksz_init_mib_timer().
2. Only queue delayed work in ksz_mac_link_down() if init is completed.
3. Queue work once in ksz_switch_register(), after dsa_register_switch()
has completed.

Fixes: 7c6ff470aa ("net: dsa: microchip: add MIB counter reading support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 10:00:24 -07:00
Linus Walleij
e0b2e0d8e6 net: dsa: rtl8366rb: Roof MTU for switch
The MTU setting for this DSA switch is global so we need
to keep track of the MTU set for each port, then as soon
as any MTU changes, roof the MTU to the biggest common
denominator and poke that into the switch MTU setting.

To achieve this we need a per-chip-variant state container
for the RTL8366RB to use for the RTL8366RB-specific
stuff. Other SMI switches does seem to have per-port
MTU setting capabilities.

Fixes: 5f4a8ef384 ("net: dsa: rtl8366rb: Support setting MTU")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-10 11:25:05 -07:00
Christian Eggers
5d3b8ec99a net: dsa: microchip: add ksz9563 to ksz9477 I2C driver
Add support for the KSZ9563 3-Port Gigabit Ethernet Switch to the
ksz9477 driver. The KSZ9563 supports both SPI (already in) and I2C. The
ksz9563 is already in the device tree binding documentation.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-09 13:05:56 -07:00
Jakub Kicinski
9d49aea13f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Small conflict around locking in rxrpc_process_event() -
channel_lock moved to bundle in next, while state lock
needs _bh() from net.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 15:44:50 -07:00
Vladimir Oltean
0132649366 net: mscc: ocelot: warn when encoding an out-of-bounds watermark value
There is an upper bound to the value that a watermark may hold. That
upper bound is not immediately obvious during configuration, and it
might be possible to have accidental truncation.

Actually this has happened already, add a warning to prevent it from
happening again.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:05:47 -07:00
David S. Miller
8b0308fe31 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Rejecting non-native endian BTF overlapped with the addition
of support for it.

The rest were more simple overlapping changes, except the
renesas ravb binding update, which had to follow a file
move as well as a YAML conversion.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-05 18:40:01 -07:00
Vladimir Oltean
2e554a7a5d net: dsa: propagate switchdev vlan_filtering prepare phase to drivers
A driver may refuse to enable VLAN filtering for any reason beyond what
the DSA framework cares about, such as:
- having tc-flower rules that rely on the switch being VLAN-aware
- the particular switch does not support VLAN, even if the driver does
  (the DSA framework just checks for the presence of the .port_vlan_add
  and .port_vlan_del pointers)
- simply not supporting this configuration to be toggled at runtime

Currently, when a driver rejects a configuration it cannot support, it
does this from the commit phase, which triggers various warnings in
switchdev.

So propagate the prepare phase to drivers, to give them the ability to
refuse invalid configurations cleanly and avoid the warnings.

Since we need to modify all function prototypes and check for the
prepare phase from within the drivers, take that opportunity and move
the existing driver restrictions within the prepare phase where that is
possible and easy.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Landen Chao <Landen.Chao@mediatek.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Jonathan McDowell <noodles@earth.li>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-05 05:56:48 -07:00
Andrew Lunn
b71a8d6025 net: dsa: mv88e6xxx: Add per port devlink regions
Add a devlink region to return the per port registers.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:38:53 -07:00
Vladimir Oltean
2b7fea0d20 net: dsa: sja1105: remove duplicate prefix for VL Lookup dynamic config
This is a strictly cosmetic change that renames some macros in
sja1105_dynamic_config.c. They were copy-pasted in haste and this has
resulted in them having the driver prefix twice.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:34:42 -07:00
Xiaoliang Yang
75944fda1d net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1
VCAP IS1 is a VCAP module which can filter on the most common L2/L3/L4
Ethernet keys, and modify the results of the basic QoS classification
and VLAN classification based on those flow keys.

There are 3 VCAP IS1 lookups, mapped over chains 10000, 11000 and 12000.
Currently the driver is hardcoded to use IS1_ACTION_TYPE_NORMAL half
keys.

Note that the VLAN_MANGLE has been omitted for now. In hardware, the
VCAP_IS1_ACT_VID_REPLACE_ENA field replaces the classified VLAN
(metadata associated with the frame) and not the VLAN from the header
itself. There are currently some issues which need to be addressed when
operating in standalone, or in bridge with vlan_filtering=0 modes,
because in those cases the switch ports have VLAN awareness disabled,
and changing the classified VLAN to anything other than the pvid causes
the packets to be dropped. Another issue is that on egress, we expect
port tagging to push the classified VLAN, but port tagging is disabled
in the modes mentioned above, so although the classified VLAN is
replaced, it is not visible in the packet transmitted by the switch.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:40:30 -07:00
Vladimir Oltean
319e4dd11a net: mscc: ocelot: introduce conversion helpers between port and netdev
Since the mscc_ocelot_switch_lib is common between a pure switchdev and
a DSA driver, the procedure of retrieving a net_device for a certain
port index differs, as those are registered by their individual
front-ends.

Up to now that has been dealt with by always passing the port index to
the switch library, but now, we're going to need to work with net_device
pointers from the tc-flower offload, for things like indev, or mirred.
It is not desirable to refactor that, so let's make sure that the flower
offload core has the ability to translate between a net_device and a
port index properly.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:40:30 -07:00
Florian Fainelli
1c5ad5a940 net: dsa: b53: Set untag_bridge_pvid
Indicate to the DSA receive path that we need to untage the bridge PVID,
this allows us to remove the dsa_untag_bridge_pvid() calls from
net/dsa/tag_brcm.c.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:36:07 -07:00
Vladimir Oltean
d732e9cef0 net: mscc: ocelot: remove unneeded VCAP parameters for IS2
Now that we are deriving these from the constants exposed by the
hardware, we can delete the static info we're keeping in the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:26:24 -07:00
Vladimir Oltean
2096805497 net: mscc: ocelot: automatically detect VCAP constants
The numbers in struct vcap_props are not intuitive to derive, because
they are not a straightforward copy-and-paste from the reference manual
but instead rely on a fairly detailed level of understanding of the
layout of an entry in the TCAM and in the action RAM. For this reason,
bugs are very easy to introduce here.

Ease the work of hardware porters and read from hardware the constants
that were exported for this particular purpose. Note that this implies
that struct vcap_props can no longer be const.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:26:24 -07:00
Vladimir Oltean
e3aea296d8 net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target
As a preparation step for the offloading to ES0, let's create the
infrastructure for talking with this hardware block.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:26:12 -07:00
Vladimir Oltean
a61e365d7c net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target
As a preparation step for the offloading to IS1, let's create the
infrastructure for talking with this hardware block.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:26:12 -07:00
Vladimir Oltean
c1c3993edb net: mscc: ocelot: generalize existing code for VCAP
In the Ocelot switches there are 3 TCAMs: VCAP ES0, IS1 and IS2, which
have the same configuration interface, but different sets of keys and
actions. The driver currently only supports VCAP IS2.

In preparation of VCAP IS1 and ES0 support, the existing code must be
generalized to work with any VCAP.

In that direction, we should move the structures that depend upon VCAP
instantiation, like vcap_is2_keys and vcap_is2_actions, out of struct
ocelot and into struct vcap_props .keys and .actions, a structure that
is replicated 3 times, once per VCAP. We'll pass that structure as an
argument to each function that does the key and action packing - only
the control logic needs to distinguish between ocelot->vcap[VCAP_IS2]
or IS1 or ES0.

Another change is to make use of the newly introduced ocelot_target_read
and ocelot_target_write API, since the 3 VCAPs have the same registers
but put at different addresses.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:26:12 -07:00
Vladimir Oltean
eaa0355c66 net: dsa: seville: fix VCAP IS2 action width
Since the actions are packed together in the action RAM, an incorrect
action width means that no action except the first one would behave
correctly.

The tc-flower offload has probably not been tested on this hardware
since its introduction.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:24:17 -07:00
Vladimir Oltean
460e985ea0 net: dsa: felix: fix incorrect action offsets for VCAP IS2
The port mask width was larger than the actual number of ports, and
therefore, all fields following this one were also shifted by the number
of excess bits. But the driver doesn't use the REW_OP, SMAC_REPLACE_ENA
or ACL_ID bits from the action vector, so the bug was inconsequential.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:24:17 -07:00
Vladimir Oltean
5124197ce5 net: dsa: tag_ocelot: use a short prefix on both ingress and egress
There are 2 goals that we follow:

- Reduce the header size
- Make the header size equal between RX and TX

The issue that required long prefix on RX was the fact that the ocelot
DSA tag, being put before Ethernet as it is, would overlap with the area
that a DSA master uses for RX filtering (destination MAC address
mainly).

Now that we can ask DSA to put the master in promiscuous mode, in theory
we could remove the prefix altogether and call it a day, but it looks
like we can't. Using no prefix on ingress, some packets (such as ICMP)
would be received, while others (such as PTP) would not be received.
This is because the DSA master we use (enetc) triggers parse errors
("MAC rx frame errors") presumably because it sees Ethernet frames with
a bad length. And indeed, when using no prefix, the EtherType (bytes
12-13 of the frame, bits 96-111) falls over the REW_VAL field from the
extraction header, aka the PTP timestamp.

When turning the short (32-bit) prefix on, the EtherType overlaps with
bits 64-79 of the extraction header, which are a reserved area
transmitted as zero by the switch. The packets are not dropped by the
DSA master with a short prefix. Actually, the frames look like this in
tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 -
added by a downstream sja1105).

89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \
	dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \
	ctrl 0x0004: Information, send seq 2, rcv seq 0, \
	Flags [Response], length 78

0x0000:  8880 000a 001d 890c a9f2 0100 0000 100f  ................
0x0010:  0400 0000 0180 c200 000e 001f 7b63 0248  ............{c.H
0x0020:  dadb 0482 88f7 1202 0036 0000 0000 0000  .........6......
0x0030:  0000 0000 0000 0000 0000 001f 7bff fe63  ............{..c
0x0040:  0248 0001 1f81 0500 0000 0000 0000 0000  .H..............
0x0050:  0000 0000 0000 0000 0000 0000            ............

So the short prefix is our new default: we've shortened our RX frames by
12 octets, increased TX by 4, and headers are now equal between RX and
TX. Note that we still need promiscuous mode for the DSA master to not
drop it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26 14:17:58 -07:00
Vladimir Oltean
2d44b097bb net: mscc: ocelot: move NPI port configuration to DSA
Remove the ocelot_configure_cpu() function, which was in fact bringing
up 2 ports: the CPU port module, which both switchdev and DSA have, and
the NPI port, which only DSA has.

The (non-Ethernet) CPU port module is at a fixed index in the analyzer,
whereas the NPI port is selected through the "ethernet" property in the
device tree.

Therefore, the function to set up an NPI port is DSA-specific, so we
move it there, simplifying the ocelot switch library a little bit.

Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: UNGLinuxDriver <UNGLinuxDriver@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26 14:17:58 -07:00
Vladimir Oltean
ff4cf8eae0 net: dsa: sja1105: implement .devlink_info_get
Return the driver name and ASIC ID so that generic user space
application are able to know they're looking at sja1105 devlink regions
when pretty-printing them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:35:27 -07:00
Vladimir Oltean
bf425b8205 net: dsa: sja1105: expose static config as devlink region
As explained in Documentation/networking/dsa/sja1105.rst, this switch
has a static config held in the driver's memory and re-uploaded from
time to time into the device (after any major change).

The format of this static config is in fact described in UM10944.pdf and
it contains all the switch's settings (it also contains device ID, table
CRCs, etc, just like in the manual). So it is a useful and universal
devlink region to expose to user space, for debugging purposes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:35:27 -07:00
Vladimir Oltean
0a7bdbc23d net: dsa: sja1105: move devlink param code to sja1105_devlink.c
We'll have more devlink code soon. Group it together in a separate
translation object.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:35:27 -07:00
Helmut Grohne
912aae27c6 net: dsa: microchip: really look for phy-mode in port nodes
The previous implementation failed to account for the "ports" node. The
actual port nodes are not child nodes of the switch node, but a "ports"
node sits in between.

Fixes: edecfa98f6 ("net: dsa: microchip: look for phy-mode in port nodes")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:08:48 -07:00
Xiaoliang Yang
dba1e4660a net: dsa: felix: convert TAS link speed based on phylink speed
state->speed holds a value of 10, 100, 1000 or 2500, but
QSYS_TAG_CONFIG_LINK_SPEED expects a value of 0, 1, 2, 3. So convert the
speed to a proper value.

Fixes: de143c0e27 ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:00:40 -07:00
Vladimir Oltean
e2f9a8fe73 net: mscc: ocelot: always pass skb clone to ocelot_port_add_txtstamp_skb
Currently, ocelot switchdev passes the skb directly to the function that
enqueues it to the list of skb's awaiting a TX timestamp. Whereas the
felix DSA driver first clones the skb, then passes the clone to this
queue.

This matters because in the case of felix, the common IRQ handler, which
is ocelot_get_txtstamp(), currently clones the clone, and frees the
original clone. This is useless and can be simplified by using
skb_complete_tx_timestamp() instead of skb_tstamp_tx().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 19:47:56 -07:00
Florian Fainelli
ed409f3bba net: dsa: b53: Configure VLANs while not filtering
Update the B53 driver to support VLANs while not filtering. This
requires us to enable VLAN globally within the switch upon driver
initial configuration (dev->vlan_enabled).

We also need to remove the code that dealt with PVID re-configuration in
b53_vlan_filtering() since that function worked under the assumption
that it would only be called to make a bridge VLAN filtering, or not
filtering, and we would attempt to move the port's PVID accordingly.

Now that VLANs are programmed all the time, even in the case of a
non-VLAN filtering bridge, we would be programming a default_pvid for
the bridged switch ports.

We need the DSA receive path to pop the VLAN tag if it is the bridge's
default_pvid because the CPU port is always programmed tagged in the
programmed VLANs. In order to do so we utilize the
dsa_untag_bridge_pvid() helper introduced in the commit before within
net/dsa/tag_brcm.c.

Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 18:13:45 -07:00
Florian Fainelli
0fa45ee3c1 net: dsa: bcm_sf2: Include address 0 for MDIO diversion
We need to include MDIO address 0, which is how our Device Tree blobs
indicate where to find the external BCM53125 switches.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 17:51:16 -07:00
Florian Fainelli
8c28044097 net: dsa: bcm_sf2: Disallow port 5 to be a DSA CPU port
While the switch driver is written such that port 5 or 8 could be CPU
ports, the use case on Broadcom STB chips is to use port 8 exclusively.
The platform firmware does make port 5 comply to a proper DSA CPU port
binding by specifiying an "ethernet" phandle. This is undesirable for
now until we have an user-space configuration mechanism (such as
devlink) which could support dynamically changing the port flavor at
run time.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 17:51:15 -07:00
David S. Miller
3ab0a7a0c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
   moving another local variable and removing it's
   initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
   One pretty prints the port mode differently, whilst another
   changes the driver to try and obtain the port mode from
   the port node rather than the switch node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22 16:45:34 -07:00
Vladimir Oltean
7a0230759e net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Since these were copied from the Felix VCAP IS2 code, and only the
offsets were adjusted, the order of the bit fields is still wrong.
Fix it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21 17:40:52 -07:00
Xiaoliang Yang
8b9e03cd08 net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT,
L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from
matching on src_port and dst_port for TCP and UDP packets.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21 17:40:52 -07:00
Linus Walleij
a7920efdd8 net: dsa: rtl8366rb: Support all 4096 VLANs
There is an off-by-one error in rtl8366rb_is_vlan_valid()
making VLANs 0..4094 valid while it should be 1..4095.
Fix it.

Fixes: d8652956cf ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21 14:49:14 -07:00
Alex Dewar
0ce0c3cd22 net: dsa: mt7530: Add some return-value checks
In mt7531_cpu_port_config(), if the variable port is neither 5 nor 6,
then variable interface will be used uninitialised. Change the function
to return -EINVAL in this case.

As the return value of mt7531_cpu_port_config() is never checked
(even though it returns an int) add a check in the correct place so that
the error can be passed up the call stack. Now that we correctly handle
errors thrown in this function, also check the return value of
mt7531_mac_config() in case an error occurs here. Also add misisng
checks to mt7530_setup() and mt7531_setup(), which are another level
further up the call stack.

Fixes: c288575f78 ("net: dsa: mt7530: Add the support of MT7531 switch")
Addresses-Coverity: 1496993 ("Uninitialized variables")
Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21 14:43:07 -07:00
Vladimir Oltean
bbed0bbddd net: dsa: tag_8021q: add VLANs to the master interface too
The whole purpose of tag_8021q is to send VLAN-tagged traffic to the
CPU, from which the driver can decode the source port and switch id.

Currently this only works if the VLAN filtering on the master is
disabled. Change that by explicitly adding code to tag_8021q.c to add
the VLANs corresponding to the tags to the filter of the master
interface.

Because we now need to call vlan_vid_add, then we also need to hold the
RTNL mutex. Propagate that requirement to the callers of dsa_8021q_setup
and modify the existing call sites as appropriate. Note that one call
path, sja1105_best_effort_vlan_filtering_set -> sja1105_vlan_filtering
-> sja1105_setup_8021q_tagging, was already holding this lock.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-20 19:01:34 -07:00
Linus Walleij
3dfe8dde09 net: dsa: rtl8366: Skip PVID setting if not requested
We go to lengths to determine whether the PVID should be set
for this port or not, and then fail to take it into account.
Fix this oversight.

Fixes: d8652956cf ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-20 14:10:58 -07:00
Andrew Lunn
93157307f7 net: dsa: mv88e6xxx: Implement devlink info get callback
Return the driver name and the asic.id with the switch name.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 18:18:30 -07:00
Andrew Lunn
bfb2554289 net: dsa: mv88e6xxx: Add devlink regions
Allow the global registers, and the ATU to be snapshot via devlink
regions. It is later planned to add support for the port registers.

v2:
Remove left over debug prints
Comment ATU format is generic for mv88e6xxx, not wider

v3:
Make use of ops structure passed to snapshot function
Remove port regions

v4:
Make use of enum mv88e6xxx_region_id
Fix global2/global1 read typ0

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 18:18:30 -07:00
Andrew Lunn
90b6dbdf41 net: dsa: mv88e6xxx: Create helper for FIDs in use
Refactor the code in mv88e6xxx_atu_new() which builds a bitmaps of
FIDs in use into a helper function. This will be reused by the devlink
code when dumping the ATU.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 18:17:45 -07:00
Andrew Lunn
9dd43aa211 net: dsa: mv88e6xxx: Move devlink code into its own file
There will soon be more devlink code. Move the existing code into a
file of its own, before we start adding this new code.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 18:17:45 -07:00
Vladimir Oltean
d60bc62de4 net: dsa: seville: build as separate module
Seville does not need to depend on PCI or on the ENETC MDIO controller.
There will also be other compile-time differences in the future.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:51 -07:00
Vladimir Oltean
2ac7c6c5b6 net: dsa: felix: move the PTP clock structure to felix_vsc9959.c
Not only does Sevile not have a PTP clock, but with separate modules,
this structure cannot even live in felix.c, due to the .owner =
THIS_MODULE assignment causing this link time error:

drivers/net/dsa/ocelot/felix.o:(.data+0x0): undefined reference to `__this_module'

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:51 -07:00
Vladimir Oltean
ccfdbab568 net: dsa: seville: duplicate vsc9959_mdio_bus_free
While we don't plan on making any changes to this function, currently
this is the only remaining dependency between felix and seville, after
the PCS has been refactored out into pcs-lynx.c.

Duplicate this function in seville to break the dependency completely.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
f8320ec14d net: dsa: felix: replace tabs with spaces
Over the time, some patches have introduced structures aligned with
spaces, near structures aligned with tabs. Fix the inconsistencies.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
123d231a16 net: dsa: seville: reindent defines for MDIO controller
Reindent these definitions to be in line with the rest of the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
9ef9e0d282 net: dsa: seville: remove unused defines for the mdio controller
Some definitions were likely copied from
drivers/net/mdio/mdio-mscc-miim.c.

They are not necessary, remove them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
c129fc55fe net: dsa: ocelot: document why reset procedure is different for felix/seville
The overall idea (issue soft reset, enable memories, initialize
memories, enable core) is the same, so it would make sense that an
attempt is made to unify the procedures.

It is not immediately obvious that the fields are not part of the same
register targets, though. So add a comment.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
9a73f0b580 net: dsa: seville: first enable memories, then initialize them
As per documentation, proper startup sequence is:
* Enable memories
* Initialize memories
* Enable core

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
6b6d804f08 net: dsa: seville: don't write to MEM_ENA twice
There is another one of these right above the readx_poll_status.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
75cea9cb94 net: dsa: felix: use ocelot_field_{read,write} helpers consistently
Since these helpers for regmap fields are available, use them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 17:52:40 -07:00
Vladimir Oltean
e5fb512d81 net: mscc: ocelot: deinitialize only initialized ports
Currently mscc_ocelot_init_ports() will skip initializing a port when it
doesn't have a phy-handle, so the ocelot->ports[port] pointer will be
NULL. Take this into consideration when tearing down the driver, and add
a new function ocelot_deinit_port() to the switch library, mirror of
ocelot_init_port(), which needs to be called by the driver for all ports
it has initialized.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 13:52:34 -07:00
Vladimir Oltean
d1cc0e9320 net: mscc: ocelot: error checking when calling ocelot_init()
ocelot_init() allocates memory, resets the switch and polls for a status
register, things which can fail. Stop probing the driver in that case,
and propagate the error result.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 13:52:34 -07:00
Vladimir Oltean
a63ed92d21 net: dsa: seville: fix buffer size of the queue system
The VSC9953 Seville switch has 2 megabits of buffer split into 4360
words of 60 bytes each.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18 13:52:33 -07:00
Matthias Schiffer
fd944dc243 net: dsa: microchip: ksz8795: really set the correct number of ports
The KSZ9477 and KSZ8795 use the port_cnt field differently: For the
KSZ9477, it includes the CPU port(s), while for the KSZ8795, it doesn't.

It would be a good cleanup to make the handling of both drivers match,
but as a first step, fix the recently broken assignment of num_ports in
the KSZ8795 driver (which completely broke probing, as the CPU port
index was always failing the num_ports check).

Fixes: af199a1a9c ("net: dsa: microchip: set the correct number of ports")
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-16 17:46:09 -07:00
Landen Chao
c288575f78 net: dsa: mt7530: Add the support of MT7531 switch
Add new support for MT7531:

MT7531 is the next generation of MT7530. It is also a 7-ports switch with
5 giga embedded phys, 2 cpu ports, and the same MAC logic of MT7530. Cpu
port 6 only supports SGMII interface. Cpu port 5 supports either RGMII
or SGMII in different HW sku, but cannot be muxed to PHY of port 0/4 like
mt7530. Due to SGMII interface support, pll, and pad setting are different
from MT7530. This patch adds different initial setting, and SGMII phylink
handlers of MT7531.

MT7531 SGMII interface can be configured in following mode:
- 'SGMII AN mode' with in-band negotiation capability
    which is compatible with PHY_INTERFACE_MODE_SGMII.
- 'SGMII force mode' without in-band negotiation
    which is compatible with 10B/8B encoding of
    PHY_INTERFACE_MODE_1000BASEX with fixed full-duplex and fixed pause.
- 2.5 times faster clocked 'SGMII force mode' without in-band negotiation
    which is compatible with 10B/8B encoding of
    PHY_INTERFACE_MODE_2500BASEX with fixed full-duplex and fixed pause.

Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-14 16:30:39 -07:00
Landen Chao
88bdef8be9 net: dsa: mt7530: Extend device data ready for adding a new hardware
Add a structure holding required operations for each device such as device
initialization, PHY port read or write, a checker whether PHY interface is
supported on a certain port, MAC port setup for either bus pad or a
specific PHY interface.

The patch is done for ready adding a new hardware MT7531, and keep the
same setup logic of existing hardware.

Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-14 16:30:38 -07:00
Landen Chao
dc8ef938c9 net: dsa: mt7530: Refine message in Kconfig
Refine message in Kconfig with fixing typo and an explicit MT7621 support.

Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-14 16:30:38 -07:00
Vladimir Oltean
5899ee367a net: dsa: tag_8021q: add a context structure
While working on another tag_8021q driver implementation, some things
became apparent:

- It is not mandatory for a DSA driver to offload the tag_8021q VLANs by
  using the VLAN table per se. For example, it can add custom TCAM rules
  that simply encapsulate RX traffic, and redirect & decapsulate rules
  for TX traffic. For such a driver, it makes no sense to receive the
  tag_8021q configuration through the same callback as it receives the
  VLAN configuration from the bridge and the 8021q modules.

- Currently, sja1105 (the only tag_8021q user) sets a
  priv->expect_dsa_8021q variable to distinguish between the bridge
  calling, and tag_8021q calling. That can be improved, to say the
  least.

- The crosschip bridging operations are, in fact, stateful already. The
  list of crosschip_links must be kept by the caller and passed to the
  relevant tag_8021q functions.

So it would be nice if the tag_8021q configuration was more
self-contained. This patch attempts to do that.

Create a struct dsa_8021q_context which encapsulates a struct
dsa_switch, and has 2 function pointers for adding and deleting a VLAN.
These will replace the previous channel to the driver, which was through
the .port_vlan_add and .port_vlan_del callbacks of dsa_switch_ops.

Also put the list of crosschip_links into this dsa_8021q_context.
Drivers that don't support cross-chip bridging can simply omit to
initialize this list, as long as they dont call any cross-chip function.

The sja1105_vlan_add and sja1105_vlan_del functions are refactored into
a smaller sja1105_vlan_add_one, which now has 2 entry points:
- sja1105_vlan_add, from struct dsa_switch_ops
- sja1105_dsa_8021q_vlan_add, from the tag_8021q ops
But even this change is fairly trivial. It just reflects the fact that
for sja1105, the VLANs from these 2 channels end up in the same hardware
table. However that is not necessarily true in the general sense (and
that's the reason for making this change).

The rest of the patch is mostly plain refactoring of "ds" -> "ctx". The
dsa_8021q_context structure needs to be propagated because adding a VLAN
is now done through the ops function pointers inside of it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 17:30:43 -07:00
Vladimir Oltean
7e092af2f3 net: dsa: tag_8021q: setup tagging via a single function call
There is no point in calling dsa_port_setup_8021q_tagging for each
individual port. Additionally, it will become more difficult to do that
when we'll have a context structure to tag_8021q (next patch). So
refactor this now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 17:30:43 -07:00
Helmut Grohne
edecfa98f6 net: dsa: microchip: look for phy-mode in port nodes
Documentation/devicetree/bindings/net/dsa/dsa.txt says that the phy-mode
property should be specified on port nodes. However, the microchip
drivers read it from the switch node.

Let the driver use the per-port property and fall back to the old
location with a warning.

Fix in-tree users.

Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Link: https://lore.kernel.org/netdev/20200617082235.GA1523@laureti-dev/
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 12:32:37 -07:00
Florian Fainelli
4f6a5caf18 net: dsa: b53: Report VLAN table occupancy via devlink
We already maintain an array of VLANs used by the switch so we can
simply iterate over it to report the occupancy via devlink.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 14:29:02 -07:00
Paul Barker
5b79798090 net: dsa: microchip: Implement recommended reset timing
The datasheet for the ksz9893 and ksz9477 switches recommend waiting at
least 100us after the de-assertion of reset before trying to program the
device through any interface.

Also switch the existing msleep() call to usleep_range() as recommended
in Documentation/timers/timers-howto.rst. The 2ms range used here is
somewhat arbitrary, as long as the reset is asserted for at least 10ms
we should be ok.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker
ade64eb5be net: dsa: microchip: Disable RGMII in-band status on KSZ9893
We can't assume that the link partner supports the in-band status
reporting which is enabled by default on the KSZ9893 when using RGMII
for the upstream port.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker
805a7e6f53 net: dsa: microchip: Improve phy mode message
Always print the selected phy mode for the CPU port when using the
ksz9477 driver. If the phy mode was changed, also print the previous
mode to aid in debugging.

To make the message more clear, prefix it with the port number which it
applies to and improve the language a little.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Paul Barker
3c85f77515 net: dsa: microchip: Make switch detection more informative
To make switch detection more informative print the result of the
ksz9477/ksz9893 compatibility check. With debug output enabled also
print the contents of the Chip ID registers as a 40-bit hex string.

As this detection is the first communication with the switch performed
by the driver, making it easy to see any errors here will help identify
issues with SPI data corruption or reset sequencing.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09 11:26:32 -07:00
Linus Walleij
bb1416adb8 net: dsa: rtl8366rb: Switch to phylink
This switches the RTL8366RB over to using phylink callbacks
instead of .adjust_link(). This is a pretty template
switchover. All we adjust is the CPU port so that is why
the code only inspects this port.

We enhance by adding proper error messages, also disabling
the CPU port on the way down and moving dev_info() to
dev_dbg().

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-07 12:38:27 -07:00
Linus Walleij
4ddcaf1ebb net: dsa: rtl8366: Properly clear member config
When removing a port from a VLAN we are just erasing the
member config for the VLAN, which is wrong: other ports
can be using it.

Just mask off the port and only zero out the rest of the
member config once ports using of the VLAN are removed
from it.

Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: d8652956cf ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06 12:32:07 -07:00
Linus Walleij
5f4a8ef384 net: dsa: rtl8366rb: Support setting MTU
This implements the missing MTU setting for the RTL8366RB
switch.

Apart from supporting jumboframes, this rids us of annoying
boot messages like this:
realtek-smi switch: nonfatal error -95 setting MTU on port 0

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-06 10:28:07 -07:00
Florian Fainelli
771089c2a4 net: dsa: bcm_sf2: Ensure that MDIO diversion is used
Registering our slave MDIO bus outside of the OF infrastructure is
necessary in order to avoid creating double references of the same
Device Tree nodes, however it is not sufficient to guarantee that the
MDIO bus diversion is used because of_phy_connect() will still resolve
to a valid PHY phandle and it will connect to the PHY using its parent
MDIO bus which is still the SF2 master MDIO bus. The reason for that is
because BCM7445 systems were already shipped with a Device Tree blob
looking like this (irrelevant parts omitted for simplicity):

	ports {
		#address-cells = <1>;
		#size-cells = <0>;

		port@1 {
			phy-mode = "rgmii-txid";
			phy-handle = <&phy0>;
                        reg = <1>;
			label = "rgmii_1";
		};
	...

	mdio@403c0 {
		...

		phy0: ethernet-phy@0 {
			broken-turn-around;
			device_type = "ethernet-phy";
			max-speed = <0x3e8>;
			reg = <0>;
			compatible = "brcm,bcm53125", "ethernet-phy-ieee802.3-c22";
		};
	};

There is a hardware issue with chip revisions (Dx) that lead to the
development of the following commits:

461cd1b03e ("net: dsa: bcm_sf2: Register our slave MDIO bus")
536fab5bf5 ("net: dsa: bcm_sf2: Do not register slave MDIO bus with OF")
b8c6cd1d31 ("net: dsa: bcm_sf2: do not use indirect reads and writes for 7445E0")

There should have been an internal MDIO bus node created for the chip
revision (Dx) that suffers from this problem, but it did not happen back
then.

Had that happen, that we should have correctly parented phy@0 (bcm53125
below) as child node of the internal MDIO bus, but the production Device
Tree blob that was shipped with the firmware targeted the fixed version
of the chip, despite both the affected and corrected chips being shipped
into production.

The problem is that of_phy_connect() for port@1 will happily resolve the
'phy-handle' from the mdio@403c0 node, which bypasses the diversion
completely. This results in this double programming that the diversion
refers to and aims to avoid. In order to force of_phy_connect() to fail,
and have DSA call to dsa_slave_phy_connect(), we must deactivate
ethernet-phy@0 from mdio@403c0, and the best way to do that is by
removing the phandle property completely.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05 13:09:03 -07:00
Jakub Kicinski
44a8c4f33c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.

Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-04 21:28:59 -07:00
Linus Torvalds
3e8d3bdc2a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
    Kivilinna.

 2) Fix loss of RTT samples in rxrpc, from David Howells.

 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.

 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.

 5) We disable BH for too lokng in sctp_get_port_local(), add a
    cond_resched() here as well, from Xin Long.

 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.

 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
    Yonghong Song.

 8) Missing of_node_put() in mt7530 DSA driver, from Sumera
    Priyadarsini.

 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.

10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.

11) Memory leak in rxkad_verify_response, from Dinghao Liu.

12) In tipc, don't use smp_processor_id() in preemptible context. From
    Tuong Lien.

13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.

14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.

15) Fix ABI mismatch between driver and firmware in nfp, from Louis
    Peens.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
  net/smc: fix sock refcounting in case of termination
  net/smc: reset sndbuf_desc if freed
  net/smc: set rx_off for SMCR explicitly
  net/smc: fix toleration of fake add_link messages
  tg3: Fix soft lockup when tg3_reset_task() fails.
  doc: net: dsa: Fix typo in config code sample
  net: dp83867: Fix WoL SecureOn password
  nfp: flower: fix ABI mismatch between driver and firmware
  tipc: fix shutdown() of connectionless socket
  ipv6: Fix sysctl max for fib_multipath_hash_policy
  drivers/net/wan/hdlc: Change the default of hard_header_len to 0
  net: gemini: Fix another missing clk_disable_unprepare() in probe
  net: bcmgenet: fix mask check in bcmgenet_validate_flow()
  amd-xgbe: Add support for new port mode
  net: usb: dm9601: Add USB ID of Keenetic Plus DSL
  vhost: fix typo in error message
  net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
  pktgen: fix error message with wrong function name
  net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
  cxgb4: fix thermal zone device registration
  ...
2020-09-03 18:50:48 -07:00
Florian Fainelli
2ee3adc4ae net: dsa: bcm_sf2: recalculate switch clock rate based on ports
Whenever a port gets enabled/disabled, recalcultate the required switch
clock rate to make sure it always gets set to the expected rate
targeting our switch use case. This is only done for the BCM7445 switch
as there is no clocking profile available for BCM7278.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03 15:08:03 -07:00
Florian Fainelli
e9ec5c3bd2 net: dsa: bcm_sf2: request and handle clocks
Fetch the corresponding clock resource and enable/disable it during
suspend/resume if and only if we have no ports defined for Wake-on-LAN.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03 15:08:03 -07:00
Paul Barker
434d2312cd net: dsa: b53: Print err message on SW_RST timeout
This allows us to differentiate between the possible failure modes of
b53_switch_reset() by looking at the dmesg output.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03 12:07:29 -07:00
Paul Barker
3b33438c52 net: dsa: b53: Use dev_{err,info} instead of pr_*
This change allows us to see which device the err or info messages are
referring to if we have multiple b53 compatible devices on a board.

As this removes the only pr_*() calls in this file we can drop the
definition of pr_fmt().

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03 12:07:29 -07:00
Linus Walleij
7e1301ed18 net: dsa: rtl8366: Refactor VLAN/PVID init
The VLANs and PVIDs on the RTL8366 utilizes a "member
configuration" (MC) which is largely unexplained in the
code.

This set-up requires a special ordering: rtl8366_set_pvid()
must be called first, followed by rtl8366_set_vlan(),
else the MC will not be properly allocated. Relax this
by factoring out the code obtaining an MC and reuse
the helper in both rtl8366_set_pvid() and
rtl8366_set_vlan() so we remove this strict ordering
requirement.

In the process, add some better comments and debug prints
so people who read the code understand what is going on.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-02 14:13:35 -07:00
Linus Walleij
6641a2c42b net: dsa: rtl8366: Check validity of passed VLANs
The rtl8366_set_vlan() and rtl8366_set_pvid() get invalid
VLANs tossed at it, especially VLAN0, something the hardware
and driver cannot handle. Check validity and bail out like
we do in the other callbacks.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-02 14:13:35 -07:00
Andrew Lunn
ceb96fae39 net: dsa: mv88e6xxx: Fix W=1 warning with !CONFIG_OF
When building on platforms without device tree, e.g. amd64, W=1 gives
a warning about mv88e6xxx_mdio_external_match being unused. Replace
of_match_node() with of_device_is_compatible() to prevent this
warning.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-01 15:33:57 -07:00
Ioana Ciornei
588d05504d net: dsa: ocelot: use the Lynx PCS helpers in Felix and Seville
Use the helper functions introduced by the newly added
Lynx PCS MDIO module in the Felix VSC9959 and Seville VSC9953.

Instead of representing the PCS as a phy_device, a mdio_device structure
will be passed to the Lynx module which is now actually implementing all
the PCS configuration and status reporting.

All code previously used for PCS monitoring and runtime configuration
is removed and replaced will calls to the Lynx PCS operations.

Tested on the following SERDES protocols of LS1028A: 0x7777
(2500Base-X), 0x85bb (QSGMII), 0x9999 (SGMII) and 0x13bb (USXGMII).

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31 12:52:33 -07:00
Landen Chao
f272285f6a net: dsa: mt7530: fix advertising unsupported 1000baseT_Half
Remove 1000baseT_Half to advertise correct hardware capability in
phylink_validate() callback function.

Fixes: 38f790a805 ("net: dsa: mt7530: Add support for port 5")
Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-28 06:54:06 -07:00
Sumera Priyadarsini
8e4efd4706 net: dsa: mt7530: Add of_node_put() before break and return statements
Every iteration of for_each_child_of_node() decrements
the reference count of the previous node, however when control
is transferred from the middle of the loop, as in the case of
a return or break or goto, there is no decrement thus ultimately
resulting in a memory leak.

Fix a potential memory leak in mt7530.c by inserting of_node_put()
before the break and return statements.

Issue found with Coccinelle.

Signed-off-by: Sumera Priyadarsini <sylphrenadin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 07:44:41 -07:00
Sumera Priyadarsini
59ebb4305c net: ocelot: Add of_node_put() before return statement
Every iteration of for_each_available_child_of_node() decrements
the reference count of the previous node, however when control
is transferred from the middle of the loop, as in the case of
a return or break or goto, there is no decrement thus ultimately
resulting in a memory leak.

Fix a potential memory leak in felix.c by inserting of_node_put()
before the return statement.

Issue found with Coccinelle.

Signed-off-by: Sumera Priyadarsini <sylphrenadin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:04:09 -07:00
Nathan Chancellor
5978fac03e net: dsa: sja1105: Do not use address of compatible member in sja1105_check_device_id
Clang warns:

drivers/net/dsa/sja1105/sja1105_main.c:3418:38: warning: address of
array 'match->compatible' will always evaluate to 'true'
[-Wpointer-bool-conversion]
        for (match = sja1105_dt_ids; match->compatible; match++) {
        ~~~                          ~~~~~~~^~~~~~~~~~
1 warning generated.

We should check the value of the first character in compatible to see if
it is empty or not. This matches how the rest of the tree iterates over
IDs.

Fixes: 0b0e299720 ("net: dsa: sja1105: use detected device id instead of DT one on mismatch")
Link: https://github.com/ClangBuiltLinux/linux/issues/1139
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 16:13:25 -07:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
David S. Miller
7611cbb900 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-08-23 11:48:27 -07:00
Tom Rix
774d977abf net: dsa: b53: check for timeout
clang static analysis reports this problem

b53_common.c:1583:13: warning: The left expression of the compound
  assignment is an uninitialized value. The computed value will
  also be garbage
        ent.port &= ~BIT(port);
        ~~~~~~~~ ^

ent is set by a successful call to b53_arl_read().  Unsuccessful
calls are caught by an switch statement handling specific returns.
b32_arl_read() calls b53_arl_op_wait() which fails with the
unhandled -ETIMEDOUT.

So add -ETIMEDOUT to the switch statement.  Because
b53_arl_op_wait() already prints out a message, do not add another
one.

Fixes: 1da6df85c6 ("net: dsa: b53: Implement ARL add/del/dump operations")
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-21 11:46:12 -07:00
Kurt Kanzenbach
28fba67ff9 net: dsa: mv88e6xxx: Use generic helper function
In order to reduce code duplication between ptp drivers, generic helper
functions were introduced. Use them.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 16:07:49 -07:00
Florian Fainelli
142061eba3 net: dsa: loop: Return VLAN table size through devlink
We return the VLAN table size through devlink as a simple parameter, we
do not support altering it at runtime:

devlink resource show mdio_bus/fixed-0:1f
mdio_bus/fixed-0:1f:
  name VTU size 4096 occ 0 unit entry dpipe_tables none

and after configure a bridge with VLAN filtering:

devlink resource show mdio_bus/fixed-0:1f
mdio_bus/fixed-0:1f:
  name VTU size 4096 occ 1 unit entry dpipe_tables none

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:01:42 -07:00
Florian Fainelli
f0408ca45a net: dsa: loop: Configure VLANs while not filtering
Since this is a mock-up driver with no real data path for now, but we
will have one at some point, enable VLANs while not filtering.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-19 12:01:42 -07:00
Colin Ian King
17340552ce net: mscc: ocelot: remove duplicate "the the" phrase in Kconfig text
The Kconfig help text contains the phrase "the the" in the help
text. Fix this.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-18 16:02:03 -07:00
Vladimir Oltean
0b0e299720 net: dsa: sja1105: use detected device id instead of DT one on mismatch
Although we can detect the chip revision 100% at runtime, it is useful
to specify it in the device tree compatible string too, because
otherwise there would be no way to assess the correctness of device tree
bindings statically, without booting a board (only some switch versions
have internal RGMII delays and/or an SGMII port).

But for testing the P/Q/R/S support, what I have is a reworked board
with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
want to keep a separate device tree blob just for this one-off board.
Since just the chip has been replaced, its RGMII delay setup is
inherently the same (meaning: delays added by the PHY on the slave
ports, and by PCB traces on the fixed-link CPU port).

For this board, I'd rather have the driver shout at me, but go ahead and
use what it found even if it doesn't match what it's been told is there.

[    2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
[    2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
[    3.005082] sja1105 spi0.1: Enabled switch tagging

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 12:20:55 -07:00
Florian Fainelli
947b6ef9f7 net: dsa: loop: Set correct number of ports
We only support DSA_LOOP_NUM_PORTS in the switch, do not tell the DSA
core to allocate up to DSA_MAX_PORTS which is nearly the double (6 vs.
11).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
c99194eded net: dsa: loop: Wire-up MTU callbacks
For now we simply store the port MTU into a per-port member.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
6c84a58997 net: dsa: loop: Move data structures to header
In preparation for adding support for a mockup data path, move the
driver data structures to include/linux/dsa/loop.h such that we can
share them between net/dsa/ and drivers/net/dsa/ later on.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
916a8d168e net: dsa: loop: Support 4K VLANs
Allocate a 4K array of VLANs instead of limiting ourselves to just 5
which is arbitrary.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:22 -07:00
Florian Fainelli
81d4e8e073 net: dsa: loop: PVID should be per-port
The PVID should be per-port, this is a preliminary change to support a
802.1Q data path in the driver.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:22 -07:00
Vladimir Oltean
af9fdd2bf8 net: dsa: sja1105: poll for extts events from a timer
The current poll interval is enough to ensure that rising and falling
edge events are not lost for a 1 PPS signal with 50% duty cycle.

But when we deliver the events to user space, it will try to infer if
they were corresponding to a rising or to a falling edge (the kernel
driver doesn't know that either). User space will try to make that
inference based on the time at which the PPS master had emitted the
pulse (i.e. if it's a .0 time, it's rising edge, if it's .5 time, it's
falling edge).

But there is no in-kernel API for retrieving the precise timestamp
corresponding to a PPS master (aka perout) pulse. So user space has to
guess even that. It will read the PTP time on the PPS master right after
we've delivered the extts event, and declare that the PPS master time
was just the closest integer second, based on 2 thresholds (lower than
.25, or higher than .75, and ignore anything else).

Except that, if we poll for extts events (and our hardware doesn't
really help us, by not providing an interrupt), then there is a risk
that the poll period (and therefore the time at which the event is
delivered) might confuse user space.

Because we are always scheduling the next extts poll at
SJA1105_EXTTS_INTERVAL "from now" (that's the only thing that the
schedule_delayed_work() API gives us), it means that the start time of
the next delayed workqueue will always be shifted to the right a little
bit (shifted with the SPI access duration of this workqueue run).
In turn, because user space sees extts events that are non-periodic
compared to the PPS master's time, this means that it might start making
wrong guesses about rising/falling edge.

To understand the effect, here is the output of ts2phc currently. Notice
the 'src' timestamps of the 'SKIP extts' events, and how they have a
large wander. They keep increasing until the upper limit for the ignore
threshold (.75 seconds), after which the application starts ignoring the
_other_ edge.

ts2phc[26.624]: /dev/ptp3 SKIP extts index 0 at 21.449898912 src 21.657784518
ts2phc[27.133]: adding tstamp 21.949894240 to clock /dev/ptp3
ts2phc[27.133]: adding tstamp 22.000000000 to clock /dev/ptp1
ts2phc[27.133]: /dev/ptp3 offset        640 s2 freq   +5112
ts2phc[27.636]: /dev/ptp3 SKIP extts index 0 at 22.449889360 src 22.669398022
ts2phc[28.140]: adding tstamp 22.949884376 to clock /dev/ptp3
ts2phc[28.140]: adding tstamp 23.000000000 to clock /dev/ptp1
ts2phc[28.140]: /dev/ptp3 offset         96 s2 freq   +4760
ts2phc[28.644]: /dev/ptp3 SKIP extts index 0 at 23.449879504 src 23.677420422
ts2phc[29.153]: adding tstamp 23.949874704 to clock /dev/ptp3
ts2phc[29.153]: adding tstamp 24.000000000 to clock /dev/ptp1
ts2phc[29.153]: /dev/ptp3 offset       -264 s2 freq   +4429
ts2phc[29.656]: /dev/ptp3 SKIP extts index 0 at 24.449870008 src 24.689407238
ts2phc[30.160]: adding tstamp 24.949865376 to clock /dev/ptp3
ts2phc[30.160]: adding tstamp 25.000000000 to clock /dev/ptp1
ts2phc[30.160]: /dev/ptp3 offset       -280 s2 freq   +4334
ts2phc[30.664]: /dev/ptp3 SKIP extts index 0 at 25.449860760 src 25.697449926
ts2phc[31.168]: adding tstamp 25.949856176 to clock /dev/ptp3
ts2phc[31.168]: adding tstamp 26.000000000 to clock /dev/ptp1
ts2phc[31.168]: /dev/ptp3 offset       -176 s2 freq   +4354
ts2phc[31.672]: /dev/ptp3 SKIP extts index 0 at 26.449851584 src 26.705433606
ts2phc[32.180]: adding tstamp 26.949846992 to clock /dev/ptp3
ts2phc[32.180]: adding tstamp 27.000000000 to clock /dev/ptp1
ts2phc[32.180]: /dev/ptp3 offset        -80 s2 freq   +4397
ts2phc[32.684]: /dev/ptp3 SKIP extts index 0 at 27.449842384 src 27.717415110
ts2phc[33.192]: adding tstamp 27.949837768 to clock /dev/ptp3
ts2phc[33.192]: adding tstamp 28.000000000 to clock /dev/ptp1
ts2phc[33.192]: /dev/ptp3 offset          0 s2 freq   +4453
ts2phc[33.696]: /dev/ptp3 SKIP extts index 0 at 28.449833128 src 28.729412902
ts2phc[34.200]: adding tstamp 28.949828472 to clock /dev/ptp3
ts2phc[34.200]: adding tstamp 29.000000000 to clock /dev/ptp1
ts2phc[34.200]: /dev/ptp3 offset          8 s2 freq   +4461
ts2phc[34.704]: /dev/ptp3 SKIP extts index 0 at 29.449823816 src 29.737416038
ts2phc[35.208]: adding tstamp 29.949819152 to clock /dev/ptp3
ts2phc[35.208]: adding tstamp 30.000000000 to clock /dev/ptp1
ts2phc[35.208]: /dev/ptp3 offset         -8 s2 freq   +4447
ts2phc[35.712]: /dev/ptp3 SKIP extts index 0 at 30.449814496 src 30.745554982
ts2phc[36.216]: adding tstamp 30.949809840 to clock /dev/ptp3
ts2phc[36.216]: adding tstamp 31.000000000 to clock /dev/ptp1
ts2phc[36.216]: /dev/ptp3 offset         -8 s2 freq   +4445
ts2phc[36.468]: /dev/ptp3 SKIP extts index 0 at 31.449805184 src 31.501109446
ts2phc[36.972]: adding tstamp 31.949800536 to clock /dev/ptp3
ts2phc[36.972]: adding tstamp 32.000000000 to clock /dev/ptp1
ts2phc[36.972]: /dev/ptp3 offset         -8 s2 freq   +4442
ts2phc[37.480]: /dev/ptp3 SKIP extts index 0 at 32.449795896 src 32.513320070
ts2phc[37.984]: adding tstamp 32.949791248 to clock /dev/ptp3
ts2phc[37.984]: adding tstamp 33.000000000 to clock /dev/ptp1
ts2phc[37.984]: /dev/ptp3 offset          0 s2 freq   +4448

Fix that by taking the following measures:
- Schedule the poll from a timer. Because we are really scheduling the
  timer periodically, the extts events delivered to user space are
  periodic too, and don't suffer from the "shift-to-the-right" effect.
- Increase the poll period to 6 times a second. This imposes a smaller
  upper bound to the shift that can occur to the delivery time of extts
  events, and makes user space (ts2phc) to always interpret correctly
  which events should be skipped and which shouldn't.
- Move the SPI readout itself to the main PTP kernel thread, instead of
  the generic workqueue. This is because the timer runs in atomic
  context, but is also better than before, because if needed, we can
  chrt & taskset this kernel thread, to ensure it gets enough priority
  under load.

After this patch, one can notice that the wander is greatly reduced, and
that the latencies of one extts poll are not propagated to the next. The
'src' timestamp that is skipped is never larger than .65 seconds (which
means .15 seconds larger than the time at which the real event occurred
at, and .10 seconds smaller than the .75 upper threshold for ignoring
the falling edge):

ts2phc[40.076]: adding tstamp 34.949261296 to clock /dev/ptp3
ts2phc[40.076]: adding tstamp 35.000000000 to clock /dev/ptp1
ts2phc[40.076]: /dev/ptp3 offset         48 s2 freq   +4631
ts2phc[40.568]: /dev/ptp3 SKIP extts index 0 at 35.449256496 src 35.595791078
ts2phc[41.064]: adding tstamp 35.949251744 to clock /dev/ptp3
ts2phc[41.064]: adding tstamp 36.000000000 to clock /dev/ptp1
ts2phc[41.064]: /dev/ptp3 offset       -224 s2 freq   +4374
ts2phc[41.552]: /dev/ptp3 SKIP extts index 0 at 36.449247088 src 36.579825574
ts2phc[42.044]: adding tstamp 36.949242456 to clock /dev/ptp3
ts2phc[42.044]: adding tstamp 37.000000000 to clock /dev/ptp1
ts2phc[42.044]: /dev/ptp3 offset       -240 s2 freq   +4290
ts2phc[42.536]: /dev/ptp3 SKIP extts index 0 at 37.449237848 src 37.563828774
ts2phc[43.028]: adding tstamp 37.949233264 to clock /dev/ptp3
ts2phc[43.028]: adding tstamp 38.000000000 to clock /dev/ptp1
ts2phc[43.028]: /dev/ptp3 offset       -144 s2 freq   +4314
ts2phc[43.520]: /dev/ptp3 SKIP extts index 0 at 38.449228656 src 38.547823238
ts2phc[44.012]: adding tstamp 38.949224048 to clock /dev/ptp3
ts2phc[44.012]: adding tstamp 39.000000000 to clock /dev/ptp1
ts2phc[44.012]: /dev/ptp3 offset        -80 s2 freq   +4335
ts2phc[44.508]: /dev/ptp3 SKIP extts index 0 at 39.449219432 src 39.535846118
ts2phc[44.996]: adding tstamp 39.949214816 to clock /dev/ptp3
ts2phc[44.996]: adding tstamp 40.000000000 to clock /dev/ptp1
ts2phc[44.996]: /dev/ptp3 offset        -32 s2 freq   +4359
ts2phc[45.488]: /dev/ptp3 SKIP extts index 0 at 40.449210192 src 40.515824678
ts2phc[45.980]: adding tstamp 40.949205568 to clock /dev/ptp3
ts2phc[45.980]: adding tstamp 41.000000000 to clock /dev/ptp1
ts2phc[45.980]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[46.636]: /dev/ptp3 SKIP extts index 0 at 41.449200928 src 41.664176902
ts2phc[47.132]: adding tstamp 41.949196288 to clock /dev/ptp3
ts2phc[47.132]: adding tstamp 42.000000000 to clock /dev/ptp1
ts2phc[47.132]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[47.620]: /dev/ptp3 SKIP extts index 0 at 42.449191656 src 42.648117190
ts2phc[48.112]: adding tstamp 42.949187016 to clock /dev/ptp3
ts2phc[48.112]: adding tstamp 43.000000000 to clock /dev/ptp1
ts2phc[48.112]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[48.604]: /dev/ptp3 SKIP extts index 0 at 43.449182384 src 43.632112582
ts2phc[49.100]: adding tstamp 43.949177736 to clock /dev/ptp3
ts2phc[49.100]: adding tstamp 44.000000000 to clock /dev/ptp1
ts2phc[49.100]: /dev/ptp3 offset         -8 s2 freq   +4376
ts2phc[49.588]: /dev/ptp3 SKIP extts index 0 at 44.449173096 src 44.616136774
ts2phc[50.080]: adding tstamp 44.949168464 to clock /dev/ptp3
ts2phc[50.080]: adding tstamp 45.000000000 to clock /dev/ptp1
ts2phc[50.080]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[50.572]: /dev/ptp3 SKIP extts index 0 at 45.449163816 src 45.600134662
ts2phc[51.064]: adding tstamp 45.949159160 to clock /dev/ptp3
ts2phc[51.064]: adding tstamp 46.000000000 to clock /dev/ptp1
ts2phc[51.064]: /dev/ptp3 offset         -8 s2 freq   +4376
ts2phc[51.556]: /dev/ptp3 SKIP extts index 0 at 46.449154528 src 46.584588550
ts2phc[52.048]: adding tstamp 46.949149896 to clock /dev/ptp3
ts2phc[52.048]: adding tstamp 47.000000000 to clock /dev/ptp1
ts2phc[52.048]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[52.540]: /dev/ptp3 SKIP extts index 0 at 47.449145256 src 47.568132198
ts2phc[53.032]: adding tstamp 47.949140616 to clock /dev/ptp3
ts2phc[53.032]: adding tstamp 48.000000000 to clock /dev/ptp1
ts2phc[53.032]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[53.524]: /dev/ptp3 SKIP extts index 0 at 48.449135968 src 48.552121446
ts2phc[54.016]: adding tstamp 48.949131320 to clock /dev/ptp3
ts2phc[54.016]: adding tstamp 49.000000000 to clock /dev/ptp1
ts2phc[54.016]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[54.512]: /dev/ptp3 SKIP extts index 0 at 49.449126680 src 49.540147014
ts2phc[55.000]: adding tstamp 49.949122040 to clock /dev/ptp3
ts2phc[55.000]: adding tstamp 50.000000000 to clock /dev/ptp1
ts2phc[55.000]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[55.492]: /dev/ptp3 SKIP extts index 0 at 50.449117400 src 50.520119078
ts2phc[55.988]: adding tstamp 50.949112768 to clock /dev/ptp3
ts2phc[55.988]: adding tstamp 51.000000000 to clock /dev/ptp1
ts2phc[55.988]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[56.476]: /dev/ptp3 SKIP extts index 0 at 51.449108120 src 51.504175910
ts2phc[57.132]: adding tstamp 51.949103480 to clock /dev/ptp3
ts2phc[57.132]: adding tstamp 52.000000000 to clock /dev/ptp1
ts2phc[57.132]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[57.624]: /dev/ptp3 SKIP extts index 0 at 52.449098840 src 52.651833574
ts2phc[58.116]: adding tstamp 52.949094200 to clock /dev/ptp3
ts2phc[58.116]: adding tstamp 53.000000000 to clock /dev/ptp1
ts2phc[58.116]: /dev/ptp3 offset          8 s2 freq   +4392
ts2phc[58.612]: /dev/ptp3 SKIP extts index 0 at 53.449089560 src 53.639826918
ts2phc[59.100]: adding tstamp 53.949084920 to clock /dev/ptp3
ts2phc[59.100]: adding tstamp 54.000000000 to clock /dev/ptp1
ts2phc[59.100]: /dev/ptp3 offset          8 s2 freq   +4394
ts2phc[59.592]: /dev/ptp3 SKIP extts index 0 at 54.449080272 src 54.619842278
ts2phc[60.084]: adding tstamp 54.949075624 to clock /dev/ptp3
ts2phc[60.084]: adding tstamp 55.000000000 to clock /dev/ptp1
ts2phc[60.084]: /dev/ptp3 offset          8 s2 freq   +4397
ts2phc[60.576]: /dev/ptp3 SKIP extts index 0 at 55.449070968 src 55.603885542
ts2phc[61.068]: adding tstamp 55.949066312 to clock /dev/ptp3
ts2phc[61.068]: adding tstamp 56.000000000 to clock /dev/ptp1
ts2phc[61.068]: /dev/ptp3 offset          0 s2 freq   +4391
ts2phc[61.560]: /dev/ptp3 SKIP extts index 0 at 56.449061680 src 56.587885798
ts2phc[62.052]: adding tstamp 56.949057032 to clock /dev/ptp3
ts2phc[62.052]: adding tstamp 57.000000000 to clock /dev/ptp1
ts2phc[62.052]: /dev/ptp3 offset         -8 s2 freq   +4383

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:16:02 -07:00
Jonathan McDowell
69462fe6a3 net: dsa: qca8k: Add 802.1q VLAN support
This adds full 802.1q VLAN support to the qca8k, allowing the use of
vlan_filtering and more complicated bridging setups than allowed by
basic port VLAN support.

Tested with a number of untagged ports with separate VLANs and then a
trunk port with all the VLANs tagged on it.

v3:
- Pull QCA8K_PORT_VID_DEF changes into separate cleanup patch
- Reverse Christmas tree notation for variable definitions
- Use untagged instead of tagged for consistency
v2:
- Return sensible errnos on failure rather than -1 (rmk)
- Style cleanups based on Florian's feedback
- Silently allow VLAN 0 as device correctly treats this as no tag

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:45:39 -07:00
Jonathan McDowell
e9d204fde5 net: dsa: qca8k: Add define for port VID
Rather than using a magic value of 1 when configuring the port VIDs add
a QCA8K_PORT_VID_DEF define and use that instead. Also fix up the
bitmask in the process; the top 4 bits are reserved so this wasn't a
problem, but only masking 12 bits is the correct approach.

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:45:39 -07:00
Linus Walleij
788abc6d9d net: dsa: rtl8366: Fix VLAN set-up
Alter the rtl8366_vlan_add() to call rtl8366_set_vlan()
inside the loop that goes over all VIDs since we now
properly support calling that function more than once.
Augment the loop to postincrement as this is more
intuitive.

The loop moved past the last VID but called
rtl8366_set_vlan() with the port number instead of
the VID, assuming a 1-to-1 correspondence between
ports and VIDs. This was also a bug.

Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: d8652956cf ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:44:23 -07:00
Linus Walleij
15ab7906cc net: dsa: rtl8366: Fix VLAN semantics
The RTL8366 would not handle adding new members (ports) to
a VLAN: the code assumed that ->port_vlan_add() was only
called once for a single port. When intializing the
switch with .configure_vlan_while_not_filtering set to
true, the function is called numerous times for adding
all ports to VLAN1, which was something the code could
not handle.

Alter rtl8366_set_vlan() to just |= new members and
untagged flags to 4k and MC VLAN table entries alike.
This makes it possible to just add new ports to a
VLAN.

Put in some helpful debug code that can be used to find
any further bugs here.

Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: d8652956cf ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:44:23 -07:00
David S. Miller
a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Chris Packham
1baf0fac10 net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU
Some of the chips in the mv88e6xxx family don't support jumbo
configuration per port. But they do have a chip-wide max frame size that
can be used. Use this to approximate the behaviour of configuring a port
based MTU.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 20:03:27 -07:00
Chris Packham
e8b34c67d6 net: dsa: mv88e6xxx: Support jumbo configuration on 6190/6190X
The MV88E6190 and MV88E6190X both support per port jumbo configuration
just like the other GE switches. Install the appropriate ops.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 20:03:27 -07:00
Chris Packham
0f3c66a3c7 net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration
The MV88E6097 chip does not support configuring jumbo frames. Prior to
commit 5f4366660d only the 6352, 6351, 6165 and 6320 chips configured
jumbo mode. The refactor accidentally added the function for the 6097.
Remove the erroneous function pointer assignment.

Fixes: 5f4366660d ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 20:03:27 -07:00
Helmut Grohne
3506b2f42d net: dsa: microchip: call phy_remove_link_mode during probe
When doing "ip link set dev ... up" for a ksz9477 backed link,
ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
called. Doing so reverts any previous change to advertised link modes
e.g. using a udevd .link file.

phy_remove_link_mode is not meant to be used while opening a link and
should be called during phy probe when the link is not yet available to
userspace.

Therefore move the phy_remove_link_mode calls into
ksz9477_switch_register. It indirectly calls dsa_register_switch, which
creates the relevant struct phy_devices and we update the link modes
right after that. At that time dev->features is already initialized by
ksz9477_switch_detect.

Remove phy_setup from ksz_dev_ops as no users remain.

Link: https://lore.kernel.org/netdev/20200715192722.GD1256692@lunn.ch/
Fixes: 42fc6a4c61 ("net: dsa: microchip: prepare PHY for proper advertisement")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:50:02 -07:00
Jonathan McDowell
f58d2598cf net: dsa: qca8k: implement the port MTU callbacks
This switch has a single max frame size configuration register, so we
track the requested MTU for each port and apply the largest.

v2:
- Address review feedback from Vladimir Oltean

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:34:16 -07:00
Russell King
fad58190c0 net: dsa: mv88e6xxx: fix in-band AN link establishment
If in-band negotiation or fixed-link modes are specified for a DSA
port, the DSA code will force the link down during initialisation. For
fixed-link mode, this is fine, as phylink will manage the link state.
However, for in-band mode, phylink expects the PCS to detect link,
which will not happen if the link is forced down.

There is a related issue that in in-band mode, the link could come up
while we are making configuration changes, so we should force the link
down prior to reconfiguring the interface mode.

This patch addresses both issues.

Fixes: 3be98b2d5f ("net: dsa: Down cpu/dsa ports phylink will control")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:08:54 -07:00
Michael Walle
16659b811a net: dsa: felix: (re)use already existing constants
Now that there are USXGMII constants available, drop the old definitions
and reuse the generic ones.

Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Vladimir Oltean
89e35f66d5 net: mscc: ocelot: rethink Kconfig dependencies again
Having the users of MSCC_OCELOT_SWITCH_LIB depend on REGMAP_MMIO was a
bad idea, since that symbol is not user-selectable. So we should have
kept a 'select REGMAP_MMIO'.

When we do that, we run into 2 more problems:

- By depending on GENERIC_PHY, we are causing a recursive dependency.
  But it looks like GENERIC_PHY has no other dependencies, and other
  drivers select it, so we can select it too:

drivers/of/Kconfig:69:error: recursive dependency detected!
drivers/of/Kconfig:69:  symbol OF_IRQ depends on IRQ_DOMAIN
kernel/irq/Kconfig:68:  symbol IRQ_DOMAIN is selected by REGMAP
drivers/base/regmap/Kconfig:7:  symbol REGMAP default is visible depending on REGMAP_MMIO
drivers/base/regmap/Kconfig:39: symbol REGMAP_MMIO is selected by MSCC_OCELOT_SWITCH_LIB
drivers/net/ethernet/mscc/Kconfig:15:   symbol MSCC_OCELOT_SWITCH_LIB is selected by MSCC_OCELOT_SWITCH
drivers/net/ethernet/mscc/Kconfig:22:   symbol MSCC_OCELOT_SWITCH depends on GENERIC_PHY
drivers/phy/Kconfig:8:  symbol GENERIC_PHY is selected by PHY_BCM_NS_USB3
drivers/phy/broadcom/Kconfig:41:        symbol PHY_BCM_NS_USB3 depends on MDIO_BUS
drivers/net/phy/Kconfig:13:     symbol MDIO_BUS depends on MDIO_DEVICE
drivers/net/phy/Kconfig:6:      symbol MDIO_DEVICE is selected by PHYLIB
drivers/net/phy/Kconfig:254:    symbol PHYLIB is selected by ARC_EMAC_CORE
drivers/net/ethernet/arc/Kconfig:19:    symbol ARC_EMAC_CORE is selected by ARC_EMAC
drivers/net/ethernet/arc/Kconfig:25:    symbol ARC_EMAC depends on OF_IRQ

- By depending on PHYLIB, we are causing a recursive dependency. PHYLIB
  only has a single dependency, "depends on NETDEVICES", which we are
  already depending on, so we can again hack our way into conformance by
  turning the PHYLIB dependency into a select.

drivers/of/Kconfig:69:error: recursive dependency detected!
drivers/of/Kconfig:69:  symbol OF_IRQ depends on IRQ_DOMAIN
kernel/irq/Kconfig:68:  symbol IRQ_DOMAIN is selected by REGMAP
drivers/base/regmap/Kconfig:7:  symbol REGMAP default is visible depending on REGMAP_MMIO
drivers/base/regmap/Kconfig:39: symbol REGMAP_MMIO is selected by MSCC_OCELOT_SWITCH_LIB
drivers/net/ethernet/mscc/Kconfig:15:   symbol MSCC_OCELOT_SWITCH_LIB is selected by MSCC_OCELOT_SWITCH
drivers/net/ethernet/mscc/Kconfig:22:   symbol MSCC_OCELOT_SWITCH depends on PHYLIB
drivers/net/phy/Kconfig:254:    symbol PHYLIB is selected by ARC_EMAC_CORE
drivers/net/ethernet/arc/Kconfig:19:    symbol ARC_EMAC_CORE is selected by ARC_EMAC
drivers/net/ethernet/arc/Kconfig:25:    symbol ARC_EMAC depends on OF_IRQ

Fixes: f4d0323bae ("net: mscc: ocelot: convert MSCC_OCELOT_SWITCH into a library")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 12:46:00 -07:00
Maxim Kochetkov
84705fc165 net: dsa: felix: introduce support for Seville VSC9953 switch
This is another switch from Vitesse / Microsemi / Microchip, that has
10 ports (8 external, 2 internal) and is integrated into the Freescale /
NXP T1040 PowerPC SoC. It is very similar to Felix from NXP LS1028A,
except that this is a platform device and Felix is a PCI device, and it
doesn't support IEEE 1588 and TSN.

Like Felix, this driver configures its own PCS on the internal MDIO bus
using a phy_device abstraction for it (yes, it will be refactored to use
a raw mdio_device, like other phylink drivers do, but let's keep it like
that for now). But unlike Felix, the MDIO bus and the PCS are not from
the same vendor. The PCS is the same QorIQ/Layerscape PCS as found in
Felix/ENETC/DPAA*, but the internal MDIO bus that is used to access it
is actually an instantiation of drivers/net/phy/mdio-mscc-miim.c. But it
would be difficult to reuse that driver (it doesn't even use regmap, and
it's less than 200 lines of code), so we hand-roll here some internal
MDIO bus accessors within seville_vsc9953.c, which serves the purpose of
driving the PCS absolutely fine.

Also, same as Felix, the PCS doesn't support dynamic reconfiguration of
SerDes protocol, so we need to do pre-validation of PHY mode from device
tree and not let phylink change it.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Vladimir Oltean
375e131429 net: dsa: felix: move probing to felix_vsc9959.c
Felix is not actually meant to be a DSA driver only for the switch
inside NXP LS1028A, but an umbrella for all Vitesse / Microsemi /
Microchip switches that are register-compatible with Ocelot and that are
using in DSA mode (with an NPI Ethernet port).

For the dsa_switch_ops exported by the felix driver to be generic enough
to be used by other non-PCI switches, we need to move the PCI-specific
probing to the low-level translation module felix_vsc9959.c. This way,
other switches can have their own probing functions, as platform devices
or otherwise.

This patch also removes the "Felix instance table", which did not stand
the test of time and is unnecessary at this point.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Maxim Kochetkov
aa92d836d5 net: mscc: ocelot: extend watermark encoding function
The ocelot_wm_encode function deals with setting thresholds for pause
frame start and stop. In Ocelot and Felix the register layout is the
same, but for Seville, it isn't. The easiest way to accommodate Seville
hardware configuration is to introduce a function pointer for setting
this up.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Maxim Kochetkov
541132f096 net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield
Seville has a different bitwise layout than Ocelot and Felix.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Vladimir Oltean
67c2404922 net: dsa: felix: create a template for the DSA tags on xmit
With this patch we try to kill 2 birds with 1 stone.

First of all, some switches that use tag_ocelot.c don't have the exact
same bitfield layout for the DSA tags. The destination ports field is
different for Seville VSC9953 for example. So the choices are to either
duplicate tag_ocelot.c into a new tag_seville.c (sub-optimal) or somehow
take into account a supposed ocelot->dest_ports_offset when packing this
field into the DSA injection header (again not ideal).

Secondly, tag_ocelot.c already needs to memset a 128-bit area to zero
and call some packing() functions of dubious performance in the
fastpath. And most of the values it needs to pack are pretty much
constant (BYPASS=1, SRC_PORT=CPU, DEST=port index). So it would be good
if we could improve that.

The proposed solution is to allocate a memory area per port at probe
time, initialize that with the statically defined bits as per chip
hardware revision, and just perform a simpler memcpy in the fastpath.

Other alternatives have been analyzed, such as:
- Create a separate tag_seville.c: too much code duplication for just 1
  bit field difference.
- Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
  tag_brcm.c, which would have a separate .xmit function. Again, too
  much code duplication for just 1 bit field difference.
- Allocate the template from the init function of the tag_ocelot.c
  module, instead of from the driver: couldn't figure out a method of
  accessing the correct port template corresponding to the correct
  tagger in the .xmit function.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Vladimir Oltean
886e1387c7 net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields
Currently Felix and Ocelot share the same bit layout in these per-port
registers, but Seville does not. So we need reg_fields for that.

Actually since these are per-port registers, we need to also specify the
number of ports, and register size per port, and use the regmap API for
multiple ports.

There's a more subtle point to be made about the other 2 register
fields:
- QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG
- QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE
which we are not writing any longer, for 2 reasons:
- Using the previous API (ocelot_write_rix), we were only writing 1 for
  Felix and Ocelot, which was their hardware-default value, and which
  there wasn't any intention in changing.
- In the case of SCH_NEXT_CFG, in fact Seville does not have this
  register field at all, and therefore, if we want to have common code
  we would be required to not write to it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Maxim Kochetkov
2789658fa3 soc: mscc: ocelot: add MII registers description
Add the register definitions for the MSCC MIIM MDIO controller in
preparation for seville_vsc9959.c to create its accessors for the
internal MDIO bus.

Since we've introduced elements to ocelot_regfields that are not
instantiated by felix and ocelot, we need to define the size of the
regfields arrays explicitly, otherwise ocelot_regfields_init, which
iterates up to REGFIELD_MAX, will fault on the undefined regfield
entries (if we're lucky).

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00