Commit Graph

321 Commits

Author SHA1 Message Date
Vladimir Oltean
6dfd23d35e net: dsa: sja1105: delete vlan delta save/restore logic
With the best_effort_vlan_filtering mode now gone, the driver does not
have 3 operating modes anymore (VLAN-unaware, VLAN-aware and best effort),
but only 2.

The idea is that we will gain support for network stack I/O through a
VLAN-aware bridge, using the data plane offload framework (imprecise RX,
imprecise TX). So the VLAN-aware use case will be more functional.

But standalone ports that are part of the same switch when some other
ports are under a VLAN-aware bridge should work too. Termination on
those should work through the tag_8021q RX VLAN and TX VLAN.

This was not possible using the old logic, because:
- in VLAN-unaware mode, only the tag_8021q VLANs were committed to hw
- in VLAN-aware mode, only the bridge VLANs were committed to hw
- in best-effort VLAN mode, both the tag_8021q and bridge VLANs were
  committed to hw

The strategy for the new VLAN-aware mode is to allow the bridge and the
tag_8021q VLANs to coexist in the VLAN table at the same time.

[ yes, we need to make sure that the bridge cannot install a tag_8021q
  VLAN, but ]

This means that the save/restore logic introduced by commit ec5ae61076
("net: dsa: sja1105: save/restore VLANs using a delta commit method")
does not serve a purpose any longer. We can delete it and restore the
old code that simply adds a VLAN to the VLAN table and calls it a day.

Note that we keep the sja1105_commit_pvid() function from those days,
but adapt it slightly. Ports that are under a VLAN-aware bridge use the
bridge's pvid, ports that are standalone or under a VLAN-unaware bridge
use the tag_8021q pvid, for local termination or VLAN-unaware forwarding.

Now, when the vlan_filtering property is toggled for the bridge, the
pvid of the ports beneath it is the only thing that's changing, we no
longer delete some VLANs and restore others.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
Colin Ian King
d63f8877c4 net: dsa: sja1105: remove redundant re-assignment of pointer table
The pointer table is being re-assigned with a value that is never
read. The assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 22:35:22 +01:00
David S. Miller
5af84df962 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts are simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:13:06 +01:00
Vladimir Oltean
e40cba9490 net: dsa: sja1105: make VID 4095 a bridge VLAN too
This simple series of commands:

ip link add br0 type bridge vlan_filtering 1
ip link set swp0 master br0

fails on sja1105 with the following error:
[   33.439103] sja1105 spi0.1: vlan-lookup-table needs to have at least the default untagged VLAN
[   33.447710] sja1105 spi0.1: Invalid config, cannot upload
Warning: sja1105: Failed to change VLAN Ethertype.

For context, sja1105 has 3 operating modes:
- SJA1105_VLAN_UNAWARE: the dsa_8021q_vlans are committed to hardware
- SJA1105_VLAN_FILTERING_FULL: the bridge_vlans are committed to hardware
- SJA1105_VLAN_FILTERING_BEST_EFFORT: both the dsa_8021q_vlans and the
  bridge_vlans are committed to hardware

Swapping out a VLAN list and another in happens in
sja1105_build_vlan_table(), which performs a delta update procedure.
That function is called from a few places, notably from
sja1105_vlan_filtering() which is called from the
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler.

The above set of 2 commands fails when run on a kernel pre-commit
8841f6e63f ("net: dsa: sja1105: make devlink property
best_effort_vlan_filtering true by default"). So the priv->vlan_state
transition that takes place is between VLAN-unaware and full VLAN
filtering. So the dsa_8021q_vlans are swapped out and the bridge_vlans
are swapped in.

So why does it fail?

Well, the bridge driver, through nbp_vlan_init(), first sets up the
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute, and only then
proceeds to call nbp_vlan_add for the default_pvid.

So when we swap out the dsa_8021q_vlans and swap in the bridge_vlans in
the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler, there are no bridge
VLANs (yet). So we have wiped the VLAN table clean, and the low-level
static config checker complains of an invalid configuration. We _will_
add the bridge VLANs using the dynamic config interface, albeit later,
when nbp_vlan_add() calls us. So it is natural that it fails.

So why did it ever work?

Surprisingly, it looks like I only tested this configuration with 2
things set up in a particular way:
- a network manager that brings all ports up
- a kernel with CONFIG_VLAN_8021Q=y

It is widely known that commit ad1afb0039 ("vlan_dev: VLAN 0 should be
treated as "no vlan tag" (802.1p packet)") installs VID 0 to every net
device that comes up. DSA treats these VLANs as bridge VLANs, and
therefore, in my testing, the list of bridge_vlans was never empty.

However, if CONFIG_VLAN_8021Q is not enabled, or the port is not up when
it joins a VLAN-aware bridge, the bridge_vlans list will be temporarily
empty, and the sja1105_static_config_reload() call from
sja1105_vlan_filtering() will fail.

To fix this, the simplest thing is to keep VID 4095, the one used for
CPU-injected control packets since commit ed040abca4 ("net: dsa:
sja1105: use 4095 as the private VLAN for untagged traffic"), in the
list of bridge VLANs too, not just the list of tag_8021q VLANs. This
ensures that the list of bridge VLANs will never be empty.

Fixes: ec5ae61076 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Reported-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-21 22:53:43 -07:00
Vladimir Oltean
c64b9c0504 net: dsa: tag_8021q: add proper cross-chip notifier support
The big problem which mandates cross-chip notifiers for tag_8021q is
this:

                                             |
    sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
 [  user ] [  user ] [  user ] [  dsa  ] [  cpu  ]
                                   |
                                   +---------+
                                             |
    sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
 [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
                                   |
                                   +---------+
                                             |
    sw2p0     sw2p1     sw2p2     sw2p3     sw2p4
 [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]

When the user runs:

ip link add br0 type bridge
ip link set sw0p0 master br0
ip link set sw2p0 master br0

It doesn't work.

This is because dsa_8021q_crosschip_bridge_join() assumes that "ds" and
"other_ds" are at most 1 hop away from each other, so it is sufficient
to add the RX VLAN of {ds, port} into {other_ds, other_port} and vice
versa and presto, the cross-chip link works. When there is another
switch in the middle, such as in this case switch 1 with its DSA links
sw1p3 and sw1p4, somebody needs to tell it about these VLANs too.

Which is exactly why the problem is quadratic: when a port joins a
bridge, for each port in the tree that's already in that same bridge we
notify a tag_8021q VLAN addition of that port's RX VLAN to the entire
tree. It is a very complicated web of VLANs.

It must be mentioned that currently we install tag_8021q VLANs on too
many ports (DSA links - to be precise, on all of them). For example,
when sw2p0 joins br0, and assuming sw1p0 was part of br0 too, we add the
RX VLAN of sw2p0 on the DSA links of switch 0 too, even though there
isn't any port of switch 0 that is a member of br0 (at least yet).
In theory we could notify only the switches which sit in between the
port joining the bridge and the port reacting to that bridge_join event.
But in practice that is impossible, because of the way 'link' properties
are described in the device tree. The DSA bindings require DT writers to
list out not only the real/physical DSA links, but in fact the entire
routing table, like for example switch 0 above will have:

	sw0p3: port@3 {
		link = <&sw1p4 &sw2p4>;
	};

This was done because:

/* TODO: ideally DSA ports would have a single dp->link_dp member,
 * and no dst->rtable nor this struct dsa_link would be needed,
 * but this would require some more complex tree walking,
 * so keep it stupid at the moment and list them all.
 */

but it is a perfect example of a situation where too much information is
actively detrimential, because we are now in the position where we
cannot distinguish a real DSA link from one that is put there to avoid
the 'complex tree walking'. And because DT is ABI, there is not much we
can change.

And because we do not know which DSA links are real and which ones
aren't, we can't really know if DSA switch A is in the data path between
switches B and C, in the general case.

So this is why tag_8021q RX VLANs are added on all DSA links, and
probably why it will never change.

On the other hand, at least the number of additions/deletions is well
balanced, and this means that once we implement reference counting at
the cross-chip notifier level a la fdb/mdb, there is absolutely zero
need for a struct dsa_8021q_crosschip_link, it's all self-managing.

In fact, with the tag_8021q notifiers emitted from the bridge join
notifiers, it becomes so generic that sja1105 does not need to do
anything anymore, we can just delete its implementation of the
.crosschip_bridge_{join,leave} methods.

Among other things we can simply delete is the home-grown implementation
of sja1105_notify_crosschip_switches(). The reason why that is wrong is
because it is not quadratic - it only covers remote switches to which we
have a cross-chip bridging link and that does not cover in-between
switches. This deletion is part of the same patch because sja1105 used
to poke deep inside the guts of the tag_8021q context in order to do
that. Because the cross-chip links went away, so needs the sja1105 code.

Last but not least, dsa_8021q_setup_port() is simplified (and also
renamed). Because our TAG_8021Q_VLAN_ADD notifier is designed to react
on the CPU port too, the four dsa_8021q_vid_apply() calls:
- 1 for RX VLAN on user port
- 1 for the user port's RX VLAN on the CPU port
- 1 for TX VLAN on user port
- 1 for the user port's TX VLAN on the CPU port

now get squashed into only 2 notifier calls via
dsa_port_tag_8021q_vlan_add.

And because the notifiers to add and to delete a tag_8021q VLAN are
distinct, now we finally break up the port setup and teardown into
separate functions instead of relying on a "bool enabled" flag which
tells us what to do. Arguably it should have been this way from the
get go.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
328621f613 net: dsa: tag_8021q: absorb dsa_8021q_setup into dsa_tag_8021q_{,un}register
Right now, setting up tag_8021q is a 2-step operation for a driver,
first the context structure needs to be created, then the VLANs need to
be installed on the ports. A similar thing is true for teardown.

Merge the 2 steps into the register/unregister methods, to be as
transparent as possible for the driver as to what tag_8021q does behind
the scenes. This also gets rid of the funny "bool setup == true means
setup, == false means teardown" API that tag_8021q used to expose.

Note that dsa_tag_8021q_register() must be called at least in the
.setup() driver method and never earlier (like in the driver probe
function). This is because the DSA switch tree is not initialized at
probe time, and the cross-chip notifiers will not work.

For symmetry with .setup(), the unregister method should be put in
.teardown().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
5da11eb407 net: dsa: make tag_8021q operations part of the core
Make tag_8021q a more central element of DSA and move the 2 driver
specific operations outside of struct dsa_8021q_context (which is
supposed to hold dynamic data and not really constant function
pointers).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
d7b1fd520d net: dsa: let the core manage the tag_8021q context
The basic problem description is as follows:

Be there 3 switches in a daisy chain topology:

                                             |
    sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
 [  user ] [  user ] [  user ] [  dsa  ] [  cpu  ]
                                   |
                                   +---------+
                                             |
    sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
 [  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
                                   |
                                   +---------+
                                             |
    sw2p0     sw2p1     sw2p2     sw2p3     sw2p4
 [  user ] [  user ] [  user ] [  user ] [  dsa  ]

The CPU will not be able to ping through the user ports of the
bottom-most switch (like for example sw2p0), simply because tag_8021q
was not coded up for this scenario - it has always assumed DSA switch
trees with a single switch.

To add support for the topology above, we must admit that the RX VLAN of
sw2p0 must be added on some ports of switches 0 and 1 as well. This is
in fact a textbook example of thing that can use the cross-chip notifier
framework that DSA has set up in switch.c.

There is only one problem: core DSA (switch.c) is not able right now to
make the connection between a struct dsa_switch *ds and a struct
dsa_8021q_context *ctx. Right now, it is drivers who call into
tag_8021q.c and always provide a struct dsa_8021q_context *ctx pointer,
and tag_8021q.c calls them back with the .tag_8021q_vlan_{add,del}
methods.

But with cross-chip notifiers, it is possible for tag_8021q to call
drivers without drivers having ever asked for anything. A good example
is right above: when sw2p0 wants to set itself up for tag_8021q,
the .tag_8021q_vlan_add method needs to be called for switches 1 and 0,
so that they transport sw2p0's VLANs towards the CPU without dropping
them.

So instead of letting drivers manage the tag_8021q context, add a
tag_8021q_ctx pointer inside of struct dsa_switch, which will be
populated when dsa_tag_8021q_register() returns success.

The patch is fairly long-winded because we are partly reverting commit
5899ee367a ("net: dsa: tag_8021q: add a context structure") which made
the driver-facing tag_8021q API use "ctx" instead of "ds". Now that we
can access "ctx" directly from "ds", this is no longer needed.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
cedf467064 net: dsa: tag_8021q: create dsa_tag_8021q_{register,unregister} helpers
In preparation of moving tag_8021q to core DSA, move all initialization
and teardown related to tag_8021q which is currently done by drivers in
2 functions called "register" and "unregister". These will gather more
functionality in future patches, which will better justify the chosen
naming scheme.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
0fac6aa098 net: dsa: sja1105: delete the best_effort_vlan_filtering mode
Simply put, the best-effort VLAN filtering mode relied on VLAN retagging
from a bridge VLAN towards a tag_8021q sub-VLAN in order to be able to
decode the source port in the tagger, but the VLAN retagging
implementation inside the sja1105 chips is not the best and we were
relying on marginal operating conditions.

The most notable limitation of the best-effort VLAN filtering mode is
its incapacity to treat this case properly:

ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
ip link set swp4 master br0
bridge vlan del dev swp4 vid 1
bridge vlan add dev swp4 vid 1 pvid

When sending an untagged packet through swp2, the expectation is for it
to be forwarded to swp4 as egress-tagged (so it will contain VLAN ID 1
on egress). But the switch will send it as egress-untagged.

There was an attempt to fix this here:
https://patchwork.kernel.org/project/netdevbpf/patch/20210407201452.1703261-2-olteanv@gmail.com/

but it failed miserably because it broke PTP RX timestamping, in a way
that cannot be corrected due to hardware issues related to VLAN
retagging.

So with either PTP broken or pushing VLAN headers on egress for untagged
packets being broken, the sad reality is that the best-effort VLAN
filtering code is broken. Delete it.

Note that this means there will be a temporary loss of functionality in
this driver until it is replaced with something better (network stack
RX/TX capability for "mode 2" as described in
Documentation/networking/dsa/sja1105.rst, the "port under VLAN-aware
bridge" case). We simply cannot keep this code until that driver rework
is done, it is super bloated and tangled with tag_8021q.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:36:42 -07:00
Vladimir Oltean
b0b33b048d net: dsa: sja1105: fix address learning getting disabled on the CPU port
In May 2019 when commit 640f763f98 ("net: dsa: sja1105: Add support
for Spanning Tree Protocol") was introduced, the comment that "STP does
not get called for the CPU port" was true. This changed after commit
0394a63acf ("net: dsa: enable and disable all ports") in August 2019
and went largely unnoticed, because the sja1105_bridge_stp_state_set()
method did nothing different compared to the static setup done by
sja1105_init_mac_settings().

With the ability to turn address learning off introduced by the blamed
commit, there is a new priv->learn_ena port mask in the driver. When
sja1105_bridge_stp_state_set() gets called and we are in
BR_STATE_LEARNING or later, address learning is enabled or not depending
on priv->learn_ena & BIT(port).

So what happens is that priv->learn_ena is not being set from anywhere
for the CPU port, and the static configuration done by
sja1105_init_mac_settings() is being overwritten.

To solve this, acknowledge that the static configuration of STP state is
no longer necessary because the STP state is being set by the DSA core
now, but what is necessary is to set priv->learn_ena for the CPU port.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:32:41 -07:00
Jakub Kicinski
b6df00789e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in net/netfilter/nf_tables_api.c.

Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.

skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29 15:45:27 -07:00
Vladimir Oltean
74e7feff0e net: dsa: sja1105: fix dynamic access to L2 Address Lookup table for SJA1110
The SJA1105P/Q/R/S and SJA1110 may have the same layout for the command
to read/write/search for L2 Address Lookup entries, but as explained in
the comments at the beginning of the sja1105_dynamic_config.c file, the
command portion of the buffer is at the end, and we need to obtain a
pointer to it by adding the length of the entry to the buffer.

Alas, the length of an L2 Address Lookup entry is larger in SJA1110 than
it is for SJA1105P/Q/R/S, so we need to create a common helper to access
the command buffer, and this receives as argument the length of the
entry buffer.

Fixes: 3e77e59bf8 ("net: dsa: sja1105: add support for the SJA1110 switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:49:05 -07:00
Vladimir Oltean
be7f62eeba net: dsa: sja1105: fix NULL pointer dereference in sja1105_reload_cbs()
priv->cbs is an array of priv->info->num_cbs_shapers elements of type
struct sja1105_cbs_entry which only get allocated if CONFIG_NET_SCH_CBS
is enabled.

However, sja1105_reload_cbs() is called from sja1105_static_config_reload()
which in turn is called for any of the items in sja1105_reset_reasons,
therefore during the normal runtime of the driver and not just from a
code path which can be triggered by the tc-cbs offload.

The sja1105_reload_cbs() function does not contain a check whether the
priv->cbs array is NULL or not, it just assumes it isn't and proceeds to
iterate through the credit-based shaper elements. This leads to a NULL
pointer dereference.

The solution is to return success if the priv->cbs array has not been
allocated, since sja1105_reload_cbs() has nothing to do.

Fixes: 4d7525085a ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 15:46:51 -07:00
Vladimir Oltean
75e994709f net: dsa: sja1105: document the SJA1110 in the Kconfig
Mention support for the SJA1110 in menuconfig.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:55:57 -07:00
Vladimir Oltean
61c77533b8 net: dsa: sja1105: completely error out in sja1105_static_config_reload if something fails
If reloading the static config fails for whatever reason, for example if
sja1105_static_config_check_valid() fails, then we "goto out_unlock_ptp"
but we print anyway that "Reset switch and programmed static config.",
which is confusing because we didn't. We also do a bunch of other stuff
like reprogram the XPCS and reload the credit-based shapers, as if a
switch reset took place, which didn't.

So just unlock the PTP lock and goto out, skipping all of that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:26:17 -07:00
Vladimir Oltean
1303e7f9b6 net: dsa: sja1105: allow the TTEthernet configuration in the static config for SJA1110
Currently sja1105_static_config_check_valid() is coded up to detect
whether TTEthernet is supported based on device ID, and this check was
not updated to cover SJA1110.

However, it is desirable to have as few checks for the device ID as
possible, so the driver core is more generic. So what we can do is look
at the static config table operations implemented by that specific
switch family (populated by sja1105_static_config_init) whether the
schedule table has a non-zero maximum entry count (meaning that it is
supported) or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:26:17 -07:00
Vladimir Oltean
cb5a82d2b9 net: dsa: sja1105: properly power down the microcontroller clock for SJA1110
It turns out that powering down the BASE_TIMER_CLK does not turn off the
microcontroller, just its timers, including the one for the watchdog.
So the embedded microcontroller is still running, and potentially still
doing things.

To prevent unwanted interference, we should power down the BASE_MCSS_CLK
as well (MCSS = microcontroller subsystem).

The trouble is that currently we turn off the BASE_TIMER_CLK for SJA1110
from the .clocking_setup() method, mostly because this is a Clock
Generation Unit (CGU) setting which was traditionally configured in that
method for SJA1105. But in SJA1105, the CGU was used for bringing up the
port clocks at the proper speeds, and in SJA1110 it's not (but rather
for initial configuration), so it's best that we rebrand the
sja1110_clocking_setup() method into what it really is - an implementation
of the .disable_microcontroller() method.

Since disabling the microcontroller only needs to be done once, at probe
time, we can choose the best place to do that as being in sja1105_setup(),
before we upload the static config to the device. This guarantees that
the static config being used by the switch afterwards is really ours.

Note that the procedure to upload a static config necessarily resets the
switch. This already did not reset the microcontroller, only the switch
core, so since the .disable_microcontroller() method is guaranteed to be
called by that point, if it's disabled, it remains disabled. Add a
comment to make that clear.

With the code movement for SJA1110 from .clocking_setup() to
.disable_microcontroller(), both methods are optional and are guarded by
"if" conditions.

Tested by enabling in the device tree the rev-mii switch port 0 that
goes towards the microcontroller, and flashing a firmware that would
have networking. Without this patch, the microcontroller can be pinged,
with this patch it cannot.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:26:17 -07:00
Vladimir Oltean
3009e8aa85 net: dsa: sja1105: constify the sja1105_regs structures
The struct sja1105_regs tables are not modified during the runtime of
the driver, so they can be made constant. In fact, struct sja1105_info
already holds a const pointer to these.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 13:14:24 -07:00
Vladimir Oltean
56b6346633 net: dsa: sja1105: plug in support for 2500base-x
The MAC treats 2500base-x same as SGMII (yay for that) except that it
must be set to a different speed.

Extend all places that check for SGMII to also check for 2500base-x.

Also add the missing 2500base-x compatibility matrix entry for SJA1110D.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
ece578bc3e net: dsa: sja1105: SGMII and 2500base-x on the SJA1110 are 'special'
For the xMII Mode Parameters Table to be properly configured for SGMII
mode on SJA1110, we need to set the "special" bit, since SGMII is
officially bitwise coded as 0b0011 in SJA1105 (decimal 3, equal to
XMII_MODE_SGMII), and as 0b1011 in SJA1110 (decimal 11).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
27871359bd net: dsa: sja1105: register the PCS MDIO bus for SJA1110
On the SJA1110, the PCS of each SERDES-capable port is accessed through
a different memory window which is 0x100 bytes in size, denoted by
"pcs_base".

In each PCS register access window, the XPCS MMDs are accessed in an
indirect way: in pages/banks of up to 0x100 addresses each. Changing the
page/bank is done by writing to a special register at the end of the
access window.

The MDIO register map accessed indirectly through the indirect banked
method described above is similar to what SJA1105 has: upper 5 bits are
the MMD, lower 16 bits are the MDIO address within that MMD.

Since the PHY ID reported by the XPCS inside SJA1110 is also all zeroes
(like SJA1105), we need to trap those reads and return a fake PHY ID so
that the xpcs driver can apply some specific fixups for our integration.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
3ad1d17154 net: dsa: sja1105: migrate to xpcs for SGMII
There is a desire to use the generic driver for the Synopsys XPCS
located in drivers/net/pcs, and to achieve that, the sja1105 driver must
expose an MDIO bus for the SGMII PCS, because the XPCS probes as an
mdio_device.

In preparation of the SJA1110 which in fact has a different access
procedure for the SJA1105, we register this PCS MDIO bus once in the
common code, but we implement function pointers for the read and write
methods. In this patch there is a single implementation for them.

There is exactly one MDIO bus for the PCS, this will contain all PCSes
at MDIO addresses equal to the port number.

We delete a bunch of hardware support code because the xpcs driver
already does what we need.

We need to hack up the MDIO reads for the PHY ID, since our XPCS
instantiation returns zeroes and there are some specific fixups which
need to be applied by the xpcs driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:43:56 -07:00
Vladimir Oltean
566b18c8b7 net: dsa: sja1105: implement TX timestamping for SJA1110
The TX timestamping procedure for SJA1105 is a bit unconventional
because the transmit procedure itself is unconventional.

Control packets (and therefore PTP as well) are transmitted to a
specific port in SJA1105 using "management routes" which must be written
over SPI to the switch. These are one-shot rules that match by
destination MAC address on traffic coming from the CPU port, and select
the precise destination port for that packet. So to transmit a packet
from NET_TX softirq context, we actually need to defer to a process
context so that we can perform that SPI write before we send the packet.
The DSA master dev_queue_xmit() runs in process context, and we poll
until the switch confirms it took the TX timestamp, then we annotate the
skb clone with that TX timestamp. This is why the sja1105 driver does
not need an skb queue for TX timestamping.

But the SJA1110 is a bit (not much!) more conventional, and you can
request 2-step TX timestamping through the DSA header, as well as give
the switch a cookie (timestamp ID) which it will give back to you when
it has the timestamp. So now we do need a queue for keeping the skb
clones until their TX timestamps become available.

The interesting part is that the metadata frames from SJA1105 haven't
disappeared completely. On SJA1105 they were used as follow-ups which
contained RX timestamps, but on SJA1110 they are actually TX completion
packets, which contain a variable (up to 32) array of timestamps.
Why an array? Because:
- not only is the TX timestamp on the egress port being communicated,
  but also the RX timestamp on the CPU port. Nice, but we don't care
  about that, so we ignore it.
- because a packet could be multicast to multiple egress ports, each
  port takes its own timestamp, and the TX completion packet contains
  the individual timestamps on each port.

This is unconventional because switches typically have a timestamping
FIFO and raise an interrupt, but this one doesn't. So the tagger needs
to detect and parse meta frames, and call into the main switch driver,
which pairs the timestamps with the skbs in the TX timestamping queue
which are waiting for one.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
30b73242e6 net: dsa: sja1105: add the RX timestamping procedure for SJA1110
This is really easy, since the full RX timestamp is in the DSA trailer
and the tagger code transfers it to SJA1105_SKB_CB(skb)->tstamp, we just
need to move it to the skb shared info region. This is as opposed to
SJA1105, where the RX timestamp was received in a meta frame (so there
needed to be a state machine to pair the 2 packets) and the timestamp
was partial (so the packet, once matched with its timestamp, needed to
be added to an RX timestamping queue where the PTP aux worker would
reconstruct that timestamp).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
4913b8ebf8 net: dsa: add support for the SJA1110 native tagging protocol
The SJA1110 has improved a few things compared to SJA1105:

- To send a control packet from the host port with SJA1105, one needed
  to program a one-shot "management route" over SPI. This is no longer
  true with SJA1110, you can actually send "in-band control extensions"
  in the packets sent by DSA, these are in fact DSA tags which contain
  the destination port and switch ID.

- When receiving a control packet from the switch with SJA1105, the
  source port and switch ID were written in bytes 3 and 4 of the
  destination MAC address of the frame (which was a very poor shot at a
  DSA header). If the control packet also had an RX timestamp, that
  timestamp was sent in an actual follow-up packet, so there were
  reordering concerns on multi-core/multi-queue DSA masters, where the
  metadata frame with the RX timestamp might get processed before the
  actual packet to which that timestamp belonged (there is no way to
  pair a packet to its timestamp other than the order in which they were
  received). On SJA1110, this is no longer true, control packets have
  the source port, switch ID and timestamp all in the DSA tags.

- Timestamps from the switch were partial: to get a 64-bit timestamp as
  required by PTP stacks, one would need to take the partial 24-bit or
  32-bit timestamp from the packet, then read the current PTP time very
  quickly, and then patch in the high bits of the current PTP time into
  the captured partial timestamp, to reconstruct what the full 64-bit
  timestamp must have been. That is awful because packet processing is
  done in NAPI context, but reading the current PTP time is done over
  SPI and therefore needs sleepable context.

But it also aggravated a few things:

- Not only is there a DSA header in SJA1110, but there is a DSA trailer
  in fact, too. So DSA needs to be extended to support taggers which
  have both a header and a trailer. Very unconventional - my understanding
  is that the trailer exists because the timestamps couldn't be prepared
  in time for putting them in the header area.

- Like SJA1105, not all packets sent to the CPU have the DSA tag added
  to them, only control packets do:

  * the ones which match the destination MAC filters/traps in
    MAC_FLTRES1 and MAC_FLTRES0
  * the ones which match FDB entries which have TRAP or TAKETS bits set

  So we could in theory hack something up to request the switch to take
  timestamps for all packets that reach the CPU, and those would be
  DSA-tagged and contain the source port / switch ID by virtue of the
  fact that there needs to be a timestamp trailer provided. BUT:

- The SJA1110 does not parse its own DSA tags in a way that is useful
  for routing in cross-chip topologies, a la Marvell. And the sja1105
  driver already supports cross-chip bridging from the SJA1105 days.
  It does that by automatically setting up the DSA links as VLAN trunks
  which contain all the necessary tag_8021q RX VLANs that must be
  communicated between the switches that span the same bridge. So when
  using tag_8021q on sja1105, it is possible to have 2 switches with
  ports sw0p0, sw0p1, sw1p0, sw1p1, and 2 VLAN-unaware bridges br0 and
  br1, and br0 can take sw0p0 and sw1p0, and br1 can take sw0p1 and
  sw1p1, and forwarding will happen according to the expected rules of
  the Linux bridge.
  We like that, and we don't want that to go away, so as a matter of
  fact, the SJA1110 tagger still needs to support tag_8021q.

So the sja1110 tagger is a hybrid between tag_8021q for data packets,
and the native hardware support for control packets.

On RX, packets have a 13-byte trailer if they contain an RX timestamp.
That trailer is padded in such a way that its byte 8 (the start of the
"residence time" field - not parsed by Linux because we don't care) is
aligned on a 16 byte boundary. So the padding has a variable length
between 0 and 15 bytes. The DSA header contains the offset of the
beginning of the padding relative to the beginning of the frame (and the
end of the padding is obviously the end of the packet minus 13 bytes,
the length of the trailer). So we discard it.

Packets which don't have a trailer contain the source port and switch ID
information in the header (they are "trap-to-host" packets). Packets
which have a trailer contain the source port and switch ID in the trailer.

On TX, the destination port mask and switch ID is always in the trailer,
so we always need to say in the header that a trailer is present.

The header needs a custom EtherType and this was chosen as 0xdadc, after
0xdada which is for Marvell and 0xdadb which is for VLANs in
VLAN-unaware mode on SJA1105 (and SJA1110 in fact too).

Because we use tag_8021q in concert with the native tagging protocol,
control packets will have 2 DSA tags.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
617ef8d937 net: dsa: sja1105: make SJA1105_SKB_CB fit a full timestamp
In SJA1105, RX timestamps for packets sent to the CPU are transmitted in
separate follow-up packets (metadata frames). These contain partial
timestamps (24 or 32 bits) which are kept in SJA1105_SKB_CB(skb)->meta_tstamp.

Thankfully, SJA1110 improved that, and the RX timestamps are now
transmitted in-band with the actual packet, in the timestamp trailer.
The RX timestamps are now full-width 64 bits.

Because we process the RX DSA tags in the rcv() method in the tagger,
but we would like to preserve the DSA code structure in that we populate
the skb timestamp in the port_rxtstamp() call which only happens later,
the implication is that we must somehow pass the 64-bit timestamp from
the rcv() method all the way to port_rxtstamp(). We can use the skb->cb
for that.

Rename the meta_tstamp from struct sja1105_skb_cb from "meta_tstamp" to
"tstamp", and increase its size to 64 bits.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
6c0de59b3d net: dsa: sja1105: allow RX timestamps to be taken on all ports for SJA1110
On SJA1105, there is support for a cascade port which is presumably
connected to a downstream SJA1105 switch. The upstream one does not take
PTP timestamps for packets received on this port, presumably because the
downstream switch already did (and for PTP, it only makes sense for the
leaf nodes in a DSA switch tree to do that).

I haven't been able to validate that feature in a fully assembled setup,
so I am disabling the feature by setting the cascade port to an unused
port value (ds->num_ports).

In SJA1110, multiple cascade ports are supported, and CASC_PORT became
a bit mask from a port number. So when CASC_PORT is set to ds->num_ports
(which is 11 on SJA1110), it is actually set to 0b1011, so ports 3, 1
and 0 are configured as cascade ports and we cannot take RX timestamps
on them.

So we need to introduce a check for SJA1110 and set things differently
(to zero there), so that the cascading feature is properly disabled and
RX timestamps can be taken on all ports.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Vladimir Oltean
29305260d2 net: dsa: sja1105: enable the TTEthernet engine on SJA1110
As opposed to SJA1105 where there are parts with TTEthernet and parts
without, in SJA1110 all parts support it, but it must be enabled in the
static config. So enable it unconditionally. We use it for the tc-taprio
and tc-gate offload.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 12:45:38 -07:00
Colin Ian King
ab324d8dfd net: dsa: sja1105: Fix assigned yet unused return code rc
The return code variable rc is being set to return error values in two
places in sja1105_mdiobus_base_tx_register and yet it is not being
returned, the function always returns 0 instead. Fix this by replacing
the return 0 with the return code rc.

Addresses-Coverity: ("Unused value")
Fixes: 5a8f09748e ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 15:46:30 -07:00
Vladimir Oltean
5a8f09748e net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX
The SJA1110 contains two types of integrated PHYs: one 100base-TX PHY
and multiple 100base-T1 PHYs.

The access procedure for the 100base-T1 PHYs is also different than it
is for the 100base-TX one. So we register 2 MDIO buses, one for the
base-TX and the other for the base-T1. Each bus has an OF node which is
a child of the "mdio" subnode of the switch, and they are recognized by
compatible string.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:37:16 -07:00
Vladimir Oltean
ceec8bc098 net: dsa: sja1105: make sure the retagging port is enabled for SJA1110
The SJA1110 has an extra configuration in the General Parameters Table
through which the user can select the buffer reservation config.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:37:16 -07:00
Vladimir Oltean
3e77e59bf8 net: dsa: sja1105: add support for the SJA1110 switch family
The SJA1110 is basically an SJA1105 with more ports, some integrated
PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which
can be disabled, and the switch core can be controlled by a host running
Linux, over SPI.

This patch contains:
- the static and dynamic config packing functions, for the tables that
  are common with SJA1105
- one more static config tables which is "unique" to the SJA1110
  (actually it is a rehash of stuff that was placed somewhere else in
  SJA1105): the PCP Remapping Table
- a reset and clock configuration procedure for the SJA1110 switch.
  This resets just the switch subsystem, and gates off the clock which
  powers on the embedded microcontroller.
- an RGMII delay configuration procedure for SJA1110, which is very
  similar to SJA1105, but different enough for us to be unable to reuse
  it (this is a pattern that repeats itself)
- some adaptations to dynamic config table entries which are no longer
  programmed in the same way. For example, to delete a VLAN, you used to
  write an entry through the dynamic reconfiguration interface with the
  desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN
  table entries contain a TYPE_ENTRY field, which must be set to zero
  (in a backwards-incompatible way) in order for the entry to be deleted,
  or to some other entry for the VLAN to match "inner tagged" or "outer
  tagged" packets.
- a similar thing for the static config: the xMII Mode Parameters Table
  encoding for SGMII and MII (the latter just when attached to a
  100base-TX PHY) just isn't what it used to be in SJA1105. They are
  identical, except there is an extra "special" bit which needs to be
  set. Set it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:37:16 -07:00
Vladimir Oltean
5d645df99a net: dsa: sja1105: determine PHY/MAC role from PHY interface type
Now that both RevMII as well as RevRMII exist, we can deprecate the
sja1105,role-mac and sja1105,role-phy properties and simply let the user
select that a port operates in MII PHY role by using
	phy-mode = "rev-mii";
or in RMII PHY role by using
	phy-mode = "rev-rmii";

There are no fixed-link MII or RMII properties in mainline device trees,
and the setup itself is fairly uncommon, so there shouldn't be risks of
breaking compatibility.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 12:20:18 -07:00
Vladimir Oltean
29afb83ac9 net: dsa: sja1105: apply RGMII delays based on the fixed-link property
The sja1105 driver has an intermediate way of determining whether the
RGMII delays should be applied by the PHY or by itself: by looking at
the port role (PHY or MAC). The port can be put in the PHY role either
explicitly (sja1105,role-phy) or implicitly (fixed-link).

We want to deprecate the sja1105,role-phy property, so all that remains
is the fixed-link property. Introduce a "fixed_link" array of booleans
in the driver, and use that to determine whether RGMII delays must be
applied or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 12:20:18 -07:00
Vladimir Oltean
96c85f51f1 net: dsa: sja1105: some table entries are always present when read dynamically
The SJA1105 has a static configuration comprised of a number of tables
with entries. Some of these can be read and modified at runtime as well,
through the dynamic configuration interface.

As a careful reader can notice from the comments in this file, the
software interface for accessing a table entry through the dynamic
reconfiguration is a bit of a no man's land, and varies wildly across
switch generations and even from one kind of table to another.

I have tried my best to come up with a software representation of a
'common denominator' SPI command to access a table entry through the
dynamic configuration interface:

struct sja1105_dyn_cmd {
	bool search;
	u64 valid; /* must be set to 1 */
	u64 rdwrset; /* 0 to read, 1 to write */
	u64 errors;
	u64 valident; /* 0 if entry is invalid, 1 if valid */
	u64 index;
};

Relevant to this patch is the VALIDENT bit, which for READ commands is
populated by the switch and lets us know if we're looking at junk or at
a real table entry.

In SJA1105, the dynamic reconfiguration interface for management routes
has notably not implemented the VALIDENT bit, leading to a workaround to
ignore this field in sja1105_dynamic_config_read(), as it will be set to
zero, but the data is valid nonetheless.

In SJA1110, this pattern has sadly been abused to death, and while there
are many more tables which can be read back over the dynamic config
interface compared to SJA1105, their handling isn't in any way more
uniform. Generally speaking, if there is a single possible entry in a
given table, and loading that table in the static config is mandatory as
per the documentation, then the VALIDENT bit is deemed as redundant and
more than likely not implemented.

So it is time to make the workaround more official, and add a bit to the
flags implemented by dynamic config tables. It will be used by more
tables when SJA1110 support arrives.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
f41fad3cb8 net: dsa: sja1105: always keep RGMII ports in the MAC role
In SJA1105, the xMII Mode Parameters Table field called PHY_MAC denotes
the 'role' of the port, be it a PHY or a MAC. This makes a difference in
the MII and RMII protocols, but RGMII is symmetric, so either PHY or MAC
settings result in the same hardware behavior.

The SJA1110 is different, and the RGMII ports only work when configured
in MAC mode, so keep the port roles in MAC mode unconditionally.

Why we had an RGMII port in the PHY role in the first place was because
we wanted to have a way in the driver to denote whether RGMII delays
should be applied based on the phy-mode property or not. This is already
done in sja1105_parse_rgmii_delays() based on an intermediary
struct sja1105_dt_port (which contains the port role). So it is a
logical fallacy to use the hardware configuration as a scratchpad for
driver data, it isn't necessary.

We can also remove the gating condition for applying RGMII delays only
for ports in the PHY role. The .setup_rgmii_delay() method looks at
the priv->rgmii_rx_delay[port] and priv->rgmii_tx_delay[port] properties
which are already populated properly (in the case of a port in the MAC
role they are false). Removing this condition generates a few more SPI
writes for these ports (clearing the RGMII delays) which are perhaps
useless for SJA1105P/Q/R/S, where we know that the delays are disabled
by default. But for SJA1110, the firmware on the embedded microcontroller
might have done something funny, so it's always a good idea to clear the
RGMII delays if that's what Linux expects.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
41fed17fdb net: dsa: sja1105: add a translation table for port speeds
In order to support the new speed of 2500Mbps, the SJA1110 has achieved
the great performance of changing the encoding in the MAC Configuration
Table for the port speeds of 10, 100, 1000 compared to SJA1105.

Because this is a common driver, we need a layer of indirection in order
to program the hardware with the right values irrespective of switch
generation.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
91a050782c net: dsa: sja1105: add a PHY interface type compatibility matrix
On the SJA1105, all ports support the parallel "xMII" protocols (MII,
RMII, RGMII) except for port 4 on SJA1105R/S which supports only SGMII.
This was relatively easy to model, by special-casing the SGMII port.

On the SJA1110, certain ports can be pinmuxed between SGMII and xMII, or
between SGMII and an internal 100base-TX PHY. This creates problems,
because the driver's assumption so far was that if a port supports
SGMII, it uses SGMII.

We allow the device tree to tell us how the port pinmuxing is done, and
check that against a PHY interface type compatibility matrix for
plausibility.

The other big change is that instead of doing SGMII configuration based
on what the port supports, we do it based on what is the configured
phy_mode of the port.

The 2500base-x support added in this patch is not complete.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:26 -07:00
Vladimir Oltean
bf4edf4afb net: dsa: sja1105: cache the phy-mode port property
So far we've succeeded in operating without keeping a copy of the
phy-mode in the driver, since we already have the static config and we
can look at the xMII Mode Parameters Table which already holds that
information.

But with the SJA1110, we cannot make the distinction between sgmii and
2500base-x, because to the hardware's static config, it's all SGMII.
So add a phy_mode property per port inside struct sja1105_private.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Vladimir Oltean
4c7ee010cf net: dsa: sja1105: the 0x1F0000 SGMII "base address" is actually MDIO_MMD_VEND2
Looking at the SGMII PCS from SJA1110, which is accessed indirectly
through a different base address as can be seen in the next patch, it
appears odd that the address accessed through indirection still
references the base address from the SJA1105S register map (first MDIO
register is at 0x1f0000), when it could index the SGMII registers
starting from zero.

Except that the 0x1f0000 is not a base address at all, it seems. It is
0x1f << 16 | 0x0000, and 0x1f is coding for the vendor-specific MMD2.
So, it turns out, the Synopsys PCS implements all its registers inside
the vendor-specific MMDs 1 and 2 (0x1e and 0x1f). This explains why the
PCS has no overlaps (for the other MMDs) with other register regions of
the switch (because no other MMDs are implemented).

Change the code to remove the SGMII "base address" and explicitly encode
the MMD for reads/writes. This will become necessary for SJA1110 support.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Vladimir Oltean
84db00f2c0 net: dsa: sja1105: allow SGMII PCS configuration to be per port
The SJA1105 R and S switches have 1 SGMII port (port 4). Because there
is only one such port, there is no "port" parameter in the configuration
code for the SGMII PCS.

However, the SJA1110 can have up to 4 SGMII ports, each with its own
SGMII register map. So we need to generalize the logic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Vladimir Oltean
15074a361f net: dsa: sja1105: be compatible with "ethernet-ports" OF node name
Since commit f2f3e09396be ("net: dsa: sja1105: be compatible with
"ethernet-ports" OF node name"), DSA supports the "ethernet-ports" name
for the container node of the ports, but the sja1105 driver doesn't,
because it handles some device tree parsing of its own.

Add the second node name as a fallback.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:40:25 -07:00
Jakub Kicinski
5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Vladimir Oltean
1bf658eefe net: dsa: sja1105: allow the frame buffer size to be customized
The shared frame buffer of the SJA1110 is larger than that of SJA1105,
which is natural due to the fact that there are more ports.

Introduce yet another property in struct sja1105_info which encodes the
maximum number of 128 byte blocks that can be used for frame buffers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:04 -07:00
Vladimir Oltean
38fbe91f22 net: dsa: sja1105: configure the multicast policers, if present
The SJA1110 policer array is similar in layout with SJA1105, except it
contains one multicast policer per port at the end.

Detect the presence of multicast policers based on the maximum number of
supported L2 Policing Table entries, and make those policers have a
shared index equal to the port's default policer. Letting the user
configure these policers is not supported at the moment.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
f78a2517cf net: dsa: sja1105: use sja1105_xfer_u32 for the reset procedure
Using sja1105_xfer_buf results in a higher overhead and is harder to
read.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
fd6f2c257b net: dsa: sja1105: dynamically choose the number of static config table entries
Due to the fact that the port count is different, some static config
tables have a different number of elements in SJA1105 compared to
SJA1110. Such an example is the L2 Policing table, which has 45 entries
in SJA1105 (one per port x traffic class, and one broadcast policer per
port) and 110 entries in SJA1110 (one per port x traffic class, one
broadcast and one multicast policer per port).

Similarly, the MAC Configuration Table, the L2 Forwarding table, all
have a different number of elements simply because the port count is
different, and although this can be accounted for by looking at
ds->ports, the policing table can't because of the presence of the extra
multicast policers.

The common denominator for the static config initializers for these
tables is that they must set up all the entries within that table.
So the simplest way to account for these differences in a uniform manner
is to look at struct sja1105_table_ops::max_entry_count. For the sake of
uniformity, this patch makes that change also for tables whose number of
elements did not change in SJA1110, like the xMII Mode Parameters, the
L2 Lookup Parameters, General Parameters, AVB Parameters (all of these
are singleton tables with a single entry).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
c50376783f net: dsa: sja1105: skip CGU configuration if it's unnecessary
There are two distinct code paths which enter sja1105_clocking.c, one
through sja1105_clocking_setup() and the other through
sja1105_clocking_setup_port():

sja1105_static_config_reload      sja1105_setup
              |                         |
              |      +------------------+
              |      |
              v      v
   sja1105_clocking_setup               sja1105_adjust_port_config
                 |                                   |
                 v                                   |
      sja1105_clocking_setup_port <------------------+

As opposed to SJA1105, the SJA1110 does not need any configuration of
the Clock Generation Unit in order for xMII ports to work. Just RGMII
internal delays need to be configured, and that is done inside
sja1105_clocking_setup_port for the RGMII ports.

So this patch introduces the concept of a "reserved address", which the
CGU configuration functions from sja1105_clocking.c must check before
proceeding to do anything. The SJA1110 will have reserved addresses for
the CGU PLLs for MII/RMII/RGMII.

Additionally, make sja1105_clocking_setup() a function pointer so it can
be overridden by the SJA1110. Even though nothing port-related needs to
be done in the CGU, there are some operations such as disabling the
watchdog clock which are unique to the SJA1110.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
df2a81a35e net: dsa: sja1105: don't assign the host port using dsa_upstream_port()
If @port is unused, then dsa_upstream_port(ds, port) returns @port,
which means we cannot assume the CPU port can be retrieved this way.

The sja1105 switches support a single CPU port, so just iterate over the
switch ports and stop at the first CPU port we see.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
82760d7f2e net: dsa: sja1105: dimension the data structures for a larger port count
Introduce a SJA1105_MAX_NUM_PORTS macro which at the moment is equal to
SJA1105_NUM_PORTS (5). With the introduction of SJA1110, these
structures will need to hold information for up to 11 ports.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
f238fef1b3 net: dsa: sja1105: avoid some work for unused ports
Do not put unused ports in the forwarding domain, and do not allocate
FDB entries for dynamic address learning for them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
542043e91d net: dsa: sja1105: parameterize the number of ports
The sja1105 driver will gain support for the next-gen SJA1110 switch,
which is very similar except for the fact it has more than 5 ports.

So we need to replace the hardcoded SJA1105_NUM_PORTS in this driver
with ds->num_ports. This patch is as mechanical as possible (save for
the fact that ds->num_ports is not an integer constant expression).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:59:03 -07:00
Vladimir Oltean
b38e659de9 net: dsa: sja1105: update existing VLANs from the bridge VLAN list
When running this sequence of operations:

ip link add br0 type bridge vlan_filtering 1
ip link set swp4 master br0
bridge vlan add dev swp4 vid 1

We observe the traffic sent on swp4 is still untagged, even though the
bridge has overwritten the existing VLAN entry:

port    vlan ids
swp4     1 PVID

br0      1 PVID Egress Untagged

This happens because we didn't consider that the 'bridge vlan add'
command just overwrites VLANs like it's nothing. We treat the 'vid 1
pvid untagged' and the 'vid 1' as two separate VLANs, and the first
still has precedence when calling sja1105_build_vlan_table. Obviously
there is a disagreement regarding semantics, and we end up doing
something unexpected from the PoV of the bridge.

Let's actually consider an "existing VLAN" to be one which is on the
same port, and has the same VLAN ID, as one we already have, and update
it if it has different flags than we do.

The first blamed commit is the one introducing the bug, the second one
is the latest on top of which the bugfix still applies.

Fixes: ec5ae61076 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Fixes: 5899ee367a ("net: dsa: tag_8021q: add a context structure")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
ed040abca4 net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic
One thing became visible when writing the blamed commit, and that was
that STP and PTP frames injected by net/dsa/tag_sja1105.c using the
deferred xmit mechanism are always classified to the pvid of the CPU
port, regardless of whatever VLAN there might be in these packets.

So a decision needed to be taken regarding the mechanism through which
we should ensure that delivery of STP and PTP traffic is possible when
we are in a VLAN awareness mode that involves tag_8021q. This is because
tag_8021q is not concerned with managing the pvid of the CPU port, since
as far as tag_8021q is concerned, no traffic should be sent as untagged
from the CPU port. So we end up not actually having a pvid on the CPU
port if we only listen to tag_8021q, and unless we do something about it.

The decision taken at the time was to keep VLAN 1 in the list of
priv->dsa_8021q_vlans, and make it a pvid of the CPU port. This ensures
that STP and PTP frames can always be sent to the outside world.

However there is a problem. If we do the following while we are in
the best_effort_vlan_filtering=true mode:

ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
bridge vlan del dev swp2 vid 1

Then untagged and pvid-tagged frames should be dropped. But we observe
that they aren't, and this is because of the precaution we took that VID
1 is always installed on all ports.

So clearly VLAN 1 is not good for this purpose. What about VLAN 0?
Well, VLAN 0 is managed by the 8021q module, and that module wants to
ensure that 802.1p tagged frames are always received by a port, and are
always transmitted as VLAN-tagged (with VLAN ID 0). Whereas we want our
STP and PTP frames to be untagged if the stack sent them as untagged -
we don't want the driver to just decide out of the blue that it adds
VID 0 to some packets.

So what to do?

Well, there is one other VLAN that is reserved, and that is 4095:
$ ip link add link swp2 name swp2.4095 type vlan id 4095
Error: 8021q: Invalid VLAN id.
$ bridge vlan add dev swp2 vid 4095
Error: bridge: Vlan id is invalid.

After we made this change, VLAN 1 is indeed forwarded and/or dropped
according to the bridge VLAN table, there are no further alterations
done by the sja1105 driver.

Fixes: ec5ae61076 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
6729188d26 net: dsa: sja1105: error out on unsupported PHY mode
The driver continues probing when a port is configured for an
unsupported PHY interface type, instead it should stop.

Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
cec279a898 net: dsa: sja1105: add error handling in sja1105_setup()
If any of sja1105_static_config_load(), sja1105_clocking_setup() or
sja1105_devlink_setup() fails, we can't just return in the middle of
sja1105_setup() or memory will leak. Add a cleanup path.

Fixes: 0a7bdbc23d ("net: dsa: sja1105: move devlink param code to sja1105_devlink.c")
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
dc596e3fe6 net: dsa: sja1105: call dsa_unregister_switch when allocating memory fails
Unlike other drivers which pretty much end their .probe() execution with
dsa_register_switch(), the sja1105 does some extra stuff. When that
fails with -ENOMEM, the driver is quick to return that, forgetting to
call dsa_unregister_switch(). Not critical, but a bug nonetheless.

Fixes: 4d7525085a ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Fixes: a68578c20a ("net: dsa: Make deferred_xmit private to sja1105")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
ba61cf167c net: dsa: sja1105: fix VL lookup command packing for P/Q/R/S
At the beginning of the sja1105_dynamic_config.c file there is a diagram
of the dynamic config interface layout:

 packed_buf

 |
 V
 +-----------------------------------------+------------------+
 |              ENTRY BUFFER               |  COMMAND BUFFER  |
 +-----------------------------------------+------------------+

 <----------------------- packed_size ------------------------>

So in order to pack/unpack the command bits into the buffer,
sja1105_vl_lookup_cmd_packing must first advance the buffer pointer by
the length of the entry. This is similar to what the other *cmd_packing
functions do.

This bug exists because the command packing function for P/Q/R/S was
copied from the E/T generation, and on E/T, the command was actually
embedded within the entry buffer itself.

Fixes: 94f94d4acf ("net: dsa: sja1105: add static tables for virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-24 13:20:24 -07:00
Vladimir Oltean
039b167d68 net: dsa: sja1105: don't use burst SPI reads for port statistics
The current internal sja1105 driver API is optimized for retrieving many
statistics counters at once. But the switch does not do atomic snapshotting
for them anyway.

In case we start reporting the hardware port counters through
ndo_get_stats64 as well, not just ethtool, it would be good to be able
to read individual port counters and not all of them.

Additionally, since Arnd Bergmann's commit ae1804de93 ("dsa: sja1105:
dynamically allocate stats structure"), sja1105_get_ethtool_stats
allocates memory dynamically, since struct sja1105_port_status was
deemed to consume too much stack memory. That is not ideal.
The large structure is only needed because of the burst read.
If we read statistics one by one, we can consume less memory, and
we can avoid dynamic allocation.

Additionally, latency-sensitive interfaces such as PTP operations (for
phc2sys) might suffer if the SPI mutex is being held for too long, which
happens in the case of SPI burst reads. By reading counters one by one,
we give a chance for higher priority processes to preempt and take the
SPI bus mutex for accessing the PTP clock.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 14:01:41 -07:00
Vladimir Oltean
30a2e9c0f5 net: dsa: sja1105: stop reporting the queue levels in ethtool port counters
The queue levels are not counters, but instead they represent the
occupancy of the MAC TX queues. Having these in ethtool port counters is
not helpful, so remove them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 14:01:41 -07:00
Vladimir Oltean
718bad0e4d net: dsa: sja1105: adapt to a SPI controller with a limited max transfer size
The static config of the sja1105 switch is a long stream of bytes which
is programmed to the hardware in chunks (portions with the chip select
continuously asserted) of max 256 bytes each. Each chunk is a
spi_message composed of 2 spi_transfers: the buffer with the data and a
preceding buffer with the SPI access header.

Only that certain SPI controllers, such as the spi-sc18is602 I2C-to-SPI
bridge, cannot keep the chip select asserted for that long.
The spi_max_transfer_size() and spi_max_message_size() functions are how
the controller can impose its hardware limitations upon the SPI
peripheral driver.

For the sja1105 driver to work with these controllers, both buffers must
be smaller than the transfer limit, and their sum must be smaller than
the message limit.

Regression-tested on a switch connected to a controller with no
limitations (spi-fsl-dspi) as well as with one with caps for both
max_transfer_size and max_message_size (spi-sc18is602).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 13:23:29 -07:00
Vladimir Oltean
ca021f0dd8 net: dsa: sja1105: send multiple spi_messages instead of using cs_change
The sja1105 driver has been described by Mark Brown as "not using the
[ SPI ] API at all idiomatically" due to the use of cs_change:
https://patchwork.kernel.org/project/netdevbpf/patch/20210520135031.2969183-1-olteanv@gmail.com/

According to include/linux/spi/spi.h, the chip select is supposed to be
asserted for the entire length of a SPI message, as long as cs_change is
false for all member transfers. The cs_change flag changes the following:

(i) When a non-final SPI transfer has cs_change = true, the chip select
    should temporarily deassert and then reassert starting with the next
    transfer.
(ii) When a final SPI transfer has cs_change = true, the chip select
     should remain asserted until the following SPI message.

The sja1105 driver only uses cs_change for its first property, to form a
single SPI message whose layout can be seen below:

                                             this is an entire, single spi_message
           _______________________________________________________________________________________________
          /                                                                                               \
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
          | hdr_xfer[0] | chunk_xfer[0] | hdr_xfer[1] | chunk_xfer[1] |     | hdr_xfer[n] | chunk_xfer[n] |
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
cs_change      false          true           false           true                false          false

           ____________________________  _____________________________       _____________________________
CS line __/                            \/                             \ ... /                             \__

The fact of the matter is that spi_max_message_size() has an ambiguous
meaning if any non-final transfer has cs_change = true.

If the SPI master has a limitation in that it cannot keep the chip
select asserted for more than, say, 200 bytes (like the spi-sc18is602),
the normal thing for it to do is to implement .max_transfer_size and
.max_message_size, and limit both to 200: in the "worst case" where
cs_change is always false, then the controller can, indeed, not send
messages larger than 200 bytes.

But the fact that the SPI controller's max_message_size does not
necessarily mean that we cannot send messages larger than that.
Notably, if the SPI master special-cases the transfers with cs_change
and treats every chip select toggling as an entirely new transaction,
then a SPI message can easily exceed that limit. So there is a
temptation to ignore the controller's reported max_message_size when
using cs_change = true in non-final transfers.

But that can lead to false conclusions. As Mark points out, the SPI
controller might have a different kind of limitation with the max
message size, that has nothing at all to do with how long it can keep
the chip select asserted.
For example, that might be the case if the device is able to offload the
chip select changes to the hardware as part of the data stream, and it
packs the entire stream of commands+data (corresponding to a SPI
message) into a single DMA transfer that is itself limited in size.

So the only thing we can do is avoid ambiguity by not using cs_change at
all. Instead of sending a single spi_message, we now send multiple SPI
messages as follows:

                  spi_message 0                 spi_message 1                       spi_message n
           ____________________________   ___________________________        _____________________________
          /                            \ /                           \      /                             \
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
          | hdr_xfer[0] | chunk_xfer[0] | hdr_xfer[1] | chunk_xfer[1] |     | hdr_xfer[n] | chunk_xfer[n] |
          +-------------+---------------+-------------+---------------+ ... +-------------+---------------+
cs_change      false          true           false           true                false          false

           ____________________________  _____________________________       _____________________________
CS line __/                            \/                             \ ... /                             \__

which is clearer because the max_message_size limit is now easier to
enforce. What is transmitted on the wire stays, of course, the same.

Additionally, because we send no more than 2 transfers at a time, we now
avoid dynamic memory allocation too, which might be seen as an
improvement by some.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21 13:23:29 -07:00
Yangbo Lu
c4b364ce12 net: dsa: free skb->cb usage in core driver
Free skb->cb usage in core driver and let device drivers decide to
use or not. The reason having a DSA_SKB_CB(skb)->clone was because
dsa_skb_tx_timestamp() which may set the clone pointer was called
before p->xmit() which would use the clone if any, and the device
driver has no way to initialize the clone pointer.

This patch just put memset(skb->cb, 0, sizeof(skb->cb)) at beginning
of dsa_slave_xmit(). Some new features in the future, like one-step
timestamp may need more bytes of skb->cb to use in
dsa_skb_tx_timestamp(), and p->xmit().

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu
5c5416f5d4 net: dsa: no longer clone skb in core driver
It was a waste to clone skb directly in dsa_skb_tx_timestamp().
For one-step timestamping, a clone was not needed. For any failure of
port_txtstamp (this may usually happen), the skb clone had to be freed.

So this patch moves skb cloning for tx timestamp out of dsa core, and
let drivers clone skb in port_txtstamp if they really need.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu
cf536ea3c7 net: dsa: no longer identify PTP packet in core driver
Move ptp_classify_raw out of dsa core driver for handling tx
timestamp request. Let device drivers do this if they want.
Not all drivers want to limit tx timestamping for only PTP
packet.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Vladimir Oltean
a1e6f641e3 Revert "net: dsa: sja1105: Clear VLAN filtering offload netdev feature"
This reverts commit e9bf96943b.

The topic of the reverted patch is the support for switches with global
VLAN filtering, added by commit 061f6a505a ("net: dsa: Add
ndo_vlan_rx_{add, kill}_vid implementation"). Be there a switch with 4
ports swp0 -> swp3, and the following setup:

ip link add br0 type bridge vlan_filtering 1
ip link set swp0 master br0
ip link set swp1 master br0

What would happen with VLAN-tagged traffic received on standalone ports
swp2 and swp3? Well, it would get dropped, were it not for the
.ndo_vlan_rx_add_vid and .ndo_vlan_rx_kill_vid implementations (called
from vlan_vid_add and vlan_vid_del respectively). Basically, for DSA
switches where VLAN filtering is a global attribute, we enforce the
standalone ports to have 'rx-vlan-filter: off [fixed]' in their ethtool
features, which lets the user know that all VLAN-tagged packets that are
not explicitly added in the RX filtering list are dropped.

As for the sja1105 driver, at the time of the reverted patch, it was
operating in a pretty handicapped mode when it had ports under a bridge
with vlan_filtering=1. Specifically, it was unable to terminate traffic
through the CPU port (for further explanation see "Traffic support" in
Documentation/networking/dsa/sja1105.rst).

However, since then, the sja1105 driver has made considerable progress,
and that limitation is no longer as severe now. Specifically, since
commit 2cafa72e51 ("net: dsa: sja1105: add a new
best_effort_vlan_filtering devlink parameter"), the driver is able to
perform CPU termination even when some ports are under bridges with
vlan_filtering=1. Then, since commit 8841f6e63f ("net: dsa: sja1105:
make devlink property best_effort_vlan_filtering true by default"), this
even became the default operating mode.

So we can now take advantage of the logic in the DSA core.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-20 19:17:05 -07:00
Wei Yongjun
6f0d32509a net: dsa: sja1105: fix error return code in sja1105_cls_flower_add()
The return value 'rc' maybe overwrite to 0 in the flow_action_for_each
loop, the error code from the offload not support error handling will
not set. This commit fix it to return -EOPNOTSUPP.

Fixes: 6a56e19902 ("flow_offload: reject configuration of packet-per-second policing in offload drivers")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16 11:14:59 -07:00
Baowen Zheng
6a56e19902 flow_offload: reject configuration of packet-per-second policing in offload drivers
A follow-up patch will allow users to configures packet-per-second policing
in the software datapath. In preparation for this, teach all drivers that
support offload of the policer action to reject such configuration as
currently none of them support it.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:18:09 -08:00
Vladimir Oltean
6a5166e07c net: dsa: sja1105: fix ucast/bcast flooding always remaining enabled
In the blamed patch I managed to introduce a bug while moving code
around: the same logic is applied to the ucast_egress_floods and
bcast_egress_floods variables both on the "if" and the "else" branches.

This is clearly an unintended change compared to how the code used to be
prior to that bugfix, so restore it.

Fixes: 7f7ccdea8c ("net: dsa: sja1105: fix leakage of flooded frames outside bridging domain")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-04 14:19:01 -08:00
Vladimir Oltean
053d8ad10d net: dsa: sja1105: fix SGMII PCS being forced to SPEED_UNKNOWN instead of SPEED_10
When using MLO_AN_PHY or MLO_AN_FIXED, the MII_BMCR of the SGMII PCS is
read before resetting the switch so it can be reprogrammed afterwards.
This works for the speeds of 1Gbps and 100Mbps, but not for 10Mbps,
because SPEED_10 is actually 0, so AND-ing anything with 0 is false,
therefore that last branch is dead code.

Do what others do (genphy_read_status_fixed, phy_mii_ioctl) and just
remove the check for SPEED_10, let it fall into the default case.

Fixes: ffe10e679c ("net: dsa: sja1105: Add support for the SGMII port")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-04 14:19:01 -08:00
Geert Uytterhoeven
fcd4ba3bcb net: dsa: sja1105: Remove unneeded cast in sja1105_crc32()
sja1105_unpack() takes a "const void *buf" as its first parameter, so
there is no need to cast away the "const" of the "buf" variable before
calling it.

Drop the cast, as it prevents the compiler performing some checks.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20210223112003.2223332-1-geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-25 09:41:32 -08:00
Vladimir Oltean
7f7ccdea8c net: dsa: sja1105: fix leakage of flooded frames outside bridging domain
Quite embarrasingly, I managed to fool myself into thinking that the
flooding domain of sja1105 source ports is restricted by the forwarding
domain, which it isn't. Frames which match an FDB entry are forwarded
towards that entry's DESTPORTS restricted by REACH_PORT[SRC_PORT], while
frames that don't match any FDB entry are forwarded towards
FL_DOMAIN[SRC_PORT] or BC_DOMAIN[SRC_PORT].

This means we can't get away with doing the simple thing, and we must
manage the flooding domain ourselves such that it is restricted by the
forwarding domain. This new function must be called from the
.port_bridge_join and .port_bridge_leave methods too, not just from
.port_bridge_flags as we did before.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:02:46 -08:00
Vladimir Oltean
4c44fc5e94 net: dsa: sja1105: fix configuration of source address learning
Due to a mistake, the driver always sets the address learning flag to
the previously stored value, and not to the currently configured one.
The bug is visible only in standalone ports mode, because when the port
is bridged, the issue is masked by .port_stp_state_set which overwrites
the address learning state to the proper value.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 14:02:46 -08:00
Vladimir Oltean
8841f6e63f net: dsa: sja1105: make devlink property best_effort_vlan_filtering true by default
The sja1105 driver has a limitation, extensively described under
Documentation/networking/dsa/sja1105.rst and
Documentation/networking/devlink/sja1105.rst, which says that when the
ports are under a bridge with vlan_filtering=1, traffic to and from
the network stack is not possible, unless the driver-specific
best_effort_vlan_filtering devlink parameter is enabled.

For users, this creates a 'wtf' moment. They need to go to the
documentation and find about the existence of this property, then maybe
install devlink and set it to true.

Having best_effort_vlan_filtering enabled by the kernel by default
delays that 'wtf' moment (maybe up to the point that it never even
happens). The user doesn't need to care that the driver supports
addressing the ports individually by retagging VLAN IDs until he/she
needs to use more than 32 VLAN IDs (since there can be at most 32
retagging rules). Only then do they need to think whether they need the
full VLAN table, at the expense of no individual port addressing, or
not.

But the odds that an sja1105 user will need more than 32 VLANs
terminated by the CPU is probably low. And, if we were to follow the
principle that more advanced use cases should require more advanced
preparation steps, then it makes more sense for ping to 'just work'
while CPU termination of > 32 VLAN IDs to require a bit more forethought
and possibly a driver-specific devlink param.

So we should be able to safely change the default here, and make this
driver act just a little bit more sanely out of the box.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-15 13:23:57 -08:00
Vladimir Oltean
89153ed6eb net: dsa: propagate extack to .port_vlan_filtering
Some drivers can't dynamically change the VLAN filtering option, or
impose some restrictions, it would be nice to propagate this info
through netlink instead of printing it to a kernel log that might never
be read. Also netlink extack includes the module that emitted the
message, which means that it's easier to figure out which ones are
driver-generated errors as opposed to command misuse.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:38:12 -08:00
Vladimir Oltean
31046a5fd9 net: dsa: propagate extack to .port_vlan_add
Allow drivers to communicate their restrictions to user space directly,
instead of printing to the kernel log. Where the conversion would have
been lossy and things like VLAN ID could no longer be conveyed (due to
the lack of support for printf format specifier in netlink extack), I
chose to keep the messages in full form to the kernel log only, and
leave it up to individual driver maintainers to move more messages to
extack.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:38:11 -08:00
Vladimir Oltean
4d94235495 net: dsa: sja1105: offload bridge port flags to device
The chip can configure unicast flooding, broadcast flooding and learning.
Learning is per port, while flooding is per {ingress, egress} port pair
and we need to configure the same value for all possible ingress ports
towards the requested one.

While multicast flooding is not officially supported, we can hack it by
using a feature of the second generation (P/Q/R/S) devices, which is that
FDB entries are maskable, and multicast addresses always have an odd
first octet. So by putting a match-all for 00:01:00:00:00:00 addr and
00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is
always checked last, and does not take precedence in front of any other
MDB. So it behaves effectively as an unknown multicast entry.

For the first generation switches, this feature is not available, so
unknown multicast will always be treated the same as unknown unicast.
So the only thing we can do is request the user to offload the settings
for these 2 flags in tandem, i.e.

ip link set swp2 type bridge_slave flood off
Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
ip link set swp2 type bridge_slave flood off mcast_flood off
ip link set swp2 type bridge_slave mcast_flood on
Error: sja1105: This chip cannot configure multicast flooding independently of unicast.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-12 17:08:05 -08:00
Vladimir Oltean
0ee2af4ebb net: dsa: set configure_vlan_while_not_filtering to true by default
As explained in commit 54a0ed0df4 ("net: dsa: provide an option for
drivers to always receive bridge VLANs"), DSA has historically been
skipping VLAN switchdev operations when the bridge wasn't in
vlan_filtering mode, but the reason why it was doing that has never been
clear. So the configure_vlan_while_not_filtering option is there merely
to preserve functionality for existing drivers. It isn't some behavior
that drivers should opt into. Ideally, when all drivers leave this flag
set, we can delete the dsa_port_skip_vlan_configuration() function.

New drivers always seem to omit setting this flag, for some reason. So
let's reverse the logic: the DSA core sets it by default to true before
the .setup() callback, and legacy drivers can turn it off. This way, new
drivers get the new behavior by default, unless they explicitly set the
flag to false, which is more obvious during review.

Remove the assignment from drivers which were setting it to true, and
add the assignment to false for the drivers that didn't previously have
it. This way, it should be easier to see how many we have left.

The following drivers: lan9303, mv88e6060 were skipped from setting this
flag to false, because they didn't have any VLAN offload ops in the
first place.

The Broadcom Starfighter 2 driver calls the common b53_switch_alloc and
therefore also inherits the configure_vlan_while_not_filtering=true
behavior.

Also, print a message through netlink extack every time a VLAN has been
skipped. This is mildly annoying on purpose, so that (a) it is at least
clear that VLANs are being skipped - the legacy behavior in itself is
confusing, and the extack should be much more difficult to miss, unlike
kernel logs - and (b) people have one more incentive to convert to the
new behavior.

No behavior change except for the added prints is intended at this time.

$ ip link add br0 type bridge vlan_filtering 0
$ ip link set sw0p2 master br0
[   60.315148] br0: port 1(sw0p2) entered blocking state
[   60.320350] br0: port 1(sw0p2) entered disabled state
[   60.327839] device sw0p2 entered promiscuous mode
[   60.334905] br0: port 1(sw0p2) entered blocking state
[   60.340142] br0: port 1(sw0p2) entered forwarding state
Warning: dsa_core: skipping configuration of VLAN. # This was the pvid
$ bridge vlan add dev sw0p2 vid 100
Warning: dsa_core: skipping configuration of VLAN.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210115231919.43834-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 17:29:40 -08:00
Vladimir Oltean
1958d5815c net: dsa: remove the transactional logic from VLAN objects
It should be the driver's business to logically separate its VLAN
offloading into a preparation and a commit phase, and some drivers don't
need / can't do this.

So remove the transactional shim from DSA and let drivers propagate
errors directly from the .port_vlan_add callback.

It would appear that the code has worse error handling now than it had
before. DSA is the only in-kernel user of switchdev that offloads one
switchdev object to more than one port: for every VLAN object offloaded
to a user port, that VLAN is also offloaded to the CPU port. So the
"prepare for user port -> check for errors -> prepare for CPU port ->
check for errors -> commit for user port -> commit for CPU port"
sequence appears to make more sense than the one we are using now:
"offload to user port -> check for errors -> offload to CPU port ->
check for errors", but it is really a compromise. In the new way, we can
catch errors from the commit phase that we previously had to ignore.
But we have our hands tied and cannot do any rollback now: if we add a
VLAN on the CPU port and it fails, we can't do the rollback by simply
deleting it from the user port, because the switchdev API is not so nice
with us: it could have simply been there already, even with the same
flags. So we don't even attempt to rollback anything on addition error,
just leave whatever VLANs managed to get offloaded right where they are.
This should not be a problem at all in practice.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:57 -08:00
Vladimir Oltean
a52b2da778 net: dsa: remove the transactional logic from MDB entries
For many drivers, the .port_mdb_prepare callback was not a good opportunity
to avoid any error condition, and they would suppress errors found during
the actual commit phase.

Where a logical separation between the prepare and the commit phase
existed, the function that used to implement the .port_mdb_prepare
callback still exists, but now it is called directly from .port_mdb_add,
which was modified to return an int code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Linus Wallei <linus.walleij@linaro.org> # RTL8366
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:57 -08:00
Vladimir Oltean
bae33f2b5a net: switchdev: remove the transaction structure from port attributes
Since the introduction of the switchdev API, port attributes were
transmitted to drivers for offloading using a two-step transactional
model, with a prepare phase that was supposed to catch all errors, and a
commit phase that was supposed to never fail.

Some classes of failures can never be avoided, like hardware access, or
memory allocation. In the latter case, merely attempting to move the
memory allocation to the preparation phase makes it impossible to avoid
memory leaks, since commit 91cf8eceff ("switchdev: Remove unused
transaction item queue") which has removed the unused mechanism of
passing on the allocated memory between one phase and another.

It is time we admit that separating the preparation from the commit
phase is something that is best left for the driver to decide, and not
something that should be baked into the API, especially since there are
no switchdev callers that depend on this.

This patch removes the struct switchdev_trans member from switchdev port
attribute notifier structures, and converts drivers to not look at this
member.

In part, this patch contains a revert of my previous commit 2e554a7a5d
("net: dsa: propagate switchdev vlan_filtering prepare phase to
drivers").

For the most part, the conversion was trivial except for:
- Rocker's world implementation based on Broadcom OF-DPA had an odd
  implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
  done mechanically, by pasting the implementation twice, then only
  keeping the code that would get executed during prepare phase on top,
  then only keeping the code that gets executed during the commit phase
  on bottom, then simplifying the resulting code until this was obtained.
- DSA's offloading of STP state, bridge flags, VLAN filtering and
  multicast router could be converted right away. But the ageing time
  could not, so a shim was introduced and this was left for a further
  commit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:57 -08:00
Vladimir Oltean
b7a9e0da2d net: switchdev: remove vid_begin -> vid_end range from VLAN objects
The call path of a switchdev VLAN addition to the bridge looks something
like this today:

        nbp_vlan_init
        |  __br_vlan_set_default_pvid
        |  |                       |
        |  |    br_afspec          |
        |  |        |              |
        |  |        v              |
        |  | br_process_vlan_info  |
        |  |        |              |
        |  |        v              |
        |  |   br_vlan_info        |
        |  |       / \            /
        |  |      /   \          /
        |  |     /     \        /
        |  |    /       \      /
        v  v   v         v    v
      nbp_vlan_add   br_vlan_add ------+
       |              ^      ^ |       |
       |             /       | |       |
       |            /       /  /       |
       \ br_vlan_get_master/  /        v
        \        ^        /  /  br_vlan_add_existing
         \       |       /  /          |
          \      |      /  /          /
           \     |     /  /          /
            \    |    /  /          /
             \   |   /  /          /
              v  |   | v          /
              __vlan_add         /
                 / |            /
                /  |           /
               v   |          /
   __vlan_vid_add  |         /
               \   |        /
                v  v        v
      br_switchdev_port_vlan_add

The ranges UAPI was introduced to the bridge in commit bdced7ef78
("bridge: support for multiple vlans and vlan ranges in setlink and
dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
have always been passed one by one, through struct bridge_vlan_info
tmp_vinfo, to br_vlan_info. So the range never went too far in depth.

Then Scott Feldman introduced the switchdev_port_bridge_setlink function
in commit 47f8328bb1 ("switchdev: add new switchdev bridge setlink").
That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
full use of the range. But switchdev_port_bridge_setlink was called like
this:

br_setlink
-> br_afspec
-> switchdev_port_bridge_setlink

Basically, the switchdev and the bridge code were not tightly integrated.
Then commit 41c498b935 ("bridge: restore br_setlink back to original")
came, and switchdev drivers were required to implement
.ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.

In the meantime, commits such as 0944d6b5a2 ("bridge: try switchdev op
first in __vlan_vid_add/del") finally made switchdev penetrate the
br_vlan_info() barrier and start to develop the call path we have today.
But remember, br_vlan_info() still receives VLANs one by one.

Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
29ab586c3d ("net: switchdev: Remove bridge bypass support from
switchdev") so that drivers would not implement .ndo_bridge_setlink any
longer. The switchdev_port_bridge_setlink also got deleted.
This refactoring removed the parallel bridge_setlink implementation from
switchdev, and left the only switchdev VLAN objects to be the ones
offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
(the latter coming from commit 9c86ce2c1a ("net: bridge: Notify about
bridge VLANs")).

That is to say, today the switchdev VLAN object ranges are not used in
the kernel. Refactoring the above call path is a bit complicated, when
the bridge VLAN call path is already a bit complicated.

Let's go off and finish the job of commit 29ab586c3d by deleting the
bogus iteration through the VLAN ranges from the drivers. Some aspects
of this feature never made too much sense in the first place. For
example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
flag supposed to mean, when a port can obviously have a single pvid?
This particular configuration _is_ denied as of commit 6623c60dc2
("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
perspective, the driver still has to play pretend, and only offload the
vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
modify the flags of another, completely unrelated, switchdev VLAN
object! (a VLAN that is PVID will invalidate the PVID flag from whatever
other VLAN had previously been offloaded with switchdev and had that
flag. Yet switchdev never notifies about that change, drivers are
supposed to guess).

Nonetheless, having a VLAN range in the API makes error handling look
scarier than it really is - unwinding on errors and all of that.
When in reality, no one really calls this API with more than one VLAN.
It is all unnecessary complexity.

And despite appearing pretentious (two-phase transactional model and
all), the switchdev API is really sloppy because the VLAN addition and
removal operations are not paired with one another (you can add a VLAN
100 times and delete it just once). The bridge notifies through
switchdev of a VLAN addition not only when the flags of an existing VLAN
change, but also when nothing changes. There are switchdev drivers out
there who don't like adding a VLAN that has already been added, and
those checks don't really belong at driver level. But the fact that the
API contains ranges is yet another factor that prevents this from being
addressed in the future.

Of the existing switchdev pieces of hardware, it appears that only
Mellanox Spectrum supports offloading more than one VLAN at a time,
through mlxsw_sp_port_vlan_set. I have kept that code internal to the
driver, because there is some more bookkeeping that makes use of it, but
I deleted it from the switchdev API. But since the switchdev support for
ranges has already been de facto deleted by a Mellanox employee and
nobody noticed for 4 years, I'm going to assume it's not a biggie.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 16:00:56 -08:00
Zheng Yongjun
c75857b055 net: dsa: sja1105: Use kzalloc for allocating only one thing
Use kzalloc rather than kcalloc(1,...)

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
@@

- kcalloc(1,
+ kzalloc(
          ...)
// </smpl>

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-05 15:43:41 -08:00
Vladimir Oltean
2e554a7a5d net: dsa: propagate switchdev vlan_filtering prepare phase to drivers
A driver may refuse to enable VLAN filtering for any reason beyond what
the DSA framework cares about, such as:
- having tc-flower rules that rely on the switch being VLAN-aware
- the particular switch does not support VLAN, even if the driver does
  (the DSA framework just checks for the presence of the .port_vlan_add
  and .port_vlan_del pointers)
- simply not supporting this configuration to be toggled at runtime

Currently, when a driver rejects a configuration it cannot support, it
does this from the commit phase, which triggers various warnings in
switchdev.

So propagate the prepare phase to drivers, to give them the ability to
refuse invalid configurations cleanly and avoid the warnings.

Since we need to modify all function prototypes and check for the
prepare phase from within the drivers, take that opportunity and move
the existing driver restrictions within the prepare phase where that is
possible and easy.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Landen Chao <Landen.Chao@mediatek.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Jonathan McDowell <noodles@earth.li>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-05 05:56:48 -07:00
Vladimir Oltean
2b7fea0d20 net: dsa: sja1105: remove duplicate prefix for VL Lookup dynamic config
This is a strictly cosmetic change that renames some macros in
sja1105_dynamic_config.c. They were copy-pasted in haste and this has
resulted in them having the driver prefix twice.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:34:42 -07:00
Vladimir Oltean
ff4cf8eae0 net: dsa: sja1105: implement .devlink_info_get
Return the driver name and ASIC ID so that generic user space
application are able to know they're looking at sja1105 devlink regions
when pretty-printing them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:35:27 -07:00
Vladimir Oltean
bf425b8205 net: dsa: sja1105: expose static config as devlink region
As explained in Documentation/networking/dsa/sja1105.rst, this switch
has a static config held in the driver's memory and re-uploaded from
time to time into the device (after any major change).

The format of this static config is in fact described in UM10944.pdf and
it contains all the switch's settings (it also contains device ID, table
CRCs, etc, just like in the manual). So it is a useful and universal
devlink region to expose to user space, for debugging purposes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:35:27 -07:00
Vladimir Oltean
0a7bdbc23d net: dsa: sja1105: move devlink param code to sja1105_devlink.c
We'll have more devlink code soon. Group it together in a separate
translation object.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:35:27 -07:00
Vladimir Oltean
bbed0bbddd net: dsa: tag_8021q: add VLANs to the master interface too
The whole purpose of tag_8021q is to send VLAN-tagged traffic to the
CPU, from which the driver can decode the source port and switch id.

Currently this only works if the VLAN filtering on the master is
disabled. Change that by explicitly adding code to tag_8021q.c to add
the VLANs corresponding to the tags to the filter of the master
interface.

Because we now need to call vlan_vid_add, then we also need to hold the
RTNL mutex. Propagate that requirement to the callers of dsa_8021q_setup
and modify the existing call sites as appropriate. Note that one call
path, sja1105_best_effort_vlan_filtering_set -> sja1105_vlan_filtering
-> sja1105_setup_8021q_tagging, was already holding this lock.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-20 19:01:34 -07:00
Vladimir Oltean
5899ee367a net: dsa: tag_8021q: add a context structure
While working on another tag_8021q driver implementation, some things
became apparent:

- It is not mandatory for a DSA driver to offload the tag_8021q VLANs by
  using the VLAN table per se. For example, it can add custom TCAM rules
  that simply encapsulate RX traffic, and redirect & decapsulate rules
  for TX traffic. For such a driver, it makes no sense to receive the
  tag_8021q configuration through the same callback as it receives the
  VLAN configuration from the bridge and the 8021q modules.

- Currently, sja1105 (the only tag_8021q user) sets a
  priv->expect_dsa_8021q variable to distinguish between the bridge
  calling, and tag_8021q calling. That can be improved, to say the
  least.

- The crosschip bridging operations are, in fact, stateful already. The
  list of crosschip_links must be kept by the caller and passed to the
  relevant tag_8021q functions.

So it would be nice if the tag_8021q configuration was more
self-contained. This patch attempts to do that.

Create a struct dsa_8021q_context which encapsulates a struct
dsa_switch, and has 2 function pointers for adding and deleting a VLAN.
These will replace the previous channel to the driver, which was through
the .port_vlan_add and .port_vlan_del callbacks of dsa_switch_ops.

Also put the list of crosschip_links into this dsa_8021q_context.
Drivers that don't support cross-chip bridging can simply omit to
initialize this list, as long as they dont call any cross-chip function.

The sja1105_vlan_add and sja1105_vlan_del functions are refactored into
a smaller sja1105_vlan_add_one, which now has 2 entry points:
- sja1105_vlan_add, from struct dsa_switch_ops
- sja1105_dsa_8021q_vlan_add, from the tag_8021q ops
But even this change is fairly trivial. It just reflects the fact that
for sja1105, the VLANs from these 2 channels end up in the same hardware
table. However that is not necessarily true in the general sense (and
that's the reason for making this change).

The rest of the patch is mostly plain refactoring of "ds" -> "ctx". The
dsa_8021q_context structure needs to be propagated because adding a VLAN
is now done through the ops function pointers inside of it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 17:30:43 -07:00
Vladimir Oltean
7e092af2f3 net: dsa: tag_8021q: setup tagging via a single function call
There is no point in calling dsa_port_setup_8021q_tagging for each
individual port. Additionally, it will become more difficult to do that
when we'll have a context structure to tag_8021q (next patch). So
refactor this now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 17:30:43 -07:00
Nathan Chancellor
5978fac03e net: dsa: sja1105: Do not use address of compatible member in sja1105_check_device_id
Clang warns:

drivers/net/dsa/sja1105/sja1105_main.c:3418:38: warning: address of
array 'match->compatible' will always evaluate to 'true'
[-Wpointer-bool-conversion]
        for (match = sja1105_dt_ids; match->compatible; match++) {
        ~~~                          ~~~~~~~^~~~~~~~~~
1 warning generated.

We should check the value of the first character in compatible to see if
it is empty or not. This matches how the rest of the tree iterates over
IDs.

Fixes: 0b0e299720 ("net: dsa: sja1105: use detected device id instead of DT one on mismatch")
Link: https://github.com/ClangBuiltLinux/linux/issues/1139
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 16:13:25 -07:00
Vladimir Oltean
0b0e299720 net: dsa: sja1105: use detected device id instead of DT one on mismatch
Although we can detect the chip revision 100% at runtime, it is useful
to specify it in the device tree compatible string too, because
otherwise there would be no way to assess the correctness of device tree
bindings statically, without booting a board (only some switch versions
have internal RGMII delays and/or an SGMII port).

But for testing the P/Q/R/S support, what I have is a reworked board
with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
want to keep a separate device tree blob just for this one-off board.
Since just the chip has been replaced, its RGMII delay setup is
inherently the same (meaning: delays added by the PHY on the slave
ports, and by PCB traces on the fixed-link CPU port).

For this board, I'd rather have the driver shout at me, but go ahead and
use what it found even if it doesn't match what it's been told is there.

[    2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
[    2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
[    3.005082] sja1105 spi0.1: Enabled switch tagging

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 12:20:55 -07:00
Vladimir Oltean
af9fdd2bf8 net: dsa: sja1105: poll for extts events from a timer
The current poll interval is enough to ensure that rising and falling
edge events are not lost for a 1 PPS signal with 50% duty cycle.

But when we deliver the events to user space, it will try to infer if
they were corresponding to a rising or to a falling edge (the kernel
driver doesn't know that either). User space will try to make that
inference based on the time at which the PPS master had emitted the
pulse (i.e. if it's a .0 time, it's rising edge, if it's .5 time, it's
falling edge).

But there is no in-kernel API for retrieving the precise timestamp
corresponding to a PPS master (aka perout) pulse. So user space has to
guess even that. It will read the PTP time on the PPS master right after
we've delivered the extts event, and declare that the PPS master time
was just the closest integer second, based on 2 thresholds (lower than
.25, or higher than .75, and ignore anything else).

Except that, if we poll for extts events (and our hardware doesn't
really help us, by not providing an interrupt), then there is a risk
that the poll period (and therefore the time at which the event is
delivered) might confuse user space.

Because we are always scheduling the next extts poll at
SJA1105_EXTTS_INTERVAL "from now" (that's the only thing that the
schedule_delayed_work() API gives us), it means that the start time of
the next delayed workqueue will always be shifted to the right a little
bit (shifted with the SPI access duration of this workqueue run).
In turn, because user space sees extts events that are non-periodic
compared to the PPS master's time, this means that it might start making
wrong guesses about rising/falling edge.

To understand the effect, here is the output of ts2phc currently. Notice
the 'src' timestamps of the 'SKIP extts' events, and how they have a
large wander. They keep increasing until the upper limit for the ignore
threshold (.75 seconds), after which the application starts ignoring the
_other_ edge.

ts2phc[26.624]: /dev/ptp3 SKIP extts index 0 at 21.449898912 src 21.657784518
ts2phc[27.133]: adding tstamp 21.949894240 to clock /dev/ptp3
ts2phc[27.133]: adding tstamp 22.000000000 to clock /dev/ptp1
ts2phc[27.133]: /dev/ptp3 offset        640 s2 freq   +5112
ts2phc[27.636]: /dev/ptp3 SKIP extts index 0 at 22.449889360 src 22.669398022
ts2phc[28.140]: adding tstamp 22.949884376 to clock /dev/ptp3
ts2phc[28.140]: adding tstamp 23.000000000 to clock /dev/ptp1
ts2phc[28.140]: /dev/ptp3 offset         96 s2 freq   +4760
ts2phc[28.644]: /dev/ptp3 SKIP extts index 0 at 23.449879504 src 23.677420422
ts2phc[29.153]: adding tstamp 23.949874704 to clock /dev/ptp3
ts2phc[29.153]: adding tstamp 24.000000000 to clock /dev/ptp1
ts2phc[29.153]: /dev/ptp3 offset       -264 s2 freq   +4429
ts2phc[29.656]: /dev/ptp3 SKIP extts index 0 at 24.449870008 src 24.689407238
ts2phc[30.160]: adding tstamp 24.949865376 to clock /dev/ptp3
ts2phc[30.160]: adding tstamp 25.000000000 to clock /dev/ptp1
ts2phc[30.160]: /dev/ptp3 offset       -280 s2 freq   +4334
ts2phc[30.664]: /dev/ptp3 SKIP extts index 0 at 25.449860760 src 25.697449926
ts2phc[31.168]: adding tstamp 25.949856176 to clock /dev/ptp3
ts2phc[31.168]: adding tstamp 26.000000000 to clock /dev/ptp1
ts2phc[31.168]: /dev/ptp3 offset       -176 s2 freq   +4354
ts2phc[31.672]: /dev/ptp3 SKIP extts index 0 at 26.449851584 src 26.705433606
ts2phc[32.180]: adding tstamp 26.949846992 to clock /dev/ptp3
ts2phc[32.180]: adding tstamp 27.000000000 to clock /dev/ptp1
ts2phc[32.180]: /dev/ptp3 offset        -80 s2 freq   +4397
ts2phc[32.684]: /dev/ptp3 SKIP extts index 0 at 27.449842384 src 27.717415110
ts2phc[33.192]: adding tstamp 27.949837768 to clock /dev/ptp3
ts2phc[33.192]: adding tstamp 28.000000000 to clock /dev/ptp1
ts2phc[33.192]: /dev/ptp3 offset          0 s2 freq   +4453
ts2phc[33.696]: /dev/ptp3 SKIP extts index 0 at 28.449833128 src 28.729412902
ts2phc[34.200]: adding tstamp 28.949828472 to clock /dev/ptp3
ts2phc[34.200]: adding tstamp 29.000000000 to clock /dev/ptp1
ts2phc[34.200]: /dev/ptp3 offset          8 s2 freq   +4461
ts2phc[34.704]: /dev/ptp3 SKIP extts index 0 at 29.449823816 src 29.737416038
ts2phc[35.208]: adding tstamp 29.949819152 to clock /dev/ptp3
ts2phc[35.208]: adding tstamp 30.000000000 to clock /dev/ptp1
ts2phc[35.208]: /dev/ptp3 offset         -8 s2 freq   +4447
ts2phc[35.712]: /dev/ptp3 SKIP extts index 0 at 30.449814496 src 30.745554982
ts2phc[36.216]: adding tstamp 30.949809840 to clock /dev/ptp3
ts2phc[36.216]: adding tstamp 31.000000000 to clock /dev/ptp1
ts2phc[36.216]: /dev/ptp3 offset         -8 s2 freq   +4445
ts2phc[36.468]: /dev/ptp3 SKIP extts index 0 at 31.449805184 src 31.501109446
ts2phc[36.972]: adding tstamp 31.949800536 to clock /dev/ptp3
ts2phc[36.972]: adding tstamp 32.000000000 to clock /dev/ptp1
ts2phc[36.972]: /dev/ptp3 offset         -8 s2 freq   +4442
ts2phc[37.480]: /dev/ptp3 SKIP extts index 0 at 32.449795896 src 32.513320070
ts2phc[37.984]: adding tstamp 32.949791248 to clock /dev/ptp3
ts2phc[37.984]: adding tstamp 33.000000000 to clock /dev/ptp1
ts2phc[37.984]: /dev/ptp3 offset          0 s2 freq   +4448

Fix that by taking the following measures:
- Schedule the poll from a timer. Because we are really scheduling the
  timer periodically, the extts events delivered to user space are
  periodic too, and don't suffer from the "shift-to-the-right" effect.
- Increase the poll period to 6 times a second. This imposes a smaller
  upper bound to the shift that can occur to the delivery time of extts
  events, and makes user space (ts2phc) to always interpret correctly
  which events should be skipped and which shouldn't.
- Move the SPI readout itself to the main PTP kernel thread, instead of
  the generic workqueue. This is because the timer runs in atomic
  context, but is also better than before, because if needed, we can
  chrt & taskset this kernel thread, to ensure it gets enough priority
  under load.

After this patch, one can notice that the wander is greatly reduced, and
that the latencies of one extts poll are not propagated to the next. The
'src' timestamp that is skipped is never larger than .65 seconds (which
means .15 seconds larger than the time at which the real event occurred
at, and .10 seconds smaller than the .75 upper threshold for ignoring
the falling edge):

ts2phc[40.076]: adding tstamp 34.949261296 to clock /dev/ptp3
ts2phc[40.076]: adding tstamp 35.000000000 to clock /dev/ptp1
ts2phc[40.076]: /dev/ptp3 offset         48 s2 freq   +4631
ts2phc[40.568]: /dev/ptp3 SKIP extts index 0 at 35.449256496 src 35.595791078
ts2phc[41.064]: adding tstamp 35.949251744 to clock /dev/ptp3
ts2phc[41.064]: adding tstamp 36.000000000 to clock /dev/ptp1
ts2phc[41.064]: /dev/ptp3 offset       -224 s2 freq   +4374
ts2phc[41.552]: /dev/ptp3 SKIP extts index 0 at 36.449247088 src 36.579825574
ts2phc[42.044]: adding tstamp 36.949242456 to clock /dev/ptp3
ts2phc[42.044]: adding tstamp 37.000000000 to clock /dev/ptp1
ts2phc[42.044]: /dev/ptp3 offset       -240 s2 freq   +4290
ts2phc[42.536]: /dev/ptp3 SKIP extts index 0 at 37.449237848 src 37.563828774
ts2phc[43.028]: adding tstamp 37.949233264 to clock /dev/ptp3
ts2phc[43.028]: adding tstamp 38.000000000 to clock /dev/ptp1
ts2phc[43.028]: /dev/ptp3 offset       -144 s2 freq   +4314
ts2phc[43.520]: /dev/ptp3 SKIP extts index 0 at 38.449228656 src 38.547823238
ts2phc[44.012]: adding tstamp 38.949224048 to clock /dev/ptp3
ts2phc[44.012]: adding tstamp 39.000000000 to clock /dev/ptp1
ts2phc[44.012]: /dev/ptp3 offset        -80 s2 freq   +4335
ts2phc[44.508]: /dev/ptp3 SKIP extts index 0 at 39.449219432 src 39.535846118
ts2phc[44.996]: adding tstamp 39.949214816 to clock /dev/ptp3
ts2phc[44.996]: adding tstamp 40.000000000 to clock /dev/ptp1
ts2phc[44.996]: /dev/ptp3 offset        -32 s2 freq   +4359
ts2phc[45.488]: /dev/ptp3 SKIP extts index 0 at 40.449210192 src 40.515824678
ts2phc[45.980]: adding tstamp 40.949205568 to clock /dev/ptp3
ts2phc[45.980]: adding tstamp 41.000000000 to clock /dev/ptp1
ts2phc[45.980]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[46.636]: /dev/ptp3 SKIP extts index 0 at 41.449200928 src 41.664176902
ts2phc[47.132]: adding tstamp 41.949196288 to clock /dev/ptp3
ts2phc[47.132]: adding tstamp 42.000000000 to clock /dev/ptp1
ts2phc[47.132]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[47.620]: /dev/ptp3 SKIP extts index 0 at 42.449191656 src 42.648117190
ts2phc[48.112]: adding tstamp 42.949187016 to clock /dev/ptp3
ts2phc[48.112]: adding tstamp 43.000000000 to clock /dev/ptp1
ts2phc[48.112]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[48.604]: /dev/ptp3 SKIP extts index 0 at 43.449182384 src 43.632112582
ts2phc[49.100]: adding tstamp 43.949177736 to clock /dev/ptp3
ts2phc[49.100]: adding tstamp 44.000000000 to clock /dev/ptp1
ts2phc[49.100]: /dev/ptp3 offset         -8 s2 freq   +4376
ts2phc[49.588]: /dev/ptp3 SKIP extts index 0 at 44.449173096 src 44.616136774
ts2phc[50.080]: adding tstamp 44.949168464 to clock /dev/ptp3
ts2phc[50.080]: adding tstamp 45.000000000 to clock /dev/ptp1
ts2phc[50.080]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[50.572]: /dev/ptp3 SKIP extts index 0 at 45.449163816 src 45.600134662
ts2phc[51.064]: adding tstamp 45.949159160 to clock /dev/ptp3
ts2phc[51.064]: adding tstamp 46.000000000 to clock /dev/ptp1
ts2phc[51.064]: /dev/ptp3 offset         -8 s2 freq   +4376
ts2phc[51.556]: /dev/ptp3 SKIP extts index 0 at 46.449154528 src 46.584588550
ts2phc[52.048]: adding tstamp 46.949149896 to clock /dev/ptp3
ts2phc[52.048]: adding tstamp 47.000000000 to clock /dev/ptp1
ts2phc[52.048]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[52.540]: /dev/ptp3 SKIP extts index 0 at 47.449145256 src 47.568132198
ts2phc[53.032]: adding tstamp 47.949140616 to clock /dev/ptp3
ts2phc[53.032]: adding tstamp 48.000000000 to clock /dev/ptp1
ts2phc[53.032]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[53.524]: /dev/ptp3 SKIP extts index 0 at 48.449135968 src 48.552121446
ts2phc[54.016]: adding tstamp 48.949131320 to clock /dev/ptp3
ts2phc[54.016]: adding tstamp 49.000000000 to clock /dev/ptp1
ts2phc[54.016]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[54.512]: /dev/ptp3 SKIP extts index 0 at 49.449126680 src 49.540147014
ts2phc[55.000]: adding tstamp 49.949122040 to clock /dev/ptp3
ts2phc[55.000]: adding tstamp 50.000000000 to clock /dev/ptp1
ts2phc[55.000]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[55.492]: /dev/ptp3 SKIP extts index 0 at 50.449117400 src 50.520119078
ts2phc[55.988]: adding tstamp 50.949112768 to clock /dev/ptp3
ts2phc[55.988]: adding tstamp 51.000000000 to clock /dev/ptp1
ts2phc[55.988]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[56.476]: /dev/ptp3 SKIP extts index 0 at 51.449108120 src 51.504175910
ts2phc[57.132]: adding tstamp 51.949103480 to clock /dev/ptp3
ts2phc[57.132]: adding tstamp 52.000000000 to clock /dev/ptp1
ts2phc[57.132]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[57.624]: /dev/ptp3 SKIP extts index 0 at 52.449098840 src 52.651833574
ts2phc[58.116]: adding tstamp 52.949094200 to clock /dev/ptp3
ts2phc[58.116]: adding tstamp 53.000000000 to clock /dev/ptp1
ts2phc[58.116]: /dev/ptp3 offset          8 s2 freq   +4392
ts2phc[58.612]: /dev/ptp3 SKIP extts index 0 at 53.449089560 src 53.639826918
ts2phc[59.100]: adding tstamp 53.949084920 to clock /dev/ptp3
ts2phc[59.100]: adding tstamp 54.000000000 to clock /dev/ptp1
ts2phc[59.100]: /dev/ptp3 offset          8 s2 freq   +4394
ts2phc[59.592]: /dev/ptp3 SKIP extts index 0 at 54.449080272 src 54.619842278
ts2phc[60.084]: adding tstamp 54.949075624 to clock /dev/ptp3
ts2phc[60.084]: adding tstamp 55.000000000 to clock /dev/ptp1
ts2phc[60.084]: /dev/ptp3 offset          8 s2 freq   +4397
ts2phc[60.576]: /dev/ptp3 SKIP extts index 0 at 55.449070968 src 55.603885542
ts2phc[61.068]: adding tstamp 55.949066312 to clock /dev/ptp3
ts2phc[61.068]: adding tstamp 56.000000000 to clock /dev/ptp1
ts2phc[61.068]: /dev/ptp3 offset          0 s2 freq   +4391
ts2phc[61.560]: /dev/ptp3 SKIP extts index 0 at 56.449061680 src 56.587885798
ts2phc[62.052]: adding tstamp 56.949057032 to clock /dev/ptp3
ts2phc[62.052]: adding tstamp 57.000000000 to clock /dev/ptp1
ts2phc[62.052]: /dev/ptp3 offset         -8 s2 freq   +4383

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:16:02 -07:00
Po Liu
5f035af76e net:qos: police action offloading parameter 'burst' change to the original value
Since 'tcfp_burst' with TICK factor, driver side always need to recover
it to the original value, this patch moves the generic calculation and
recover to the 'burst' original value before offloading to device driver.

Signed-off-by: Po Liu <po.liu@nxp.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:33:42 -07:00
David S. Miller
7bed145516 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 19:29:51 -07:00
Vladimir Oltean
43ce887c50 net: dsa: sja1105: fix tc-gate schedule with single element
The sja1105_gating_cfg_time_to_interval function does this, as per the
comments:

/* The gate entries contain absolute times in their e->interval field. Convert
 * that to proper intervals (i.e. "0, 5, 10, 15" to "5, 5, 5, 5").
 */

To perform that task, it iterates over gating_cfg->entries, at each step
updating the interval of the _previous_ entry. So one interval remains
to be updated at the end of the loop: the last one (since it isn't
"prev" for anyone else).

But there was an erroneous check, that the last element's interval
should not be updated if it's also the only element. I'm not quite sure
why that check was there, but it's clearly incorrect, as a tc-gate
schedule with a single element would get an e->interval of zero,
regardless of the duration requested by the user. The switch wouldn't
even consider this configuration as valid: it will just drop all traffic
that matches the rule.

Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:06:56 -07:00
Vladimir Oltean
82f6896a25 net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
Currently, tas_data->enabled would remain true even after deleting all
tc-gate rules from the switch ports, which would cause the
sja1105_tas_state_machine to get unnecessarily scheduled.

Also, if there were any errors which would prevent the hardware from
enabling the gating schedule, the sja1105_tas_state_machine would
continuously detect and print that, spamming the kernel log, even if the
rules were subsequently deleted.

The rules themselves are _not_ active, because sja1105_init_scheduling
does enough of a job to not install the gating schedule in the static
config. But the virtual link rules themselves are still present.

So call the functions that remove the tc-gate configuration from
priv->tas_data.gating_cfg, so that tas_data->enabled can be set to
false, and sja1105_tas_state_machine will stop from being scheduled.

Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:06:56 -07:00
Vladimir Oltean
026bdb2b96 net: dsa: sja1105: unconditionally free old gating config
Currently sja1105_compose_gating_subschedule is not prepared to be
called for the case where we want to recompute the global tc-gate
configuration after we've deleted those actions on a port.

After deleting the tc-gate actions on the last port, max_cycle_time
would become zero, and that would incorrectly prevent
sja1105_free_gating_config from getting called.

So move the freeing function above the check for the need to apply a new
configuration.

Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:06:56 -07:00