linux

Author	SHA1	Message	Date
David S. Miller	b42a738e40	Merge branch 'dsa-fdb-isolation' Vladimir Oltean says: ==================== DSA FDB isolation There are use cases which need FDB isolation between standalone ports and bridged ports, as well as isolation between ports of different bridges. Most of these use cases are a result of the fact that packets can now be partially forwarded by the software bridge, so one port might need to send a packet to the CPU but its FDB lookup will see that it can forward it directly to a bridge port where that packet was autonomously learned. So the source port will attempt to shortcircuit the CPU and forward autonomously, which it can't due to the forwarding isolation we have in place. So we will have packet drops instead of proper operation. Additionally, before DSA can implement IFF_UNICAST_FLT for standalone ports, we must have control over which database we install FDB entries corresponding to port MAC addresses in. We don't want to hinder the operation of the bridging layer. DSA does not have a driver API that encourages FDB isolation, so this needs to be created. The basis for this is a new struct dsa_db which annotates each FDB and MDB entry with the database it belongs to. The sja1105 and felix drivers are modified to observe the dsa_db argument, and therefore, enforce the FDB isolation. Compared to the previous RFC patch series from August: https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/ what is different is that I stopped trying to make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking, instead I'm making use of the fact that DSA waits for switchdev FDB work items to finish before a port leaves the bridge. This is possible since: https://patchwork.kernel.org/project/netdevbpf/patch/20211024171757.3753288-7-vladimir.oltean@nxp.com/ Additionally, v2 is also rebased over the DSA LAG FDB work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	54c3198460	net: mscc: ocelot: enforce FDB isolation when VLAN-unaware Currently ocelot uses a pvid of 0 for standalone ports and ports under a VLAN-unaware bridge, and the pvid of the bridge for ports under a VLAN-aware bridge. Standalone ports do not perform learning, but packets received on them are still subject to FDB lookups. So if the MAC DA that a standalone port receives has been also learned on a VLAN-unaware bridge port, ocelot will attempt to forward to that port, even though it can't, so it will drop packets. So there is a desire to avoid that, and isolate the FDBs of different bridges from one another, and from standalone ports. The ocelot switch library has two distinct entry points: the felix DSA driver and the ocelot switchdev driver. We need to code up a minimal bridge_num allocation in the ocelot switchdev driver too, this is copied from DSA with the exception that ocelot does not care about DSA trees, cross-chip bridging etc. So it only looks at its own ports that are already in the same bridge. The ocelot switchdev driver uses the bridge_num it has allocated itself, while the felix driver uses the bridge_num allocated by DSA. They are both stored inside ocelot_port->bridge_num by the common function ocelot_port_bridge_join() which receives the bridge_num passed by value. Once we have a bridge_num, we can only use it to enforce isolation between VLAN-unaware bridges. As far as I can see, ocelot does not have anything like a FID that further makes VLAN 100 from a port be different to VLAN 100 from another port with regard to FDB lookup. So we simply deny multiple VLAN-aware bridges. For VLAN-unaware bridges, we crop the 4000-4095 VLAN region and we allocate a VLAN for each bridge_num. This will be used as the pvid of each port that is under that VLAN-unaware bridge, for as long as that bridge is VLAN-unaware. VID 0 remains only for standalone ports. It is okay if all standalone ports use the same VID 0, since they perform no address learning, the FDB will contain no entry in VLAN 0, so the packets will always be flooded to the only possible destination, the CPU port. The CPU port module doesn't need to be member of the VLANs to receive packets, but if we use the DSA tag_8021q protocol, those packets are part of the data plane as far as ocelot is concerned, so there it needs to. Just ensure that the DSA tag_8021q CPU port is a member of all reserved VLANs when it is created, and is removed when it is deleted. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	219827ef92	net: dsa: sja1105: enforce FDB isolation For sja1105, to enforce FDB isolation simply means to turn on Independent VLAN Learning unconditionally, and to remap VLAN-unaware FDB and MDB entries towards the private VLAN allocated by tag_8021q for each bridge. Standalone ports each have their own standalone tag_8021q VLAN. No learning happens in that VLAN due to: - learning being disabled on standalone user ports - learning being disabled on the CPU port (we use assisted_learning_on_cpu_port which only installs bridge FDBs) VLAN-aware ports learn FDB entries with the bridge VLANs. VLAN-unaware bridge ports learn with the tag_8021q VLAN for bridging. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	06b9cce426	net: dsa: pass extack to .port_bridge_join driver methods As FDB isolation cannot be enforced between VLAN-aware bridges in lack of hardware assistance like extra FID bits, it seems plausible that many DSA switches cannot do it. Therefore, they need to reject configurations with multiple VLAN-aware bridges from the two code paths that can transition towards that state: - joining a VLAN-aware bridge - toggling VLAN awareness on an existing bridge The .port_vlan_filtering method already propagates the netlink extack to the driver, let's propagate it from .port_bridge_join too, to make sure that the driver can use the same function for both. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	c26933639b	net: dsa: request drivers to perform FDB isolation For DSA, to encourage drivers to perform FDB isolation simply means to track which bridge does each FDB and MDB entry belong to. It then becomes the driver responsibility to use something that makes the FDB entry from one bridge not match the FDB lookup of ports from other bridges. The top-level functions where the bridge is determined are: - dsa_port_fdb_{add,del} - dsa_port_host_fdb_{add,del} - dsa_port_mdb_{add,del} - dsa_port_host_mdb_{add,del} aka the pre-crosschip-notifier functions. Changing the API to pass a reference to a bridge is not superfluous, and looking at the passed bridge argument is not the same as having the driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add() method. DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well, and those do not have any dp->bridge information to retrieve, because they are not in any bridge - they are merely the pipes that serve the user ports that are in one or multiple bridges. The struct dsa_bridge associated with each FDB/MDB entry is encapsulated in a larger "struct dsa_db" database. Although only databases associated to bridges are notified for now, this API will be the starting point for implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB entries on the CPU port which belong to the corresponding user port's port database. These are supposed to match only when the port is standalone. It is better to introduce the API in its expected final form than to introduce it for bridges first, then to have to change drivers which may have made one or more assumptions. Drivers can use the provided bridge.num, but they can also use a different numbering scheme that is more convenient. DSA must perform refcounting on the CPU and DSA ports by also taking into account the bridge number. So if two bridges request the same local address, DSA must notify the driver twice, once for each bridge. In fact, if the driver supports FDB isolation, DSA must perform refcounting per bridge, but if the driver doesn't, DSA must refcount host addresses across all bridges, otherwise it would be telling the driver to delete an FDB entry for a bridge and the driver would delete it for all bridges. So introduce a bool fdb_isolation in drivers which would make all bridge databases passed to the cross-chip notifier have the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal() say that all bridge databases are the same database - which is essentially the legacy behavior. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	b6362bdf75	net: dsa: tag_8021q: rename dsa_8021q_bridge_tx_fwd_offload_vid The dsa_8021q_bridge_tx_fwd_offload_vid is no longer used just for bridge TX forwarding offload, it is the private VLAN reserved for VLAN-unaware bridging in a way that is compatible with FDB isolation. So just rename it dsa_tag_8021q_bridge_vid. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	04b67e18ce	net: dsa: tag_8021q: merge RX and TX VLANs In the old Shared VLAN Learning mode of operation that tag_8021q previously used for forwarding, we needed to have distinct concepts for an RX and a TX VLAN. An RX VLAN could be installed on all ports that were members of a given bridge, so that autonomous forwarding could still work, while a TX VLAN was dedicated for precise packet steering, so it just contained the CPU port and one egress port. Now that tag_8021q uses Independent VLAN Learning and imprecise RX/TX all over, those lines have been blurred and we no longer have the need to do precise TX towards a port that is in a bridge. As for standalone ports, it is fine to use the same VLAN ID for both RX and TX. This patch changes the tag_8021q format by shifting the VLAN range it reserves, and halving it. Previously, our DIR bits were encoding the VLAN direction (RX/TX) and were set to either 1 or 2. This meant that tag_8021q reserved 2K VLANs, or 50% of the available range. Change the DIR bits to a hardcoded value of 3 now, which makes tag_8021q reserve only 1K VLANs, and a different range now (the last 1K). This is done so that we leave the old format in place in case we need to return to it. In terms of code, the vid_is_dsa_8021q_rxvlan and vid_is_dsa_8021q_txvlan functions go away. Any vid_is_dsa_8021q is both a TX and an RX VLAN, and they are no longer distinct. For example, felix which did different things for different VLAN types, now needs to handle the RX and the TX logic for the same VLAN. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	08f44db3ab	net: dsa: felix: delete workarounds present due to SVL tag_8021q bridging The felix driver, which also has a tagging protocol implementation based on tag_8021q, does not care about adding the RX VLAN that is pvid on one port on the other ports that are in the same bridge with it. It simply doesn't need that, because in its implementation, the RX VLAN that is pvid of a port is only used to install a TCAM rule that pushes that VLAN ID towards the CPU port. Now that tag_8021q no longer performs Shared VLAN Learning based forwarding, the RX VLANs are actually segregated into two types: standalone VLANs and VLAN-unaware bridging VLANs. Since you actually have to call dsa_tag_8021q_bridge_join() to get a bridging VLAN from tag_8021q, and felix does not do that because it doesn't need it, it means that it only gets standalone port VLANs from tag_8021q. Which is perfect because this means it can drop its workarounds that avoid the VLANs it does not need. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	d27656d02d	docs: net: dsa: sja1105: document limitations of tc-flower rule VLAN awareness After change "net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging", tag_8021q enforces two different pvids on a port, depending on whether it is standalone or in a VLAN-unaware bridge. Up until now, there was a single pvid, represented by dsa_tag_8021q_rx_vid(), and that was used as the VLAN for VLAN-unaware virtual link rules, regardless of whether the port was bridged or standalone. To keep VLAN-unaware virtual links working, we need to follow whether the port is in a bridge or not, and update the VLAN ID from those rules. In fact we can't fully do that. Depending on whether the switch is VLAN-aware or not, we can accept Virtual Link rules with just the MAC DA, or with a MAC DA and a VID. So we already deny changes to the VLAN awareness of the switch. But the VLAN awareness may also change as a result of joining or leaving a bridge. One might say we could just allow the following: a port may leave a VLAN-unaware bridge while it has VLAN-unaware VL (tc-flower) rules, and the driver will update those with the new tag_8021q pvid for standalone mode, but the driver won't accept joining a bridge at all while VL rules were installed in standalone mode. This is sort of a compromise made because leaving a bridge is an operation that cannot be vetoed. But this sort of setup change is not fully supported, either: as mentioned, VLAN filtering changes can also be triggered by leaving a bridge, therefore, the existing veto we have in place for turning VLAN filtering off with VLAN-aware VL rules active still isn't fully effective. I really don't know how to deal with this in a way that produces predictable behavior for user space. Since at the moment, keeping this feature fully functional on constellation changes (not changing the tag_8021q port pvid when joining a bridge) is blocking progress for the DSA FDB isolation, I'd rather document it as a (potentially temporary) limitation and go on without it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:13 +00:00
Vladimir Oltean	d7f9787a76	net: dsa: tag_8021q: add support for imprecise RX based on the VBID The sja1105 switch can't populate the PORT field of the tag_8021q header when sending a frame to the CPU with a non-zero VBID. Similar to dsa_find_designated_bridge_port_by_vid() which performs imprecise RX for VLAN-aware bridges, let's introduce a helper in tag_8021q for performing imprecise RX based on the VLAN that it has allocated for a VLAN-unaware bridge. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:13 +00:00
Vladimir Oltean	91495f21fc	net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging For VLAN-unaware bridging, tag_8021q uses something perhaps a bit too tied with the sja1105 switch: each port uses the same pvid which is also used for standalone operation (a unique one from which the source port and device ID can be retrieved when packets from that port are forwarded to the CPU). Since each port has a unique pvid when performing autonomous forwarding, the switch must be configured for Shared VLAN Learning (SVL) such that the VLAN ID itself is ignored when performing FDB lookups. Without SVL, packets would always be flooded, since FDB lookup in the source port's VLAN would never find any entry. First of all, to make tag_8021q more palatable to switches which might not support Shared VLAN Learning, let's just use a common VLAN for all ports that are under the same bridge. Secondly, using Shared VLAN Learning means that FDB isolation can never be enforced. But if all ports under the same VLAN-unaware bridge share the same VLAN ID, it can. The disadvantage is that the CPU port can no longer perform precise source port identification for these packets. But at least we have a mechanism which has proven to be adequate for that situation: imprecise RX (dsa_find_designated_bridge_port_by_vid), which is what we use for termination on VLAN-aware bridges. The VLAN ID that VLAN-unaware bridges will use with tag_8021q is the same one as we were previously using for imprecise TX (bridge TX forwarding offload). It is already allocated, it is just a matter of using it. Note that because now all ports under the same bridge share the same VLAN, the complexity of performing a tag_8021q bridge join decreases dramatically. We no longer have to install the RX VLAN of a newly joining port into the port membership of the existing bridge ports. The newly joining port just becomes a member of the VLAN corresponding to that bridge, and the other ports are already members of it from when they joined the bridge themselves. So forwarding works properly. This means that we can unhook dsa_tag_8021q_bridge_{join,leave} from the cross-chip notifier level dsa_switch_bridge_{join,leave}. We can put these calls directly into the sja1105 driver. With this new mode of operation, a port controlled by tag_8021q can have two pvids whereas before it could only have one. The pvid for standalone operation is different from the pvid used for VLAN-unaware bridging. This is done, again, so that FDB isolation can be enforced. Let tag_8021q manage this by deleting the standalone pvid when a port joins a bridge, and restoring it when it leaves it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:13 +00:00
David S. Miller	1bb1c5bc54	Merge branch 'FFungible-ethernet-driver' Dimitris Michailidis says: ==================== new Fungible Ethernet driver This patch series contains a new network driver for the Ethernet functionality of Fungible cards. It contains two modules. The first one in patch 2 is a library module that implements some of the device setup, queue managenent, and support for operating an admin queue. These are placed in a separate module because the cards provide a number of PCI functions handled by different types of drivers and all use the same common means to interact with the device. Each of the drivers will be relying on this library module for them. The remaining patches provide the Ethernet driver for the cards. v2: - Fix set_pauseparam, remove get_wol, remove module param (Andrew Lunn) - Fix a register poll loop (Andrew) - Replace constants defined with 'static const' - make W=1 C=1 is clean - Remove devlink FW update (Jakub) - Remove duplicate ethtool stats covered by structured API (Jakub) v3: - Make TLS stats unconditional (Andrew) - Remove inline from .c (Andrew) - Replace some ifdef with IS_ENABLED (Andrew) - Fix build failure on 32b arches (build robot) - Fix build issue with make O= (Jakub) v4: - Fix for newer bpf_warn_invalid_xdp_action() (Jakub) - Remove 32b dma_set_mask_and_coherent() v5: - Make XDP enter/exit non-disruptive to active traffic - Remove dormant port state - Style fixes, unused stuff removal (Jakub) v6: - When changing queue depth or numbers allocate the new queues before shutting down the existing ones (Jakub) v7: - Convert IRQ bookeeping to use XArray. - Changes to the numbers of Tx/Rx queues are now incremental and do not disrupt ongoing traffic. - Implement .ndo_eth_ioctl instead of .ndo_do_ioctl. - Replace deprecated irq_set_affinity_hint. - Remove TLS 1.3 support (Jakub) - Remove hwtstamp_config.flags check (Jakub) - Add locking in SR-IOV enable/disable. (Jakub) v8: - Remove dropping of <33B packets and the associated counter (Jakub) - Report CQE size. - Show last MAC stats when the netdev isn't running (Andrew) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:24 +00:00
Dimitris Michailidis	749efb1e6d	net/fungible: Kconfig, Makefiles, and MAINTAINERS Hook up the new driver to configuration and build. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	a3662007a1	net/funeth: add kTLS TX control part This provides the control pieces for kTLS Tx offload, implementinng the offload operations. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	db37bc177d	net/funeth: add the data path Add the driver's data path. Tx handles skbs, XDP, and kTLS, Rx has skbs and XDP. Also included are Rx and Tx queue creation/tear-down and tracing. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	d1d899f244	net/funeth: devlink support The devlink part, which is minimal at this time giving just the driver name. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	21c5ea95da	net/funeth: ethtool operations Add ethtool operations, primarily related to queues and ports, as well as device statistics. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	ee6373ddf3	net/funeth: probing and netdev ops This is the first part of the Fungible ethernet driver. It deals with device probing, net_device creation, and netdev ops. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	e1ffcc6681	net/fungible: Add service module for Fungible drivers Fungible cards have a number of different PCI functions and thus different drivers, all of which use a common method to initialize and interact with the device. This commit adds a library module that collects these common mechanisms. They mainly deal with device initialization, setting up and destroying queues, and operating an admin queue. A subset of the FW interface is also included here. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	e8eb9e3299	PCI: Add Fungible Vendor ID to pci_ids.h Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
David S. Miller	4aaa489538	Merge branch 'ip-neigh-skb-reason' Menglong Dong says: ==================== net: use kfree_skb_reason() for ip/neighbour In the series "net: use kfree_skb_reason() for ip/udp packet receive", reasons for skb drops are added to the packet receive process of IP layer. Link: https://lore.kernel.org/netdev/20220205074739.543606-1-imagedong@tencent.com/ And in the first patch of this series, skb drop reasons are added to the packet egress path of IP layer. As kfree_skb() is not used frequent, I commit these changes at once and didn't create a patch for every functions that involed. Following functions are handled: __ip_queue_xmit() ip_finish_output() ip_mc_finish_output() ip6_output() ip6_finish_output() ip6_finish_output2() Following new drop reasons are introduced (what they mean can be seen in the document of them): SKB_DROP_REASON_IP_OUTNOROUTES SKB_DROP_REASON_BPF_CGROUP_EGRESS SKB_DROP_REASON_IPV6DISABLED SKB_DROP_REASON_NEIGH_CREATEFAIL In the 2th and 3th patches, kfree_skb_reason() is used in neighbour subsystem instead of kfree_skb(). __neigh_event_send() and arp_error_report() are involed, and following new drop reasons are introduced: SKB_DROP_REASON_NEIGH_FAILED SKB_DROP_REASON_NEIGH_QUEUEFULL SKB_DROP_REASON_NEIGH_DEAD Changes since v2: - fix typo in the 1th patch of 'SKB_DROP_REASON_IPV6DSIABLED' reported by Roman Changes since v1: - introduce SKB_DROP_REASON_NEIGH_CREATEFAIL for some path in the 1th patch - introduce SKB_DROP_REASON_NEIGH_DEAD in the 2th patch - simplify the document for the new drop reasons, as David Ahern suggested ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:53:59 +00:00
Menglong Dong	56d4b4e48a	net: neigh: add skb drop reasons to arp_error_report() When neighbour become invalid or destroyed, neigh_invalidate() will be called. neigh->ops->error_report() will be called if the neighbour's state is NUD_FAILED, and seems here is the only use of error_report(). So we can tell that the reason of skb drops in arp_error_report() is SKB_DROP_REASON_NEIGH_FAILED. Replace kfree_skb() used in arp_error_report() with kfree_skb_reason(). Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:53:59 +00:00
Menglong Dong	a5736edda1	net: neigh: use kfree_skb_reason() for __neigh_event_send() Replace kfree_skb() used in __neigh_event_send() with kfree_skb_reason(). Following drop reasons are added: SKB_DROP_REASON_NEIGH_FAILED SKB_DROP_REASON_NEIGH_QUEUEFULL SKB_DROP_REASON_NEIGH_DEAD The first two reasons above should be the hot path that skb drops in neighbour subsystem. Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:53:59 +00:00
Menglong Dong	5e187189ec	net: ip: add skb drop reasons for ip egress path Replace kfree_skb() which is used in the packet egress path of IP layer with kfree_skb_reason(). Functions that are involved include: __ip_queue_xmit() ip_finish_output() ip_mc_finish_output() ip6_output() ip6_finish_output() ip6_finish_output2() Following new drop reasons are introduced: SKB_DROP_REASON_IP_OUTNOROUTES SKB_DROP_REASON_BPF_CGROUP_EGRESS SKB_DROP_REASON_IPV6DISABLED SKB_DROP_REASON_NEIGH_CREATEFAIL Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:53:58 +00:00
David S. Miller	0cc70c6eec	Merge branch 'dsa-ocelot-phylink-updates' Russell King says: ==================== net: dsa: ocelot: phylink updates This series updates the Ocelot DSA driver for some of the recent phylink changes. Specifically, we fill in the supported_interfaces fields, convert to mac_select_pcs and mark the driver as non-legacy. We do not convert to phylink_generic_validate() as Ocelot has special support for its rate adapting PCS which makes the generic validate method unsuitable for this driver. The three changes mentioned above are implemented in their own separate patches with one additional cleanup: 1) Populate the supported_interfaces bitmap 2) Remove the now unnecessary interface checks in the validate methods 3) Convert from phylink_set_pcs() to .mac_select_pcs. 4) Mark the driver as non-legacy Thanks. RFC -> non-RFC: add reviewed-by/tested-by's, update patch 1 to set the supported_interfaces bitmap in felix.c rather than the sub-drivers as requested by Vladimir. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:44:29 +00:00
Russell King (Oracle)	f6f04c0204	net: dsa: ocelot: mark as non-legacy The ocelot DSA driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:44:29 +00:00
Russell King (Oracle)	864ba485ac	net: dsa: ocelot: convert to mac_select_pcs() Convert the PCS selection to use mac_select_pcs, which allows the PCS to perform any validation it needs, and removes the need to set the PCS in the mac_config() callback, delving into the higher DSA levels to do so. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:44:28 +00:00
Russell King (Oracle)	e57a15401e	net: dsa: ocelot: remove interface checks When the supported interfaces bitmap is populated, phylink will itself check that the interface mode is present in this bitmap. Drivers no longer need to perform this check themselves. Remove these checks. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:44:28 +00:00
Russell King (Oracle)	79fda660bd	net: dsa: ocelot: populate supported_interfaces Populate the supported interfaces bitmap for the Ocelot DSA switches. Since all sub-drivers only support a single interface mode, defined by ocelot_port->phy_mode, we can handle this in the main driver code without reference to the sub-driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-26 12:44:28 +00:00
Jakub Kicinski	3e120e4580	Merge branch 'small-fixes-for-mctp' Matt Johnston says: ==================== Small fixes for MCTP This series has 3 fixes for MCTP. ==================== Link: https://lore.kernel.org/r/20220225053938.643605-1-matt@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-25 22:23:37 -08:00
Matt Johnston	33f5d1a9d9	mctp i2c: Fix hard head TX bounds length check We should be testing the length before fitting into the u8 byte_count. This is just a sanity check, the MCTP stack should have limited to MTU which is checked, and we check consistency later in mctp_i2c_xmit(). Found by Smatch mctp_i2c_header_create() warn: impossible condition '(hdr->byte_count > 255) => (0-255 > 255)' Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-25 22:23:33 -08:00
Matt Johnston	06bf1ce69d	mctp i2c: Fix potential use-after-free The skb is handed off to netif_rx() which may free it. Found by Smatch. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-25 22:23:29 -08:00
Matt Johnston	f62457df5c	mctp: Avoid warning if unregister notifies twice Previously if an unregister notify handler ran twice (waiting for netdev to be released) it would print a warning in mctp_unregister() every subsequent time the unregister notify occured. Instead we only need to worry about the case where a mctp_ptr is set on an unknown device type. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-25 22:23:23 -08:00
Wong Vee Khee	23d7433011	stmmac: intel: Enable 2.5Gbps for Intel AlderLake-S Intel AlderLake-S platform is capable of running on 2.5GBps link speed. This patch enables 2.5Gbps link speed on AlderLake-S platform. Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Link: https://lore.kernel.org/r/20220225023325.474242-1-vee.khee.wong@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-25 22:22:09 -08:00
Colin Ian King	38455fbcc8	net: dsa: qca8k: return with -EINVAL on invalid port Currently an invalid port throws a WARN_ON warning however invalid uninitialized values in reg and cpu_port_index are being used later on. Fix this by returning -EINVAL for an invalid port value. Addresses clang-scan warnings: drivers/net/dsa/qca8k.c:1981:3: warning: 2nd function call argument is an uninitialized value [core.CallAndMessage] drivers/net/dsa/qca8k.c:1999:9: warning: 2nd function call argument is an uninitialized value [core.CallAndMessage] Fixes: `7544b3ff74` ("net: dsa: qca8k: move pcs configuration") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220224220557.147075-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-25 22:21:24 -08:00
David S. Miller	5ebaaa69bd	Merge branch 'sja1105-phylink-updates' Russell King says: ==================== net: dsa: sja1105: phylink updates This series updates the phylink implementation in sja1105 to use the supported_interfaces bitmap, convert to the mac_select_pcs() interface, mark as non-legacy, and get rid of the validation method. As a final step, enable switching between SGMII and 2500BASE-X as it is a feature that Vladimir desires. Specifically, the patches in this series: 1. Populates the supported_interfaces bitmap. 2. As a result of the supported_interfaces bitmap being populated, sja1105 no longer needs to check the interface mode as phylink will do this. 3. Switch away from using phylink_set_pcs(), using the mac_select_pcs() method instead. 4. Mark the driver as not-legacy 5. Fill in mac_capabilities using _exactly_ the same conditions as is currently used to decide which link modes to support, and convert to use phylink_generic_validate() 6. Add brand new support to permit switching between SGMII and 2500BASE-X modes of operation as per Vladimir's single patch that performs steps 1, 2, 5 and 6 in one go. There are some additional changes in Vladimir's single patch that I have not included: * validation of priv->phy_mode[] in sja1105_phylink_get_caps(). The driver has already validated the phy_mode for each port in sja1105_init_mii_settings(), and a failure here will prevent the driver reaching sja1105_phylink_get_caps(). * Changing the decisions on which mac_capabilities to set. Vladimir's patch always sets MAC_10FD \| MAC_100FD \| MAC_1000FD despite the current code clearly making the 1G speed conditional on the xmii_mode for the port. The change in decision making may be visible when in PHY_INTERFACE_MODE_INTERNAL mode, for which the phylink_generic_validate() will pass through all the MAC capabilities as ethtool link modes. Hence, if we have PHY_INTERFACE_MODE_INTERNAL but supports_rgmii[] or supports_sgmii[] is non-zero, currently we do not get 1G speeds. With Vladimir's additional change, we will get 1G speeds. While it is not clear whether that can happen, I feel changing the decision making should be a separate patch. * The decision for MAC_2500FD is made differently - sja1105_init_mii_settings() allows PHY_INTERFACE_MODE_2500BASEX when supports_2500basex[] is non-zero, and is not based on any other condition such as supports_sgmii[] or supports_rgmii[]. Vladimir's patch makes it additionally conditional on those supports_.gmii[] settings, which is a functional change that should be made in a separate patch - and if desired, then sja1105_init_mii_settings() should also be updated at the same time. Consequently, I believe that my previous objections to Vladimir's single patch approach are well founded and justified, even through Vladimir is the maintainer of this driver. I have no objection to the additional changes, I just don't think they should all be wrapped up into a single patch that converts the way validation is done _and_ also makes a bunch of other functional changes. RFC->non-RFC: added Vladimir's Reviewed-by's, fixed the typo in the commit message of patch 6, and removed the phrase at the end of a comment as requested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:20 +00:00
Russell King (Oracle)	83dc4c2af6	net: dsa: sja1105: support switching between SGMII and 2500BASE-X Vladimir Oltean suggests that sja1105 can support switching between SGMII and 2500BASE-X modes. Augment sja1105_phylink_get_caps() to fill in both interface modes if they can be supported. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:20 +00:00
Russell King (Oracle)	9c318be13c	net: dsa: sja1105: convert to phylink_generic_validate() Populate the MAC capabilities for the SJA1105 DSA switch using the same decision making which sja1105_phylink_validate() uses. Remove the now obsolete sja1105_phylink_validate() implementation to allow DSA to use phylink_generic_validate() for this switch driver. As noted by Vladimir, this fixes an inconsequential bug which allowed gigabit and lower interface modes to be indicated when operating in 2500base-X mode. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:19 +00:00
Russell King (Oracle)	2d1d548ec1	net: dsa: sja1105: mark as non-legacy The sja1105 DSA driver does not have a phylink_mac_config() method implementation, it is safe to mark this as a non-legacy driver. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:19 +00:00
Russell King (Oracle)	827b4ef277	net: dsa: sja1105: use .mac_select_pcs() interface Convert the PCS selection to use mac_select_pcs, which allows the PCS to perform any validation it needs, and removes the need to set the PCS in the mac_config() callback, delving into the higher DSA levels to do so. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:19 +00:00
Russell King (Oracle)	c2b8e1e3d8	net: dsa: sja1105: remove interface checks When the supported interfaces bitmap is populated, phylink will itself check that the interface mode is present in this bitmap. Drivers no longer need to perform this check themselves. Remove these checks. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:19 +00:00
Russell King (Oracle)	a420b757ac	net: dsa: sja1105: populate supported_interfaces Populate the supported interfaces bitmap for the SJA1105 DSA switch. This switch only supports a static model of configuration, so we restrict the interface modes to the configured setting. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir. │ Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 12:47:19 +00:00
Toms Atteka	28a3f06017	net: openvswitch: IPv6: Add IPv6 extension header support This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and packets can be filtered using ipv6_ext flag. Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-25 10:32:55 +00:00
Jakub Kicinski	a46e3d5eb7	Merge branch 'nfp-flow-independent-tc-action-hardware-offload' Simon Horman says: ==================== nfp: flow-independent tc action hardware offload Baowen Zheng says: Allow nfp NIC to offload tc actions independent of flows. The motivation for this work is to offload tc actions independent of flows for nfp NIC. We allow nfp driver to provide hardware offload of OVS metering feature - which calls for policers that may be used by multiple flows and whose lifecycle is independent of any flows that use them. When nfp driver tries to offload a flow table using the independent action, the driver will search if the action is already offloaded to the hardware. If not, the flow table offload will fail. When the nfp NIC successes to offload an action, the user can check in_hw_count when dumping the tc action. Tc cli command to offload and dump an action: # tc actions add action police rate 100mbit burst 10000k index 200 skip_sw # tc -s -d actions list action police total acts 1 action order 0: police 0xc8 rate 100Mbit burst 10000Kb mtu 2Kb action reclassify overhead 0b linklayer ethernet ref 1 bind 0 installed 142 sec used 0 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 skip_sw in_hw in_hw_count 1 used_hw_stats delayed ==================== Link: https://lore.kernel.org/r/20220223162302.97609-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:11 -08:00
Baowen Zheng	5e98743cfa	nfp: add NFP_FL_FEATS_QOS_METER to host features to enable meter offload Add NFP_FL_FEATS_QOS_METER to host features to enable meter offload in driver. Before adding this feature, we will not offload any police action since we will check the host features before offloading any police action. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:08 -08:00
Baowen Zheng	147747ec66	nfp: add support to offload police action from flower table Offload flow table if the action is already offloaded to hardware when flow table uses this action. Change meter id to type of u32 to support all the action index. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:08 -08:00
Baowen Zheng	776178a5cc	nfp: add process to get action stats from hardware Add a process to update action stats from hardware. This stats data will be updated to tc action when dumping actions or filters. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:07 -08:00
Baowen Zheng	26ff98d7dd	nfp: add hash table to store meter table Add a hash table to store meter table. This meter table will also be used by flower action. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:07 -08:00
Baowen Zheng	59080da090	nfp: add support to offload tc action to hardware Add process to offload tc action to hardware. Currently we only support to offload police action. Add meter capability to check if firmware supports meter offload. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:07 -08:00
Baowen Zheng	bbab5f9332	nfp: refactor policer config to support ingress/egress meter Add an policer API to support ingress/egress meter. Change ingress police to compatible with the new API. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:51:07 -08:00

1 2 3 4 5 ...

1075135 Commits