linux/include/soc/mscc/ocelot.h

836 lines
24 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */
/* Copyright (c) 2017 Microsemi Corporation
*/
#ifndef _SOC_MSCC_OCELOT_H
#define _SOC_MSCC_OCELOT_H
#include <linux/ptp_clock_kernel.h>
#include <linux/net_tstamp.h>
#include <linux/if_vlan.h>
#include <linux/regmap.h>
#include <net/dsa.h>
net: dsa: felix: Allow unknown unicast traffic towards the CPU port module Compared to other DSA switches, in the Ocelot cores, the RX filtering is a much more important concern. Firstly, the primary use case for Ocelot is non-DSA, so there isn't any secondary Ethernet MAC [the DSA master's one] to implicitly drop frames having a DMAC we are not interested in. So the switch driver itself needs to install FDB entries towards the CPU port module (PGID_CPU) for the MAC address of each switch port, in each VLAN installed on the port. Every address that is not whitelisted is implicitly dropped. This is in order to achieve a behavior similar to N standalone net devices. Secondly, even in the secondary use case of DSA, such as illustrated by Felix with the NPI port mode, that secondary Ethernet MAC is present, but its RX filter is bypassed. This is because the DSA tags themselves are placed before Ethernet, so the DMAC that the switch ports see is not seen by the DSA master too (since it's shifter to the right). So RX filtering is pretty important. A good RX filter won't bother the CPU in case the switch port receives a frame that it's not interested in, and there exists no other line of defense. Ocelot is pretty strict when it comes to RX filtering: non-IP multicast and broadcast traffic is allowed to go to the CPU port module, but unknown unicast isn't. This means that traffic reception for any other MAC addresses than the ones configured on each switch port net device won't work. This includes use cases such as macvlan or bridging with a non-Ocelot (so-called "foreign") interface. But this seems to be fine for the scenarios that the Linux system embedded inside an Ocelot switch is intended for - it is simply not interested in unknown unicast traffic, as explained in Allan Nielsen's presentation [0]. On the other hand, the Felix DSA switch is integrated in more general-purpose Linux systems, so it can't afford to drop that sort of traffic in hardware, even if it will end up doing so later, in software. Actually, unknown unicast means more for Felix than it does for Ocelot. Felix doesn't attempt to perform the whitelisting of switch port MAC addresses towards PGID_CPU at all, mainly because it is too complicated to be feasible: while the MAC addresses are unique in Ocelot, by default in DSA all ports are equal and inherited from the DSA master. This adds into account the question of reference counting MAC addresses (delayed ocelot_mact_forget), not to mention reference counting for the VLAN IDs that those MAC addresses are installed in. This reference counting should be done in the DSA core, and the fact that it wasn't needed so far is due to the fact that the other DSA switches don't have the DSA tag placed before Ethernet, so the DSA master is able to whitelist the MAC addresses in hardware. So this means that even regular traffic termination on a Felix switch port happens through flooding (because neither Felix nor Ocelot learn source MAC addresses from CPU-injected frames). So far we've explained that whitelisting towards PGID_CPU: - helps to reduce the likelihood of spamming the CPU with frames it won't process very far anyway - is implemented in the ocelot driver - is sufficient for the ocelot use cases - is not feasible in DSA - breaks use cases in DSA, in the current status (whitelisting enabled but no MAC address whitelisted) So the proposed patch allows unknown unicast frames to be sent to the CPU port module. This is done for the Felix DSA driver only, as Ocelot seems to be happy without it. [0]: https://www.youtube.com/watch?v=B1HhxEcU7Jg Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 14:50:03 +00:00
/* Port Group IDs (PGID) are masks of destination ports.
*
* For L2 forwarding, the switch performs 3 lookups in the PGID table for each
* frame, and forwards the frame to the ports that are present in the logical
* AND of all 3 PGIDs.
*
* These PGID lookups are:
* - In one of PGID[0-63]: for the destination masks. There are 2 paths by
* which the switch selects a destination PGID:
* - The {DMAC, VID} is present in the MAC table. In that case, the
* destination PGID is given by the DEST_IDX field of the MAC table entry
* that matched.
* - The {DMAC, VID} is not present in the MAC table (it is unknown). The
* frame is disseminated as being either unicast, multicast or broadcast,
* and according to that, the destination PGID is chosen as being the
* value contained by ANA_FLOODING_FLD_UNICAST,
* ANA_FLOODING_FLD_MULTICAST or ANA_FLOODING_FLD_BROADCAST.
* The destination PGID can be an unicast set: the first PGIDs, 0 to
* ocelot->num_phys_ports - 1, or a multicast set: the PGIDs from
* ocelot->num_phys_ports to 63. By convention, a unicast PGID corresponds to
* a physical port and has a single bit set in the destination ports mask:
* that corresponding to the port number itself. In contrast, a multicast
* PGID will have potentially more than one single bit set in the destination
* ports mask.
* - In one of PGID[64-79]: for the aggregation mask. The switch classifier
* dissects each frame and generates a 4-bit Link Aggregation Code which is
* used for this second PGID table lookup. The goal of link aggregation is to
* hash multiple flows within the same LAG on to different destination ports.
* The first lookup will result in a PGID with all the LAG members present in
* the destination ports mask, and the second lookup, by Link Aggregation
* Code, will ensure that each flow gets forwarded only to a single port out
* of that mask (there are no duplicates).
* - In one of PGID[80-90]: for the source mask. The third time, the PGID table
* is indexed with the ingress port (plus 80). These PGIDs answer the
* question "is port i allowed to forward traffic to port j?" If yes, then
* BIT(j) of PGID 80+i will be found set. The third PGID lookup can be used
* to enforce the L2 forwarding matrix imposed by e.g. a Linux bridge.
*/
/* Reserve some destination PGIDs at the end of the range:
* PGID_CPU: used for whitelisting certain MAC addresses, such as the addresses
* of the switch port net devices, towards the CPU port module.
* PGID_UC: the flooding destinations for unknown unicast traffic.
* PGID_MC: the flooding destinations for broadcast and non-IP multicast
* traffic.
* PGID_MCIPV4: the flooding destinations for IPv4 multicast traffic.
* PGID_MCIPV6: the flooding destinations for IPv6 multicast traffic.
*/
#define PGID_CPU 59
#define PGID_UC 60
#define PGID_MC 61
#define PGID_MCIPV4 62
#define PGID_MCIPV6 63
#define for_each_unicast_dest_pgid(ocelot, pgid) \
for ((pgid) = 0; \
(pgid) < (ocelot)->num_phys_ports; \
(pgid)++)
#define for_each_nonreserved_multicast_dest_pgid(ocelot, pgid) \
for ((pgid) = (ocelot)->num_phys_ports + 1; \
(pgid) < PGID_CPU; \
(pgid)++)
#define for_each_aggr_pgid(ocelot, pgid) \
for ((pgid) = PGID_AGGR; \
(pgid) < PGID_SRC; \
(pgid)++)
net: dsa: felix: Allow unknown unicast traffic towards the CPU port module Compared to other DSA switches, in the Ocelot cores, the RX filtering is a much more important concern. Firstly, the primary use case for Ocelot is non-DSA, so there isn't any secondary Ethernet MAC [the DSA master's one] to implicitly drop frames having a DMAC we are not interested in. So the switch driver itself needs to install FDB entries towards the CPU port module (PGID_CPU) for the MAC address of each switch port, in each VLAN installed on the port. Every address that is not whitelisted is implicitly dropped. This is in order to achieve a behavior similar to N standalone net devices. Secondly, even in the secondary use case of DSA, such as illustrated by Felix with the NPI port mode, that secondary Ethernet MAC is present, but its RX filter is bypassed. This is because the DSA tags themselves are placed before Ethernet, so the DMAC that the switch ports see is not seen by the DSA master too (since it's shifter to the right). So RX filtering is pretty important. A good RX filter won't bother the CPU in case the switch port receives a frame that it's not interested in, and there exists no other line of defense. Ocelot is pretty strict when it comes to RX filtering: non-IP multicast and broadcast traffic is allowed to go to the CPU port module, but unknown unicast isn't. This means that traffic reception for any other MAC addresses than the ones configured on each switch port net device won't work. This includes use cases such as macvlan or bridging with a non-Ocelot (so-called "foreign") interface. But this seems to be fine for the scenarios that the Linux system embedded inside an Ocelot switch is intended for - it is simply not interested in unknown unicast traffic, as explained in Allan Nielsen's presentation [0]. On the other hand, the Felix DSA switch is integrated in more general-purpose Linux systems, so it can't afford to drop that sort of traffic in hardware, even if it will end up doing so later, in software. Actually, unknown unicast means more for Felix than it does for Ocelot. Felix doesn't attempt to perform the whitelisting of switch port MAC addresses towards PGID_CPU at all, mainly because it is too complicated to be feasible: while the MAC addresses are unique in Ocelot, by default in DSA all ports are equal and inherited from the DSA master. This adds into account the question of reference counting MAC addresses (delayed ocelot_mact_forget), not to mention reference counting for the VLAN IDs that those MAC addresses are installed in. This reference counting should be done in the DSA core, and the fact that it wasn't needed so far is due to the fact that the other DSA switches don't have the DSA tag placed before Ethernet, so the DSA master is able to whitelist the MAC addresses in hardware. So this means that even regular traffic termination on a Felix switch port happens through flooding (because neither Felix nor Ocelot learn source MAC addresses from CPU-injected frames). So far we've explained that whitelisting towards PGID_CPU: - helps to reduce the likelihood of spamming the CPU with frames it won't process very far anyway - is implemented in the ocelot driver - is sufficient for the ocelot use cases - is not feasible in DSA - breaks use cases in DSA, in the current status (whitelisting enabled but no MAC address whitelisted) So the proposed patch allows unknown unicast frames to be sent to the CPU port module. This is done for the Felix DSA driver only, as Ocelot seems to be happy without it. [0]: https://www.youtube.com/watch?v=B1HhxEcU7Jg Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 14:50:03 +00:00
/* Aggregation PGIDs, one per Link Aggregation Code */
#define PGID_AGGR 64
/* Source PGIDs, one per physical port */
#define PGID_SRC 80
#define IFH_INJ_BYPASS BIT(31)
#define IFH_INJ_POP_CNT_DISABLE (3 << 28)
#define IFH_TAG_TYPE_C 0
#define IFH_TAG_TYPE_S 1
#define IFH_REW_OP_NOOP 0x0
#define IFH_REW_OP_DSCP 0x1
#define IFH_REW_OP_ONE_STEP_PTP 0x2
#define IFH_REW_OP_TWO_STEP_PTP 0x3
#define IFH_REW_OP_ORIGIN_PTP 0x5
#define OCELOT_NUM_TC 8
#define OCELOT_TAG_LEN 16
#define OCELOT_SHORT_PREFIX_LEN 4
#define OCELOT_LONG_PREFIX_LEN 16
net: dsa: tag_ocelot: use a short prefix on both ingress and egress There are 2 goals that we follow: - Reduce the header size - Make the header size equal between RX and TX The issue that required long prefix on RX was the fact that the ocelot DSA tag, being put before Ethernet as it is, would overlap with the area that a DSA master uses for RX filtering (destination MAC address mainly). Now that we can ask DSA to put the master in promiscuous mode, in theory we could remove the prefix altogether and call it a day, but it looks like we can't. Using no prefix on ingress, some packets (such as ICMP) would be received, while others (such as PTP) would not be received. This is because the DSA master we use (enetc) triggers parse errors ("MAC rx frame errors") presumably because it sees Ethernet frames with a bad length. And indeed, when using no prefix, the EtherType (bytes 12-13 of the frame, bits 96-111) falls over the REW_VAL field from the extraction header, aka the PTP timestamp. When turning the short (32-bit) prefix on, the EtherType overlaps with bits 64-79 of the extraction header, which are a reserved area transmitted as zero by the switch. The packets are not dropped by the DSA master with a short prefix. Actually, the frames look like this in tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 - added by a downstream sja1105). 89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \ dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \ ctrl 0x0004: Information, send seq 2, rcv seq 0, \ Flags [Response], length 78 0x0000: 8880 000a 001d 890c a9f2 0100 0000 100f ................ 0x0010: 0400 0000 0180 c200 000e 001f 7b63 0248 ............{c.H 0x0020: dadb 0482 88f7 1202 0036 0000 0000 0000 .........6...... 0x0030: 0000 0000 0000 0000 0000 001f 7bff fe63 ............{..c 0x0040: 0248 0001 1f81 0500 0000 0000 0000 0000 .H.............. 0x0050: 0000 0000 0000 0000 0000 0000 ............ So the short prefix is our new default: we've shortened our RX frames by 12 octets, increased TX by 4, and headers are now equal between RX and TX. Note that we still need promiscuous mode for the DSA master to not drop it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26 19:32:04 +00:00
#define OCELOT_TOTAL_TAG_LEN (OCELOT_SHORT_PREFIX_LEN + OCELOT_TAG_LEN)
#define OCELOT_SPEED_2500 0
#define OCELOT_SPEED_1000 1
#define OCELOT_SPEED_100 2
#define OCELOT_SPEED_10 3
#define OCELOT_PTP_PINS_NUM 4
#define TARGET_OFFSET 24
#define REG_MASK GENMASK(TARGET_OFFSET - 1, 0)
#define REG(reg, offset) [reg & REG_MASK] = offset
#define REG_RESERVED_ADDR 0xffffffff
#define REG_RESERVED(reg) REG(reg, REG_RESERVED_ADDR)
enum ocelot_target {
ANA = 1,
QS,
QSYS,
REW,
SYS,
S0,
S1,
S2,
HSIO,
PTP,
GCB,
DEV_GMII,
TARGET_MAX,
};
enum ocelot_reg {
ANA_ADVLEARN = ANA << TARGET_OFFSET,
ANA_VLANMASK,
ANA_PORT_B_DOMAIN,
ANA_ANAGEFIL,
ANA_ANEVENTS,
ANA_STORMLIMIT_BURST,
ANA_STORMLIMIT_CFG,
ANA_ISOLATED_PORTS,
ANA_COMMUNITY_PORTS,
ANA_AUTOAGE,
ANA_MACTOPTIONS,
ANA_LEARNDISC,
ANA_AGENCTRL,
ANA_MIRRORPORTS,
ANA_EMIRRORPORTS,
ANA_FLOODING,
ANA_FLOODING_IPMC,
ANA_SFLOW_CFG,
ANA_PORT_MODE,
ANA_CUT_THRU_CFG,
ANA_PGID_PGID,
ANA_TABLES_ANMOVED,
ANA_TABLES_MACHDATA,
ANA_TABLES_MACLDATA,
ANA_TABLES_STREAMDATA,
ANA_TABLES_MACACCESS,
ANA_TABLES_MACTINDX,
ANA_TABLES_VLANACCESS,
ANA_TABLES_VLANTIDX,
ANA_TABLES_ISDXACCESS,
ANA_TABLES_ISDXTIDX,
ANA_TABLES_ENTRYLIM,
ANA_TABLES_PTP_ID_HIGH,
ANA_TABLES_PTP_ID_LOW,
ANA_TABLES_STREAMACCESS,
ANA_TABLES_STREAMTIDX,
ANA_TABLES_SEQ_HISTORY,
ANA_TABLES_SEQ_MASK,
ANA_TABLES_SFID_MASK,
ANA_TABLES_SFIDACCESS,
ANA_TABLES_SFIDTIDX,
ANA_MSTI_STATE,
ANA_OAM_UPM_LM_CNT,
ANA_SG_ACCESS_CTRL,
ANA_SG_CONFIG_REG_1,
ANA_SG_CONFIG_REG_2,
ANA_SG_CONFIG_REG_3,
ANA_SG_CONFIG_REG_4,
ANA_SG_CONFIG_REG_5,
ANA_SG_GCL_GS_CONFIG,
ANA_SG_GCL_TI_CONFIG,
ANA_SG_STATUS_REG_1,
ANA_SG_STATUS_REG_2,
ANA_SG_STATUS_REG_3,
ANA_PORT_VLAN_CFG,
ANA_PORT_DROP_CFG,
ANA_PORT_QOS_CFG,
ANA_PORT_VCAP_CFG,
ANA_PORT_VCAP_S1_KEY_CFG,
ANA_PORT_VCAP_S2_CFG,
ANA_PORT_PCP_DEI_MAP,
ANA_PORT_CPU_FWD_CFG,
ANA_PORT_CPU_FWD_BPDU_CFG,
ANA_PORT_CPU_FWD_GARP_CFG,
ANA_PORT_CPU_FWD_CCM_CFG,
ANA_PORT_PORT_CFG,
ANA_PORT_POL_CFG,
ANA_PORT_PTP_CFG,
ANA_PORT_PTP_DLY1_CFG,
ANA_PORT_PTP_DLY2_CFG,
ANA_PORT_SFID_CFG,
ANA_PFC_PFC_CFG,
ANA_PFC_PFC_TIMER,
ANA_IPT_OAM_MEP_CFG,
ANA_IPT_IPT,
ANA_PPT_PPT,
ANA_FID_MAP_FID_MAP,
ANA_AGGR_CFG,
ANA_CPUQ_CFG,
ANA_CPUQ_CFG2,
ANA_CPUQ_8021_CFG,
ANA_DSCP_CFG,
ANA_DSCP_REWR_CFG,
ANA_VCAP_RNG_TYPE_CFG,
ANA_VCAP_RNG_VAL_CFG,
ANA_VRAP_CFG,
ANA_VRAP_HDR_DATA,
ANA_VRAP_HDR_MASK,
ANA_DISCARD_CFG,
ANA_FID_CFG,
ANA_POL_PIR_CFG,
ANA_POL_CIR_CFG,
ANA_POL_MODE_CFG,
ANA_POL_PIR_STATE,
ANA_POL_CIR_STATE,
ANA_POL_STATE,
ANA_POL_FLOWC,
ANA_POL_HYST,
ANA_POL_MISC_CFG,
QS_XTR_GRP_CFG = QS << TARGET_OFFSET,
QS_XTR_RD,
QS_XTR_FRM_PRUNING,
QS_XTR_FLUSH,
QS_XTR_DATA_PRESENT,
QS_XTR_CFG,
QS_INJ_GRP_CFG,
QS_INJ_WR,
QS_INJ_CTRL,
QS_INJ_STATUS,
QS_INJ_ERR,
QS_INH_DBG,
QSYS_PORT_MODE = QSYS << TARGET_OFFSET,
QSYS_SWITCH_PORT_MODE,
QSYS_STAT_CNT_CFG,
QSYS_EEE_CFG,
QSYS_EEE_THRES,
QSYS_IGR_NO_SHARING,
QSYS_EGR_NO_SHARING,
QSYS_SW_STATUS,
QSYS_EXT_CPU_CFG,
QSYS_PAD_CFG,
QSYS_CPU_GROUP_MAP,
QSYS_QMAP,
QSYS_ISDX_SGRP,
QSYS_TIMED_FRAME_ENTRY,
QSYS_TFRM_MISC,
QSYS_TFRM_PORT_DLY,
QSYS_TFRM_TIMER_CFG_1,
QSYS_TFRM_TIMER_CFG_2,
QSYS_TFRM_TIMER_CFG_3,
QSYS_TFRM_TIMER_CFG_4,
QSYS_TFRM_TIMER_CFG_5,
QSYS_TFRM_TIMER_CFG_6,
QSYS_TFRM_TIMER_CFG_7,
QSYS_TFRM_TIMER_CFG_8,
QSYS_RED_PROFILE,
QSYS_RES_QOS_MODE,
QSYS_RES_CFG,
QSYS_RES_STAT,
QSYS_EGR_DROP_MODE,
QSYS_EQ_CTRL,
QSYS_EVENTS_CORE,
QSYS_QMAXSDU_CFG_0,
QSYS_QMAXSDU_CFG_1,
QSYS_QMAXSDU_CFG_2,
QSYS_QMAXSDU_CFG_3,
QSYS_QMAXSDU_CFG_4,
QSYS_QMAXSDU_CFG_5,
QSYS_QMAXSDU_CFG_6,
QSYS_QMAXSDU_CFG_7,
QSYS_PREEMPTION_CFG,
QSYS_CIR_CFG,
QSYS_EIR_CFG,
QSYS_SE_CFG,
QSYS_SE_DWRR_CFG,
QSYS_SE_CONNECT,
QSYS_SE_DLB_SENSE,
QSYS_CIR_STATE,
QSYS_EIR_STATE,
QSYS_SE_STATE,
QSYS_HSCH_MISC_CFG,
QSYS_TAG_CONFIG,
QSYS_TAS_PARAM_CFG_CTRL,
QSYS_PORT_MAX_SDU,
QSYS_PARAM_CFG_REG_1,
QSYS_PARAM_CFG_REG_2,
QSYS_PARAM_CFG_REG_3,
QSYS_PARAM_CFG_REG_4,
QSYS_PARAM_CFG_REG_5,
QSYS_GCL_CFG_REG_1,
QSYS_GCL_CFG_REG_2,
QSYS_PARAM_STATUS_REG_1,
QSYS_PARAM_STATUS_REG_2,
QSYS_PARAM_STATUS_REG_3,
QSYS_PARAM_STATUS_REG_4,
QSYS_PARAM_STATUS_REG_5,
QSYS_PARAM_STATUS_REG_6,
QSYS_PARAM_STATUS_REG_7,
QSYS_PARAM_STATUS_REG_8,
QSYS_PARAM_STATUS_REG_9,
QSYS_GCL_STATUS_REG_1,
QSYS_GCL_STATUS_REG_2,
REW_PORT_VLAN_CFG = REW << TARGET_OFFSET,
REW_TAG_CFG,
REW_PORT_CFG,
REW_DSCP_CFG,
REW_PCP_DEI_QOS_MAP_CFG,
REW_PTP_CFG,
REW_PTP_DLY1_CFG,
REW_RED_TAG_CFG,
REW_DSCP_REMAP_DP1_CFG,
REW_DSCP_REMAP_CFG,
REW_STAT_CFG,
REW_REW_STICKY,
REW_PPT,
SYS_COUNT_RX_OCTETS = SYS << TARGET_OFFSET,
SYS_COUNT_RX_UNICAST,
SYS_COUNT_RX_MULTICAST,
SYS_COUNT_RX_BROADCAST,
SYS_COUNT_RX_SHORTS,
SYS_COUNT_RX_FRAGMENTS,
SYS_COUNT_RX_JABBERS,
SYS_COUNT_RX_CRC_ALIGN_ERRS,
SYS_COUNT_RX_SYM_ERRS,
SYS_COUNT_RX_64,
SYS_COUNT_RX_65_127,
SYS_COUNT_RX_128_255,
SYS_COUNT_RX_256_1023,
SYS_COUNT_RX_1024_1526,
SYS_COUNT_RX_1527_MAX,
SYS_COUNT_RX_PAUSE,
SYS_COUNT_RX_CONTROL,
SYS_COUNT_RX_LONGS,
SYS_COUNT_RX_CLASSIFIED_DROPS,
SYS_COUNT_TX_OCTETS,
SYS_COUNT_TX_UNICAST,
SYS_COUNT_TX_MULTICAST,
SYS_COUNT_TX_BROADCAST,
SYS_COUNT_TX_COLLISION,
SYS_COUNT_TX_DROPS,
SYS_COUNT_TX_PAUSE,
SYS_COUNT_TX_64,
SYS_COUNT_TX_65_127,
SYS_COUNT_TX_128_511,
SYS_COUNT_TX_512_1023,
SYS_COUNT_TX_1024_1526,
SYS_COUNT_TX_1527_MAX,
SYS_COUNT_TX_AGING,
SYS_RESET_CFG,
SYS_SR_ETYPE_CFG,
SYS_VLAN_ETYPE_CFG,
SYS_PORT_MODE,
SYS_FRONT_PORT_MODE,
SYS_FRM_AGING,
SYS_STAT_CFG,
SYS_SW_STATUS,
SYS_MISC_CFG,
SYS_REW_MAC_HIGH_CFG,
SYS_REW_MAC_LOW_CFG,
SYS_TIMESTAMP_OFFSET,
SYS_CMID,
SYS_PAUSE_CFG,
SYS_PAUSE_TOT_CFG,
SYS_ATOP,
SYS_ATOP_TOT_CFG,
SYS_MAC_FC_CFG,
SYS_MMGT,
SYS_MMGT_FAST,
SYS_EVENTS_DIF,
SYS_EVENTS_CORE,
SYS_CNT,
SYS_PTP_STATUS,
SYS_PTP_TXSTAMP,
SYS_PTP_NXT,
SYS_PTP_CFG,
SYS_RAM_INIT,
SYS_CM_ADDR,
SYS_CM_DATA_WR,
SYS_CM_DATA_RD,
SYS_CM_OP,
SYS_CM_DATA,
PTP_PIN_CFG = PTP << TARGET_OFFSET,
PTP_PIN_TOD_SEC_MSB,
PTP_PIN_TOD_SEC_LSB,
PTP_PIN_TOD_NSEC,
PTP_PIN_WF_HIGH_PERIOD,
PTP_PIN_WF_LOW_PERIOD,
PTP_CFG_MISC,
PTP_CLK_CFG_ADJ_CFG,
PTP_CLK_CFG_ADJ_FREQ,
GCB_SOFT_RST = GCB << TARGET_OFFSET,
GCB_MIIM_MII_STATUS,
GCB_MIIM_MII_CMD,
GCB_MIIM_MII_DATA,
DEV_CLOCK_CFG = DEV_GMII << TARGET_OFFSET,
DEV_PORT_MISC,
DEV_EVENTS,
DEV_EEE_CFG,
DEV_RX_PATH_DELAY,
DEV_TX_PATH_DELAY,
DEV_PTP_PREDICT_CFG,
DEV_MAC_ENA_CFG,
DEV_MAC_MODE_CFG,
DEV_MAC_MAXLEN_CFG,
DEV_MAC_TAGS_CFG,
DEV_MAC_ADV_CHK_CFG,
DEV_MAC_IFG_CFG,
DEV_MAC_HDX_CFG,
DEV_MAC_DBG_CFG,
DEV_MAC_FC_MAC_LOW_CFG,
DEV_MAC_FC_MAC_HIGH_CFG,
DEV_MAC_STICKY,
PCS1G_CFG,
PCS1G_MODE_CFG,
PCS1G_SD_CFG,
PCS1G_ANEG_CFG,
PCS1G_ANEG_NP_CFG,
PCS1G_LB_CFG,
PCS1G_DBG_CFG,
PCS1G_CDET_CFG,
PCS1G_ANEG_STATUS,
PCS1G_ANEG_NP_STATUS,
PCS1G_LINK_STATUS,
PCS1G_LINK_DOWN_CNT,
PCS1G_STICKY,
PCS1G_DEBUG_STATUS,
PCS1G_LPI_CFG,
PCS1G_LPI_WAKE_ERROR_CNT,
PCS1G_LPI_STATUS,
PCS1G_TSTPAT_MODE_CFG,
PCS1G_TSTPAT_STATUS,
DEV_PCS_FX100_CFG,
DEV_PCS_FX100_STATUS,
};
enum ocelot_regfield {
ANA_ADVLEARN_VLAN_CHK,
ANA_ADVLEARN_LEARN_MIRROR,
ANA_ANEVENTS_FLOOD_DISCARD,
ANA_ANEVENTS_MSTI_DROP,
ANA_ANEVENTS_ACLKILL,
ANA_ANEVENTS_ACLUSED,
ANA_ANEVENTS_AUTOAGE,
ANA_ANEVENTS_VS2TTL1,
ANA_ANEVENTS_STORM_DROP,
ANA_ANEVENTS_LEARN_DROP,
ANA_ANEVENTS_AGED_ENTRY,
ANA_ANEVENTS_CPU_LEARN_FAILED,
ANA_ANEVENTS_AUTO_LEARN_FAILED,
ANA_ANEVENTS_LEARN_REMOVE,
ANA_ANEVENTS_AUTO_LEARNED,
ANA_ANEVENTS_AUTO_MOVED,
ANA_ANEVENTS_DROPPED,
ANA_ANEVENTS_CLASSIFIED_DROP,
ANA_ANEVENTS_CLASSIFIED_COPY,
ANA_ANEVENTS_VLAN_DISCARD,
ANA_ANEVENTS_FWD_DISCARD,
ANA_ANEVENTS_MULTICAST_FLOOD,
ANA_ANEVENTS_UNICAST_FLOOD,
ANA_ANEVENTS_DEST_KNOWN,
ANA_ANEVENTS_BUCKET3_MATCH,
ANA_ANEVENTS_BUCKET2_MATCH,
ANA_ANEVENTS_BUCKET1_MATCH,
ANA_ANEVENTS_BUCKET0_MATCH,
ANA_ANEVENTS_CPU_OPERATION,
ANA_ANEVENTS_DMAC_LOOKUP,
ANA_ANEVENTS_SMAC_LOOKUP,
ANA_ANEVENTS_SEQ_GEN_ERR_0,
ANA_ANEVENTS_SEQ_GEN_ERR_1,
ANA_TABLES_MACACCESS_B_DOM,
ANA_TABLES_MACTINDX_BUCKET,
ANA_TABLES_MACTINDX_M_INDEX,
QSYS_SWITCH_PORT_MODE_PORT_ENA,
QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG,
QSYS_SWITCH_PORT_MODE_YEL_RSRVD,
QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE,
QSYS_SWITCH_PORT_MODE_TX_PFC_ENA,
QSYS_SWITCH_PORT_MODE_TX_PFC_MODE,
QSYS_TIMED_FRAME_ENTRY_TFRM_VLD,
QSYS_TIMED_FRAME_ENTRY_TFRM_FP,
QSYS_TIMED_FRAME_ENTRY_TFRM_PORTNO,
QSYS_TIMED_FRAME_ENTRY_TFRM_TM_SEL,
QSYS_TIMED_FRAME_ENTRY_TFRM_TM_T,
SYS_PORT_MODE_DATA_WO_TS,
SYS_PORT_MODE_INCL_INJ_HDR,
SYS_PORT_MODE_INCL_XTR_HDR,
SYS_PORT_MODE_INCL_HDR_ERR,
SYS_RESET_CFG_CORE_ENA,
SYS_RESET_CFG_MEM_ENA,
SYS_RESET_CFG_MEM_INIT,
GCB_SOFT_RST_SWC_RST,
GCB_MIIM_MII_STATUS_PENDING,
GCB_MIIM_MII_STATUS_BUSY,
SYS_PAUSE_CFG_PAUSE_START,
SYS_PAUSE_CFG_PAUSE_STOP,
SYS_PAUSE_CFG_PAUSE_ENA,
REGFIELD_MAX
};
enum {
/* VCAP_CORE_CFG */
VCAP_CORE_UPDATE_CTRL,
VCAP_CORE_MV_CFG,
/* VCAP_CORE_CACHE */
VCAP_CACHE_ENTRY_DAT,
VCAP_CACHE_MASK_DAT,
VCAP_CACHE_ACTION_DAT,
VCAP_CACHE_CNT_DAT,
VCAP_CACHE_TG_DAT,
/* VCAP_CONST */
VCAP_CONST_VCAP_VER,
VCAP_CONST_ENTRY_WIDTH,
VCAP_CONST_ENTRY_CNT,
VCAP_CONST_ENTRY_SWCNT,
VCAP_CONST_ENTRY_TG_WIDTH,
VCAP_CONST_ACTION_DEF_CNT,
VCAP_CONST_ACTION_WIDTH,
VCAP_CONST_CNT_WIDTH,
VCAP_CONST_CORE_CNT,
VCAP_CONST_IF_CNT,
};
enum ocelot_ptp_pins {
PTP_PIN_0,
PTP_PIN_1,
PTP_PIN_2,
PTP_PIN_3,
TOD_ACC_PIN
};
struct ocelot_stat_layout {
u32 offset;
char name[ETH_GSTRING_LEN];
};
enum ocelot_tag_prefix {
OCELOT_TAG_PREFIX_DISABLED = 0,
OCELOT_TAG_PREFIX_NONE,
OCELOT_TAG_PREFIX_SHORT,
OCELOT_TAG_PREFIX_LONG,
};
struct ocelot;
struct ocelot_ops {
struct net_device *(*port_to_netdev)(struct ocelot *ocelot, int port);
int (*netdev_to_port)(struct net_device *dev);
int (*reset)(struct ocelot *ocelot);
u16 (*wm_enc)(u16 value);
u16 (*wm_dec)(u16 value);
void (*wm_stat)(u32 val, u32 *inuse, u32 *maxuse);
};
struct ocelot_vcap_block {
net: mscc: ocelot: simplify tc-flower offload structures The ocelot tc-flower offload binds a second flow block callback (apart from the one for matchall) just because it uses a different block private structure (ocelot_port_private for matchall, ocelot_port_block for flower). But ocelot_port_block just appears to be boilerplate, and doesn't help with anything in particular at all, it's just useless glue between the (global!) struct ocelot_acl_block *block pointer, and a per-netdevice struct ocelot_port_private *priv. So let's just simplify that, and make struct ocelot_port_private be the private structure for the block offload. This makes us able to use the same flow callback as in the case of matchall. This also reveals that the struct ocelot_acl_block *block is used rather strangely, as mentioned above: it is defined globally, allocated at probe time, and freed at unbind time. So just move the structure to the main ocelot structure, which gives further opportunity for simplification. Also get rid of backpointers from struct ocelot_acl_block and struct ocelot_ace_rule back to struct ocelot, by reworking the function prototypes, where necessary, to use a more DSA-friendly "struct ocelot *ocelot, int port" format. And finally, remove the debugging prints that were added during development, since they provide no useful information at this point. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 14:31:06 +00:00
struct list_head rules;
int count;
int pol_lpr;
net: mscc: ocelot: simplify tc-flower offload structures The ocelot tc-flower offload binds a second flow block callback (apart from the one for matchall) just because it uses a different block private structure (ocelot_port_private for matchall, ocelot_port_block for flower). But ocelot_port_block just appears to be boilerplate, and doesn't help with anything in particular at all, it's just useless glue between the (global!) struct ocelot_acl_block *block pointer, and a per-netdevice struct ocelot_port_private *priv. So let's just simplify that, and make struct ocelot_port_private be the private structure for the block offload. This makes us able to use the same flow callback as in the case of matchall. This also reveals that the struct ocelot_acl_block *block is used rather strangely, as mentioned above: it is defined globally, allocated at probe time, and freed at unbind time. So just move the structure to the main ocelot structure, which gives further opportunity for simplification. Also get rid of backpointers from struct ocelot_acl_block and struct ocelot_ace_rule back to struct ocelot, by reworking the function prototypes, where necessary, to use a more DSA-friendly "struct ocelot *ocelot, int port" format. And finally, remove the debugging prints that were added during development, since they provide no useful information at this point. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 14:31:06 +00:00
};
struct ocelot_vlan {
bool valid;
u16 vid;
};
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 02:11:20 +00:00
enum ocelot_sb {
OCELOT_SB_BUF,
OCELOT_SB_REF,
OCELOT_SB_NUM,
};
enum ocelot_sb_pool {
OCELOT_SB_POOL_ING,
OCELOT_SB_POOL_EGR,
OCELOT_SB_POOL_NUM,
};
struct ocelot_port {
struct ocelot *ocelot;
struct regmap *target;
net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge To rehash a previous explanation given in commit 1c44ce560b4d ("net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up"), the switch driver operates the in a mode where a single VLAN can be transmitted as untagged on a particular egress port. That is the "native VLAN on trunk port" use case. The configuration for this native VLAN is driven in 2 ways: - Set the egress port rewriter to strip the VLAN tag for the native VID (as it is egress-untagged, after all). - Configure the ingress port to drop untagged and priority-tagged traffic, if there is no native VLAN. The intention of this setting is that a trunk port with no native VLAN should not accept untagged traffic. Since both of the above configurations for the native VLAN should only be done if VLAN awareness is requested, they are actually done from the ocelot_port_vlan_filtering function, after the basic procedure of toggling the VLAN awareness flag of the port. But there's a problem with that simplistic approach: we are trying to juggle with 2 independent variables from a single function: - Native VLAN of the port - its value is held in port->vid. - VLAN awareness state of the port - currently there are some issues here, more on that later*. The actual problem can be seen when enslaving the switch ports to a VLAN filtering bridge: 0. The driver configures a pvid of zero for each port, when in standalone mode. While the bridge configures a default_pvid of 1 for each port that gets added as a slave to it. 1. The bridge calls ocelot_port_vlan_filtering with vlan_aware=true. The VLAN-filtering-dependent portion of the native VLAN configuration is done, considering that the native VLAN is 0. 2. The bridge calls ocelot_vlan_add with vid=1, pvid=true, untagged=true. The native VLAN changes to 1 (change which gets propagated to hardware). 3. ??? - nobody calls ocelot_port_vlan_filtering again, to reapply the VLAN-filtering-dependent portion of the native VLAN configuration, for the new native VLAN of 1. One can notice that after toggling "ip link set dev br0 type bridge vlan_filtering 0 && ip link set dev br0 type bridge vlan_filtering 1", the new native VLAN finally makes it through and untagged traffic finally starts flowing again. But obviously that shouldn't be needed. So it is clear that 2 independent variables need to both re-trigger the native VLAN configuration. So we introduce the second variable as ocelot_port->vlan_aware. *Actually both the DSA Felix driver and the Ocelot driver already had each its own variable: - Ocelot: ocelot_port_private->vlan_aware - Felix: dsa_port->vlan_filtering but the common Ocelot library needs to work with a single, common, variable, so there is some refactoring done to move the vlan_aware property from the private structure into the common ocelot_port structure. Fixes: 97bb69e1e36e ("net: mscc: ocelot: break apart ocelot_vlan_port_apply") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-14 19:36:15 +00:00
bool vlan_aware;
/* VLAN that untagged frames are classified to, on ingress */
struct ocelot_vlan pvid_vlan;
/* The VLAN ID that will be transmitted as untagged, on egress */
struct ocelot_vlan native_vlan;
u8 ptp_cmd;
struct sk_buff_head tx_skbs;
u8 ts_id;
2020-09-18 01:07:24 +00:00
spinlock_t ts_id_lock;
phy_interface_t phy_mode;
net: dsa: felix: create a template for the DSA tags on xmit With this patch we try to kill 2 birds with 1 stone. First of all, some switches that use tag_ocelot.c don't have the exact same bitfield layout for the DSA tags. The destination ports field is different for Seville VSC9953 for example. So the choices are to either duplicate tag_ocelot.c into a new tag_seville.c (sub-optimal) or somehow take into account a supposed ocelot->dest_ports_offset when packing this field into the DSA injection header (again not ideal). Secondly, tag_ocelot.c already needs to memset a 128-bit area to zero and call some packing() functions of dubious performance in the fastpath. And most of the values it needs to pack are pretty much constant (BYPASS=1, SRC_PORT=CPU, DEST=port index). So it would be good if we could improve that. The proposed solution is to allocate a memory area per port at probe time, initialize that with the statically defined bits as per chip hardware revision, and just perform a simpler memcpy in the fastpath. Other alternatives have been analyzed, such as: - Create a separate tag_seville.c: too much code duplication for just 1 bit field difference. - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like tag_brcm.c, which would have a separate .xmit function. Again, too much code duplication for just 1 bit field difference. - Allocate the template from the init function of the tag_ocelot.c module, instead of from the driver: couldn't figure out a method of accessing the correct port template corresponding to the correct tagger in the .xmit function. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 16:57:04 +00:00
u8 *xmit_template;
net: dsa: felix: perform switch setup for tag_8021q Unlike sja1105, the only other user of the software-defined tag_8021q.c tagger format, the implementation we choose for the Felix DSA switch driver preserves full functionality under a vlan_filtering bridge (i.e. IP termination works through the DSA user ports under all circumstances). The tag_8021q protocol just wants: - Identifying the ingress switch port based on the RX VLAN ID, as seen by the CPU. We achieve this by using the TCAM engines (which are also used for tc-flower offload) to push the RX VLAN as a second, outer tag, on egress towards the CPU port. - Steering traffic injected into the switch from the network stack towards the correct front port based on the TX VLAN, and consuming (popping) that header on the switch's egress. A tc-flower pseudocode of the static configuration done by the driver would look like this: $ tc qdisc add dev <cpu-port> clsact $ for eth in swp0 swp1 swp2 swp3; do \ tc filter add dev <cpu-port> egress flower indev ${eth} \ action vlan push id <rxvlan> protocol 802.1ad; \ tc filter add dev <cpu-port> ingress protocol 802.1Q flower vlan_id <txvlan> action vlan pop \ action mirred egress redirect dev ${eth}; \ done but of course since DSA does not register network interfaces for the CPU port, this configuration would be impossible for the user to do. Also, due to the same reason, it is impossible for the user to inadvertently delete these rules using tc. These rules do not collide in any way with tc-flower, they just consume some TCAM space, which is something we can live with. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 01:00:09 +00:00
bool is_dsa_8021q_cpu;
struct net_device *bond;
};
struct ocelot {
struct device *dev;
struct devlink *devlink;
struct devlink_port *devlink_ports;
const struct ocelot_ops *ops;
struct regmap *targets[TARGET_MAX];
struct regmap_field *regfields[REGFIELD_MAX];
const u32 *const *map;
const struct ocelot_stat_layout *stats_layout;
unsigned int num_stats;
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 02:11:20 +00:00
u32 pool_size[OCELOT_SB_NUM][OCELOT_SB_POOL_NUM];
int packet_buffer_size;
int num_frame_refs;
int num_mact_rows;
struct net_device *hw_bridge_dev;
u16 bridge_mask;
u16 bridge_fwd_mask;
struct ocelot_port **ports;
u8 base_mac[ETH_ALEN];
/* Keep track of the vlan port masks */
u32 vlan_mask[VLAN_N_VID];
net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville The current assumption is that the felix DSA driver has flooding knobs per traffic class, while ocelot switchdev has a single flooding knob. This was correct for felix VSC9959 and ocelot VSC7514, but with the introduction of seville VSC9953, we see a switch driven by felix.c which has a single flooding knob. So it is clear that we must do what should have been done from the beginning, which is not to overwrite the configuration done by ocelot.c in felix, but instead to teach the common ocelot library about the differences in our switches, and set up the flooding PGIDs centrally. The effect that the bogus iteration through FELIX_NUM_TC has upon seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will actually overwrite ANA_FLOODING_IPMC when attempting to write ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING. And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D: MC6_DATA = 61, MC6_CTRL = 61, MC4_DATA = 60, MC4_CTRL = 0. Because MC4_CTRL is zero, this means that IPv4 multicast control packets are not flooded, but dropped. An invalid configuration, and this is how the issue was actually spotted. Reported-by: Eldar Gasanov <eldargasanov2@gmail.com> Reported-by: Maxim Kochetkov <fido_max@inbox.ru> Tested-by: Eldar Gasanov <eldargasanov2@gmail.com> Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch") Fixes: 3c7b51bd39b2 ("net: dsa: felix: allow flooding for all traffic classes") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201204175416.1445937-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 17:54:16 +00:00
/* Switches like VSC9959 have flooding per traffic class */
int num_flooding_pgids;
net: mscc: ocelot: eliminate confusion between CPU and NPI port Ocelot has the concept of a CPU port. The CPU port is represented in the forwarding and the queueing system, but it is not a physical device. The CPU port can either be accessed via register-based injection/extraction (which is the case of Ocelot), via Frame-DMA (similar to the first one), or "connected" to a physical Ethernet port (called NPI in the datasheet) which is the case of the Felix DSA switch. In Ocelot the CPU port is at index 11. In Felix the CPU port is at index 6. The CPU bit is treated special in the forwarding, as it is never cleared from the forwarding port mask (once added to it). Other than that, it is treated the same as a normal front port. Both Felix and Ocelot should use the CPU port in the same way. This means that Felix should not use the NPI port directly when forwarding to the CPU, but instead use the CPU port. This patch is fixing this such that Felix will use port 6 as its CPU port, and just use the NPI port to carry the traffic. Therefore, eliminate the "ocelot->cpu" variable which was holding the index of the NPI port for Felix, and the index of the CPU port module for Ocelot, so the variable was actually configuring different things for different drivers and causing at least part of the confusion. Also remove the "ocelot->num_cpu_ports" variable, which is the result of another confusion. The 2 CPU ports mentioned in the datasheet are because there are two frame extraction channels (register based or DMA based). This is of no relevance to the driver at the moment, and invisible to the analyzer module. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 14:50:02 +00:00
/* In tables like ANA:PORT and the ANA:PGID:PGID mask,
* the CPU is located after the physical ports (at the
* num_phys_ports index).
*/
u8 num_phys_ports;
int npi;
enum ocelot_tag_prefix npi_inj_prefix;
enum ocelot_tag_prefix npi_xtr_prefix;
struct list_head multicast;
net: mscc: ocelot: support L2 multicast entries There is one main difference in mscc_ocelot between IP multicast and L2 multicast. With IP multicast, destination ports are encoded into the upper bytes of the multicast MAC address. Example: to deliver the address 01:00:5E:11:22:33 to ports 3, 8, and 9, one would need to program the address of 00:03:08:11:22:33 into hardware. Whereas for L2 multicast, the MAC table entry points to a Port Group ID (PGID), and that PGID contains the port mask that the packet will be forwarded to. As to why it is this way, no clue. My guess is that not all port combinations can be supported simultaneously with the limited number of PGIDs, and this was somehow an issue for IP multicast but not for L2 multicast. Anyway. Prior to this change, the raw L2 multicast code was bogus, due to the fact that there wasn't really any way to test it using the bridge code. There were 2 issues: - A multicast PGID was allocated for each MDB entry, but it wasn't in fact programmed to hardware. It was dummy. - In fact we don't want to reserve a multicast PGID for every single MDB entry. That would be odd because we can only have ~60 PGIDs, but thousands of MDB entries. So instead, we want to reserve a multicast PGID for every single port combination for multicast traffic. And since we can have 2 (or more) MDB entries delivered to the same port group (and therefore PGID), we need to reference-count the PGIDs. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-29 02:27:38 +00:00
struct list_head pgids;
struct list_head dummy_rules;
struct ocelot_vcap_block block[3];
struct vcap_props *vcap;
/* Workqueue to check statistics for overflow with its lock */
struct mutex stats_lock;
u64 *stats;
struct delayed_work stats_work;
struct workqueue_struct *stats_queue;
struct workqueue_struct *owq;
u8 ptp:1;
struct ptp_clock *ptp_clock;
struct ptp_clock_info ptp_info;
struct hwtstamp_config hwtstamp_config;
/* Protects the PTP interface state */
struct mutex ptp_lock;
/* Protects the PTP clock */
spinlock_t ptp_clock_lock;
struct ptp_pin_desc ptp_pins[OCELOT_PTP_PINS_NUM];
};
struct ocelot_policer {
u32 rate; /* kilobit per second */
u32 burst; /* bytes */
};
#define ocelot_read_ix(ocelot, reg, gi, ri) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_read_gix(ocelot, reg, gi) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi))
#define ocelot_read_rix(ocelot, reg, ri) __ocelot_read_ix(ocelot, reg, reg##_RSZ * (ri))
#define ocelot_read(ocelot, reg) __ocelot_read_ix(ocelot, reg, 0)
#define ocelot_write_ix(ocelot, val, reg, gi, ri) __ocelot_write_ix(ocelot, val, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_write_gix(ocelot, val, reg, gi) __ocelot_write_ix(ocelot, val, reg, reg##_GSZ * (gi))
#define ocelot_write_rix(ocelot, val, reg, ri) __ocelot_write_ix(ocelot, val, reg, reg##_RSZ * (ri))
#define ocelot_write(ocelot, val, reg) __ocelot_write_ix(ocelot, val, reg, 0)
#define ocelot_rmw_ix(ocelot, val, m, reg, gi, ri) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_rmw_gix(ocelot, val, m, reg, gi) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi))
#define ocelot_rmw_rix(ocelot, val, m, reg, ri) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_RSZ * (ri))
#define ocelot_rmw(ocelot, val, m, reg) __ocelot_rmw_ix(ocelot, val, m, reg, 0)
#define ocelot_field_write(ocelot, reg, val) regmap_field_write((ocelot)->regfields[(reg)], (val))
#define ocelot_field_read(ocelot, reg, val) regmap_field_read((ocelot)->regfields[(reg)], (val))
#define ocelot_fields_write(ocelot, id, reg, val) regmap_fields_write((ocelot)->regfields[(reg)], (id), (val))
#define ocelot_fields_read(ocelot, id, reg, val) regmap_fields_read((ocelot)->regfields[(reg)], (id), (val))
net: mscc: ocelot: introduce a new ocelot_target_{read,write} API There are some targets (register blocks) in the Ocelot switch that are instantiated more than once. For example, the VCAP IS1, IS2 and ES0 blocks all share the same register layout for interacting with the cache for the TCAM and the action RAM. For the VCAPs, the procedure for servicing them is actually common. We just need an API specifying which VCAP we are talking to, and we do that via these raw ocelot_target_read and ocelot_target_write accessors. In plain ocelot_read, the target is encoded into the register enum itself: u16 target = reg >> TARGET_OFFSET; For the VCAPs, the registers are currently defined like this: enum ocelot_reg { [...] S2_CORE_UPDATE_CTRL = S2 << TARGET_OFFSET, S2_CORE_MV_CFG, S2_CACHE_ENTRY_DAT, S2_CACHE_MASK_DAT, S2_CACHE_ACTION_DAT, S2_CACHE_CNT_DAT, S2_CACHE_TG_DAT, [...] }; which is precisely what we want to avoid, because we'd have to duplicate the same register map for S1 and for S0, and then figure out how to pass VCAP instance-specific registers to the ocelot_read calls (basically another lookup table that undoes the effect of shifting with TARGET_OFFSET). So for some targets, propose a more raw API, similar to what is currently done with ocelot_port_readl and ocelot_port_writel. Those targets can only be accessed with ocelot_target_{read,write} and not with ocelot_{read,write} after the conversion, which is fine. The VCAP registers are not actually modified to use this new API as of this patch. They will be modified in the next one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 22:27:21 +00:00
#define ocelot_target_read_ix(ocelot, target, reg, gi, ri) \
__ocelot_target_read_ix(ocelot, target, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_target_read_gix(ocelot, target, reg, gi) \
__ocelot_target_read_ix(ocelot, target, reg, reg##_GSZ * (gi))
#define ocelot_target_read_rix(ocelot, target, reg, ri) \
__ocelot_target_read_ix(ocelot, target, reg, reg##_RSZ * (ri))
#define ocelot_target_read(ocelot, target, reg) \
__ocelot_target_read_ix(ocelot, target, reg, 0)
#define ocelot_target_write_ix(ocelot, target, val, reg, gi, ri) \
__ocelot_target_write_ix(ocelot, target, val, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_target_write_gix(ocelot, target, val, reg, gi) \
__ocelot_target_write_ix(ocelot, target, val, reg, reg##_GSZ * (gi))
#define ocelot_target_write_rix(ocelot, target, val, reg, ri) \
__ocelot_target_write_ix(ocelot, target, val, reg, reg##_RSZ * (ri))
#define ocelot_target_write(ocelot, target, val, reg) \
__ocelot_target_write_ix(ocelot, target, val, reg, 0)
/* I/O */
u32 ocelot_port_readl(struct ocelot_port *port, u32 reg);
void ocelot_port_writel(struct ocelot_port *port, u32 val, u32 reg);
u32 __ocelot_read_ix(struct ocelot *ocelot, u32 reg, u32 offset);
void __ocelot_write_ix(struct ocelot *ocelot, u32 val, u32 reg, u32 offset);
void __ocelot_rmw_ix(struct ocelot *ocelot, u32 val, u32 mask, u32 reg,
u32 offset);
net: mscc: ocelot: introduce a new ocelot_target_{read,write} API There are some targets (register blocks) in the Ocelot switch that are instantiated more than once. For example, the VCAP IS1, IS2 and ES0 blocks all share the same register layout for interacting with the cache for the TCAM and the action RAM. For the VCAPs, the procedure for servicing them is actually common. We just need an API specifying which VCAP we are talking to, and we do that via these raw ocelot_target_read and ocelot_target_write accessors. In plain ocelot_read, the target is encoded into the register enum itself: u16 target = reg >> TARGET_OFFSET; For the VCAPs, the registers are currently defined like this: enum ocelot_reg { [...] S2_CORE_UPDATE_CTRL = S2 << TARGET_OFFSET, S2_CORE_MV_CFG, S2_CACHE_ENTRY_DAT, S2_CACHE_MASK_DAT, S2_CACHE_ACTION_DAT, S2_CACHE_CNT_DAT, S2_CACHE_TG_DAT, [...] }; which is precisely what we want to avoid, because we'd have to duplicate the same register map for S1 and for S0, and then figure out how to pass VCAP instance-specific registers to the ocelot_read calls (basically another lookup table that undoes the effect of shifting with TARGET_OFFSET). So for some targets, propose a more raw API, similar to what is currently done with ocelot_port_readl and ocelot_port_writel. Those targets can only be accessed with ocelot_target_{read,write} and not with ocelot_{read,write} after the conversion, which is fine. The VCAP registers are not actually modified to use this new API as of this patch. They will be modified in the next one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 22:27:21 +00:00
u32 __ocelot_target_read_ix(struct ocelot *ocelot, enum ocelot_target target,
u32 reg, u32 offset);
void __ocelot_target_write_ix(struct ocelot *ocelot, enum ocelot_target target,
u32 val, u32 reg, u32 offset);
/* Hardware initialization */
int ocelot_regfields_init(struct ocelot *ocelot,
const struct reg_field *const regfields);
struct regmap *ocelot_regmap_init(struct ocelot *ocelot, struct resource *res);
int ocelot_init(struct ocelot *ocelot);
void ocelot_deinit(struct ocelot *ocelot);
void ocelot_init_port(struct ocelot *ocelot, int port);
void ocelot_deinit_port(struct ocelot *ocelot, int port);
/* DSA callbacks */
void ocelot_port_enable(struct ocelot *ocelot, int port,
struct phy_device *phy);
void ocelot_port_disable(struct ocelot *ocelot, int port);
void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data);
void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data);
int ocelot_get_sset_count(struct ocelot *ocelot, int port, int sset);
int ocelot_get_ts_info(struct ocelot *ocelot, int port,
struct ethtool_ts_info *info);
void ocelot_set_ageing_time(struct ocelot *ocelot, unsigned int msecs);
void ocelot_adjust_link(struct ocelot *ocelot, int port,
struct phy_device *phydev);
net: switchdev: remove the transaction structure from port attributes Since the introduction of the switchdev API, port attributes were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port attribute notifier structures, and converts drivers to not look at this member. In part, this patch contains a revert of my previous commit 2e554a7a5d8a ("net: dsa: propagate switchdev vlan_filtering prepare phase to drivers"). For the most part, the conversion was trivial except for: - Rocker's world implementation based on Broadcom OF-DPA had an odd implementation of ofdpa_port_attr_bridge_flags_set. The conversion was done mechanically, by pasting the implementation twice, then only keeping the code that would get executed during prepare phase on top, then only keeping the code that gets executed during the commit phase on bottom, then simplifying the resulting code until this was obtained. - DSA's offloading of STP state, bridge flags, VLAN filtering and multicast router could be converted right away. But the ageing time could not, so a shim was introduced and this was left for a further commit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 00:01:50 +00:00
int ocelot_port_vlan_filtering(struct ocelot *ocelot, int port, bool enabled);
void ocelot_bridge_stp_state_set(struct ocelot *ocelot, int port, u8 state);
net: dsa: felix: perform switch setup for tag_8021q Unlike sja1105, the only other user of the software-defined tag_8021q.c tagger format, the implementation we choose for the Felix DSA switch driver preserves full functionality under a vlan_filtering bridge (i.e. IP termination works through the DSA user ports under all circumstances). The tag_8021q protocol just wants: - Identifying the ingress switch port based on the RX VLAN ID, as seen by the CPU. We achieve this by using the TCAM engines (which are also used for tc-flower offload) to push the RX VLAN as a second, outer tag, on egress towards the CPU port. - Steering traffic injected into the switch from the network stack towards the correct front port based on the TX VLAN, and consuming (popping) that header on the switch's egress. A tc-flower pseudocode of the static configuration done by the driver would look like this: $ tc qdisc add dev <cpu-port> clsact $ for eth in swp0 swp1 swp2 swp3; do \ tc filter add dev <cpu-port> egress flower indev ${eth} \ action vlan push id <rxvlan> protocol 802.1ad; \ tc filter add dev <cpu-port> ingress protocol 802.1Q flower vlan_id <txvlan> action vlan pop \ action mirred egress redirect dev ${eth}; \ done but of course since DSA does not register network interfaces for the CPU port, this configuration would be impossible for the user to do. Also, due to the same reason, it is impossible for the user to inadvertently delete these rules using tc. These rules do not collide in any way with tc-flower, they just consume some TCAM space, which is something we can live with. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 01:00:09 +00:00
void ocelot_apply_bridge_fwd_mask(struct ocelot *ocelot);
int ocelot_port_bridge_join(struct ocelot *ocelot, int port,
struct net_device *bridge);
int ocelot_port_bridge_leave(struct ocelot *ocelot, int port,
struct net_device *bridge);
int ocelot_fdb_dump(struct ocelot *ocelot, int port,
dsa_fdb_dump_cb_t *cb, void *data);
int ocelot_fdb_add(struct ocelot *ocelot, int port,
net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge To rehash a previous explanation given in commit 1c44ce560b4d ("net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up"), the switch driver operates the in a mode where a single VLAN can be transmitted as untagged on a particular egress port. That is the "native VLAN on trunk port" use case. The configuration for this native VLAN is driven in 2 ways: - Set the egress port rewriter to strip the VLAN tag for the native VID (as it is egress-untagged, after all). - Configure the ingress port to drop untagged and priority-tagged traffic, if there is no native VLAN. The intention of this setting is that a trunk port with no native VLAN should not accept untagged traffic. Since both of the above configurations for the native VLAN should only be done if VLAN awareness is requested, they are actually done from the ocelot_port_vlan_filtering function, after the basic procedure of toggling the VLAN awareness flag of the port. But there's a problem with that simplistic approach: we are trying to juggle with 2 independent variables from a single function: - Native VLAN of the port - its value is held in port->vid. - VLAN awareness state of the port - currently there are some issues here, more on that later*. The actual problem can be seen when enslaving the switch ports to a VLAN filtering bridge: 0. The driver configures a pvid of zero for each port, when in standalone mode. While the bridge configures a default_pvid of 1 for each port that gets added as a slave to it. 1. The bridge calls ocelot_port_vlan_filtering with vlan_aware=true. The VLAN-filtering-dependent portion of the native VLAN configuration is done, considering that the native VLAN is 0. 2. The bridge calls ocelot_vlan_add with vid=1, pvid=true, untagged=true. The native VLAN changes to 1 (change which gets propagated to hardware). 3. ??? - nobody calls ocelot_port_vlan_filtering again, to reapply the VLAN-filtering-dependent portion of the native VLAN configuration, for the new native VLAN of 1. One can notice that after toggling "ip link set dev br0 type bridge vlan_filtering 0 && ip link set dev br0 type bridge vlan_filtering 1", the new native VLAN finally makes it through and untagged traffic finally starts flowing again. But obviously that shouldn't be needed. So it is clear that 2 independent variables need to both re-trigger the native VLAN configuration. So we introduce the second variable as ocelot_port->vlan_aware. *Actually both the DSA Felix driver and the Ocelot driver already had each its own variable: - Ocelot: ocelot_port_private->vlan_aware - Felix: dsa_port->vlan_filtering but the common Ocelot library needs to work with a single, common, variable, so there is some refactoring done to move the vlan_aware property from the private structure into the common ocelot_port structure. Fixes: 97bb69e1e36e ("net: mscc: ocelot: break apart ocelot_vlan_port_apply") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-14 19:36:15 +00:00
const unsigned char *addr, u16 vid);
int ocelot_fdb_del(struct ocelot *ocelot, int port,
const unsigned char *addr, u16 vid);
int ocelot_vlan_prepare(struct ocelot *ocelot, int port, u16 vid, bool pvid,
bool untagged);
int ocelot_vlan_add(struct ocelot *ocelot, int port, u16 vid, bool pvid,
bool untagged);
int ocelot_vlan_del(struct ocelot *ocelot, int port, u16 vid);
int ocelot_hwstamp_get(struct ocelot *ocelot, int port, struct ifreq *ifr);
int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr);
void ocelot_port_add_txtstamp_skb(struct ocelot *ocelot, int port,
struct sk_buff *clone);
void ocelot_get_txtstamp(struct ocelot *ocelot);
void ocelot_port_set_maxlen(struct ocelot *ocelot, int port, size_t sdu);
int ocelot_get_max_mtu(struct ocelot *ocelot, int port);
int ocelot_port_policer_add(struct ocelot *ocelot, int port,
struct ocelot_policer *pol);
int ocelot_port_policer_del(struct ocelot *ocelot, int port);
int ocelot_cls_flower_replace(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress);
int ocelot_cls_flower_destroy(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress);
int ocelot_cls_flower_stats(struct ocelot *ocelot, int port,
struct flow_cls_offload *f, bool ingress);
int ocelot_port_mdb_add(struct ocelot *ocelot, int port,
const struct switchdev_obj_port_mdb *mdb);
int ocelot_port_mdb_del(struct ocelot *ocelot, int port,
const struct switchdev_obj_port_mdb *mdb);
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 02:11:20 +00:00
int ocelot_devlink_sb_register(struct ocelot *ocelot);
void ocelot_devlink_sb_unregister(struct ocelot *ocelot);
int ocelot_sb_pool_get(struct ocelot *ocelot, unsigned int sb_index,
u16 pool_index,
struct devlink_sb_pool_info *pool_info);
int ocelot_sb_pool_set(struct ocelot *ocelot, unsigned int sb_index,
u16 pool_index, u32 size,
enum devlink_sb_threshold_type threshold_type,
struct netlink_ext_ack *extack);
int ocelot_sb_port_pool_get(struct ocelot *ocelot, int port,
unsigned int sb_index, u16 pool_index,
u32 *p_threshold);
int ocelot_sb_port_pool_set(struct ocelot *ocelot, int port,
unsigned int sb_index, u16 pool_index,
u32 threshold, struct netlink_ext_ack *extack);
int ocelot_sb_tc_pool_bind_get(struct ocelot *ocelot, int port,
unsigned int sb_index, u16 tc_index,
enum devlink_sb_pool_type pool_type,
u16 *p_pool_index, u32 *p_threshold);
int ocelot_sb_tc_pool_bind_set(struct ocelot *ocelot, int port,
unsigned int sb_index, u16 tc_index,
enum devlink_sb_pool_type pool_type,
u16 pool_index, u32 threshold,
struct netlink_ext_ack *extack);
int ocelot_sb_occ_snapshot(struct ocelot *ocelot, unsigned int sb_index);
int ocelot_sb_occ_max_clear(struct ocelot *ocelot, unsigned int sb_index);
int ocelot_sb_occ_port_pool_get(struct ocelot *ocelot, int port,
unsigned int sb_index, u16 pool_index,
u32 *p_cur, u32 *p_max);
int ocelot_sb_occ_tc_port_bind_get(struct ocelot *ocelot, int port,
unsigned int sb_index, u16 tc_index,
enum devlink_sb_pool_type pool_type,
u32 *p_cur, u32 *p_max);
#endif