linux

Author	SHA1	Message	Date
Mohamad Haj Yahia	c3b7c5c950	net/mlx5e: start/stop all tx queues upon open/close netdev Start all tx queues (including inactive ones) when opening the netdev. Stop all tx queues (including inactive ones) when closing the netdev. This is a workaround for the tx timeout watchdog false alarm issue in which the netdev watchdog is polling all the tx queues which may include inactive queues and thus once lowering the real tx queues number (ethtool -L) it will generate tx timeout watchdog false alarms. Fixes: `3947ca1859` ('net/mlx5e: Implement ndo_tx_timeout callback') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-13 11:38:16 -07:00
Daniel Jurgens	2c1ccc9937	net/mlx5e: Fix TX Timeout to detect queues stuck on BQL Change netif_tx_queue_stopped to netif_xmit_stopped. This will show when queues are stopped due to byte queue limits. Fixes: `3947ca1859` ('net/mlx5e: Implement ndo_tx_timeout callback') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-13 11:38:16 -07:00
Jiri Pirko	b38a75d2d3	mlxsw: core: Trace EMAD messages Trace EMAD messages going down to HW and up from HW. Devlink needs to be registered before EMAD init so the trace function can be called with valid devlink handle. Signed-off-by: Jiri Pirko <jiri@mellanox.com> v1->v2: - Use trace_devlink_hwmsg directly Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-12 14:20:18 -07:00
David S. Miller	30d0844bdc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mellanox/mlx5/core/en.h drivers/net/ethernet/mellanox/mlx5/core/en_main.c drivers/net/usb/r8152.c All three conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-06 10:35:22 -07:00
Or Gerlitz	eae033c1b8	net/mlx5: Avoid setting unused var when modifying vport node GUID GCC complains on unused-but-set-variable, clean this up. Fixes: `23898c763f` ('net/mlx5: E-Switch, Modify node guid on vf set MAC') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 11:52:42 -07:00
Yotam Gigi	0b2361d9d9	mlxsw: Add the unresolved next-hops probes Now, the driver sends arp probes for all unresolved neighbours that are currently a nexthop for some route on the system. The job is set periodically every 5 seconds. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:31 -07:00
Yotam Gigi	b2157149b0	mlxsw: spectrum_router: Add the nexthop neigh activity update For nexthop neighbours we need to make kernel to think there is a traffic flowing to them preventing it from going to stale state. Otherwise kernel would stale it and eventually the neigh would be removed from HW and nexthop as well. That would reduce ECMP group in HW. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:30 -07:00
Jiri Pirko	a7ff87acd9	mlxsw: spectrum_router: Implement next-hop routing Implement next-hop routing offload including ECMP. To make it possible, introduce next-hop group entity. This entity keeps track of resolved neighbours and updates HW adjacency table accordingly. Note that HW next-hops are stored in this adjacency table, in form of MAC. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:30 -07:00
Jiri Pirko	a59f0b312a	mlxsw: reg: Add Router Algorithmic LPM ECMP Update Register The RALEU register is used to mass update remote action adjacency index and ecmp size. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:30 -07:00
Yotam Gigi	089f981683	mlxsw: reg: Add Router Adjacency Table register The RATR register is used to configure the Router Adjacency (next-hop) Table. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:30 -07:00
Jiri Pirko	b090ef0686	mlxsw: Introduce simplistic KVD linear area manager This is a very simple manager for KVD linear area. Currently, the allocator will either allocate a single entry from pre-defined sub-area, or in case more than one entry is needed, it will allocate 32-entry chunk in other pre-defined sub-area. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:30 -07:00
Jiri Pirko	c602242761	mlxsw: spectrum: Define sizes of KVD areas Override the defaults and define the area sizes ourselves. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:30 -07:00
Jiri Pirko	489107bda1	mlxsw: Add KVD sizes configuration into profile Up until now we only used hash-based tables in the device, but we are going to use the linear table for remote routes adjacency lists. Add the configuration fields that control the size of the linear table. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:29 -07:00
Yotam Gigi	a6bf9e933d	mlxsw: spectrum_router: Offload neighbours based on NUD state change Listen to any NEIGH_UPDATE events sent and program the device accordingly. If NUD state is VALID and neighbour isn't yet offloaded, then program it into the device's table. Otherwise, just edit its parameters. If NUD state machine transitioned neighbour out of VALID state and it's present in the device's table, then remove it. Note that the device is programmed in delayed work, as the netevent notification chain is atomic and prevents us from going to sleep. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:29 -07:00
Yotam Gigi	c723c735fa	mlxsw: spectrum_router: Periodically update the kernel's neigh table As previously explained, the driver should periodically poll the device for neighbours activity according to the configured DELAY_PROBE_TIME. This will prevent active neighbours from staying in STALE state for long periods of time. During init configure the polling interval according to the DELAY_PROBE_TIME used in the default table. In addition, register a netevent notification block, so that the interval is updated whenever DELAY_PROBE_TIME changes. Using the computed interval schedule a delayed work, which will update the kernel via neigh_event_send() on any active neighbour since the last delayed work. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:29 -07:00
Yotam Gigi	7cf2c205d7	mlxsw: reg: Add Router Algorithmic LPM Unicast Host Table Dump register The RAUHTD register allows dumping entries from the Router Unicast Host Table. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:29 -07:00
Yotam Gigi	4457b3df3f	mlxsw: reg: Add Router Algorithmic LPM Unicast Host Table register The RAUHT register is used to configure and query the Unicast Host Table in devices that implement the Algorithmic LPM. In other words, it is used to configure neighbour entries in the device. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:28 -07:00
Jiri Pirko	6cf3c971dc	mlxsw: spectrum_router: Add private neigh table We need to hold some private data for every neigh entry. It would be possible to do it using neigh_priv_len/ndo_neigh_construct/ ndo_neigh_destroy however only for the port device itself. That would not work for stacked devices like bridge/team/bond. So introduce a private neigh table. Hook onto ndos neigh_construct/destroy and add/remove table entry according to that. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 09:06:28 -07:00
Gal Pressman	e989d5a532	net/mlx5e: Expose flow control counters to ethtool Just like per prio counters, the global flow counters are queried from per priority counters register. Global flow control counters are stored in priority 0 PFC counters. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:03 -07:00
Gal Pressman	fe6b9bd9eb	net/mlx5e: Expose RDMA VPort counters to ethtool Add the needed descriptors to expose RoCE RDMA counters. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:03 -07:00
Maor Gottlieb	f913a72aa0	net/mlx5e: Add support to get ethtool flow rules Enhance the existing get_rxnfc callback: 1. Get flow rule of specific ID. 2. Get all flow rules. 3. Get number of rules. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:03 -07:00
Maor Gottlieb	1174fce8d1	net/mlx5e: Support l3/l4 flow type specs in ethtool flow steering Add support to add flow steering rules with ethtool of L3/L4 flow types (ip4/tcp4/udp4). Those rules will be in higher priority than l2 flow rules, in order to prefer more specific rules. Mask is not supported for l3/l4 flow types. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:02 -07:00
Maor Gottlieb	6dc6071cfc	net/mlx5e: Add ethtool flow steering support Implement etrhtool set_rxnfc callback to support ethtool flow spec direct steering. This patch adds only the support of ether flow type spec. L3/L4 flow specs support will be added in downstream patches. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:02 -07:00
Maor Gottlieb	0da2d66666	net/mlx5: Properly remove all steering objects Instead of explicitly cleaning up the well known parts of the steering tree, we use the generic tree structure to traverse for cleanup. No functional changes. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:02 -07:00
Maor Gottlieb	fba53f7b57	net/mlx5: Introduce mlx5_flow_steering structure Instead of having all steering private name spaces and steering module fields flat in mlx5_core_priv, we wrap them in mlx5_flow_steering for better modularity and API exposure. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:02 -07:00
Maor Gottlieb	c5bb17302e	net/mlx5: Refactor mlx5_add_flow_rule Reduce the set of arguments passed to mlx5_add_flow_rule by introducing flow_spec structure. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-05 00:06:02 -07:00
Ido Schimmel	99f44bb352	mlxsw: spectrum: Enable L3 interfaces on top of bridge devices As with the previously introduced L3 interfaces, listen to 'inetaddr' notifications sent for bridges devices configured on top of the port netdevs and create / destroy router interfaces (RIFs) accordingly. This also includes VLAN devices configured on top of the VLAN-aware bridge. The RIFs will be destroyed either when the last IP address is removed or when the underlying FID is is destroyed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:16 -07:00
Ido Schimmel	701b186ebf	mlxsw: spectrum: Configure FIDs based on bridge events Before introducing support for L3 interfaces on top of the VLAN-aware bridge we need to add some missing infrastructure. Such an interface can either be the bridge device itself or a VLAN device on top of it. In the first case the router interface (RIF) is associated with FID 1, which is created whenever the first port netdev joins the bridge. We currently assume the default PVID is 1 and that it's already created, as it seems reasonable. This can be extended in the future. However, in the second case it's entirely possible we've yet to create a matching FID. This can happen if the VLAN device was configured before making any bridge port member in the VLAN. Prevent such ordering problems by using the VLAN device's CHANGEUPPER event to configure the FID. Make the VLAN device hold a reference to the FID and prevent it from being destroyed even if none of the port netdevs is using it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:16 -07:00
Ido Schimmel	3ba2ebf4a2	mlxsw: spectrum: Unsplit the vFID range Previous commit deprecated the vFIDs used to get traffic to the CPU ('port_vfids'). Thus, we now use the vFIDs as god intended and the artificial split is no longer needed. Rename functions and variables to reflect that. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:15 -07:00
Ido Schimmel	99724c18fc	mlxsw: spectrum: Introduce support for router interfaces Up until now we only supported bridged interfaces. Packets ingressing through the switch ports were either classified to FIDs (in the case of the VLAN-aware bridge) or vFIDs (in the case of VLAN-unaware bridges). The packets were then forwarded according to the FDB. Routing was done entirely in slowpath, by splitting the vFID range in two and using the lower 0.5K vFIDs as dummy bridges that simply flooded all incoming traffic to the CPU. Instead, allow packets to be routed in the device by creating router interfaces (RIFs) that will direct them to the router block. Specifically, the RIFs introduced here are Sub-port RIFs used for VLAN devices and port netdevs. Packets ingressing from the {Port / LAG ID, VID} with which the RIF was programmed with will be assigned to a special kind of FIDs called rFIDs and from there directed to the router. Create a RIF whenever the first IPv4 address was programmed on a VLAN / LAG / port netdev. Destroy it upon removal of the last IPv4 address. Receive these notifications by registering for the 'inetaddr' notification chain. A non-zero (10) priority is used for the notification block, so that RIFs will be created before routes are offloaded via FIB code. Note that another trigger for RIF destruction are CHANGEUPPER notifications causing the underlying FID's reference count to go down to zero. This can happen, for example, when a VLAN netdev with an IP address is put under bridge. While this configuration doesn't make sense it does cause the device and the kernel to get out of sync when the netdev is unbridged. We intend to address this in the future, hopefully in current cycle. Finally, Remove the lower 0.5K vFIDs, as they are deprecated by the RIFs, which will trap packets according to their DIP. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:15 -07:00
Ido Schimmel	6e095fd4eb	mlxsw: spectrum: Edit RIF properties based on netdev events We are just about to introduce router interfaces (RIFs), but before that we need to be able update the device with the correct RIF attributes whenever they change for the netdev the RIF is backing. Two such attributes are MTU and MAC. The MAC is used both to set the source MAC of packets egressing from the RIF and also to program an FDB rule that will direct packets to the router block. Use the existing netdevice notification block and respond to CHANGEADDR and CHANGEMTU accordingly. Store both attributes in the RIF struct in case we need to revert to old attributes following a failed update. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:15 -07:00
Jiri Pirko	7ce856aaaf	mlxsw: spectrum: Add couple of lower device helper functions Add functions that iterate over lower devices and find port device. As a dependency add netdev_for_each_all_lower_dev and netdev_for_each_all_lower_dev_rcu macro with netdev_all_lower_get_next and netdev_all_lower_get_next_rcu shelpers. Also, add functions to return mlxsw struct according to lower device found and mlxsw_port struct with a reference to lower device. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:15 -07:00
Jiri Pirko	61c503f976	mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops Implement ipv4 FIB entries addition and removal. Initially, we support local and broadcast routes using "ip2me" trap action. Also, unicast routes without nexthop are supported using "local" action. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:15 -07:00
Jiri Pirko	d5a1c749d2	mlxsw: reg: Add Router Algorithmic LPM Unicast Entry Register definition Serves for adding, updating and removing fib entries. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:15 -07:00
Jiri Pirko	6b75c4807d	mlxsw: spectrum_router: Add virtual router management Virtual router is a construct used inside HW. In this implementation we map kernel tables to virtual routers one to one. Introduce management logic to create virtual routers when needed and destroy in case they are no longer in use. According to that, call into LPM tree management. Each virtual router is always bound to one LPM tree. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:14 -07:00
Jiri Pirko	53342023ee	mlxsw: spectrum_router: Implement LPM trees management Introduce basic LPM tree management allowing to share the trees in between tables if the used prefixes in the tables are the same. Build the tree structure according to the used prefixes. Although it is not optimal for many use cases, this initial implementation does only simple linear left-tree. More advanced structures will be introduced later on, possibly including mechanisms to change trees on the fly. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:14 -07:00
Jiri Pirko	20ae4053e9	mlxsw: reg: Add Router Algorithmic LPM Tree Binding Register definition This register is used to bind virtual router and protocol to an allocated LPM tree. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:14 -07:00
Jiri Pirko	a9823359c6	mlxsw: reg: Add Router Algorithmic LPM Structure Tree Register definition Serves to build LPM tree structure. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:14 -07:00
Jiri Pirko	6f9fc3cee4	mlxsw: reg: Add Router Algorithmic LPM Tree Allocation Register definition Register serves for allocation and deallocation of LPM search tree. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:14 -07:00
Jiri Pirko	5e9c16cc83	mlxsw: spectrum_router: Implement private fib Shadow FIB is needed in order to hold additional information for FIB entries and keep track of used prefixes. That is needed for the LPM tree construction to be introduced later on in this set. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 18:25:14 -07:00
Christophe Jaillet	5d4de16c6d	net/mlx4: Fix some indent inconsistancy Silent a few smatch warnings about indentation. This include the removal of a 'return' statement in 'resource_tracker.c'. This 'return' will still be performed when breaking out of the corresponding 'switch' block. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-04 15:22:33 -07:00
Jiri Pirko	7b27ce7bb9	mlxsw: spectrum: Add traps needed for router implementation ip2me: To instruct HW to send trapped ip2me traffic to kernel, we have to add this trap. Selection ip2me traffic is introduced later on in this set. ARPs: We are going to stop flooding to CPU port when netdev isn't bridged and only get packets destined to the netdev's IP address and certain control packets. Add traps for ARP request (broadcast) and response (unicast) in order to get these to the CPU and resolve neighbours. host miss: If a packet is routed through a directly connected route and its destination IP is not in the device's neighbour table, then we need to trap it to CPU. This will cause the host to resolve the MAC of the neighbour, which will be eventually programmed to the device's table. router ingress: In order to trap packets in router part. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:18 -04:00
Ido Schimmel	10f00aa1c4	mlxsw: spectrum: Use action 'discard' when removing traps When removing packet traps we should use action 'discard' instead of 'forward', as some trap IDs we'll add cannot be configured with the later. However, result is the same, as packets are not trapped to the CPU. In the future we will be able to reverse the operation properly by detaching the trap group from the CPU. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:18 -04:00
Ido Schimmel	3dc266896d	mlxsw: reg: Add Router Interface Table Register Add the Router Interface Table Register (RITR), which allows us to create and configure router interfaces (RIFs). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:18 -04:00
Ido Schimmel	d82d8c060f	mlxsw: reg: Add FDB action to forward to router Incoming packets are directed to the router when they match an FDB entry with action forward to IP router. Add this action, which was mistakenly named "TRAP". Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	fa3054f5a8	mlxsw: spectrum: Add router interface struct When enabling the router in the device we will represent L3 netdevs using router interfaces (RIFs). These will be specified whenever programming routes or neighbours on the netdev. Introduce the basic RIF infrastructure which allows one to lookup a RIF by its netdev. Later patches in the series will extend this, but the basic routines are needed now in order to direct traffic to CPU. Pointers to the RIF structs are stored in an array indexed by the RIF's number. This will allow us to efficiently update the kernel's neighbour table when regularly dumping the device's table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	464dce1884	mlxsw: spectrum_router: Add basic ipv4 router initialization Create a skeleton router file and do basic HW initialization of router. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	bbf2a4757b	mlxsw: spectrum: Initialize ports at the end of init sequence During ports initialization a net device is registered for each available port, which implies the port is usable. However, a port is only usable after the different parts of the device (e.g. flooding, buffers) are initialized. This is especially important now, when we must initialize the router before the ports, as otherwise the device can't be initialized. Solve that by initializing the switch ports at the end of init sequence. Also, remove an unnecessary warning about port up/down events, which would otherwise be invoked whenever removing the driver, as ports are removed before unregistering the listener for these events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	69c407aaf9	mlxsw: reg: Add Router General Configuration Register Add the Router General Configuration Register (RGCR), which allows us to enable the router in the device and configure its various parameters. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	11943ff442	mlxsw: spectrum: Remove RIF from PVID vPort when joining / leaving LAG We are going to assign router interfaces (RIFs) to netdevs if an IPv4 address was assigned to them. If one was assigned to a port netdev, this will translate to the PVID vPort being member in a RIF. While it's possible for a LAG slave to have an IP address, we can't have a vPort being member in two FIDs (assuming the LAG device will be put in bridge / assigned an IP address). Solve that by making the PVID vPort leave any FID it might be a member in when joining / leaving LAG. Note that the PVID vPort is the only vPort that can be present on the port when it's put under LAG. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	86bf95b334	mlxsw: spectrum: Sync PVID vPort LAG status When VLAN devices are created on top of LAG, their underlying vPorts are configured correctly with LAG membership. However, the PVID vPort is implicit and already present when the port netdev is put under LAG, so its LAG membership is never set. Set it correctly when joining / leaving LAG. This didn't matter until now, but we are going to introduce support for router interfaces (RIFs), which need to take into account LAG membership. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	32d863fb93	mlxsw: spectrum: Remove VLANs configuration via SELF flag When port isn't bridged it is still possible to invoke switchdev ops and configure the device's VLAN filters. However, this will require us to use different Router InterFaces (RIFs) for the same netdev, instead of one per-netdev as with any other configuration. Taking the above into account and the fact that this functionality is questionable with regards to the device's normal use-case, remove it and instead return an error. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Ido Schimmel	52697a9ede	mlxsw: spectrum: Send untagged packets through a port netdev Port netdevs (e.g. swXpY) that are not bridged are represented in the device using a vPort with VID=PVID=1 (the PVID vPort), as untagged packets entering the switch are internally tagged with the PVID VLAN. When these packets are routed through a different port netdev they should egress untagged. This wasn't a problem until now, as non-bridged traffic only originated from the CPU, which transmits packets out of the port as-is. When a vPort is created with VID 1 mark it as egress untagged. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 15:21:17 -04:00
Hadar Hen Zion	cb67b83292	net/mlx5e: Introduce SRIOV VF representors Implement the relevant profile functions to create mlx5e driver instance serving as VF representor. When SRIOV offloads mode is enabled, each VF will have a representor netdevice instance on the host. To do that, we also export set of shared service functions from en_main.c, such that they can be used by both NIC and repsresentors netdevs. The newly created representor netdevice has a basic set of net_device_ops which are the same ndo functions as the NIC netdevice and an ndo of it's own for phys port name. The profiling infrastructure allow sharing code between the NIC and the vport representor even though the representor has only a subset of the NIC functionality. The VF reps and the PF which is used in that mode to represent the uplink, expose switchdev ops. Currently the only op supposed is attr get for the port parent ID which here serves to identify net-devices belonging to the same HW E-Switch. Other than that, no offloading is implemented and hence switching functionality is achieved if one sets SW switching rules, e.g using tc, bridge or ovs. Port phys name (ndo_get_phys_port_name) is implemented to allow exporting to user-space the VF vport number and along with the switchdev port parent id (phys_switch_id) enable a udev base consistent naming scheme: SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="", NAME="$PF_NIC$attr{phys_port_name}" where phys_switch_id is exposed by the PF (and VF reps) and $PF_NIC is the name of the PF netdevice. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:41 -04:00
Hadar Hen Zion	127ea380ac	net/mlx5: Add Representors registration API Introduce E-Switch registration/unregister representors functions. Those functions are called by the mlx5e driver when the PF NIC is created upon pci probe action regardless of the E-Switch mode (NONE, LEGACY or OFFLOADS). Adding basic E-Switch database that will hold the vport represntors upon creation. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:41 -04:00
Hadar Hen Zion	6bfd390ba5	net/mlx5e: Add support for multiple profiles To allow support in representor netdevices where we create more than one netdevice per NIC, add profiles to the mlx5e driver. The profiling allows for creation of mlx5e instances with different characteristics. Each profile implements its own behavior using set of function pointers defined in struct mlx5e_profile. This is done to allow for avoiding complex per profix branching in the code. Currently only the profile for the conventional NIC is implemented, which is of use when a netdev is created upon pci probe. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:41 -04:00
Hadar Hen Zion	398f33511e	net/mlx5e: Mark enabled RQTs instances explicitly In the current driver implementation two types of receive queue tables (RQTs) are in use - direct and indirect. Change the driver to mark each new created RQT (direct or indirect) as "enabled". This behaviour is needed for introducing new mlx5e instances which serve to represent SRIOV VFs. The VF representors will have only one type of RQTs (direct). An "enabled" flag is added to each RQT to allow better handling and code sharing between the representors and the nic netdevices. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Hadar Hen Zion	724b2aa151	net/mlx5e: TIRs management refactoring The current refresh tirs self loopback mechanism, refreshes all the tirs belonging to the same mlx5e instance to prevent self loopback by packets sent over any ring of that instance. This mechanism relies on all the tirs/tises of an instance to be created with the same transport domain number (tdn). Change the driver to refresh all the tirs created under the same tdn regardless of which mlx5e netdev instance they belong to. This behaviour is needed for introducing new mlx5e instances which serve to represent SRIOV VFs. The representors and the PF share vport used for E-Switch management, and we want to avoid NIC level HW loopback between them, e.g when sending broadcast packets. To achieve that, both the representors and the PF NIC will share the tdn. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Hadar Hen Zion	b50d292b43	net/mlx5e: Create NIC global resources only once To allow creating more than one netdev over the same PCI function, we change the driver such that global NIC resources are created once and later be shared amongst all the mlx5e netdevs running over that port. Move the CQ UAR, PD (pdn), Transport Domain (tdn), MKey resources from being kept in the mlx5e priv part to a new resources structure (mlx5e_resources) placed under the mlx5_core device. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Or Gerlitz	c930a3ad74	net/mlx5e: Add devlink based SRIOV mode changes Implement handlers for the devlink commands to get and set the SRIOV E-Switch mode. When turning to the switchdev/offloads mode, we disable the e-switch and enable it again in the new mode, create the NIC offloads table and create VF reps. When turning to legacy mode, we remove the VF reps and the offloads table, and re-initiate the e-switch in it's legacy mode. The actual creation/removal of the VF reps is done in downstream patches. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Or Gerlitz	feae908744	net/mlx5: Add devlink interface The devlink interface is initially used to set/get the mode of the SRIOV e-switch. Currently, these are only stubs for get/set, down-stream patch will actually fill them out. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Or Gerlitz	fed9ce22bf	net/mlx5: E-Switch, Add API to create vport rx rules Add the API to create vport rx rules of the form packet meta-data :: vport == $VPORT --> $TIR where the TIR is opened by this VF representor. This logic will by used for packets that didn't match any rule in the e-switch datapath and should be received into the host OS through the netdevice that represents the VF they were sent from. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Or Gerlitz	c116c6eec6	net/mlx5: E-Switch, Add offloads table Belongs to the NIC offloads name-space, and to be used as part of the SRIOV offloads logic to steer packets that hit the e-switch miss rule to the TIR of the relevant VF representor. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:40 -04:00
Or Gerlitz	acbc2004d7	net/mlx5: Introduce offloads steering namespace Add a new namespace (MLX5_FLOW_NAMESPACE_OFFLOADS) to be populated with flow steering rules that deal with rules that have have to be executed before the EN NIC steering rules are matched. The namespace is located after the bypass name-space and before the kernel name-space. Therefore, it precedes the HW processing done for rules set for the kernel NIC name-space. Under SRIOV, it would allow us to match on e-switch missed packet and forward them to the relevant VF representor TIR. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:39 -04:00
Or Gerlitz	ab22be9ba3	net/mlx5: E-Switch, Add API to create send-to-vport rules Add the API to create send-to-vport e-switch rules of the form packet meta-data :: send-queue-number == $SQN and source-vport == 0 --> $VPORT These rules are to be used for a send-to-vport logic which conceptually bypasses the "normal" steering rules currently present at the e-switch datapath. Such rule should apply only for packets that originate in the e-switch manager vport (0) and are sent for a given SQN which is used by a given VF representor device, and hence the matching logic. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:39 -04:00
Or Gerlitz	3aa335724f	net/mlx5: E-Switch, Add miss rule for offloads mode In the sriov offloads mode, packets that are not matched by any other rule should be sent towards the e-switch manager for further processing. Add such "miss" rule which matches ANY packet as the last rule in the e-switch FDB and programs the HW to send the packet to vport 0 where the e-switch manager runs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:39 -04:00
Or Gerlitz	69697b6e20	net/mlx5: E-Switch, Add support for the sriov offloads mode Unlike the legacy mode, here, forwarding rules are not learned by the driver per events on macs set by VFs/VMs into their vports, but rather should be programmed by higher-level SW entities. Saying that, still, in the offloads mode (SRIOV_OFFLOADS), two flow groups are created by the driver for management (slow path) purposes: The first group will be used for sending packets over e-switch vports from the host OS where the e-switch management code runs, to be received by VFs. The second group will be used by a miss rule which forwards packets toward the e-switch manager. Further logic will trap these packets such that the receiving net-device as seen by the networking stack is the representor of the vport that sent the packet over the e-switch data-path. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:39 -04:00
Or Gerlitz	6ab36e35f1	net/mlx5: E-Switch, Add operational mode to the SRIOV e-Switch Define three modes for the SRIOV e-switch operation, none (SRIOV_NONE, none of the VF vports are enabled), legacy (SRIOV_LEGACY, the current mode) and sriov offloads (SRIOV_OFFLOADS). Currently, when in SRIOV, only the legacy mode is supported, where steering rules are of the form: destination mac --> VF vport This patch does not change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-02 14:40:39 -04:00
Dave Airlie	542d972221	Linux 4.7-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXcHi9AAoJEHm+PkMAQRiGSJ0H/2o4t9VWYmhyPC1sdIHoCExJ P4tBrcZYBmKcsOmIfnJDa5g/+IdhouEUM0v0fHPogS2UUWT9eRuJWYD3sY+HpEQ+ heKTli8X73gsFB25odeIbIt0jAoSiiMYWDrWqLNsuUV1tjEYVA8rH0SM94FiOC/5 7WVWXLTuH+Rm7JHP18BnKxmMMbzrTFmwisLMqFKyfZRRSlS+/ix7iLUNO9AFa39B YHxNPihLrZ0oONyCOAQoHTIXXrw0cQbxV2utg3vnMcCZdme2xOn+iXMntTSKfZ39 iC9/T0vsO3R6OrRo2aDZAnCPUAniXnMEIhrKG37WMyXpj6cucZ/2QiNXcXviGV4= =iLte -----END PGP SIGNATURE----- Back-merge tag 'v4.7-rc5' into drm-next Linux 4.7-rc5 The fsl-dcu pull needs -rc3 so go to -rc5 for now.	2016-07-02 15:56:01 +10:00
Shaker Daibes	87424ad52d	net/mlx5e: Log link state changes Add Link UP/Down prints to kernel log when link state changes Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:04 -04:00
Rana Shahout	cdcf11212b	net/mlx5e: Validate BW weight values of ETS Valid weight assigned to ETS TClass values are 1-100 Fixes: `08fb1dacdd` ('net/mlx5e: Support DCBNL IEEE ETS') Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:04 -04:00
Rana Shahout	7ccdd0841b	net/mlx5e: Fix select queue callback The default fallback function used by mlx5e select queue can return any TX queues in range [0..dev->num_real_tx_queues). The current implementation assumes that the fallback function returns a number in the range [0.. number of channels). Actually dev->num_real_tx_queues = (number of channels) * dev->num_tc; which is more than the expected range if num_tc is configured and could lead to crashes. To fix this we test if num_tc is not configured we can safely return the fallback suggestion, if not we will reciprocal_scale the fallback result and normalize it to the desired range. Fixes: `08fb1dacdd` ('net/mlx5e: Support DCBNL IEEE ETS') Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:04 -04:00
Matthew Finlay	e3a19b53cb	net/mlx5e: Copy all L2 headers into inline segment ConnectX4-Lx uses an inline wqe mode that currently defaults to requiring the entire L2 header be included in the wqe. This patch fixes mlx5e_get_inline_hdr_size() to account for all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb). Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Signed-off-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:04 -04:00
Daniel Jurgens	6cd392a082	net/mlx5e: Handle RQ flush in error cases Add a timeout to avoid an infinite loop waiting for RQ's to flush. This occurs during AER/EEH and will also happen if the device stops posting completions due to internal error or reset, or if moving the RQ to the error state fails. Also cleanup posted receive resources when closing the RQ. Fixes: `f62b8bb8f2` ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Daniel Jurgens	3947ca1859	net/mlx5e: Implement ndo_tx_timeout callback Add callback to handle TX timeouts. Fixes: `f62b8bb8f2` ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Daniel Jurgens	29429f3300	net/mlx5e: Timeout if SQ doesn't flush during close Avoid an infinite loop by timing out waiting for the SQ to flush. Also clean up the TX descriptors if that happens. Fixes: `f62b8bb8f2` ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia	65ee670845	net/mlx5: Add timeout handle to commands with callback The current implementation does not handle timeout in case of command with callback request, and this can lead to deadlock if the command doesn't get fw response. Add delayed callback timeout work before posting the command to fw. In case of real fw command completion we will cancel the delayed work. In case of fw command timeout the callback timeout handler will be called and it will simulate fw completion with timeout error. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia	9cba4ebcf3	net/mlx5: Fix potential deadlock in command mode change Call command completion handler in case of timeout when working in interrupts mode. Avoid flushing the commands workqueue after acquiring the semaphores to prevent a potential deadlock. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Daniel Jurgens	d57847dc41	net/mlx5: Fix wait_vital for VFs and remove fixed sleep The device ID for VFs is in a different location than PFs. This results in the poll always timing out for VFs. There's no good way to read the VF device ID without using the PF's configuration space. Switch to waiting for the health poll to start incrementing. Also remove the 1s sleep at the beginning. fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Daniel Jurgens	5adff6a088	net/mlx5: Fix incorrect page count when in internal error Change page cleanup flow when in internal error to properly decrement the page counts when reclaiming pages. The prevents timing out waiting for extra pages that were actually cleaned up previously. fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia	c1d4d2e92a	net/mlx5: Avoid calling sleeping function by the health poll thread In internal error state the health poll thread will eventually call synchronize_irq() (to safely trigger command completions) which might sleep, so we are calling sleeping function from atomic context which is invalid. Here we move trigger_cmd_completions(dev) to enter error state which is the earliest stage in error state handling. This way we won't need to wait for next health poll to trigger command completions and will solve the scheduling while atomic issue. mlx5_enter_error_state can be called from two contexts, protect it with dev->intf_state_lock Fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia	0d834442cc	net/mlx5: Fix teardown errors that happen in pci error handler In case of internal error state we will simulate the commands status through the return value translation function, but we need to simulate all the teardown fw commands as successful so we will not have fw command failure prints. This also fix memory leaks that happen because we skip teardown stages due to failed fw commands. Fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-01 06:12:02 -04:00
David S. Miller	ee58b57100	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several cases of overlapping changes, except the packet scheduler conflicts which deal with the addition of the free list parameter to qdisc_enqueue(). Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-30 05:03:36 -04:00
Gal Pressman	bfe6d8d1d4	net/mlx5e: Reorganize ethtool statistics Categorize and reorganize ethtool statistics counters by renaming to "rx_" and "tx_" and removing redundant and duplicated counters, this way they are easier to grasp and more user friendly. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:47 -04:00
Gal Pressman	ed80ec4c17	net/mlx5e: Fix number of PFC counters reported to ethtool Number of PFC counters used to count only number of priorities with PFC enabled, but each priority has more than one counter, hence the need to multiply it by the number of PFC counters per priority. Fixes: `cf678570d5` ('net/mlx5e: Add per priority group to PPort counters') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:46 -04:00
Matthew Finlay	9ceec359e4	net/mlx5e: Prevent adding the same vxlan port Do not allow the same vxlan udp port to be added to the device more than once. Fixes: `b3f63c3d5e` ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:46 -04:00
Gal Pressman	fd4782c213	net/mlx5e: Check for BlueFlame capability before allocating SQ uar Previous to this patch mapping was always set to write combining without checking whether BlueFlame is supported in the device. Fixes: `0ba422410b` ('net/mlx5: Fix global UAR mapping') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:46 -04:00
Eli Cohen	e0f46eb9f6	net/mlx5e: Change enum to better reflect usage Change MLX5E_STATE_ASYNC_EVENTS_ENABLE to MLX5E_STATE_ASYNC_EVENTS_ENABLED since it represent a state and not an operation. Fixes: `acff797cd1` ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:46 -04:00
Majd Dibbiny	7092fe8669	net/mlx5: Add ConnectX-5 PCIe 4.0 to list of supported devices Add the upcoming ConnectX-5 PCIe 4.0 device to the list of supported devices by the mlx5 driver. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:46 -04:00
Eli Cohen	5be1ea899d	net/mlx5: Update command strings Add command string for MODIFY_FLOW_TABLE which is used by the driver. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 04:28:46 -04:00
Wang Sheng-Hui	f299a02d5f	net/mlx5: use mlx5_buf_alloc_node instead of mlx5_buf_alloc in mlx5_wq_ll_create Commit `311c7c71c9` ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") introduced mlx5__alloc_node() but missed changing some calling and warn messages. This patch introduces 2 changes: Use mlx5_buf_alloc_node() instead of mlx5_buf_alloc() in mlx5_wq_ll_create() * Update the failure warn messages with _node postfix for mlx5_*_alloc function names Fixes: `311c7c71c9` ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com> Acked-By: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-28 05:17:38 -04:00
Gal Pressman	52244d9607	net/mlx5e: Report correct auto negotiation and allow toggling Previous to this patch auto negotiation was reported off although it was on by default in hardware. This patch reports the correct information to ethtool and allows the user to toggle it on/off. Added another parameter to set port proto function in order to pass the auto negotiation field to the hardware. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:41 -04:00
Gal Pressman	665bc53969	net/mlx5e: Use new ethtool get/set link ksettings API Use new get/set link ksettings and remove get/set settings legacy callbacks. This allows us to use bitmasks longer than 32 bit for supported and advertised link modes and use modes that were previously not supported. Signed-off-by: Gal Pressman <galp@mellanox.com> CC: Ben Hutchings <bwh@kernel.org> CC: David Decotigny <decot@googlers.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:41 -04:00
Gal Pressman	4a50e35b04	net/mlx5e: Add missing 50G baseSR2 link mode Add MLX5E_50GBASE_SR2 as ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Ben Hutchings <bwh@kernel.org> Cc: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:41 -04:00
Gal Pressman	667daedaec	net/mlx5e: Toggle link only after modifying port parameters Add a dedicated function to toggle port link. It should be called only after setting a port register. Toggle will set port link to down and bring it back up in case that it's admin status was up. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:41 -04:00
Gil Rockah	cb3c7fd4f8	net/mlx5e: Support adaptive RX coalescing Striving for high message rate and low interrupt rate. Usage: ethtool -C <interface> adaptive-rx on/off Signed-off-by: Gil Rockah <gilr@mellanox.com> Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> CC: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:41 -04:00
Tariq Toukan	9908aa2929	net/mlx5e: CQE based moderation In this mode the moderation timer will restart upon new completion (CQE) generation rather than upon interrupt generation. The outcome is that for bursty traffic the period timer will never expire and thus only the moderation frames counter will dictate interrupt generation, thus the interrupt rate will be relative to the incoming packets size. If the burst seizes for "moderation period" time then an interrupt will be issued immediately. CQE based moderation is off by default and can be controlled via ethtool set_priv_flags. Performance tested on ConnectX4-Lx 50G. Less packet loss in netperf UDP and TCP tests, with no bw degradation, for both single and multi streams, with message sizes of 64, 1024, 1472 and 32768 byte. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Gil Rockah <gilr@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:41 -04:00
Gal Pressman	4e59e28881	net/mlx5e: Introduce net device priv flags infrastructure Introduce an infrastructure for getting/setting private net device flags. Currently a 'nop' priv flag is added, following patches will override the flag will actual feature specific flags. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:40 -04:00
Yevgeny Petrilin	507f0c817f	net/mlx5e: Add TXQ set max rate support Implement set_maxrate ndo. Use the rate index from the hardware table to attach to channel SQ/TXQ. In case of failure to configure new rate, the queue remains with unlimited rate. We save the configuration on priv structure and apply it each time Send Queues are being reinitialized (after open/close) operations. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:40 -04:00
Yevgeny Petrilin	1466cc5b23	net/mlx5: Rate limit tables support Configuring and managing HW rate limit tables. The HW holds a table of rate limits, each rate is associated with an index in that table. Later a Send Queue uses this index to set the rate limit. Multiple Send Queues can have the same rate limit, which is represented by a single entry in this table. Even though a rate can be shared, each queue is being rate limited independently of others. The SW shadow of this table holds the rate itself, the index in the HW table and the refcount (number of queues) working with this rate. The exported functions are mlx5_rl_add_rate and mlx5_rl_remove_rate. Number of different rates and their values are derived from HW capabilities. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-27 04:10:40 -04:00
Rana Shahout	af7d518526	net/mlx4_en: Add DCB PFC support through CEE netlink commands This patch adds support for reading and updating priority flow control (PFC) attributes in the driver via netlink. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-23 15:18:50 -04:00
Artemy Kovalyov	af1ba291c5	{net, IB}/mlx5: Refactor internal SRQ API Currently, the SRQ API uses the obsolete mlx5_*_srq_mbox_{in,out} structs which limit the ability to pass the SRQ attributes between net and IB parts of the driver. This patch changes the SRQ API so as to use auto-generated structs and provides a better way to pass attributes which will be in use by coming features. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 11:20:07 -04:00
Yishai Hadas	16bd020147	net/mlx5: Export required core functions to support RSS In order to support RSS QPs, we need to create Ethernet based objects. This is done by create_rq, destroy_rq, create_rqt and destroy_rqt mlx5_core functions. We export these functions. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-06-23 11:02:43 -04:00
Eran Ben Elisha	9d76931180	net/mlx4_en: Avoid unregister_netdev at shutdown flow This allows a clean shutdown, even if some netdev clients do not release their reference from this netdev. It is enough to release the HW resources only as the kernel is shutting down. Fixes: `2ba5fbd62b` ('net/mlx4_core: Handle AER flow properly') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-22 16:38:11 -04:00
Kamal Heib	93c098af09	net/mlx4_en: Fix the return value of a failure in VLAN VID add/kill Modify mlx4_en_vlan_rx_[add/kill]_vid to return error value in case of failure. Fixes: `8e586137e6` ('net: make vlan ndo_vlan_rx_[add/kill]_vid return error value') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-22 16:38:11 -04:00
Ido Schimmel	223053783b	mlxsw: spectrum: Add debug prints For debug purposes, it's useful to know the order in which the driver responds to changes in the topology of its upper devices. Add debug prints to signal these events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:51 -04:00
Ido Schimmel	1c80075907	mlxsw: spectrum: Free resources upon vPort destruction There are situations in which a vPort is destroyed while still holding references to device's resources such as FIDs and FDB records. This can happen, for example, when a VLAN device is deleted while still being bridged. Instead of trying to make sure vPort destruction is invoked when it no longer uses device's resources, just free them upon destruction. This simplifies the code, as we no longer need to take different situations into account when events are received - cleanup is taken care of in one place. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:51 -04:00
Ido Schimmel	fe3f6d144a	mlxsw: spectrum: Refactor FDB flushing logic FDB entries are learned using {Port / LAG ID, FID} and therefore should be flushed whenever a port (vPort) leaves its FID (vFID). However, when the bridge port is a LAG device (or a VLAN device on top), then FDB flushing is conditional. Ports removed from such LAG configurations must not trigger flushing, as other ports might still be members in the LAG and therefore the bridge port is still active. The decision whether to flush or not was previously computed in the netdevice notification block, but in order to flush the entries when a port leaves its FID this decision should be computed there. Strip the notification block from this logic and instead move it to one FDB flushing function that is invoked from both the FID / vFID leave functions. When port isn't member in LAG, FDB flushing should always occur. Otherwise, it should occur only when the last port (vPort) member in the LAG leaves the FID (vFID). This will allow us - in the next patch - to simplify the cleanup code paths that are hit whenever the topology above the port netdevs changes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:51 -04:00
Ido Schimmel	56918b6b0a	mlxsw: spectrum: Don't count on FID being present Not all vPorts will have FIDs assigned to them, so make sure functions first test for FID presence. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	41b996cc94	mlxsw: spectrum: Add FID get / set functions As previously explained, not all vPorts will be assigned FIDs, so instead of returning the FID index of a vPort, return a pointer to its FID struct. This will allow us to know whether it's legal to access the vPort's FID parameters such as index and device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	6381b3a85f	mlxsw: spectrum: Check if port is vPort using its VID When L3 interfaces will be introduced a vPort won't necessarily have a FID assigned to it. This can happen if it's not member in a bridge (in which case it's assigned a vFID) or doesn't have an IP address (in which case it's assigned an rFID). Therefore, instead check the VID parameter to test whether a port is a vPort or not. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	14d39461b3	mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge In a very similar way to the vFIDs, make the first 4K FIDs - used in the VLAN-aware bridge - use the new FID struct. Upon first use of the FID by any of the ports do the following: 1) Create the FID 2) Setup a matching flooding entry 3) Create a mapping for the FID Unlike vFIDs, upon creation of a FID we always create a global VID-to-FID mapping, so that ports without upper vPorts can use it instead of creating an explicit {Port, VID} to FID mapping. When a port leaves a FID the reverse is performed. Whenever the FID's reference count reaches zero the FID is deleted along with the global mapping. The per-FID struct will later allow us to configure L3 interfaces on top of the VLAN-aware bridge. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	37286d2571	mlxsw: spectrum: Remove unused function argument Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	0355b59fbb	mlxsw: spectrum: Use join / leave functions for vFID operations When a vPort is created or when it joins a bridge we always do the same set of operations: 1) Create the vFID, if not already created 2) Setup flooding for the vFID 3) Map the {Port, VID} to the vFID When a vPort is destroyed or when it leaves a bridge the reverse is performed. Encapsulate the above in join / leave functions and simplify the code. FIDs and rFIDs will use a similar set of functions. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	d0ec875a2f	mlxsw: spectrum: Make vFID struct generic Up until now we had a dedicated struct only for vFIDs, but before introducing support for L3 interfaces we need to make it generic and use it for all three types of FIDs: 1) FIDs - 0..4K-1, used for the VLAN-aware bridge 2) vFIDs - 4K..15K-1, used for VLAN-unaware bridges 3) rFIDs - 15K..16K-1, used to direct traffic to / from the router in the device. Will be introduced later in the series. The three types of L3 interfaces - Router InterFaces, RIFs - that will be introduced correspond to the three types of FIDs and are configured using them. Therefore, we'll need to store the links between them as well as a reference count on the underlying FID, so that the corresponding RIF will be destroyed when it reaches zero. Note that the lower 0.5K vFIDs are currently used for for non-bridged netdevs, so that traffic could be flooded to the CPU port. However, when rFIDs will be introduced we'll no longer need these and they too will be used for VLAN-unaware bridges. Make the vFID struct generic by renaming it and some of its fields. FIDs will be converted to use it later in the series. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	e606002721	mlxsw: spectrum: Use FID instead of vFID to setup flooding Use a FID index instead of vFID and ease the transition towards a generic FID struct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:50 -04:00
Ido Schimmel	9c4d442314	mlxsw: spectrum: Create a function to map vPort's FID A FID used by a vPort (vFID, but also rFID later in the series) is always mapped using {Port, VID} and not only VID as with the 4K FIDs of the VLAN-aware bridge. Instead of specifying all the arguments each time, just wrap this operation using a dedicated function and simplify the code. As before, the function takes FID as its argument in preparation for a generic FID struct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	c7e920b5be	mlxsw: spectrum: Use only one function to create vFIDs Simplify the code and use only one function for vFID creation / destruction. Unlike before, the function receives a FID index as its argument and not a vFID index. Instead of passing 0, now one would need to pass 4K, which is the first vFID. This is the first step in creating a generic FID struct that will be used for all three types of FIDs. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	47a0a9e6c3	mlxsw: spectrum: Remove redundant function argument In all call sites 'only_uc' is set to false, so strip it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	d8651fd886	mlxsw: spectrum: Use DECLARE_BITMAP() macro There is a macro to do this kind of declarations, so use it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	7117a570b9	mlxsw: spectrum: Centralize VLAN-aware bridge ref counting We hold a reference count on the number of ports member in the VLAN-aware bridge, as we only support one. Instead of always incrementing / decrementing the reference count after joining / leaving the bridge, simply do this accounting in the join / leave functions. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	279438952b	mlxsw: spectrum: Remove unnecessary function argument The argument 'br_dev' is never used, so remove it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	82e6db034b	mlxsw: spectrum: Make unlinking functions return void When responding to unlinking CHANGEUPPER notifications we shouldn't return any value, as it's not checked by upper layers. In addition, there's nothing the driver can do in case of failure, so it should simply continue and try to free as much resources as possible and not stop on first error. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	423b937e7d	mlxsw: spectrum: Use WARN_ON() return value Instead of checking for a condition and then issue the warning, just do it in one go and simplify the code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	ddbe993dbe	mlxsw: spectrum: Remove unnecessary checks from event processing When upper device of a VLAN device changes we already made sure it's a bridge device in PRECHANGEUPPER, so no need to check it's a master device in CHANGEUPPER. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:49 -04:00
Ido Schimmel	6ec439043b	mlxsw: spectrum: Forbid LAG slave from having VLAN uppers When a port netdev is put under LAG it cannot have VLAN upper devices, so forbid that. The LAG device itself can have VLAN upper devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:48 -04:00
Ido Schimmel	59fe9b3f84	mlxsw: spectrum: Sanitize port netdev upper devices We currently only support the following upper devices for port netdevs: 1) Bridge 2) LAG (bond / team) 3) VLAN Any other device is forbidden, so return an error. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:48 -04:00
Ido Schimmel	80bedf1a62	mlxsw: spectrum: Use notifier_from_errno() in notifier block Instead of checking the error value and returning NOTIFY_BAD, just use notifier_from_errno() and simplify the code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-21 05:02:48 -04:00
Nogah Frankel	4e239fac7c	mlxsw: switchx2: Don't count internal TX header bytes to stats Stop the SW TX counter from counting the TX header bytes since they are not being sent out. Fixes: `e577516b9d` ("mlxsw: Fix use-after-free bug in mlxsw_sx_port_xmit") Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 21:57:53 -07:00
Nogah Frankel	63dcdd35c1	mlxsw: spectrum: Don't count internal TX header bytes to stats Stop the SW TX counter from counting the TX header bytes since they are not being sent out. Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 21:57:53 -07:00
Alexander Duyck	974c3f3000	mlx5_en: Replace ndo_add/del_vxlan_port with ndo_add/del_udp_enc_port This change replaces the network device operations for adding or removing a VXLAN port with operations that are more generically defined to be used for any UDP offload port but provide a type. As such by just adding a line to verify that the offload type is VXLAN we can maintain the same functionality. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 20:23:31 -07:00
Alexander Duyck	a831274a13	mlx4_en: Replace ndo_add/del_vxlan_port with ndo_add/del_udp_enc_port This change replaces the network device operations for adding or removing a VXLAN port with operations that are more generically defined to be used for any UDP offload port but provide a type. As such by just adding a line to verify that the offload type is VXLAN we can maintain the same functionality. In addition I updated the socket address family check so that instead of excluding IPv6 we instead abort of type is not IPv4. This makes much more sense as we should only be supporting IPv4 outer addresses on this hardware. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 20:23:31 -07:00
Alexander Duyck	a547224dce	mlx4e: Do not attempt to offload VXLAN ports that are unrecognized The mlx4e driver does not support more than one port for VXLAN offload. As such expecting the hardware to offload other ports is invalid since it appears the parsing logic is used to perform Tx checksum and segmentation offloads. Use the vxlan_port number to determine in which cases we can apply the offload and in which cases we can not. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-16 14:24:59 -07:00
Eric Dumazet	0c5ddb51e8	net/mlx4_en: initialize cmd.context_lock spinlock earlier Maciej Żenczykowski reported lockdep warning a spinlock was not registered before being held in mlx4_cmd_wake_completions() cmd.context_lock initialization is not at the right place. 1) mlx4_cmd_use_events() can be called multiple times. Calling spin_lock_init() on a live spinlock can lead to hangs. 2) mlx4_cmd_wake_completions() can be called while lock has not been initialized. Lockdep complains, and current logic is not race prone. It seems better to move the initialization earlier in mlx4_load_one() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-15 12:16:30 -07:00
David S. Miller	1578b0a5e9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/sched/act_police.c net/sched/sch_drr.c net/sched/sch_hfsc.c net/sched/sch_prio.c net/sched/sch_red.c net/sched/sch_tbf.c In net-next the drop methods of the packet schedulers got removed, so the bug fixes to them in 'net' are irrelevant. A packet action unload crash fix conflicts with the addition of the new firstuse timestamp. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-10 11:52:24 -07:00
Bhaktipriya Shridhar	3d5479e920	mlxsw: core: Remove deprecated create_workqueue alloc_workqueue replaces deprecated create_workqueue(). A dedicated workqueue has been used since the workqueue mlxsw_wq is used for FDB notif. processing with workitems that are involved in normal device operation && because it's a network device which can be depended upon during memory reclaim. Workitems &trans->timeout_dw and &mlxsw_sp->fdb_notify.dw, map to mlxsw_sp_fdb_notify_work (processes FDB notifications from the underlying device and resolves the netdev to which the entry points to and notifies the bridge using the switchdev notifier) and mlxsw_emad_trans_timeout_work (provides async EMAD register access) respectively. They require forward progress under memory pressure and hence, WQ_MEM_RECLAIM has been set. Since there are only a fixed number of work items, explicit concurrency limit is unnecessary here. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 23:49:43 -07:00
Eric Dumazet	f7d3c1cbe3	net/mlx4_en: fix ethtool -x mlx4 RSS is limited to spread incoming packets to a power of two number of queues. An uniformly distibuted traffic would be split on queues 0 to N-1, N being a power of two, each queue having a 1/N weight. If number of RX queues is not a power of two, upper RX queues do not receive traffic. ethtool -x is lying, because it pretends some queues have higher weight. Before patch: lpaa24:~# ethtool -L eth1 rx 24 lpaa24:~# ethtool -x eth1 RX flow hash indirection table for eth1 with 24 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 10 11 12 13 14 15 16: 0 1 2 3 4 5 6 7 RSS hash key: e0:7c:3a:89:07:55:b6:58:69:cc:f4:e5:24:62:e3:25:88:6c:42:5b:d2:cb:9a:d2:e0:06:e1:dc:f9:09:a1:89:0f:a0:30:43:73:6f:0c:b6 If this information was correct, user space tools could expect queues 0 to 7 to receive twice more traffic than queues 8 to 15 After patch : lpaa24:~# ethtool -L eth1 rx 24 lpaa24:~# ethtool -x eth1 RX flow hash indirection table for eth1 with 24 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 10 11 12 13 14 15 RSS hash key: da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce lpaa24:~# ethtool -X eth1 equal 8 lpaa24:~# ethtool -x eth1 RX flow hash indirection table for eth1 with 24 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 0 1 2 3 4 5 6 7 RSS hash key: da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Cc: Wei Wang <weiwan@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 23:39:46 -07:00
Eric Dumazet	7d71e994cd	net/mlx4_en: mlx4_en_netpoll() should schedule TX, not RX I am not sure mlx4_en_netpoll() is doing anything useful right now. mlx4 has different NAPI structures for RX and TX, and netpoll only wants to drain TX queues. Lets schedule NAPI polls on TX, not RX. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:24:16 -07:00
Eli Cohen	0ca00fc1f8	net/mlx5e: Fix blue flame quota logic Blue flame is a latency enhancement feature that allows the driver to write the packet data directly to the NIC's registers thus making the read of the packet data from host memory redundant. We maintain a quota for the blue flame which is reloaded whenever we identify that the hardware is processing send requests and processes them fast enough so by the time we post the next send request it was able to process all the pending ones. This indicates that the hardware is capable of processing more blue flame requests efficiently. The blue flame quota is decremented whenever we send using blue flame. The current code erroneously clears the budget if we did not use blue flame for the current post send operation and we fix it here. Fixes: `88a85f99e5` ('net/mlx5e: TX latency optimization to save DMA reads') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:27 -07:00
Eran Ben Elisha	811afeaa37	net/mlx5e: Use ndo_stop explicitly at shutdown flow The current implementation copies the flow of ndo_stop instead of calling it explicitly, Fixed it. Fixes: `5fc7197d3a` ("net/mlx5: Add pci shutdown callback") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:27 -07:00
Mohamad Haj Yahia	62e3c24ac4	net/mlx5: E-Switch, always set mc_promisc for allmulti vports Set the mc_promisc flag also in the case of adding new mc address to existing allmulti vport. Fixes: `a35f71f27a` ('net/mlx5: E-Switch, Implement promiscuous rx modes vf request handling') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:26 -07:00
Noa Osherovich	23898c763f	net/mlx5: E-Switch, Modify node guid on vf set MAC In RoCE, the RDMA-CM needs the node guid to establish connection between nodes. Today, the node guid exposed to mlx5 Ethernet VFs is zero, therefore RDMA-CM on the VF is broken. Whenever the administrator sets a MAC for a VF, derive the node guid from it and set it as well in the following way: MAC: e4:1d:2d:b3:f4:01 -> node_guid: e4:1d:2d:ff:fe:b3:f4:01 Fixes: `77256579c6` ('net/mlx5: E-Switch, Introduce Vport...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:26 -07:00
Mohamad Haj Yahia	25fff58cb2	net/mlx5: E-Switch, Fix vport enable flow Reorder vport enable flow to mark the vport as enabled before calling the vport change handler which was modified to handle the case for when vport is not enabled. This fixes the case for when the PF netdev is open before sriov is enabled, once sriov is enabled at esw_enable_vport, esw_vport_change_handle_locked didn't read the PF context since it thought the PF vport was not enabled. When we enable the vport, arming for events is not required anymore, since it's done on the vport change handle Fixes: `586cfa7f1d` ('net/mlx5: E-Switch, Use vport event handler for vport cleanup') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:26 -07:00
Or Gerlitz	3f42ac6648	net/mlx5: E-Switch, Use the correct error check on returned pointers The mlx5 flow-steering API (mlx5_create_flow_table/group/rule) never returns null pointer on error. Even if it was doing that, checking for IS_ERR_OR_NULL(p) and then returning PTR_ERR(p) would have cause bugs, since PTR_ERR(NULL) --> success, crash. To make things more robust and protect against related future bugs, convert all IS_ERR_OR_NULL checks on returned values to IS_ERR. Fixes: `5742df0f7d` ('net/mlx5: E-Switch, Introduce VST vport ingress/egress ACLs') Fixes: `86d722ad2c` ('net/mlx5: Use flow steering infrastructure for mlx5_en') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:26 -07:00
Or Gerlitz	3fe3d819d5	net/mlx5: E-Switch, Use the correct free() function We must use kvfree() for something that could have been allocated with vzalloc(), do that. Fixes: `5742df0f7d` ('net/mlx5: E-Switch, Introduce VST vport ingress/egress ACLs') Fixes: `86d722ad2c` ('net/mlx5: Use flow steering infrastructure for mlx5_en') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:26 -07:00
Maor Gottlieb	bd02ef8eec	net/mlx5: Fix E-Switch flow steering capabilities check Add missing capabilities check for E-Switch FDB and ACLs flow tables before creating their namespace in flow steering. Fixes: `efdc810ba3` ('net/mlx5: Flow steering, Add vport ACL support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:26 -07:00
Maor Gottlieb	876d634d19	net/mlx5: Fix flow steering NIC capabilities check Flow steering infrastructure is currently used only on link layer ethernet, therefore the driver should initialize the flow steering when the device link layer is ethernet. In addition, add missing capability check before initializing the namespace of NIC RX flow tables. Fixes: `2530236303` ('net/mlx5_core: Flow steering tree initialization') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:25 -07:00
Maor Gottlieb	2fee37a47c	net/mlx5: Fix root flow table update When we destroy the last flow table we need to update the root_ft to NULL. It fixes an issue for when the last flow table is destroyed and recreated again, root_ft pointer will not be updated, as a result traffic will be dropped. Fixes: `2cc43b494a` ('net/mlx5_core: Managing root flow table') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:25 -07:00
Majd Dibbiny	9cd3411c42	net/mlx5: Fix masking of reserved bits in XRCD number Mask the reserved bits when reading the number of newly created XRCD. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:06:25 -07:00
Ido Schimmel	d664b41e2a	mlxsw: spectrum: Don't sleep during ndo_get_phys_port_name() When rtnl_fill_ifinfo() is called for a certain netdevice it queries its various parameters such as switch id and physical port name. The function might get called in an atomic context, which means the underlying driver must not sleep during the query operation. Don't query the device and sleep during ndo_get_phys_port_name(), but instead store the needed parameters in port creation time. Fixes: `2bf9a58675` ("mlxsw: spectrum: Add support for physical port names") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 11:20:05 -07:00

1 2 3 4 5 ...

1802 Commits