linux

Author	SHA1	Message	Date
Eric Dumazet	bb07fafaec	net/mlx4_en: remove napi_hash_del() call There is no need calling napi_hash_del()+synchronize_rcu() before calling netif_napi_del() netif_napi_del() does this already. Using napi_hash_del() in a driver is useful only when dealing with a batch of NAPI structures, so that a single synchronize_rcu() can be used. mlx4_en_deactivate_cq() is deactivating a single NAPI. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-17 11:17:05 -05:00
Vadim Pasternak	d556e92916	mlxsw: minimal: Add I2C support for Mellanox ASICs Add I2C access support for Mellanox ASICs: - Virtual Protocol Interconnect switches SwitchX, SwitchX2, providing InfiniBand, Ethernet and Fibre Channel connectivity; - Infiniband switches SwitchIB, SwitchIB2: - Ethernet switch Spectrum. Example of probing activation: echo mlxsw_minimal 0x48 > /sys/bus/i2c/devices/i2c-2/new_device Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 23:29:04 -05:00
Vadim Pasternak	4ebd00bc5f	mlxsw: Invoke driver's init/fini methods only if defined We are going to add a minimal driver on top of the mlxsw core infrastructure, which will be mainly used for hardware monitoring in Baseboard management controller (BMC) installations. Unlike the switch drivers (e.g., spectrum, switchx2), this driver does not initialize the ASIC and therefore doesn't need to implement the init() and fini() methods in its 'mlxsw_driver' struct. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 23:29:04 -05:00
Vadim Pasternak	6882b0aee1	mlxsw: Introduce support for I2C bus Add I2C bus implementation for Mellanox Technologies Switch ASICs. This includes command interface implementation using input / out mailboxes, whose location is retrieved from the firmware during probe time. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 23:29:04 -05:00
Vadim Pasternak	c711e27a18	mlxsw: Add bus capability flag The mlxsw core infrastructure currently assumes that communication with the ASIC is always possible using Ethernet management datagrams (EMADs), but this is only possible when the PCI bus is used. The bus capability flag is added to indicate EMAD support and make core initialize EMAD communication only when it's set. Otherwise, register access is done using command interface. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 23:29:04 -05:00
Ido Schimmel	d331d30360	mlxsw: spectrum_router: Adjust placement of FIB abort warning The recent merge commit `bb598c1b8c` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") would cause the FIB abort warning to fire whenever we flush the FIB tables - either during module removal or actual abort. Move it back to its rightful location in the FIB abort function. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 15:18:01 -05:00
Eric Dumazet	2e71328375	net/mlx4_en: use napi_complete_done() return value Do not rearm interrupts if we are busy polling. mlx4 uses separate CQ for TX and RX, so number of TX interrupts does not change, unfortunately. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Adam Belay <abelay@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 13:40:58 -05:00
Nicolas Pitre	d1cbfd771c	ptp_clock: Allow for it to be optional In order to break the hard dependency between the PTP clock subsystem and ethernet drivers capable of being clock providers, this patch provides simple PTP stub functions to allow linkage of those drivers into the kernel even when the PTP subsystem is configured out. Drivers must be ready to accept NULL from ptp_clock_register() in that case. And to make it possible for PTP to be configured out, the select statement in those driver's Kconfig menu entries is converted to the new "imply" statement. This way the PTP subsystem may have Kconfig dependencies of its own, such as POSIX_TIMERS, without having to make those ethernet drivers unavailable if POSIX timers are cconfigured out. And when support for POSIX timers is selected again then the default config option for PTP clock support will automatically be adjusted accordingly. The pch_gbe driver is a bit special as it relies on extra code in drivers/ptp/ptp_pch.c. Therefore we let the make process descend into drivers/ptp/ even if PTP_1588_CLOCK is unselected. Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Edward Cree <ecree@solarflare.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: linux-kbuild@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Michal Marek <mmarek@suse.com> Link: http://lkml.kernel.org/r/1478841010-28605-4-git-send-email-nicolas.pitre@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-16 09:26:34 +01:00
David S. Miller	bb598c1b8c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-15 10:54:36 -05:00
Ido Schimmel	ac571de999	mlxsw: spectrum_router: Flush FIB tables during fini Since commit `b45f64d16d` ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") we reflect to the device the entire FIB table and not only FIBs that point to netdevs created by the driver. During module removal, FIBs of the second type are removed following NETDEV_UNREGISTER events sent. The other FIBs are still present in both the driver's cache and the device's table. Fix this by iterating over all the FIB tables in the device and flush them. There's no need to take locks, as we're the only writer. Fixes: `b45f64d16d` ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-14 16:45:16 -05:00
Jiri Pirko	8d419324ef	mlxsw: spectrum_router: Add FIB abort warning Add a warning that the abort mechanism was triggered for device. Also avoid going through the procedure if abort was already done. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-13 22:36:59 -05:00
Jiri Pirko	f7ad3d4b83	mlxsw: reg: Fix pwm_frequency field size in MFCR register The field is 7bit long. Fix it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-13 12:52:17 -05:00
Arkadi Sharshevsky	42cdb338f4	mlxsw: spectrum_router: Correctly dump neighbour activity The device's neighbour table is periodically dumped in order to update the kernel about active neighbours. A single dump session may span multiple queries, until the response carries less records than requested or when a record (can contain up to four neighbour entries) is not full. Current code stops the session when the number of returned records is zero, which can result in infinite loop in case of high packet rate. Fix this by stopping the session according to the above logic. Fixes: `c723c735fa` ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-13 12:51:00 -05:00
Yotam Gigi	2d644d4c75	mlxsw: spectrum: Fix refcount bug on span entries When binding port to a newly created span entry, its refcount is initialized to zero even though it has a bound port. That leads to unexpected behaviour when the user tries to delete that port from the span entry. Fix this by initializing the reference count to 1. Also add a warning to put function. Fixes: `763b4b70af` ("mlxsw: spectrum: Add support in matchall mirror TC offloading") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-13 12:51:00 -05:00
Daniel Borkmann	c540594f86	bpf, mlx4: fix prog refcount in mlx4_en_try_alloc_resources error path Commit `67f8b1dcb9` ("net/mlx4_en: Refactor the XDP forwarding rings scheme") added a bug in that the prog's reference count is not dropped in the error path when mlx4_en_try_alloc_resources() is failing from mlx4_xdp_set(). We previously took bpf_prog_add(prog, priv->rx_ring_num - 1), that we need to release again. Earlier in the call path, dev_change_xdp_fd() itself holds a reference to the prog as well (hence the '- 1' in the bpf_prog_add()), so a simple atomic_sub() is safe to use here. When an error is propagated, then bpf_prog_put() is called eventually from dev_change_xdp_fd() Fixes: `67f8b1dcb9` ("net/mlx4_en: Refactor the XDP forwarding rings scheme") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-12 23:31:57 -05:00
Jiri Pirko	0e3715c9c2	mlxsw: spectrum_router: Ignore FIB notification events for non-init namespaces Since now, the table with same id in multiple netnamespaces were squashed to a single virtual router. That is not only incorrect, it also causes error messages when trying to use RALUE register to do double remove of FIB entries, like this one: mlxsw_spectrum 0000:03:00.0: EMAD reg access failed (tid=facb831c00007b20,reg_id=8013(ralue),type=write,status=7(bad parameter)) Since we don't allow ports to change namespaces (NETIF_F_NETNS_LOCAL), and the infrastructure is not yet prepared to handle netnamespaces, just ignore FIB notification events for non-init namespaces. That is clear to do since we don't need to offload them. Fixes: `b45f64d16d` ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-10 13:02:15 -05:00
Jiri Pirko	33b1341cd1	mlxsw: spectrum_router: Fix handling of neighbour structure __neigh_create function works in a different way than assumed. It passes "n" as a parameter to ndo_neigh_construct. But this "n" might be destroyed right away before __neigh_create() returns in case there is already another neighbour struct in the hashtable with the same dev and primary key. That is not expected by mlxsw_sp_router_neigh_construct() and the stored "n" points to freed memory, eventually leading to crash. Fix this by doing tight 1:1 coupling between neighbour struct and internal driver neigh_entry. That allows to narrow down the key in internal driver hashtable to do lookups by "n" only. Fixes: `6cf3c971dc` ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-10 13:02:15 -05:00
Hadar Hen Zion	a54e20b4fc	net/mlx5e: Add basic TC tunnel set action for SRIOV offloads In mlx5 HW, encapsulation is offloaded by the steering rule having index into an encapsulation table containing the entire set of headers to be added by the HW. The driver sets these headers in a buffer when we are offloading the action. The code maintains mlx5_encap_entry for each encap header it has encountered when attempted to offload TC tunnel set action. This entry maintains a linked list of all the flows sharing the same encap header, when the last flow is removed from the list the encap entry is removed. The actual encap_header is allocated by the driver in the hardware only if we have layer two neighbour info when the encap entry is created. While the flow is in the driver, the driver holds a reference on the neighbour. When a new flow with encap action is inserted, the code first checks if the required encap entry exists according to the tunnel set parameters. If it does the encap is shared, otherwise a new mlx5_encap_entry is created. TC action parsing implementation in the driver assumes that tunnel set action is provided in the same order set by the user, e.g before the mirred_redirect action. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:57 -05:00
Hadar Hen Zion	4a25730eb2	net/mlx5e: Add ndo_udp_tunnel_add to VF representors By implementing this ndo, the host stack will set the vxlan udp port also to VF representor netdevices. This will allow the TC offload code in the driver when it gets a tunnel key set action to identify the UDP port as vxlan, and hence the rule will be a candidate for offloading. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:56 -05:00
Hadar Hen Zion	bbd00f7e23	net/mlx5e: Add TC tunnel release action for SRIOV offloads Enhance the parsing of offloaded TC rules to set HW matching on outer (encapsulation) headers. Parse TC tunnel release action and set it as mlx5 decap action when the required capabilities are supported. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:56 -05:00
Hadar Hen Zion	66958ed906	net/mlx5: Support encap id when setting new steering entry In order to support steering rules which add encapsulation headers, encap_id parameter is needed. Add new mlx5_flow_act struct which holds action related parameter: action, flow_tag and encap_id. Use mlx5_flow_act struct when adding a new steering rule. This patch doesn't change any functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:56 -05:00
Hadar Hen Zion	c9f1b073d0	net/mlx5: Add creation flags when adding new flow table When creating flow tables, allow the caller to specify creation flags. Currently no flags are used and as such this patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:56 -05:00
Hadar Hen Zion	43f93839e3	net/mlx5: Check max encap header size capability Instead of comparing to a const value, check the value of max encap header size capability as reported by the Firmware. Fixes: `575ddf5888` ('net/mlx5: Introduce alloc_encap and dealloc_encap commands') Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:55 -05:00
Hadar Hen Zion	ae9f83ac24	net/mlx5: Move alloc/dealloc encap commands declarations to common header file The alloc and dealloc encap commands will be used in the mlx5e driver, as such, declare them in a common header file. Also, rename the functions: mlx5_cmd_{de}alloc_encap is replaced with mlx5_encap_{de}alloc. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:41:55 -05:00
Tariq Toukan	f91d718156	Revert "net/mlx4_en: Fix panic during reboot" This reverts commit `9d2afba058`. The original issue would possibly exist if an external module tried calling our "ethtool_ops" without checking if it still exists. The right way of solving it is by simply doing the check in the caller side. Currently, no action is required as there's no such use case. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-09 13:29:32 -05:00
Wei Yongjun	b6e417977d	mlxsw: Remove unused including <generated/utsrelease.h> Remove including <generated/utsrelease.h> that don't need it. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-05 16:22:57 -04:00
Huy Nguyen	0e97a34083	net/mlx5: Fix invalid pointer reference when prof_sel parameter is invalid When prof_sel is invalid, mlx5_core_warn is called but the mlx5_core_dev is not initialized yet. Solution is moving the prof_sel code after dev->pdev assignment Fixes: `2974ab6e8b` ('net/mlx5: Improve driver log messages') Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:59:37 -04:00
Or Gerlitz	ee39fbc444	net/mlx5: E-Switch, Set the actions for offloaded rules properly As for the current generation of the mlx5 HW (CX4/CX4-Lx) per flow vlan push/pop actions are emulated, we must not program them to the firmware. Fixes: `f5f8247609` ('net/mlx5: E-Switch, Support VLAN actions in the offloads mode') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:59:37 -04:00
Or Gerlitz	358d79a47e	net/mlx5e: Handle matching on vlan priority for offloaded TC rules We ignored the vlan priority in offloaded TC rules matching part, fix that. Fixes: `095b6cfd69` ('net/mlx5e: Add TC vlan match parsing') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:59:36 -04:00
Or Gerlitz	abd3277287	net/mlx5e: Disallow changing name-space for VF representors VF reps should be altogether on the same NS as they were created. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:59:36 -04:00
Saeed Mahameed	d7a0ecab38	net/mlx5e: Re-arrange XDP SQ/CQ creation In mlx5e_open_channel CQs must be created before napi is enabled. Here we move the XDP CQ creation to satisfy that fact. mlx5e_close_channel is already working according to the right order. Fixes: `b5503b994e` ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:59:35 -04:00
Saeed Mahameed	87dc02551c	net/mlx5e: Fix XDP error path of mlx5e_open_channel() In case of mlx5e_open_rq fails the error handling will jump to label err_close_xdp_sq and will try to close the xdp_sq unconditionally. xdp_sq is valid only in case of XDP use cases, i.e priv->xdp_prog is not null. To fix this in this patch we test xdp_sq validity prior to closing it. In addition we now close the xdp_sq.cq as well. Fixes: `b5503b994e` ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:59:35 -04:00
Elad Raz	5e5f89e70b	mlxsw: pci: Fix the FW ready mask length The system-status register is actually 16-bit wide and not 8 bit-wide. Fixes: `233fa44bd6` ("mlxsw: pci: Implement reset done check") Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 16:21:27 -04:00
Tariq Toukan	15fca2c8eb	net/mlx4_en: Add ethtool statistics for XDP cases XDP statistics are reported in ethtool, in total and per ring, as follows: - xdp_drop: the number of packets dropped by xdp. - xdp_tx: the number of packets forwarded by xdp. - xdp_tx_full: the number of times an xdp forward failed due to a full tx xdp ring. In addition, all packets that are dropped/forwarded by XDP are no longer accounted in rx_packets/rx_bytes of the ring, so that they count traffic that is passed to the stack. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:07:11 -04:00
Tariq Toukan	67f8b1dcb9	net/mlx4_en: Refactor the XDP forwarding rings scheme Separately manage the two types of TX rings: regular ones, and XDP. Upon an XDP set, do not borrow regular TX rings and convert them into XDP ones, but allocate new ones, unless we hit the max number of rings. Which means that in systems with smaller #cores we will not consume the current TX rings for XDP, while we are still in the num TX limit. XDP TX rings counters are not shown in ethtool statistics. Instead, XDP counters will be added to the respective RX rings in a downstream patch. This has no performance implications. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:07:11 -04:00
Tariq Toukan	ccc109b8ed	net/mlx4_en: Add TX_XDP for CQ types Support XDP CQ type, and refactor the CQ type enum. Rename the is_tx field to match the change. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:07:11 -04:00
Christophe Jaillet	42fb18fd58	net/mlx5: Simplify a test 'create_root_ns()' does not return an error pointer, so the test can be simplified to be more consistent. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-01 12:16:40 -04:00
Ido Schimmel	46d0847cdd	mlxsw: spectrum: Fix incorrect reuse of MID entries In the device, a MID entry represents a group of local ports, which can later be bound to a MDB entry. The lookup of an existing MID entry is currently done using the provided MC MAC address and VID, from the Linux bridge. However, this can result in an incorrect reuse of the same MID index in different VLAN-unaware bridges (same IP MC group and VID 0). Fix this by performing the lookup based on FID instead of VID, which is unique across different bridges. Fixes: `3a49b4fde2` ("mlxsw: Adding layer 2 multicast support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:34:43 -04:00
David S. Miller	0a6ce1e3c1	Mellanox ConnectX-4/Connect-IB shared code (IB & ETH part) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYFfxWAAoJEORje4g2clinr4MQAMUMoO4akLhppyUuLiT2ZZHc KshFwUF5RnZq6c2qXSxTVhfajyG6q71EwQNGaAMeGsjoCXM+u99Adgp/lYkXxbzY e4W3gCvjGWIik5d3JRVq07XutVpeJIc/Qvvc5zc1lUYNR5f71iBw538eG9ic4PXi 4/CpRvcsa8Z9sbtKPcjHwQQRd4ewx/KAD6QyOsVz9GgkBeNMYag3SO731DYSkjRC MYK85arNC1JUE/MHQKIfYQvjiJVfEyt2FvC8v9tW+bhzP6dAzxRY0yd8ZFJtCiYH GFGy8vdeCA/0dFRD5cYKPKBiFwUbRC8bt2lLC5ZoUic2nZ23LlO67uDkaPDRYckt oyhErFRX6Q/goqKFCI4tLUoSBF1bhy9EnbWyOWmcW7qpXRD3VCclS0Ctr++yJnv2 bhhlID56f+dX+rnW/OAERrk8MdVHo5xBUzQ8ZAAF3WDP9LqW+qlYVrEvrFqFIeFM OCGUbW2xsZaHMZRyx0K6068hy8O4EujjgC9PARi65rrZAAwxlDm4ElJEYvzXZVXA YoMeXiZGrhoj7+h8OorV0TyB+7mUgxFNlq0tCoi193QS+zIuQqf3XYNmuQGCUflF YdzZs7/9LpANN4e5yDTCq3CIr8yYv9sRrdnSX0iShvbFBAKClLxpjj2xkpayHFVR 8CRvlb5O1v1fIwSn2z/I =D5Ix -----END PGP SIGNATURE----- Merge tag 'shared-for-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Saeed Mahameed says: ==================== Mellanox mlx5 core driver updates 2016-10-25 This series contains some updates and fixes of mlx5 core and IB drivers with the addition of two features that demand new low level commands and infrastructure updates. - SRIOV VF max rate limit support - mlx5e tc support for FWD rules with counter. Needed for both net and rdma subsystems. Updates and Fixes: From Saeed Mahameed (2): - mlx5 IB: Skip handling unknown mlx5 events - Add ConnectX-5 PCIe 4.0 VF device ID From Artemy Kovalyov (2): - Update struct mlx5_ifc_xrqc_bits - Ensure SRQ physical address structure endianness From Eugenia Emantayev (1): - Fix length of async_event_mask New Features: From Mohamad Haj Yahia (3): mlx5 SRIOV VF max rate limit support - Introduce TSAR manipulation firmware commands - Introduce E-switch QoS management - Add SRIOV VF max rate configuration support From Mark Bloch (7): mlx5e Tc support for FWD rule with counter - Don't unlock fte while still using it - Use fte status to decide on firmware command - Refactor find_flow_rule - Group similar rules under the same fte - Add multi dest support - Add option to add fwd rule with counter - mlx5e tc support for FWD rule with counter Mark here fixed two trivial issues with the flow steering core, and did some refactoring in the flow steering API to support adding mulit destination rules to the same hardware flow table entry at once. In the last two patches added the ability to populate a flow rule with a flow counter to the same flow entry. V2: Dropped some patches that added new structures without adding any usage of them. Added SRIOV VF max rate configuration support patch that introduces the usage of the TSAR infrastructure. Added flow steering fixes and refactoring in addition to mlx5 tc support for forward rule with counter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 17:31:12 -04:00
Elad Raz	d1ba526384	mlxsw: switchib: Introduce SwitchIB and SwitchIB silicon driver SwitchIB and SwitchIB-2 are Infiniband switches with up to 36 ports. This driver initialize the hardware and Firmware which implements the IB management and connection with the SM. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	64b92b0114	mlxsw: switchx2: Add IB port support SwitchX-2 is IB capable device. This patch add a support to change the port type between Ethernet and Infiniband. When the port is set to IB, the FW implements the Subnet Management Agent (SMA) manage the port. All port attributes can be control remotely by the SM. Usage: $ devlink port show pci/0000:03:00.0/1: type eth netdev eth0 pci/0000:03:00.0/3: type eth netdev eth1 pci/0000:03:00.0/5: type eth netdev eth2 pci/0000:03:00.0/6: type eth netdev eth3 pci/0000:03:00.0/8: type eth netdev eth4 $ devlink port set pci/0000:03:00.0/1 type ib $ devlink port show pci/0000:03:00.0/1: type ib Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	03ddb787c7	mlxsw: switchx2: Add eth prefix to port create and remove Since we are about to add Infiniband port remove and create we will add "eth" prefix to port create and remove APIs. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	0c81ea5db2	mlxsw: core: Add port type (Eth/IB) set API Add "port_type_set" API to mlxsw core. The core layer send the change type callback to the port along with it's private information. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	d808c7e490	mlxsw: core: Add "eth" prefix to mlxsw_core_port_set Since we are about to introduce IB port APIs, we will add prefixes to existing APIs. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	c68b71cb2f	mlxsw: switchx2: Add Infiniband switch partition In order to put a port in Infiniband fabric it should be assigned to separate swid (Switch partition) that initialized as IB swid. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Jiri Pirko	67963a33b4	mlxsw: Make devlink port instances independent of spectrum/switchx2 port instances Currently, devlink register/unregister is done directly from spectrum/switchx2 port create/remove functions. With a need to introduce a port type change, the devlink port instances have to be persistent across type changes, therefore across port create/remove function calls. So do a bit of reshuffling to achieve that. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	7136793e4a	mlxsw: reg: Add local-port to Infiniband port mapping In order to change a port type to Infiniband port we should change his mapping from local-port to Infiniband. Adding the PLIB (Port Local to InfiniBand) allows this mapping. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	794177027b	mlxsw: reg: Add Infiniband support to PTYS In order to support Infiniband fabric, we need to introduce IB speeds and capabilities to PTYS emads. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	401c8b4e3c	mlxsw: reg: Add eth prefix to PTYS pack and unpack We want to add Infiniband support to PTYS. In order to maintain proper conventions, we will change pack and unpack prefix to eth. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	10dbf8f8ec	mlxsw: switchx2: Fix port speed configuration In SwitchX-2 we configure the port speed to negotiate with 40G link only. Add support for all other supported speeds. Fixes: `31557f0f97` ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support") Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	e50e789ce1	mlxsw: switchx2: Add support for physical port names Export to userspace the front panel name of the port, so that udev can rename the ports accordingly. The convention suggested by switchdev documentation is used: pX Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Jiri Pirko	f83e21027b	mlxsw: spectrum: Move port used check outside port remove function Be symmentrical with create and do the check outside the remove function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Jiri Pirko	adc4e04a4b	mlxsw: switchx2: Move port used check outside port remove function Be symmentrical with create and do the check outside the remove function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Jiri Pirko	abc1de256c	mlxsw: switchx2: Check if port is usable before calling port create Do it in a same way we do it in spectrum. Check if port is usable first and only in that case create a port instance. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	5b0907407e	mlxsw: core: Zero payload buffers for couple of registers We recently discovered a bug in the firmware in which a field's length in one of the registers was incorrectly set. This caused the firmware to access garbage data that wasn't initialized by the driver and therefore emit error messages. While the bug is already fixed and the driver usually zeros the buffers passed to the firmware, there are a handful of cases where this isn't done. Zero the buffer in these cases and prevent similar bugs from recurring, as they tend to be hard to debug. Fixes: `52581961d8` ("mlxsw: core: Implement fan control using hwmon") Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
David S. Miller	27058af401	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Mostly simple overlapping changes. For example, David Ahern's adjacency list revamp in 'net-next' conflicted with an adjacency list traversal bug fix in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 12:42:58 -04:00
Mark Bloch	e37a79e5d4	net/mlx5e: Add tc support for FWD rule with counter When creating a FWD rule using tc create also a HW counter for this rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:18 +02:00
Mark Bloch	ae05831424	net/mlx5: Add option to add fwd rule with counter Currently the code supports only drop rules to possess counters, add that ability also for fwd rules. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:17 +02:00
Mark Bloch	74491de937	net/mlx5: Add multi dest support Currently when calling mlx5_add_flow_rule we accept only one flow destination, this commit allows to pass multiple destinations. This change forces us to change the return structure to a more flexible one. We introduce a flow handle (struct mlx5_flow_handle), it holds internally the number for rules created and holds an array where each cell points the to a flow rule. From the consumers (of mlx5_add_flow_rule) point of view this change is only cosmetic and requires only to change the type of the returned value they store. From the core point of view, we now need to use a loop when allocating and deleting rules (e.g given to us a flow handler). Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:17 +02:00
Mark Bloch	a622498577	net/mlx5: Group similer rules under the same fte When adding a new rule, if we can match it with compare_match_value and flow tag we might be able to insert the rule to the same fte. In order to do that, there must be an overlap between the actions of the fte and the new rule. When updating the action of an existing fte, we must tell the firmware we are doing so. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:16 +02:00
Mark Bloch	814fb87541	net/mlx5: Refactor find_flow_rule The way we compare between two dests will need to be used in other places in the future, so we factor out the comparison logic between two dests into a separate function. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:15 +02:00
Mark Bloch	0501fc477c	net/mlx5: Use fte status to decide on firmware command An fte status becomes FS_FTE_STATUS_EXISTING only after it was created in HW. We can use this in order to simplify the logic on what firmware command to use. If the status isn't FS_FTE_STATUS_EXISTING we need to create the fte, otherwise we need only to update it. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:15 +02:00
Mark Bloch	0fd758d611	net/mlx5: Don't unlock fte while still using it When adding a new rule to an fte, we need to hold the fte lock until we add that rule to the fte and increase the fte ref count. Fixes: `0c56b97503` ("net/mlx5_core: Introduce flow steering API") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:14 +02:00
Mohamad Haj Yahia	bd77bf1cb5	net/mlx5: Add SRIOV VF max rate configuration support Implement the vf set rate ndo by modifying the TSAR vport rate limit. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:13 +02:00
Mohamad Haj Yahia	1bd27b11c1	net/mlx5: Introduce E-switch QoS management Add TSAR to the eswitch which will act as the vports rate limiter. Create/Destroy TSAR on Enable/Dsiable SRIOV. Attach/Detach vport to eswitch TSAR on Enable/Disable vport. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:13 +02:00
Mohamad Haj Yahia	813f854053	net/mlx5: Introduce TSAR manipulation firmware commands TSAR (stands for Transmit Scheduling ARbiter) is a hardware component that is responsible for selecting the next entity to serve on the transmit path. The arbitration defines the QoS policy between the agents connected to the TSAR. The TSAR is a consist two main features: 1) BW Allocation between agents: The TSAR implements a defecit weighted round robin between the agents. Each agent attached to the TSAR is assigned with a weight and it is awarded transmission tokens according to this weight. 2) Rate limer per agent: Each agent attached to the TSAR is (optionally) assigned with a rate limit. TSAR will not allow scheduling for an agent exceeding its defined rate limit. In this patch we implement the API of manipulating the TSAR. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:12 +02:00
Saeed Mahameed	86490d9a59	net/mlx5: Add ConnectX-5 PCIe 4.0 VF device ID For the mlx5 driver to support ConnectX-5 PCIe 4.0 VFs, we add the device ID "0x101a" to mlx5_core_pci_table. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:12 +02:00
Eugenia Emantayev	6887a825dc	net/mlx5: Fix length of async_event_mask According to PRM async_event_mask have to be 64 bits long. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-10-30 15:43:11 +02:00
Tariq Toukan	eb4b678825	net/mlx4_en: Save slave ethtool stats command Following the previous patch, as an optimization, the slave will not even bother sending the DUMP_ETH_STATS command over the comm channel. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Jack Morgenstein	d2582a0393	net/mlx4_en: Fix potential deadlock in port statistics flow mlx4_en_DUMP_ETH_STATS took the counter mutex and then called the FW command, with WRAPPED attribute. As a result, the fw command is wrapped on the Hypervisor when it calls mlx4_en_DUMP_ETH_STATS. The FW command wrapper flow on the hypervisor takes the slave_cmd_mutex during processing. At the same time, a VF could be in the process of coming up, and could call mlx4_QUERY_FUNC_CAP. On the hypervisor, the command flow takes the slave_cmd_mutex, then executes mlx4_QUERY_FUNC_CAP_wrapper. mlx4_QUERY_FUNC_CAP wrapper calls mlx4_get_default_counter_index(), which takes the counter mutex. DEADLOCK. The fix is that the DUMP_ETH_STATS fw command should be called with the NATIVE attribute, so that on the hypervisor, this command does not enter the wrapper flow. Since the Hypervisor no longer goes through the wrapper code, we also simply return 0 in mlx4_DUMP_ETH_STATS_wrapper (i.e.the function succeeds, but the returned data will be all zeroes). No need to test if it is the Hypervisor going through the wrapper. Fixes: `f9baff509f` ("mlx4_core: Add "native" argument to mlx4_cmd ...") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Eugenia Emantayev	6f2e0d2c3b	net/mlx4: Fix firmware command timeout during interrupt test Currently interrupt test that is part of ethtool selftest runs the check over all interrupt vectors of the device. In mlx4_en package part of interrupt vectors are uninitialized since mlx4_ib doesn't exist. This causes NOP FW command to time out. Change logic to test current port interrupt vectors only. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Jack Morgenstein	81d184199e	net/mlx4_core: Do not access comm channel if it has not yet been initialized In the Hypervisor, there are several FW commands which are invoked before the comm channel is initialized (in mlx4_multi_func_init). These include MOD_STAT_CONFIG, QUERY_DEV_CAP, INIT_HCA, and others. If any of these commands fails, say with a timeout, the Hypervisor driver enters the internal error reset flow. In this flow, the driver attempts to notify all slaves via the comm channel that an internal error has occurred. Since the comm channel has not yet been initialized (i.e., mapped via ioremap), this will cause dereferencing a NULL pointer. To fix this, do not access the comm channel in the internal error flow if it has not yet been initialized. Fixes: `55ad359225` ("net/mlx4_core: Enable device recovery flow with SRIOV") Fixes: `ab9c17a009` ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Eugenia Emantayev	9d2afba058	net/mlx4_en: Fix panic during reboot Fix a kernel panic that occurs as a result of an asynchronous event handled in roce_gid_mgmt: mlx4_en_get_drvinfo is called and accesses freed resources. This happens in a shutdown flow only, since pci device is destroyed while netdevice is still alive. Fixes: `c27a02cd94` ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Erez Shitrit	8d59de8f7b	net/mlx4_en: Process all completions in RX rings after port goes up Currently there is a race between incoming traffic and initialization flow. HW is able to receive the packets after INIT_PORT is done and unicast steering is configured. Before we set priv->port_up NAPI is not scheduled and receive queues become full. Therefore we never get new interrupts about the completions. This issue could happen if running heavy traffic during bringing port up. The resolution is to schedule NAPI once port_up is set. If receive queues were full this will process all cqes and release them. Fixes: `c27a02cd94` ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Eugenia Emantayev	4850cf4581	net/mlx4_en: Resolve dividing by zero in 32-bit system When doing roundup_pow_of_two for large enough number with bit 31, an overflow will occur and a value equal to 1 will be returned. In this case 1 will be subtracted from the return value and division by zero will be reached. Fixes: `31c128b66e` ("net/mlx4_en: Choose time-stamping shift value according to HW frequency") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Moshe Lazer	72da2e911f	net/mlx4_core: Change the default value of enable_qos Change the default status of quality of service back to disabled, as it hurts performance in some cases. Fixes: `38438f7c7e` ("net/mlx4: Set enhanced QoS support by default when ...") Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Maor Gottlieb	33a1f8b196	net/mlx4_core: Avoid setting ports to auto when only one port type is supported When only one port type is supported, it should be read only. We reject changing requests, even to the auto sense mode. Fixes: `27bf91d6a0` ("mlx4_core: Add link type autosensing") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Jack Morgenstein	aa0c08feae	net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec The resource type enum in the resource tracker was incorrect. RES_EQ was put in the position of RES_NPORT_ID (a FC resource). Since the remaining resources maintain their current values, and RES_EQ is not passed from slaves to the hypervisor in any FW command, this change affects only the hypervisor. Therefore, there is no backwards-compatibility issue. Fixes: `623ed84b1f` ("mlx4_core: initial header-file changes for SRIOV support") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 16:23:48 -04:00
Arnd Bergmann	a4256bc9ec	IB/mlx4: avoid a -Wmaybe-uninitialize warning There is an old warning about mlx4_SW2HW_EQ_wrapper on x86: ethernet/mellanox/mlx4/resource_tracker.c: In function ‘mlx4_SW2HW_EQ_wrapper’: ethernet/mellanox/mlx4/resource_tracker.c:3071:10: error: ‘eq’ may be used uninitialized in this function [-Werror=maybe-uninitialized] The problem here is that gcc won't track the state of the variable across a spin_unlock. Moving the assignment out of the lock is safe here and avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 14:53:48 -04:00
Noa Osherovich	6b276190c5	net/mlx5: Avoid passing dma address 0 to firmware Currently the firmware can't work with a page with dma address 0. Passing such an address to the firmware will cause the give_pages command to fail. To avoid this, in case we get a 0 dma address of a page from the dma engine, we avoid passing it to FW by remapping to get an address other than 0. Fixes: `bf0bf77f65` ('mlx5: Support communicating arbitrary host...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Mohamad Haj Yahia	04c0c1ab38	net/mlx5: PCI error recovery health care simulation In case that the kernel PCI error handlers are not called, we will trigger our own recovery flow. The health work will give priority to the kernel pci error handlers to recover the PCI by waiting for a small period, if the pci error handlers are not triggered the manual recovery flow will be executed. We don't save pci state in case of manual recovery because it will ruin the pci configuration space and we will lose dma sync. Fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Mohamad Haj Yahia	05ac2c0b74	net/mlx5: Fix race between PCI error handlers and health work Currently there is a race between the health care work and the kernel pci error handlers because both of them detect the error, the first one to be called will do the error handling. There is a chance that health care will disable the pci after resuming pci slot. Also create a separate WQ because now we will have two types of health works, one for the error detection and one for the recovery. Fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Mohamad Haj Yahia	2241007b3d	net/mlx5: Clear health sick bit when starting health poll The health sick status should be cleared when we start the health poll. This is crucial for driver reload (unload + load) in order to behave right in case of health issue. Fixes: `fd76ee4da5` ('net/mlx5_core: Fix internal error detection conditions') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Mohamad Haj Yahia	247f139cda	net/mlx5: Change the acl enable prototype to return status The Ingress/Egress ACL enable function may fail and it should return status to its caller to avoid NULL pointer dereference. Fixes: `f942380c12` ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Mohamad Haj Yahia	5e1e93c704	net/mlx5e: Unregister netdev before detaching it Detaching the netdev before unregistering it cause some netdev cleanup ndos to fail because they check presence of the netdev, so we need to unregister the netdev first. Fixes: `26e59d8077` ('net/mlx5e: Implement mlx5e interface attach/detach callbacks') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Saeed Mahameed	2b02955666	net/mlx5e: Choose best nearest LRO timeout Instead of predicting the index of the wanted LRO timeout value from hardware capabilities, look for the nearest LRO timeout value. Fixes: `5c50368f38` ('net/mlx5e: Light-weight netdev open/stop') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Paul Blakey	e83d6955fe	net/mlx5: Correctly initialize last use of flow counters Currently, last use timestamp is initialized to zero. This is not the expected value by higher layers such as when we do TC action offloading. To fix that, set it to the current time, e.g when the counter/rule is offloaded. This is the same behaviour of non-offloaded TC actions. Fixes: `43a335e055` ('mlx5_core: Flow counters infrastructure') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Paul Blakey	32dba76a7a	net/mlx5: Fix autogroups groups num not decreasing Autogroups groups num is increased when creating a new flow group, but is never decreased. Now decreasing it when deleting a flow group. Fixes: `f0d22d1874` ('net/mlx5_core: Introduce flow steering autogrouped flow table') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Paul Blakey	eccec8da3b	net/mlx5: Keep autogroups list ordered Finding a new autogroup range is done by going over a group list sorted by each group start index. The search is stopped after finding the first free range. Adding the newly created group to the list is wrongly added to the end of the list regardless of its start index as the parameter of where to insert it is ignored. This commit makes sure to use that unused parameter to insert it where requested. Fixes: `f0d22d1874` ('net/mlx5_core: Introduce flow steering autogrouped flow table') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Daniel Jurgens	bba1574c2f	net/mlx5: Always Query HCA caps after setting them Always query the HCA caps after setting them to update the capablities data structures. Not doing so results in incorrect capabilities being reported including max_dc, max_qp and several others. Fixes: `59211bd3b6` ("net/mlx5: Split the load/unload flow into hardware and software flows") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Daniel Jurgens	b47bd6ea40	{net, ib}/mlx5: Make cache line size determination at runtime. ARM 64B cache line systems have L1_CACHE_BYTES set to 128. cache_line_size() will return the correct size. Fixes: cf50b5efa2fe('net/mlx5_core/ib: New device capabilities handling.') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 12:00:39 -04:00
Jiri Pirko	71fac305fb	mlxsw: switchx2: Set physical device for port netdevice Do this so the sysfs has "device" link correctly set. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:45:52 -04:00
Jiri Pirko	f20a91f1cc	mlxsw: spectrum: Set physical device for port netdevice Do this so the sysfs has "device" link correctly set. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:45:52 -04:00
Jiri Pirko	1d20d23c59	mlxsw: Move PCI id table definitions into driver modules So far, mlxsw_pci.ko is the module that registers PCI table for all drivers (spectrum and switchx2). That is problematic for example with dracut. Since mlxsw_spectrum.ko and mlxsw_switchx2.ko are loaded dynamically from within mlxsw_core.ko, dracut does not have track of them and avoids them from being included in initramfs. So make this in an ordinary way and define the PCI tables in individual driver modules, so it can be properly loaded and included in dracut initramfs image. As a side effect, this patch could remove no longer necessary driver "kind" strings which were used to link PCI ids with individual mlxsw drivers. Suggested-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:45:52 -04:00
Jiri Pirko	62e86f9e82	mlxsw: pci: Rename header with HW definitions pci.h needs to be used for inner function declarations. So move the original one to more appropriate name, pci_hw.h. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:45:51 -04:00
Ido Schimmel	8c9583a81c	mlxsw: spectrum: Remove extra whitespace Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:45:51 -04:00
Jiri Pirko	8b99becdc8	mlxsw: spectrum_router: Compare only trees which are in use during tree get Only trees which are in use should be compared to requested prefix usage. Fixes: `53342023ee` ("mlxsw: spectrum_router: Implement LPM trees management") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:43:56 -04:00
Jiri Pirko	2083d36790	mlxsw: spectrum_router: Save requested prefix bitlist when creating tree Currently, the prefix bitlist is not saved for LPM trees, causing the compare to always fail which causes the tree to be destroyed and created for every inserted and removed FIB entry. So fix this by saving the bitlist as it should have been done from the very beginning. Fixes: `53342023ee` ("mlxsw: spectrum_router: Implement LPM trees management") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-28 13:43:56 -04:00
Jiri Pirko	c1a3831121	mlxsw: Convert resources into array Since the number of resources is going to get much bigger, ease up the addition by simly defining IDs. Convert the existing structure members to a set array, one for validity, one for values. Introduce a set of getters and setters for easy access. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	f38a2314e5	mlxsw: cmd: Push resource query defines to cmd.h Push cmd resource query related defines to cmd.h where they belong. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	8e9658d567	mlxsw: reg: Generare register names automatically Extend the MLXSW_REG_DEFINE macro to store register name in string form. Use this string later on instead of hard coded string values. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	21978dcfc8	mlxsw: reg: Use helper macro to define registers Save some code and also prepare to easily carry name in string form. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	412791df39	mlxsw: item: Make char *buf arg constant for getters Enforce const for getter buf args. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:28 -04:00
Jiri Pirko	fe0612dc90	mlxsw: item: Make struct mlxsw_item args const These should be const, so enforce it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:28 -04:00
Jarod Wilson	d894be57ca	ethernet: use net core MTU range checking in more drivers Somehow, I missed a healthy number of ethernet drivers in the last pass. Most of these drivers either were in need of an updated max_mtu to make jumbo frames possible to enable again. In a few cases, also setting a different min_mtu to match previous lower bounds. There are also a few drivers that had no upper bounds checking, so they're getting a brand new ETH_MAX_MTU that is identical to IP_MAX_MTU, but accessible by includes all ethernet and ethernet-like drivers all have already. acenic: - min_mtu = 0, max_mtu = 9000 amazon/ena: - min_mtu = 128, max_mtu = adapter->max_mtu amd/xgbe: - min_mtu = 0, max_mtu = 9000 sb1250: - min_mtu = 0, max_mtu = 1518 cxgb3: - min_mtu = 81, max_mtu = 65535 cxgb4: - min_mtu = 81, max_mtu = 9600 cxgb4vf: - min_mtu = 81, max_mtu = 65535 benet: - min_mtu = 256, max_mtu = 9000 ibmveth: - min_mtu = 68, max_mtu = 65535 ibmvnic: - min_mtu = adapter->min_mtu, max_mtu = adapter->max_mtu - remove now redundant ibmvnic_change_mtu jme: - min_mtu = 1280, max_mtu = 9202 mv643xx_eth: - min_mtu = 64, max_mtu = 9500 mlxsw: - min_mtu = 0, max_mtu = 65535 - Basically bypassing the core checks, and instead relying on dynamic checks in the respective switch drivers' ndo_change_mtu functions ns83820: - min_mtu = 0 - remove redundant ns83820_change_mtu, only checked for mtu > 1500 netxen: - min_mtu = 0, max_mtu = 8000 (P2), max_mtu = 9600 (P3) qlge: - min_mtu = 1500, max_mtu = 9000 - driver only supports setting mtu to 1500 or 9000, so the core check only rules out < 1500 and > 9000, qlge_change_mtu still needs to check that the value is 1500 or 9000 qualcomm/emac: - min_mtu = 46, max_mtu = 9194 xilinx_axienet: - min_mtu = 64, max_mtu = 9000 Fixes: `61e84623ac` ("net: centralize net_device min/max MTU checking") CC: netdev@vger.kernel.org CC: Jes Sorensen <jes@trained-monkey.org> CC: Netanel Belgazal <netanel@annapurnalabs.com> CC: Tom Lendacky <thomas.lendacky@amd.com> CC: Santosh Raspatur <santosh@chelsio.com> CC: Hariprasad S <hariprasad@chelsio.com> CC: Sathya Perla <sathya.perla@broadcom.com> CC: Ajit Khaparde <ajit.khaparde@broadcom.com> CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> CC: Somnath Kotur <somnath.kotur@broadcom.com> CC: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> CC: John Allen <jallen@linux.vnet.ibm.com> CC: Guo-Fu Tseng <cooldavid@cooldavid.org> CC: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> CC: Jiri Pirko <jiri@mellanox.com> CC: Ido Schimmel <idosch@mellanox.com> CC: Manish Chopra <manish.chopra@qlogic.com> CC: Sony Chacko <sony.chacko@qlogic.com> CC: Rajesh Borundia <rajesh.borundia@qlogic.com> CC: Timur Tabi <timur@codeaurora.org> CC: Anirudha Sarangi <anirudh@xilinx.com> CC: John Linn <John.Linn@xilinx.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:08 -04:00
Jiri Pirko	164c971d1c	mlxsw: pci: Fix reset wait for SwitchX2 SwitchX2 firmware does not implement reset done yet. Moreover, when busy-polled for ready magic, that slows down firmware and reset takes longer than the defined timeout, causing initialization to fail. So restore the previous behaviour and just sleep in this case. Fixes: `233fa44bd6` ("mlxsw: pci: Implement reset done check") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 11:11:11 -04:00
Elad Raz	7fb6a36bab	mlxsw: switchx2: Fix ethernet port initialization When creating an ethernet port fails, we must move the port to disable, otherwise putting the port in switch partition 0 (ETH) or 1 (IB) will always fails. Fixes: `31557f0f97` ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support") Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 11:11:11 -04:00
Jiri Pirko	37956d78b8	mlxsw: spectrum_router: Make mlxsw_sp_router_fib4_del return void and remove warn The function return value is not checked anywhere. Also, the warning causes huge slowdown when removing large number of FIB entries which were not offloaded, because of ordering issue. Ido's preparing a patchset to fix the ordering issue, but that is definitelly not net tree material. Fixes: `b45f64d16d` ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 11:11:11 -04:00
Jiri Pirko	19271c1a08	mlxsw: spectrum_router: Use correct tree index for binding By a mistake, there is tree index 0 passed to RALTB. Should be MLXSW_SP_LPM_TREE_MIN. Fixes: `b45f64d16d` ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Reported-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 11:11:10 -04:00
David Ahern	dd82364c3a	mlxsw: Flip to the new dev walk API Convert mlxsw users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-18 11:44:59 -04:00
Jarod Wilson	b80f71f581	ethernet/mellanox: use core min/max MTU checking mlx4: min_mtu 46, max_mtu depends on hardware mlx5: min_mtu 68, max_mtu depends on hardware CC: netdev@vger.kernel.org CC: Tariq Toukan <tariqt@mellanox.com> CC: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-18 11:34:19 -04:00
Brenden Blanco	958b3d396d	net/mlx4_en: fixup xdp tx irq to match rx In cases where the number of tx rings is not a multiple of the number of rx rings, the tx completion event will be handled on a different core from the transmit and population of the ring. Races on the ring will lead to a double-free of the page, and possibly other corruption. The rings are initialized by default with a valid multiple of rings, based on the number of cpus, therefore an invalid configuration requires ethtool to change the ring layout. For instance 'ethtool -L eth0 rx 9 tx 8' will cause packets received on rx0, and XDP_TX'd to tx48, to be completed on cpu3 (48 % 9 == 3). Resolve this discrepancy by shifting the irq for the xdp tx queues to start again from 0, modulo rx_ring_num. Fixes: `9ecc2d8617` ("net/mlx4_en: add xdp forwarding and data write support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-14 11:13:00 -04:00
Shmulik Ladkani	5724b8b569	net/sched: tc_mirred: Rename public predicates 'is_tcf_mirred_redirect' and 'is_tcf_mirred_mirror' These accessors are used in various drivers that support tc offloading, to detect properties of a given 'tc_action'. 'is_tcf_mirred_redirect' tests that the action is TCA_EGRESS_REDIR. 'is_tcf_mirred_mirror' tests that the action is TCA_EGRESS_MIRROR. As a prep towards supporting INGRESS redir/mirror, rename these predicates to reflect their true meaning: s/is_tcf_mirred_redirect/is_tcf_mirred_egress_redirect/ s/is_tcf_mirred_mirror/is_tcf_mirred_egress_mirror/ Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-14 10:23:06 -04:00
Tom Herbert	b8a4ddb2e8	net/mlx5: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON I am hitting this in mlx5: drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c: In function reclaim_pages_cmd.clone.0: drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:346: error: call to __compiletime_assert_346 declared with attribute error: BUILD_BUG_ON failed: __mlx5_bit_off(manage_pages_out, pas[i]) % 64 drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c: In function give_pages: drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:291: error: call to __compiletime_assert_291 declared with attribute error: BUILD_BUG_ON failed: __mlx5_bit_off(manage_pages_in, pas[i]) % 64 Problem is that this is doing a BUILD_BUG_ON on a non-constant expression because of trying to take offset of pas[i] in the structure. Fix is to create MLX5_ARRAY_SET64 that takes an additional argument that is the field index to separate between BUILD_BUG_ON on the array constant field and the indexed field to assign the value to. There are two callers of MLX5_SET64 that are trying to get a variable offset, change those to call MLX5_ARRAY_SET64 passing 'pas' and 'i' as the arguments to use in the offset check and the indexed value assignment. Fixes: `a533ed5e17` ("net/mlx5: Pages management commands via mlx5 ifc") Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-13 10:13:24 -04:00
Linus Torvalds	b9044ac829	Merge of primary rdma-core code for 4.9 - Updates to mlx5 - Updates to mlx4 (two conflicts, both minor and easily resolved) - Updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - Improvements to uAPI (moved vendor specific API elements to uAPI area) - Add hns-roce driver and hns and hns-roce ACPI reset support - Conversion of all rdma code away from deprecated create_singlethread_workqueue - Security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging) -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJX+AwSAAoJELgmozMOVy/d0WkQAKxPzVccMWwHv28iZI4ey13u JwE+VoCNpCAZAVuEgzK5zzFdNHPvAk2jU93H4apA7dfXJBXPatVuj9Lnk+ieEEnW tbFwJjBpbQ3Zol3+SPfAHnsVMbtax+xmd6WDKExPXXEDl1L6rutwL3KKfmgWEitg ysX7XOJCiSdyM0hcg4T6UPB9a3jGPff9NLu0oGamV+yoUk5Y0WGoVFxHZ4MKcw8t OkFBYIxGz4SGwq2tulStuH03HteURX594KngtrA8dyq6l1R2GlGRv+bkJAUEIWUv aA0ow3VWusOM6fT+jLXPCv8iUwIXM8tR/U6F7X+cmORUUtWvCl+uCUVid113j/aN BK+Af2nJnfoJ5cDBPsD+bC76l5gQycNZO/Qh8op2kmgJtD+6OpGM3cBXsHx53+kk 0wloJ2lKCGShWxNj+ig8n8rR/rhhs/x3vV3ouCVWNMbOUgOSN3eYHxmK3wGFW4nd Qx+WYCjj9Yi/J6nmUDcfEQ4NWPR22Q2+0ENAabfhLhV6mDloAO5ILHd4GDqC3IA9 UtxlVjf4ZonaiLnTQQzCnDMGVVk6tT8FJ9D42s0ScwjbdYwjyCW9/rs/g2EhcprR Cc+AmjqLviCWGtzBSFO0SijqQon8lcQOwdLw61CdFFvPa/mlLdf1rbx9ArIyNVKn JSrbr3CGyoqyYj6qaEO5 =LC+S -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull main rdma updates from Doug Ledford: "This is the main pull request for the rdma stack this release. The code has been through 0day and I had it tagged for linux-next testing for a couple days. Summary: - updates to mlx5 - updates to mlx4 (two conflicts, both minor and easily resolved) - updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - improvements to uAPI (moved vendor specific API elements to uAPI area) - add hns-roce driver and hns and hns-roce ACPI reset support - conversion of all rdma code away from deprecated create_singlethread_workqueue - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits) staging/lustre: Disable InfiniBand support iw_cxgb4: add fast-path for small REG_MR operations cxgb4: advertise support for FR_NSMR_TPTE_WR IB/core: correctly handle rdma_rw_init_mrs() failure IB/srp: Fix infinite loop when FMR sg[0].offset != 0 IB/srp: Remove an unused argument IB/core: Improve ib_map_mr_sg() documentation IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets IB/mthca: Move user vendor structures IB/nes: Move user vendor structures IB/ocrdma: Move user vendor structures IB/mlx4: Move user vendor structures IB/cxgb4: Move user vendor structures IB/cxgb3: Move user vendor structures IB/mlx5: Move and decouple user vendor structures IB/{core,hw}: Add constant for node_desc ipoib: Make ipoib_warn ratelimited IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue IB/ipoib: Remove deprecated create_singlethread_workqueue ...	2016-10-09 17:04:33 -07:00
Jack Morgenstein	fd10ed8e6f	IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets In MLX qp packets, the LRH (built by the driver) has both a VL field and an SL field. When building a QP1 packet, the VL field should reflect the SLtoVL mapping and not arbitrarily contain zero (as is done now). This bug causes credit problems in IB switches at high rates of QP1 packets. The fix is to cache the SL to VL mapping in the driver, and look up the VL mapped to the SL provided in the send request when sending QP1 packets. For FW versions which support generating a port_management_config_change event with subtype sl-to-vl-table-change, the driver uses that event to update its sl-to-vl mapping cache. Otherwise, the driver snoops incoming SMP mads to update the cache. There remains the case where the FW is running in secure-host mode (so no QP0 packets are delivered to the driver), and the FW does not generate the sl2vl mapping change event. To support this case, the driver updates (via querying the FW) its sl2vl mapping cache when running in secure-host mode when it receives either a Port Up event or a client-reregister event (where the port is still up, but there may have been an opensm failover). OpenSM modifies the sl2vl mapping before Port Up and Client-reregister events occur, so if there is a mapping change the driver's cache will be properly updated. Fixes: `225c7b1fee` ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-10-07 16:54:38 -04:00
Yotam Gigi	251d41c58b	mlxsw: switchx2: Fix misuse of hard_header_len In order to specify that the mlxsw switchx2 driver needs additional headroom for packets, there have been use of the hard_header_len field of the netdevice struct. This commit changes that to use needed_headroom instead, as this is the correct way to do that. Fixes: `31557f0f97` ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-04 20:27:54 -04:00
Yotam Gigi	feb7d387a6	mlxsw: spectrum: Fix misuse of hard_header_len In order to specify that the mlxsw spectrum driver needs additional headroom for packets, there have been use of the hard_header_len field of the netdevice struct. This commit changes that to use needed_headroom instead, as this is the correct way to do that. Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-04 20:27:54 -04:00
Arnd Bergmann	ab58070569	mlxsw: spectrum_router: avoid potential uninitialized data usage If fi->fib_nhs is zero, the router interface pointer is uninitialized, as shown by this warning: drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_router_fib_event': drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1674:21: error: 'r' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1643:23: note: 'r' was declared here This changes the loop so we handle the case the same way as finding no router interface pointer attached to one of the nexthops to ensure we always trap here instead of using uninitialized data. Fixes: `b45f64d16d` ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-03 01:55:18 -04:00
Arnd Bergmann	d0debb76df	net/mlx5e: shut up maybe-uninitialized warning Build-testing this driver with -Wmaybe-uninitialized gives a new false-positive warning that I can't really explain: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'mlx5e_configure_flower': drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:509:3: error: 'old_attr' may be used uninitialized in this function [-Werror=maybe-uninitialized] It's obvious from the code that 'old_attr' is initialized whenever 'old' is non-NULL here. The warning appears with all versions I tested from gcc-4.7 through gcc-6.1, and I could not come up with a way to rewrite the function in a more readable way that avoids the warning, so I'm adding another initialization to shut it up. Fixes: `8b32580df1` ("net/mlx5e: Add TC vlan action for SRIOV offloads") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-03 01:55:18 -04:00
Calvin Owens	803783849f	mlx5: Add ndo_poll_controller() implementation This implements ndo_poll_controller in net_device_ops callbacks for mlx5, which is necessary to use netconsole with this driver. Acked-By: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Calvin Owens <calvinowens@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-30 02:11:16 -04:00
David Decotigny	5038056e6b	mlx4: remove unused fields This also can address following UBSAN warnings: [ 36.640343] ================================================================================ [ 36.648772] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx4/fw.c:857:26 [ 36.656853] shift exponent 64 is too large for 32-bit type 'int' [ 36.663348] ================================================================================ [ 36.671783] ================================================================================ [ 36.680213] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx4/fw.c:861:27 [ 36.688297] shift exponent 35 is too large for 32-bit type 'int' [ 36.694702] ================================================================================ Tested: reboot with UBSAN, no warning. Signed-off-by: David Decotigny <decot@googlers.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-30 01:56:41 -04:00
Jiri Pirko	b45f64d16d	mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls Until now, in order to offload a FIB entry to HW we use switchdev op. However that has limits. Mainly in case we need to make the HW aware of all route prefixes configured in kernel. HW needs to know those in order to properly trap appropriate packets and pass the to kernel to do the forwarding. Abort mechanism is now handled within the mlxsw driver. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-28 04:48:00 -04:00
Colin Ian King	faac0ff0a5	mlxsw: spectrum: remove redundant check if err is zero There is an earlier check and return if err is non-zero, so the check to see if it is zero is redundant in every iteration of the loop and hence the check can be removed. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:28:23 -04:00
Moshe Shemesh	b42959dc35	net/mlx4: Add VF vlan protocol 802.1ad support Move the vf to VST 802.1ad mode (mlx4 VST QinQ mode) by setting vf vlan protocol to 802.1ad. VST 802.1ad mode in mlx4, is used for STAG strip/insertion by PF, while the CTAG is set by the VF. Read current vlan protocol as part of the vf configuration state. Upon setting vf vlan protocol to 802.1ad, we use a mechanism of handshake to verify that both the vf and the pf driver version support it. The handshake uses the command QUERY_FUNC_CAP: - The vf sets a pre-defined support bit in input modifier. - A pf that supports the feature sends the request to the vf through a pre-defined field in the output mailbox. - In case vf does not support the feature, the pf will fail the control command (in this case, IP link tool command to set the vf vlan protocol to 802.1ad). No change in VST 802.1Q mode. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:01:27 -04:00
Moshe Shemesh	79aab093a0	net: Update API for VF vlan protocol 802.1ad support Introduce new rtnl UAPI that exposes a list of vlans per VF, giving the ability for user-space application to specify it for the VF, as an option to support 802.1ad. We adjusted IP Link tool to support this option. For future use cases, the new UAPI supports multiple vlans. For now we limit the list size to a single vlan in kernel. Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward compatibility with older versions of IP Link tool. Add a vlan protocol parameter to the ndo_set_vf_vlan callback. We kept 802.1Q as the drivers' default vlan protocol. Suitable ip link tool command examples: Set vf vlan protocol 802.1ad: ip link set eth0 vf 1 vlan 100 proto 802.1ad Set vf to VST (802.1Q) mode: ip link set eth0 vf 1 vlan 100 proto 802.1Q Or by omitting the new parameter ip link set eth0 vf 1 vlan 100 Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:01:26 -04:00
Moshe Shemesh	0815fe3a86	net/mlx4_en: Disable vlan HW acceleration when in VF vlan protocol 802.1ad mode In Ethernet VF, disable vlan HW acceleration on VF while it is set to VF vlan protocol 802.1ad mode. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:01:26 -04:00
Moshe Shemesh	7c3d21c815	net/mlx4_core: Preparation for VF vlan protocol 802.1ad Check device capability to support VF vlan protocol 802.1ad mode. Add vport attribute vlan protocol. Init vport vlan protocol by default to 802.1Q. Add update QP support for VF vlan protocol 802.1ad. Add func capability vlan_offload_disable to disable all vlan HW acceleration on VF while the VF is set to VF vlan protocol 802.1ad mode. No change in VF vlan protocol 802.1Q (VST) mode. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:01:26 -04:00
Moshe Shemesh	c9cc599a96	net/mlx4_core: Fix QUERY FUNC CAP flags Separate QUERY_FUNC_CAP flags0 from QUERY_FUNC_CAP flags, as 'flags' is already used for another set of flags in FUNC CAP, while phv bit should be part of a different set of flags. Remove QUERY_FUNC_CAP port_flags field, as it is not in use. Fixes: `77fc29c4bb` ('net/mlx4_core: Preparations for 802.1ad VLAN support') Fixes: `5cc914f108` ('mlx4_core: Added FW commands and their wrappers for supporting SRIOV') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:01:26 -04:00
Or Gerlitz	095b6cfd69	net/mlx5e: Add TC vlan match parsing Enhance the parsing of offloaded TC rules matches to handle vlans. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	8b32580df1	net/mlx5e: Add TC vlan action for SRIOV offloads Parse TC vlan actions and set the required elements to allow offloading. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	f5f8247609	net/mlx5: E-Switch, Support VLAN actions in the offloads mode Many virtualization systems use a policy under which a vlan tag is pushed to packets sent by guests, and popped before the packet is forwarded to the VM. The current generation of the mlx5 HW doesn't fully support that on a per flow level. As such, we are addressing the above common use case with the SRIOV e-Switch abilities to push vlan into packets sent by VFs and pop vlan from packets forwarded to VFs. The HW can match on the correct vlan being present in packets forwarded to VFs (eSwitch steering is done before stripping the tag), so this part is offloaded as is. A common practice for vlans is to avoid both push vlan and pop vlan for inter-host VM/VM (east-west) communication because in this case, push on egress cancels out with pop on ingress. For supporting that, we use a global eswitch vlan pop policy, hence allowing guest A to communicate with both remote VM B and local VM C. This works since the HW pops the vlan only if it exists (e.g for C --> A packets but not for B --> A packets). On the slow path, when a VF vport has an offloaded flow which involves pushing vlans, wheres another flow is not currently offloaded, the packets from the 2nd flow seen by the VF representor on the host have vlan. The VF rep driver removes such vlan before calling into the host networking stack. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	8515c581df	net/mlx5e: Refactor retrival of skb from rx completion element (cqe) Factor the relevant code into a static inline helper (skb_from_cqe) doing that. Move the call to napi_gro_receive to be carried out just after mlx5e_complete_rx_cqe returns. Both changes are to be used for the VF representor as well in the next commit. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	776b12b674	net/mlx5: Put elements related to offloaded TC rule in one struct Put the representors related to the source and dest vports and the action in struct mlx5_esw_flow_attr which is used while setting the FDB rule. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	e33dfe316c	net/mlx5: E-Switch, Allow fine tuning of eswitch vport push/pop vlan The HW can be programmed to push vlan, pop vlan or both. A factorization step towards using the push/pop capabilties in the eswitch offloads mode. This patch doesn't add new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	bac9b6aa1d	net/mlx5: E-Switch, Set vport representor fields explicitly on registration The structure we use for the eswitch vport representor (mlx5_eswitch_rep) has some fields which are set from upper layers in the driver when they register the rep. Use explicit setting on registration time for them and avoid global memcpy. This patch doesn't add new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	9deb2241f1	net/mlx5: E-Switch, Set the vport when registering the uplink rep Set the vport value in the PF entry to be that of the uplink so we can use it blindly over the tc / eswitch offload code without translating it each time we deal with the uplink representor. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
David S. Miller	d6989d4bbe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-09-23 06:46:57 -04:00
Saeed Mahameed	35b510e257	net/mlx5e: XDP TX xmit more Previously we rang XDP SQ doorbell on every forwarded XDP packet. Here we introduce a xmit more like mechanism that will queue up more than one packet into SQ (up to RX napi budget) w/o notifying the hardware. Once RX napi budget is consumed and we exit napi RX loop, we will flush (doorbell) all XDP looped packets in case there are such. XDP forward packet rate: Comparing XDP with and w/o xmit more (bulk transmit): RX Cores XDP TX XDP TX (xmit more) --------------------------------------------------- 1 6.5Mpps 12.4Mpps 2 13.2Mpps 24.2Mpps 4 25.2Mpps 36.3Mpps* 8 36.3Mpps* 36.3Mpps* *My xmitter was limited to 36.3Mpps, so it is the bottleneck. It seems that receive side can handle more. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:41 -04:00
Saeed Mahameed	b5503b994e	net/mlx5e: XDP TX forwarding support Adding support for XDP_TX forwarding from xdp program. Using XDP, now user can loop packets out of the same port. We create a dedicated TX SQ for each channel that will serve XDP programs that return XDP_TX action to loop packets back to the wire directly from the channel RQ RX path. For that RX pages will now need to be mapped bi-directionally, and on XDP_TX action we will sync the page back to device then queue it into SQ for transmission. The XDP xmit frame function will report back to the RX path if the page was consumed (transmitted), if so, RX path will forget about that page as if it were released to the stack. Later on, on XDP TX completion, the page will be released back to the page cache. For simplicity this patch will hit a doorbell on every XDP TX packet. Next patch will introduce a xmit more like mechanism that will queue up more than one packet into SQ w/o notifying the hardware, once RX napi loop is done we will hit doorbell once for all XDP TX packets form the previous loop. This should drastically improve XDP TX performance. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:41 -04:00
Saeed Mahameed	f10b7cc770	net/mlx5e: Have a clear separation between different SQ types Make a clear separate between Regular SQ (TXQ) and ICO SQ creation, destruction and union their mutual information structures. Don't allocate redundant TXQ skb/wqe_info/dma_fifo arrays for ICO SQ. And have a different SQ edge for ICO SQ than TXQ SQ, to be more accurate. In preparation for XDP TX support. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:41 -04:00
Rana Shahout	86994156c7	net/mlx5e: XDP fast RX drop bpf programs support Add support for the BPF_PROG_TYPE_PHYS_DEV hook in mlx5e driver. When XDP is on we make sure to change channels RQs type to MLX5_WQ_TYPE_LINKED_LIST rather than "striding RQ" type to ensure "page per packet". On XDP set, we fail if HW LRO is set and request from user to turn it off. Since on ConnectX4-LX HW LRO is always on by default, this will be annoying, but we prefer not to enforce LRO off from XDP set function. Full channels reset (close/open) is required only when setting XDP on/off. When XDP set is called just to exchange programs, we will update each RQ xdp program on the fly and for synchronization with current data path RX activity of that RQ, we temporally disable that RQ and ensure RX path is not running, quickly update and re-enable that RQ, for that we do: - rq.state = disabled - napi_synnchronize - xchg(rq->xdp_prg) - rq.state = enabled - napi_schedule // Just in case we've missed an IRQ Packet rate performance testing was done with pktgen 64B packets and on TX side and, TC drop action on RX side compared to XDP fast drop. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Comparison is done between: 1. Baseline, Before this patch with TC drop action 2. This patch with TC drop action 3. This patch with XDP RX fast drop RX Cores Baseline(TC drop) TC drop XDP fast Drop -------------------------------------------------------------- 1 5.3Mpps 5.3Mpps 16.5Mpps 2 10.2Mpps 10.2Mpps 31.3Mpps 4 20.5Mpps 19.9Mpps 36.3Mpps* *My xmitter was limited to 36.3Mpps, so it is the bottleneck. It seems that receive side can handle more. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:41 -04:00
Saeed Mahameed	2fc4bfb725	net/mlx5e: Dynamic RQ type infrastructure Add two helper functions to allow dynamic changes of RQ type. mlx5e_set_rq_priv_params and mlx5e_set_rq_type_params will be used on netdev creation to determine the default RQ type. This will be needed later for downstream patches of XDP support. When enabling XDP we will dynamically move from striding RQ to linked list RQ type. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:40 -04:00
Saeed Mahameed	e4b8550807	net/mlx5e: Slightly reduce hardware LRO size Before this patch LRO size was 64K, now with build_skb requires extra room, headroom + sizeof(skb_shared_info) added to the data buffer will make wqe size or page_frag_size slightly larger than 64K which will demand order 5 page instead of order 4 in 4K page systems. We take those extra bytes from hardware LRO data size in order to not increase the required page order for when hardware LRO is enabled. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:40 -04:00
Saeed Mahameed	21c59685dd	net/mlx5e: Union RQ RX info per RQ type We have two types of RX RQs, and they use two separate sets of info arrays and structures in RX data path function. Today those structures are mutually exclusive per RQ type, hence one kind is allocated on RQ creation according to the RQ type. For better cache locality and to minimalize the sizeof(struct mlx5e_rq), in this patch we define them as a union. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:40 -04:00
Saeed Mahameed	1bfecfca56	net/mlx5e: Build RX SKB on demand For non-striding RQ configuration before this patch we had a ring with pre-allocated SKBs and mapped the SKB->data buffers for device. For robustness and better RX data buffers management, we allocate a page per packet and build_skb around it. This patch (which is a prerequisite for XDP) will actually reduce performance for normal stack usage, because we are now hitting a bottleneck in the page allocator. We use the page-cache to restore or even improve performance in comparison to the old RX scheme. Packet rate performance testing was done with pktgen 64B packets on xmit side and TC ingress dropping action on RX side. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Comparison is done between: 1.Baseline, before 'net/mlx5e: Build RX SKB on demand' 2.Build SKB with RX page cache (This patch) RX Cores Baseline Build SKB+page-cache Improvement ----------------------------------------------------------- 1 4.16Mpps 5.33Mpps 28% 2 7.16Mpps 10.24Mpps 43% 4 13.61Mpps 20.51Mpps 51% 8 25.32Mpps 32.00Mpps 26% All respective cores were 100% utilized. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:51:40 -04:00
Nicolas Pitre	efee95f42b	ptp_clock: future-proofing drivers against PTP subsystem becoming optional Drivers must be ready to accept NULL from ptp_clock_register() if the PTP clock subsystem is configured out. This patch documents that and ensures that all drivers cope well with a NULL return. Signed-off-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Eugenia Emantayev <eugenia@mellanox.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 02:18:33 -04:00
Kamal Heib	fba1296624	net/mlx4_core: Fix to clean devlink resources This patch cleans devlink resources by calling devlink_port_unregister() to avoid the following issues: - Kernel panic when triggering reset flow. - Memory leak due to unfreed resources in mlx4_init_port_info(). Fixes: `09d4d087cd` ("mlx4: Implement devlink interface") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 01:41:27 -04:00
Jack Morgenstein	a7e1f04905	net/mlx4_core: Fix deadlock when switching between polling and event fw commands When switching from polling-based fw commands to event-based fw commands, there is a race condition which could cause a fw command in another task to hang: that task will keep waiting for the polling sempahore, but may never be able to acquire it. This is due to mlx4_cmd_use_events, which "down"s the sempahore back to 0. During driver initialization, this is not a problem, since no other tasks which invoke FW commands are active. However, there is a problem if the driver switches to polling mode and then back to event mode during normal operation. The "test_interrupts" feature does exactly that. Running "ethtool -t <eth device> offline" causes the PF driver to temporarily switch to polling mode, and then back to event mode. (Note that for VF drivers, such switching is not performed). Fix this by adding a read-write semaphore for protection when switching between modes. Fixes: `225c7b1fee` ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 21:52:43 -04:00
Leon Romanovsky	30353bfc43	net/mlx4_core: Use RCU to perform radix tree lookup for SRQ Radix tree lookup can be performed without locking. Fixes: `225c7b1fee` ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Suggested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 21:52:43 -04:00
Kamal Heib	57c970c2e8	net/mlx4_en: Fix wrong indentation Use tabs instead of spaces before if statement, no functional change. Fixes: `e7c1c2c462` ("mlx4_en: Added self diagnostics test implementation") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 21:52:43 -04:00
Tariq Toukan	de3d6fa81e	net/mlx4_en: Add branch prediction hints in RX data-path Add likely/unlikely hints to improve branch predictions in the RX data-path. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 21:52:43 -04:00
Nogah Frankel	8f8a62d462	mlxsw: spectrum: Implement max rif resource Replace max rif const with using the result from resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:59 -04:00
Nogah Frankel	274df7fb77	mlxsw: pci: Add max router interface resource Add the max number of rif (router interfaces) to resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:59 -04:00
Nogah Frankel	e44d49cbbc	mlxsw: pci: Add some miscellaneous resources Add max system ports, max regions and max vlan groups to resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:59 -04:00
Nogah Frankel	9497c042bf	mlxsw: spectrum: Implement max virtual routers resource Replace max virtual routers const with the result from the resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:59 -04:00
Nogah Frankel	b8a09f0a09	mlxsw: pci: Add max virtual routers resource Add the max number of virtual routers to resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:59 -04:00
Nogah Frankel	403547d38d	mlxsw: profile: Add KVD resources to profile config Use resources from resource query to determine values for the profile configuration. Add KVD determined section sizes to the resources struct. Change the profile struct and value to match this changes. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:58 -04:00
Nogah Frankel	2acd10c51b	mlxsw: pci: Add KVD size relate resources Add KVD size, and minimum sizes for the single and double sections resources to resources query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:58 -04:00
Nogah Frankel	ce0bd2b0c5	mlxsw: spectrum: lag resources- use resources data instead of consts Use max lag and max ports in lag resources as the result of resource query instead of using const to save them. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:58 -04:00
Nogah Frankel	9f7f797c1d	mlxsw: pci: Add lag related resources to resources query Add max lag and max ports in lag resources to resources query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 01:00:58 -04:00
Or Gerlitz	4bdcc6ca21	mlxsw: spectrum: Make offloads stats functions static The offloads stats functions are local to this file, make them static. Fixes: `fc1bbb0f18` ('mlxsw: spectrum: Implement offload stats ndo [..]') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-21 00:45:59 -04:00
Jesper Dangaard Brouer	5737f6c926	mlx4: add missed recycle opportunity for XDP_TX on TX failure Correct drop handling for XDP_TX on TX failure, were recently added in commit `95357907ae` ("mlx4: fix XDP_TX is acting like XDP_PASS on TX ring full"). The change missed an opportunity for recycling the RX page, instead of going through the page allocator, like the regular XDP_DROP action does. This patch cease the opportunity, by going through the XDP_DROP case. Fixes: `95357907ae` ("mlx4: fix XDP_TX is acting like XDP_PASS on TX ring full") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-20 04:58:17 -04:00
Ido Schimmel	1a9234e66e	mlxsw: spectrum: Fix sparse warnings drivers/net/ethernet/mellanox/mlxsw//spectrum.c:251:28: warning: symbol 'mlxsw_sp_span_entry_find' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlxsw//spectrum.c:265:28: warning: symbol 'mlxsw_sp_span_entry_get' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: int enum mlxsw_sp_span_type versus drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: int enum mlxsw_reg_mpar_i_e ... drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: int enum mlxsw_reg_sbxx_dir versus drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: int enum devlink_sb_pool_type drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: int enum mlxsw_reg_sbpr_mode versus drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: int enum devlink_sb_threshold_type ... drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: int enum mlxsw_sp_l3proto versus drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: int enum mlxsw_reg_ralxx_protocol ... drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1749:6: warning: symbol 'mlxsw_sp_fib_entry_put' was not declared. Should it be static? Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-20 04:32:50 -04:00
Elad Raz	18c2d2c113	mlxsw: Change the RX LAG hash function from XOR to CRC Change the RX hash function from XOR to CRC in order to have better distribution of the traffic. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-20 04:32:50 -04:00
Or Gerlitz	6c419ba8e2	net/mlx5: E-Switch, Handle mode change failures E-switch mode changes involve creating HW tables, potentially allocating netdevices, etc, and things can fail. Add an attempt to rollback to the existing mode when changing to the new mode fails. Only if rollback fails, getting proper SRIOV functionality requires module unload or sriov disablement/enablement. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-19 22:10:16 -04:00
Or Gerlitz	4eea37d7b9	net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code When enablement of the SRIOV e-switch in certain mode (switchdev or legacy) fails, we must set the mode to none. Otherwise, we'll run into double free based crashes when further attempting to deal with the e-switch (such as when disabling sriov or unloading the driver). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-19 22:10:15 -04:00
Roi Dayan	babd6134a5	net/mlx5: Fix flow counter bulk command out mailbox allocation The FW command output length should be only the length of struct mlx5_cmd_fc_bulk out field. Failing to do so will cause the memcpy call which is invoked later in the driver to write over wrong memory address and corrupt kernel memory which results in random crashes. This bug was found using the kernel address sanitizer (kasan). Fixes: `a351a1b03b` ('net/mlx5: Introduce bulk reading of flow counters') Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-19 22:10:15 -04:00
Baoyou Xie	766a0e978f	net/mlx5: clean function declarations in eswitch.c up We get 2 warnings when building kernel with W=1: drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:463:5: warning: no previous prototype for 'esw_offloads_init' [-Wmissing-prototypes] drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:521:6: warning: no previous prototype for 'esw_offloads_cleanup' [-Wmissing-prototypes] In fact, both functions are declared in drivers/net/ethernet/mellanox/mlx5/core/eswitch.c,but should be declared in a header file, thus can be recognized in other file. So this patch moves the declarations into drivers/net/ethernet/mellanox/mlx5/core/eswitch.h Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-19 21:52:10 -04:00
Jesper Dangaard Brouer	95357907ae	mlx4: fix XDP_TX is acting like XDP_PASS on TX ring full The XDP_TX action can fail transmitting the frame in case the TX ring is full or port is down. In case of TX failure it should drop the frame, and not as now call 'break' which is the same as XDP_PASS. Fixes: `9ecc2d8617` ("net/mlx4_en: add xdp forwarding and data write support") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-19 01:32:19 -04:00
Nogah Frankel	fc1bbb0f18	mlxsw: spectrum: Implement offload stats ndo and expose HW stats by default Change the default statistics ndo to return HW statistics (like the one returned by ethtool_ops). The HW stats are collected to a cache by delayed work every 1 sec. Implement the offload stat ndo. Add a function to get SW statistics, to be called from this function. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-18 22:33:42 -04:00
Tariq Toukan	4415a0319f	net/mlx5e: Implement RX mapped page cache for page recycle Instead of reallocating and mapping pages for RX data-path, recycle already used pages in a per ring cache. Performance tests: The following results were measured on a freshly booted system, giving optimal baseline performance, as high-order pages are yet to be fragmented and depleted. We ran pktgen single-stream benchmarks, with iptables-raw-drop: Single stride, 64 bytes: * 4,739,057 - baseline * 4,749,550 - order0 no cache * 4,786,899 - order0 with cache 1% gain Larger packets, no page cross, 1024 bytes: * 3,982,361 - baseline * 3,845,682 - order0 no cache * 4,127,852 - order0 with cache 3.7% gain Larger packets, every 3rd packet crosses a page, 1500 bytes: * 3,731,189 - baseline * 3,579,414 - order0 no cache * 3,931,708 - order0 with cache 5.4% gain Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-17 09:51:40 -04:00
Tariq Toukan	a5a0c59016	net/mlx5e: Introduce API for RX mapped pages Manage the allocation and deallocation of mapped RX pages only through dedicated API functions. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-17 09:51:40 -04:00
Tariq Toukan	7e42667170	net/mlx5e: Single flow order-0 pages for Striding RQ To improve the memory consumption scheme, we omit the flow that demands and splits high-order pages in Striding RQ, and stay with a single Striding RQ flow that uses order-0 pages. Moving to fragmented memory allows the use of larger MPWQEs, which reduces the number of UMR posts and filler CQEs. Moving to a single flow allows several optimizations that improve performance, especially in production servers where we would anyway fallback to order-0 allocations: - inline functions that were called via function pointers. - improve the UMR post process. This patch alone is expected to give a slight performance reduction. However, the new memory scheme gives the possibility to use a page-cache of a fair size, that doesn't inflate the memory footprint, which will dramatically fix the reduction and even give a performance gain. Performance tests: The following results were measured on a freshly booted system, giving optimal baseline performance, as high-order pages are yet to be fragmented and depleted. We ran pktgen single-stream benchmarks, with iptables-raw-drop: Single stride, 64 bytes: * 4,739,057 - baseline * 4,749,550 - this patch no reduction Larger packets, no page cross, 1024 bytes: * 3,982,361 - baseline * 3,845,682 - this patch 3.5% reduction Larger packets, every 3rd packet crosses a page, 1500 bytes: * 3,731,189 - baseline * 3,579,414 - this patch 4% reduction Fixes: `461017cb00` ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)") Fixes: `bc77b240b3` ("net/mlx5e: Add fragmented memory support for RX multi packet WQE") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-17 09:51:40 -04:00
Sebastian Ott	2a292822f0	net/mlx4_en: fix off by one in error handling If an error occurs in mlx4_init_eq_table the index used in the err_out_unmap label is one too big which results in a panic in mlx4_free_eq. This patch fixes the index in the error path. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-16 04:16:22 -04:00
Ido Schimmel	b9d66a36aa	mlxsw: spectrum: Add support for new ethtool API Remove the deprecated {get,set}_settings callbacks and instead add {get,set}_link_ksettings along with support for newly available speeds. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-13 12:16:34 -04:00
Ido Schimmel	91bdc7a43a	mlxsw: spectrum: Indicate support of multiple port types The device can support multiple port types, so don't return on first match. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-13 12:16:33 -04:00
Ido Schimmel	0213424ada	mlxsw: spectrum: Report port type according to operational speed In case port isn't operational we shouldn't report the port type, but instead return PORT_OTHER. This is consistent with most other drivers that return PORT_OTHER when media type can't be determined. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-13 12:16:33 -04:00
Ido Schimmel	4149b97f72	mlxsw: spectrum: Report link partner's advertised speeds If autonegotiation was performed successfully, then we should report the link partner's advertised speeds instead of the operational speed of the port. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-13 12:16:33 -04:00
Ido Schimmel	0c83f88c02	mlxsw: spectrum: Correctly report autonegotiation Up until now the device always reported autonegotiation to be off although it was on by default. Allow the user to disable / enable autonegotiation and report its status correctly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-13 12:16:33 -04:00
David S. Miller	b20b378d49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mediatek/mtk_eth_soc.c drivers/net/ethernet/qlogic/qed/qed_dcbx.c drivers/net/phy/Kconfig All conflicts were cases of overlapping commits. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-12 15:52:44 -07:00
Moshe Shemesh	7a61fc86af	net/mlx4_en: Fix panic on xmit while port is down When port is down, tx drop counter update is not needed. Updating the counter in this case can cause a kernel panic as when the port is down, ring can be NULL. Fixes: `63a664b7e9` ("net/mlx4_en: fix tx_dropped bug") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-11 19:40:26 -07:00
Tariq Toukan	564ed9b187	net/mlx4_en: Fixes for DCBX This patch adds a capability check before enabling DCBX. In addition, it re-organizes the relevant data structures, and fixes a typo in a define. Fixes: `af7d518526` ("net/mlx4_en: Add DCB PFC support through CEE netlink commands") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-11 19:40:26 -07:00
Kamal Heib	c677071741	net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_state() mlx4_en_dcbnl_set_state() returns u8, the return value from mlx4_en_setup_tc() could be negative in case of failure, so fix that. Fixes: `af7d518526` ("net/mlx4_en: Add DCB PFC support through CEE netlink commands") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-11 19:40:25 -07:00
Kamal Heib	74a9e90544	net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_all() mlx4_en_dcbnl_set_all() returns u8, so return value can't be negative in case of failure. Fixes: `af7d518526` ("net/mlx4_en: Add DCB PFC support through CEE netlink commands") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Rana Shahout <ranas@mellanox.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-11 19:40:25 -07:00
Mohamad Haj Yahia	f1ee87fe55	net/mlx5: Organize device list API in one place Hide the exposed (external) mlx5_dev_list and mlx5_intf_mutex and expose an organized modular API to manage and manipulate mlx5 devices list. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:50 -07:00
Mohamad Haj Yahia	9df30601c8	net/mlx5e: Restore vlan filter after seamless reset When detaching the mlx5e interface clear all the vlans rules from the vlan flow table. When attaching it back restore all the active vlans rules to the HW. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:50 -07:00
Mohamad Haj Yahia	26e59d8077	net/mlx5e: Implement mlx5e interface attach/detach callbacks Needed to support seamless and lightweight PCI/Internal error recovery. Implement the attach/detach interface callbacks. In attach callback we only allocate HW resources. In detach callback we only deallocate HW resources. All SW/kernel objects initialzing/destroying is kept in add/remove callbacks. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:50 -07:00
Mohamad Haj Yahia	1ab2068a4c	net/mlx5: Implement vports admin state backup/restore Save the user configuration in the vport sturct. Restore saved old configuration upon vport enable. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:50 -07:00
Mohamad Haj Yahia	c2d6e31a00	net/mlx5: Align sriov/eswitch modules with the new load/unload flow. Init/cleanup sriov/eswitch in the core software context init/cleanup flows. Attach/detach sriov/eswitch in the core load/unload flows. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:50 -07:00
Mohamad Haj Yahia	62a9b90ad8	net/mlx5: Implement eswitch attach/detach flows Needed for lightweight and modular internal/pci error handling. Implement eswitch attach function which allocates/starts hw related resources. Implement eswitch detach function which releases/stops hw related resources. Init/cleanup function only handle eswitch software context allocation and destruction. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:49 -07:00
Mohamad Haj Yahia	acab721b5d	net/mlx5: Implement SRIOV attach/detach flows Needed for lightweight and modular internal/pci error handling. Implement sriov attach function which enables pre-saved number of vfs on the device side. Implement sriov detach function which disable the current vfs on the device side. Init/cleanup function only handles sriov software context allocation and destruction. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:49 -07:00
Mohamad Haj Yahia	59211bd3b6	net/mlx5: Split the load/unload flow into hardware and software flows Gather all software context creating/destroying in one function and call it once in the first load and in the last unload. load/unload functions will now receive indication if we need to create/destroy the software contexts. In internal/pci error do the unload/load flows without releasing the software objects. In this way we perserve the sw core state and it help us restoring old driver state after PCI error/shutdown. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:49 -07:00
Mohamad Haj Yahia	737a234bb6	net/mlx5: Introduce attach/detach to interface API Add attach/detach callbacks to interface API. This is crucial for implementing seamless reset flow which releases the hardware and it's resources upon detach while keeping software structures and state (e.g netdev) then reset and reallocate the hardware needed resources upon attach. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:49 -07:00
Mohamad Haj Yahia	6b6adee3da	net/mlx5: SRIOV core code refactoring Simplify the code and makes it look modular and symmetric. Split sriov enable/disable to two levels: device level and pci level. When user enable/disable sriov (via sriov_configure driver callback) we will enable/disable both device and pci sriov. When driver load/unload we will enable/disable (on demand) only device sriov while keeping the PCI sriov enabled for next driver load. On internal/pci error, VFs will be kept enabled on PCI and the reset is done only in device level. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:49 -07:00
Mohamad Haj Yahia	d62292e850	net/mlx5: Skip waiting for vf pages in internal error In case of device in internal error state there is no need to wait for vf pages since they will be reclaimed manually later in the unload flow. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-10 21:21:48 -07:00
Ido Schimmel	3247ff2b31	mlxsw: spectrum: Set port type before setting its address During port init, we currently set the port's type to Ethernet after setting its MAC address. However, the hardware documentation states this should be the other way around. Align the driver with the hardware documentation and set the port's MAC address after setting its type. Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-09 16:56:53 -07:00
Jiri Pirko	40d2590455	mlxsw: spectrum_router: Fix error path in mlxsw_sp_router_init When neigh_init fails, we have to do proper cleanup including router_fini call. Fixes: `6cf3c971dc` ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-09 16:56:53 -07:00
Gal Pressman	cd17d230dd	net/mlx5e: Fix parsing of vlan packets when updating lro header Currently vlan tagged packets were not parsed correctly and assumed to be regular IPv4/IPv6 packets. We should check for 802.1Q/802.1ad tags and update the lro header accordingly. This fixes the use case where LRO is on and rxvlan is off (vlan stripping is off). Fixes: `e586b3b0ba` ('net/mlx5: Ethernet Datapath files') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-08 16:15:29 -07:00
Gal Pressman	4e39883d9c	net/mlx5e: Fix global PFC counters replication Currently when reading global PFC statistics we left the counter iterator out of the equation and we ended up reading the same counter over and over again. Instead of reading the counter at index 0 on every iteration we now read the counter at index (i). Fixes: `e989d5a532` ('net/mlx5e: Expose flow control counters to ethtool') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-08 16:15:29 -07:00
Gal Pressman	7abc211077	net/mlx5e: Prevent casting overflow On 64 bits architectures unsigned long is longer than u32, casting to unsigned long will result in overflow. We need to first allocate an unsigned long variable, then assign the wanted value. Fixes: `665bc53969` ('net/mlx5e: Use new ethtool get/set link ksettings API') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-08 16:15:29 -07:00
Tariq Toukan	0dbf657c39	net/mlx5e: Fix xmit_more counter race issue Update the xmit_more counter before notifying the HW, to prevent a possible use-after-free of the skb. Fixes: `c8cf78fe10` ("net/mlx5e: Add ethtool counter for TX xmit_more") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-08 16:15:28 -07:00
Brenden Blanco	326fe02d1e	net/mlx4_en: protect ring->xdp_prog with rcu_read_lock Depending on the preempt mode, the bpf_prog stored in xdp_prog may be freed despite the use of call_rcu inside bpf_prog_put. The situation is possible when running in PREEMPT_RCU=y mode, for instance, since the rcu callback for destroying the bpf prog can run even during the bh handling in the mlx4 rx path. Several options were considered before this patch was settled on: Add a napi_synchronize loop in mlx4_xdp_set, which would occur after all of the rings are updated with the new program. This approach has the disadvantage that as the number of rings increases, the speed of update will slow down significantly due to napi_synchronize's msleep(1). Add a new rcu_head in bpf_prog_aux, to be used by a new bpf_prog_put_bh. The action of the bpf_prog_put_bh would be to then call bpf_prog_put later. Those drivers that consume a bpf prog in a bh context (like mlx4) would then use the bpf_prog_put_bh instead when the ring is up. This has the problem of complexity, in maintaining proper refcnts and rcu lists, and would likely be harder to review. In addition, this approach to freeing must be exclusive with other frees of the bpf prog, for instance a _bh prog must not be referenced from a prog array that is consumed by a non-_bh prog. The placement of rcu_read_lock in this patch is functionally the same as putting an rcu_read_lock in napi_poll. Actually doing so could be a potentially controversial change, but would bring the implementation in line with sk_busy_loop (though of course the nature of those two paths is substantially different), and would also avoid future copy/paste problems with future supporters of XDP. Still, this patch does not take that opinionated option. Testing was done with kernels in either PREEMPT_RCU=y or CONFIG_PREEMPT_VOLUNTARY=y+PREEMPT_RCU=n modes, with neither exhibiting any drawback. With PREEMPT_RCU=n, the extra call to rcu_read_lock did not show up in the perf report whatsoever, and with PREEMPT_RCU=y the overhead of rcu_read_lock (according to perf) was the same before/after. In the rx path, rcu_read_lock is eventually called for every packet from netif_receive_skb_internal, so the napi poll call's rcu_read_lock is easily amortized. v2: Remove extra rcu_read_lock in mlx4_en_process_rx_cq body Annotate xdp_prog with __rcu, and convert all usages to rcu_assign or rcu_dereference[_protected] as appropriate. Add explicit mutex lock around rcu_assign instead of xchg loop. Fixes: `d576acf0a2` ("net/mlx4_en: add page recycle to prepare rx ring for tx support") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-06 13:39:33 -07:00
Ido Schimmel	aad8b6bae7	mlxsw: spectrum: Use existing flood setup when adding VLANs When a VLAN is added on a bridge port we should use the existing unicast flood configuration of the port instead of assuming it's enabled. Fixes: `0293038e0c` ("mlxsw: spectrum: Add support for flood control") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 09:44:56 -07:00
Ido Schimmel	f1de7a28d5	mlxsw: spectrum: Don't take multiple references on a FID In commit `14d39461b3` ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge") I added a per-FID struct, which member ports can take a reference on upon VLAN membership configuration. However, sometimes only the VLAN flags (e.g. egress untagged) are toggled without changing the VLAN membership. In these cases we shouldn't take another reference on the FID. Fixes: `14d39461b3` ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 09:44:56 -07:00
Jiri Pirko	e732263849	mlxsw: spectrum_router: Fix netevent notifier registration Currently the notifier is registered for every asic instance, however the same block. Fix this by moving the registration to module init. Fixes: `c723c735fa` ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 09:44:56 -07:00
Jiri Pirko	de7d62952b	mlxsw: spectrum: Fix error path in mlxsw_sp_module_init Add forgotten notifier unregister. Fixes: `99724c18fc` ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 09:44:56 -07:00
Jiri Pirko	7146da3181	mlxsw: spectrum_router: Fix fib entry update path Originally, I expected that there would be needed to call update operation in case RALUE record action is changed. However, that is not needed since write operation takes care of that nicely. Remove prepared construct and always call the write operation. Fixes: `61c503f976` ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 09:44:56 -07:00
Jiri Pirko	5b004412e2	mlxsw: spectrum_router: Fix failure caused by double fib removal from HW In mlxsw we squash tables 254 and 255 together into HW. Kernel adds/dels /32 ip to/from both 254 and 255. On del path, that causes the same prefix being removed twice. Fix this by introducing reference counting for private mlxsw fib entries. That required a bit of code reshuffle. Also put dev into fib entry key so the same prefix could be represented once per every router interface. Fixes: `61c503f976` ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 09:44:55 -07:00
David S. Miller	6abdd5f593	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-30 00:54:02 -04:00
Maor Gottlieb	e5835f2833	net/mlx5: Increase number of ethtool steering priorities Ethtool has 11 flow tables, each flow table has its own priority. Increase the number of priorities to be aligned with the number of flow tables. Fixes: `1174fce8d1` ('net/mlx5e: Support l3/l4 flow type specs in ethtool flow steering') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:16 -04:00
Eran Ben Elisha	1722b9694e	net/mlx5: Add error prints when validate ETS failed Upon set ETS failure due to user invalid input, add error prints to specify the exact error to the user. Fixes: `cdcf11212b` ('net/mlx5e: Validate BW weight values of ETS') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:16 -04:00
Kamal Heib	bf50082c15	net/mlx5e: Fix memory leak if refreshing TIRs fails Free 'in' command object also when mlx5_core_modify_tir fails. Fixes: `724b2aa151` ("net/mlx5e: TIRs management refactoring") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Tariq Toukan	c8cf78fe10	net/mlx5e: Add ethtool counter for TX xmit_more Add a counter in ethtool for the number of times that TX xmit_more was used. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Eran Ben Elisha	cc8e9ebf95	net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ The driver RQ has two possible configurations: striding RQ and non-striding RQ. Until this patch, the driver always reported the number of hardware WQEs (ring descriptors). For non striding RQ configuration, this was OK since we have one WQE per pending packet For striding RQ, multiple packets can fit into one WQE. For better user experience we normalize the rx_pending parameter (size of wqe/mtu) as the average ring size in case of striding RQ. Fixes: `461017cb00` ('net/mlx5e: Support RX multi-packet WQE ...') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Saeed Mahameed	6e8dd6d6f4	net/mlx5e: Don't wait for SQ completions on close Instead of asking the firmware to flush the SQ (Send Queue) via asynchronous completions when moved to error, we handle SQ flush manually (mlx5e_free_tx_descs) same as we did when SQ flush got timed out or on tx_timeout. This will reduce SQs flush time and speedup interface down procedure. Moved mlx5e_free_tx_descs to the end of en_tx.c for tx critical code locality. Fixes: `29429f3300` ('net/mlx5e: Timeout if SQ doesn't flush during close') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Saeed Mahameed	8484f9ed13	net/mlx5e: Don't post fragmented MPWQE when RQ is disabled ICO (Internal control operations) SQ (Send Queue) is closed/disabled after RQ (Receive Queue). After RQ is closed an ICO SQ completion might post a fragmented MPWQE (Multi Packet Work Queue Element) into that RQ. As on regular RQ post, check if we are allowed to post to that RQ (RQ is enabled). Cleanup in-progress UMR MPWQE on mlx5e_free_rx_descs if needed. Fixes: `bc77b240b3` ('net/mlx5e: Add fragmented memory support for RX multi packet WQE') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Saeed Mahameed	f2fde18c52	net/mlx5e: Don't wait for RQ completions on close This will significantly reduce receive queue flush time on interface down. Instead of asking the firmware to flush the RQ (Receive Queue) via asynchronous completions when moved to error, we handle RQ flush manually (mlx5e_free_rx_descs) same as we did when RQ flush got timed out. This will reduce RQs flush time and speedup interface down procedure (ifconfig down) from 6 sec to 0.3 sec on a 48 cores system. Moved mlx5e_free_rx_descs en_main.c where it is needed, to keep en_rx.c free form non critical data path code for better code locality. Fixes: `6cd392a082` ('net/mlx5e: Handle RQ flush in error cases') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Saeed Mahameed	fe4c988bdd	net/mlx5e: Limit UMR length to the device's limitation ConnectX-4 UMR (User Memory Region) MTT translation table offset in WQE is limited to U16_MAX, before this patch we ignored that limitation and requested the maximum possible UMR translation length that the netdev might need (MAX channels * MAX pages per channel). In case of a system with #cores > 32 and when linear WQE allocation fails, falling back to using UMR WQEs will cause the RQ (Receive Queue) to get stuck. Here we limit UMR length to min(U16_MAX, max required pages) (while considering the required alignments) on driver load, by default U16_MAX is sufficient since the default RX rings value guarantees that we are in range, dynamically (on set_ringparam/set_channels) we will check if the new required UMR length (num mtts) is still in range, if not, fail the request. Fixes: `bc77b240b3` ('net/mlx5e: Add fragmented memory support for RX multi packet WQE') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-28 23:24:15 -04:00
Ido Schimmel	1c6c6d221e	mlxsw: spectrum: Mirror certain packets to CPU Instead of trapping certain packets to the CPU and then relying on it to flood them we can instead make the device mirror them. The following packet types are mirrored: * DHCP: Broadcast packets that should be flooded by the device, but also trapped in case CPU is running the DHCP server. * IGMP query: Multicast packets that need to be forwarded to other bridge ports, but also trapped so that receiving netdev will be marked as a router port by the bridge driver. * ARP request: Broadcast packets that should be forwarded to other bridge ports, but also trapped in case requested IP is of the local machine. * ARP response: Unicast packets that should be forwarded by the bridge but also trapped in case response is directed at us. Set the trap action of such packets to mirror and mark them using 'offload_fwd_mark' to prevent the bridge driver from forwarding them itself. Note that OSPF packets are also marked despite their action being trap. The reason for this is that the device traps such packets in the pipeline after they were already flooded. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-26 13:13:37 -07:00
Ido Schimmel	63a811417d	mlxsw: spectrum: Allow different traps to have different actions Up until now we only trapped packets to CPU, but we are going to allow some packets to be mirrored (trap & forward) to CPU. Extend the Rx listener with 'action' member. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-26 13:13:36 -07:00
Ido Schimmel	93393b339d	mlxsw: spectrum: Simplify traps definition Instead of copying & pasting the same struct initialization for every Rx listener, just use a macro. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-26 13:13:36 -07:00
Doug Ledford	0c41284c83	Mellanox ConnectX-4/Connect-IB shared code (SW part) * net/mlx5: Add sniffer namespaces * net/mlx5: Introduce sniffer steering hardware capabilities * net/mlx5: Configure IB devices according to LAG state * net/mlx5: Vport LAG creation support * net/mlx5: Add LAG flow steering namespace * net/mlx5: LAG demux flow table support * net/mlx5: LAG and SRIOV cannot be used together * net/mlx5e: Avoid port remapping of mlx5e netdev TISes * net/mlx5: Get RoCE netdev * net/mlx5: Implement RoCE LAG feature * net/mlx5: Add HW interfaces used by LAG * net/mlx5: Separate query_port_proto_oper for IB and EN * net/mlx5: Expose mlx5e_link_mode * net/mlx5: Update struct mlx5_ifc_xrqc_bits * net/mlx5: Modify RQ bitmask from mlx5 ifc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXu20hAAoJEORje4g2cliniO8P/0nMxLemOxY63u7P6DqT+UZQ +LN62W+/iLicNayKkt8mtcjnDm768YcF3ADvx73vRvKEeUyyEqT5ChMA59eicf70 rrumfNXB/kfBOaPh5rFWf4Tn8WWpKW+0559drm80NslFZF9jjF9pwv5QGg7xISb7 fYLcDQWn+5fYDuZzYsSu8zZKUEyGN0AugdjfxT5OHfh4rw+6oqGDb2fhH6LdkD8q j3Qx1cPmdQQnjJ5veXJFJT5qHFDqJlNmy85s4l99ItdWD/bcU29ue3Q3vNf7+lHp XoJB4ZRWG7sf98yXYXnOUt3iGUMdSJzpLfZqh/Nx9U1LZpdJ8lmBf7pRuR1hpPIN yDitcz+CMcFVr2WxvwWaUPhRE7SJsZxxr6tQISgRicYcFVyy9e7mLjABMtkh9vEn CXXqiDGUb/27HqTi9ha5qRiLoeT8yFpOCkINL4omV2FJKoUEbC+Jbq5P0mjnPpS1 ZdzTOzWCtkDQGtLbi+nCIF5SVTv7CCDU+6VpGZPmk6M4/ednwajhxGPsbw6bRpna ck5SglGO8dFAaUv1UVRq04PIt7Lj2FRakP7sHWx3tc9XEP8syLX0OEiVB+ZN3yRn y2TlpsREk7AqDdRulwM4qfuNd4AxaDklXyS3C79RiJtenYO4GUGrJ6J6ryesLg8u tGKVV3fXEr2Hve6cTkpu =+m21 -----END PGP SIGNATURE----- Merge tag 'shared-for-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma into mlx5-shared Mellanox ConnectX-4/Connect-IB shared code (SW part) * net/mlx5: Add sniffer namespaces * net/mlx5: Introduce sniffer steering hardware capabilities * net/mlx5: Configure IB devices according to LAG state * net/mlx5: Vport LAG creation support * net/mlx5: Add LAG flow steering namespace * net/mlx5: LAG demux flow table support * net/mlx5: LAG and SRIOV cannot be used together * net/mlx5e: Avoid port remapping of mlx5e netdev TISes * net/mlx5: Get RoCE netdev * net/mlx5: Implement RoCE LAG feature * net/mlx5: Add HW interfaces used by LAG * net/mlx5: Separate query_port_proto_oper for IB and EN * net/mlx5: Expose mlx5e_link_mode * net/mlx5: Update struct mlx5_ifc_xrqc_bits * net/mlx5: Modify RQ bitmask from mlx5 ifc	2016-08-25 10:01:23 -04:00
Ido Schimmel	0f7a4d8a9d	mlxsw: spectrum: Don't set learning when creating vPorts Before commit `99724c18fc` ("mlxsw: spectrum: Introduce support for router interfaces") we used to assign vFIDs to the created vPorts. Since these vPorts were used for slow path traffic we had to disable learning for them, as it doesn't make sense to have it enabled. This is no longer the case and now vPorts are either used for router interfaces (for which learning is disabled by the firmware) or bridge ports (for which learning is explicitly enabled by the driver). Therefore, we can remove the learning configuration upon vPort creation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:13 -07:00
Ido Schimmel	81f77bc006	mlxsw: spectrum: Remove unnecessary check in FDB processing We now offload the learning configuration to the device and don't rely on the driver to decide whether to learn the FDB record, so remove the check. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:12 -07:00
Ido Schimmel	89b548f0cf	mlxsw: spectrum: Offload learning to the switch ASIC Up until now we simply stored the learning configuration of a bridge port in the driver and decided whether to learn a new FDB record based on this value. However, this is sub-optimal in cases where learning is disabled on the bridge port, as the device repeatedly generates learning notifications for the same record. Instead, offload the learning configuration to the device, thereby preventing it from generating notifications when learning is disabled. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:12 -07:00
Ido Schimmel	584d73df06	mlxsw: spectrum: Configure learning for VLAN-aware bridge port We are going to prevent the device from generating learning notifications for a port that was configured with learning disabled. Since learning configuration is done per {Port, VID} we need to apply the port's learning configuration for any VID that is added to the bridge port's VLAN filter list. When a VID is added to the VLAN filter list of a VLAN-aware bridge port, configure the {Port, VID} learning status according to the port's configuration. When the VID is removed, disable learning for the {Port, VID}. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:12 -07:00
Ido Schimmel	640be7b717	mlxsw: spectrum: Don't abort on first error when removing VLANs When removing VLANs from the VLAN-aware bridge we shouldn't abort on the first error, as we'll otherwise have resources that will never be freed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:12 -07:00
Ido Schimmel	f7a8f6cec3	mlxsw: spectrum: Make VLAN deletion function symmetric Commit `05978481e7` ("mlxsw: spectrum: Create PVID vPort before registering netdevice") removed __mlxsw_sp_port_vlans_del() from the init sequence of the driver, which forced it to be non-symmetric with regards to __mlxsw_sp_port_vlans_add(). Make both functions symmetric as the constraint no longer exists. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:12 -07:00
Ido Schimmel	1803e0fb7e	mlxsw: spectrum: Limit number of FDB records per learning session Up until now a learning session ended whenever the number of queried records was zero. This turned out to be problematic in situations where a large number of MACs (48K) had to be processed by the switch driver, as RTNL mutex is held during the learning session. Instead, limit the number of FDB records that can be processed in a session to 64. This means that every time the device is queried for learning notifications (currently, every 100ms), up to 64 records will be processed by the switch driver. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:41:11 -07:00
Yotam Gigi	51af96b534	mlxsw: router: Enable neighbors to be created on stacked devices Make the function mlxsw_router_neigh_construct search the rif according to the neighbour dev other than the dev that was passed to the ndo, thus allowing creating neigbhours upon stacked devices. Fixes: `6cf3c971dc` ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:39:04 -07:00
Ido Schimmel	f888f58795	mlxsw: spectrum: Add missing flood to router port In case we have a layer 3 interface on top of a bridge (VLAN / FID RIF), then we should flood the following packet types to the router: * Broadcast: If DIP is the broadcast address of the interface, then we need to be able to get it to CPU by trapping it following route lookup. * Reserved IP multicast (224.0.0.X): Some control packets (e.g. OSPF) use this range and are trapped in the router block. Fixes: `99f44bb352` ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:39:03 -07:00
David S. Miller	fff84d2a39	Mellanox ConnectX-4/Connect-IB shared code (SW part) * net/mlx5: Add sniffer namespaces * net/mlx5: Introduce sniffer steering hardware capabilities * net/mlx5: Configure IB devices according to LAG state * net/mlx5: Vport LAG creation support * net/mlx5: Add LAG flow steering namespace * net/mlx5: LAG demux flow table support * net/mlx5: LAG and SRIOV cannot be used together * net/mlx5e: Avoid port remapping of mlx5e netdev TISes * net/mlx5: Get RoCE netdev * net/mlx5: Implement RoCE LAG feature * net/mlx5: Add HW interfaces used by LAG * net/mlx5: Separate query_port_proto_oper for IB and EN * net/mlx5: Expose mlx5e_link_mode * net/mlx5: Update struct mlx5_ifc_xrqc_bits * net/mlx5: Modify RQ bitmask from mlx5 ifc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXu20hAAoJEORje4g2cliniO8P/0nMxLemOxY63u7P6DqT+UZQ +LN62W+/iLicNayKkt8mtcjnDm768YcF3ADvx73vRvKEeUyyEqT5ChMA59eicf70 rrumfNXB/kfBOaPh5rFWf4Tn8WWpKW+0559drm80NslFZF9jjF9pwv5QGg7xISb7 fYLcDQWn+5fYDuZzYsSu8zZKUEyGN0AugdjfxT5OHfh4rw+6oqGDb2fhH6LdkD8q j3Qx1cPmdQQnjJ5veXJFJT5qHFDqJlNmy85s4l99ItdWD/bcU29ue3Q3vNf7+lHp XoJB4ZRWG7sf98yXYXnOUt3iGUMdSJzpLfZqh/Nx9U1LZpdJ8lmBf7pRuR1hpPIN yDitcz+CMcFVr2WxvwWaUPhRE7SJsZxxr6tQISgRicYcFVyy9e7mLjABMtkh9vEn CXXqiDGUb/27HqTi9ha5qRiLoeT8yFpOCkINL4omV2FJKoUEbC+Jbq5P0mjnPpS1 ZdzTOzWCtkDQGtLbi+nCIF5SVTv7CCDU+6VpGZPmk6M4/ednwajhxGPsbw6bRpna ck5SglGO8dFAaUv1UVRq04PIt7Lj2FRakP7sHWx3tc9XEP8syLX0OEiVB+ZN3yRn y2TlpsREk7AqDdRulwM4qfuNd4AxaDklXyS3C79RiJtenYO4GUGrJ6J6ryesLg8u tGKVV3fXEr2Hve6cTkpu =+m21 -----END PGP SIGNATURE----- Merge tag 'shared-for-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Saeed Mahameed says: ==================== Mellanox mlx5 core driver updates 2016-08-24 This series contains some low level and API updates for mlx5 core driver interface and mlx5_ifc.h, plus mlx5 LAG core driver support, to be shared as base code for net-next and rdma mlx5 4.9 submissions. From Alex and Artemy, Update mlx5_ifc for modify RQ and XRC bits. From Noa, Expose mlx5 link modes so they can be used in RDMA tree for rdma tools. From Aviv, LAG support needed for RDMA. - Add needed hardware structures, layouts and interface - mlx5 core driver LAG implementation - Introduce mlx5 core driver LAG API for mlx5_ib From Maor, add two low level patches for mlx5 hardware sniffer QP infrastructure bits and capabilities, plus added the namespace for sniffer steering tables. Needed for RDMA subtree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-24 09:35:35 -07:00
David S. Miller	01555e6449	Mellanox ConnectX-4/Connect-IB shared code (HW part) * net/mlx5: Introduce alloc_encap and dealloc_encap commands * net/mlx5: Update mlx5_ifc.h for vxlan encap/decap * net/mlx5: Enable setting minimum inline header mode for VFs * net/mlx5: Improve driver log messages * net/mlx5: Unify and improve command interface * {net,IB}/mlx5: Modify QP commands via mlx5 ifc * {net,IB}/mlx5: QP/XRCD commands via mlx5 ifc * {net,IB}/mlx5: MKey/PSV commands via mlx5 ifc * {net,IB}/mlx5: CQ commands via mlx5 ifc * net/mlx5: EQ commands via mlx5 ifc * net/mlx5: Pages management commands via mlx5 ifc * net/mlx5: MCG commands via mlx5 ifc * net/mlx5: PD and UAR commands via mlx5 ifc * net/mlx5: Access register and MAD IFC commands via mlx5 ifc * net/mlx5: Init/Teardown hca commands via mlx5 ifc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXu2zqAAoJEORje4g2clin0dQP/3SF9+4lxVaWRDnhutwIdaxd GDWDCEcp1x8oC1ylKEfQW57tTG8mk6pEFD5xEZSAJMGjGm5zR8QnaIS9eiPTdDkf QIReMP9XJUUVDqXZ8F207PwVbgB4IkHB2VPyl2Sar1HULe6Mn3nAS40A1QfYpVzs cYC3SFOPuLsTDZkIVQrZzKvX4WVHjcyj0tAkXkutWQ+K8cPXmpx49+ngrzVm6xnw j6THx3kOAEwozW5NxMC7V6DOD7KfLWzPi96BLZ2h4eQynpgJnSLOCar3zyBPH5g3 KAk99tVjD1kp+HreSNzCd+oP8Zqrw+RBt3WlrGX2GvQ0V7XIJrpkRbLDgWhbBjej O1ln/xr5pqLSKgxz41LsFlrLWbOgG7r4N212iMNv3rArb9e11tqZCAbR0OzX5vZ6 fl2W7moYRB2273Y+MnB/e1e8xf7PEIppWnyvyPrzCz1lSdzw1BzLqz5tWz2nc1dB yQWosVTf4xTa3OQHhUqw6CbhpRpywQZx1ZhmAzZ7+hQ90Z4hwPWWXIx7MNa4g2sJ toUamuonbnib3wBLQzzW2ktTbdJUx8OTF5aiVNC06QG8KAvXeUAP2Ho95Am3JpLJ XZ14ZP0NxOFaGgOSDRxEVKuhnUnXuIG57NSgQpMD5rjSieMl+msasrydP8X4+qny HlwA4nwt2bHf9k7Cg1iM =QIAc -----END PGP SIGNATURE----- Merge tag 'shared-for-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Saeed Mahameed says: ==================== Mellanox mlx5 core driver updates 2016-08-20 This series contains several low level and API updates for mlx5 core commands interface and mlx5_ifc.h to be shared as base code for net-next and rdma mlx5 4.9 submissions. From Saeed, ten patches that refactors old layouts of firmware commands which were manually generated before we introduced the mlx5_ifc, now all of the firmware commands inbox/outbox layouts moved to use mlx5_ifc and we remove the old manually generated structures. Plus to those ten patches, we add two patches that unifies mlx5 commands execution interface and improve the driver log messages in that area. From Hadar and Ilya, added the needed hardware bits and infrastructure for minimum inline headers setting and encap/decap commands and capabilities, needed for E-Switch offloads. This series applies on top latest net-next and rdma/master, and smoothly merges with the latest "Mellanox 100G mlx5 fixes 2016-08-16" series already applied into net branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-23 11:08:23 -07:00
Doug Ledford	124c13439b	Mellanox ConnectX-4/Connect-IB shared code (HW part) * net/mlx5: Introduce alloc_encap and dealloc_encap commands * net/mlx5: Update mlx5_ifc.h for vxlan encap/decap * net/mlx5: Enable setting minimum inline header mode for VFs * net/mlx5: Improve driver log messages * net/mlx5: Unify and improve command interface * {net,IB}/mlx5: Modify QP commands via mlx5 ifc * {net,IB}/mlx5: QP/XRCD commands via mlx5 ifc * {net,IB}/mlx5: MKey/PSV commands via mlx5 ifc * {net,IB}/mlx5: CQ commands via mlx5 ifc * net/mlx5: EQ commands via mlx5 ifc * net/mlx5: Pages management commands via mlx5 ifc * net/mlx5: MCG commands via mlx5 ifc * net/mlx5: PD and UAR commands via mlx5 ifc * net/mlx5: Access register and MAD IFC commands via mlx5 ifc * net/mlx5: Init/Teardown hca commands via mlx5 ifc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXu2zqAAoJEORje4g2clin0dQP/3SF9+4lxVaWRDnhutwIdaxd GDWDCEcp1x8oC1ylKEfQW57tTG8mk6pEFD5xEZSAJMGjGm5zR8QnaIS9eiPTdDkf QIReMP9XJUUVDqXZ8F207PwVbgB4IkHB2VPyl2Sar1HULe6Mn3nAS40A1QfYpVzs cYC3SFOPuLsTDZkIVQrZzKvX4WVHjcyj0tAkXkutWQ+K8cPXmpx49+ngrzVm6xnw j6THx3kOAEwozW5NxMC7V6DOD7KfLWzPi96BLZ2h4eQynpgJnSLOCar3zyBPH5g3 KAk99tVjD1kp+HreSNzCd+oP8Zqrw+RBt3WlrGX2GvQ0V7XIJrpkRbLDgWhbBjej O1ln/xr5pqLSKgxz41LsFlrLWbOgG7r4N212iMNv3rArb9e11tqZCAbR0OzX5vZ6 fl2W7moYRB2273Y+MnB/e1e8xf7PEIppWnyvyPrzCz1lSdzw1BzLqz5tWz2nc1dB yQWosVTf4xTa3OQHhUqw6CbhpRpywQZx1ZhmAzZ7+hQ90Z4hwPWWXIx7MNa4g2sJ toUamuonbnib3wBLQzzW2ktTbdJUx8OTF5aiVNC06QG8KAvXeUAP2Ho95Am3JpLJ XZ14ZP0NxOFaGgOSDRxEVKuhnUnXuIG57NSgQpMD5rjSieMl+msasrydP8X4+qny HlwA4nwt2bHf9k7Cg1iM =QIAc -----END PGP SIGNATURE----- Merge tag 'shared-for-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma into mlx5-shared Mellanox ConnectX-4/Connect-IB shared code (HW part) * net/mlx5: Introduce alloc_encap and dealloc_encap commands * net/mlx5: Update mlx5_ifc.h for vxlan encap/decap * net/mlx5: Enable setting minimum inline header mode for VFs * net/mlx5: Improve driver log messages * net/mlx5: Unify and improve command interface * {net,IB}/mlx5: Modify QP commands via mlx5 ifc * {net,IB}/mlx5: QP/XRCD commands via mlx5 ifc * {net,IB}/mlx5: MKey/PSV commands via mlx5 ifc * {net,IB}/mlx5: CQ commands via mlx5 ifc * net/mlx5: EQ commands via mlx5 ifc * net/mlx5: Pages management commands via mlx5 ifc * net/mlx5: MCG commands via mlx5 ifc * net/mlx5: PD and UAR commands via mlx5 ifc * net/mlx5: Access register and MAD IFC commands via mlx5 ifc * net/mlx5: Init/Teardown hca commands via mlx5 ifc	2016-08-23 11:52:02 -04:00
Markus Elfring	6f0b826da4	mlx5/core: Use memdup_user() rather than duplicating its implementation * Reuse existing functionality from memdup_user() instead of keeping duplicate source code. This issue was detected by using the Coccinelle software. * Return directly if this copy operation failed. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-22 17:05:38 -07:00
Jiri Pirko	8912862f06	mlxsw: spectrum_buffers: Fix pool value handling in mlxsw_sp_sb_tc_pool_bind_set Pool index has to be converted by get_pool helper to work correctly for egress pool. In mlxsw the egress pool index starts from 0. Fixes: `0f433fa0ec` ("mlxsw: spectrum_buffers: Implement shared buffer configuration") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 18:01:56 -07:00
Or Gerlitz	f96750f8d6	net/mlx5: E-Switch, Avoid ACLs in the offloads mode When we are in the switchdev/offloads mode, HW matching is done as dictated by the offloaded rules and hence we don't need to enable the ACLs mechanism used by the legacy mode. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:56 -07:00
Or Gerlitz	1a8ee6f25b	net/mlx5: E-Switch, Set the send-to-vport rules in the correct table While adding actual offloading support to the new switchdev mode, we didn't change the setup of the send-to-vport rules to put them in the slow path table, fix that. Fixes: `1033665e63` ('net/mlx5: E-Switch, Use two priorities for SRIOV offloads mode') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:56 -07:00
Or Gerlitz	ef78618b9d	net/mlx5: E-Switch, Return the correct devlink e-switch mode Since mlx5 has also the NONE e-switch mode, we must translate from mlx5 mode to devlink mode on the devlink eswitch mode get call, do that. While here, remove the mlx5_ prefix from the static function helpers that deal with the mode to comply with the rest of the code. Fixes: `c930a3ad74` ('net/mlx5e: Add devlink based SRIOV mode change') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:56 -07:00
Hadar Hen Zion	dbe413e3bb	net/mlx5e: Retrieve the switchdev id from the firmware only once Avoid firmware command execution each time the switchdev HW ID attr get call is made. We do that by reading the ID (PF NIC MAC) only once at load time and store it on the representor structure. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:56 -07:00
Hadar Hen Zion	1dbd0d373a	net/mlx5e: Use correct flow dissector key on flower offloading The wrong key is used when extracting the address type field set by the flower offload code. We have to use the control key and not the basic key, fix that. Fixes: `e3a2b7ed01` ('net/mlx5e: Support offload cls_flower with drop action') Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:56 -07:00
Amir Vadai	6c3b4f9086	net/mlx5: Update last-use statistics for flow rules Set lastuse statistic, when number of packets is changed compared to last query. This was wrongly dropped when bulk counter reading was added. Fixes: `a351a1b03b` ('net/mlx5: Introduce bulk reading of flow counters') Signed-off-by: Amir Vadai <amirva@mellanox.com> Reported-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:55 -07:00
Paul Blakey	2c0f8ce1b5	net/mlx5: Added missing check of msg length in verifying its signature Set and verify signature calculates the signature for each of the mailbox nodes, even for those that are unused (from cache). Added a missing length check to set and verify only those which are used. While here, also moved the setting of msg's nodes token to where we already go over them. This saves a pass because checksum is disabled, and the only useful thing remaining that set signature does is setting the token. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:55 -07:00
Mohamad Haj Yahia	1061c90f52	net/mlx5: Fix pci error recovery flow When PCI error is detected we should save the state of the pci prior to disabling it. Also when receiving pci slot reset call we need to verify that the device is responsive. Fixes: `89d44f0a6c` ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:55 -07:00
Tariq Toukan	506753b0b4	net/mlx5e: Optimization for MTU change Avoid unnecessary interface down/up operations upon an MTU change when it does not affect the rings configuration. Fixes: `461017cb00` ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:55 -07:00
Saeed Mahameed	13f9bba7cd	net/mlx5e: Set port MTU on netdev creation rather on open Port mtu shouldn't be written to hardware on every single interface open. Here we set it only when needed, on change_mtu and netdevice creation. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 16:09:55 -07:00
Maor Gottlieb	87d22483ce	net/mlx5: Add sniffer namespaces Add sniffer TX and RX namespaces to receive ingoing and outgoing traffic. Each outgoing/incoming packet is duplicated and steered to the sniffer TX/RX namespace in addition to the regular flow. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-08-18 18:49:59 +03:00
Aviv Heller	917b41aab7	net/mlx5: Configure IB devices according to LAG state When mlx5_ib is loaded, we would like each card's IB devices to be added according to its LAG state (one IB device, instead of two, is to be added if LAG is active). Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-08-18 18:49:58 +03:00
Aviv Heller	3bc34f3bcb	net/mlx5: Vport LAG creation support Add interfaces for issuing CREATE_VPORT_LAG and DESTROY_VPORT_LAG commands. Used for receiving PF1's eth traffic on PF0's root ft. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2016-08-18 18:49:57 +03:00

... 3 4 5 6 7 ...

2236 Commits