linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-15 23:51:46 +00:00

Author	SHA1	Message	Date
Brian Welty	1198fcea8a	IB/hfi1, rdmavt: Move SGE state helper routines into rdmavt To improve code reuse, add small SGE state helper routines to rdmavt_mr.h. Leverage these in hfi1, including refactoring of hfi1_copy_sge. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:41 -05:00
Brian Welty	0128fceaf9	IB/hfi1, rdmavt: Update copy_sge to use boolean arguments Convert copy_sge and related SGE state functions to use boolean. For determining if QP is in user mode, add helper function in rdmavt_qp.h. This is used to determine if QP needs the last byte ordering. While here, change rvt_pd.user to a boolean. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:41 -05:00
Venkata Sandeep Dhanalakota	b4238e7057	IB/qib: Use new rdmavt timers Reduce qib code footprint by using the rdmavt timers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:40 -05:00
Venkata Sandeep Dhanalakota	56acbbfb46	IB/hfi1: Use new rdmavt timers Reduce hfi1 code footprint by using the rdmavt timers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:39 -05:00
Venkata Sandeep Dhanalakota	11a10d4bc7	IB/rdmavt: Adding timer logic to rdmavt To move common code across target to rdmavt for code reuse. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:39 -05:00
Brian Welty	696513e8cf	IB/hfi1, qib, rdmavt: Move AETH credit functions into rdmavt Add rvt_compute_aeth() and rvt_get_credit() as shared functions in rdmavt, moved from hfi1/qib logic. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:38 -05:00
Brian Welty	beb5a04267	IB/hfi1, qib, rdmavt: Move two IB event functions into rdmavt Add rvt_rc_error() and rvt_comm_est() as shared functions in rdmavt, moved from hfi1/qib logic. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:38 -05:00
Sebastian Sanchez	c03c08d50b	IB/hfi1: Check upper-case EFI variables The EFI variable that provides board ID is named by the PCI address of the device, which is published in upper-case, while the HFI1 driver reads the EFI variable in lower-case. This prevents returning the correct board id when queried through sysfs. Read EFI variables in upper-case if the lower-case read fails. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:37 -05:00
Sebastian Sanchez	76327627be	IB/hfi1: Reduce oversized fields in struct hfi1_packet Some fields in struct hfi1_packet are oversized. Reduce them. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:37 -05:00
Mike Marciniszyn	d7c76e91aa	IB/hfi1: Add additional fields to qp_stats The r_psn and s_rnr_retry are missing. Add with this patch. Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:36 -05:00
Sebastian Sanchez	b448bf9a0d	IB/hfi1: Allocate context data on memory node There are some memory allocation calls in hfi1_create_ctxtdata() that do not use the numa function parameter. This can cause cache lines to be filled over QPI. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:36 -05:00
Sebastian Sanchez	338adfdddf	IB/rdmavt: Use per-CPU reference count for MRs Having per-CPU reference count for each MR prevents cache-line bouncing across the system. Thus, it prevents bottlenecks. Use per-CPU reference counts per MR. The per-CPU reference count for FMRs is used in atomic mode to allow accurate testing of the busy state. Other MR types run in per-CPU mode MR until they're freed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:35 -05:00
Sebastian Sanchez	f3e862cb68	IB/hfi1: Access hfi1_ibport through rcd pointer Receive code paths use the QP's device and port number to access the struct hfi1_ibport. When an instance of struct hfi1_ctxtdata is present, it can be used to access struct hfi1_ibport through a pointer. This makes struct hfi1_ibport lookup time faster as an array doesn't have to be indexed and access fields in other cache-lines. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:35 -05:00
Mike Marciniszyn	a8715b97d6	IB/hfi1: Correct error calldown locking The resource specific wait locking missed correcting the lock for the notify_error_qp() calldown. The code is fixed to correctly use the iowait lock field to protect the head that is protected by that lock. Fixes: Commit `4e045572e2` ("IB/hfi1: Add unique txwait_lock for txreq events") Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:34 -05:00
Easwar Hariharan	39e2afa8d0	IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs After extended testing, it was found that the previous PCIe Gen 3 recipe, which used adaptive CTLE with Preset 4, could cause an NMI/Surprise Link Down in about 1 in 100 to 1 in 1000 power cycles on some platforms. New EV data combined with extensive empirical data indicates that the new recipe should use static CTLE with Preset 6 for all integrated silicon SKUs. Fixes: `c3f8de0b33` ("IB/hfi1: Add static PCIe Gen3 CTLE tuning") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:34 -05:00
Mike Marciniszyn	eb04ff09d8	IB/hfi1: Ensure read of producer s_head is correct The read of s_head in the hfi1_make_rc_req() and qib_make_rc_req() lack the necesary barrier instuctions. Correct other ACCESS_ONCE() warnings in the same file. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:33 -05:00
Mike Marciniszyn	a82a7fcd1f	IB/hfi1: Process qp wait list in IRQ thread periodically In the event that the IRQ thread is extremely busy, the processing of an rcd wait list can be delayed by quite a bit until the IRQ thread completes its work. The QP reset reference count wait can then appear to be stuck, thus causing up a QP destroy to emit the hung task diagnostic. Fix by processing the qp wait list periodically from the thread. The interval is a multiple (currently 4) of the MAX_PKT_RECV. Also, reduce some of the excessive inlining. The guidelines are per packet is ok inline, otherwise the choice is based on likelyhood of execution. Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:32 -05:00
Mike Marciniszyn	4fcf1de5a7	IB/hfi1: Correct defered count after processing qp_wait_list The qp_wait_list processing leaves the defered ack count at its prior value. This can result in a premature send of an ack. Fixed by unconditionally reseting the defered ack count in hfi1_send_rc_ack(). Fixes: Commit `7c091e5c06` ("staging/rdma/hfi1: add ACK coalescing logic") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:32 -05:00
Wei Yongjun	8d8a473380	IB/rxe: use setup_timer to simplify the code Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:31 -05:00
Max Gurtovoy	32f8e839ed	IB/iser: Protect completion context active_qps update As iser connections can share completion contexts, we need to protect the active_qps update each time we set it. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:31 -05:00
Ganesh Goudar	192539f4ce	iw_cxgb4: clean up send_connect() Clean up send_connect() and make use of t6 specific active open request struct. Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Bharat Teja <bharat@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-19 09:18:30 -05:00
Doug Ledford	6dd7abae71	Merge branch 'k.o/for-4.10-rc' into HEAD	2017-02-19 09:18:21 -05:00
Moni Shoua	6df6b4a9ce	IB/cma: Destination and source addr families must match The destination address in a listening rdma_id does not have an address family. Since address family in both sides of a connection must be the same in rdma_bind_addr() we set the address family of the destination to the address family of the source. This patch serves the logic in cma_port_is_unique() which requires to know if destination address that is associated with a rdma_id is any address (cma_zero_addr() and cma_loopback_addr()). This can happen when port reuse is checked for a port number that is being listened to. Fixes: `19b752a19d` ("IB/cma: Allow port reuse for rdma_id") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:33 -05:00
Majd Dibbiny	89052d784b	IB/cma: Add default RoCE TOS to CMA configfs Add new entry to the RDMA-CM configfs that allows users to select default TOS for RDMA-CM QPs. This is useful for users that want to control the TOS for legacy applications without changing their code. Application that sets the TOS explicitly using the rdma_set_option API will continue to work as expected, meaning overriding the configfs value. CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:28 -05:00
Parav Pandit	5903960840	IB/core: Remove pointer casting from void to net_device This patch avoids unnecessary type casting from void to net_device. CC: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:28 -05:00
Erez Shitrit	2b0841766a	IB/IPoIB: Add destination address when re-queue packet When sending packet to destination that was not resolved yet via path query, the driver keeps the skb and tries to re-send it again when the path is resolved. But when re-sending via dev_queue_xmit the kernel doesn't call to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb and to put the destination address inside them. In that way the dev_start_xmit will have the correct destination, and the driver won't take the destination from the skb->data, while nothing exists there, which causes to packet be be dropped. The test flow is: 1. Run the SM on remote node, 2. Restart the driver. 4. Ping some destination, 3. Observe that first ICMP request will be dropped. Fixes: `fc791b6335` ("IB/ipoib: move back IB LL address into the hard header") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:28 -05:00
Eli Cohen	cdbe33d0f8	IB/mlx5: Fix configuration of port capabilities When the "ib_virt" cap is set, configuration of port capabilities need to be done through mlx5_core_modify_hca_vport_context. Since modify_hca_vport_context accepts mask and value, there is no need to read the port capabilities and calculate the new cap values so we avoid the mutex when ib_virt is set. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:29:37 -05:00
Talat Batheesh	a748d60df3	IB/mlx4: Take source GID by index from HW GID table Previously, we used the HW GID index in order to search the source GID in the software GID cached table. In some cases, for example when the MAC Address of the network interface is changed, the GID cached table saves the old-IPv6-link-local GID at the end of the table. When returning the old MAC address, the software GID cached table tries to add the new IPv6-link-local GID, and when it identifies that the GID already exists, the software GID cached does not add it. Thus a mismatch occurs between the HW and the SW GID tables. It resulted with sending traffic with the wrong source GID. This commit fixes the issue by taking both from the HW table. The problem can be reproduced with the following scenario: Client: # ifconfig ens6 2.2.2.5 # ifconfig ens6 inet6 add 2001:0db8:0:f101::5/64 # ifconfig ens6 hw ether f4:52:14:61:a0:71 # ifconfig ens6 inet6 del 2001:0db8:0:f101::5/64 # ifconfig ens6 inet6 add 2001:0db8:0:f101::5/64 # ucmatose -f ipv6 -b 2001:0db8:0:f101::5 -s 2001:0db8:0:f101::6 -p 20156 Server: # ucmatose -f ipv6 -b 2001:0db8:0:f101::6 -p 20156 Fixes: `4c3eb3ca13` ('IB/mlx4: Add VLAN support for IBoE') Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:44:42 -05:00
Eli Cohen	d8030b0de0	IB/mlx5: Fix blue flame buffer size calculation A blue flame register is comprised of two buffers of equal size. Fixes: `5fe9dec0d0` ("IB/mlx5: Use blue flame register allocator in mlx5_ib") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:44:42 -05:00
Leon Romanovsky	850b741514	IB/mlx4: Remove unused variable from function declaration Remove unused netw_view parameter from eth_link_query_port() function. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:44:42 -05:00
Or Gerlitz	c4550c63b3	IB: Query ports via the core instead of direct into the driver Change the drivers to call ib_query_port in their get port immutable handler instead of their own query port handler. Doing this required to set the core cap flags of this device before the ib_query_port call is made, since the IB core might need these caps to serve the port query. Drivers are ensured by the IB core that the port attributes passed to the port query verb implementation are zero, and hence we removed the zeroing from the drivers. This patch doesn't add any new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:22 -05:00
Or Gerlitz	ce1e055fb9	IB: Add protocol for USNIC Add protocol definition for the proprietary the USNIC driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:21 -05:00
Or Gerlitz	bc63f9d558	IB/mlx4: Support raw packet protocol Mark support for the new raw packet protocol on Eth ports. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:21 -05:00
Or Gerlitz	72cd57178f	IB/mlx5: Support raw packet protocol Mark support for the new raw packet protocol on Eth ports. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:20 -05:00
Artemy Kovalyov	81713d3788	IB/mlx5: Add implicit MR support Add implicit MR, covering entire user address space. The MR is implemented as an indirect KSM MR consisting of 1GB direct MRs. Pages and direct MRs are added/removed to MR by ODP. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:19 -05:00
Artemy Kovalyov	49780d42df	IB/mlx5: Expose MR cache for mlx5_ib Allow other parts of mlx5_ib to use MR cache mechanism. * Add new functions mlx5_mr_cache_alloc and mlx5_mr_cache_free * Traditional MTT MKey buckets are limited by MAX_UMR_CACHE_ENTRY Additinal buckets may be added above. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:18 -05:00
Artemy Kovalyov	94990b4989	IB/mlx5: Add null_mkey access Add mlx5_cmd_null_mkey() function to access null_mkey information from firmware. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:18 -05:00
Artemy Kovalyov	d9d0674c0f	IB/umem: Indicate that process is being terminated When process is killed while pagefault operation still in progress - function will fail. In this specific case we don't want any warnings in dmesg to avoid log analyzers false alerts. So we need distinct error code for this case. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:17 -05:00
Artemy Kovalyov	d07d1d70ce	IB/umem: Update on demand page (ODP) support Currently ODP MR may explicitly register virtual address space area of limited length. This change allows MR to cover entire process virtual address space dynamicaly adding/removing translation entries to device MTT. Add following changes to support implicit MR: * Allow umem to be zero size to back-up implicit MR. * Add new function ib_alloc_odp_umem() to add virtual memory regions to implicit MR dynamically on demand. * Add new function rbt_ib_umem_lookup() to find dynamically added virtual memory regions. * Expose function rbt_ib_umem_for_each_in_range() to other modules and make it safe Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:17 -05:00
Noa Osherovich	4be6da1e5b	IB/mlx5: Support creation of a WQ with scatter FCS offload Add support for creation of a WQ with scatter FCS capability, if this capability is supported by the hardware. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:15 -05:00
Noa Osherovich	e4cc4fa7cc	IB/mlx5: Enable QP creation with cvlan offload Enable creating a RAW Ethernet QP with cvlan stripping offload when it's supported by the hardware. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:15 -05:00
Noa Osherovich	b1f74a8437	IB/mlx5: Enable WQ creation and modification with cvlan offload Allow creating a WQ with cvlan stripping considering device's capabilities. The default value was fixed to disable vlan stripping till was asked explicitly. In addition, allow modification of a WQ to turn on/off this property. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:14 -05:00
Noa Osherovich	e816133440	IB/mlx5: Expose vlan offloads capabilities Check device's capabilities and report which raw packet capabilities are supported. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:14 -05:00
Noa Osherovich	9e1b161f3b	IB/uverbs: Enable QP creation with cvlan offload Enable user applications to create a QP with cvlan stripping offload. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:13 -05:00
Noa Osherovich	af1cb95d2e	IB/uverbs: Enable WQ creation and modification with cvlan offload Enable user space application via WQ creation and modification to turn on and off cvlan offload. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:12 -05:00
Noa Osherovich	5f23d4265f	IB/uverbs: Expose vlan offloads capabilities Expose raw packet capabilities to user space as part of query device. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:12 -05:00
Majd Dibbiny	23a6964e3a	IB/mlx5: Add port counter support for Receive WQs Counters weren't updated due to Receive WQs' traffic since the counter-id was not associated with the RQ. Added support for associating the q-counter-id with the Receive WQ. The attachment is done only when changing WQ's state from RESET to READY in modify-WQ command. FW support is required for the above, without this support Receive WQ counters will not count. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:41:09 -05:00
Kamal Heib	7c16f47779	IB/mlx5: Expose Q counters groups only if they are supported by FW This patch modify the Q counters implementation, so each one of the three Q counters groups will be exposed by the driver only if they are supported by the firmware. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 11:40:56 -05:00
Leon Romanovsky	1ffd3a26f8	IB/mlx5: Replace ENOTSUPP usage with EOPNOTSUPP Flow steering is supposed to return EOPNOTSUPP error for unsupported fields and not ENOTSUPP error. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:21:01 -05:00
Moses Reuben	2ac693f995	IB/mlx5: Add flow tag support Set flow tag in flow table entry, when IB_FLOW_SPEC_ACTION_TAG is part of the flow specifications. Flow tag doesn't support multicast flows, so it's passing to hardware only when used. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:21:01 -05:00
Moses Reuben	94e03f11ad	IB/uverbs: Add support for flow tag The struct ib_uverbs_flow_spec_action_tag associates a tag_id with the flow defined by any number of other flow_spec entries which can reference L2, L3, and L4 packet contents. Use of ib_uverbs_flow_spec_action_tag allows the consumer to identify the set of rules which where matched by the packet by examining the tag_id in the CQE. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:21:01 -05:00
Leon Romanovsky	5abb0da9cd	IB/mlx5: Remove deprecated module parameter Commit `9603b61de1` ("mlx5: Move pci device handling from mlx5_ib to mlx5_core") moved prof_sel module parameter from mlx5_ib to mlx5_core and marked it as deprecated in 2014. Three years after deprecation, it is time to remove the deprecated module parameter. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Majd Dibbiny	ed88451e1f	IB/mlx5: Assign DSCP for R-RoCE QPs Address Path For Routable RoCE QPs, the DSCP should be set in the QP's address path. The DSCP's value is derived from the traffic class. Fixes: `2811ba51b0` ("IB/mlx5: Add RoCE fields to Address Vector") Cc: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Maor Gottlieb	1e0e50b617	IB/mlx5: Avoid SMP MADs from VFs According to the device specification, we need to check that the has_smi bit is set in vport context before allowing send SMP MADs from VF. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Maor Gottlieb	c43f1112c0	IB/mlx5: Add additional checks before processing MADs Check the has_smi bit in vport context and class version of MADs before allowing MADs processing to take place. MAD_IFC SMI commands can be executed only if smi bit is set. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Parvi Kaustubhi <parvik@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Kamal Heib	45bded2c21	IB/mlx5: Verify that Q counters are supported Make sure that the Q counters are supported by the FW before trying to allocate/deallocte them, this will avoid driver load failure when they aren't supported by the FW. Fixes: `0837e86a7a` ('IB/mlx5: Add per port counters') Cc: <stable@vger.kernel.org> # v4.7+ Signed-off-by: Kamal Heib <kamalh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Leon Romanovsky	12bbf1ea7e	IB/mlx5: Return error for unsupported signature type In case of unsupported singature, we returned positive value, while the better approach is to return -EINVAL. In addition, in this change, the error print is enriched to provide an actual supplied signature type. Fixes: `e6631814fb` ("IB/mlx5: Support IB_WR_REG_SIG_MR") Cc: Sagi Grimberg <sagi@grimberg.me> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Leon Romanovsky	0fd27a88c2	IB/mlx5: Fix out-of-bound access When we initialize buffer to create SRQ in kernel, the number of pages was less than actually used in following mlx5_fill_page_array(). Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:14:25 -05:00
Selvin Xavier	592e8b3226	RDMA/bnxt_re: Add bnxt_re driver build support Makefile and Kconfig changes for enabling bnxt_re compilation Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 09:51:28 -05:00
Selvin Xavier	1ac5a40479	RDMA/bnxt_re: Add bnxt_re RoCE driver This patch introduces the RoCE driver for the Broadcom NetXtreme-E 10/25/40/50G RoCE HCAs. The RoCE driver is a two part driver that relies on the parent bnxt_en NIC driver to operate. The changes needed in the bnxt_en driver have already been incorporated via Dave Miller's net tree into the mainline kernel. The vendor official git repository for this driver is available on github as: https://github.com/Broadcom/linux-rdma-nxt/ Signed-off-by: Eddie Wai <eddie.wai@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 09:51:15 -05:00
David S. Miller	35eeacf182	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-02-11 02:31:11 -05:00
Eyal Itkin	647bf3d8a8	IB/rxe: Fix mem_check_range integer overflow Update the range check to avoid integer-overflow in edge case. Resolves CVE 2016-8636. Signed-off-by: Eyal Itkin <eyal.itkin@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-08 12:28:30 -05:00
Eyal Itkin	628f07d33c	IB/rxe: Fix resid update Update the response's resid field when larger than MTU, instead of only updating the local resid variable. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Eyal Itkin <eyal.itkin@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-08 12:28:30 -05:00
David S. Miller	501ec18757	mlx5-updates-2017-01-31 This series includes some updates to mlx5 core and ethernet driver. We got one patch from Or to fix some static checker warnings. 2nd patche from Dan came to add the support for 128B cache line in the HCA, which will configures the hardware to use 128B alignment only on systems with 128B cache lines, otherwise it will be kept as the current default of 64B. From me three patches to support no inline copy on TX on ConnectX-5 and later HCAs. Starting with two small infrastructure changes and refactoring patches followed by two patches to add the actual support for both xmit ndo and XDP xmit routines. Last patch is a simple fix to return a mistakenly removed pointer from the SQ structure, which was remove in previous submission of mlx5 4K UAR. Saeed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYmQKaAAoJEEg/ir3gV/o+RBMH/RGHNw3yPB2MyWo28V3eabw+ xl/SymiNOUgmq03ULYoc6xJpi9RCya7m/Kyce1M/M1gSz6LXubG2IDw9QsKV8lnc +5rwHCKjop6MdR3khsgqvWqGiKfQN0+QON5MjlPZB3/4u8qFcjauhfXpiX9naMO5 aB/Sm9zRPwRnsEhy2AwPyZqOxe5boZzHqmZxpthIgPMtqbpBYNkTkooljsj/KqXf AO3y/mdGykELPF3lIHTE4X9zixx5s6MrlAYX2uGUrAojs2WVIBsq3iXI/J8X9zs/ lg7to15WoMttR66vRZ120U6tx17OMmoxuAp+bmgZumabi/wDAZGSy5ELbH28WlY= =F+t/ -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2017-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-01-31 This series includes some updates to mlx5 core and ethernet driver. We got one patch from Or to fix some static checker warnings. 2nd patche from Dan came to add the support for 128B cache line in the HCA, which will configures the hardware to use 128B alignment only on systems with 128B cache lines, otherwise it will be kept as the current default of 64B. From me three patches to support no inline copy on TX on ConnectX-5 and later HCAs. Starting with two small infrastructure changes and refactoring patches followed by two patches to add the actual support for both xmit ndo and XDP xmit routines. Last patch is a simple fix to return a mistakenly removed pointer from the SQ structure, which was remove in previous submission of mlx5 4K UAR. Saeed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 13:44:08 -05:00
Christoph Hellwig	b6a05c823f	scsi: remove eh_timed_out methods in the transport template Instead define the timeout behavior purely based on the host_template eh_timed_out method and wire up the existing transport implementations in the host templates. This also clears up the confusion that the transport template method overrides the host template one, so some drivers have to re-override the transport template one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-06 19:10:03 -05:00
Parav Pandit	d0d7b10b05	net-next: treewide use is_vlan_dev() helper function. This patch makes use of is_vlan_dev() function instead of flag comparison which is exactly done by is_vlan_dev() helper function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 16:33:29 -05:00
Saeed Mahameed	2b31f7ae5f	net/mlx5: TX WQE update Add new TX WQE fields for Connect-X5 vlan insertion support, type and vlan_tci, when type = MLX5_ETH_WQE_INSERT_VLAN the HW will insert the vlan and prio fields (vlan_tci) to the packet. Those bits and the inline header fields are mutually exclusive, and valid only when: MLX5_CAP_ETH(mdev, wqe_inline_mode) == MLX5_CAP_INLINE_MODE_NOT_REQUIRED and MLX5_CAP_ETH(mdev, wqe_vlan_insert), who will be set in ConnectX-5 and later HW generations. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-02-06 18:20:16 +02:00
David S. Miller	4e8f2fc1a5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-28 10:33:06 -05:00
Yuval Shaia	24dc831b77	IB/core: Add inline function to validate port Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:33:59 -05:00
Bart Van Assche	2bce1a6d22	IB/srpt: Accept GUIDs as port names Port and ACL information must be configured before an initiator logs in. Make it possible to configure this information before a subnet prefix has been assigned to a port by not only accepting GIDs as target port and initiator port names but by also accepting port GUIDs. Add a 'priv' member to struct se_wwn to allow target drivers to associate their own data with struct se_wwn. Reported-by: Doug Ledford <dledford@redhat.com> References: http://www.spinics.net/lists/linux-rdma/msg39505.html Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:31:50 -05:00
Christophe Jaillet	a3dd3a48a5	IB/cma: Fix reversed test This test looks reverted. We should log an error message only if 'ib_attach_mcast()' fails. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:29:20 -05:00
Jack Morgenstein	b4cfe3971f	RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled If IPV6 has not been enabled in the underlying kernel, we must avoid calling IPV6 procedures in rdma_cm.ko. This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements surrounding any code which calls external IPV6 procedures. In the instance fixed here, procedure cma_bind_addr() called ipv6_addr_type() -- which resulted in calling external procedure __ipv6_addr_type(). Fixes: `6c26a77124` ("RDMA/cma: fix IPv6 address resolution") Cc: <stable@vger.kernel.org> # v4.2+ Cc: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:29:04 -05:00
Zhu Yanjun	5c37077fd0	IB/ipoib: Remove the unnecessary error check The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero. As such, in the function ipoib_open, err_stop will never be reached. So remove this err_stop and change the return type of the function ipoib_mcast_start_thread/ipoib_ib_dev_up to void. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:22:24 -05:00
Shiraz Saleem	3f9fade5e7	i40iw: Set maj_err and min_err in i40iw_sc_cqp_create Set maj_err and min_err in i40iw_sc_cqp_create so that it returns correct values for all return cases. This also addresses an uninitialized variable warning for maj_err and min_err in i40iw_create_cqp. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Leon Romanovsky	564649b4ea	IB/qib: Remove empty function Commit `f06267104d` ("RDMA: Update workqueue usage") removed content of qib_qsfp_deinit(...) and left it empty. This patch deletes all leftovers of that function. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Jack Wang	21d6454a39	RDMA/core: create struct ib_port_cache As Jason suggested, we have 4 elements for per port arrays, it's better to have a separate structure to represent them. It simplifies code a bit, ~ 30 lines of code less :) Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Zhu Yanjun	dfc0e55506	IB/ipoib: function interface change The ipoib_ib_dev_down/ipoib_ib_dev_stop return zero unconditionally and the callers never check the returned values, change the return type to void and remove the redundant return values. Reviewed-by: Shan Hai <shan.hai@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Dan Carpenter	820cd30ac2	i40iw: fix some indenting in i40iw_sc_vsi_init() The debug printk was indented more than it should have been and we can remove an unnecessary line break. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Moni Shoua	19b752a19d	IB/cma: Allow port reuse for rdma_id When allocating a port number for binding to a rdma_id, assuming the allocation is not for a specific port, the rule is to allow only ports that were not in use before by any other rdma_id. This condition is too strong to achieve the goal of a unique 5 tuple rdma_id. Instead, we can compare current rdma_id with other rdma_id for difference in one of destination port, source address and destination address to allow port reuse. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Moni Shoua	498683c6a7	IB/cma: Add debug messages to error flows Print debug messages to the kernel log to add more information about RDMA_CM events that indicate an error. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Zhu Yanjun	f7534f45dc	IB/ipoib: Remove unnecessary returned value check In the function ipoib_set_dev_features, the returned value is always 0. As such, it is not necessary to check the returned value. This is not a bug. It is a trivial problem. Reviewed-by: Guanglei Li <guanglei.li@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Colin Ian King	506f71d181	IB/isert: fix spelling mistake: "teminating" -> "terminating" Trivial fix to spelling mistake in isert_warn message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Yonatan Cohen	2d4b21e0a2	IB/rxe: Prevent from completer to operate on non valid QP On UD QP completer tasklet is scheduled for each packet sent. If it is followed by a destroy_qp(), the kernel panic will happen as the completer tries to operate on a destroyed QP. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:17:32 -05:00
Maor Gottlieb	f39f775218	IB/rxe: Fix rxe dev insertion to rxe_dev_list The first argument of list_add_tail is the new item and the second is the head of the list. Fix the code to pass arguments in the right order, otherwise not all the rxe devices will be removed during teardown. Fixes: `8700e3e7c4` ('Soft RoCE driver') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:17:25 -05:00
Kenneth Lee	828f6fa65c	IB/umem: Release pid in error and ODP flow 1. Release pid before enter odp flow 2. Release pid when fail to allocate memory Fixes: `87773dd56d` ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get") Fixes: `8ada2c1c0c` ("IB/core: Add support for on demand paging regions") Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:44:31 -05:00
Ram Amrani	f449c7a2d8	RDMA/qedr: Dispatch port active event from qedr_add Relying on qede to trigger qedr on startup is problematic. When probing both if qedr loads slowly then qede can assume qedr is missing and not trigger it. This patch adds a triggering from qedr and protects against a race via an atomic bit. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:08 -05:00
Ram Amrani	9c1e0228ab	RDMA/qedr: Fix and simplify memory leak in PD alloc Free the PD if no internal resources were available. Move userspace code under the relevant 'if'. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:07 -05:00
Ram Amrani	af2b14b8b8	RDMA/qedr: Fix RDMA CM loopback The loopback logic in RDMA CM packets compares Ethernet addresses and was accidently inverse. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:02 -05:00
Ram Amrani	1a59075197	RDMA/qedr: Fix formatting Remove standalone ';'. List function's parameters in a single line. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:01 -05:00
Ram Amrani	27a4b1a6d6	RDMA/qedr: Mark three functions as static mark qedr_get_state_from_ibqp(), __qedr_alloc_mr() and __qedr_post_send() as static since they are only used in the same file. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:56 -05:00
Ram Amrani	933e6dcaa0	RDMA/qedr: Don't reset QP when queues aren't flushed Fail QP state transition from error to reset if SQ/RQ are not empty and still in the process of flushing out the queued work entries. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:55 -05:00
Ram Amrani	c78c314961	RDMA/qedr: Don't spam dmesg if QP is in error state It is normal to flush CQEs if the QP is in error state. Hence there's no use in printing a message per CQE to dmesg. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:54 -05:00
Ram Amrani	91bff997db	RDMA/qedr: Remove CQ spinlock from CM completion handlers There is only a single event queue that triggers the completion events for the RDMA CM and it is being processed serially. This means that inherently there can no parallelism of CQ completion handler callbacks, hence the lock is redundant. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:43 -05:00
Ram Amrani	59e8970b37	RDMA/qedr: Return max inline data in QP query result Return the maximum supported amount of inline data, not the qp's current configured inline data size, when filling out the results of a query qp call. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:37 -05:00
Ram Amrani	865cea40b6	RDMA/qedr: Return success when not changing QP state If the user is requesting us to change the QP state to the same state that it is already in, return success instead of failure. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:36 -05:00
Amrani, Ram	097b615965	RDMA/qedr: Fix MTU returned from QP query MTU value returned from QP query should include overhead. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:30 -05:00
Amrani, Ram	d3f4aadd61	RDMA/core: Add the function ib_mtu_int_to_enum As the functionality to convert the MTU from a number to enum_ib_mtu is ubiquitous, define a dedicated function and remove the duplicated code. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:22 -05:00
Ganesh Goudar	bab572f1d4	iw_cxgb4: Guard against null cm_id in dump_ep/qp Endpoints that are aborting can have already dereferenced the cm_id and set ep->com.cm_id to NULL. So guard against that in dump_ep() and dump_qp(). Also create a common function for setting up ip address pointers since the same logic is needed in several places. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:44:01 -05:00
Yuval Shaia	f57e8ca50e	IB/mad: Add port_num to error message Print the invalid port number to ease troubleshooting. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:20:42 -05:00
Yuval Shaia	1dd70ea360	IB/vmw_pvrdma: Remove unused qp_type Remove the unused qp_type parameter from function's args Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:20:42 -05:00
Yuval Shaia	6c6e51a617	IB/core: Fix typo in comment Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:19:48 -05:00
Adit Ranadive	ff89b070b7	IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path If the interrupt allocation failed we should start freeing the CQ rings rather than unregistering the netdev notifier. Fixes: `29c8d9eba5` ("IB: Add vmw_pvrdma driver") Signed-off-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:15:28 -05:00
Adit Ranadive	7d211c81e9	IB/vmw_pvrdma: Don't leak info from alloc_ucontext Clear out the user response struct correctly. Fixes: `29c8d9eba5` ("IB: Add vmw_pvrdma driver") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:15:28 -05:00
Or Gerlitz	7898489880	IB/mlx5: Enable Eth VFs to query their min-inline value for user-space For some mlx5 HW models (CX4, CX4Lx), the VF driver needs to put part of the packet headers on the TX descriptor so the e-switch can do proper matching and steering. This is called "min-inline", it's advertized to the VF by the FW and also enforced on them by the HW, such that if they don't obey, their packets are dropped. SRIOV VF libmlx5 instances should take into account the min-inline value of their vports. For that end, we provide this value through the vendor response part of init_ucontext command. The min inline value is reported in a way which will let newer libmlx5 instances realize that they are running over an older kernel and act accordingly (e.g apply some educated guess). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-24 21:14:06 +02:00
Bart Van Assche	0bbb3b7496	IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it Make the rxe and rdmavt drivers use dma_virt_ops. Update the comments that refer to the source files removed by this patch. Remove struct ib_dma_mapping_ops. Remove ib_device.dma_ops. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Jonathan Toppins <jtoppins@redhat.com> Cc: Alex Estrin <alex.estrin@intel.com> Cc: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:31:32 -05:00
Bart Van Assche	99db949403	IB/core: Remove ib_device.dma_device Add code in ib_register_device() for copying the DMA masks. Use &ib_device.dev in DMA mapping operations instead of dma_device. Remove ib_device.dma_device because due to this and previous patches it is no longer used. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	e3dfa60c0a	IB/srpt: Modify a debug statement Since a later patch will remove ib_device.dma_device and since knowing the value of that pointer is not too important, remove dma_device from the debug output. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	dee2b82a5f	IB/srp: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	61118cecf2	IB/iser: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	db97ed0a2e	IB/IPoIB: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	85e9f1dbbd	IB/rxe: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	a62ef9a7d2	IB/vmw_pvrdma: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Adit Ranadive <aditr@vmware.com> Cc: VMware PV-Drivers <pv-drivers@vmware.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	6b06d52dbe	IB/usnic: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christian Benvenuti <benve@cisco.com> Cc: Dave Goodell <dgoodell@cisco.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	989ab358f7	IB/qib: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	69117101f9	IB/qedr: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Ram Amrani <Ram.Amrani@cavium.com> Cc: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	e6a73f2672	IB/ocrdma: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Selvin Xavier <selvin.xavier@avagotech.com> Cc: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	a487a0bff3	IB/nes: Remove a superfluous assignment statement Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	26e372705f	IB/mthca: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	9b0c289ec4	IB/mlx5: Switch from dma_device to dev.parent Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	d66c88a8fc	IB/mlx4: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	f2296adccf	IB/i40iw: Remove a superfluous assignment statement Due to a previous patch initializing ib_device.dev.parent is sufficient and initializing dma_device is no longer needed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Faisal Latif <faisal.latif@intel.com> Cc: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	fecd02eb2c	IB/hns: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Lijun Ou <oulijun@huawei.com> Cc: Wei Hu(Xavier) <xavier.huwei@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	3067771c51	IB/hfi1: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	d08868a15a	IB/cxgb4: Set dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hariprasad S <hariprasad@chelsio.com> Acked-by: Steve Wise <swise@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	91f734b4f3	IB/cxgb3: Set dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@chelsio.com> Acked-by: Steve Wise <swise@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	1e35a0880f	IB/core: Use dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	97a9ea8480	IB/core: Initialize ib_device.dev.parent earlier Move the ib_device.dev.parent initialization code from ib_device_register_sysfs() to ib_register_device(). Additionally, allow HBA drivers to set ib_device.dev.parent without setting ib_device.dma_device. This is the first step towards removing ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	5f0cb80134	IB/qib: Remove DMA mapping code The qib DMA mapping code is no longer built since commit `eb636ac0e4` ("IB/qib: Remove dma.c and use rdmavt version of dma functions"). Hence remove it. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	e6d356d3cd	IB/hf1: Remove DMA mapping code The hfi1 DMA mapping code has never been built in any upstream kernel. Hence remove it. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	5657933dbb	treewide: Move dma_ops from struct dev_archdata into struct device Some but not all architectures provide set_dma_ops(). Move dma_ops from struct dev_archdata into struct device such that it becomes possible on all architectures to configure dma_ops per device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: x86@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Max Gurtovoy	83236f0157	IB/iser: remove unused variable from iser_conn struct max_sectors calculation was fixed in commit: `9c674815d3` ("IB/iser: Fix max_sectors calculation"). Thus, iser_conn variable scsi_max_sectors is not needed anymore. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:37:45 -05:00
Max Gurtovoy	1e5db6c31a	IB/iser: Fix sg_tablesize calculation For devices that can register page list that is bigger than USHRT_MAX, we actually take the wrong value for sg_tablesize. E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX) so we set sg_tablesize to 0 by mistake. Therefore, each IO that is bigger than 4k splitted to "< 4k" chunks that cause performance degredation. Remove wrong sg_tablesize assignment, and use the value that was set during address resolution handler with the needed casting. Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:37:45 -05:00
Israel Rukshin	0a475ef422	IB/srp: fix invalid indirect_sg_entries parameter value After setting indirect_sg_entries module_param to huge value (e.g 500,000), srp_alloc_req_data() fails to allocate indirect descriptors for the request ring (kmalloc fails). This commit enforces the maximum value of indirect_sg_entries to be SG_MAX_SEGMENTS as signified in module param description. Fixes: `65e8617fba` (scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS) Fixes: `c07d424d61` (IB/srp: add support for indirect tables that don't fit in SRP_CMD) Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:30:14 -05:00
Israel Rukshin	ad8e66b4a8	IB/srp: fix mr allocation when the device supports sg gaps If the device support arbitrary sg list mapping (device cap IB_DEVICE_SG_GAPS_REG set) we allocate the memory regions with IB_MR_TYPE_SG_GAPS. Fixes: `509c5f33f4` ("IB/srp: Prevent mapping failures") Cc: <stable@vger.kernel.org> # 4.7+ Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:03:17 -05:00
Mohamad Haj Yahia	105433659d	net/mlx5: Add support to s-tag in mlx5 firmware interface Add svlan_tag and rename vlan_tag to cvlan_tag in flow table entry match param. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-01-19 23:19:55 +02:00
Peter Zijlstra	2c935bc572	locking/atomic, kref: Add kref_read() Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages. Kills two anti-patterns: atomic_read(&kref->refcount) kref->refcount.counter Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:37:18 +01:00
Jack Wang	102c5ce082	RDMA/cma: use cached port state when bind loopback Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 23:00:04 -05:00
Jack Wang	93b1f29de7	RDMA/cma: resolve to first active ib port When we try to resolve a dest addr, if we don't give src addr, cma core will try to resolve to our source ib device automatically. The current logic only checks if a given port has the same subnet_prefix as our dest, which is not enough if we use default well known subnet_prefix on our active port, as it will be the same as the subnet_prefix on inactive ports and we might match against an inactive port by accident. To resolve this, we should also check if port is active before we resolve it as a suitable src address for a given dest. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 23:00:04 -05:00
Jack Wang	9e2c3f1c7f	RDMA/core: export ib_get_cached_port_state Export function for rdma_cm, patch for rdma_cm to follow. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 23:00:00 -05:00
Jack Wang	aaaca121c7	RDMA/core: add port state cache We need a port state cache in ib_core, later we will use in rdma_cm. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 22:59:55 -05:00
Feras Daoud	27d41d29c7	IB/ipoib: Change list_del to list_del_init in the tx object Since ipoib_cm_tx_start function and ipoib_cm_tx_reap function belong to different work queues, they can run in parallel. In this case if ipoib_cm_tx_reap calls list_del and release the lock, ipoib_cm_tx_start may acquire it and call list_del_init on the already deleted object. Changing list_del to list_del_init in ipoib_cm_tx_reap fixes the problem. Fixes: `839fcaba35` ("IPoIB: Connected mode experimental support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:06 -05:00
Feras Daoud	c586071d1d	IB/ipoib: Replace list_del of the neigh->list with list_del_init In order to resolve a situation where a few process delete the same list element in sequence and cause panic, list_del is replaced with list_del_init. In this case if the first process that calls list_del releases the lock before acquiring it again, other processes who can acquire the lock will call list_del_init. Fixes: `b63b70d877` ("IPoIB: Use a private hash table for path lookup") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:05 -05:00
Feras Daoud	13ee429a02	IB/ipoib: Use debug prints instead of warnings in RNR WC status If a receive request has not been posted to the work queue, the incoming message is rejected and the peer will receive a receiver-not-ready (RNR) error. In IPoIB, IB_WC_RNR_RETRY_EXC_ERR error is part of the life cycle therefore ipoib_cm_handle_tx_wc function will print to debug instead of warnings. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:05 -05:00
Feras Daoud	d32b9a81d7	IB/ipoib: Add detailed error message to dev_queue_xmit call Add a detailed return code to dev_queue_xmit function when calling to requeue packet via __skb_dequeue. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:04 -05:00
Feras Daoud	89a3987ab7	IB/ipoib: rtnl_unlock can not come after free_netdev The ipoib_vlan_add function calls rtnl_unlock after free_netdev, rtnl_unlock not only releases the lock, but also calls netdev_run_todo. The latter function browses the net_todo_list array and completes the unregistration of all its net_device instances. If we call free_netdev before rtnl_unlock, then netdev_run_todo call over the freed device causes panic. To fix, move rtnl_unlock call before free_netdev call. Fixes: `9baa0b0364` ("IB/ipoib: Add rtnl_link_ops support") Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:04 -05:00
Feras Daoud	0a0007f283	IB/ipoib: Fix deadlock between rmmod and set_mode When calling set_mode from sys/fs, the call flow locks the sys/fs lock first and then tries to lock rtnl_lock (when calling ipoib_set_mod). On the other hand, the rmmod call flow takes the rtnl_lock first (when calling unregister_netdev) and then tries to take the sys/fs lock. Deadlock a->b, b->a. The problem starts when ipoib_set_mod frees it's rtnl_lck and tries to get it after that. set_mod: [<ffffffff8104f2bd>] ? check_preempt_curr+0x6d/0x90 [<ffffffff814fee8e>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff81448655>] ? __rtnl_unlock+0x15/0x20 [<ffffffff814fed2b>] mutex_lock+0x2b/0x50 [<ffffffff81448675>] rtnl_lock+0x15/0x20 [<ffffffffa02ad807>] ipoib_set_mode+0x97/0x160 [ib_ipoib] [<ffffffffa02b5f5b>] set_mode+0x3b/0x80 [ib_ipoib] [<ffffffff8134b840>] dev_attr_store+0x20/0x30 [<ffffffff811f0fe5>] sysfs_write_file+0xe5/0x170 [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 [<ffffffff8117ba81>] sys_write+0x51/0x90 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b rmmod: [<ffffffff81279ffc>] ? put_dec+0x10c/0x110 [<ffffffff8127a2ee>] ? number+0x2ee/0x320 [<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0 [<ffffffff8127cc04>] ? vsnprintf+0x484/0x5f0 [<ffffffff8127b550>] ? string+0x40/0x100 [<ffffffff814fe323>] wait_for_common+0x123/0x180 [<ffffffff81060250>] ? default_wake_function+0x0/0x20 [<ffffffff8119661e>] ? ifind_fast+0x5e/0xb0 [<ffffffff814fe43d>] wait_for_completion+0x1d/0x20 [<ffffffff811f2e68>] sysfs_addrm_finish+0x228/0x270 [<ffffffff811f2fb3>] sysfs_remove_dir+0xa3/0xf0 [<ffffffff81273f66>] kobject_del+0x16/0x40 [<ffffffff8134cd14>] device_del+0x184/0x1e0 [<ffffffff8144e59b>] netdev_unregister_kobject+0xab/0xc0 [<ffffffff8143c05e>] rollback_registered+0xae/0x130 [<ffffffff8143c102>] unregister_netdevice+0x22/0x70 [<ffffffff8143c16e>] unregister_netdev+0x1e/0x30 [<ffffffffa02a91b0>] ipoib_remove_one+0xe0/0x120 [ib_ipoib] [<ffffffffa01ed95f>] ib_unregister_device+0x4f/0x100 [ib_core] [<ffffffffa021f5e1>] mlx4_ib_remove+0x41/0x180 [mlx4_ib] [<ffffffffa01ab771>] mlx4_remove_device+0x71/0x90 [mlx4_core] Fixes: `862096a8bb` ("IB/ipoib: Add more rtnl_link_ops callbacks") Cc: <stable@vger.kernel.org> # v3.6+ Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:03 -05:00
Feras Daoud	1c3098cdb0	IB/ipoib: Fix deadlock over vlan_mutex This patch fixes Deadlock while executing ipoib_vlan_delete. The function takes the vlan_rwsem semaphore and calls unregister_netdevice. The later function calls ipoib_mcast_stop_thread that cause workqueue flush. When the queue has one of the ipoib_ib_dev_flush_xxx events, a deadlock occur because these events also tries to catch the same vlan_rwsem semaphore. To fix, unregister_netdevice should be called after releasing the semaphore. Fixes: `cbbe1efa49` ("IPoIB: Fix deadlock between ipoib_open() and child interface create") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:02 -05:00
Feras Daoud	80b5b35aba	IB/ipoib: Set device connection mode only when needed When changing the connection mode, the ipoib_set_mode function did not check if the previous connection mode equals to the new one. This commit adds the required check and return 0 if the new mode equals to the previous one. Fixes: `839fcaba35` ("IPoIB: Connected mode experimental support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:02 -05:00
Feras Daoud	29da686dff	IB/ipoib: When given an invalid UD MTU, give debug msg In datagram mode, the IB UD (Unreliable Datagram) transport is used so the MTU of the interface is equal to the IB L2 MTU minus the IPoIB encapsulation header. Any request to change the MTU value above the maximum range will change the MTU to the max allowed, but will not show any warning message. An ipoib_warn is issued in such cases, letting the user know that even though the value is legal, it can't be currently applied. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 13:59:56 -05:00
ssh10	db287ec5cb	RDMA/ocrdma: Replace BUG() with BUG_ON() Replace BUG() with BUG_ON() using coccinelle Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:52 -05:00
ssh10	b462b06eb6	RDMA/cxgb4: Use AF_INET for sin_family field Elsewhere the sin_family field holds a value with a name of the form AF_..., so it seems reasonable to do so here as well. Also the values of PF_INET and AF_INET are the same. The semantic patch that makes this change is as follows: //</smpl> @@ struct sockaddr_in sip; @@ ( sip.sin_family == - PF_INET + AF_INET \| sip.sin_family != - PF_INET + AF_INET \| sip.sin_family = - PF_INET + AF_INET ) //</smpl> Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:52 -05:00
Amrani, Ram	df15856132	RDMA/qedr: restructure functions that create/destroy QPs Simplify function and sub-function flow of QP creation and destruction. This also serves as a preparation for SRQ and iWARP support. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:41 -05:00
Geliang Tang	bb75f33cf0	RDMA/qib: use rb_entry() To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Cao jin	e8f4eb3bfa	RDMA/hfi1: drop pci_link_reset() In AER recovery, pci_error_handlers.link_reset() is never called, drop it now. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Cao jin	850d08721a	RDMA/qib: drop qib_pci_link_reset() In AER recovery, pci_error_handlers.link_reset() is never called, drop it now. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Kees Cook	7f6856b789	RDMA/i40iw: use designated initializers Prepare to mark sensitive kernel structures for randomization by making sure they're using designated initializers. These were identified during allyesconfig builds of x86, arm, and arm64, with most initializer fixes extracted from grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Kees Cook	6554c9f7f7	RDMA/nes: use designated initializers Prepare to mark sensitive kernel structures for randomization by making sure they're using designated initializers. These were identified during allyesconfig builds of x86, arm, and arm64, with most initializer fixes extracted from grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Bart Van Assche	c5540a0195	IB/rxe: Fix an skb leak Additionally, make it easier to detect skb leaks by issuing a warning if a leak occurs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	839f5ac0d8	IB/rxe: Remove a pointless indirection layer Neither rxe->ifc_ops nor any of the function pointers in struct struct rxe_ifc_ops ever change. Hence remove the rxe->ifc_ops indirection mechanism. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	ab17654476	IB/rxe: Fix reference leaks in memory key invalidation code Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	b3a4599610	IB/rxe: Fix a MR reference leak in check_rkey() Avoid that calling check_rkey() for mem->state == RXE_MEM_STATE_FREE triggers an MR reference leak. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	18d3451c0d	IB/rxe: Generate a completion for all failed work requests Change do_complete() such that an error completion is not only generated if a QP is in the error state but also if a work request failed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	723ec9ae2a	IB/rxe: Introduce functions for queue draining This change makes the code easier to read and avoids that code is duplicated. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	642c7cbcaf	IB/rxe: Add a runtime check in alloc_index() Since index values equal to or above 'range' can trigger memory corruption, complain if index >= range. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	43553b47c3	IB/rxe: Issue warnings once It is strongly recommended to report kernel warnings once instead of every time a condition is hit. Hence change WARN_ON() into WARN_ON_ONCE() / BUILD_BUG_ON() as appropriate. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	32404fb764	IB/rxe: Let the compiler check the type of the cleanup functions Change the argument type of these functions from void * into struct rxe_pool_entry *. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	046ef24d25	IB/rxe: Enable type checking on SKB_TO_PKT() and PKT_TO_SKB() arguments Let the compiler check the type of the arguments passed to SKB_TO_PKT() and PKT_TO_SKB(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	967335ab90	IB/rxe: Remove superfluous casts Casting a pointer to 'void *' explicitly is not necessary in C code. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	175f1244c1	IB/rxe: Remove an unused variable and an unused argument The variable 'av' is not used so remove it. Since that change removes the last user of the 'wqe' argument, remove that argument too. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	c8b82182cb	IB/rxe: Remove an unused function Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	2bec3baded	IB/rxe: Constify the pool name Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	8d8f083720	IB/rxe: Suppress sparse warnings Avoid that sparse complains about using 0 as a pointer, about missing function declarations and also avoid that sparse complains about endianness. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Selvin Xavier	69ae543969	RDMA: Adding ethertype ETH_P_IBOE Update the if_ether.h with the ethertype for Infiniband over Ethernet packets. Also, removing the occurances of 0x8915 from infiniband vendor drivers. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:05:11 -05:00
Steve Wise	3bcf96e018	iw_cxgb4: do not send RX_DATA_ACK CPLs after close/abort Function rx_data(), which handles ingress CPL_RX_DATA messages, was always sending an RX_DATA_ACK with the goal of updating the credits. However, if the RDMA connection is moved out of FPDU mode abruptly, then it is possible for iw_cxgb4 to process queued RX_DATA CPLs after HW has aborted the connection. These CPLs should not trigger RX_DATA_ACKS. If they do, HW can see a READ after DELETE of the DB_LE hash entry for the tid and post a LE_DB HashTblMemCrcError. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Steve Wise	c12a67fec8	iw_cxgb4: free EQ queue memory on last deref Commit `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") introduced a bug where the RDMA QP EQ queue memory (and QIDs) are possibly freed before the underlying connection has been fully shutdown. The result being a possible DMA read issued by HW after the queue memory has been unmapped and freed. This results in possible WR corruption in the worst case, system bus errors if an IOMMU is in use, and SGE "bad WR" errors reported in the very least. The fix is to defer unmap/free of queue memory and QID resources until the QP struct has been fully dereferenced. To do this, the c4iw_ucontext must also be kept around until the last QP that references it is fully freed. In addition, since the last QP deref can happen in an IRQ disabled context, we need a new workqueue thread to do the final unmap/free of the EQ queue memory. Fixes: `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Steve Wise	4fe7c2962e	iw_cxgb4: refactor sq/rq drain logic With the addition of the IB/Core drain API, iw_cxgb4 supported drain by watching the CQs when the QP was out of RTS and signalling "drain complete" when the last CQE is polled. This, however, doesn't fully support the drain semantics. Namely, the drain logic is supposed to signal "drain complete" only when the application has _processed_ the last CQE, not just removed them from the CQ. Thus a small timing hole exists that can cause touch after free type bugs in applications using the drain API (nvmf, iSER, for example). So iw_cxgb4 needs a better solution. The iWARP Verbs spec mandates that "_at some point_ after the QP is moved to ERROR", the iWARP driver MUST synchronously fail post_send and post_recv calls. iw_cxgb4 was currently not allowing any posts once the QP is in ERROR. This was in part due to the fact that the HW queues for the QP in ERROR state are disabled at this point, so there wasn't much else to do but fail the post operation synchronously. This restriction is what drove the first drain implementation in iw_cxgb4 that has the above mentioned flaw. This patch changes iw_cxgb4 to allow post_send and post_recv WRs after the QP is moved to ERROR state for kernel mode users, thus still adhering to the Verbs spec for user mode users, but allowing flush WRs for kernel users. Since the HW queues are disabled, we just synthesize a CQE for this post, queue it to the SW CQ, and then call the CQ event handler. This enables proper drain operations for the various storage applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Parav Pandit	43579b5f2c	IB/core: added support to use rdma cgroup controller Added support APIs for IB core to register/unregister every IB/RDMA device with rdma cgroup for tracking rdma resources. IB core registers with rdma cgroup controller. Added support APIs for uverbs layer to make use of rdma controller. Added uverbs layer to perform resource charge/uncharge functionality. Added support during query_device uverb operation to ensure it returns resource limits by honoring rdma cgroup configured limits. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-01-10 11:14:27 -05:00
David S. Miller	bda65b4255	mlx5 4K UAR The following series of patches optimizes the usage of the UAR area which is contained within the BAR 0-1. Previous versions of the firmware and the driver assumed each system page contains a single UAR. This patch set will query the firmware for a new capability that if published, means that the firmware can support UARs of fixed 4K regardless of system page size. In the case of powerpc, where page size equals 64KB, this means we can utilize 16 UARs per system page. Since user space processes by default consume eight UARs per context this means that with this change a process will need a single system page to fulfill that requirement and in fact make use of more UARs which is better in terms of performance. In addition to optimizing user-space processes, we introduce an allocator that can be used by kernel consumers to allocate blue flame registers (which are areas within a UAR that are used to write doorbells). This provides further optimization on using the UAR area since the Ethernet driver makes use of a single blue flame register per system page and now it will use two blue flame registers per 4K. The series also makes changes to naming conventions and now the terms used in the driver code match the terms used in the PRM (programmers reference manual). Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame register). In order to support compatibility between different versions of library/driver/firmware, the library has now means to notify the kernel driver that it supports the new scheme and the kernel can notify the library if it supports this extension. So mixed versions of libraries can run concurrently without any issues. Thanks, Eli and Matan -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYc9kSAAoJEEg/ir3gV/o+a0EH/jEGiopH7CHc4T4nXT1I4kQa TicrkMNV3Sr9MBWwn8TLOyx+Fi1dex4cumrJI/BNVjC6h/nS6JHbslYoZxTkX9lT L0vRsHJBVr/PODqimIGNnlJFBPhNJSGiHG4JHlJHlpvcGNahitN3gXmUjcRNju+V ExnvgwWzAXM0qg1qWf5A/3HmqbtYES1rJXQUsimtc2QAif/SIayBD4fEA8x5zNBA i0p8xcDrzUqmeblkpnsJA3w40s1rsuqvJnvLPDpbpKENtHfw1UFZ2987P7LvOrIv NF/mZBkStC0gOZX6dLEAdoZXL1gTsJX19hTkUMfYH4BHqHARa2/oCS3wcCf1Giw= =C+cp -----END PGP SIGNATURE----- Merge tag 'mlx5-4kuar-for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5 4K UAR The following series of patches optimizes the usage of the UAR area which is contained within the BAR 0-1. Previous versions of the firmware and the driver assumed each system page contains a single UAR. This patch set will query the firmware for a new capability that if published, means that the firmware can support UARs of fixed 4K regardless of system page size. In the case of powerpc, where page size equals 64KB, this means we can utilize 16 UARs per system page. Since user space processes by default consume eight UARs per context this means that with this change a process will need a single system page to fulfill that requirement and in fact make use of more UARs which is better in terms of performance. In addition to optimizing user-space processes, we introduce an allocator that can be used by kernel consumers to allocate blue flame registers (which are areas within a UAR that are used to write doorbells). This provides further optimization on using the UAR area since the Ethernet driver makes use of a single blue flame register per system page and now it will use two blue flame registers per 4K. The series also makes changes to naming conventions and now the terms used in the driver code match the terms used in the PRM (programmers reference manual). Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame register). In order to support compatibility between different versions of library/driver/firmware, the library has now means to notify the kernel driver that it supports the new scheme and the kernel can notify the library if it supports this extension. So mixed versions of libraries can run concurrently without any issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 17:09:31 -05:00
Eli Cohen	30aa60b3bd	IB/mlx5: Support 4k UAR for libmlx5 Add fields to structs to convey to kernel an indication whether the library supports multi UARs per page and return to the library the size of a UAR based on the queried value. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-09 20:25:09 +02:00
Eli Cohen	b037c29a80	IB/mlx5: Allow future extension of libmlx5 input data Current check requests that new fields in struct mlx5_ib_alloc_ucontext_req_v2 that are not known to the driver be zero. This was introduced so new libraries passing additional information to the kernel through struct mlx5_ib_alloc_ucontext_req_v2 will be notified by old kernels that do not support their request by failing the operation. This schecme is problematic since it requires libmlx5 to issue the requests with descending input size for struct mlx5_ib_alloc_ucontext_req_v2. To avoid this, we require that new features that will obey the following rules: If the feature requires one or more fields in the response and the at least one of the fields can be encoded such that a zero value means the kernel ignored the request then this field will provide the indication to the library. If no response is required or if zero is a valid response, a new field should be added that indicates to the library whether its request was processed. Fixes: `b368d7cb8c` ('IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-09 20:25:09 +02:00
Eli Cohen	5fe9dec0d0	IB/mlx5: Use blue flame register allocator in mlx5_ib Make use of the blue flame registers allocator at mlx5_ib. Since blue flame was not really supported we remove all the code that is related to blue flame and we let all consumers to use the same blue flame register. Once blue flame is supported we will add the code. As part of this patch we also move the definition of struct mlx5_bf to mlx5_ib.h as it is only used by mlx5_ib. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-09 20:25:08 +02:00
Eli Cohen	0b80c14f00	IB/mlx5: Fix retrieval of index to first hi class bfreg First the function retrieving the index of the first hi latency class blue flame register. High latency class bfregs are located right above medium latency class bfregs. Fixes: `c1be5232d2` ('IB/mlx5: Fix micro UAR allocator') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
Eli Cohen	2f5ff26478	mlx5: Fix naming convention with respect to UARs This establishes a solid naming conventions for UARs. A UAR (User Access Region) can have size identical to a system page or can be fixed 4KB depending on a value queried by firmware. Each UAR always has 4 blue flame register which are used to post doorbell to send queue. In addition, a UAR has section used for posting doorbells to CQs or EQs. In this patch we change names to reflect this conventions. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
Eli Cohen	f4044dac63	IB/mlx5: Fix error handling order in create_kernel_qp Make sure order of cleanup is exactly the opposite of initialization. Fixes: `9603b61de1` ('mlx5: Move pci device handling from mlx5_ib to mlx5_core') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
Eli Cohen	de8d6e02ef	IB/mlx5: Fix kernel to user leak prevention logic The logic was broken as it failed to update the response length for architectures with PAGE_SIZE larger than 4kB. As a result further extension of the ucontext response struct would fail. Fixes: `d69e3bcf79` ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
David S. Miller	76eb75be79	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-01-05 11:03:07 -05:00
Artemy Kovalyov	aa8e08d2f5	IB/mlx5: Improve MR check Add "type" field to mlx5_core MKEY struct. Check whether page fault happens on MKEY corresponding to MR. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	17d2f88f92	IB/mlx5: Add ODP atomics support Handle ODP atomic operations. When initiator of RDMA atomic operation use ODP MR to provide source data handle pagefault properly. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	d9aaed8387	{net,IB}/mlx5: Refactor page fault handling * Update page fault event according to last specification. * Separate code path for page fault EQ, completion EQ and async EQ. * Move page fault handling work queue from mlx5_ib static variable into mlx5_core page fault EQ. * Allocate memory to store ODP event dynamically as the events arrive, since in atomic context - use mempool. * Make mlx5_ib page fault handler run in process context. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	7d0cc6edcc	IB/mlx5: Add MR cache for large UMR regions In this change we turn mlx5_ib_update_mtt() into generic mlx5_ib_update_xlt() to perfrom HCA translation table modifiactions supporting both atomic and process contexts and not limited by number of modified entries. Using this function we increase preallocated MRs up to 16GB. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	c438fde1c2	IB/mlx5: Add support for big MRs Make use of extended UMR translation offset. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	3161625589	IB/mlx5: Refactor UMR post send format * Update struct mlx5_wqe_umr_ctrl_seg. * Currenlty UMR send_flags aim only certain use cases: enabled/disable cached MR, modifying XLT for ODP. By making flags independent make UMR more flexible allowing arbitrary manipulations. * Since different UMR formats have different entry sizes UMR request should receive exact size of translation table update instead of number of entries. Rename field npages to xlt_size in struct mlx5_umr_wr and update relevant code accordingly. * Add support of length64 bit. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Binoy Jayan	d5ea2df9ce	IB/mlx5: Add helper mlx5_ib_post_send_wait Clean up the following common code (to post a list of work requests to the send queue of the specified QP) at various places and add a helper function 'mlx5_ib_post_send_wait' to implement the same. - Initialize 'mlx5_ib_umr_context' on stack - Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe - Acquire the semaphore - call ib_post_send with a single ib_send_wr - wait_for_completion() - Check for umr_context.status - Release the semaphore Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Leon Romanovsky	9f885201f2	IB/mlx5: Reorder code in query device command The order of features exposed by private mlx5-abi.h file is CQE zipping, packet pacing and multi-packet WQE. The internal order implemented in mlx5_ib_query_device() is multi-packet WQE, CQE zipping and packet pacing. Such difference hurts code readability, so let's sync, while mlx5-abi.h (exposed to userspace) is the primary order. This commit doesn't change any functionality. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Jack Morgenstein	10b1c04e92	net/mlx4_core: Fix raw qp flow steering rules under SRIOV Demoting simple flow steering rule priority (for DPDK) was achieved by wrapping FW commands MLX4_QP_FLOW_STEERING_ATTACH/DETACH for the PF as well, and forcing the priority to MLX4_DOMAIN_NIC in the wrapper function for the PF and all VFs. In function mlx4_ib_create_flow(), this change caused the main rule creation for the PF to be wrapped, while it left the associated tunnel steering rule creation unwrapped for the PF. This mismatch caused rule deletion failures in mlx4_ib_destroy_flow() for the PF when the detach wrapper function did not find the associated tunnel-steering rule (since creation of that rule for the PF did not go through the wrapper function). Fix this by setting MLX4_QP_FLOW_STEERING_ATTACH/DETACH to be "native" (so that the PF invocation does not go through the wrapper), and perform the required priority demotion for the PF in the mlx4_ib_create_flow() code path. Fixes: `48564135cb` ("net/mlx4_core: Demote simple multicast and broadcast flow steering rules") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-29 14:17:40 -05:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Linus Torvalds	296915912d	First round of -rc fixes for 4.10 kernel - Series of qedr fixes - Series of rxe fixes - One isolated i40iw fix - One isolated cma fix - One isolated cxgb4 fix -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYXAGvAAoJELgmozMOVy/dDukQAMMNarWp0U8KfNYRU5tyCBwd aIQC1gFT6GUCFys40Z6L84m1D3NpGR+vzVv3grVBeuge73b79zAOHXvVDwJCA+Jl QQLG3vZ13C3158sLDiK8zL+4Ob5OfOQ5nQ2spvDfJWpye9SD+pWFcrpqvK02ANRN kFHILk1gROBTNi46yBR5hjWOkw7Bua6XLsPxh6xoaDZ43NL0r0xgm43FTnj/19x3 0zpZYYKP+3C6U7678rqaog9zfXHvadghW5/WBJ/VgfKqEmH89ESx4J2MvbB8DxFD 1tWAOpr5TNY5jnh8mtUsceDjCzQivc/RWqAu05BspEwcavjSLFyRYr1epR0/4oAd PqLSmfORmhpJ8+5Kmn+chtXo3TT4SYGHIzSUbgbEV/ClwX/7UW+w8mfQZ3buUBq/ cQp/oRnJcsrQIEDFO3AH7P+6Sxy6t3zbSl5oKBUOI1u4RFmC7YBPqo9fQu2Z2mGk 3+AWQaPr7qgEcFzXBgLzvd4LhTYKsvmiNwrcXi9KjjwQjNEVg15qqF2YtmxEUgi9 kh3IOcGan3iSblhV/WLrxcOjlPQrPpBOVnTPhUskFtlsrD+032OxeOBpVoU3nCUt MjTYWoNTYdw4wHz0w373o0uR4+4nl4a5OmO4Fh6Drmg5hm4Bl9BWy0Kziu93Z1Ay Z2utZVWLWhBzn8yJujUz =NW9g -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "First round of -rc fixes for 4.10 kernel: - a series of qedr fixes - a series of rxe fixes - one i40iw fix - one cma fix - one cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: Don't check for null ptr in send() IB/rxe: Drop future atomic/read packets rather than retrying IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends qedr: Always notify the verb consumer of flushed CQEs qedr: clear the vendor error field in the work completion qedr: post_send/recv according to QP state qedr: ignore inline flag in read verbs qedr: modify QP state to error when destroying it qedr: return correct value on modify qp qedr: return error if destroy CQ failed qedr: configure the number of CQEs on CQ creation i40iw: Set 128B as the only supported RQ WQE size IB/cma: Fix a race condition in iboe_addr_get_sgid() IB/rxe: Fix a memory leak in rxe_qp_cleanup() iw_cxgb4: set correct FetchBurstMax for QPs	2016-12-23 10:38:48 -08:00
Andrew Boyer	5cc8fabc5e	IB/rxe: Don't check for null ptr in send() pkt->qp was already dereferenced earlier in the function. Fixes Smatch complaint: drivers/infiniband/sw/rxe/rxe_net.c:458 send() warn: variable dereferenced before check 'pkt->qp' (see line 441) Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Andrew Boyer	cbf1f9a46c	IB/rxe: Drop future atomic/read packets rather than retrying If the completer is in the middle of a large read operation, one lost packet can cause havoc. Going to COMPST_ERROR_RETRY will cause the requester to resend the request. After that, any packet from the first attempt still in the receive queue will be interpreted as an error, restarting the error/retry sequence. The transfer will quickly exhaust its retries. This behavior is very noticeable when doing 512KB reads on a QEMU system configured with 1500B MTU. Also, a resent request here will prompt the responder on the other side to immediately start resending, but the resent packets will get stuck in the already-loaded receive queue and will never be processed. Rather than erroring out every time an unexpected future packet arrives, just drop it. Eventually the retry timer will send a duplicate request; the completer will be able to make progress since the queue will start relatively empty. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Andrew Boyer	37b3619394	IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	74c3875c3d	qedr: Always notify the verb consumer of flushed CQEs Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	27035a1b37	qedr: clear the vendor error field in the work completion We clear the vendor error field in the work completion so that if a work completion is erroneous the field won't confuse the caller. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	922d9a40d3	qedr: post_send/recv according to QP state Enable posting to SQ only in RTS, ERR and SQD QP state. Enable posting to RQ in ERR QP state. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	8b0cabc650	qedr: ignore inline flag in read verbs In the current implementation a read verb with IB_SEND_INLINE may be illegally configured. In this fix we ignore the inline bit in the case of a read verb. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	b4c2cc48aa	qedr: modify QP state to error when destroying it Current code didn't modify the QP state to error because it queried the QP state as a bitmap while it isn't. So the code never got executed. This patch fixes this and queries for each QP state respectively and not at once via a bitmask. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	d6ebbf29c3	qedr: return correct value on modify qp Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	a121135973	qedr: return error if destroy CQ failed Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	c7eb3bced7	qedr: configure the number of CQEs on CQ creation Configure ibcq->cqe when a CQ is created. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Chien Tin Tung	61f51b7b20	i40iw: Set 128B as the only supported RQ WQE size RQ WQE size other than 128B is not supported. Correct RQ size calculation to use 128B only. Since this breaks ABI, add additional code to provide compatibility with v4 user provider, libi40iw. Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Bart Van Assche	e259934d4d	IB/rxe: Fix a memory leak in rxe_qp_cleanup() A socket is associated with every QP by the rxe driver but sock_release() is never called. Add a call to sock_release() in rxe_qp_cleanup(). Fixes: commit 8700e3e7c48A5 ("Add Soft RoCE driver") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Moni Shoua <monis@mellanox.com> Cc: Kamal Heib <kamalh@mellanox.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: <stable@vger.kernel.org> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-18 13:35:19 -05:00
Steve Wise	b414fa01c3	iw_cxgb4: set correct FetchBurstMax for QPs The current QP FetchBurstMax value is 256B, which is incorrect since a WR can exceed that value. The result being a partial WR fetched by hardware, and a fatal "bad WR" error posted by the SGE. So bump the FetchBurstMax to 512B. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-18 13:35:19 -05:00
Linus Torvalds	d3ea547853	rdma: fix buggy code that the compiler warns about Get rid of this warning: drivers/infiniband/sw/rdmavt/cq.c: In function ‘rvt_cq_exit’: drivers/infiniband/sw/rdmavt/cq.c:542:2: warning: ‘worker’ may be used uninitialized in this function [-Wmaybe-uninitialized] kthread_destroy_worker(worker); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ by fixing the function to actually work. Fixes: `6efaf10f16` ("IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker") Cc: Petr Mladek <pmladek@suse.com> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-15 12:18:42 -08:00
Linus Torvalds	4d5b57e05a	Updates for 4.10 kernel merge window - Shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - Driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - Debug cleanups - New connection rejection helpers - SRP updates - Various misc fixes - New paravirt driver from vmware -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYUbAPAAoJELgmozMOVy/dMXcP/iuG5MNzfN8Ny1JftyBQGWg3 cqoQ2OLj9CsXjwVB+5EqbcZHRZY852lKONaLoDKkIOx4YAXO2YuIKOp944vN7EQx 96wfqzT1F5jzAcy5mYZXgLaStGFDAwejKMqeHd0LfJj3OEtemGnVPWYzyqSQmSKo dzJraS1Z9GIRppzU5WaRpB9PtRBkqIqGJ5vZ0EKLGhed5hYY5r0iMJB0GfriMRDO lJ4UUVfpsAoLPnqDBFH6IMn2V2UeAw9IR5zNa1mrM1RBfvt/uYTxrw1w3p9WoaNs GRodhk4DCeAfeyqzVPNBLyXZ4Zq4FzGe3UWM4qysJ1RR4oFNw9Cuw0Fqk8mrfznr 7hv5TpGIckRZiKf8l6e+qLirF0qGtXJg29j2vPVQI9i5nSj95g1agA81PnLQlLLb flWyxeMj81my7lfMHN1xcV6pqPEKMCOysZmfcvVfJd2XxpjuVD7ekl/YXWp8o8kU YPdQMqPD626XsD8VpPdMszb9FPmx0JD0HEv+Y1rIFX8JegEI+c3H2X0dqC27T/Ou FEPWOy025EgHm0Fh/7eIzkG6tjZ4JHoCugJAcxNZGj2XW4eB6r5vY8UwJ8iQRv+n PVYHiy0UoIRePh0mrdOSSphGZMi/GO/DsqKwCtAMEK43WqZQju6wR7QSIGkh66mp 4uSHJqpf3YEYylxGMhk3 =QeGy -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "This is the complete update for the rdma stack for this release cycle. Most of it is typical driver and core updates, but there is the entirely new VMWare pvrdma driver. You may have noticed that there were changes in DaveM's pull request to the bnxt Ethernet driver to support a RoCE RDMA driver. The bnxt_re driver was tentatively set to be pulled in this release cycle, but it simply wasn't ready in time and was dropped (a few review comments still to address, and some multi-arch build issues like prefetch() not working across all arches). Summary: - shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - debug cleanups - new connection rejection helpers - SRP updates - various misc fixes - new paravirt driver from vmware" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits) IB: Add vmw_pvrdma driver IB/mlx4: fix improper return value IB/ocrdma: fix bad initialization infiniband: nes: return value of skb_linearize should be handled MAINTAINERS: Update Intel RDMA RNIC driver maintainers MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers IB/core: fix unmap_sg argument qede: fix general protection fault may occur on probe IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc mlx5, calc_sq_size(): Make a debug message more informative mlx5: Remove a set-but-not-used variable mlx5: Use { } instead of { 0 } to init struct IB/srp: Make writing the add_target sysfs attr interruptible IB/srp: Make mapping failures easier to debug IB/srp: Make login failures easier to debug IB/srp: Introduce a local variable in srp_add_one() IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build IB/multicast: Check ib_find_pkey() return value IPoIB: Avoid reading an uninitialized member variable IB/mad: Fix an array index check ...	2016-12-15 12:03:32 -08:00
Lorenzo Stoakes	5b56d49fc3	mm: add locked parameter to get_user_pages_remote() Patch series "mm: unexport __get_user_pages_unlocked()". This patch series continues the cleanup of get_user_pages() functions taking advantage of the fact we can now pass gup_flags as we please. It firstly adds an additional 'locked' parameter to get_user_pages_remote() to allow for its callers to utilise VM_FAULT_RETRY functionality. This is necessary as the invocation of __get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of this and no other existing higher level function would allow it to do so. Secondly existing callers of __get_user_pages_unlocked() are replaced with the appropriate higher-level replacement - get_user_pages_unlocked() if the current task and memory descriptor are referenced, or get_user_pages_remote() if other task/memory descriptors are referenced (having acquiring mmap_sem.) This patch (of 2): Add a int locked parameter to get_user_pages_remote() to allow VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked(). Taking into account the previous adjustments to get_user_pages*() functions allowing for the passing of gup_flags, we are now in a position where __get_user_pages_unlocked() need only be exported for his ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to subsequently unexport __get_user_pages_unlocked() as well as allowing for future flexibility in the use of get_user_pages_remote(). [sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change] Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-14 16:04:08 -08:00
Doug Ledford	6f94ba2079	Merge branch 'vmw_pvrdma' into merge-test	2016-12-14 14:56:21 -05:00
Adit Ranadive	29c8d9eba5	IB: Add vmw_pvrdma driver This patch series adds a driver for a paravirtual RDMA device. The device is developed for VMware's Virtual Machines and allows existing RDMA applications to continue to use existing Verbs API when deployed in VMs on ESXi. We recently did a presentation in the OFA Workshop [1] regarding this device. Description and RDMA Support ============================ The virtual device is exposed as a dual function PCIe device. One part is a virtual network device (VMXNet3) which provides networking properties like MAC, IP addresses to the RDMA part of the device. The networking properties are used to register GIDs required by RDMA applications to communicate. These patches add support and the all required infrastructure for letting applications use such a device. We support the mandatory Verbs API as well as the base memory management extensions (Local Inv, Send with Inv and Fast Register Work Requests). We currently support both Reliable Connected and Unreliable Datagram QPs but do not support Shared Receive Queues (SRQs). Also, we support the following types of Work Requests: o Send/Receive (with or without Immediate Data) o RDMA Write (with or without Immediate Data) o RDMA Read o Local Invalidate o Send with Invalidate o Fast Register Work Requests This version only adds support for version 1 of RoCE. We will add RoCEv2 support in a future patch. We do support registration of both MAC-based and IP-based GIDs. I have also created a git tree for our user-level driver [2]. Testing ======= We have tested this internally for various types of Guest OS - Red Hat, Centos, Ubuntu 12.04/14.04/16.04, Oracle Enterprise Linux, SLES 12 using backported versions of this driver. The tests included several runs of the performance tests (included with OFED), Intel MPI PingPong benchmark on OpenMPI, krping for FRWRs. Mellanox has been kind enough to test the backported version of the driver internally on their hardware using a VMware provided ESX build. I have also applied and tested this with Doug's k.o/for-4.9 branch (commit 5603910b). Note, that this patch series should be applied all together. I split out the commits so that it may be easier to review. PVRDMA Resources ================ [1] OFA Workshop Presentation - https://openfabrics.org/images/eventpresos/2016presentations/102parardma.pdf [2] Libpvrdma User-level library - http://git.openfabrics.org/?p=~aditr/libpvrdma.git;a=summary Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Reviewed-by: George Zhang <georgezhang@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Bryan Tan <bryantan@vmware.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:55:10 -05:00
Doug Ledford	9032ad78bb	Merge branches 'misc', 'qedr', 'reject-helpers', 'rxe' and 'srp' into merge-test	2016-12-14 14:44:47 -05:00
Doug Ledford	86ef0beaa0	Merge branch 'mlx' into merge-test	2016-12-14 14:44:25 -05:00
Doug Ledford	253f8b22e0	Merge branch 'hfi1' into merge-test	2016-12-14 14:44:08 -05:00
Doug Ledford	884fa4f304	Merge branches 'chelsio', 'debug-cleanup', 'hns' and 'i40iw' into merge-test	2016-12-14 14:43:14 -05:00
Pan Bian	46d0703fac	IB/mlx4: fix improper return value If uhw->inlen is non-zero, the value of variable err is 0 if the copy succeeds. Then, if kzalloc() or kmalloc() returns a NULL pointer, it will return 0 to the callers. As a result, the callers cannot detect the errors. This patch fixes the bug, assign "-ENOMEM" to err before the NULL pointer checks, and remove the initialization of err at the beginning. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189031 Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:35:23 -05:00
Pan Bian	5b4c9cd7e4	IB/ocrdma: fix bad initialization In function ocrdma_mbx_create_ah_tbl(), returns the value of status on errors. However, because status is initialized with 0, 0 will be returned even if on error paths. This patch initialize status with "-ENOMEM". Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188831 Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:33:48 -05:00
Zhouyi Zhou	6a3a1056d6	infiniband: nes: return value of skb_linearize should be handled Return value of skb_linearize should be handled in function nes_netdev_start_xmit. Compiled in x86_64 Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:26:49 -05:00
Sebastian Ott	17069d32a3	IB/core: fix unmap_sg argument __ib_umem_release calls dma_unmap_sg with a different number of sg_entries than ib_umem_get uses for dma_map_sg. This might cause trouble for implementations that merge sglist entries and results in the following dma debug complaint: DMA-API: device driver frees DMA sg list with different entry count [map count=2] [unmap count=1] Fix it by using the correct value. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 14:21:26 -05:00
Souptick Joarder	7ceb740c54	IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc In mthca_create_ah(), pci_pool_alloc() followed by memset will be replaced by pci_pool_zalloc() Signed-off-by: Souptick joarder <jrdr.linux@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:58:39 -05:00
Bart Van Assche	1974ab9d9d	mlx5, calc_sq_size(): Make a debug message more informative Make it clear that qp->sq.wqe_cnt is not the number of WQEs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:45:38 -05:00
Bart Van Assche	3d6bdf1625	mlx5: Remove a set-but-not-used variable This has been detected by building the mlx5 driver with W=1. Fixes: `1a412fb1ca` ('net/mlx5: Fixes: `1a412fb1ca` (IB/mlx5: Modify QP commands via mlx5 ifc') Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:45:10 -05:00
Bart Van Assche	626bc02d4d	mlx5: Use { } instead of { 0 } to init struct Detected by sparse. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:42:32 -05:00
Bart Van Assche	4fa354c9db	IB/srp: Make writing the add_target sysfs attr interruptible Avoid that shutdown of srp_daemon is delayed if add_target_mutex is held by another process. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:47 -05:00
Bart Van Assche	290081b453	IB/srp: Make mapping failures easier to debug Make it easier to figure out what is going on if memory mapping fails because more memory regions than mr_per_cmd are needed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	3787d9908c	IB/srp: Make login failures easier to debug If login fails because memory region allocation failed it can be hard to figure out what happened. Make it easier to figure out why login failed by logging a message if ib_alloc_mr() fails. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	042dd765bd	IB/srp: Introduce a local variable in srp_add_one() This patch makes the srp_add_one() code more compact and does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	1a1faf7a8a	IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build Avoid that the kernel build fails as follows if dynamic debug support is disabled: drivers/infiniband/ulp/srp/ib_srp.c:2272:3: error: implicit declaration of function 'DEFINE_DYNAMIC_DEBUG_METADATA' drivers/infiniband/ulp/srp/ib_srp.c:2272:33: error: 'ddm' undeclared (first use in this function) drivers/infiniband/ulp/srp/ib_srp.c:2275:39: error: '_DPRINTK_FLAGS_PRINT' undeclared (first use in this function) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:31:37 -05:00
Bart Van Assche	d3a2418ee3	IB/multicast: Check ib_find_pkey() return value This patch avoids that Coverity complains about not checking the ib_find_pkey() return value. Fixes: commit `547af76521` ("IB/multicast: Report errors on multicast groups if P_key changes") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:27:34 -05:00
Bart Van Assche	11b642b84e	IPoIB: Avoid reading an uninitialized member variable This patch avoids that Coverity reports the following: Using uninitialized value port_attr.state when calling printk Fixes: commit `94232d9ce8` ("IPoIB: Start multicast join process only on active ports") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Erez Shitrit <erezsh@mellanox.com> Cc: <stable@vger.kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:27:34 -05:00
Bart Van Assche	2fe2f378dd	IB/mad: Fix an array index check The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS (80) elements. Hence compare the array index with that value instead of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity reports the following: Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127). Fixes: commit `b7ab0b19a8` ("IB/mad: Verify mgmt class in received MADs") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:27:34 -05:00
Bart Van Assche	b42dde478b	IB/mlx4: Rework special QP creation error path The special QP creation error path relies on offset_of(struct mlx4_ib_sqp, qp) == 0. Remove this assumption because that makes the QP creation code easier to understand. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 13:01:11 -05:00
Bart Van Assche	0d38c240f9	IB/srpt: Report login failures only once Report the following message only once if no ACL has been configured yet for an initiator port: "Rejected login because no ACL has been configured yet for initiator %s.\n" Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:58:30 -05:00
Julia Lawall	5f4c7e4eb5	IB/usnic: simplify IS_ERR_OR_NULL to IS_ERR The function usnic_ib_qp_grp_get_chunk only returns an ERR_PTR value or a valid pointer, never NULL. The same is true of get_qp_res_chunk, which just returns the result of calling usnic_ib_qp_grp_get_chunk. Simplify IS_ERR_OR_NULL to IS_ERR in both cases. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression t,e; @@ t = $usnic_ib_qp_grp_get_chunk(...)\\|get_qp_res_chunk(...)$ ... when != t=e - IS_ERR_OR_NULL(t) + IS_ERR(t) @@ expression t,e,e1; @@ t = $usnic_ib_qp_grp_get_chunk(...)\\|get_qp_res_chunk(...)$ ... when != t=e ?- t ? PTR_ERR(t) : e1 + PTR_ERR(t) ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:57:54 -05:00
Hans Westgaard Ry	9315bc9a13	IB/core: Issue DREQ when receiving REQ/REP for stale QP from "InfiBand Architecture Specifications Volume 1": A QP is said to have a stale connection when only one side has connection information. A stale connection may result if the remote CM had dropped the connection and sent a DREQ but the DREQ was never received by the local CM. Alternatively the remote CM may have lost all record of past connections because its node crashed and rebooted, while the local CM did not become aware of the remote node's reboot and therefore did not clean up stale connections. and: A local CM may receive a REQ/REP for a stale connection. It shall abort the connection issuing REJ to the REQ/REP. It shall then issue DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP. This patch solves a problem with reuse of QPN. Current codebase, that is IPoIB, relies on a REAP-mechanism to do cleanup of the structures in CM. A problem with this is the timeconstants governing this mechanism; they are up to 768 seconds and the interface may look inresponsive in that period. Issuing a DREQ (and receiving a DREP) does the necessary cleanup and the interface comes up. Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:56:24 -05:00
Philippe Reynes	24dc08c3c9	IB/nes: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:52:25 -05:00
Alexey Khoroshilov	def4a6ffc9	IB/isert: do not ignore errors in dma_map_single() There are several places, where errors in dma_map_single() are ignored. The patch fixes them. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:51:31 -05:00
Jim Foraker	22dccc5454	IB/rdmavt: Only put mmap_info ref if it exists rvt_create_qp() creates qp->ip only when a qp creation request comes from userspace (udata is not NULL). If we exceed the number of available queue pairs however, the error path always attempts to put a kref to this structure. If the requestor is inside the kernel, this leads to a crash. We fix this by checking that qp->ip is not NULL before caling kref_put(). Signed-off-by: Jim Foraker <foraker1@llnl.gov> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:16:11 -05:00
Petr Mladek	f5eabf5e51	IB/rdmavt: Handle the kthread worker using the new API Use the new API to create and destroy the cq kthread worker. The API hides some implementation details. In particular, kthread_create_worker() allocates and initializes struct kthread_worker. It runs the kthread the right way and stores task_struct into the worker structure. In addition, the *on_cpu() variant binds the kthread to the given cpu and the related memory node. kthread_destroy_worker() flushes all pending works, stops the kthread and frees the structure. This patch does not change the existing behavior. Note that we must use the on_cpu() variant because the function starts the kthread and it must bind it to the right CPU before waking. The numa node is associated for given CPU as well. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:16:11 -05:00
Petr Mladek	6efaf10f16	IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker The memory barrier is not enough to protect queuing works into a destroyed cq kthread. Just imagine the following situation: CPU1 CPU2 rvt_cq_enter() worker = cq->rdi->worker; rvt_cq_exit() rdi->worker = NULL; smp_wmb(); kthread_flush_worker(worker); kthread_stop(worker->task); kfree(worker); // nothing queued yet => // nothing flushed and // happily stopped and freed if (likely(worker)) { // true => read before CPU2 acted cq->notify = RVT_CQ_NONE; cq->triggered++; kthread_queue_work(worker, &cq->comptask); BANG: worker has been flushed/stopped/freed in the meantime. This patch solves this by protecting the critical sections by rdi->n_cqs_lock. It seems that this lock is not much contended and looks reasonable for this purpose. One catch is that rvt_cq_enter() might be called from IRQ context. Therefore we must always take the lock with IRQs disabled to avoid a possible deadlock. Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:16:11 -05:00
Arnd Bergmann	14ab8896f5	IB/mlx5: avoid bogus -Wmaybe-uninitialized warning We get a false-positive warning in linux-next for the mlx5 driver: infiniband/hw/mlx5/mr.c: In function ‘mlx5_ib_reg_user_mr’: infiniband/hw/mlx5/mr.c:1172:5: error: ‘order’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1161:6: note: ‘order’ was declared here infiniband/hw/mlx5/mr.c:1173:6: error: ‘ncont’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1160:6: note: ‘ncont’ was declared here infiniband/hw/mlx5/mr.c:1173:6: error: ‘page_shift’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1158:6: note: ‘page_shift’ was declared here infiniband/hw/mlx5/mr.c:1143:13: error: ‘npages’ may be used uninitialized in this function [-Werror=maybe-uninitialized] infiniband/hw/mlx5/mr.c:1159:6: note: ‘npages’ was declared here I had a trivial workaround for gcc-5 or higher, but that didn't work on gcc-4.9 unfortunately. The only way I found to avoid the warnings for gcc-4.9, short of initializing each of the arguments first was to change the calling conventions to separate the error code from the umem pointer. This avoids casting the error codes from one pointer to another incompatible pointer, and lets gcc figure out when that the data is actually valid whenever we return successfully. Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 12:12:53 -05:00
Steve Wise	1e38a366ee	ib_isert: log the connection reject message Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	97540bb90a	ib_iser: log the connection reject message Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	5f24410408	rdma_cm: add rdma_consumer_reject_data helper function rdma_consumer_reject_data() will return the private data pointer and length if any is available. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00
Steve Wise	5042a73d3e	rdma_cm: add rdma_is_consumer_reject() helper function Return true if the peer consumer application rejected the connection attempt. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-14 11:38:28 -05:00

... 3 4 5 6 7 ...

6647 Commits