linux

Author	SHA1	Message	Date
Doug Ledford	520a07bff6	Merge branches 'i40iw', 'sriov' and 'hfi1' into k.o/for-4.6	2016-03-21 17:32:23 -04:00
Eli Cohen	68996a6e76	IB/ipoib: Allow mcast packets from other VFs With SRIOV enabled, two VFs on the same HCA which have the same port LID and may have the same QP number. To enable receiving multicasts from such VFs, further qualify the check: ignore the receive only if, in addition, the packet source gid equals the receiving VF's source gid. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 17:13:14 -04:00
Eli Cohen	eff901d30e	IB/mlx5: Implement callbacks for manipulating VFs Implement the IB defined callbacks used to manipulate the policy for the link state, set GUIDs or get statistics information. This functionality is added into a new file that will be used to add any SRIOV related functionality to the mlx5 IB layer. The following callbacks have been added: mlx5_ib_get_vf_config mlx5_ib_set_vf_link_state mlx5_ib_get_vf_stats mlx5_ib_set_vf_guid In addition, publish whether this device is based on a virtual function. In mlx5 supported devices, virtual functions are implemented as vHCAs. vHCAs have their own QP number space so it is possible that two vHCAs will use a QP with the same number at the same time. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 17:13:14 -04:00
Eli Cohen	1f324bff9b	net/mlx5_core: Implement modify HCA vport command Implement the modify HCA vport commands used to modify the parameters of virtual HCA's ports. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 17:13:14 -04:00
Eli Cohen	2a4826fe74	net/mlx5_core: Add VF param when querying vport counter Add a vf parameter to mlx5_core_query_vport_counter so we can call it to query counters of virtual functions. Also update current users of the API. PFs may call mlx5_core_query_vport_counter with other_vport set to indicate that they are querying a virtual function. The virtual function to be queried is given by the vf parameter. Virtual function numbering is zero based so the first VF is 0 and so on. When a PF queries its own function, the other_vport parameter is cleared. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 17:13:14 -04:00
Eli Cohen	9c3c5f8e1f	IB/ipoib: Add ndo operations for configuring VFs Add ndo operations to the network driver that enables configuring the following operations: ipoib_set_vf_link_state - configure the VF link policy ipoib_get_vf_config - get link state configuration ipoib_set_vf_guid - set a VF port or node GUID ipoib_get_vf_stats - get statistics of a VF Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 17:13:14 -04:00
Eli Cohen	50174a7f2c	IB/core: Add interfaces to control VF attributes Following the practice exercised for network devices which allow the PF net device to configure attributes of its virtual functions, we introduce the following functions to be used by IPoIB which is the network driver implementation for IB devices. ib_set_vf_link_state - set the policy for a VF link. More below. ib_get_vf_config - read configuration information of a VF ib_get_vf_stats - read VF statistics ib_set_vf_guid - set the node or port GUID of a VF Also add an indication in the device cap flags that indicates that this IB devices is based on a virtual function. A VF shares the physical port with the PF and other VFs. When setting the link state we have three options: 1. Auto - in this mode, the virtual port follows the state of the physical port and becomes active only if the physical port's state is active. In all other cases it remains in a Down state. 2. Down - sets the state of the virtual port to Down 3. Up - causes the virtual port to transition into Initialize state if it was not already in this state. A virtualization aware subnet manager can then bring the state of the port into the Active state. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 17:13:14 -04:00
Eli Cohen	a0c1b2a350	IB/core: Support accessing SA in virtualized environment Per the ongoing standardisation process, when virtual HCAs are present in a network, traffic is routed based on a destination GID. In order to access the SA we use the well known SA GID. We also add a GRH required boolean field to the port attributes which is used to report to the verbs consumer whether this port is connected to a virtual network. We use this field to realize whether we need to create an address vector with GRH to access the subnet administrator. We clear the port attributes struct before calling the hardware driver to make sure the default remains that GRH is not required. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:34:06 -04:00
Eli Cohen	fad61ad4e7	IB/core: Add subnet prefix to port info The subnet prefix is a part of the port_info MAD returned and should be available at the ib_port_attr struct. We define it here and provide a default implementation in case the hardware driver does not provide one. The subnet prefix is required when creating the address vector to access the SA in networks where GRH must be used. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:34:06 -04:00
Eli Cohen	d603c809ef	IB/mlx5: Fix decision on using MAD_IFC Fix the condition that dictates when MAD_IFC should be used. According to firmware specifications, MAD_IFC commands must be used only if the ib_virt capability is off. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:34:06 -04:00
Eli Cohen	cc8e27cc97	net/core: Add support for configuring VF GUIDs Add two new NLAs to support configuration of Infiniband node or port GUIDs. New applications can choose to use this interface to configure GUIDs with iproute2 with commands such as: ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70 ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78 A new ndo, ndo_sef_vf_guid is introduced to notify the net device of the request to change the GUID. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:34:06 -04:00
Leon Romanovsky	fb532d6a79	IB/{core, ulp} Support above 32 possible device capability flags The old bitwise device_cap_flags variable was limited to u32 which has all bits already defined. In order to overcome it, we converted device_cap_flags variable to be u64 type. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:32:59 -04:00
Leon Romanovsky	2953f42513	IB/core: Replace setting the zero values in ib_uverbs_ex_query_device The setting to zero during variable initialization eliminates the need to explicitly set to zero variables and structures. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:32:36 -04:00
Sagi Grimberg	3f0393a575	net/mlx5_core: Introduce offload arithmetic hardware capabilities Define the necessary hardware structures for the offload arithmetic capabilities and read/cache them on driver load. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:32:35 -04:00
Leon Romanovsky	b06e7de8a9	net/mlx5_core: Refactor device capability function Device capability function was called similar in all places. It was called twice for every queried parameter, while the difference between calls was in HCA capability mode only. The change proposed unify these calls into one function. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:29:07 -04:00
Leon Romanovsky	91d9ed8443	net/mlx5_core: Fix caching ATOMIC endian mode capability Add caching of maximum device capability of ATOMIC endian mode. Fixes: `f91e6d8941` ('net/mlx5_core: Add setting ATOMIC endian mode') Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:29:07 -04:00
Dan Carpenter	5658600e7f	ib_srpt: fix a WARN_ON() message The first argument of WARN_ON() is a condition, so it means the warning message here will just be the name without the ->qp_num information. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:06:27 -04:00
Tatyana Nikolova	34abf9ed73	i40iw: Replace the obsolete crypto hash interface with shash This patch replaces the obsolete crypto hash interface with shash and resolves a build failure after merge of the rdma tree which is caused by the removal of crypto hash interface Removing CRYPTO_ALG_ASYNC from crypto_alloc_shash(), because it is by definition sync only Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 16:02:24 -04:00
Mitko Haralanov	5511d78107	IB/hfi1: Add SDMA cache eviction algorithm This commit adds a cache eviction algorithm for the SDMA user buffer cache. Besides the interval RB tree used for node lookup, the cache nodes are also arranged in a doubly-linked list. When a node is used, it is put at the beginning of the list. Less frequently used nodes naturally move to the tail of the list. When the cache limit is reached, the eviction code starts traversing the linked list in reverse, freeing buffers until enough space has been freed to fit the new user buffer. This guarantees that only the least used cache nodes will be removed from the cache. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:25 -04:00
Mitko Haralanov	a7922f7ddf	IB/hfi1: Switch to using the pin query function Use the new function to query whether the expected receive user buffer can be pinned successfully. This requires that a new variable be added to the hfi1_filedata structure used to hold the number of pages pinned by the expected receive code. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:25 -04:00
Mitko Haralanov	bd3a8947de	IB/hfi1: Specify mm when releasing pages This change adds a pointer to the process mm_struct when calling hfi1_release_user_pages(). Previously, the function used the mm_struct of the current process to adjust the number of pinned pages. However, is some cases, namely when unpinning pages due to a MMU notifier call, we want to drop into that code block as it will cause a deadlock (the MMU notifiers take the process' mmap_sem prior to calling the callbacks). By allowing to caller to specify the pointer to the mm_struct, the caller has finer control over that part of hfi1_release_user_pages(). Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:25 -04:00
Mitko Haralanov	2c97ce4f3c	IB/hfi1: Add pin query function System administrators can use the locked memory ulimit setting to set the maximum amount of memory a user can lock/pin. However, this setting alone is not enough to guarantee good operation of the hfi1 driver due to the fact that the setting does not have fine enough granularity to account for the limit being used by multiple user processes and caches. Therefore, a better limiting algorithm is needed. This is where the new hfi1_can_pin_pages() function and the cache_size module parameter come in. The function works by looking at the ulimit and cache_size value to compute a cache size. The algorithm examines the ulimit value and, if it is not "unlimited", computes a per-cache limit based on the number of configured user contexts. After that, the lower of the two - cache_size and computed per-cache limit - is used. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:24 -04:00
Mitko Haralanov	5cd3a88d7f	IB/hfi1: Implement SDMA-side buffer caching Add support for caching of user buffers used for SDMA transfers. This change improves performance by avoiding repeatedly pinning the pages of buffers, which are being re-used by the application. While the cost of the pinning operation has been made heavier by adding the extra code to search the cache tree, re-allocate pages arrays, and future cache evictions, that cost will be amortized against the savings when the same buffer is re-used. It is also worth noting that in most cases, the cost of pinning should be much lower due to the buffer already being in the cache. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:24 -04:00
Mitko Haralanov	a489876010	IB/hfi1: Adjust last address values for intervals Last address values for intervals in the interval RB tree nodes should be non-inclusive in order to avoid confusing ranges. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:23 -04:00
Mitko Haralanov	0f310a00e0	IB/hfi1: Add filter callback This commit adds a filter callback, which can be used to filter out interval RB nodes matching a certain interval down to a single one. This is needed for the upcoming SDMA-side caching where buffers will need to be filtered by their virtual address. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:23 -04:00
Mitko Haralanov	b8718e2e2e	IB/hfi1: Remove compare callback Interval RB trees provide their own searching function, which also takes care of determining the path through the tree that should be taken. This make the compare callback unnecessary. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:23 -04:00
Mitko Haralanov	353b71c7c0	IB/hfi1: Add MMU tracing Add a new tracepoint type for the MMU functions and calls to that tracepoint to allow tracing of MMU functionality. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:22 -04:00
Mitko Haralanov	df5a00f81d	IB/hfi1: Use interval RB trees The interval RB trees can handle RB nodes which hold ranged information. This is exactly the usage for the buffer cache implemented in the expected receive code path. Convert the MMU/RB functions to use the interval RB tree API. This will help with future users of the caching API, as well. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:22 -04:00
Mitko Haralanov	909e2cd004	IB/hfi1: Notify remove MMU/RB callback of calling context Tell the remove MMU/RB callback if it's being called as part of a memory invalidation or not. This can be important in preventing a deadlock if the remove callback attempts to take the map_sem semaphore because the kernel's MMU invalidation functions have already taken it. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:22 -04:00
Mitko Haralanov	368f2b59d0	IB/hfi1: Remove the use of add/remove RB function pointers The usage of function pointers for RB node insertion and removal in the expected receive code path was meant to be a small performance optimization. However, maintaining it, especially with the new MMU API, would become more troublesome as the API is extended. Since the performance optimization is minor, remove the function pointers and replace with direct calls. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:21 -04:00
Mitko Haralanov	eef9c896a9	IB/hfi1: Allow remove MMU callbacks to free nodes In order to allow the remove MMU callbacks to free the RB nodes, it is necessary to prevent any references to the nodes after the remove callback has been called. Therefore, remove the node from the tree prior to calling the callback. In other words, the MMU/RB API now guarantees that all RB node operations it performs will be done prior to calling the remove callback and that the RB node will not be touched afterwards. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:21 -04:00
Mitko Haralanov	4b00d9490f	IB/hfi1: Prevent NULL pointer dereference Prevent a potential NULL pointer dereference (found by code inspection) when unregistering an MMU handler. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:20 -04:00
Mitko Haralanov	c81e1f6452	IB/hfi1: Allow MMU function execution in IRQ context Future users of the MMU/RB functions might be searching or manipulating the MMU RB trees in interrupt context. Therefore, the MMU/RB functions need to be able to run in interrupt context. This requires that we use the IRQ-aware API for spin locks. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:55:20 -04:00
Mitko Haralanov	06e0ffa693	IB/hfi1: Re-factor MMU notification code The MMU notification code added to the expected receive side has been re-factored and split into it's own file. This was done in order to make the code more general and, therefore, usable by other parts of the driver. The caching behavior remains the same. However, the handling of the RB tree (insertion, deletions, and searching) as well as the MMU invalidation processing is now handled by functions in the mmu_rb.[ch] files. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-21 15:54:59 -04:00
Alex Estrin	000a830efd	IB/rdmavt: Post receive for QP in ERR state Accordingly IB Spec post WR to receive queue must complete with error if QP is in Error state. Please refer to C10-42, C10-97.2.1 Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:22 -04:00
Mike Marciniszyn	d0e859c328	IB/hfi1: Enable adaptive pio by default Set the piothreshold to the agreed upon default of 256B. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:21 -04:00
Mike Marciniszyn	47177f1bac	IB/hfi1: Fix adaptive pio packet corruption The adaptive pio heuristic missed a case that causes a corrupted packet on the wire. The case is if SDMA egress had been chosen for a pio-able packet and then encountered a ring space wait, the packet is queued. The sge cursor had been incremented as part of the packet build out for SDMA. After the send engine restart, the heuristic might now chose pio based on the sdma count being zero and start the mmio copy using the already incremented sge cursor. Fix this by forcing SDMA egress when the SDMA descriptor has already been built. Additionally, the code to wait for a QPs pio count to zero when switching to SDMA was missing. Add it. There is also an issue with UD QPs, in that the different SLs can pick a different egress send context. For now, just insure the UD/GSI always go through SDMA. Reviewed-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:21 -04:00
Mike Marciniszyn	cef504c5c0	IB/hfi1: Fix panic in adaptive pio The following panic occurs while running ib_send_bw -a with adaptive pio turned on: [ 8551.143596] BUG: unable to handle kernel NULL pointer dereference at (null) [ 8551.152986] IP: [<ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1] [ 8551.160926] PGD 80db21067 PUD 80bb45067 PMD 0 [ 8551.166431] Oops: 0000 [#1] SMP [ 8551.276725] task: ffff880816bf15c0 ti: ffff880812ac0000 task.ti: ffff880812ac0000 [ 8551.285705] RIP: 0010:[<ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1] [ 8551.296462] RSP: 0018:ffff880812ac3b58 EFLAGS: 00010282 [ 8551.303029] RAX: 000000000000002d RBX: 0000000000000000 RCX: 0000000000000800 [ 8551.311633] RDX: ffff880812ac3c08 RSI: 0000000000000000 RDI: ffff8800b6665e40 [ 8551.320228] RBP: ffff880812ac3ba0 R08: 0000000000001000 R09: ffffffffa09039a0 [ 8551.328820] R10: ffff880817a0c000 R11: 0000000000000000 R12: ffff8800b6665e40 [ 8551.337406] R13: ffff880817a0c000 R14: ffff8800b6665800 R15: ffff8800b6665e40 [ 8551.355640] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8551.362674] CR2: 0000000000000000 CR3: 000000080abe8000 CR4: 00000000001406e0 [ 8551.371262] Stack: [ 8551.374119] ffff880812ac3bf0 ffff88080cf54010 ffff880800000800 ffff880812ac3c08 [ 8551.383036] ffff8800b6665800 ffff8800b6665e40 0000000000000202 ffffffffa08e7b80 [ 8551.391941] 00000001007de431 ffff880812ac3bc8 ffffffffa0904645 ffff8800b6665800 [ 8551.400859] Call Trace: [ 8551.404214] [<ffffffffa08e7b80>] ? hfi1_del_timers_sync+0x30/0x30 [hfi1] [ 8551.412417] [<ffffffffa0904645>] hfi1_verbs_send+0x215/0x330 [hfi1] [ 8551.420154] [<ffffffffa08ec126>] hfi1_do_send+0x166/0x350 [hfi1] [ 8551.427618] [<ffffffffa055a533>] rvt_post_send+0x533/0x6a0 [rdmavt] [ 8551.435367] [<ffffffffa050760f>] ib_uverbs_post_send+0x30f/0x530 [ib_uverbs] [ 8551.443999] [<ffffffffa0501367>] ib_uverbs_write+0x117/0x380 [ib_uverbs] [ 8551.452269] [<ffffffff815810ab>] ? sock_recvmsg+0x3b/0x50 [ 8551.459071] [<ffffffff81581152>] ? sock_read_iter+0x92/0xe0 [ 8551.466068] [<ffffffff81212857>] __vfs_write+0x37/0x100 [ 8551.472692] [<ffffffff81213532>] ? rw_verify_area+0x52/0xd0 [ 8551.479682] [<ffffffff81213782>] vfs_write+0xa2/0x1a0 [ 8551.486089] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 8551.493891] [<ffffffff812146c5>] SyS_write+0x55/0xc0 [ 8551.500220] [<ffffffff816ae0ee>] entry_SYSCALL_64_fastpath+0x12/0x71 [ 8551.531284] RIP [<ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1] [ 8551.539508] RSP <ffff880812ac3b58> [ 8551.544110] CR2: 0000000000000000 The priv s_sendcontext pointer was not setup properly. Fix with this patch by using the s_sendcontext and eliminating its send engine use. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:21 -04:00
Mike Marciniszyn	60df29581f	IB/hfi1: Fix PIO wakeup timing hole There is a timing hole if there had been greater than PIO_WAIT_BATCH_SIZE waiters. This code will dispatch the first batch but leave the others in the queue. If the restarted waiters don't in turn wait on a buffer, there is a hang. Fix by forcing a return when the QP queue is non-empty. Reviewed-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:20 -04:00
Mike Marciniszyn	5326dfbf00	IB/hfi1: Fix ordering of trace for accuracy The postitioning of the sdma ibhdr trace was causing an extra trace message when the tx send returned -EBUSY. Move the trace to just before the return and handle negative return values to avoid any trace. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:20 -04:00
Mike Marciniszyn	1db78eeebe	IB/hfi1: Add unique trace point for pio and sdma send This allows for separately enabling pio and sdma tracepoints to cut the volume of trace information. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:19 -04:00
Mike Marciniszyn	ef6d8c4ec8	IB/hfi1: Fix issues with qp_stats print The changes are to aid in coorelating trace information with QPs between the trace and qp_stats information Such changes include adds a space after QP and clarifying that the second QP is actually the remote QP. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:19 -04:00
Mike Marciniszyn	ef086c0d5d	IB/hfi1: Report pid in qp_stats to aid debug Tracking user/QP ownership is needed to debug issues with user ULPs like OpenMPI. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:19 -04:00
Easwar Hariharan	2243472e9d	IB/hfi1: Improve LED beaconing The current LED beaconing code is unclear and uses the timer handler to turn off the timer. This patch simplifies the code by removing the special semantics of timeon = timeoff = 0 being interpreted as a request to turn off the beaconing. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:18 -04:00
Kaike Wan	831464ce4b	IB/hfi1: Don't call cond_resched in atomic mode when sending packets This patch fixed the problem where the driver might reschedule in atomic mode when sending packets. This is due to the fact that the call to cond_resched() in hfi1_do_send() might occur in atomic mode and a check is required to avoid the warning message: "kernel: BUG: scheduling while atomic: swapper/2/0/0x10000100." Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:18 -04:00
Dean Luick	528ee9fbf0	IB/hfi1: Add adaptive cacheless verbs copy The kernel memcpy is faster than a cacheless copy. However, if too much of the L3 cache is overwritten by one-time copies then overall bandwidth suffers. Implement an adaptive scheme where full page copies are tracked and if the number of unique entries are larger than a threshold, verbs will use a cacheless copy. Tracked entries are gradually cleaned, allowing memcpy to resume once the larger copies have stopped. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:17 -04:00
Jubin John	8fefef125e	IB/hfi1: Handle host handshake timeout Host handshake timeout can occur during the verify capability state. This is a LNI related failure and should be handled in the same way as other LNI failures. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:17 -04:00
Dean Luick	c9c8ea3d47	IB/hfi1: Add ASIC flag view/clear Different OSes using parts of the same hardware may leave cross-device flags set. Export a debugfs file to view and clear these flags if needed. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:17 -04:00
Dean Luick	ae993e7fba	IB/hfi1: Hold i2c resource across debugfs open/close External i2c firmware updates are done in multiple steps and cannot have other things done in between. For debugfs files, acquire the resource on open and release it on close. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:16 -04:00
Dean Luick	b0506f4c56	IB/hfi1: Reduce hardware mutex timeout The hardware mutex is now held only long enough to set or clear flags. Reduce the timeout to something more reasonable. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-03-17 15:55:16 -04:00

1 2 3 4 5 ...

575565 Commits