Commit Graph

1095 Commits

Author SHA1 Message Date
Ira Weiny
da2dfaa3a3 IB/mad: Support alternate Base Versions when creating MADs
In preparation to support the new OPA MAD Base version, add a base version
parameter to ib_create_send_mad and set it to IB_MGMT_BASE_VERSION for current
users.

Definition of the new base version and it's processing will occur in later
patches.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12 14:49:17 -04:00
Matan Barak
8e37210b38 IB/core: Change ib_create_cq to use struct ib_cq_init_attr
Currently, ib_create_cq uses cqe and comp_vecotr instead
of the extendible ib_cq_init_attr struct.

Earlier patches already changed the vendors to work with
ib_cq_init_attr. This patch changes the consumers too.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12 14:49:10 -04:00
Doug Ledford
b806ef3bbe Merge branch 'for-4.2-misc' into k.o/for-4.2 2015-06-02 09:33:22 -04:00
Bart Van Assche
5237496781 IB/ipoib: Fix RCU annotations in ipoib_neigh_hash_init()
Avoid that sparse complains about ipoib_neigh_hash_init(). This
patch does not change any functionality. See also patch "IPoIB:
Fix memory leak in the neigh table deletion flow" (commit ID
66172c0993).

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-02 09:22:31 -04:00
Doug Ledford
175e8efe69 Merge branches 'bart-srp', 'generic-errors', 'ira-cleanups' and 'mwang-v8' into k.o/for-4.2 2015-05-20 16:12:40 -04:00
Sagi Grimberg
ea8a1616a7 iser-target: Align to generic logging helpers
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:44:22 -04:00
Sagi Grimberg
871e00afa4 IB/iser: Align to generic logging helpers
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:44:22 -04:00
Sagi Grimberg
57363d98cf IB/srp: Align to generic logging helpers
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:44:22 -04:00
Bart Van Assche
985aa49556 IB/srp: Add 64-bit LUN support
The SCSI standard defines 64-bit values for LUNs. Large arrays
employing large or hierarchical LUN numbers become more and more
common. So update the SRP initiator to use 64-bit LUN numbers.
See also Hannes Reinecke, commit 9cb78c16f5 ("scsi: use 64-bit LUNs"),
June 2014.

The largest LUN number that has been tested is 0xd2003fff00000000.

Checked the following structure sizes with gdb:
* sizeof(struct srp_cmd) = 48
* sizeof(struct srp_tsk_mgmt) = 48
* sizeof(struct srp_aer_req) = 36

The ibmvscsi changes have been compile tested only (on a PPC system).

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:56 -04:00
Bart Van Assche
bbac5ccff4 IB/srp: Remove !ch->target tests from the reconnect code
Remove the !ch->target tests from the reconnect code. These
tests are not needed: upon entry of srp_rport_reconnect()
it is guaranteed that all ch->target pointers are non-NULL.
None of the functions srp_new_cm_id(), srp_finish_req(),
srp_create_ch_ib() nor srp_connect_ch() modifies this pointer.
srp_free_ch_ib() is never called concurrently with
srp_rport_reconnect().

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:56 -04:00
Bart Van Assche
47513cf4f4 IB/srp: Remove a superfluous check from srp_free_req_data()
The function srp_free_req_data() does not use ch->target.
Hence remove the ch->target != NULL check.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:56 -04:00
Bart Van Assche
33ab3e5ba2 IB/srp: Rearrange module description
Move the module version and release date into separate fields.
This makes the modinfo output easier to read.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:56 -04:00
Bart Van Assche
45c37cad40 IB/srp: Remove superfluous casts
A long time ago the data type int64_t was declared as long long
on x86 systems and as long on PPC systems. Today that data type
is declared as long long on all Linux architectures. This means
that the casts from uint64_t into unsigned long long are
superfluous. Remove these superfluous casts.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:55 -04:00
Bart Van Assche
a44074f14b IB/srp: Fix reconnection failure handling
Although it is possible to let SRP I/O continue if a reconnect
results in a reduction of the number of channels, the current
code does not handle this scenario correctly. Instead of making
the reconnect code more complex, consider this as a reconnection
failure.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: <stable@vger.kernel.org> #v3.19
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:55 -04:00
Bart Van Assche
c014c8cd31 IB/srp: Fix connection state tracking
Reception of a DREQ message only causes the state of a single
channel to change. Hence move the 'connected' member variable
from the target to the channel data structure. This patch
avoids that following false positive warning can be reported
by srp_destroy_qp():

WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:617 srp_destroy_qp+0xa6/0x120 [ib_srp]()
Call Trace:
[<ffffffff8106e10f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff8106e16a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa0440226>] srp_destroy_qp+0xa6/0x120 [ib_srp]
[<ffffffffa0440322>] srp_free_ch_ib+0x82/0x1e0 [ib_srp]
[<ffffffffa044408b>] srp_create_target+0x7ab/0x998 [ib_srp]
[<ffffffff81346f60>] dev_attr_store+0x20/0x30
[<ffffffff811dd90f>] sysfs_write_file+0xef/0x170
[<ffffffff8116d248>] vfs_write+0xc8/0x190
[<ffffffff8116d411>] sys_write+0x51/0x90

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: <stable@vger.kernel.org> #v3.19
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:55 -04:00
Bart Van Assche
8de9fe3a1d IB/srp: Fix a connection setup race
Avoid that receiving a DREQ while RDMA channels are being
established causes target->qp_in_error to be reset.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: <stable@vger.kernel.org> #v3.19
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:55 -04:00
Bart Van Assche
fb49c8bbaa IB/srp: Remove an extraneous scsi_host_put() from an error path
Fix a scsi_get_host() / scsi_host_put() imbalance in the error
path of srp_create_target(). See also patch "IB/srp: Avoid that
I/O hangs due to a cable pull during LUN scanning" (commit ID
34aa654ecb).

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: <stable@vger.kernel.org> #v3.19
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:55 -04:00
Michael Wang
8e37ab68fe IB/Verbs: Reform IB-ulp ipoib
Use raw management helpers to reform IB-ulp ipoib.

Signed-off-by: Michael Wang <yun.wang@profitbricks.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18 13:35:04 -04:00
Bart Van Assche
8f71c1a27b IPoIB/CM: Fix indentation level
See also patch "IPoIB/cm: Add connected mode support for devices
without SRQs" (commit ID 68e995a295). Detected by smatch.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 13:21:27 -04:00
Linus Torvalds
c6668726d2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Lots of activity in target land the last months.

  The highlights include:

   - Convert fabric drivers tree-wide to target_register_template() (hch
     + bart)

   - iser-target hardening fixes + v1.0 improvements (sagi)

   - Convert iscsi_thread_set usage to kthread.h + kill
     iscsi_target_tq.c (sagi + nab)

   - Add support for T10-PI WRITE_STRIP + READ_INSERT operation (mkp +
     sagi + nab)

   - DIF fixes for CONFIG_DEBUG_SG=y + UNMAP file emulation (akinobu +
     sagi + mkp)

   - Extended TCMU ABI v2 for future BIDI + DIF support (andy + ilias)

   - Fix COMPARE_AND_WRITE handling for NO_ALLLOC drivers (hch + nab)

  Thanks to everyone who contributed this round with new features,
  bug-reports, fixes, cleanups and improvements.

  Looking forward, it's currently shaping up to be a busy v4.2 as well"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (69 commits)
  target: Put TCMU under a new config option
  target: Version 2 of TCMU ABI
  target: fix tcm_mod_builder.py
  target/file: Fix UNMAP with DIF protection support
  target/file: Fix SG table for prot_buf initialization
  target/file: Fix BUG() when CONFIG_DEBUG_SG=y and DIF protection enabled
  target: Make core_tmr_abort_task() skip TMFs
  target/sbc: Update sbc_dif_generate pr_debug output
  target/sbc: Make internal DIF emulation honor ->prot_checks
  target/sbc: Return INVALID_CDB_FIELD if DIF + sess_prot_type disabled
  target: Ensure sess_prot_type is saved across session restart
  target/rd: Don't pass incomplete scatterlist entries to sbc_dif_verify_*
  target: Remove the unused flag SCF_ACK_KREF
  target: Fix two sparse warnings
  target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling
  target: simplify the target template registration API
  target: simplify target_xcopy_init_pt_lun
  target: remove the unused SCF_CMD_XCOPY_PASSTHROUGH flag
  target/rd: reduce code duplication in rd_execute_rw()
  tcm_loop: fixup tpgt string to integer conversion
  ...
2015-04-24 10:22:09 -07:00
Linus Torvalds
7c034dfd58 InfiniBand/RDMA updates for 4.1:
- IPoIB fixes from Doug Ledford and Erez Shitrit
  - iSER updates from Sagi Grimberg
  - mlx4 GUID handling changes from Yishai Hadas
  - other misc fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJVN9SzAAoJEENa44ZhAt0hWq4QAJRFrwoe9ubTextSHeTU0FkY
 CydiQtGWrhyAHTX/KtdB1Uv9FzGHc6gqkAOXImouacYTM9ffypMF6Oj4xIYIMQtz
 MvNlNm07KOtQYlubiaZWcP5BjdLfMZjQxb03/9smygLTBjm80dAEt5X1znx7YrqI
 ZfE+ibPdvRqVEvFZKfT2U0kGU6oEVKrbJEiUCoJPwwcghDZQl18YmGOxt5qdI2uO
 V+71ozwozT8utSIl7S2YTJZBdkJ7tLrqrX2D/D2jUAmh1rqHIDrsXXiZ44UJj82i
 oXuwqmHXfq1LfuC9kxCX5JJpGeLE7E3OoxM1zIev31710zPA0v57rNKKweCi2Tj6
 Z36B0SIRV4ipWr/sBhVDr1Ffc/uap3DOIEU9Z+t8rwhELCEVuxmNaNb0K1e5nPiy
 YOQYp/ctC0NslM4mqQJLhGMVl6H8PjodbM1whnYZLsF1+8clNvdtLYzy/cA5fGbO
 tngUGXu0YZGdwvfuQhi5FB45XLaErJaPcMH0QRI5G0JgtjvbzXiMlqWtekTUBi7W
 DJNQlVRI4S1RYRBYkq709ymXiWwTeh3rhH+ZJpM+aY8b0NR/lx+dNyesNG+7GBJH
 y5UOOUck0w+JbQzZo264I6a5e8pXq3kMi3BH8pF4Jbo5WvxSF6uriXb6Q1JzfH20
 Jn0J6W9ghCSfrhMI1zgQ
 =v1jB
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA updates from Roland Dreier:

 - IPoIB fixes from Doug Ledford and Erez Shitrit

 - iSER updates from Sagi Grimberg

 - mlx4 GUID handling changes from Yishai Hadas

 - other misc fixes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (51 commits)
  mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures
  IB/iser: Rewrite bounce buffer code path
  IB/iser: Bump version to 1.6
  IB/iser: Remove code duplication for a single DMA entry
  IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr
  IB/iser: Modify struct iser_mem_reg members
  IB/iser: Make fastreg pool cache friendly
  IB/iser: Move PI context alloc/free to routines
  IB/iser: Move fastreg descriptor pool get/put to helper functions
  IB/iser: Merge build page-vec into register page-vec
  IB/iser: Get rid of struct iser_rdma_regd
  IB/iser: Remove redundant assignments in iser_reg_page_vec
  IB/iser: Move memory reg/dereg routines to iser_memory.c
  IB/iser: Don't pass ib_device to fall_to_bounce_buff routine
  IB/iser: Remove a redundant struct iser_data_buf
  IB/iser: Remove redundant cmd_data_len calculation
  IB/iser: Fix wrong calculation of protection buffer length
  IB/iser: Handle fastreg/local_inv completion errors
  IB/iser: Fix unload during ep_poll wrong dereference
  ib_srpt: convert printk's to pr_* functions
  ...
2015-04-22 11:50:05 -07:00
Erez Shitrit
2c15395974 IB/ipoib: Fix ndo_get_iflink
Currently, iflink of the parent interface was always accessed, even
when interface didn't have a parent and hence we crashed there.

Handle the interface types properly: for a child interface, return
the ifindex of the parent, for parent interface, return its ifindex.

For child devices, make sure to set the parent pointer prior to
invoking register_netdevice(), this allows the new ndo to be called
by the stack immediately after the child device is registered.

Fixes: 5aa7add8f1 ('infiniband/ipoib: implement ndo_get_iflink')
Reported-by: Honggang Li <honli@redhat.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Honggang Li <honli@redhat.com>
Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>+
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-17 15:21:04 -04:00
Doug Ledford
c1c2fef6cf Merge branches 'cve-fixup', 'ipoib', 'iser', 'misc-4.1', 'or-mlx4' and 'srp' into for-4.1 2015-04-15 16:24:49 -04:00
Sagi Grimberg
ba943fb237 IB/iser: Rewrite bounce buffer code path
In some rare cases, IO operations may be not aligned to page
boundaries. This prevents iser from performing fast memory
registration. In order to overcome that iser uses a bounce
buffer to carry the transaction. We basically allocate a buffer
in the size of the transaction and perform a copy.

The buffer allocation using kmalloc is too restrictive since it
requires higher order (atomic) allocations for large transactions
(which may result in memory exhaustion fairly fast for some workloads).
We rewrite the bounce buffer code path to allocate scattered pages
and perform a copy between the transaction sg and the bounce sg.

Reported-by: Alex Lyakas <alex@zadarastorage.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:14 -04:00
Sagi Grimberg
4fcd1470a0 IB/iser: Bump version to 1.6
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
ad1e567242 IB/iser: Remove code duplication for a single DMA entry
In singleton scatterlists, DMA memory registration code
is taken both for Fastreg and FMR code paths. Move it to
a function.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
6ef8bb837d IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr
Instead of passing ib_sge as output variable, we pass the mem_reg
pointer to have the routines fill the rkey as well. This reduces
code duplication and extra assignments. This is a preparation step
to unify some registration logics together. Also, pass iser_fast_reg_mr
the fastreg descriptor directly.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
90a6684c30 IB/iser: Modify struct iser_mem_reg members
No need to keep lkey, va, len variables, we can keep
them as struct ib_sge. This will help when we change the
memory registration logic.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
8b95aa2c1b IB/iser: Make fastreg pool cache friendly
Memory regions are resources that are saved
in the device caches. Increase the probability for
a cache hit by adding the MRU descriptor to pool
head.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
4dec2a27e3 IB/iser: Move PI context alloc/free to routines
Make iser_[create|destroy]_fastreg_desc shorter, more
readable and easily extendable.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
bd8b944eee IB/iser: Move fastreg descriptor pool get/put to helper functions
Instead of open-coding connection fastreg pool get/put,
we introduce iser_reg_desc[get|put] helpers.

We aren't setting these static as this will be a per-device
routine later on. Also, cleanup iser_unreg_rdma_mem_fastreg
a bit.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
f0e35c27a5 IB/iser: Merge build page-vec into register page-vec
No need for these two separate. Keep it in a single routine
like in the fastreg case. This will also make iser_reg_page_vec
closer to iser_fast_reg_mr arguments. This is a preparation
step for registration flow refactor.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
b130ededff IB/iser: Get rid of struct iser_rdma_regd
This struct members other than struct iser_mem_reg are unused,
so remove it altogether.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
6847fdeb0b IB/iser: Remove redundant assignments in iser_reg_page_vec
Buffer length was assigned twice, and no reason to set va to
io_addr and then add the offset, just set va to io_addr + offset.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:13 -04:00
Sagi Grimberg
d03e61d036 IB/iser: Move memory reg/dereg routines to iser_memory.c
As memory registration/de-registration methods, lets
move them to their natural location. While we're at it,
make iser_reg_page_vec routine static.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Sagi Grimberg
5640832590 IB/iser: Don't pass ib_device to fall_to_bounce_buff routine
No need to pass that, we can take it from the task.
In a later stage, this function will be invoked
according to a device capability.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Sagi Grimberg
e3784bd1d9 IB/iser: Remove a redundant struct iser_data_buf
No need to keep two iser_data_buf structures just in case we use
mem copy. We can avoid that just by adding a pointer to the original
sg. So keep only two iser_data_buf per command (data and protection)
and pass the relevant data_buf to bounce buffer routine.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Sagi Grimberg
ecc3993a2a IB/iser: Remove redundant cmd_data_len calculation
This code was added before we had protection data length
calculation (in iser_send_command), so we needed to calc
the sg data length from the sg itself. This is not needed
anymore.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Sagi Grimberg
a065fe6aa2 IB/iser: Fix wrong calculation of protection buffer length
This length miss-calculation may cause a silent data corruption
in the DIX case and cause the device to reference unmapped area.

Fixes: d77e65350f ('libiscsi, iser: Adjust data_length to include protection information')
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Sagi Grimberg
30bf1d58ae IB/iser: Handle fastreg/local_inv completion errors
Fast registration and local invalidate work requests can
also fail. We should call error completion handler for them.

Reported-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Sagi Grimberg
c4de4663e0 IB/iser: Fix unload during ep_poll wrong dereference
In case the user unloaded ib_iser while ep_connect is in
progress, we need to destroy the endpoint although ep_disconnect
wasn't invoked (we detect this by the iser conn state != DOWN).
However, if we got an REJECTED/UNREACHABLE CM event we move the
connection state to DOWN which will prevent us from destroying
the endpoint in the module unload stage. Fix this by setting the
connection state to TERMINATING in iser_conn_error so we can still
destroy the endpoint at unload stage.

Reported-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:07:12 -04:00
Doug Ledford
9f5d32af09 ib_srpt: convert printk's to pr_* functions
The driver already defined the pr_format, it just hadn't
been converted to use pr_info, pr_warn, and pr_err instead
of the equivalent printks.  Convert so that messages from
the driver are now properly tagged with their driver name
and can be more easily debugged.

In addition, a number of these printk's were not newline
terminated, so fix that at the same time.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:54 -04:00
Bart Van Assche
56b5390caf IB/srp: Use P_Key cache for P_Key lookups
This change slightly reduces the time needed to log in.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: David Dillow <dave@thedillows.org>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:54 -04:00
Erez Shitrit
0e5544d9bf IB/ipoib: Remove IPOIB_MCAST_RUN bit
After Doug Ledford's changes there is no need in that bit, it's
semantic becomes subset of the IPOIB_FLAG_OPER_UP bit.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:19 -04:00
Erez Shitrit
1e85b806f9 IB/ipoib: Save only IPOIB_MAX_PATH_REC_QUEUE skb's
Whenever there is no path->ah to the destination, keep only defined
number of skb's. Otherwise there are cases that the driver can keep
infinite list of skb's.

For example, when one device want to send unicast arp to the destination,
and from some reason the SM doesn't respond, the driver currently keeps
all the skb's. If that unicast arp traffic stopped, all  these skb's
are kept by the path object till the interface is down.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:19 -04:00
Erez Shitrit
2c01073095 IB/ipoib: Handle QP in SQE state
As the result of a completion error the QP can moved to SQE state by
the hardware. Since it's not the Error state, there are no flushes
and hence the driver doesn't know about that.

The fix creates a task that after completion with error which is not a
flush tracks the QP state and if it is in SQE state moves it back to RTS.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:19 -04:00
Erez Shitrit
3fd0605caa IB/ipoib: Update broadcast record values after each successful join request
Update the cached broadcast record in the priv object after every new
join of this broadcast domain group.

These values are needed for the port configuration (MTU size) and to
all the new multicast (non-broadcast) join requests initial parameters.

For example, SM starts with 2K MTU for all the fabric, and after that it
restarts (or handover to new SM) with new port configuration of 4K MTU.
Without using the new values, the driver will keep its old configuration
of 2K and will not apply the new configuration of 4K.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:18 -04:00
Erez Shitrit
a44878d100 IB/ipoib: Use one linear skb in RX flow
The current code in the RX flow uses two sg entries for each incoming
packet, the first one was for the IB headers and the second for the rest
of the data, that causes two  dma map/unmap and two allocations, and few
more actions that were done at the data path.

Use only one linear skb on each incoming packet, for the data (IB
headers and payload), that reduces the packet processing in the
data-path (only one skb, no frags, the first frag was not used anyway,
less memory allocations) and the dma handling (only one dma map/unmap
over each incoming packet instead of two map/unmap per each incoming packet).

After commit 73d3fe6d1c ("gro: fix aggregation for skb using frag_list") from
Eric Dumazet, we will get full aggregation for large packets.

When running bandwidth tests before and after the (over the card's numa node),
using "netperf -H 1.1.1.3 -T -t TCP_STREAM", the results before are ~12Gbs before
and after ~16Gbs on my setup (Mellanox's ConnectX3).

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:18 -04:00
Doug Ledford
1c0453d64a IB/ipoib: drop mcast_mutex usage
We needed the mcast_mutex when we had to prevent the join completion
callback from having the value it stored in mcast->mc overwritten
by a delayed return from ib_sa_join_multicast.  By storing the return
of ib_sa_join_multicast in an intermediate variable, we prevent a
delayed return from ib_sa_join_multicast overwriting the valid
contents of mcast->mc, and we no longer need a mutex to force the
join callback to run after the return of ib_sa_join_multicast.  This
allows us to do away with the mutex entirely and protect our critical
sections with a just a spinlock instead.  This is highly desirable
as there were some places where we couldn't use a mutex because the
code was not allowed to sleep, and so we were currently using a mix
of mutex and spinlock to protect what we needed to protect.  Now we
only have a spin lock and the locking complexity is greatly reduced.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:18 -04:00
Doug Ledford
d2fe937ce6 IB/ipoib: deserialize multicast joins
Allow the ipoib layer to attempt to join all outstanding multicast
groups at once.  The ib_sa layer will serialize multiple attempts to
join the same group, but will process attempts to join different groups
in parallel.  Take advantage of that.

In order to make this happen, change the mcast_join_thread to loop
through all needed joins, sending a join request for each one that we
still need to join.  There are a few special cases we handle though:

1) Don't attempt to join anything but the broadcast group until the join
of the broadcast group has succeeded.
2) No longer restart the join task at the end of completion handling.
If we completed successfully, we are done.  The join task now needs kicked
either by mcast_send or mcast_restart_task or mcast_start_thread, but
should not need started anytime else except when scheduling a backoff
attempt to rejoin.
3) No longer use separate join/completion routines for regular and
sendonly joins, pass them all through the same routine and just do the
right thing based on the SENDONLY join flag.
4) Only try to join a SENDONLY join twice, then drop the packets and
quit trying.  We leave the mcast group in the list so that if we get a
new packet, all that we have to do is queue up the packet and restart
the join task and it will automatically try to join twice and then
either send or flush the queue again.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:18 -04:00