linux

Author	SHA1	Message	Date
Linus Torvalds	4e4adb2f46	NFS client updates for Linux 4.3 Highlights include: Stable patches: - Fix atomicity of pNFS commit list updates - Fix NFSv4 handling of open(O_CREAT\|O_EXCL\|O_RDONLY) - nfs_set_pgio_error sometimes misses errors - Fix a thinko in xs_connect() - Fix borkage in _same_data_server_addrs_locked() - Fix a NULL pointer dereference of migration recovery ops for v4.2 client - Don't let the ctime override attribute barriers. - Revert "NFSv4: Remove incorrect check in can_open_delegated()" - Ensure flexfiles pNFS driver updates the inode after write finishes - flexfiles must not pollute the attribute cache with attrbutes from the DS - Fix a protocol error in layoutreturn - Fix a protocol issue with NFSv4.1 CLOSE stateids Bugfixes + cleanups - pNFS blocks bugfixes from Christoph - Various cleanups from Anna - More fixes for delegation corner cases - Don't fsync twice for O_SYNC/IS_SYNC files - Fix pNFS and flexfiles layoutstats bugs - pnfs/flexfiles: avoid duplicate tracking of mirror data - pnfs: Fix layoutget/layoutreturn/return-on-close serialisation issues. - pnfs/flexfiles: error handling retries a layoutget before fallback to MDS Features: - Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from Kinglong - More RDMA client transport improvements from Chuck - Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr() verbs from the SUNRPC, Lustre and core infiniband tree. - Optimise away the close-to-open getattr if there is no cached data -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV7chgAAoJEGcL54qWCgDyqJQP/3kto9VXnXcatC382jF9Pfj5 F55XeSnviOXH7CyiKA4nSBhnxg/sLuWOTpbkVI/4Y+VyWhLby9h+mtcKURHOlBnj d5BFoPwaBVDnUiKlHFQDkRjIyxjj2Sb6/uEb2V/u3v+3znR5AZZ4lzFx4cD85oaz mcru7yGiSxaQCIH6lHExcCEKXaDP5YdvS9YFsyQfv2976JSaQHM9ZG04E0v6MzTo E5wwC4CLMKmhuX9kmQMj85jzs1ASAKZ3N7b4cApTIo6F8DCDH0vKQphq/nEQC497 ECjEs5/fpxtNJUpSBu0gT7G4LCiW3PzE7pHa+8bhbaAn9OzxIR5+qWujKsfGYQhO Oomp3K9zO6omshAc5w4MkknPpbImjoZjGAj/q/6DbtrDpnD7DzOTirwYY2yX0CA8 qcL81uJUb8+j4jJj4RTO+lTUBItrM1XTqTSd/3eSMr5DDRVZj+ERZxh17TaxRBZL YrbrLHxCHrcbdSbPlovyvY+BwjJUUFJRcOxGQXLmNYR9u92fF59rb53bzVyzcRRO wBozzrNRCFL+fPgfNPLEapIb6VtExdM3rl2HYsJGckHj4DPQdnoB3ytIT9iEFZEN +/rE14XEZte7kuH3OP4el2UsP/hVsm7A49mtwrkdbd7rxMWD6XfQUp8DAggWUEtI 1H6T7RC1Y6wsu0X1fnVz =knJA -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: Stable patches: - Fix atomicity of pNFS commit list updates - Fix NFSv4 handling of open(O_CREAT\|O_EXCL\|O_RDONLY) - nfs_set_pgio_error sometimes misses errors - Fix a thinko in xs_connect() - Fix borkage in _same_data_server_addrs_locked() - Fix a NULL pointer dereference of migration recovery ops for v4.2 client - Don't let the ctime override attribute barriers. - Revert "NFSv4: Remove incorrect check in can_open_delegated()" - Ensure flexfiles pNFS driver updates the inode after write finishes - flexfiles must not pollute the attribute cache with attrbutes from the DS - Fix a protocol error in layoutreturn - Fix a protocol issue with NFSv4.1 CLOSE stateids Bugfixes + cleanups - pNFS blocks bugfixes from Christoph - Various cleanups from Anna - More fixes for delegation corner cases - Don't fsync twice for O_SYNC/IS_SYNC files - Fix pNFS and flexfiles layoutstats bugs - pnfs/flexfiles: avoid duplicate tracking of mirror data - pnfs: Fix layoutget/layoutreturn/return-on-close serialisation issues - pnfs/flexfiles: error handling retries a layoutget before fallback to MDS Features: - Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from Kinglong - More RDMA client transport improvements from Chuck - Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr() verbs from the SUNRPC, Lustre and core infiniband tree. - Optimise away the close-to-open getattr if there is no cached data" * tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (108 commits) NFSv4: Respect the server imposed limit on how many changes we may cache NFSv4: Express delegation limit in units of pages Revert "NFS: Make close(2) asynchronous when closing NFS O_DIRECT files" NFS: Optimise away the close-to-open getattr if there is no cached data NFSv4.1/flexfiles: Clean up ff_layout_write_done_cb/ff_layout_commit_done_cb NFSv4.1/flexfiles: Mark the layout for return in ff_layout_io_track_ds_error() nfs: Remove unneeded checking of the return value from scnprintf nfs: Fix truncated client owner id without proto type NFSv4.1/flexfiles: Mark layout for return if the mirrors are invalid NFSv4.1/flexfiles: RW layouts are valid only if all mirrors are valid NFSv4.1/flexfiles: Fix incorrect usage of pnfs_generic_mark_devid_invalid() NFSv4.1/flexfiles: Fix freeing of mirrors NFSv4.1/pNFS: Don't request a minimal read layout beyond the end of file NFSv4.1/pnfs: Handle LAYOUTGET return values correctly NFSv4.1/pnfs: Don't ask for a read layout for an empty file. NFSv4.1: Fix a protocol issue with CLOSE stateids NFSv4.1/flexfiles: Don't mark the entire deviceid as bad for file errors SUNRPC: Prevent SYN+SYNACK+RST storms SUNRPC: xs_reset_transport must mark the connection as disconnected NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payload ...	2015-09-07 14:02:24 -07:00
Jason Gunthorpe	d1178cbcdc	IB/ipoib: Suppress warning for send only join failures We expect send only joins to fail, it just means there are no listeners for the group. The correct thing to do is silently drop the packet at source. Eg avahi will full join 224.0.0.251 which causes a send only IGMP packet to 224.0.0.22, and then a warning level kmessage like this: ib0: sendonly multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 If there is no IP router listening to IGMP. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 17:11:05 -04:00
Doug Ledford	c3acdc06a9	IB/ipoib: Clean up send-only multicast joins Even though we don't expect the group to be created by the SM we sill need to provide all the parameters to force the SM to validate they are correct. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 17:05:58 -04:00
Sagi Grimberg	7fbc67df2c	IB/srp: Fix possible protection fault srp_destroy_qp is designed to indicate we are safe to continue with freeing the channel resources by modifying the qp error state, posting a dummy wr on the queue-pair and waiting for it to flush. This also holds for the channel registration pool as we are unmapping the memory region when handling a scsi response. Destroying the channel registration pool before we make sure we processed all the inflight IO might introduce a use-after-free of the registration pool. This use-after-free is demonstrated in the stack trace below where srp is trying to unmap a used FMR after the fmr_pool was already destroyed. general protection fault: 0000 [#1] SMP RIP: 0010:[<ffffffff8151121b>] [<ffffffff8151121b>] _raw_spin_lock_irqsave+0x1b/0x50 Call Trace: [<ffffffffa055d88a>] ib_fmr_pool_unmap+0x1a/0xb0 [ib_core] [<ffffffffa06c00ed>] srp_unmap_data.isra.28+0x17d/0x250 [ib_srp] [<ffffffffa06c01eb>] srp_free_req+0x2b/0x60 [ib_srp] [<ffffffffa06c0c94>] srp_recv_completion+0x174/0x580 [ib_srp] [<ffffffffa04580fe>] mlx4_eq_int+0x4de/0xe50 [mlx4_core] [<ffffffffa0458b00>] mlx4_msi_x_interrupt+0x10/0x20 [mlx4_core] [<ffffffff810abc45>] handle_irq_event_percpu+0x35/0x1b0 [<ffffffff810abdf2>] handle_irq_event+0x32/0x50 [<ffffffff810ae5cf>] handle_edge_irq+0x6f/0x120 [<ffffffff8100455a>] handle_irq+0x1a/0x30 [<ffffffff8151b475>] do_IRQ+0x45/0xb0 [<ffffffff8151162d>] common_interrupt+0x6d/0x6d [<ffffffff813e4d2f>] cpuidle_enter_state+0x4f/0xc0 [<ffffffff813e4e6c>] cpuidle_idle_call+0xcc/0x210 [<ffffffff8100b9ea>] arch_cpu_idle+0xa/0x30 [<ffffffff810ab1e1>] cpu_startup_entry+0xe1/0x270 [<ffffffff81030b3a>] start_secondary+0x21a/0x2c0 Reported-by: Eliott Kespi <eliottk@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 15:59:48 -04:00
Ira Weiny	0629cb06cd	IB/core: Move SM class defines from ib_mad.h to ib_smi.h When the hfi1 driver was added these definitions were moved from the qib driver to ib_mad.h to be used by both qib and hfi1. They should have been moved to ib_smi.h instead. Fixes: `d4ab347005` ("IB/core: Add core header changes needed for OPA") Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 15:50:32 -04:00
Sagi Grimberg	b636401f0e	mlx5: Fix incorrect wc pkey_index assignment for GSI messages Since patch series "Demux IB CM requests in the rdma_cm module" the P_Key index is taken from the work completion rather than the message itself. The HCA provides us with the message P_Key. In order to provide the P_Key index, we need to look it up. Given that this is relevant only for GSI messages (session establishments) which is less performance critical, micro-optimize against the GSI (is_qp1) branch. Fixes: `4c21b5bcef` ("IB/cma: Add net_dev and private data checks to RDMA CM") Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 14:50:06 -04:00
Haggai Eran	11d748045c	IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow The mlx5_ib_reg_user_mr() function will attempt to call clean_mr() in its error flow even though there is never a case where the error flow occurs with a valid MR pointer to destroy. Remove the clean_mr() call and the incorrect comment above it. Fixes: `b4cfe447d4` ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers") Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 14:42:54 -04:00
Christoph Hellwig	b632ffa7ce	IB/uverbs: reject invalid or unknown opcodes We have many WR opcodes that are only supported in kernel space and/or require optional information to be copied into the WR structure. Reject all those not explicitly handled so that we can't pass invalid information to drivers. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 14:25:24 -04:00
Nicholas Krause	54b9a96f10	IB/cxgb4: Fix if statement in pick_local_ip6adddrs This fixes an if statement checking the return value of the function get_lladdr for success in the function pick_local_ip6addrs to instead of directly checking the return value of this call check the opposite as get_lladdr returns zero for success which would incorrectly make this if statement block not execute with the current if statement check. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-03 14:03:59 -04:00
Kaike Wan	ba13b5f8f8	IB/sa: Fix rdma netlink message flags The flags to ibnl_put_msg should be NLM_F_REQUEST instead of GFP_KERNEL. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: John Fleck <john.fleck@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-09-02 13:58:54 -04:00
Yishai Hadas	e1c30298cc	IB/ucma: HW Device hot-removal support Currently, IB/cma remove_one flow blocks until all user descriptor managed by IB/ucma are released. This prevents hot-removal of IB devices. This patch allows IB/cma to remove devices regardless of user space activity. Upon getting the RDMA_CM_EVENT_DEVICE_REMOVAL event we close all the underlying HW resources for the given ucontext. The ucontext itself is still alive till its explicit destroying by its creator. Running applications at that time will have some zombie device, further operations may fail. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:41 -04:00
Yishai Hadas	ae184ddeca	IB/mlx4_ib: Disassociate support Implements the IB core disassociate_ucontext API. The driver detaches the HW resources for a given user context to prevent a dependency between application termination and device disconnecting. This is done by managing the VMAs that were mapped to the HW bars such as door bell and blueflame. When need to detach remap them to an arbitrary kernel page returned by the zap API. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	036b106357	IB/uverbs: Enable device removal when there are active user space applications Enables the uverbs_remove_one to succeed despite the fact that there are running IB applications working with the given ib device. This functionality enables a HW device to be unbind/reset despite the fact that there are running user space applications using it. It exposes a new IB kernel API named 'disassociate_ucontext' which lets a driver detaching its HW resources from a given user context without crashing/terminating the application. In case a driver implemented the above API and registered with ib_uverb there will be no dependency between its device to its uverbs_device. Upon calling remove_one of ib_uverbs the call should return after disassociating the open HW resources without waiting to clients disconnecting. In case driver didn't implement this API there will be no change to current behaviour and uverbs_remove_one will return only when last client has disconnected and reference count on uverbs device became 0. In case the lower driver device was removed any application will continue working over some zombie HCA, further calls will ended with an immediate error. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	057aec0d23	IB/uverbs: Explicitly pass ib_dev to uverbs commands Done in preparation for deploying RCU for the device removal flow. Allows isolating the RCU handling to the uverb_main layer and keeping the uverbs_cmd code as is. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	35d4a0b63d	IB/uverbs: Fix race between ib_uverbs_open and remove_one Fixes: `2a72f21226` ("IB/uverbs: Remove dev_table") Before this commit there was a device look-up table that was protected by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When it was dropped and container_of was used instead, it enabled the race with remove_one as dev might be freed just after: dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but before the kref_get. In addition, this buggy patch added some dead code as container_of(x,y,z) can never be NULL and so dev can never be NULL. As a result the comment above ib_uverbs_open saying "the open method will either immediately run -ENXIO" is wrong as it can never happen. The solution follows Jason Gunthorpe suggestion from below URL: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html cdev will hold a kref on the parent (the containing structure, ib_uverbs_device) and only when that kref is released it is guaranteed that open will never be called again. In addition, fixes the active count scheme to use an atomic not a kref to prevent WARN_ON as pointed by above comment from Jason. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	03c40442a0	IB/uverbs: Fix reference counting usage of event files Fix the reference counting usage to be handled in the event file creation/destruction function, instead of being done by the caller. This is done for both async/non-async event files. Based on Jason Gunthorpe report at https://www.mail-archive.com/ linux-rdma@vger.kernel.org/msg24680.html: "The existing code for this is broken, in ib_uverbs_get_context all the error paths between ib_uverbs_alloc_event_file and the kref_get(file->ref) are wrong - this will result in fput() which will call ib_uverbs_event_close, which will try to do kref_put and ib_unregister_event_handler - which are no longer paired." Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:39 -04:00
Jason Gunthorpe	7dd78647a2	IB/core: Make ib_dealloc_pd return void The majority of callers never check the return value, and even if they did, they can't do anything about a failure. All possible failure cases represent a bug in the caller, so just WARN_ON inside the function instead. This fixes a few random errors: net/rd/iw.c infinite loops while it fails. (racing with EBUSY?) This also lays the ground work to get rid of error return from the drivers. Most drivers do not error, the few that do are broken since it cannot be handled. Since uverbs can legitimately make use of EBUSY, open code the check. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:39 -04:00
Bart Van Assche	03f6fb93fd	IB/srp: Create an insecure all physical rkey only if needed The SRP initiator only needs this if the insecure register_always=N performance optimization is enabled, or if FRWR/FMR is not supported in the driver. Do not create an all physical MR unless it is needed to support either of those modes. Default register_always to true so the out of the box configuration does not create an insecure all physical MR. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [bvanassche: reworked and rebased this patch] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:39 -04:00
Bart Van Assche	330179f2fa	IB/srp: Register the indirect data buffer descriptor Instead of always using the global rkey for the indirect data buffer descriptor, register that descriptor with the HCA if the kernel module parameter register_always has been set to Y. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:38 -04:00
Bart Van Assche	002f15674c	IB/srp: Introduce srp_device.use_fmr Introduce the variable srp_device.use_fmr. Leave out the dev->has_fr / dev->has_fmr and ch->fr_pool / ch->fmr_pool checks since these are redundant. This patch does not change any functionality but makes the source code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:38 -04:00
Bart Van Assche	3ae95da883	IB/srp: Remove use_mr argument from srp_map_sg_entry() Move the srp_map_desc() call from inside srp_map_sg_entry() to srp_map_sg() such that the use_mr argument can be removed from srp_map_sg_entry(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:38 -04:00
Bart Van Assche	0e0d3a4800	IB/srp: Remove the memory registration backtracking code Mapping a discontiguous sg-list requires multiple memory regions and hence can exhaust the memory region pool. The SRP initiator already handles this by temporarily reducing the queue depth. This means that it is safe to remove the memory registration backtracking code. This patch has been tested with direct I/O sizes up to 256 MB. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:37 -04:00
Bart Van Assche	f731ed6293	IB/srp: Add memory descriptor array pointer range checking Although most paths through which a request is submitted check block layer parameters like the max_segments limit, these are not checked when an SG_IO or direct I/O request is submitted. Hence add a range check for the memory descriptor array pointer. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:37 -04:00
Bart Van Assche	7e85c91970	IB/srp: Use multiple registrations for large memory regions Instead of using the global rkey for large memory regions, use multiple registrations. See also the while (dma_len) loop further down in srp_map_sg_entry(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:37 -04:00
Bart Van Assche	186fbc6689	IB/srp: Re-enable FMR for non-page aligned buffers During a discussion in 2011 nobody recalled why FMR was not used for non-page aligned buffers (see also http://thread.gmane.org/gmane.linux.drivers.rdma/7149). Re-enable FMR for such buffers. For the reason why the srp_map_fmr() function needs to be modified, see also patch "IB/srp: rework mapping engine to use multiple FMR entries" (commit ID 8f26c9ff9cd0; January 2011). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:36 -04:00
Jason Gunthorpe	5a783956c2	ib_srpt: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:35 -04:00
Jason Gunthorpe	e6bf5f48d2	IB/srp: Use pd->local_dma_lkey Replace all leys with pd->local_dma_lkey. This driver does not support iWarp, so this is safe. The insecure use of ib_get_dma_mr is thus isolated to an rkey, and will have to be fixed separately. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:35 -04:00
Jason Gunthorpe	34efc7dfbd	iser-target: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:35 -04:00
Jason Gunthorpe	256b7ad273	IB/iser: Use pd->local_dma_lkey Replace all leys with pd->local_dma_lkey. This driver does not support iWarp, so this is safe. The insecure use of ib_get_dma_mr is thus isolated to an rkey, and this looks trivially fixed by forcing the use of registration in a future patch. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:34 -04:00
Jason Gunthorpe	b37c788f59	IB/mlx5: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:34 -04:00
Jason Gunthorpe	7dd9757628	IB/mlx4: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:34 -04:00
Jason Gunthorpe	77b1f99660	IB/ipoib: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:34 -04:00
Jason Gunthorpe	4be90bc60d	IB/mad: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:33 -04:00
Jason Gunthorpe	96249d70dd	IB/core: Guarantee that a local_dma_lkey is available Every single ULP requires a local_dma_lkey to do anything with a QP, so let us ensure one exists for every PD created. If the driver can supply a global local_dma_lkey then use that, otherwise ask the driver to create a local use all physical memory MR associated with the new PD. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Acked-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:33 -04:00
Sagi Grimberg	7332bed085	IB/iser: Chain all iser transaction send work requests Chaning of send work requests benefits performance by reducing the send queue lock contention (acquired in ib_post_send) and saves us HW doorbells which is posted only once. Currently, in normal IO flows iser does not chain the CDB send work request with the registration work request. Also in PI flows, signature work requests are not chained as well. Lets chain those and post only once. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:33 -04:00
Sagi Grimberg	1b16c9894b	IB/iser: Add debug prints to the various memory registration methods Easier to debug when we have the registration details. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:32 -04:00
Sagi Grimberg	df749cdc45	IB/iser: Support up to 8MB data transfer in a single command iser support up to 512KB data transfer in a single scsi command. This means that larger IOs will split to different request. While iser can easily saturate FDR/EDR wires, some arrays are fine tuned for 1MB (or larger) IO sizes, hence add an option to support larger transfers (up to 8MB) if the device allows it. Given that a few target implementations don't support data transfers of more than 512KB by default and the fact that larger IO sizes require more resources, we introduce a module parameter to determine the maximum number of 512B sectors in a single scsi command. Users that are interested in larger transfers can change this value given that the target supports larger transfers. At the moment, iser works in 4K pages granularity, In a later stage we will get it to work with system page size instead. IO operations that consists of N pages will need a page vector of size N+1 in case the first SG element contains an offset. Given that some devices allocates memory regions in powers of 2, this means that allocating a region with N+1 pages, will result in region resources allocation of the next power of 2. Since we don't want that to happen, in case we are in the limit of IO size supported and the first SG element has an offset, we align the SG list using a bounce buffer (which is OK given that this is not likely to happen a lot). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:32 -04:00
Sagi Grimberg	f8db651da2	IB/iser: Pass registration pool a size parameter Hard coded for now. This will allow to allocate different sized MRs depending on the IO size needed (and device capabilities). This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:32 -04:00
Sagi Grimberg	32467c420b	IB/iser: Unify fast memory registration flows iser_reg_rdma_mem_[fastreg\|fmr] share a lot of code, and logically do the same thing other than the buffer registration method itself (iser_fast_reg_mr vs. iser_fast_reg_fmr). The DIF logic is not implemented in the FMR flow as there is no existing device that supports FMRs and Signature feature. This patch unifies the flow in a single routine iser_reg_rdma_mem and just split to fmr/frwr for the buffer registration itself. Also, for symmetry reasons, unify iser_unreg_rdma_mem (which will call the relevant device specific unreg routine). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:31 -04:00
Sagi Grimberg	81722909c8	IB/iser: Make reg_desc_get a per device routine As for fmrs we will hold a single registration descriptor as no need for multiple like in the frwr mode (descriptor for each task). This change helps unifying the duplicate registration code paths. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:31 -04:00
Sagi Grimberg	7d0483c927	IB/iser: Rename iser_reg_page_vec to iser_fast_reg_fmr Also, change a name of a local variable. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:31 -04:00
Adir Lev	2b3bf95810	IB/iser: Maintain connection fmr_pool under a single registration descriptor This will allow us to unify the memory registration code path between the various methods which vary by the device capabilities. This change will make it easier and less intrusive to remove fmr_pools from the code when we'd want to. The reason we use a single descriptor is to avoid taking a redundant spinlock when working with FMRs. We also change the signature of iser_reg_page_vec to make it match iser_fast_reg_mr (and the future indirect registration method). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:30 -04:00
Sagi Grimberg	385ad87d4b	IB/iser: Introduce iser registration pool struct Instead of having it a part of the connection structure, have it be under a dedicated (embedded) structure in the connection. A logical separation of the registration pool and the connection structure. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:30 -04:00
Sagi Grimberg	eb6ea8c36c	IB/iser: Move fastreg descriptor allocation to iser_create_fastreg_desc Don't have the caller allocate the structure and worry about freeing it in case the routine failed. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:30 -04:00
Sagi Grimberg	48afbff673	IB/iser: Introduce iser_reg_ops Move all the per-device function pointers to an easy extensible iser_reg_ops structure that contains all the iser registration operations. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:29 -04:00
Sagi Grimberg	8c18ed03a9	IB/iser: Remove dead code in fmr_pool alloc/free In the past the we always tried to allocate an fmr_pool and if it failed on ENOSYS (not supported) then we continued with dma mr. This is not the case anymore and if we tried to allocate an fmr_pool then it is supported and we expect to succeed. Also, the check if fmr_pool is allocated when free is called is redundant as well as we are guaranteed it exists. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:29 -04:00
Sagi Grimberg	5190cc2664	IB/iser: Rename struct fast_reg_descriptor -> iser_fr_desc Avoid struct names without iser_ prefix. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:29 -04:00
Sagi Grimberg	d711d81d64	IB/iser: Introduce struct iser_reg_resources Have fast_reg_descriptor hold struct iser_reg_resources (mr, frpl, valid flag). This will be useful when the actual buffer registration routines will be passed with the needed registration resources (i.e. iser_reg_resources) without being aware of their nature (i.e. data or protection). In order to achieve this, we remove reg_indicators flags container and place specific flags (mr_valid) within iser_reg_resources struct. We also place the sig_mr_valid and sig_protcted flags in iser_pi_context. This patch also modifies iser_fast_reg_mr to receive the reg_resources instead of the fast_reg_descriptor and a data/protection indicator. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:28 -04:00
Sagi Grimberg	ea18f5d777	IB/iser: Remove an unneeded print for unaligned memory We can do it in iser_aligned_data_len instead and it will save us an argument that is passed to fall_to_counce_buf just for the print. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:28 -04:00
Sagi Grimberg	b9abd8d21d	IB/iser: Remove a redundant always-false condition We always call iser_initialize_task_headers() and set the header tx_sg.lkey to the device mr lkey, so no point in checking it in iser_create_send_desc(). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:28 -04:00
Sagi Grimberg	8d5944d803	IB/iser: Fix possible bogus DMA unmapping If iser_initialize_task_headers() routine failed before dma mapping, we should not attempt to unmap in cleanup_task(). Fixes: `7414dde0a6` (IB/iser: Fix race between iser connection ...) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:28 -04:00
Sagi Grimberg	02816a8b88	IB/iser: Get rid of un-maintained counters We don't update those anywhere in the code and they seem pretty useless (no one seem to care about those). qp_tx_queue_full: We never should get this fmr_map_not_avail: We can never get to this eh_abort_cnt: We don't monitor aborts Go ahead and remove them. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:27 -04:00
Sagi Grimberg	d16739055b	IB/iser: Fix missing return status check in iser_send_data_out Since commit "IB/iser: Fix race between iser connection teardown..." iser_initialize_task_headers() might fail, so we need to check that. Fixes: `7414dde0a6` (IB/iser: Fix race between iser connection ...) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:27 -04:00
Sagi Grimberg	1156cc80f8	IB/iser: Remove '.' from log message Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:27 -04:00
Sagi Grimberg	74ce897b7c	IB/iser: Change minor assignments and logging prints Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:26 -04:00
Jenny Falkovich	db0a6cbd21	IB/iser: Change some module parameters to be RO While we're at it, use permission defines instead of octal values and rearrange a little bit. Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:26 -04:00
Kaike Wan	2ca546b92a	IB/sa: Route SA pathrecord query through netlink This patch routes a SA pathrecord query to netlink first and processes the response appropriately. If a failure is returned, the request will be sent through IB. The decision whether to route the request to netlink first is determined by the presence of a listener for the local service netlink multicast group. If the user-space local service netlink multicast group listener is not present, the request will be sent through IB, just like what is currently being done. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: John Fleck <john.fleck@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:26 -04:00
Kaike Wan	5d2657708e	IB/sa: Allocate SA query with kzalloc Replace kmalloc with kzalloc so that all uninitialized fields in SA query will be zero-ed out to avoid unintentional consequence. This prepares the SA query structure to accept new fields in the future. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: John Fleck <john.fleck@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:25 -04:00
Kaike Wan	bc10ed7d3d	IB/core: Add rdma netlink helper functions This patch adds a function to check if listeners for a netlink multicast group are present. It also adds a function to receive netlink response messages. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: John Fleck <john.fleck@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:25 -04:00
Bart Van Assche	bc44bd1d86	IB/srp: Stop the scsi_eh_<n> and scsi_tmf_<n> threads if login fails scsi_host_alloc() not only allocates memory for a SCSI host but also creates the scsi_eh_<n> kernel thread and the scsi_tmf_<n> workqueue. Stop these threads if login fails by calling scsi_host_put(). Reported-by: Konstantin Krotov <kkv@clodo.ru> Fixes: `fb49c8bbaa` ("Remove an extraneous scsi_host_put() from an error path") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: <stable@vger.kernel.org> #v3.19 Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:24 -04:00
Bart Van Assche	713ef24e41	IB/srp: Bump driver version and release date Since version 1.0 e.g. scsi-mq has been added. Since this is a significant change, bump the driver version and release date. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:24 -04:00
Bart Van Assche	c257ea6f9f	IB/srp: Handle partial connection success correctly Avoid that the following kernel warning is reported if the SRP target system accepts fewer channels per connection than what was requested by the initiator system: WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:617 srp_destroy_qp+0xb1/0x120 [ib_srp]() Call Trace: [<ffffffff8105d67f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8105d6da>] warn_slowpath_null+0x1a/0x20 [<ffffffffa05419e1>] srp_destroy_qp+0xb1/0x120 [ib_srp] [<ffffffffa05445fb>] srp_create_ch_ib+0x19b/0x420 [ib_srp] [<ffffffffa0545257>] srp_create_target+0x7d7/0xa94 [ib_srp] [<ffffffff8138dac0>] dev_attr_store+0x20/0x30 [<ffffffff812079ef>] sysfs_write_file+0xef/0x170 [<ffffffff81191fc4>] vfs_write+0xb4/0x130 [<ffffffff8119276f>] sys_write+0x5f/0xa0 [<ffffffff815a0a59>] system_call_fastpath+0x16/0x1b Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:24 -04:00
Bart Van Assche	e6300cbd9b	IB/srp: Constify a function argument This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:23 -04:00
Ariel Nahum	799cdaf8a9	IB/mlx4: Fix incorrect cq flushing in error state When handling a device internal error, the driver is responsible to drain the completion queue with flush errors. In case a completion queue was assigned to multiple send queues, the driver iterates over the send queues and generates flush errors of inflight wqes. The driver must correctly pass the wc array with an offset as a result of the previous send queue iteration. Not doing so will overwrite previously set completions and return a wrong number of polled completions which includes ones which were not correctly set. Fixes: `35f05dabf9` (IB/mlx4: Reset flow support for IB kernel ULPs) Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:23 -04:00
Noa Osherovich	5e99b139f1	IB/mlx4: Use correct SL on AH query under RoCE The mlx4 IB driver implementation for ib_query_ah used a wrong offset (28 instead of 29) when link type is Ethernet. Fixed to use the correct one. Fixes: `fa417f7b52` ('IB/mlx4: Add support for IBoE') Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:23 -04:00
Jack Morgenstein	2b135db3e8	IB/mlx4: Forbid using sysfs to change RoCE pkeys The pkey mapping for RoCE must remain the default mapping: VFs: virtual index 0 = mapped to real index 0 (0xFFFF) All others indices: mapped to a real pkey index containing an invalid pkey. PF: virtual index i = real index i. Don't allow users to change these mappings using files found in sysfs. Fixes: `c1e7e46612` ('IB/mlx4: Add iov directory in sysfs under the ib device') Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:22 -04:00
Jack Morgenstein	2cb8e7f86e	IB/mlx4: Demote mcg message from warning to debug The mcg "too many pending requests" warning message fills the log when OpenSM is downed. Demote the message from warning level to debug level. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:22 -04:00
Jack Morgenstein	90c1d8b635	IB/mlx4: Fix potential deadlock when sending mad to wire send_mad_to_wire takes the same spinlock that is taken in the interrupt context. Therefore, it needs irqsave/restore. Fixes: `b9c5d6a643` ('IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV') Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:22 -04:00
Doug Ledford	b8071ad893	IB/core: Remove needless bracketization Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:21 -04:00
Somnath Kotur	cc36929e73	RDMA/ocrdma: Incorporate the moving of GID Table mgmt to IB/Core 1.Change query_gid hook to return value from IB/Core GID management APIs. 2.Get rid of all the netdev notifier chain subscription code as well as maintenance of SGID Table in memory. 3.Implement get_netdev hook in driver. Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:21 -04:00
Moni Shoua	5070cd2239	IB/mlx4: Replace mechanism for RoCE GID management Manage RoCE gid table with logic in IB/core, which is common to all vendors, and remove the mechanism from the mlx4 IB driver. Since management of the GID cache may lead to index mismatch with the hardware GID table, a translation between indexes is required when modifying a QP or creating an address handle. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:21 -04:00
Moni Shoua	e26be1bfef	IB/mlx4: Implement ib_device callbacks get_netdev: get the net_device on the physical port of the IB transport port. In port aggregation mode it is required to return the netdev of the active port. modify_gid: note for a change in the RoCE gid cache. Handle this by writing to the harsware GID table. It is possible that indexes in cahce and hardware tables won't match so a translation is required when modifying a QP or creating an address handle. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:20 -04:00
Matan Barak	238fdf48f2	IB/core: Add RoCE table bonding support Handling bonding and other devices require us to all all GIDs of the net-devices which are upper-devices of the RoCE port related net-device. Active-backup configurations imposes even more challenges as the default GID should only be set on the active devices (this is necessary as otherwise the same MAC could be used for several slaves and thus several slaves will have identical GIDs). Managing these configurations are done by listening to: (a) NETDEV_CHANGEUPPER event (1) if a related net-device is linked, delete all inactive slaves default GIDs and add the upper device GIDs. (2) if a related net-device is unlinked, delete all upper GIDs and add the default GIDs. (b) NETDEV_BONDING_FAILOVER: (1) delete the bond GIDs from inactive slaves (2) delete the inactive slave's default GIDs (3) Add the bond GIDs to the active slave. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:20 -04:00
Dan Carpenter	98d25afa97	IB/core: missing curly braces in ib_find_gid() Smatch says that, based on the indenting, we should probably add curly braces here. Fixes: `03db3a2d81` ('IB/core: Add RoCE GID table management') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:09 -04:00
Matan Barak	03db3a2d81	IB/core: Add RoCE GID table management RoCE GIDs are based on IP addresses configured on Ethernet net-devices which relate to the RDMA (RoCE) device port. Currently, each of the low-level drivers that support RoCE (ocrdma, mlx4) manages its own RoCE port GID table. As there's nothing which is essentially vendor specific, we generalize that, and enhance the RDMA core GID cache to do this job. In order to populate the GID table, we listen for events: (a) netdev up/down/change_addr events - if a netdev is built onto our RoCE device, we need to add/delete its IPs. This involves adding all GIDs related to this ndev, add default GIDs, etc. (b) inet events - add new GIDs (according to the IP addresses) to the table. For programming the port RoCE GID table, providers must implement the add_gid and del_gid callbacks. RoCE GID management requires us to state the associated net_device alongside the GID. This information is necessary in order to manage the GID table. For example, when a net_device is removed, its associated GIDs need to be removed as well. RoCE mandates generating a default GID for each port, based on the related net-device's IPv6 link local. In contrast to the GID based on the regular IPv6 link-local (as we generate GID per IP address), the default GID is also available when the net device is down (in order to support loopback). Locking is done as follows: The patch modify the GID table code both for new RoCE drivers implementing the add_gid/del_gid callbacks and for current RoCE and IB drivers that do not. The flows for updating the table are different, so the locking requirements are too. While updating RoCE GID table, protection against multiple writers is achieved via mutex_lock(&table->lock). Since writing to a table requires us to find an entry (possible a free entry) in the table and then modify it, this mutex protects both the find_gid and write_gid ensuring the atomicity of the action. Each entry in the GID cache is protected by rwlock. In RoCE, writing (usually results from netdev notifier) involves invoking the vendor's add_gid and del_gid callbacks, which could sleep. Therefore, an invalid flag is added for each entry. Updates for RoCE are done via a workqueue, thus sleeping is permitted. In IB, updates are done in write_lock_irq(&device->cache.lock), thus write_gid isn't allowed to sleep and add_gid/del_gid are not called. When passing net-device into/out-of the GID cache, the device is always passed held (dev_hold). The code uses a single work item for updating all RDMA devices, following a netdev or inet notifier. The patch moves the cache from being a client (which was incorrect, as the cache is part of the IB infrastructure) to being explicitly initialized/freed when a device is registered/removed. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:50 -04:00
Jason Gunthorpe	55aeed0654	IB/core: Make ib_alloc_device init the kobject This gets rid of the weird in-between state where struct ib_device was allocated but the kobject didn't work. Consequently ib_device_release is now guaranteed to be called in all situations and we needn't duplicate its kfrees on error paths. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:50 -04:00
Sagi Grimberg	d9f272c523	IB/core: Drop ib_alloc_fast_reg_mr Fully replaced by a more generic and suitable ib_alloc_mr. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:49 -04:00
Sagi Grimberg	1302f8452b	qib: Support ib_alloc_mr verb Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:48 -04:00
Sagi Grimberg	e02e4d554d	nes: Support ib_alloc_mr verb Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:48 -04:00
Sagi Grimberg	f683d3bdbb	cxgb3: Support ib_alloc_mr verb Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:47 -04:00
Sagi Grimberg	a21640347a	iw_cxgb4: Support ib_alloc_mr verb Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:47 -04:00
Sagi Grimberg	cacb7d59be	ocrdma: Support ib_alloc_mr verb Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:47 -04:00
Sagi Grimberg	679e34d1d0	mlx4: Support ib_alloc_mr verb Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:46 -04:00
Sagi Grimberg	b3778ba8de	mlx5: Drop mlx5_ib_alloc_fast_reg_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:46 -04:00
Sagi Grimberg	563b67c5f9	IB/srp: Convert to ib_alloc_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:45 -04:00
Sagi Grimberg	a89be2cc51	iser-target: Convert to ib_alloc_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:45 -04:00
Sagi Grimberg	34780f012c	IB/iser: Convert to ib_alloc_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:44 -04:00
Sagi Grimberg	9bee178b4f	IB: Modify ib_create_mr API Use ib_alloc_mr with specific parameters. Change the existing callers. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:44 -04:00
Sagi Grimberg	8b91ffc1cf	IB/core: Get rid of redundant verb ib_destroy_mr This was added in a thought of uniting all mr allocation and deallocation routines but the fact is we have a single deallocation routine already, ib_dereg_mr. And, move mlx5_ib_destroy_mr specific logic into mlx5_ib_dereg_mr (includes only signature stuff for now). And, fixup the only callers (iser/isert) accordingly. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:44 -04:00
Haggai Eran	be688195bd	IB/cma: Fix net_dev reference leak with failed requests When no matching listening ID is found for a given request, the net_dev that was used to find the request isn't released. Fixes: `0b3ca768fc` ("IB/cma: Use found net_dev for passive connections") Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:08:28 -04:00
Haggai Eran	73fec7fd04	IB/cm: Remove compare_data checks Now that there are no ib_cm clients using the compare_data feature for matching IB CM requests' private data, remove the compare_data parameter of ib_cm_listen and remove the code implementing the feature. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:24 -04:00
Haggai Eran	51efe394bc	IB/cma: Share ib_cm_ids between rdma_cm_ids Use ib_cm_insert_listen to create listening IB CM IDs or share existing ones if needed. When given a request on a specific CM ID, the code now matches the request to the RDMA CM ID based on the request parameters, so it no longer needs to rely on the ib_cm's private data matching capabilities. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:24 -04:00
Haggai Eran	0b3ca768fc	IB/cma: Use found net_dev for passive connections When receiving a new connection in cma_req_handler, we actually already know the net_dev that is used for the connection's creation. Instead of calling cma_translate_addr to resolve the new connection id's source address, just use the net_dev that was found. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:24 -04:00
Haggai Eran	f887f2ac87	IB/cma: Validate routing of incoming requests Pass incoming request parameters through the relevant IPv4/IPv6 routing tables and make sure the network stack is configured to handle such requests. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:23 -04:00
Haggai Eran	4c21b5bcef	IB/cma: Add net_dev and private data checks to RDMA CM Instead of relying on a the ib_cm module to check an incoming CM request's private data header, add these checks to the RDMA CM module. This allows a following patch to to clean up the ib_cm interface and remove the code that looks into the private headers. It will also allow supporting namespaces in RDMA CM by making these checks namespace aware later on. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:23 -04:00
Haggai Eran	24cad9a7e8	IB/cm: Expose BTH P_Key in CM and SIDR request events The rdma_cm module will later use the P_Key from the BTH to de-mux requests. See discussion at: http://www.spinics.net/lists/netdev/msg336067.html Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:23 -04:00
Haggai Eran	aac978e152	IB/cma: Helper functions to access port space IDRs Add helper functions to access the IDRs by port-space and port number. Pass around the port-space enum in cma.c instead of using pointers to port-space IDRs. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:23 -04:00
Haggai Eran	0c505f70a2	IB/cma: Refactor RDMA IP CM private-data parsing code When receiving a connection request, rdma_cm needs to associate the request with a network device, in order to disambiguate requests. To do this, it needs to know the request's destination IP. For this the module needs to allow getting this information from the private data in the request packet, instead of relying on the information already being in the listening RDMA CM ID. When creating a new incoming connection ID, the code in cma_save_ip{4,6}_info can no longer rely on the listener's private data to find the port number, so it reads it from the requested service ID. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:22 -04:00
Haggai Eran	067b171b86	IB/cm: Share listening CM IDs Enabling network namespaces for RDMA CM will allow processes on different namespaces to listen on the same port. In order to leave namespace support out of the CM layer, this requires that multiple RDMA CM IDs will be able to share a single CM ID. This patch adds infrastructure to retrieve an existing listening ib_cm_id, based on its device and service ID, or create a new one if one does not already exist. It also adds a reference count for such instances (cm_id_private.listen_sharecount), and prevents cm_destroy_id from destroying a CM if it is still shared. See the relevant discussion [1]. [1] Re: [PATCH v3 for-next 05/13] IB/cm: Reference count ib_cm_ids http://www.spinics.net/lists/netdev/msg328860.html Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:22 -04:00
Haggai Eran	15865e7dab	IB/cm: Expose service ID in request events Expose the service ID on an incoming CM or SIDR request to the event handler. This will allow the RDMA CM module to de-multiplex connection requests based on the information encoded in the service ID. Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:22 -04:00
Guy Shapiro	ddde896e56	IB/ipoib: Return IPoIB devices matching connection parameters Implement the get_net_device_by_port_pkey_ip callback that returns network device to ib_core according to connection parameters. Check the ipoib device and iterate over all child devices to look for a match. For each IPoIB device we iterate through all upper devices when searching for a matching IP, in order to support bonding. Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:21 -04:00
Yotam Kenneth	9268f72dcb	IB/core: Find the network device matching connection parameters In the case of IPoIB, and maybe in other cases, the network device is managed by an upper-layer protocol (ULP). In order to expose this network device to other users of the IB device, let ULPs implement a callback that returns network device according to connection parameters. The IB device and port, together with the P_Key and the GID should be enough to uniquely identify the ULP net device. However, in current kernels there can be multiple IPoIB interfaces created with the same GID. Furthermore, such configuration may be desireable to support ipvlan-like configurations for RDMA CM with IPoIB. To resolve the device in these cases the code will also take the IP address as an additional input. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:21 -04:00
Haggai Eran	7c1eb45a22	IB/core: lock client data with lists_rwsem An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device() freeing its client data. This is because ib_unregister_device() will remove the device from the device list with lists_rwsem locked for write, but perform the rest of the cleanup, including the call to remove() without that lock. Mark client data that is undergoing de-registration with a new going_down flag in the client data context. Lock the client data list with lists_rwsem for write in addition to using the spinlock, so that functions calling the callback would be able to lock only lists_rwsem for read and let callbacks sleep. Since ib_unregister_client() now marks the client data context, no need for remove() to search the context again, so pass the client data directly to remove() callbacks. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:21 -04:00
Haggai Eran	5aa44bb90f	IB/core: Add rwsem to allow reading device list or client list Currently the RDMA subsystem's device list and client list are protected by a single mutex. This prevents adding user-facing APIs that iterate these lists, since using them may cause a deadlock. The patch attempts to solve this problem by adding a read-write semaphore to protect the lists. Readers now don't need the mutex, and are safe just by read-locking the semaphore. The ib_register_device, ib_register_client, ib_unregister_device, and ib_unregister_client functions are modified to lock the semaphore for write during their respective list modification. Also, in order to make sure client callbacks are called only between add() and remove() calls, the code is changed to only add items to the lists after the add() calls and remove from the lists before the remove() calls. This patch attempts to solve a similar need [1] that was seen in the RoCE v2 patch series. [1] http://www.spinics.net/lists/linux-rdma/msg24733.html Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Matan Barak <matanb@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 15:48:15 -04:00
Steve Wise	aaae91f4f0	ipath,qib: Expose max_sge_rd correctly Applications must not assume that max_sge and max_sge_rd are the same, Hence expose max_sge_rd correctly as well. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 23:02:11 -04:00
Sagi Grimberg	18ebd40773	mlx4, mlx5, mthca: Expose max_sge_rd correctly Applications must not assume that max_sge and max_sge_rd are the same, Hence expose max_sge_rd correctly as well. Reported-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 23:02:10 -04:00
Dennis Dalessandro	d4ab347005	IB/core: Add core header changes needed for OPA This patch adds the value of the CNP opcode to the existing list of enumerated opcodes in ib_pack.h Add common OPA header definitions for driver build: - opa_port_info.h - opa_smi.h - hfi1_user.h Additionally, ib_mad.h, has additional definitions that are common to ib_drivers including: - trap support - cca support The qib driver has the duplication removed in favor those in ib_mad.h Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: John, Jubin <jubin.john@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:50 -04:00
Steve Wise	072bf1f7e4	RDMA/amso1100: Deprecate the amso1100 driver and move to staging The HW hasn't been sold since 2005, and the SW has definite bit rot. Its time to remove it. So move it to staging for a few releases and then remove it after that. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:49 -04:00
Dennis Dalessandro	6f9b38903c	IB/ipath: Deprecate ipath driver and move to staging. It is now time for the ipath driver to begin to be phased out of the kernel. This patch moves the ipath driver from the Infiniband sub tree to the staging area where it will remain until the code is removed from the kernel in a few releases. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:49 -04:00
Hariprasad S	84cc6ac62d	iw_cxgb4: Add support for clip Add support for ipv6 address handling clip api provided by lld Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:48 -04:00
Spencer Baugh	6c26a77124	RDMA/cma: fix IPv6 address resolution Resolving a link-local IPv6 address with an unspecified source address was broken by commit `5462eddd7a`, which prevented the IPv6 stack from learning the scope id of the link-local IPv6 address, causing random failures as the IP stack chose a random link to resolve the address on. This commit `5462eddd7a` made us bail out of cma_check_linklocal early if the address passed in was not an IPv6 link-local address. On the address resolution path, the address passed in is the source address; if the source address is the unspecified address, which is not link-local, we will bail out early. This is mostly correct, but if the destination address is a link-local address, then we will be following a link-local route, and we'll need to tell the IPv6 stack what the scope id of the destination address is. This used to be done by last line of cma_check_linklocal, which is skipped when bailing out early: dev_addr->bound_dev_if = sin6->sin6_scope_id; (In cma_bind_addr, the sin6_scope_id of the source address is set to the sin6_scope_id of the destination address, so this is correct) This line is required in turn for the following line, L279 of addr6_resolve, to actually inform the IPv6 stack of the scope id: fl6.flowi6_oif = addr->bound_dev_if; Since we can only know we are in this failure case when we have access to both the source IPv6 address and destination IPv6 address, we have to deal with this further up the stack. So detect this failure case in cma_bind_addr, and set bound_dev_if to the destination address scope id to correct it. Signed-off-by: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:48 -04:00
Jason Gunthorpe	7e967fd0b8	IB/ucma: Fix theoretical user triggered use-after-free Something like this: CPU A CPU B Acked-by: Sean Hefty <sean.hefty@intel.com> ======================== ================================ ucma_destroy_id() wait_for_completion() .. anything ucma_put_ctx() complete() .. continues ... ucma_leave_multicast() mutex_lock(mut) atomic_inc(ctx->ref) mutex_unlock(mut) ucma_free_ctx() ucma_cleanup_multicast() mutex_lock(mut) kfree(mc) rdma_leave_multicast(mc->ctx->cm_id,.. Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot leave the mutex(mut) protection. The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects it from racing with ucma_destroy_id. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:48 -04:00
Hariprasad S	b8ac311246	iw_cxgb4: set the default MPA version to 2 This enables ORD/IRD negotiation and its about time to enable it by default Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:47 -04:00
Steve Wise	7854550ae6	RDMA/iser: Limit sgs to the device fastreg depth Currently the sg tablesize, which dictates fast register page list depth to use, does not take into account the limits of the rdma device. So adjust it once we discover the device fastreg max depth limit. Also adjust the max_sectors based on the resulting sg tablesize. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:47 -04:00
Roland Dreier	d6c7276be1	IB/mlx5: Remove dead code from alloc_cached_mr() The only place that assigns mr inside the loop already does a break. So "if (mr)" will never be true here since the function initializes mr to NULL at the top. We can just drop the extra if and break here. Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:47 -04:00
Mike Marciniszyn	d6f1c17e16	IB/qib: Change lkey table allocation to support more MRs The lkey table is allocated with with a get_user_pages() with an order based on a number of index bits from a module parameter. The underlying kernel code cannot allocate that many contiguous pages. There is no reason the underlying memory needs to be physically contiguous. This patch: - switches the allocation/deallocation to vmalloc/vfree - caps the number of bits to 23 to insure at least 1 generation bit o this matches the module parameter description Cc: stable@vger.kernel.org Reviewed-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:46 -04:00
Sagi Grimberg	e0238a6a36	mlx5: Expose correct page_size_cap in device attributes Should be all the page sizes that are supported by the device. Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:46 -04:00
Sagi Grimberg	a3c874200c	mlx5: Fix missing device local_dma_lkey The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY but does not set the the device local_dma_lkey. This breaks rpcrdma drivers. Query and set this lkey when creating the device resources. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-28 22:54:46 -04:00
Andy Grover	13a3cf08fa	target/iscsi: Replace __kernel_sockaddr_storage with sockaddr_storage It appears to be what the rest of the kernel does, so let's do it too. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-08-26 23:27:25 -07:00
Andy Grover	dc58f760e2	target/iscsi: Replace conn->login_ip with login_sockaddr Very similar to how it went with local_sockaddr. It was embedded in iscsi_login_stats so some changes there, and we needed to copy in a sockaddr_storage comparison function. Hopefully the kernel will get a standard one soon, our implementation makes the 3rd. isert_set_conn_info() became much smaller. IPV6_ADDRESS_SPACE define goes away, had to modify a call to in6_pton(), can just use -1 since we are sure string is null-terminated. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-08-26 23:27:20 -07:00
Andy Grover	69d755747d	target/iscsi: Keep local_ip as the actual sockaddr This is a more natural format that lets us format it with the appropriate printk specifier as needed. This also lets us handle v4-mapped ipv6 addresses a little more nicely, by storing the addr as an actual v4 sockaddr in conn->local_sockaddr. Finally, we no longer need to maintain variables for port, since this is contained in sockaddr. Remove iscsi_np.np_port and iscsi_conn.local_port. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-08-26 23:27:16 -07:00
David S. Miller	dc25b25897	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/qmi_wwan.c Overlapping additions of new device IDs to qmi_wwan.c Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-21 11:44:04 -07:00
Linus Torvalds	bf6740281e	Changes for 4.2-rc - Bugfix in iw_cxgb4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVzkfOAAoJELgmozMOVy/dW2oQAMN8UQxwk+r69rEPWPJSHvG5 0Ia8EuAL4CQavIvWDxWBNmGRe/1SjxXSb6OGdHWD73wpKzoWJcGatS9ZTaXFRvWR wkqfXtZ501ZG9ejQBloJ23sgqqUuk9TEOEfonlowFwjbXA6mB8MEoM/QkmFvhgP0 Wda7jDEe4SVWOhipDMWNWtQhxIJtuwgggHeiBdjQASBTyFmA/SJlGf9UgLcJxShh 0kvuo1nxeC1p4xcKspyMo4x1xlMi0hURp611r8JZ34BuV2q4SUNM3dLsGwgxJipZ P0kvvfjvZxF31pgfeNkPRTli736pLJNPXHMfVJvkBLysqk3LjjS+KfOMTqTiNDYP Rldmk0CH20It1LWTVGIhw4CUOH05sKZSOvObUQAB8MbXgUXYGxbPB2od7PQYnDWA +GhOnWj6PjVQKGgIqO3S9NPgKtFxeA7gbKqYlsOYf/KQtP/Trhghm23jX3+dEfpy bmzn6qzKgM0mcGqzKAjxAMMzh4enAg0X5FFoyGhzk8ZC1b+0/0LgiRDVZvaJdQc7 rVRwe3Tk+pAF65Hk89KeGqwbhOm4JGFxno1BXabtt4orb/uSgqW9W9vf5BRW0mWS C7RQnBrsE82ADJ5sGKNuE31bGu5Daaucf2hOC91vKzUgWmnVtDPpfrh93iBYoUSK RkZD5+bPaDgN9roFzLUh =idkv -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma bugfix from Doug Ledford: "Bugfix in iw_cxgb4" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: iw_cxgb4: gracefully handle unknown CQE status errors	2015-08-17 16:26:30 -07:00
Chuck Lever	1241d7bf2a	core: Remove the ib_reg_phys_mr() and ib_rereg_phys_mr() verbs The verbs are obsolete. The ib_rereg_phys_mr() verb is not used by kernel ULPs, and the last ib_reg_phys_mr() call site in the kernel tree has now been removed. Two staging tree call sites remain in the Lustre client. The Lustre team has been notified of the deprecation of reg_phys_mr. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-08-05 16:21:28 -04:00
David S. Miller	5510b3c2a1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/s390/net/bpf_jit_comp.c drivers/net/ethernet/ti/netcp_ethss.c net/bridge/br_multicast.c net/ipv4/ip_fragment.c All four conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-31 23:52:20 -07:00
Linus Torvalds	733db573a6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target fixes from Nicholas Bellinger: "This series is larger than what I'd normally be conformable with sending for a -rc5 PULL request.. However, the bulk of the series is localized to qla2xxx target specific fixes that address a number of real-world correctness issues, that have been outstanding on the list for ~6 weeks now. They where submitted + verified + acked by the HW LLD vendor, contributed by a major production customer of the code, and are marked for v3.18.y stable code. That said, I don't see a good reason to wait another month to get these fixes into mainline. Beyond the qla2xx specific fixes, this series also includes: - bugfix for a long standing use-after-free in iscsi-target during TPG shutdown + demo-mode sessions. - bugfix for a >= v4.0 regression OOPs in iscsi-target during a iscsi_start_kthreads() failure. - bugfix for a >= v4.0 regression hang in iscsi-target for iser explicit session/connection logout. - bugfix for a iser-target bug where a early CMA REJECTED status during login triggers a NULL pointer dereference OOPs. - bugfixes for a handful of v4.2-rc1 specific regressions related to the larger set of recent backend configfs attribute changes. A big thanks to QLogic + Pure Storage for the qla2xxx target bugfixes" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits) Documentation/target: Fix tcm_mod_builder.py build breakage iser-target: Fix REJECT CM event use-after-free OOPs iscsi-target: Fix iser explicit logout TX kthread leak iscsi-target: Fix iscsit_start_kthreads failure OOPs iscsi-target: Fix use-after-free during TPG session shutdown qla2xxx: terminate exchange when command is aborted by LIO qla2xxx: drop cmds/tmrs arrived while session is being deleted qla2xxx: disable scsi_transport_fc registration in target mode qla2xxx: added sess generations to detect RSCN update races qla2xxx: Abort stale cmds on qla_tgt_wq when plogi arrives qla2xxx: delay plogi/prli ack until existing sessions are deleted qla2xxx: cleanup cmd in qla workqueue before processing TMR qla2xxx: kill sessions/log out initiator on RSCN and port down events qla2xxx: fix command initialization in target mode. qla2xxx: Remove msleep in qlt_send_term_exchange qla2xxx: adjust debug flags qla2xxx: release request queue reservation. qla2xxx: Add flush after updating ATIOQ consumer index. qla2xxx: Enable target mode for ISP27XX qla2xxx: Fix hardware lock/unlock issue causing kernel panic. ...	2015-07-29 09:54:40 -07:00
Linus Torvalds	956325bd55	Changes for 4.2-rc - Two minor bug fixes - Relicense ocrdma driver to dual license, GPL or BSD -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVt4+aAAoJELgmozMOVy/dsiwQALBVIIscxVh+GGuRvDopAwA7 nfsAAqK/XFO3kT3QNSO3gsj4yhIfgR8EXjLiKVrYAWNIUz4NwtlH0lDDH741pVxS v8dVFuH4KuKw9Hotu4G7W4M3AslszZZQAjSVGSgzrmvIBf9vS6A33S8gfCnNshnZ o/dccJalsAPMsf9OY4vVJE4Jpc0vhwaLxxILK8l3qBImnrJrzoV8jQc+SGWFA2V1 IePSEMuVVcXEVLaxaOCtQEoI40UHDZHAqINw+QvNzGBpaY8FysnwAAl0p9r5a5VU jteYZ+jvAkbSC/GUpbM6s/MziltTookbFyC6pMIbGAPi43Hz1khA2eYofJ30bihP TAyNGv9yzwkPcTjq6NMxpj2O06ITyzDSCLlC6/dOcV9nrfMz/xoPZ8zaVpFuTt/Q bHpXgqyQcz8RWpX4A2c5iomgKwX4lwMtsDf/0fLlZTwHWBc3nGuGHeX1wTtemK/i ClEUNmZR14/jaQdeLJU6bvGmUo03nhCvIVvIJa8wpqL4nAzUsKDQMVTEK2A26mzz SD2JaFMiDACAvhU1UjkmdpUAycyFacbCgkm9FpwsuaK5lFhkzRohD1kM9oyzz0Q8 p0HTikxjU6Ve1s6kM6c6dISD3x5beDR2cNDQuQaIcrDO8HETugQ5m+gXeVOuH8zo PRe/SehnsjTmHHFq6s+v =TYMP -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: - two minor bug fixes - relicense ocrdma driver to dual license, GPL or BSD * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: RDMA/ocrdma: update ocrdma module license string RDMA/ocrdma: update ocrdma license to dual-license IB/ipoib: Fix CONFIG_INFINIBAND_IPOIB_CM RDMA/cxgb3: fail get_dma_mr on 64 bit arches	2015-07-28 14:20:16 -07:00
Hariprasad S	3661df179b	iw_cxgb4: gracefully handle unknown CQE status errors c4iw_poll_cq_on() shouldn't fail the poll operation just because the CQE status is unknown. Rather, it should map this to the "fatal error" status and log the anomaly. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-28 11:24:51 -04:00
Hadar Hen Zion	e802f8e4c5	net/mlx4: Prepare VLAN macros for 802.1ad Hardware accelerated support To add Hardware accelerated support in 802.1ad vlan, replace Current VLAN macros to CVLAN. Replace: MLX4_WQE_CTRL_INS_VLAN MLX4_CQE_VLAN_PRESENT_MASK With: MLX4_WQE_CTRL_INS_CVLAN MLX4_CQE_CVLAN_PRESENT_MASK Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 15:00:37 -07:00
Nicholas Bellinger	ce9a9fc20a	iser-target: Fix REJECT CM event use-after-free OOPs This patch fixes a bug in iser-target code where the REJECT CM event handler code currently performs a isert_put_conn() for the final isert_conn->kref put, while iscsi_np process context is still blocked in isert_get_login_rx(). Once isert_get_login_rx() is awoking due to login timeout, iscsi_np process context will attempt to invoke iscsi_target_login_sess_out() to cleanup iscsi_conn as expected, and calls isert_wait_conn() + isert_free_conn() which triggers the use-after-free OOPs. To address this bug, move the kref_get_unless_zero() call from isert_connected_handler() into isert_connect_request() immediately preceeding isert_rdma_accept() to ensure the CM handler cleanup paths and isert_free_conn() are always operating with two refs. Cc: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-07-24 14:19:45 -07:00
Devesh Sharma	b8f5595eb9	RDMA/ocrdma: update ocrdma module license string Change module_license from "GPL" to "Dual BSD/GPL" Cc: Tejun Heo <tj@kernel.org> Cc: Duan Jiong <duanj.fnst@cn.fujitsu.com> Cc: Roland Dreier <roland@purestorage.com> Cc: Jes Sorensen <Jes.Sorensen@redhat.com> Cc: Sasha Levin <levinsasha928@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Moni Shoua <monis@mellanox.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Li RongQing <roy.qing.li@gmail.com> Cc: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-24 11:34:41 -04:00
Devesh Sharma	71ee67306e	RDMA/ocrdma: update ocrdma license to dual-license Change of license from GPLv2 to dual-license (GPLv2 and BSD 2-Clause) All contributors were contacted off-list and permission to make this change was received. The complete list of contributors are Cc:ed here. Cc: Tejun Heo <tj@kernel.org> Cc: Duan Jiong <duanj.fnst@cn.fujitsu.com> Cc: Roland Dreier <roland@purestorage.com> Cc: Jes Sorensen <Jes.Sorensen@redhat.com> Cc: Sasha Levin <levinsasha928@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Moni Shoua <monis@mellanox.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Li RongQing <roy.qing.li@gmail.com> Cc: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-24 11:34:34 -04:00
Jason Gunthorpe	efc1eedbf6	IB/ipoib: Fix CONFIG_INFINIBAND_IPOIB_CM If the above is turned off then ipoib_cm_dev_init unconditionally returns ENOSYS, and the newly added error handling in 0b3957 prevents ipoib from coming up at all: kernel: mlx4_0: ipoib_transport_dev_init failed kernel: mlx4_0: failed to initialize port 1 (ret = -12) Fixes: `0b39578bcd` (IB/ipoib: Use dedicated workqueues per interface) Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-24 11:34:26 -04:00
Steve Wise	49fa63d8c2	RDMA/cxgb3: fail get_dma_mr on 64 bit arches T3 HW only supports 32 bit MRs. If the system uses 64 bit memory addresses, then a registered 32 bit MR will wrap and write to the wrong memory when used with addresses > 4GB. To prevent this, simply fail to allocate an MR on 64 bit machines (other means of registering memory are still available and software can still work, we just don't allow this means of memory registration). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-24 11:34:17 -04:00
Luis R. Rodriguez	fd0a1b8607	x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn() WARN() may confuse users, fix that. ipath_init_one() is part the device's probe so this would only be triggered if a corresponding device was found. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Acked-by: Doug Ledford <dledford@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andy@silverblocksystems.net Cc: benh@kernel.crashing.org Cc: bp@suse.de Cc: dan.j.williams@intel.com Cc: jkosina@suse.cz Cc: julia.lawall@lip6.fr Cc: luto@amacapital.net Cc: mchehab@osg.samsung.com Link: http://lkml.kernel.org/r/1437167245-28273-2-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-21 09:42:54 +02:00
Johannes Thumshirn	d8b2ba7c59	IB/core: Destroy ocrdma_dev_id IDR on module exit Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez <mcgrof@suse.com>) <SmPL> @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@ module_init(init); @ defines_module_exit @ identifier exit; @@ module_exit(exit); @ declares_idr depends on defines_module_init && defines_module_exit @ identifier idr; @@ DEFINE_IDR(idr); @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... idr_destroy(&idr); ... } @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... +idr_destroy(&idr); } </SmPL> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:16 -04:00
Johannes Thumshirn	45d254206f	IB/core: Destroy multcast_idr on module exit Destroy multcast_idr on module exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez <mcgrof@suse.com>) <SmPL> @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@ module_init(init); @ defines_module_exit @ identifier exit; @@ module_exit(exit); @ declares_idr depends on defines_module_init && defines_module_exit @ identifier idr; @@ DEFINE_IDR(idr); @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... idr_destroy(&idr); ... } @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... +idr_destroy(&idr); } </SmPL> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:15 -04:00
Doug Ledford	d9a047aeff	IB/mlx4: Optimize do_slave_init There is little chance our memory allocation will fail, so we can combine initializing the work structs with allocating them instead of looping through all of them once to allocate and again to initialize. Then when we need to actually find out if our device is up or in the process of going down, have all of our work structs batched up, take the spin_lock once and only once, and do all of the batch under the one spin_lock invocation instead of incurring all of the locked memory cycles we would otherwise incur to take/release the spin_lock over and over again. Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:15 -04:00
Doug Ledford	9bbf282da8	IB/mlx4: Fix memory leak in do_slave_init We create a number of work structs to be queued up to a workqueue, and on completion of the workqueue handler, the workqueue handler frees the allocated memory. If, however, we don't queue the work struct because the device is going down, then we need to free the memory ourselves. Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:15 -04:00
Maninder Singh	a39a98ff4c	IB/mlx4: Optimize freeing of items on error unwind On failure, we loop through all possible pointers and test them before calling kfree. But really, why even attempt to free items we didn't allocate when we can easily loop through exactly and only the devices for which the original memory allocation succeeded and free just those. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:14 -04:00
Or Gerlitz	43bfb9729e	IB/mlx4: Fix use of flow-counters for process_mad For IB links, reading HCA flow counters through iboe_process_mad() should be used when mlx4_ib_process_mad() is invoked only for VFs PMA queries and exactly nothing else. Fixes: `7193a141eb` ('IB/mlx4: Set VF to read from QP counters') Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:14 -04:00
Vaishali Thakkar	cb1ff431c3	IB/ipath: Convert use of __constant_<foo> to <foo> In little endian cases, the macros be16_to_cpu and cpu_to_be64 unfolds to __swab{16,64} which provides special case for constants. In big endian cases, __constant_be16_to_cpu and be16_to_cpu expand directly to the same expression. The same applies for __constant_cpu_to_be64 and cpu_to_be64. So, replace __constant_be16_to_cpu with be16_to_cpu and __constant_cpu_to_be64 with cpu_to_be64, with the goal of getting rid of the definition of __constant_be16_to_cpu and __constant_cpu_to_be64 completely. Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:13 -04:00
Erez Shitrit	edcd2a7474	IB/ipoib: Set MTU to max allowed by mode when mode changes When switching between modes (datagram / connected) change the MTU accordingly. datagram mode up to 4K, connected mode up to (64K - 0x10). Signed-off-by: ELi Cohen <eli@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:13 -04:00
Yuval Shaia	c42687784b	IB/ipoib: Scatter-Gather support in connected mode By default, IPoIB-CM driver uses 64k MTU. Larger MTU gives better performance. This MTU plus overhead puts the memory allocation for IP based packets at 32 4k pages (order 5), which have to be contiguous. When the system memory under pressure, it was observed that allocating 128k contiguous physical memory is difficult and causes serious errors (such as system becomes unusable). This enhancement resolve the issue by removing the physically contiguous memory requirement using Scatter/Gather feature that exists in Linux stack. With this fix Scatter-Gather will be supported also in connected mode. This change reverts some of the change made in commit `e112373fd6` ("IPoIB/cm: Reduce connected mode TX object size"). The ability to use SG in IPoIB CM is possible because the coupling between NETIF_F_SG and NETIF_F_CSUM was removed in commit `ec5f061564` ("net: Kill link between CSUM and SG features.") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Christian Marie <christian@ponies.io> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:13 -04:00
Carol L Soto	59d40dd92c	IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES ib_ucm_release_dev clears the wrong bit if devnum is greater than IB_UCM_MAX_DEVICES. Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:12 -04:00
Haggai Eran	8b7cce0dae	IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush __ipoib_ib_dev_flush calls itself recursively on child devices, and lockdep complains about locking vlan_rwsem twice (see below). Use down_read_nested instead of down_read to prevent the warning. ============================================= [ INFO: possible recursive locking detected ] 4.1.0-rc4+ #36 Tainted: G O --------------------------------------------- kworker/u20:2/261 is trying to acquire lock: (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib] but task is already holding lock: (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&priv->vlan_rwsem); lock(&priv->vlan_rwsem); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by kworker/u20:2/261: #0: ("%s""ipoib_flush"){.+.+..}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760 #1: ((&priv->flush_heavy)){+.+...}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760 #2: (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib] stack backtrace: CPU: 3 PID: 261 Comm: kworker/u20:2 Tainted: G O 4.1.0-rc4+ #36 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 Workqueue: ipoib_flush ipoib_ib_dev_flush_heavy [ib_ipoib] ffff8801c6c54790 ffff8801c9927af8 ffffffff81665238 0000000000000001 ffffffff825b5b30 ffff8801c9927bd8 ffffffff810bba51 ffff880100000000 ffffffff00000001 ffff880100000001 ffff8801c6c55428 ffff8801c6c54790 Call Trace: [<ffffffff81665238>] dump_stack+0x4f/0x6f [<ffffffff810bba51>] __lock_acquire+0x741/0x1820 [<ffffffff810bcbf8>] lock_acquire+0xc8/0x240 [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib] [<ffffffff81669d2c>] down_read+0x4c/0x70 [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib] [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib] [<ffffffffa0791e4a>] __ipoib_ib_dev_flush+0x5a/0x2b0 [ib_ipoib] [<ffffffffa07920ba>] ipoib_ib_dev_flush_heavy+0x1a/0x20 [ib_ipoib] [<ffffffff81082871>] process_one_work+0x201/0x760 [<ffffffff810827cc>] ? process_one_work+0x15c/0x760 [<ffffffff81082ef0>] worker_thread+0x120/0x4d0 [<ffffffff81082dd0>] ? process_one_work+0x760/0x760 [<ffffffff81082dd0>] ? process_one_work+0x760/0x760 [<ffffffff81088b7e>] kthread+0xfe/0x120 [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70 [<ffffffff8166c6e2>] ret_from_fork+0x42/0x70 [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70 Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:12 -04:00
Haggai Eran	31b57b87fd	IB/ucma: Fix lockdep warning in ucma_lock_files The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating an ID. Use mutex_lock_nested() to prevent the warning below. ============================================= [ INFO: possible recursive locking detected ] 4.1.0-rc6-hmm+ #40 Tainted: G O --------------------------------------------- pingpong_rpc_se/10260 is trying to acquire lock: (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm] but task is already holding lock: (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&file->mut); lock(&file->mut); * DEADLOCK * May be due to missing lock nesting notation 1 lock held by pingpong_rpc_se/10260: #0: (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm] stack backtrace: CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G O 4.1.0-rc6-hmm+ #40 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001 ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000 ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9 Call Trace: [<ffffffff81668f49>] dump_stack+0x4f/0x6e [<ffffffff810bb991>] __lock_acquire+0x741/0x1820 [<ffffffff8121bee9>] ? dput+0x29/0x320 [<ffffffff810bcb38>] lock_acquire+0xc8/0x240 [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm] [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0 [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0 [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm] [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10 [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm] [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm] [<ffffffff81200674>] __vfs_write+0x34/0x100 [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110 [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0 [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90 [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0 [<ffffffff812009bb>] vfs_write+0xab/0x120 [<ffffffff81201519>] SyS_write+0x59/0xd0 [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110 [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76 Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:12 -04:00
Tatyana Nikolova	0a69127250	RDMA/nes: Fix for incorrect recording of the MAC address Fix for incorrect recording of the MAC address Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:11 -04:00
Tatyana Nikolova	165d682289	RDMA/nes: Fix for resolving the neigh Neighbor resolution doesn't work without this fix Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:11 -04:00
Tatyana Nikolova	a7f2f24cd7	RDMA/core: Fixes for port mapper client registration Fixes to allow clients to make remove mapping requests, after they have provided the user space service with the mapping information, they are using when the service is restarted. 1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF registration types for the port mapper clients and functions to set/check the registration type. 2) If the port mapper user space service is not available to register the client, then its registration stays IWPM_REG_UNDEF and the registration isn't checked until the service becomes available (no mappings are possible, if the user space service isn't running). 3) After the service is restarted, the user space port mapper pid is set to valid and the client registration is set to IWPM_REG_INCOMPL to allow the client to make remove mapping requests. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:10 -04:00
Amir Vadai	58e9cc90cd	IB/IPoIB: Fix bad error flow in ipoib_add_port() Error values of ib_query_port() and ib_query_device() weren't propagated correctly. Because of that, ipoib_add_port() could return NULL value, which escaped the IS_ERR() check in ipoib_add_one() and we crashed. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:10 -04:00
Matan Barak	8a7ff14dcb	IB/mlx4: Do not attemp to report HCA clock offset on VFs mlx4 VFs can provide CQE raw time-stamping services, but they don't have the hca core clock mapped to their PCI bars. As such, we should not attempt to query and report the clock offset to user space for VFs. Doing so causes query_device over VFs to fail with -ENOSUPP. Fixes: `4b664c4355` ('IB/mlx4: Add support for CQ time-stamping') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:10 -04:00
Erez Shitrit	be4b499323	IB/cm: Do not queue work to a device that's going away Whenever ib_cm gets remove_one call, like when there is a hot-unplug event, the driver should mark itself as going_down and confirm that no new works are going to be queued for that device. so, the order of the actions are: 1. mark the going_down bit. 2. flush the wq. 3. [make sure no new works for that device.] 4. unregister mad agent. otherwise, works that are already queued can be scheduled after the mad agent was freed. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:09 -04:00
Sagi Grimberg	3fdf70acec	IB/srp: Avoid using uninitialized variable We might return res which is not initialized. Also reduce code duplication by exporting srp_parse_tmo so srp_tmo_set can reuse it. Detected by Coverity. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jenny Falkovich <jennyf@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:09 -04:00
Vaishali Thakkar	b356c1c1f9	IB/srpt: Convert use of __constant_cpu_to_beXX to cpu_to_beXX In little endian cases, the macro cpu_to_be{16,32,64} unfolds to __swab{16,32,64} which provides special case for constants. In big endian cases, __constant_cpu_to_be{16,32,64} and cpu_to_be{16,32,64} expand directly to the same expression. So, replace __constant_cpu_to_be{16,32,64} with cpu_to_be{16,32,64} with the goal of getting rid of the definitions of __constant_cpu_to_be{16,32,64} completely. The Coccinelle semantic patch that performs this transformation is as follows: @@expression x;@@ ( - __constant_cpu_to_be16(x) + cpu_to_be16(x) \| - __constant_cpu_to_be32(x) + cpu_to_be32(x) \| - __constant_cpu_to_be64(x) + cpu_to_be64(x) ) Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:09 -04:00
Ira Weiny	3b8ab700af	IB/mad: Remove improper use of BUG_ON We recently added BUG_ON's which were inappropriate for a condition which should never happen. Change these to be WARN_ON_ONCE as a debugging aid. Fixes: `4cd7c9479a` ('IB/mad: Add support for additional MAD info to/from drivers') Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:08 -04:00
Ira Weiny	cd4cd565e0	IB/mad: Fix compare between big endian and cpu endian The define OPA_LID_PERMISSIVE is big endian and was compared to the cpu endian variable opa_drslid. Problem caught by 0-day build infrastructure. Fixes: `8e4349d13f` (IB/mad: Add final OPA MAD processing) Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: John, Jubin <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:08 -04:00
Hal Rosenstock	4139032b48	IB: Add rdma_cap_ib_switch helper and use where appropriate Persuant to Liran's comments on node_type on linux-rdma mailing list: In an effort to reform the RDMA core and ULPs to minimize use of node_type in struct ib_device, an additional bit is added to struct ib_device for is_switch (IB switch). This is needed to be initialized by any IB switch device driver. This is a NEW requirement on such device drivers which are all "out of tree". In addition, an ib_switch helper was added to ib_verbs.h based on the is_switch device bit rather than node_type (although those should be consistent). The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs) as well as (IPoIB and SRP) ULPs are updated where appropriate to use this new helper. In some cases, the helper is now used under the covers of using rdma_[start end]_port rather than the open coding previously used. Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-07-14 13:20:08 -04:00
Linus Torvalds	1dc51b8288	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "Assorted VFS fixes and related cleanups (IMO the most interesting in that part are f_path-related things and Eric's descriptor-related stuff). UFS regression fixes (it got broken last cycle). 9P fixes. fs-cache series, DAX patches, Jan's file_remove_suid() work" [ I'd say this is much more than "fixes and related cleanups". The file_table locking rule change by Eric Dumazet is a rather big and fundamental update even if the patch isn't huge. - Linus ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits) 9p: cope with bogus responses from server in p9_client_{read,write} p9_client_write(): avoid double p9_free_req() 9p: forgetting to cancel request on interrupted zero-copy RPC dax: bdev_direct_access() may sleep block: Add support for DAX reads/writes to block devices dax: Use copy_from_iter_nocache dax: Add block size note to documentation fs/file.c: __fget() and dup2() atomicity rules fs/file.c: don't acquire files->file_lock in fd_install() fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation vfs: avoid creation of inode number 0 in get_next_ino namei: make set_root_rcu() return void make simple_positive() public ufs: use dir_pages instead of ufs_dir_pages() pagemap.h: move dir_pages() over there remove the pointless include of lglock.h fs: cleanup slight list_entry abuse xfs: Correctly lock inode when removing suid and file capabilities fs: Call security_ops->inode_killpriv on truncate fs: Provide function telling whether file_remove_privs() will do anything ...	2015-07-04 19:36:06 -07:00
Linus Torvalds	5c755fe142	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "It's been a busy development cycle for target-core in a number of different areas. The fabric API usage for se_node_acl allocation is now within target-core code, dropping the external API callers for all fabric drivers tree-wide. There is a new conversion to RCU hlists for se_node_acl and se_portal_group LUN mappings, that turns fast-past LUN lookup into a completely lockless code-path. It also removes the original hard-coded limitation of 256 LUNs per fabric endpoint. The configfs attributes for backends can now be shared between core and driver code, allowing existing drivers to use common code while still allowing flexibility for new backend provided attributes. The highlights include: - Merge sbc_verify_dif_* into common code (sagi) - Remove iscsi-target support for obsolete IFMarker/OFMarker (Christophe Vu-Brugier) - Add bidi support in target/user backend (ilias + vangelis + agover) - Move se_node_acl allocation into target-core code (hch) - Add crc_t10dif_update common helper (akinobu + mkp) - Handle target-core odd SGL mapping for data transfer memory (akinobu) - Move transport ID handling into target-core (hch) - Move task tag into struct se_cmd + support 64-bit tags (bart) - Convert se_node_acl->device_list[] to RCU hlist (nab + hch + paulmck) - Convert se_portal_group->tpg_lun_list[] to RCU hlist (nab + hch + paulmck) - Simplify target backend driver registration (hch) - Consolidate + simplify target backend attribute implementations (hch + nab) - Subsume se_port + t10_alua_tg_pt_gp_member into se_lun (hch) - Drop lun_sep_lock for se_lun->lun_se_dev RCU usage (hch + nab) - Drop unnecessary core_tpg_register TFO parameter (nab) - Use 64-bit LUNs tree-wide (hannes) - Drop left-over TARGET_MAX_LUNS_PER_TRANSPORT limit (hannes)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (76 commits) target: Bump core version to v5.0 target: remove target_core_configfs.h target: remove unused TARGET_CORE_CONFIG_ROOT define target: consolidate version defines target: implement WRITE_SAME with UNMAP bit using ->execute_unmap target: simplify UNMAP handling target: replace se_cmd->execute_rw with a protocol_data field target/user: Fix inconsistent kmap_atomic/kunmap_atomic target: Send UA when changing LUN inventory target: Send UA upon LUN RESET tmr completion target: Send UA on ALUA target port group change target: Convert se_lun->lun_deve_lock to normal spinlock target: use 'se_dev_entry' when allocating UAs target: Remove 'ua_nacl' pointer from se_ua structure target_core_alua: Correct UA handling when switching states xen-scsiback: Fix compile warning for 64-bit LUN target: Remove TARGET_MAX_LUNS_PER_TRANSPORT target: use 64-bit LUNs target: Drop duplicate + unused se_dev_check_wce target: Drop unnecessary core_tpg_register TFO parameter ...	2015-07-04 14:13:43 -07:00
Linus Torvalds	2d01eedf1d	Merge branch 'akpm' (patches from Andrew) Merge third patchbomb from Andrew Morton: - the rest of MM - scripts/gdb updates - ipc/ updates - lib/ updates - MAINTAINERS updates - various other misc things * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (67 commits) genalloc: rename of_get_named_gen_pool() to of_gen_pool_get() genalloc: rename dev_get_gen_pool() to gen_pool_get() x86: opt into HAVE_COPY_THREAD_TLS, for both 32-bit and 64-bit MAINTAINERS: add zpool MAINTAINERS: BCACHE: Kent Overstreet has changed email address MAINTAINERS: move Jens Osterkamp to CREDITS MAINTAINERS: remove unused nbd.h pattern MAINTAINERS: update brcm gpio filename pattern MAINTAINERS: update brcm dts pattern MAINTAINERS: update sound soc intel patterns MAINTAINERS: remove website for paride MAINTAINERS: update Emulex ocrdma email addresses bcache: use kvfree() in various places libcxgbi: use kvfree() in cxgbi_free_big_mem() target: use kvfree() in session alloc and free IB/ehca: use kvfree() in ipz_queue_{cd}tor() drm/nouveau/gem: use kvfree() in u_free() drm: use kvfree() in drm_free_large() cxgb4: use kvfree() in t4_free_mem() cxgb3: use kvfree() in cxgb_free_mem() ...	2015-07-01 17:47:51 -07:00
Linus Torvalds	02201e3f1b	Minor merge needed, due to function move. Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too. Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVkgKHAAoJENkgDmzRrbjxQpwQAJVmBN6jF3SnwbQXv9vRixjH 58V33sb1G1RW+kXxQ3/e8jLX/4VaN479CufruXQp+IJWXsN/CH0lbC3k8m7u50d7 b1Zeqd/Yrh79rkc11b0X1698uGCSMlzz+V54Z0QOTEEX+nSu2ZZvccFS4UaHkn3z rqDo00lb7rxQz8U25qro2OZrG6D3ub2q20TkWUB8EO4AOHkPn8KWP2r429Axrr0K wlDWDTTt8/IsvPbuPf3T15RAhq1avkMXWn9nDXDjyWbpLfTn8NFnWmtesgY7Jl4t GjbXC5WYekX3w2ZDB9KaT/DAMQ1a7RbMXNSz4RX4VbzDl+yYeSLmIh2G9fZb1PbB PsIxrOgy4BquOWsJPm+zeFPSC3q9Cfu219L4AmxSjiZxC3dlosg5rIB892Mjoyv4 qxmg6oiqtc4Jxv+Gl9lRFVOqyHZrTC5IJ+xgfv1EyP6kKMUKLlDZtxZAuQxpUyxR HZLq220RYnYSvkWauikq4M8fqFM8bdt6hLJnv7bVqllseROk9stCvjSiE3A9szH5 OgtOfYV5GhOeb8pCZqJKlGDw+RoJ21jtNCgOr6DgkNKV9CX/kL/Puwv8gnA0B0eh dxCeB7f/gcLl7Cg3Z3gVVcGlgak6JWrLf5ITAJhBZ8Lv+AtL2DKmwEWS/iIMRmek tLdh/a9GiCitqS0bT7GE =tWPQ -----END PGP SIGNATURE----- Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits) modules: only use mod->param_lock if CONFIG_MODULES param: fix module param locks when !CONFIG_SYSFS. rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() module: add per-module param_lock module: make perm const params: suppress unused variable error, warn once just in case code changes. modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'. kernel/module.c: avoid ifdefs for sig_enforce declaration kernel/workqueue.c: remove ifdefs over wq_power_efficient kernel/params.c: export param_ops_bool_enable_only kernel/params.c: generalize bool_enable_only kernel/module.c: use generic module param operaters for sig_enforce kernel/params: constify struct kernel_param_ops uses sysfs: tightened sysfs permission checks module: Rework module_addr_{min,max} module: Use __module_address() for module_address_lookup() module: Make the mod_tree stuff conditional on PERF_EVENTS \|\| TRACING module: Optimize __module_address() using a latched RB-tree rbtree: Implement generic latch_tree seqlock: Introduce raw_read_seqcount_latch() ...	2015-07-01 10:49:25 -07:00
Pekka Enberg	f8c5b93947	IB/ehca: use kvfree() in ipz_queue_{cd}tor() Use kvfree() instead of open-coding it. Signed-off-by: Pekka Enberg <penberg@kernel.org> Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-06-30 19:45:00 -07:00
Linus Torvalds	e0456717e4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) Add TX fast path in mac80211, from Johannes Berg. 2) Add TSO/GRO support to ibmveth, from Thomas Falcon 3) Move away from cached routes in ipv6, just like ipv4, from Martin KaFai Lau. 4) Lots of new rhashtable tests, from Thomas Graf. 5) Run ingress qdisc lockless, from Alexei Starovoitov. 6) Allow servers to fetch TCP packet headers for SYN packets of new connections, for fingerprinting. From Eric Dumazet. 7) Add mode parameter to pktgen, for testing receive. From Alexei Starovoitov. 8) Cache access optimizations via simplifications of build_skb(), from Alexander Duyck. 9) Move page frag allocator under mm/, also from Alexander. 10) Add xmit_more support to hv_netvsc, from KY Srinivasan. 11) Add a counter guard in case we try to perform endless reclassify loops in the packet scheduler. 12) Extern flow dissector to be programmable and use it in new "Flower" classifier. From Jiri Pirko. 13) AF_PACKET fanout rollover fixes, performance improvements, and new statistics. From Willem de Bruijn. 14) Add netdev driver for GENEVE tunnels, from John W Linville. 15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso. 16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet. 17) Add an ECN retry fallback for the initial TCP handshake, from Daniel Borkmann. 18) Add tail call support to BPF, from Alexei Starovoitov. 19) Add several pktgen helper scripts, from Jesper Dangaard Brouer. 20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa. 21) Favor even port numbers for allocation to connect() requests, and odd port numbers for bind(0), in an effort to help avoid ip_local_port_range exhaustion. From Eric Dumazet. 22) Add Cavium ThunderX driver, from Sunil Goutham. 23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata, from Alexei Starovoitov. 24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai. 25) Double TCP Small Queues default to 256K to accomodate situations like the XEN driver and wireless aggregation. From Wei Liu. 26) Add more entropy inputs to flow dissector, from Tom Herbert. 27) Add CDG congestion control algorithm to TCP, from Kenneth Klette Jonassen. 28) Convert ipset over to RCU locking, from Jozsef Kadlecsik. 29) Track and act upon link status of ipv4 route nexthops, from Andy Gospodarek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits) bridge: vlan: flush the dynamically learned entries on port vlan delete bridge: multicast: add a comment to br_port_state_selection about blocking state net: inet_diag: export IPV6_V6ONLY sockopt stmmac: troubleshoot unexpected bits in des0 & des1 net: ipv4 sysctl option to ignore routes when nexthop link is down net: track link-status of ipv4 nexthops net: switchdev: ignore unsupported bridge flags net: Cavium: Fix MAC address setting in shutdown state drivers: net: xgene: fix for ACPI support without ACPI ip: report the original address of ICMP messages net/mlx5e: Prefetch skb data on RX net/mlx5e: Pop cq outside mlx5e_get_cqe net/mlx5e: Remove mlx5e_cq.sqrq back-pointer net/mlx5e: Remove extra spaces net/mlx5e: Avoid TX CQE generation if more xmit packets expected net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device ...	2015-06-24 16:49:49 -07:00
Linus Torvalds	acd53127c4	SCSI misc on 20150622 This is the usual grab bag of driver updates (lpfc, hpsa, megaraid_sas, cxgbi, be2iscsi) plus an assortment of minor updates. There are also one new driver: the Cisco snic; the advansys driver has been rewritten to get rid of the warning about converting it to the DMA API, the tape statistics patch got in and finally, there's a resuffle of SCSI header files to separate more cleanly initiator from target mode (and better share the common definitions). Signed-off-by: James Bottomley <JBottomley@Odin.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJViKWdAAoJEDeqqVYsXL0MAr8IAMmlA6HBVjMJJFCEOY9corHj e70MNQa7LUgf+JCdOtzGcvHXTiFFd4IHZAwXUJAnsC4IU2QWEfi1bjUTErlqBIGk LoZlXXpEHnFpmWot3OluOzzcGcxede8rVgPiKWVVdojIngBC2+LL/i2vPCJ84ri9 WCVlk6KBvWZXuU6JuOKAb2FO9HOX7Q61wuKAMast2Qc6RNc2ksgc7VbstsITqzZ9 FVEsjmQ5lqUj+xdxBpiUOdUpc22IJ4VcpBgQ2HrThvg6vf4aq937RJ/g4vi/g0SU Utk0a3bUw1H/WnYAfJVFx83nVEsS/954Z7/ERDg1sjlfLYwQtQnpov0XIbPIbZU= =k9IT -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is the usual grab bag of driver updates (lpfc, hpsa, megaraid_sas, cxgbi, be2iscsi) plus an assortment of minor updates. There is also one new driver: the Cisco snic. The advansys driver has been rewritten to get rid of the warning about converting it to the DMA API, the tape statistics patch got in and finally, there's a resuffle of SCSI header files to separate more cleanly initiator from target mode (and better share the common definitions)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (156 commits) snic: driver for Cisco SCSI HBA qla2xxx: Fix indentation qla2xxx: Comment out unreachable code fusion: remove dead MTRR code advansys: fix compilation errors and warnings when CONFIG_PCI is not set mptsas: fix depth param in scsi_track_queue_full megaraid: fix irq setup process regression lpfc: Update version to 10.7.0.0 for upstream patch set. lpfc: Fix to drop PLOGIs from fabric node till LOGO processing completes lpfc: Fix scsi task management error message. lpfc: Fix cq_id masking problem. lpfc: Fix scsi prep dma buf error. lpfc: Add support for using block multi-queue lpfc: Devices are not discovered during takeaway/giveback testing lpfc: Fix vport deletion failure. lpfc: Check for active portpeerbeacon. lpfc: Update driver version for upstream patch set 10.6.0.1. lpfc: Change buffer pool empty message to miscellaneous category lpfc: Fix incorrect log message reported for empty FCF record. lpfc: Fix rport leak. ...	2015-06-23 15:55:44 -07:00
Linus Torvalds	f9d1b5a31a	Changes for 4.2 - A large cleanup of how device capabilities are checked for various features - Additional cleanups in the MAD processing - Update to the srp driver - Creation and use of centralized log message helpers - Add const to a number of args to calls and clean up call chain - Add support for extended cq create verb - Add support for timestamps on cq completion - Add support for processing OPA MAD packets -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVeyzqAAoJELgmozMOVy/di3wP/jml4F9crvQn7UBJjGm/rgcI wzZ2GZTqxQE8dn+W6gQsdKOzy0Ibxx5UYGp9ruInuxAcVh9t1PcylanasaiGMEtY mrGRFjipJ9jYa+yDQDTQi8EFMClZuMSvtRLKjzYITudCXQck37V+F5YlP6VphjX7 JeiM4a+4rD0ukk5PKGvUw51sP1eawKtEdUvnqcOEI2tJgQmzJBP4mXrhVtS/0wSc Pi8TRN5QKi3Drom/tK9QQ/ncoYngi4BKLfszCeU373HJq6qXqsxBYvs3jX6MPzfv Aooj272JxBgCYxkmEfECezDpmi3PbWDJjXj/xCLjfhjISDtHHHVLGVMODZpwUEsL 2wBgwlzdajVopSbSLvsjQNtQw25s7sDWpu+TFKbS0u+W2d0ZOyipM1Xeje+OtDHQ clhwvDhgSfeI/bJ1YdtNLbvINrwsfZD213zD+WH21A/9weAVr3hEfTuSaNFiTiRn 5yywP36TM0wH90KhiWoLrztcHvoE5p7kGuqzv04MRjrMMNHEJK2/IhWvT97Ewngu vWrZl7QRzXYcGspCOp2aJW9Wr2rhGRrv28TF+thpNrIJOB2JM4q4koCKZCcI0s2D E6pY2YQSzvrA/ZSfcWIg4yhugcycIJkOf7ur2N/U43cwGXtaCzPWVnKMApmdnVOO ZEMwD3OZ1OGcCHLhRL8Y =yISf -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: - a large cleanup of how device capabilities are checked for various features - additional cleanups in the MAD processing - update to the srp driver - creation and use of centralized log message helpers - add const to a number of args to calls and clean up call chain - add support for extended cq create verb - add support for timestamps on cq completion - add support for processing OPA MAD packets * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (92 commits) IB/mad: Add final OPA MAD processing IB/mad: Add partial Intel OPA MAD support IB/mad: Add partial Intel OPA MAD support IB/core: Add OPA MAD core capability flag IB/mad: Add support for additional MAD info to/from drivers IB/mad: Convert allocations from kmem_cache to kzalloc IB/core: Add ability for drivers to report an alternate MAD size. IB/mad: Support alternate Base Versions when creating MADs IB/mad: Create a generic helper for DR forwarding checks IB/mad: Create a generic helper for DR SMP Recv processing IB/mad: Create a generic helper for DR SMP Send processing IB/mad: Split IB SMI handling from MAD Recv handler IB/mad cleanup: Generalize processing of MAD data IB/mad cleanup: Clean up function params -- find_mad_agent IB/mlx4: Add support for CQ time-stamping IB/mlx4: Add mmap call to map the hardware clock IB/core: Pass hardware specific data in query_device IB/core: Add timestamp_mask and hca_core_clock to query_device IB/core: Extend ib_uverbs_create_cq IB/core: Add CQ creation time-stamping flag ...	2015-06-23 15:53:26 -07:00
Al Viro	dc3f4198ea	make simple_positive() public Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-23 18:02:01 -04:00
Linus Torvalds	d70b3ef54c	Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Ingo Molnar: "There were so many changes in the x86/asm, x86/apic and x86/mm topics in this cycle that the topical separation of -tip broke down somewhat - so the result is a more traditional architecture pull request, collected into the 'x86/core' topic. The topics were still maintained separately as far as possible, so bisectability and conceptual separation should still be pretty good - but there were a handful of merge points to avoid excessive dependencies (and conflicts) that would have been poorly tested in the end. The next cycle will hopefully be much more quiet (or at least will have fewer dependencies). The main changes in this cycle were: * x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas Gleixner) - This is the second and most intrusive part of changes to the x86 interrupt handling - full conversion to hierarchical interrupt domains: [IOAPIC domain] ----- \| [MSI domain] --------[Remapping domain] ----- [ Vector domain ] \| (optional) \| [HPET MSI domain] ----- \| \| [DMAR domain] ----------------------------- \| [Legacy domain] ----------------------------- This now reflects the actual hardware and allowed us to distangle the domain specific code from the underlying parent domain, which can be optional in the case of interrupt remapping. It's a clear separation of functionality and removes quite some duct tape constructs which plugged the remap code between ioapic/msi/hpet and the vector management. - Intel IOMMU IRQ remapping enhancements, to allow direct interrupt injection into guests (Feng Wu) * x86/asm changes: - Tons of cleanups and small speedups, micro-optimizations. This is in preparation to move a good chunk of the low level entry code from assembly to C code (Denys Vlasenko, Andy Lutomirski, Brian Gerst) - Moved all system entry related code to a new home under arch/x86/entry/ (Ingo Molnar) - Removal of the fragile and ugly CFI dwarf debuginfo annotations. Conversion to C will reintroduce many of them - but meanwhile they are only getting in the way, and the upstream kernel does not rely on them (Ingo Molnar) - NOP handling refinements. (Borislav Petkov) * x86/mm changes: - Big PAT and MTRR rework: making the code more robust and preparing to phase out exposing direct MTRR interfaces to drivers - in favor of using PAT driven interfaces (Toshi Kani, Luis R Rodriguez, Borislav Petkov) - New ioremap_wt()/set_memory_wt() interfaces to support Write-Through cached memory mappings. This is especially important for good performance on NVDIMM hardware (Toshi Kani) * x86/ras changes: - Add support for deferred errors on AMD (Aravind Gopalakrishnan) This is an important RAS feature which adds hardware support for poisoned data. That means roughly that the hardware marks data which it has detected as corrupted but wasn't able to correct, as poisoned data and raises an APIC interrupt to signal that in the form of a deferred error. It is the OS's responsibility then to take proper recovery action and thus prolonge system lifetime as far as possible. - Add support for Intel "Local MCE"s: upcoming CPUs will support CPU-local MCE interrupts, as opposed to the traditional system- wide broadcasted MCE interrupts (Ashok Raj) - Misc cleanups (Borislav Petkov) * x86/platform changes: - Intel Atom SoC updates ... and lots of other cleanups, fixlets and other changes - see the shortlog and the Git log for details" * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits) x86/hpet: Use proper hpet device number for MSI allocation x86/hpet: Check for irq==0 when allocating hpet MSI interrupts x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail genirq: Prevent crash in irq_move_irq() genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain iommu, x86: Properly handle posted interrupts for IOMMU hotplug iommu, x86: Provide irq_remapping_cap() interface iommu, x86: Setup Posted-Interrupts capability for Intel iommu iommu, x86: Add cap_pi_support() to detect VT-d PI capability iommu, x86: Avoid migrating VT-d posted interrupts iommu, x86: Save the mode (posted or remapped) of an IRTE iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip iommu: dmar: Provide helper to copy shared irte fields iommu: dmar: Extend struct irte for VT-d Posted-Interrupts iommu: Add new member capability to struct irq_remap_ops x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry() ...	2015-06-22 17:59:09 -07:00
Ingo Molnar	7ef3d7d58d	Merge branches 'x86/apic', 'x86/asm', 'x86/mm' and 'x86/platform' into x86/core, to merge last updates Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-22 09:15:03 +02:00
Linus Torvalds	d2228e4310	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull scsi target fixes from Nicholas Bellinger: "Apologies for the late pull request. Here are the outstanding target-pending fixes for v4.1 code. The series contains three patches from Sagi + Co that address a few iser-target issues that have been uncovered during recent testing at Mellanox. Patch #1 has a v3.16+ stable tag, and #2-3 have v3.10+ stable tags" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: iser-target: Fix possible use-after-free iser-target: release stale iser connections iser-target: Fix variable-length response error completion	2015-06-20 17:26:01 -07:00
Luis R. Rodriguez	7ea402d01c	x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled We are burrying direct access to MTRR code support on x86 in order to take advantage of PAT. In the future, we also want to make the default behaviour of ioremap_nocache() to use strong UC, use of mtrr_add() on those systems would make write-combining void. In order to help both enable us to later make strong UC default and in order to phase out direct MTRR access code port the driver over to arch_phys_wc_add() and annotate that the device driver requires systems to boot with PAT disabled, with the 'nopat' kernel parameter. This is a workable compromise given that the ipath device driver powers the old HTX bus cards that only work in AMD systems, while the newer IB/qib device driver powers all PCI-e cards. The ipath device driver is obsolete, hardware is hard to find and because of this its a reasonable compromise to require users of ipath to boot with 'nopat'. Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Doug Ledford <dledford@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Walls <awalls@md.metrocast.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Roland Dreier <roland@purestorage.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Stefan Bader <stefan.bader@canonical.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: infinipath@intel.com Cc: jbeulich@suse.com Cc: konrad.wilk@oracle.com Cc: linux-rdma@vger.kernel.org Cc: mchehab@osg.samsung.com Cc: toshi.kani@hp.com Link: http://lkml.kernel.org/r/1434053994-2196-4-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1434356898-25135-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-18 11:23:42 +02:00
Nicholas Bellinger	bc0c94b140	target: Drop unnecessary core_tpg_register TFO parameter This patch drops unnecessary target_core_fabric_ops parameter usage for core_tpg_register() during fabric driver TFO->fabric_make_tpg() se_portal_group creation callback execution. Instead, use the existing se_wwn->wwn_tf->tf_ops pointer to ensure fabric driver is really using the same TFO provided at module_init time. Also go ahead and drop the forward TFO declarations tree-wide, and handling the special case for iscsi-target discovery TPG. Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-06-15 23:23:22 -07:00
Eran Ben Elisha	9616982f3f	net/mlx4_core: Add helper to query counters This is an infrastructure step for querying VF and PF counters. This code was in the IB driver, move it to the mlx4 core driver so it will be accessible for more use cases. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	7193a141eb	IB/mlx4: Set VF to read from QP counters As IB VFs are not capable to read the port counters through MADs, move there to read their own QP counters to gather statistics. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	c3abb51bdb	IB/mlx4: Add RoCE/IB dedicated counters This is an infrastructure step to attach all the QPs opened from the IB driver to a counter in order to collect VF stats from the PF using those counters. If the port's type is Ethernet, the counter policy demands two counters per port (one for RoCE and one for Ethernet). The port default counter (allocated in mlx4_core) is used for the Ethernet netdev QPs and we allocate another counter for RoCE. If the port's traffic is Infiniband, the counter policy demands one counter per port, so it can use the port's default counter. Also, Add 'allocated' flag for each counter in order to clean it at unload. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	47d8417f59	net/mlx4_core: Add sink counter Reserve the last valid counter index for "sink" counter, when a new counter cannot be allocated, the driver will use this counter. In order to avoid allocating this counter on any other flow, fix the indices bitmap allocation range, and reserve the sink counter index. Add macro for the sink counter index and replace all appearences of the index with the macro. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:01 -07:00
Ira Weiny	8e4349d13f	IB/mad: Add final OPA MAD processing For devices which support OPA MADs 1) Use previously defined SMP support functions. 2) Pass correct base version to ib_create_send_mad when processing OPA MADs. 3) Process out_mad_key_index returned by agents for a response. This is necessary because OPA SMP packets must carry a valid pkey. 4) Carry the correct segment size (OPA vs IBTA) of RMPP messages within ib_mad_recv_wc. 5) Handle variable length OPA MADs by: * Adjusting the 'fake' WC for locally routed SMP's to represent the proper incoming byte_len * out_mad_size is used from the local HCA agents 1) when sending agent responses on the wire 2) when passing responses through the local_completions function NOTE: wc.byte_len includes the GRH length and therefore is different from the in_mad_size specified to the local HCA agents. out_mad_size should _not_ include the GRH length as it is added Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:18 -04:00
Ira Weiny	f28990bc89	IB/mad: Add partial Intel OPA MAD support Add OPA SMP processing functionality. Define the new OPA SMP format, create support functions for this format using the previously defined helper functions as appropriate. These functions are defined in this patch and used in the final OPA MAD support patch. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:17 -04:00
Ira Weiny	548ead1744	IB/mad: Add partial Intel OPA MAD support This patch is the first of 3 which adds processing of OPA MADs 1) Add Intel Omni-Path Architecture defines 2) Increase max management version to accommodate OPA 3) update ib_create_send_mad If the device supports OPA MADs and the MAD being sent is the OPA base version alter the MAD size and sg lengths as appropriate Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:17 -04:00
Ira Weiny	4cd7c9479a	IB/mad: Add support for additional MAD info to/from drivers In order to support alternate sized MADs (and variable sized MADs on OPA devices) add in/out MAD size parameters to the process_mad core call. In addition, add an out_mad_pkey_index to communicate the pkey index the driver wishes the MAD stack to use when sending OPA MAD responses. The out MAD size and the out MAD PKey index are required by the MAD stack to generate responses on OPA devices. Furthermore, the in and out MAD parameters are made generic by specifying them as ib_mad_hdr rather than ib_mad. Drivers are modified as needed and are protected by BUG_ON flags if the MAD sizes passed to them is incorrect. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:17 -04:00
Ira Weiny	c9082e51b6	IB/mad: Convert allocations from kmem_cache to kzalloc This patch implements allocating alternate receive MAD buffers within the MAD stack. Support for OPA to send/recv variable sized MADs is implemented later. 1) Convert MAD allocations from kmem_cache to kzalloc kzalloc is more flexible to support devices with different sized MADs and research and testing showed that the current use of kmem_cache does not provide performance benefits over kzalloc. 2) Change struct ib_mad_private to use a flex array for the mad data 3) Allocate ib_mad_private based on the size specified by devices in rdma_max_mad_size. 4) Carry the allocated size in ib_mad_private to be used when processing ib_mad_private objects. 5) Alter DMA mappings based on the mad_size of ib_mad_private. 6) Replace the use of sizeof and static defines as appropriate 7) Add appropriate casts for the MAD data when calling processing functions. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:17 -04:00
Ira Weiny	337877a466	IB/core: Add ability for drivers to report an alternate MAD size. Add max MAD size to the device immutable data set and have all drivers that support MADs report the current IB MAD size (IB_MGMT_MAD_SIZE) to the core. Verify MAD size data in both the MAD core and when reading the immutable data. OPA drivers will report alternate MAD sizes in subsequent patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:17 -04:00
Ira Weiny	da2dfaa3a3	IB/mad: Support alternate Base Versions when creating MADs In preparation to support the new OPA MAD Base version, add a base version parameter to ib_create_send_mad and set it to IB_MGMT_BASE_VERSION for current users. Definition of the new base version and it's processing will occur in later patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:17 -04:00
Ira Weiny	29869eafa6	IB/mad: Create a generic helper for DR forwarding checks IB and OPA SMPs share the same processing algorithm but have different header formats and permissive LID detection. Add a helper function which is generic to processing the DR forwarding checks which can be used by both IB and OPA SMP code. Use this function in the current IB function smi_check_forward_dr_smp. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:16 -04:00
Ira Weiny	86f0e67a21	IB/mad: Create a generic helper for DR SMP Recv processing IB and OPA SMPs share the same processing algorithm but have different header formats and permissive LID detection. Add a helper function which is generic to processing DR SMP Recv messages which can be used by both IB and OPA SMP code. Use this function in the current IB function smi_handle_dr_smp_recv. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:16 -04:00
Ira Weiny	92f1505604	IB/mad: Create a generic helper for DR SMP Send processing IB and OPA SMPs share the same processing algorithm but have different header formats and permissive LID detection. Add a helper function which is generic to processing DR SMP Send messages which can be used by both IB and OPA SMP code. Use this function in the current IB function smi_handle_dr_smp_send. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:16 -04:00
Ira Weiny	e11ae8aa0c	IB/mad: Split IB SMI handling from MAD Recv handler Make a helper function to process Directed Route SMPs to be called by the IB MAD Recv Handler, ib_mad_recv_done_handler. This cleans up the MAD receive handler code a bit and allows for us to better share the SMP processing code between IB and OPA SMPs. IB and OPA SMPs share the same processing algorithm but have different header formats and permissive LID detection. Therefore this and subsequent patches split the common processing code from the IB specific code in anticipation of sharing those algorithms with the OPA code. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:16 -04:00
Ira Weiny	83a1d22889	IB/mad cleanup: Generalize processing of MAD data ib_find_send_mad only needs access to the MAD header not the full IB MAD. Change the local variable to ib_mad_hdr and change the corresponding cast. This allows for clean usage of this function with both IB and OPA MADs because OPA MADs carry the same header as IB MADs. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:16 -04:00
Ira Weiny	d94bd2667a	IB/mad cleanup: Clean up function params -- find_mad_agent find_mad_agent only needs read only access to the MAD header. Update the ib_mad pointer to be const ib_mad_hdr. Adjust call tree. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:16 -04:00
Matan Barak	4b664c4355	IB/mlx4: Add support for CQ time-stamping This includes: * support allocation of CQ with the TIMESTAMP_COMPLETION creation flag. * add timestamp_mask and hca_core_clock to query_device, reporting the number of supported timestamp bits (mask) and the hca_core_clock frequency. * return hca core clock's offset in query_device vendor's data, this is needed in order to read the HCA's core clock. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Matan Barak	52033cfb5a	IB/mlx4: Add mmap call to map the hardware clock In order to read the HCA's cycle counter efficiently in user space, we need to map the HCA's register. This is done through mmap call. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Matan Barak	2528e33e68	IB/core: Pass hardware specific data in query_device Vendors should be able to pass vendor specific data to/from user-space via query_device uverb. In order to do this, we need to pass the vendors' specific udata. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Matan Barak	24306dc661	IB/core: Add timestamp_mask and hca_core_clock to query_device In order to expose timestamp we need to expose two new attributes in query_device to be used for CQ completion time-stamping: timestamp_mask - how many bits are valid in the timestamp, where timestamp values could be 64bits the most. hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units of the HCA, necessary in order to convert cycles to seconds. This is added both to ib_query_device and its respective uverbs counterpart. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Matan Barak	565197dd8f	IB/core: Extend ib_uverbs_create_cq ib_uverbs_ex_create_cq follows the extension verbs mechanism. New features (for example, CQ creation flags field which is added in a downstream patch) could used via user-space libraries without breaking the ABI. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Matan Barak	8e37210b38	IB/core: Change ib_create_cq to use struct ib_cq_init_attr Currently, ib_create_cq uses cqe and comp_vecotr instead of the extendible ib_cq_init_attr struct. Earlier patches already changed the vendors to work with ib_cq_init_attr. This patch changes the consumers too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Matan Barak	bcf4c1ea58	IB/core: Change provider's API of create_cq to be extendible Add a new ib_cq_init_attr structure which contains the previous cqe (minimum number of CQ entries) and comp_vector (completion vector) in addition to a new flags field. All vendors' create_cq callbacks are changed in order to work with the new API. This commit does not change any functionality. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2 Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Saeed Mahameed	facc9699f0	net/mlx5e: Fix HW MTU settings Previously we configured HW MTU to be netdev->mtu, actually we need to configure netdev->mtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN). Also, query MTU can not fail, hence make the relevant helper a void functionm, add mlx5e_set_dev_port_mtu, helper function to handle MTU setting. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:25 -07:00
Hariprasad S	74217d4c6a	iw_cxgb4: support for bar2 qid densities exceeding the page size Handle this configuration: Queues Per Page * SGE BAR2 Queue Register Area Size > Page Size Use cxgb4_bar2_sge_qregs() to obtain the proper location within the bar2 region for a given qid. Rework the DB and GTS write functions to make use of this bar2 info. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-11 12:22:32 -04:00
Doug Ledford	0699ee7ad7	Merge branch 'for-4.2-misc' into k.o/for-4.2	2015-06-11 01:13:30 -04:00
Colin Ian King	4dc5444279	RDMA/ocrdma: fix double free on pd A reorganisation of the PD allocation and deallocation in commit `9ba1377daa` ("RDMA/ocrdma: Move PD resource management to driver.") introduced a double free on pd, as detected by static analysis by smatch: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:682 ocrdma_alloc_pd() error: double free of 'pd'^ The original call to ocrdma_mbx_dealloc_pd() (which does not kfree pd) was replaced with a call to _ocrdma_dealloc_pd() (which does kfree pd). The kfree following this call causes the double free, so just remove it to fix the problem. Fixes: `9ba1377daa` ("RDMA/ocrdma: Move PD resource management to driver.") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-11 01:12:28 -04:00
Dan Carpenter	fc3aa45b63	IB/usnic: clean up some error handling code This code causes a static checker warning: drivers/infiniband/hw/usnic/usnic_uiom.c:476 usnic_uiom_alloc_pd() warn: passing zero to 'PTR_ERR' This code isn't buggy, but iommu_domain_alloc() doesn't return an error pointer so we can simplify the error handling and silence the static checker warning. The static checker warning is to catch place which do: if (!ptr) return ERR_PTR(ptr); Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dave Goodell <dgoodell@cisco.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-11 01:11:27 -04:00
Fabian Frederick	ed0de4a8c9	IB/mthca: use swap() in mthca_make_profile() Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-11 01:10:59 -04:00
Moni Shoua	9247a8eba6	IB/core: Don't warn on no SA support in event handler Registering an event handler is done for a device. This device may have one RoCE port (no SA cap) and one InfiniBand port (has SA cap). Therefore, warning from the event handler about a specific port that doesn't have SA cap is correct but pollutes the kernel log without a need. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-10 23:54:34 -04:00
Sagi Grimberg	524630d582	iser-target: Fix possible use-after-free iser connection termination process happens in 2 stages: - isert_wait_conn: - resumes rdma disconnect - wait for session commands - wait for flush completions (post a marked wr to signal we are done) - wait for logout completion - queue work for connection cleanup (depends on disconnected/timewait events) - isert_free_conn - last reference put on the connection In case we are terminating during IOs, we might be posting send/recv requests after we posted the last work request which might lead to a use-after-free condition in isert_handle_wc. After we posted the last wr in isert_wait_conn we are guaranteed that no successful completions will follow (meaning no new work request posts may happen) but other flush errors might still come. So before we put the last reference on the connection, we repeat the process of posting a marked work request (isert_wait4flush) in order to make sure all pending completions were flushed. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jenny Falkovich <jennyf@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-06-08 22:17:09 -07:00
Sagi Grimberg	2f1b6b7d9a	iser-target: release stale iser connections When receiving a new iser connect request we serialize the pending requests by adding the newly created iser connection to the np accept list and let the login thread process the connect request one by one (np_accept_wait). In case we received a disconnect request before the iser_conn has begun processing (still linked in np_accept_list) we should detach it from the list and clean it up and not have the login thread process a stale connection. We do it only when the connection state is not already terminating (initiator driven disconnect) as this might lead us to access np_accept_mutex after the np was released in live shutdown scenarios. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jenny Falkovich <jennyf@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-06-08 22:16:40 -07:00
Sagi Grimberg	9253e667ab	iser-target: Fix variable-length response error completion Since commit "2426bd456a6 target: Report correct response ..." we might get a command with data_size that does not fit to the number of allocated data sg elements. Given that we rely on cmd t_data_nents which might be different than the data_size, we sometimes receive local length error completion. The correct approach would be to take the command data_size into account when constructing the ib sg_list. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jenny Falkovich <jennyf@mellanox.com> Cc: stable@vger.kernel.org # 3.16+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-06-08 22:16:17 -07:00
Haggai Abramonvsky	4aa17b2879	mlx5: Enable mutual support for IB and Ethernet Ethernet functionality is only available when working in ISSI > 0 mode. Previously, the IB driver wasn't ready to work on that mode, and hence building both the IB driver and the Ethernet functionality in the core driver were disallowed by Kconfigs. Now, once we have all the pre-steps in place, we can remove this limitation. The last steps in the IB driver for getting that setup to work are: create dummy SRQ for the driver's use (until now we could use XRC_SRQ as SRQ and XRC_SRQ, after moving to ISSI > 0, we separate XRC SRQs from basic SRQs) and adapt the create QP function to be compatible with ISSI > 0. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:02 -07:00
Majd Dibbiny	647241ea10	IB/mlx5: Don't create IB instance over Ethernet ports Since we still don't have RoCE support in mlx5, avoid creating IB driver instance over Ethernet ports. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:02 -07:00
Majd Dibbiny	1b5daf11b0	IB/mlx5: Avoid using the MAD_IFC command under ISSI > 0 mode In ISSI > 0 mode, most of the MAD_IFC command features are deprecated, and can't be used. Therefore, when in that mode, we replace all of them with other commands that provide the required functionality. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:02 -07:00
Haggai Abramonvsky	01949d0109	net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0 When working in ISSI > 0 mode, the model exposed by the device for XRCs and SRQs is different. XRCs use XRC SRQs and plain SRQs are based on RPM (Receive Memory Pool). Add helper functions to create, modify, query, and arm XRC SRQs and RMPs. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Hariprasad Shenai	a4cfd929c9	cxgb4: Add ethtool support to get adapter stats Add ethtool support to get adapter specific hardware statistics Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-03 23:40:19 -07:00
Bart Van Assche	ba92999252	target: Minimize SCSI header #include directives Only include SCSI initiator header files in target code that needs these header files, namely the SCSI pass-through code and the tcm_loop driver. Change SCSI_SENSE_BUFFERSIZE into TRANSPORT_SENSE_BUFFER in target code because the former is intended for initiator code and the latter for target code. With this patch the only initiator include directives in target code that remain are as follows: $ git grep -nHE 'include .scsi/(scsi.h\|scsi_host.h\|scsi_device.h\|scsi_cmnd.h)' drivers/target drivers/infiniband/ulp/{isert,srpt} drivers/usb/gadget/legacy/tcm_.[ch] drivers/{vhost,xen} include/{target,trace/events/target.h} drivers/target/loopback/tcm_loop.c:29:#include <scsi/scsi.h> drivers/target/loopback/tcm_loop.c:31:#include <scsi/scsi_host.h> drivers/target/loopback/tcm_loop.c:32:#include <scsi/scsi_device.h> drivers/target/loopback/tcm_loop.c:33:#include <scsi/scsi_cmnd.h> drivers/target/target_core_pscsi.c:39:#include <scsi/scsi_device.h> drivers/target/target_core_pscsi.c:40:#include <scsi/scsi_host.h> drivers/xen/xen-scsiback.c:52:#include <scsi/scsi_host.h> / SG_ALL */ Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-06-02 08:03:25 -07:00
Doug Ledford	b806ef3bbe	Merge branch 'for-4.2-misc' into k.o/for-4.2	2015-06-02 09:33:22 -04:00
Ira Weiny	73cdaaeed1	IB/core cleanup: Add const to args - agent_send_response In order to support constant callers of agent_send_response we add const specifiers to the its pointer arguments. Adjust the call tree accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:33:13 -04:00
Ira Weiny	a97e2d86a9	IB/core cleanup: Add const on args - device->process_mad The process_mad device function declares some parameters as "in". Make those parameters const and adjust the call tree under process_mad in the various drivers accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:33:13 -04:00
Roland Dreier	1156256811	IB/mlx4: Fix error paths in mlx4_ib_create_flow() The unwinding clean up code are err_create_flow starts at the current index i. That means we shouldn't increment i until we're really sure we won't have to destroy the current flow; otherwise we might increment the index, fail inside an is_bonded block, and end up accessing off the end of the reg_id[] array. This was detected by Coverity (CID 1271229). Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:22:31 -04:00
Roland Dreier	18eaf1f195	RDMA/ocrdma: Fix memory leak in _ocrdma_alloc_pd() If ocrdma_get_pd_num() fails, then we need to free the pd struct we allocated. This was detected by Coverity (CID 1271245). Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:22:31 -04:00
Bart Van Assche	5237496781	IB/ipoib: Fix RCU annotations in ipoib_neigh_hash_init() Avoid that sparse complains about ipoib_neigh_hash_init(). This patch does not change any functionality. See also patch "IPoIB: Fix memory leak in the neigh table deletion flow" (commit ID `66172c0993`). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:22:31 -04:00
Faisal Latif	854ace98e7	RDMA/nes: Enable the use of the tos field in the nes driver RDMA/nes: Enable the use of the tos field in the nes driver Signed-off-by: Faisal Latif <Faisal.Latif@intel.com> Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:22:31 -04:00
Steve Wise	68cdba068d	RDMA/iw_cm: Export tos field to iwarp providers rdma-cma/iw_cm: Export tos field to iwarp providers Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:22:30 -04:00
David S. Miller	dda922c831	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/amd-xgbe-phy.c drivers/net/wireless/iwlwifi/Kconfig include/net/mac80211.h iwlwifi/Kconfig and mac80211.h were both trivial overlapping changes. The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and the bug fix that happened on the 'net' side is already integrated into the rest of the amd-xgbe driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 22:51:30 -07:00
Linus Torvalds	dae8f283bf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target fixes from Nicholas Bellinger: "These are mostly minor fixes, with the exception of the following that address fall-out from recent v4.1-rc1 changes: - regression fix related to the big fabric API registration changes and configfs_depend_item() usage, that required cherry-picking one of HCH's patches from for-next to address the issue for v4.1 code. - remaining TCM-USER -v2 related changes to enforce full CDB passthrough from Andy + Ilias. Also included is a target_core_pscsi driver fix from Andy that addresses a long standing issue with a Scsi_Host reference being leaked on PSCSI device shutdown" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: iser-target: Fix error path in isert_create_pi_ctx() target: Use a PASSTHROUGH flag instead of transport_types target: Move passthrough CDB parsing into a common function target/user: Only support full command pass-through target/user: Update example code for new ABI requirements target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem target: Drop signal_pending checks after interruptible lock acquire target: Add missing parentheses target: Fix bidi command handling target/user: Disallow full passthrough (pass_level=0) ISCSI: fix minor memory leak	2015-05-31 11:31:42 -07:00
Matan Barak	c66fa19c40	net/mlx4: Add EQ pool Previously, mlx4_en allocated EQs and used them exclusively. This affected RoCE performance, as applications which are events sensitive were limited to use only the legacy EQs. Change that by introducing an EQ pool. This pool is managed by mlx4_core. EQs are assigned to ports (when there are limited number of EQs, multiple ports could be assigned to the same EQs). An exception to this rule is the ASYNC EQ which handles various events. Legacy EQs are completely removed as all EQs could be shared. When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for EQ serving on a specific port. The core driver calculates which EQ should be assigned to that request. Because IRQs are shared between IB and Ethernet modules, their names only include the PCI device BDF address. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Matan Barak	48564135cb	net/mlx4_core: Demote simple multicast and broadcast flow steering rules In SRIOV, when simple (i.e - Ethernet L2 only) flow steering rules are created, always create them at MLX4_DOMAIN_NIC priority (instead of the real priority the function created them at). This is done in order to let multiple functions add broadcast/multicast rules without affecting other functions, which is necessary for DPDK in SRIOV. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Christoph Hellwig	7ad34a9367	target: target_core_configfs.h is not needed in fabric drivers Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:39 -07:00
Bart Van Assche	2fe6e721b5	ib_srpt: Remove set-but-not-used variables Detected these variables by building with W=1. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:32 -07:00
Bart Van Assche	649ee05499	target: Move task tag into struct se_cmd + support 64-bit tags Simplify target core and target drivers by storing the task tag a.k.a. command identifier inside struct se_cmd. For several transports (e.g. SRP) tags are 64 bits wide. Hence add support for 64-bit tags. (Fix core_tmr_abort_task conversion spec warnings - nab) (Fix up usb-gadget to use 16-bit tags - HCH + bart) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:31 -07:00
Christoph Hellwig	2650d71e24	target: move transport ID handling to the core Now that struct se_portal_group contains a protocol identifier field we can take all the code to format an parse protocol identifiers in CDBs into common code instead of leaving this to low-level drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:30 -07:00
Christoph Hellwig	2aeeafae6b	target: remove the get_fabric_proto_ident method Now that we store the protocol identifier in the tpg structure we don't need this method. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:30 -07:00
Christoph Hellwig	e4aae5af81	target: change core_tpg_register prototype Remove the unneeded fabric_ptr argument, and change the type argument to pass in a SPC protocol identifier. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:27 -07:00
Christoph Hellwig	144bc4c2a4	target: move node ACL allocation to core code Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:42:23 -07:00
Christoph Hellwig	c7d6a80392	target: refactor init/drop_nodeacl methods By always allocating and adding, respectively removing and freeing the se_node_acl structure in core code we can remove tons of repeated code in the init_nodeacl and drop_nodeacl routines. Additionally this now respects the get_default_queue_depth method in this code path as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:41:51 -07:00
Christoph Hellwig	e1750d20e6	target: make the tpg_get_default_depth method optional All fabric drivers except for iSCSI always return 1, so implement that as default behavior. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:41:50 -07:00
Bart Van Assche	afc16604c0	target: Remove first argument of target_{get,put}_sess_cmd() The first argument of these two functions is always identical to se_cmd->se_sess. Hence remove the first argument. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 22:41:47 -07:00
Roland Dreier	b2feda4feb	iser-target: Fix error path in isert_create_pi_ctx() We don't assign pi_ctx to desc->pi_ctx until we're certain to succeed in the function. That means the cleanup path should use the local pi_ctx variable, not desc->pi_ctx. This was detected by Coverity (CID 1260062). Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-05-30 20:01:04 -07:00
Amir Vadai	f62b8bb8f2	net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality This is the Ethernet part of the driver for the Mellanox ConnectX(R)-4 Single/Dual-Port Adapter supporting 100Gb/s with VPI. The driver extends the existing mlx5 driver with Ethernet functionality. This patch contains the driver entry points but does not include transmit and receive (see the previous patch in the series) routines. It also adds the option MLX5_CORE_EN to Kconfig to enable/disable the Ethernet functionality. Currently, Kconfig is programmed to make Ethernet and Infiniband functionality mutally exclusive. Also changed MLX5_INFINIBAND to be depandant on MLX5_CORE instead of selecting it, since MLX5_CORE could be selected without MLX5_INFINIBAND being selected. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:51 -07:00
Saeed Mahameed	938fe83c8d	net/mlx5_core: New device capabilities handling - Query all supported types of dev caps on driver load. - Store the Cap data outbox per cap type into driver private data. - Introduce new Macros to access/dump stored caps (using the auto generated data types). - Obsolete SW representation of dev caps (no need for SW copy for each cap). - Modify IB driver to use new macros for checking caps. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:22 -07:00
Amir Vadai	64ffaa2159	net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory As David Daney pointed in mlx4_core driver [1], mlx5_core is also misusing the DMA-API. This patch is removing the code that vmap() memory allocated by dma_alloc_coherent(). After this patch, users of this drivers might fail allocating resources on memory fragmeneted systems. This will be fixed later on. [1] - https://patchwork.ozlabs.org/patch/458531/ CC: David Daney <david.daney@cavium.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:22:37 -07:00
Luis R. Rodriguez	9c27847dda	kernel/params: constify struct kernel_param_ops uses Most code already uses consts for the struct kernel_param_ops, sweep the kernel for the last offending stragglers. Other than include/linux/moduleparam.h and kernel/params.c all other changes were generated with the following Coccinelle SmPL patch. Merge conflicts between trees can be handled with Coccinelle. In the future git could get Coccinelle merge support to deal with patch --> fail --> grammar --> Coccinelle --> new patch conflicts automatically for us on patches where the grammar is available and the patch is of high confidence. Consider this a feature request. Test compiled on x86_64 against: * allnoconfig * allmodconfig * allyesconfig @ const_found @ identifier ops; @@ const struct kernel_param_ops ops = { }; @ const_not_found depends on !const_found @ identifier ops; @@ -struct kernel_param_ops ops = { +const struct kernel_param_ops ops = { }; Generated-by: Coccinelle SmPL Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Junio C Hamano <gitster@pobox.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: cocci@systeme.lip6.fr Cc: linux-kernel@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2015-05-28 11:32:10 +09:30
Or Gerlitz	74d4943fbb	net/mlx4_core: Modify port values when generting EQEs for VFs As part of enabling single ported VFs over IB ports we need to handle some of the flows for generting EQ events for VFs which don't come into play under Eth ports. This mainly includes port management events derived from changes of the phyiscal port (lid change, client re-register, down/up, etc), VF pkey table changes and VF guid changes initiated by the IB driver. (1) make sure that events are generated only for VFs sitting on the relevant physical port (under the ALL_SLAVES flow). (2) before generating the event, convert from physical (one or two) to VF port (always equals one). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:09 -04:00
Or Gerlitz	430910b1b9	IB/mlx4: Convert slave port before building address-handle When multiplexling a MAD sent from VF, we should convert the port used by the guest to send the packet to the actual physical port which will be used to transmit the packet, before building the relevant address-handle (AH). This is needed under VPI for single ported VFs, since the code that builds the AH (mlx4_ib_query_ah()) makes decisions based on the input port. If we use the port number provided by the guest, it might have different protocol vs. the one this packat has to go from, and hence the result could be wrong. So far, the conversion was done after the AH was built and it worked for single ported Eth VFs which were not enabled under VPI. When adding support for single ported IB VFs and VPI, we hit that. Fixes: `449fc48866` ('net/mlx4: Adapt code for N-Port VF') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:09 -04:00
Matthew Finlay	c07678bb01	IB/cma: Fix broken AF_IB UD support Support for using UD and AF_IB is currently broken. The IB_CM_SIDR_REQ_RECEIVED message is not handled properly in cma_save_net_info() and we end up falling into code that will try and process the request as ipv4/ipv6, which will end up failing. The resolution is to add a check for the SIDR_REQ and call cma_save_ib_info() with a NULL path record. Change cma_save_ib_info() to copy the src sib info from the listen_id when the path record is NULL. Reported-by: Hari Shankar <Hari.Shankar@netapp.com> Signed-off-by: Matt Finlay <matt@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 16:15:56 -04:00
Doug Ledford	175e8efe69	Merge branches 'bart-srp', 'generic-errors', 'ira-cleanups' and 'mwang-v8' into k.o/for-4.2	2015-05-20 16:12:40 -04:00
Ira Weiny	5d9fb04406	IB/core: Change rdma_protocol_iboe to roce After discussion upstream, it was agreed to transition the usage of iboe in the kernel to roce. This keeps our terminology consistent with what was finalized in the IBTA Annex 16 and IBTA Annex 17 publications. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 15:58:19 -04:00
Ted Kim	c29ed5a456	ib/cm: Change reject message type when destroying cm_id Problem reported by: Ted Kim <ted.h.kim@oracle.com>: We have a case where a Linux system and a non-Linux system are trying to interoperate. The Linux host is the active side and starts the connection establishment, but later decides to not go through with the connection setup and does rdma_destroy_id(). The rdma_destroy_id() eventually works its way down to cm_destroy_id() in core/cm.c, where a REJ is sent. The non-Linux system has some trouble recognizing the REJ because of: A. CM states which can't receive the REJ B. Some issues about REJ formatting (missing comm ID) ISSUE A: That part of the spec says, a Consumer Reject REJ can be sent for a connection abort, but it goes further and says: can send a REJ message with a "Consumer Reject" Reason code if they are in a CM state (i.e. REP Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows a REJ to be sent (lines 35-38). Of the states listed there in that sentence, it would seem to limit the active side to using the Consumer Reject (for the abort case) in just the REP-Rcvd and MRA-REP-Sent states. That is basically only after the active side sees a REP (or alternatively goes down the state transitions to timeout in which case a Timeout REJ is sent). As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case to the same as REQ-SENT. Essentially, make a REJ sent after getting an MRA on active side a timeout rather than Consumer- Reject, which is arguably more correct with the CM state diagrams previous to getting a REP. Signed-off-by: Ted Kim <ted.h.kim@oracle.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2015-05-20 12:41:38 -04:00
Ira Weiny	f9b22e355d	IB/core: Convert core to use bitfield for caps Remove query_protocol callback Use the new Core Capability bits for: rdma_protocol_* rdma_cap_ib_mad rdma_cap_ib_smi rdma_cap_ib_cm rdma_cap_iw_cm rdma_cap_ib_sa rdma_cap_ib_mcast rdma_cap_af_ib rdma_cap_eth_ah Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 12:38:43 -04:00
Ira Weiny	7738613e7c	IB/core: Add per port immutable struct to ib_device As of commit `5eb620c81c` "IB/core: Add helpers for uncached GID and P_Key searches"; pkey_tbl_len and gid_tbl_len are immutable data which are stored in the ib_device. The per port core capability flags to be added later are also immutable data to be stored in the ib_device object. In preparation for this create a structure for per port immutable data and place the pkey and gid table lengths within this structure. "get_port_immutable" is added as a mandatory device function to allow the drivers to fill in this data. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 12:38:13 -04:00
Ira Weiny	26c454288a	IB/user_mad: Fix buggy usage of port index The addition of the rdma_cap_ib_mad is technically broken in ib_umad_remove_one because the loop "i" value is not a port value. This bug resulted in the ib_umad failing to properly remove its resources when the core capability functions were converted to bit fields. NOTE: e17371d73908 did not result in broken behavior on its own. It was only an issue when the implementation of rdma_cap_ib_mad was changed. Pass the port value to rdma_cap_ib_mad. Fixes: e17371d73908 ("IB/Verbs: Use management helper rdma_cap_ib_mad()") Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 12:37:34 -04:00
Ira Weiny	ab8be619b8	IB/user_mad: Use new start/end port functions Use the new common rdma_[start\|end]_port functions instead of using local variables and figuring it out on the fly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 12:36:17 -04:00
Ira Weiny	f766c58fa3	IB/mad: Add const qualifiers to query only functions The following functions only need read access to the data passed to them. ib_mad_kernel_rmpp_agent is_rmpp_data_mad rcv_has_same_gid ib_find_send_mad Clarify with const specifiers Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-05-20 12:34:45 -04:00

... 3 4 5 6 7 ...

4951 Commits