linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-07 20:51:47 +00:00

Author	SHA1	Message	Date
Haggai Eran	8605933a22	IB/mlx5: Add MR to radix tree in reg_mr_callback For memory regions that are allocated using reg_umr, the suffix of mlx5_core_create_mkey isn't being called. Instead the creation is completed in a callback function (reg_mr_callback). This means that these MRs aren't being added to the MR radix tree. Add them in the callback. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:05 -07:00
Haggai Eran	096f7e72c6	IB/mlx5: Fix error handling in reg_umr If ib_post_send fails when posting the UMR work request in reg_umr, the code doesn't release the temporary pas buffer allocated, and doesn't dma_unmap it. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:05 -07:00
Sagi Grimberg	c7f44fbda6	mlx5_core: Copy DIF fields only when input and output space values match Some DIF implementations (SCSI initiator/target) may want to use different input/output values for application tag and/or reference tag. So in case memory/wire domain values don't match HW must not copy them. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:53:02 -07:00
Sagi Grimberg	5c273b1677	mlx5_core: Simplify signature handover wqe for interleaved buffers No need for repetition format pattern in case the data and protection are already interleaved in the memory domain since the pattern already exists. A single key entry is sufficient and may save some extra fetch ops. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:52:58 -07:00
Sagi Grimberg	8524867b9c	mlx5_core: Fix signature handover operation for interleaved buffers When the data and protection are interleaved in the memory domain, no need to expand the mkey total length. At the moment no Linux user works (iSER initiator & target) in interleaved mode. This may change in the future as for SCSI pass-through devices there is no real point in target performing de-interleaving and re-interleaving of the protection data in the PT stage. Regardless, signature verbs support this mode. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-27 11:52:54 -07:00
Or Gerlitz	c7ca4b69d9	IB/iser: Bump version to 1.4 Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
Roi Dayan	e7eeffa4a0	IB/iser: Add missing newlines to logging messages Logging messages need terminating newlines to avoid possible message interleaving. Add them. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
Ariel Nahum	66d4e62d27	IB/iser: Fix a possible race in iser connection states transition In some circumstances (multiple targets), RDMA_CM ESTABLISHED event and ep_disconnect may race. In this case, the iser connection state may transition to UP (after ep_disconnect transitioned it to TERMINATING), while the connection is being torn down. Upon RDMA_CM event ESTABLISHED we allow iser connection state to transition to UP only from PENDING. We also make sure to protect this state change (done under the connection lock). Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
Ariel Nahum	b73c3adabd	IB/iser: Simplify connection management iSER relies on refcounting to manage iser connections establishment and teardown. Following commit `39ff05dbbb` ("IB/iser: Enhance disconnection logic for multi-pathing"), iser connection maintain 3 references: - iscsi_endpoint (at creation stage) - cma_id (at connection request stage) - iscsi_conn (at bind stage) We can avoid taking explicit refcounts by correctly serializing iser teardown flows (graceful and non-graceful). Our approach is to trigger a scheduled work to handle ordered teardown by gracefully waiting for 2 cleanup stages to complete: 1. Cleanup of live pending tasks indicated by iscsi_conn_stop completion 2. Flush errors processing Each completed stage will notify a waiting worker thread when it is done to allow teardwon continuation. Since iSCSI connection establishment may trigger endpoint disconnect without a successful endpoint connect, we rely on the iscsi <-> iser binding (.conn_bind) to learn about the teardown policy we should take wrt cleanup stages. Since all cleanup worker threads are scheduled (release_wq) in .ep_disconnect it is safe to assume that when module_exit is called, all cleanup workers are already scheduled. Thus proper module unload shall flush all scheduled works before allowing safe exit, to guarantee no resources got left behind. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-26 08:19:48 -07:00
David S. Miller	54e5c4def0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 00:32:30 -04:00
Linus Torvalds	5fa6a683c0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "It looks like a sizeble collection but this is nearly 3 weeks of bug fixing while you were away. 1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute the packet through a non-IPSEC protected path and the code has to be able to handle SKBs attached to routes lacking an attached xfrm state. From Steffen Klassert. 2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported sub-protocols, also from Steffen Klassert. 3) Set local_df on fragmented netfilter skbs otherwise we won't be able to forward successfully, from Florian Westphal. 4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without holding RCU lock, from Bjorn Mork. 5) local_df test in ip_may_fragment is inverted, from Florian Westphal. 6) jme driver doesn't check for DMA mapping failures, from Neil Horman. 7) qlogic driver doesn't calculate number of TX queues properly, from Shahed Shaikh. 8) fib_info_cnt can drift irreversibly positive if we fail to allocate the fi->fib_metrics array, from Sergey Popovich. 9) Fix use after free in ip6_route_me_harder(), also from Sergey Popovich. 10) When SYSCTL is disabled, we don't handle local_port_range and ping_group_range defaults properly at all, from Cong Wang. 11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim driver, fix from Bjorn Mork. 12) cassini driver needs nested lock annotations for TX locking, from Emil Goode. 13) On init error ipv6 VTI driver can unregister pernet ops twice, oops. Fix from Mahtias Krause. 14) If macvlan device is down, don't propagate IFF_ALLMULTI changes, from Peter Christensen. 15) Missing NULL pointer check while parsing netlink config options in ip6_tnl_validate(). From Susant Sahani. 16) Fix handling of neighbour entries during ipv6 router reachability probing, from Duan Jiong. 17) x86 and s390 JIT address randomization has some address calculation bugs leading to crashes, from Alexei Starovoitov and Heiko Carstens. 18) Clear up those uglies with nop patching and net_get_random_once(), from Hannes Frederic Sowa. 19) Option length miscalculated in ip6_append_data(), fix also from Hannes Frederic Sowa. 20) A while ago we fixed a race during device unregistry when a namespace went down, turns out there is a second place that needs similar protection. From Cong Wang. 21) In the new Altera TSE driver multicast filtering isn't working, disable it and just use promisc mode until the cause is found. From Vince Bridgers. 22) When we disable router enabling in ipv6 we have to flush the cached routes explicitly, from Duan Jiong. 23) NBMA tunnels should not cache routes on the tunnel object because the key is variable, from Timo Teräs. 24) With stacked devices GRO information in skb->cb[] can be not setup properly, make sure it is in all code paths. From Eric Dumazet. 25) Really fix stacked vlan locking, multiple levels of nesting with intervening non-vlan devices are possible. From Vlad Yasevich. 26) Fallback ipip tunnel device's mtu is not setup properly, from Steffen Klassert. 27) The packet scheduler's tcindex filter can crash because we structure copy objects with list_head's inside, oops. From Cong Wang. 28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric Dumazet. 29) In some configurations 'itag' in __mkroute_input() can end up being used uninitialized because of how fib_validate_source() works. Fix it by explitly initializing itag to zero like all the other fib_validate_source() callers do, from Li RongQing" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) batman: fix a bogus warning from batadv_is_on_batman_iface() ipv4: initialise the itag variable in __mkroute_input bonding: Send ALB learning packets using the right source bonding: Don't assume 802.1Q when sending alb learning packets. net: doc: Update references to skb->rxhash stmmac: Remove unbalanced clk_disable call ipv6: gro: fix CHECKSUM_COMPLETE support net_sched: fix an oops in tcindex filter can: peak_pci: prevent use after free at netdev removal ip_tunnel: Initialize the fallback device properly vlan: Fix build error wth vlan_get_encap_level() can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option MAINTAINERS: Pravin Shelar is Open vSwitch maintainer. bnx2x: Convert return 0 to return rc bonding: Fix alb mode to only use first level vlans. bonding: Fix stacked device detection in arp monitoring macvlan: Fix lockdep warnings with stacked macvlan devices vlan: Fix lockdep warning with stacked vlan devices. net: Allow for more then a single subclass for netif_addr_lock net: Find the nesting level of a given device by type. ...	2014-05-23 15:29:43 -07:00
Linus Torvalds	a7aa96a92e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull scsi target fixes from Nicholas Bellinger: "This series include: - Close race between iser-target network portal shutdown + accepting new connection logins (sagi) - Fix free-after-use regression in tcm_fc post conversion to percpu-ida pre-allocation (nab) - Explicitly disable Immediate + Unsolicited Data for iser-target connections when T10-PI is enabled (sagi + nab) - Allow pi_prot_type + emulate_write_cache attributes to be set to zero regardless of backend support (andy) - memory leak fix (mikulas)" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: fix memory leak on XCOPY target: Don't allow setting WC emulation if device doesn't support iscsi-target: Disable Immediate + Unsolicited Data with ISER Protection tcm_fc: Fix free-after-use regression in ft_free_cmd iscsi-target: Change BUG_ON to REJECT in iscsit_process_nop_out Target/iscsi,iser: Avoid accepting transport connections during stop stage Target/iser: Fix iscsit_accept_np and rdma_cm racy flow Target/iser: Fix wrong connection requests list addition target: Allow non-supporting backends to set pi_prot_type to 0	2014-05-21 18:03:14 +09:00
Sagi Grimberg	5f80ff8ecc	Target/iser: Gracefully reject T10-PI enabled connect request if not supported In case user chose to set T10-PI enable on the target while the IB device does not support it, gracefully reject the request. Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-20 11:18:44 -07:00
Sagi Grimberg	f5ebec9629	Target/iser: Wait for proper cleanup before unloading disconnected_handler works are scheduled on system_wq. When attempting to unload, first make sure all works have cleaned up. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-20 11:18:43 -07:00
Sagi Grimberg	88c4015fda	Target/iser: Improve cm events handling There are 4 RDMA_CM events that all basically mean that the user should teardown the IB connection: - DISCONNECTED - ADDR_CHANGE - DEVICE_REMOVAL - TIMEWAIT_EXIT Only in DISCONNECTED/ADDR_CHANGE it makes sense to call rdma_disconnect (send DREQ/DREP to our initiator). So we keep the same teardown handler for all of them but only indicate calling rdma_disconnect for the relevant events. This patch also removes redundant debug prints for each single event. v2 changes: - Call isert_disconnected_handler() for DEVICE_REMOVAL (Or + Sag) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-20 11:17:40 -07:00
Bart Van Assche	5cfb17828d	IB/srp: Add fast registration support Certain HCA types (e.g. Connect-IB) and certain configurations (e.g. ConnectX VF) support fast registration but not FMR. Hence add fast registration support. In function srp_rport_reconnect(), move the the srp_finish_req() loop from after to before the srp_create_target_ib() call. This is needed to avoid that srp_finish_req() tries to queue any invalidation requests for rkeys associated with the old queue pair on the newly allocated queue pair. Invoking srp_finish_req() before the queue pair has been reallocated is safe since srp_claim_req() handles completions correctly that arrive after srp_finish_req() has been invoked. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	52ede08f00	IB/srp: Rename FMR-related variables The next patch will cause the renamed variables to be shared between the code for FMR and for FR memory registration. Make the names of these variables independent of the memory registration mode. This patch does not change any functionality. The start of this patch was the changes applied via the following shell command: sed -i.orig 's/SRP_FMR_SIZE/SRP_MAX_PAGES_PER_MR/g; \ s/fmr_page_mask/mr_page_mask/g;s/fmr_page_size/mr_page_size/g; \ s/fmr_page_shift/mr_page_shift/g;s/fmr_max_size/mr_max_size/g; \ s/max_pages_per_fmr/max_pages_per_mr/g;s/nfmr/nmdesc/g; \ s/fmr_len/dma_len/g' drivers/infiniband/ulp/srp/ib_srp.[ch] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	d1b4289e16	IB/srp: One FMR pool per SRP connection Allocate one FMR pool per SRP connection instead of one SRP pool per HCA. This improves scalability of the SRP initiator. Only request the SCSI mid-layer to retry a SCSI command after a temporary mapping failure (-ENOMEM) but not after a permanent mapping failure. This avoids that SCSI commands are retried indefinitely if a permanent memory mapping failure occurs. Tell the SCSI mid-layer to reduce queue depth temporarily in the unlikely case where an application is queuing many requests with more than max_pages_per_fmr sg-list elements. For FMR pool allocation, base the max_pages_per_fmr parameter on the HCA memory registration limit. Only try to allocate an FMR pool if FMR is supported. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	b1b8854d16	IB/srp: Introduce the 'register_always' kernel module parameter Add a kernel module parameter that enables memory registration also for SG-lists that can be processed without memory registration. This makes it easier for kernel developers to test the memory registration code. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	539dde6fc5	IB/srp: Introduce srp_finish_mapping() This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:52 -07:00
Bart Van Assche	76bc1e1ddd	IB/srp: Introduce srp_map_fmr() This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Bart Van Assche	62154b2e89	IB/srp: Introduce an additional local variable This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Bart Van Assche	af24663bc8	IB/srp: Fix kernel-doc warnings Avoid that the kernel-doc tool warns about missing argument descriptions for the ib_srp.[ch] source files. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Bart Van Assche	024ca90151	IB/srp: Fix a sporadic crash triggered by cable pulling Avoid that the loops that iterate over the request ring can encounter a pointer to a SCSI command in req->scmnd that is no longer associated with that request. If the function srp_unmap_data() is invoked twice for a SCSI command that is not in flight then that would cause ib_fmr_pool_unmap() to be invoked with an invalid pointer as argument, resulting in a kernel oops. Reported-by: Sagi Grimberg <sagig@mellanox.com> Reference: http://thread.gmane.org/gmane.linux.drivers.rdma/19068/focus=19069 Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-20 09:20:51 -07:00
Steve Wise	11b8e22d4d	RDMA/cxgb4: Fix vlan support RDMA connections over a vlan interface don't work due to import_ep() not using the correct egress device. - use the real device in import_ep() - use rdma_vlan_dev_real_dev() in get_real_dev(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-19 18:00:32 -07:00
Duan Jiong	0cc65dd691	RDMA/ocrdma: Convert to use simple_open() This removes an open-coded duplicate of simple_open(). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-19 17:55:54 -07:00
Christoph Jaeger	65b302ad31	RDMA/cxgb4: Fix memory leaks in c4iw_alloc() error paths c4iw_alloc() bails out without freeing the storage that 'devp' points to. Picked up by Coverity - CID 1204241. Fixes: `fa658a98a2` ("RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices") Signed-off-by: Christoph Jaeger <christophjaeger@linux.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-05-19 17:55:43 -07:00
Sagi Grimberg	9d49f5e284	Target/iser: Fix hangs in connection teardown In ungraceful teardowns isert close flows seem racy such that isert_wait_conn hangs as RDMA_CM_EVENT_DISCONNECTED never gets invoked (no one called rdma_disconnect). Both graceful and ungraceful teardowns will have rx flush errors (isert posts a batch once connection is established). Once all flush errors are consumed we invoke isert_wait_conn and it will be responsible for calling rdma_disconnect. This way it can be sure that rdma_disconnect was called and it won't wait forever. This patch also removes the logout_posted indicator. either the logout completion was consumed and no problem decrementing the post_send_buf_count, or it was consumed as a flush error. no point of keeping it for isert_wait_conn as there is no danger that isert_conn will be accidentally removed while it is running. (Drop unnecessary sleep_on_conn_wait_comp check in isert_cq_rx_comp_err - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-19 14:38:49 -07:00
Sagi Grimberg	e346ab343f	Target/iser: Bail from accept_np if np_thread is trying to close In case np_thread state is in RESET/SHUTDOWN/EXIT states, no point for isert to stall there as we may get a hang in case no one will wake it up later. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-19 14:32:58 -07:00
Matan Barak	9433c18891	IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes When we receive a netdev event indicating a netdev change and/or a netdev address change, we must change the MAC index used by the proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the VF will not carry the same source MAC address as the non-CM packets. We use the UPDATE_QP command to perform this change. In order to avoid modifying a QP context based on netdev event, while the driver attempts to destroy this QP (e.g either the mlx4_ib or ib_mad modules are unloaded), we use mutex locking in both flows. Since the relevant mlx4 proxy GSI QP is created indirectly by the mad module when they create their GSI QP, the mlx4 didn't need to keep track on that QP prior to this change. Now, when QP modifications are needed to this QP from within the driver, we added refernece to it. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 15:12:45 -04:00
Sagi Grimberg	14f4b54fe3	Target/iscsi,iser: Avoid accepting transport connections during stop stage When the target is in stop stage, iSER transport initiates RDMA disconnects. The iSER initiator may wish to establish a new connection over the still existing network portal. In this case iSER transport should not accept and resume new RDMA connections. In order to learn that, iscsi_np is added with enabled flag so the iSER transport can check when deciding weather to accept and resume a new connection request. The iscsi_np is enabled after successful transport setup, and disabled before iscsi_np login threads are cleaned up. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-15 17:09:11 -07:00
Sagi Grimberg	531b7bf4bd	Target/iser: Fix iscsit_accept_np and rdma_cm racy flow RDMA CM and iSCSI target flows are asynchronous and completely uncorrelated. Relying on the fact that iscsi_accept_np will be called after CM connection request event and will wait for it is a mistake. When attempting to login to a few targets this flow is racy and unpredictable, but for parallel login to dozens of targets will race and hang every time. The correct synchronizing mechanism in this case is pending on a semaphore rather than a wait_for_event. We keep the pending interruptible for iscsi_np cleanup stage. (Squash patch to remove dead code into parent - nab) Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-15 17:09:10 -07:00
Sagi Grimberg	9fe63c88b1	Target/iser: Fix wrong connection requests list addition Should be adding list_add_tail($new, $head) and not the other way around. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-05-15 17:09:10 -07:00
Wilfried Klaebe	7ad24ea4bf	net: get rid of SET_ETHTOOL_OPS net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops ops; struct net_device dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:43:20 -04:00
Hariprasad S	7d0a73a40c	RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Steve Wise	c2f9da92f2	RDMA/cxgb4: Only allow kernel db ringing for T4 devs The whole db drop avoidance stuff is for T4 only. So we cannot allow that to be enabled for T5 devices. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Steve Wise	92e5011ab0	RDMA/cxgb4: Force T5 connections to use TAHOE congestion control This is required to work around a T5 HW issue. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Steve Wise	cc18b939e1	RDMA/cxgb4: Fix endpoint mutex deadlocks In cases where the cm calls c4iw_modify_rc_qp() with the endpoint mutex held, they must be called with internal == 1. rx_data() and process_mpa_reply() are not doing this. This causes a deadlock because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some !internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex. The design was intended to only do the disconnect for !internal calls. Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with internal == 1, and then disconnect only after releasing the mutex. Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with internal == 1 and set a new attr flag telling it to send a TERMINATE message. Previously this was implied by !internal. Change process_mpa_reply() to return whether the caller should disconnect after releasing the endpoint mutex. Now rx_data() will do the disconnect in the cases where process_mpa_reply() wants to disconnect after the TERMINATE is sent. Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal, and to send a TERMINATE message if attrs->send_term is 1. Change abort_connection() to not aquire the ep mutex for setting the state, and make all calls to abort_connection() do so with the mutex held. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-28 17:29:41 -07:00
Linus Torvalds	38137a5188	InfiniBand/RDMA updates for 3.15-rc2: - Mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree. - Drop deprecated MSI-X API use. - A couple other miscellaneous things. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJTUXCpAAoJEENa44ZhAt0h4tEP/1P1TqA/90sO7WeChze6H2ph kkGJPsCj8i1AWpef/Ax2hvoH1pjxK/VQbpwgkGJElv+rRaP99pandYGPBFluUoJK jI6D/mZSAsl6BXIZZCTcz9ttQNYacDdDL1I22WIdntkkCUsoP1awuHnXS4WLAiPA 3D1XmiSHTGv9nbqd2wKA3sHWFaUN/s+RqsvIXUcNrN4qEUL2KXPOyt7i4Pz8eUq7 /lxG1S854QUrZsW7qFR6Jc3clhNupUDWBBRrII3MOV98W3upHOVsFAdtcLZAkcTA FBCk90VVi0hsPeHqqCZjKUW/JE414vEDzOZfmYOpMj0xzc9sF1POxlvqMfRjvKTD fwYAkNLT7AW74jSU38KZl9pdLsQx187c75pcYUZ4ph6m6bozbtlPGkJaa3M3yyUb DmWUZL1dX8i69dV+w/ABz/4uK0dbCwMeAM9cCCUiedqI5wIuEV4Ld3NOLe7HnAXz O81UrVVnwrOVl8uPmGzY3VliU0OtFqhMBCxTgsZRxuBr9DSx/dUzb4Tzfw7e3Sc+ +uxMMnbuCDysqEfzKsLG0+qu9P3hcwyafTJ9u8pq0D+BCB3tA8ZIKiqrKeMBua2h IJAy8M+zVG2HZCQFreH2oMnXmzVoyp+5Yl28izMDEfDptJ6hel4FLBiLLlYnU1DO zxUN5X1yOHxFkK4YYetE =/PTZ -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: - mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree - drop deprecated MSI-X API use. - a couple other miscellaneous things. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cxgb4: Fix over-dereference when terminating RDMA/cxgb4: Use uninitialized_var() RDMA/cxgb4: Add missing debug stats RDMA/cxgb4: Initialize reserved fields in a FW work request RDMA/cxgb4: Use pr_warn_ratelimited RDMA/cxgb4: Max fastreg depth depends on DSGL support RDMA/cxgb4: SQ flush fix RDMA/cxgb4: rmb() after reading valid gen bit RDMA/cxgb4: Endpoint timeout fixes RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices IB/mlx5: Add block multicast loopback support IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix()	2014-04-18 13:49:42 -07:00
Linus Torvalds	141eaccd01	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Here are the target pending updates for v3.15-rc1. Apologies in advance for waiting until the second to last day of the merge window to send these out. The highlights this round include: - iser-target support for T10 PI (DIF) offloads (Sagi + Or) - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung) - Pass in transport supported PI at session initialization (Sagi + MKP + nab) - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi) - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab) - Fix tcm_fc use-after-free of ft_tpg (Andy Grover) - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn) Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI metadata into KVM guest have been left-out for now, as there where a few comments from MST + Paolo that where not able to be addressed in time for v3.15. Please expect this feature for v3.16-rc1" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits) ib_srpt: Use correct ib_sg_dma primitives target/tcm_fc: Rename ft_tport_create to ft_tport_get target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn target/tcm_fc: Rename structs and list members for clarity target/tcm_fc: Limit to 1 TPG per wwn target/tcm_fc: Don't export ft_lport_list target/tcm_fc: Fix use-after-free of ft_tpg target: Add check to prevent Abort Task from aborting itself target: Enable READ_STRIP emulation in target_complete_ok_work target/sbc: Add sbc_dif_read_strip software emulation target: Enable WRITE_INSERT emulation in target_execute_cmd target/sbc: Add sbc_dif_generate software emulation target/sbc: Only expose PI read_cap16 bits when supported by fabric target/spc: Only expose PI mode page bits when supported by fabric target/spc: Only expose PI inquiry bits when supported by fabric target: Pass in transport supported PI at session initialization target/iblock: Fix double bioset_integrity_free bug Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist target/rd: T10-Dif: RAM disk is allocating more space than required. iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug ...	2014-04-12 16:51:08 -07:00
Mike Marciniszyn	b076808051	ib_srpt: Use correct ib_sg_dma primitives The code was incorrectly using sg_dma_address() and sg_dma_len() instead of ib_sg_dma_address() and ib_sg_dma_len(). This prevents srpt from functioning with the Intel HCA and indeed will corrupt memory badly. Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Vinod Kumar <vinod.kumar@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: stable@vger.kernel.org # 3.3+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-11 15:31:20 -07:00
Roland Dreier	5ae2866f52	Merge branches 'cxgb4', 'misc', 'mlx5' and 'qib' into for-next	2014-04-11 11:36:15 -07:00
Steve Wise	1d1ca9b4fd	RDMA/cxgb4: Fix over-dereference when terminating Need to get the endpoint reference before calling rdma_fini(), which might fail causing us to not get the reference. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:10 -07:00
Steve Wise	97df1c6736	RDMA/cxgb4: Use uninitialized_var() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:10 -07:00
Steve Wise	98a3e87990	RDMA/cxgb4: Add missing debug stats Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:09 -07:00
Steve Wise	c3f98fa291	RDMA/cxgb4: Initialize reserved fields in a FW work request Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:09 -07:00
Hariprasad Shenai	aec844df10	RDMA/cxgb4: Use pr_warn_ratelimited Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:08 -07:00
Steve Wise	a03d9f94cc	RDMA/cxgb4: Max fastreg depth depends on DSGL support The max depth of a fastreg mr depends on whether the device supports DSGL or not. So compute it dynamically based on the device support and the module use_dsgl option. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:08 -07:00
Steve Wise	b4e2901c52	RDMA/cxgb4: SQ flush fix There is a race when moving a QP from RTS->CLOSING where a SQ work request could be posted after the FW receives the RDMA_RI/FINI WR. The SQ work request will never get processed, and should be completed with FLUSHED status. Function c4iw_flush_sq(), however was dropping the oldest SQ work request when in CLOSING or IDLE states, instead of completing the pending work request. If that oldest pending work request was actually complete and has a CQE in the CQ, then when that CQE is proceessed in poll_cq, we'll BUG_ON() due to the inconsistent SQ/CQ state. This is a very small timing hole and has only been hit once so far. The fix is two-fold: 1) c4iw_flush_sq() MUST always flush all non-completed WRs with FLUSHED status regardless of the QP state. 2) In c4iw_modify_rc_qp(), always set the "in error" bit on the queue before moving the state out of RTS. This ensures that the state transition will not happen while another thread is in post_rc_send(), because set_state() and post_rc_send() both aquire the qp spinlock. Also, once we transition the state out of RTS, subsequent calls to post_rc_send() will fail because the "in error" bit is set. I don't think this fully closes the race where the FW can get a FINI followed a SQ work request being posted (because they are posted to differente EQs), but the #1 fix will handle the issue by flushing the SQ work request. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:08 -07:00
Steve Wise	def4771f4b	RDMA/cxgb4: rmb() after reading valid gen bit Some HW platforms can reorder read operations, so we must rmb() after we see a valid gen bit in a CQE but before we read any other fields from the CQE. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:07 -07:00
Steve Wise	b33bd0cbfa	RDMA/cxgb4: Endpoint timeout fixes 1) timedout endpoint processing can be starved. If there are continual CPL messages flowing into the driver, the endpoint timeout processing can be starved. This condition exposed the other bugs below. Solution: In process_work(), call process_timedout_eps() after each CPL is processed. 2) Connection events can be processed even though the endpoint is on the timeout list. If the endpoint is scheduled for timeout processing, then we must ignore MPA Start Requests and Replies. Solution: Change stop_ep_timer() to return 1 if the ep has already been queued for timeout processing. All the callers of stop_ep_timer() need to check this and act accordingly. There are just a few cases where the caller needs to do something different if stop_ep_timer() returns 1: 1) in process_mpa_reply(), ignore the reply and process_timeout() will abort the connection. 2) in process_mpa_request, ignore the request and process_timeout() will abort the connection. It is ok for callers of stop_ep_timer() to abort the connection since that will leave the state in ABORTING or DEAD, and process_timeout() now ignores timeouts when the ep is in these states. 3) Double insertion on the timeout list. Since the endpoint timers are used for connection setup and teardown, we need to guard against the possibility that an endpoint is already on the timeout list. This is a rare condition and only seen under heavy load and in the presense of the above 2 bugs. Solution: In ep_timeout(), don't queue the endpoint if it is already on the queue. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:07 -07:00
Steve Wise	fa658a98a2	RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fix cast from u64* to integer. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-11 11:36:01 -07:00
Eli Cohen	f360d88a2e	IB/mlx5: Add block multicast loopback support Add support for the block multicast loopback QP creation flag along the proper firmware API for that. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-10 18:43:32 -07:00
Alexander Gordeev	9684c2ea6d	IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() As result of the deprecation of the MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block(), all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() or pci_enable_msi_exact() and pci_enable_msix_range() or pci_enable_msix_exact() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-10 18:41:34 -07:00
Alexander Gordeev	bf3f043e7b	IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix() As result of the deprecation of the MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block(), all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() and pci_enable_msix_range() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-10 18:39:13 -07:00
Nicholas Bellinger	e70beee783	target: Pass in transport supported PI at session initialization In order to support local WRITE_INSERT + READ_STRIP operations for non PI enabled fabrics, the fabric driver needs to be able signal what protection offload operations are supported. This is done at session initialization time so the modes can be signaled by individual se_wwn + se_portal_group endpoints, as well as optionally across different transports on the same endpoint. For iser-target, set TARGET_PROT_ALL if the underlying ib_device has already signaled PI offload support, and allow this to be exposed via a new iscsit_transport->iscsit_get_sup_prot_ops() callback. For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode operation. For all other drivers, set TARGET_PROT_NORMAL to disable fabric level PI. Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:54 -07:00
Sagi Grimberg	f225225848	Target/iser: Use Fastreg only if device supports signature Fastreg is mandatory for signature, so if the device doesn't support it we don't need to use it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:52 -07:00
Nicholas Bellinger	03e7848a64	iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err This patch fixes a bug where outstanding RDMA_READs with WRITE_PENDING status require an extra target_put_sess_cmd() in isert_put_cmd() code when called from isert_cq_tx_comp_err() + isert_cq_drain_comp_llist() context during session shutdown. The extra kref PUT is required so that transport_generic_free_cmd() invokes the last target_put_sess_cmd() -> target_release_cmd_kref(), which will complete(&se_cmd->cmd_wait_comp) the outstanding se_cmd descriptor with WRITE_PENDING status, and awake the completion in target_wait_for_sess_cmds() to invoke TFO->release_cmd(). The bug was manifesting itself in target_wait_for_sess_cmds() where a se_cmd descriptor with WRITE_PENDING status would end up sleeping indefinately. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:51 -07:00
Nicholas Bellinger	131e6abc67	target: Add TFO->abort_task for aborted task resources release Now that TASK_ABORTED status is not generated for all cases by TMR ABORT_TASK + LUN_RESET, a new TFO->abort_task() caller is necessary in order to give fabric drivers a chance to unmap hardware / software resources before the se_cmd descriptor is released via the normal TFO->release_cmd() codepath. This patch adds TFO->aborted_task() in core_tmr_abort_task() in place of the original transport_send_task_abort(), and also updates all fabric drivers to implement this caller. The fabric drivers that include changes to perform cleanup via ->aborted_task() are: - iscsi-target - iser-target - srpt - tcm_qla2xxx The fabric drivers that currently set ->aborted_task() to NOPs are: - loopback - tcm_fc - usb-gadget - sbp-target - vhost-scsi For the latter five, there appears to be no additional cleanup required before invoking TFO->release_cmd() to release the se_cmd descriptor. v2 changes: - Move ->aborted_task() call into transport_cmd_finish_abort (Alex) Cc: Alex Leung <amleung21@yahoo.com> Cc: Mark Rustad <mark.d.rustad@intel.com> Cc: Roland Dreier <roland@kernel.org> Cc: Vu Pham <vu@mellanox.com> Cc: Chris Boot <bootc@bootc.net> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Cc: Saurav Kashyap <saurav.kashyap@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:51 -07:00
Nicholas Bellinger	f46d6a8a01	iser-target: Match FRMR descriptors to available session tags This patch changes isert_conn_create_fastreg_pool() to follow logic in iscsi_target_locate_portal() for determining how many FRMR descriptors to allocate based upon the number of possible per-session command slots that are available. This addresses an OOPs in isert_reg_rdma() where due to the use of ISCSI_DEF_XMIT_CMDS_MAX could end up returning a bogus fast_reg_descriptor when the number of active tags exceeded the original hardcoded max. Note this also includes moving isert_conn_create_fastreg_pool() from isert_connect_request() to isert_put_login_tx() before posting the final Login Response PDU in order to determine the se_nacl->queue_depth (eg: number of tags) per session the target will be enforcing. v2 changes: - Move isert_conn->conn_fr_pool list_head init into isert_conn_request() v3 changes: - Drop unnecessary list_empty() check in isert_reg_rdma() (Sagi) Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:50 -07:00
Sagi Grimberg	5bac4b1a1f	Target/iser: Fail SCSI WRITE command if device detected integrity error If during data-transfer a data-integrity error was detected we must fail the command with CHECK_CONDITION and not execute the command. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:49 -07:00
Sagi Grimberg	96b7973e1c	Target/iser: Move check signature status to a function Remove code duplication from RDMA_READ and RDMA_WRITE completions that do basically the same check. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:49 -07:00
Sagi Grimberg	897bb2c916	Target/iser: Consider DIF and RDMA_READ completions when calculating post_send counter If protection is involved, iSER target must wait for completion of RDMA_READ before sending SCSI response. So we must consider that when calculating post_send_buf_count additions, also when processing good/error completions. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:48 -07:00
Sagi Grimberg	c2caa20777	Target/iser: Fix signature work requests accounting As REG_SIG_MR work request and it's LOCAL_INVALIDATE are not accounted in post_send_buf_count we must color these with ISER_FASTREG_LI_WRID in order to process their error completions when the QP flushes. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:48 -07:00
Sagi Grimberg	9e961ae73c	IB/isert: Support T10-PI protected transactions In case the Target core passed transport T10 protection operation: 1. Register data buffer (data memory region) 2. Register protection buffer if exsists (prot memory region) 3. Register signature region (signature memory region) - use work request IB_WR_REG_SIG_MR 4. Execute RDMA 5. Upon RDMA completion check the signature status - if succeeded send good SCSI response - if failed send SCSI bad response with appropriate sense buffer (Fix up compile error in isert_reg_sig_mr, and fix up incorrect se_cmd->prot_type -> TARGET_PROT_NORMAL comparision - nab) (Fix failed sector assignment in isert_completion_rdma_* - Sagi + nab) (Fix enum assignements for protection type - Sagi) (Fix devision on 32-bit in isert_completion_rdma_* - Sagi + Fengguang) (Fix context change for v3.14-rc6 code - nab) (Fix iscsit_build_rsp_pdu inc_statsn flag usage - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:47 -07:00
Sagi Grimberg	f93f3a70da	IB/isert: Accept RDMA_WRITE completions In case of protected transactions, we will need to check the protection status of the transaction before sending SCSI response. So be ready for RDMA_WRITE completions. currently we don't ask for these completions, but for T10-PI we will. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:46 -07:00
Sagi Grimberg	d3e125dac1	IB/isert: Initialize T10-PI resources Introduce pi_context to hold relevant RDMA protection resources. We eliminate data_key_valid boolean and replace it with indicators container to indicate: - Is the descriptor protected (registered via signature MR) - Is the data_mr key valid (can spare LOCAL_INV WR) - Is the prot_mr key valid (can spare LOCAL_INV WR) - Is the sig_mr key valid (can spare LOCAL_INV WR) Upon connection establishment check if network portal is T10-PI enabled and allocate T10-PI resources if necessary, allocate signature enabled memory regions and mark connection queue-pair as signature enabled. (Fix context change for v3.14-rc6 code - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:46 -07:00
Sagi Grimberg	e3d7e4c30c	IB/isert: Introduce isert_map/unmap_data_buf export map/unmap data buffer to a routine that may be used in various places in the code and keep the mapping data in a designated descriptor. Also, let isert_fast_reg_mr to decide weather to use global MR or do fast registration. This commit does not change any functionality. (Fix context change for v3.14-rc6 code - nab) Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2014-04-07 01:48:45 -07:00
Linus Torvalds	877f075aac	Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the "umem" memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJTPYBnAAoJEENa44ZhAt0hGI4P/29eotGwpkANUQE6FQvxCUL2 CXJtSg52lmYvGJrPK4IhihpbtQmHJz3iXEzlOOWidTw1dJgObR6vFaRymh7+vDLs CdzybMcXdasarqTuYeJbFzhkimpwtWWrMy/8Ik/Jj/5glGQ6cUSpdYZzVtFhYNqf hCGE8iLi+tuekJJj1htut5D6apXM7udcdc2yLJNOdsSj/VUXt1oqG1x9xAi9R8Tq 7o8eFSStdlja0EBQ6Hli2zauCSnQkaUtr8h6EAFbcCtvBK8HqsHSc2gfq2ViFUiN ztt167oWoQnVkR0qCPL5nVt+CRQHHROprVXvbpcTI3aW61gNIl6OrUUOXefzHXac TNi+fdMpiEB/JQ4Z04Jzd1dGCSjYeTqPj4rO4meFjBmxRDdTgZHu7FWwejT1nYJ5 d2abVdCOT+QWlIlM7m/pjdWJII5OYM+4/jtTayGepEaR4fTUzKtPZPBLNUBDBKE+ 4f92PC8LiuPkwJgb6XT96onPz1bDCOnPSEdwoKUFKPeGUcwgVOM/Wx5NU4Yf7rfg RxQwZ7mJXbjCYFlmGGo/0QDy6UEGkIFYlJSzooP+wlK1JvZ5h2M+9QKX2FtwzR+R I2kBxcTXWsM/h88R7MkNqbNIllmhssrJwmAE46OneZbfoBOB+JZjb4nLRTu0jEcS zn6f16GmJ37BKn2/qYY/ =Ww6H -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the "umem" memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (102 commits) RDMA/ocrdma: Unregister inet notifier when unloading ocrdma RDMA/ocrdma: Fix warnings about pointer <-> integer casts RDMA/ocrdma: Code clean-up RDMA/ocrdma: Display FW version RDMA/ocrdma: Query controller information RDMA/ocrdma: Support non-embedded mailbox commands RDMA/ocrdma: Handle CQ overrun error RDMA/ocrdma: Display proper value for max_mw RDMA/ocrdma: Use non-zero tag in SRQ posting RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr() RDMA/ocrdma: Increment abi version count RDMA/ocrdma: Update version string be2net: Add abi version between be2net and ocrdma RDMA/ocrdma: ABI versioning between ocrdma and be2net RDMA/ocrdma: Allow DPP QP creation RDMA/ocrdma: Read ASIC_ID register to select asic_gen RDMA/ocrdma: SQ and RQ doorbell offset clean up RDMA/ocrdma: EQ full catastrophe avoidance RDMA/cxgb4: Disable DSGL use by default RDMA/cxgb4: rx_data() needs to hold the ep mutex ...	2014-04-03 16:57:19 -07:00
Roland Dreier	f7eaa7ed8f	Merge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', 'ocrdma', 'qib', 'sgwrapper', 'srp' and 'usnic' into for-next	2014-04-03 08:30:17 -07:00
Selvin Xavier	2d8f57d56f	RDMA/ocrdma: Unregister inet notifier when unloading ocrdma Unregister the inet notifier during ocrdma unload to avoid a panic after driver unload. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:07 -07:00
Roland Dreier	7a1e89d8b7	RDMA/ocrdma: Fix warnings about pointer <-> integer casts We should cast pointers to and from unsigned long to turn them into ints. Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:07 -07:00
Devesh Sharma	fad51b7d36	RDMA/ocrdma: Code clean-up Clean up code. Also modifying GSI QP to error during ocrdma_close is fixed. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:06 -07:00
Selvin Xavier	334b8db3a6	RDMA/ocrdma: Display FW version Adding a sysfs file for getting the FW version. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:06 -07:00
Selvin Xavier	a51f06e167	RDMA/ocrdma: Query controller information Issue mailbox commands to query ocrdma controller information and phy information and print them while adding ocrdma device. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:05 -07:00
Selvin Xavier	bbc5ec524e	RDMA/ocrdma: Support non-embedded mailbox commands Added a routine to issue non-embedded mailbox commands for handling large mailbox request/response data. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:05 -07:00
Selvin Xavier	1228056bcf	RDMA/ocrdma: Handle CQ overrun error Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:05 -07:00
Selvin Xavier	ac578aef8b	RDMA/ocrdma: Display proper value for max_mw Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:04 -07:00
Selvin Xavier	cf5788ade7	RDMA/ocrdma: Use non-zero tag in SRQ posting As part of SRQ receive buffers posting we populate a non-zero tag which will be returned in SRQ receive completions. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:04 -07:00
Selvin Xavier	9d1878a369	RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr() Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:03 -07:00
Devesh Sharma	2e6e9f2bb8	RDMA/ocrdma: Increment abi version count Increment the ABI version count for driver/library interface. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:30:02 -07:00
Devesh Sharma	0154410bd4	RDMA/ocrdma: Update version string Update the driver vrsion string and node description string Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:59 -07:00
Devesh Sharma	b6b87d2e69	RDMA/ocrdma: ABI versioning between ocrdma and be2net While loading RoCE driver be2net driver should check for ABI version to catch functional incompatibilities. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:51 -07:00
Devesh Sharma	1eebbb6ec3	RDMA/ocrdma: Allow DPP QP creation Allow creating DPP QP even if inline-data is not requested. This is an optimization to lower latency. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:44 -07:00
Devesh Sharma	21c3391a9a	RDMA/ocrdma: Read ASIC_ID register to select asic_gen ocrdma driver selects execution path based on sli_family and asic generation number. This introduces code to read the asic gen number from pci register instead of obtaining it from the Emulex NIC driver. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:40 -07:00
Devesh Sharma	2df84fa87f	RDMA/ocrdma: SQ and RQ doorbell offset clean up Introducing new macros to define SQ and RQ doorbell offset. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:36 -07:00
Devesh Sharma	ea61762679	RDMA/ocrdma: EQ full catastrophe avoidance Stale entries in the CQ being destroyed causes hardware to generate EQEs indefinitely for a given CQ. Thus causing uncontrolled execution of irq_handler. This patch fixes this using following sementics: * irq_handler will ring EQ doorbell atleast once and implement budgeting scheme. * cq_destroy will count number of valid entires during destroy and ring cq-db so that hardware does not generate uncontrolled EQE. * cq_destroy will synchronize with last running irq_handler instance. * arm_cq will always defer arming CQ till poll_cq, except for the first arm_cq call. * poll_cq will always ring cq-db with arm=SET if arm_cq was called prior to enter poll_cq. * poll_cq will always ring cq-db with arm=UNSET if arm_cq was not called prior to enter poll_cq. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-03 08:29:34 -07:00
Steve Wise	96bb2706c8	RDMA/cxgb4: Disable DSGL use by default Current hardware doesn't correctly support DSGL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:53:54 -07:00
Steve Wise	c529fb5046	RDMA/cxgb4: rx_data() needs to hold the ep mutex To avoid racing with other threads doing close/flush/whatever, rx_data() should hold the endpoint mutex. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:53:54 -07:00
Steve Wise	977116c698	RDMA/cxgb4: Drop RX_DATA packets if the endpoint is gone Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:53:53 -07:00
Steve Wise	a7db89eb89	RDMA/cxgb4: Lock around accept/reject downcalls There is a race between ULP threads doing an accept/reject, and the ingress processing thread handling close/abort for the same connection. The accept/reject path needs to hold the lock to serialize these paths. Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fold in locking fix found by Dan Carpenter <dan.carpenter@oracle.com>. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-02 08:52:45 -07:00
Moni Shoua	b2853fd6c2	IB/core: Don't resolve passive side RoCE L2 address in CMA REQ handler The code that resolves the passive side source MAC within the rdma_cm connection request handler was both redundant and buggy, so remove it. It was redundant since later, when an RC QP is modified to RTR state, the resolution will take place in the ib_core module. It was buggy because this callback also deals with UD SIDR exchange, for which we incorrectly looked at the REQ member of the CM event and dereferenced a random value. Fixes: `dd5f03beb4` ("IB/core: Ethernet L2 attributes in verbs/cm structures") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 14:05:26 -07:00
Mike Marciniszyn	f3585a6ae3	IB/ehca: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads These methods appear to only mimic the sg_dma_address() and sg_dma_len() behavior. They can be safely removed. Suggested-by: Bart Van Assche <bvanassche@acm.org> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:16:31 -07:00
Mike Marciniszyn	49c5c27e05	IB/ipath: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads The removal of these methods is compensated for by code changes to .map_sg to insure that the vanilla sg_dma_address() and sg_dma_len() will do the same thing as the equivalent former ib_sg_dma_address() and ib_sg_dma_len() calls into the drivers. The introduction of this patch required that the struct ipath_dma_mapping_ops be converted to a C99 initializer. Suggested-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:16:31 -07:00
Mike Marciniszyn	446bf432a9	IB/qib: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads Remove the overload for .dma_len and .dma_address The removal of these methods is compensated for by code changes to .map_sg to insure that the vanilla sg_dma_address() and sg_dma_len() will do the same thing as the equivalent former ib_sg_dma_address() and ib_sg_dma_len() calls into the drivers. Suggested-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Vinod Kumar <vinod.kumar@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:16:31 -07:00
Or Gerlitz	5de2ad986d	IB/iser: Bump driver version to 1.3 Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Or Gerlitz	3ee07d27ac	IB/iser: Update Mellanox copyright note Update Mellanox copyrights for 2014 on the iser initiator driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Or Gerlitz	4f9208ad3f	IB/iser: Print QP information once connection is established Add an iser info print with the local/remote QP information carried out when the connection is established. While here, fix a little leftover from the T10 work and set a debug print to be carried in debug and not info level. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Ariel Nahum	4667f5dfb0	IB/iser: Remove struct iscsi_iser_conn The iscsi stack has existing mechanisms to link back and forth between the iscsi connection and the iscsi transport (e.g iser/tcp) connection. This is done through a dd_data pointer field in struct iscsi_conn which can be set to point to the transport connection, etc. The iscsi_iser_conn structure was used to get this linking done in another way, which is uneeded and adds extra complication to the iser code, so we just remove it. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00
Roi Dayan	1d6c2b736f	IB/iser: Drain the tx cq once before looping on the rx cq The iser disconnection flow isn't done before all the inflight recv/send buffers posted to the QP are either flushed or normally completed to the CQ that serves this connection. The condition check is done in iser_handle_comp_error(). Currently, it's possible for the send buffer completion that makes the posted send buffers counter reach zero to be polled in the drain tx call, which is after the rx cq is fully drained. Since this completion might be not an error one (for example, it might be a completion of the logout request iSCSI PDU) we will skip iser_handle_comp_error(). So the connection will never terminate from the iscsi stack point of view, and we hang. To resolve this race, do the draining of the tx cq before the loop on the rx cq. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-04-01 11:09:46 -07:00

1 2 3 4 5 ...

3965 Commits