linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-28 15:11:31 +00:00

Author	SHA1	Message	Date
Yishai Hadas	35f05dabf9	IB/mlx4: Reset flow support for IB kernel ULPs The driver exposes interfaces that directly relate to HW state. Upon fatal error, consumers of these interfaces (ULPs) that rely on completion of all their posted work-request could hang, thereby introducing dependencies in shutdown order. To prevent this from happening, we manage the relevant resources (CQs, QPs) that are used by the device. Upon a fatal error, we now generate simulated completions for outstanding WQEs that were not completed at the time the HW was reset. It includes invoking the completion event handler for all involved CQs so that the ULPs will poll those CQs. When polled we return simulated CQEs with IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and not wait forever for completions upon receiving remove_one. The above change requires an extra check in the data path to make sure that when device is in error state, the simulated CQEs will be returned and no further WQEs will be posted. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:03:53 -08:00
Moni Shoua	824c25c1ab	IB/mlx4: Always use the correct port for mirrored multicast attachments When attaching a QP to a multicast address in bonded mode, there was an assumption that the port of the QP must be #1. This assumption isn't the case under the flow which enables maximal usage of the physical ports. Fix it by always checking the port of the original flow and create the mirrored flow on the other port. Fixes: `c6215745b6` ('IB/mlx4: Load balance ports in port aggregation mode') Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:03:53 -08:00
Moni Shoua	c6215745b6	IB/mlx4: Load balance ports in port aggregation mode When the mlx4 IB (RoCE) device works in link aggregation mode, it exposes a single port to upper layers. Therefore, applications always set '1' in port_num attribute when modifying a QP or creating an address handle. To make sure that a node uses all available ports the mlx4 driver will override the port_num attribute with a round robin policy. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:25 -08:00
Moni Shoua	146d6e1983	IB/mlx4: Create mirror flows in port aggregation mode In port aggregation mode flows for port #1 (the only port) should be mirrored on port #2. This is because packets can arrive from either physical ports. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:25 -08:00
Moni Shoua	a575009030	IB/mlx4: Add port aggregation support Register the interface with the mlx4 core driver with port aggregation support and check for port aggregation mode when the 'add' function is called. In this mode, only one physical port is reported to the upper layer (RoCE/IB core stack and ULPs). Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:25 -08:00
Moni Shoua	2f48485d1c	IB/mlx4: Reuse mlx4_mac_to_u64() This function is implemented twice... get rid of one copy. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:25 -08:00
David S. Miller	95f873f2ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/arm/boot/dts/imx6sx-sdb.dts net/sched/cls_bpf.c Two simple sets of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 16:59:56 -08:00
Yishai Hadas	872bf2fb69	net/mlx4_core: Maintain a persistent memory for mlx4 device Maintain a persistent memory that should survive reset flow/PCI error. This comes as a preparation for coming series to support above flows. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:13 -08:00
Hariprasad Shenai	cf7fe64aee	iw_cxgb4: Cleanup register defines/MACROS defined in t4fw_ri_api.h Cleanup all the MACROS that are defined in t4fw_ri_api.h and affected files Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 01:07:02 -05:00
Hariprasad Shenai	a56c66e808	iw_cxgb4: Cleanup register defines/MACROS defined in t4.h Cleanup all the MACROS defined in t4.h and the affected files Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 01:07:01 -05:00
Or Gerlitz	5eff6dadb9	net/mlx4: Don't disable vxlan offloads under DMFS-A0 optimized steering Except for VXLAN steering rules, all offloads should work as they were under plain DMFS mode. Fix that by enabling all the offloads under DMFS-A0 mode, except for VXLAN steering rules. Fixes: `d57febe1a4` "net/mlx4: Add A0 hybrid steering" Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 19:35:30 -05:00
Jiri Pirko	df8a39defa	net: rename vlan_tx_* helpers since "tx" is misleading there The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:51:08 -05:00
Arnd Bergmann	7835bfb526	infiniband: mlx5: avoid a compile-time warning The return type of find_first_bit() is architecture specific, on ARM it is 'unsigned int', while the asm-generic code used on x86 and a lot of other architectures returns 'unsigned long'. When building the mlx5 driver on ARM, we get a warning about this: infiniband/hw/mlx5/mem.c: In function 'mlx5_ib_cont_pages': infiniband/hw/mlx5/mem.c:84:143: warning: comparison of distinct pointer types lacks a cast m = min(m, find_first_bit(&tmp, sizeof(tmp))); This patch changes the driver to use min_t to make it behave the same way on all architectures. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:08:21 -05:00
Hariprasad Shenai	bdc590b99f	iw_cxgb4/cxgb4/cxgb4vf/cxgb4i/csiostor: Cleanup register defines/macros related to all other cpl messages This patch cleanups all other macros/register define related to CPL messages that are defined in t4_msg.h and the affected files Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:19:34 -05:00
Hariprasad Shenai	6c53e938a8	iw_cxgb4/cxgb4/cxgb4i: Cleanup register defines/MACROS related to CM CPL messages This patch cleanups all macros/register define related to connection management CPL messages that are defined in t4_msg.h and the affected files Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:19:34 -05:00
Hariprasad Shenai	f612b815d7	RDMA/cxgb4/cxgb4vf/csiostor: Cleanup SGE register defines This patch cleanups all SGE related macros/register defines that are defined in t4_regs.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 16:34:47 -05:00
Roland Dreier	a7cfef21e3	Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'mlx4', 'ocrdma', 'odp' and 'srp' into for-next	2014-12-15 18:19:20 -08:00
Haggai Eran	b4cfe447d4	IB/mlx5: Implement on demand paging by adding support for MMU notifiers * Implement the relevant invalidation functions (zap MTTs as needed) * Implement interlocking (and rollback in the page fault handlers) for cases of a racing notifier and fault. * With this patch we can now enable the capability bits for supporting RC send/receive/RDMA read/RDMA write, and UD send. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:04 -08:00
Haggai Eran	eab668a6d0	IB/mlx5: Add support for RDMA read/write responder page faults Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:03 -08:00
Haggai Eran	7bdf65d411	IB/mlx5: Handle page faults This patch implement a page fault handler (leaving the pages pinned as of time being). The page fault handler handles initiator and responder page faults for UD/RC transports, for send/receive operations, as well as RDMA read/write initiator support. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:03 -08:00
Haggai Eran	6aec21f6a8	IB/mlx5: Page faults handling infrastructure * Refactor MR registration and cleanup, and fix reg_pages accounting. * Create a work queue to handle page fault events in a kthread context. * Register a fault handler to get events from the core for each QP. The registered fault handler is empty in this patch, and only a later patch implements it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:03 -08:00
Haggai Eran	832a6b06ab	IB/mlx5: Add mlx5_ib_update_mtt to update page tables after creation The new function allows updating the page tables of a memory region after it was created. This can be used to handle page faults and page invalidations. Since mlx5_ib_update_mtt will need to work from within page invalidation, so it must not block on memory allocation. It employs an atomic memory allocation mechanism that is used as a fallback when kmalloc(GFP_ATOMIC) fails. In order to reuse code from mlx5_ib_populate_pas, the patch splits this function and add the needed parameters. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:02 -08:00
Haggai Eran	cc149f751b	IB/mlx5: Changes in memory region creation to support on-demand paging This patch wraps together several changes needed for on-demand paging support in the mlx5_ib_populate_pas function, and when registering memory regions. * Instead of accepting a UMR bit telling the function to enable all access flags, the function now accepts the access flags themselves. * For on-demand paging memory regions, fill the memory tables from the correct list, and enable/disable the access flags per-page according to whether the page is present. * A new bit is set to enable writing of access flags when using the firmware create_mkey command. * Disable contig pages when on-demand paging is enabled. In addition the patch changes the UMR code to use PTR_ALIGN instead of our own macro. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:02 -08:00
Haggai Eran	8cdd312cfe	IB/mlx5: Implement the ODP capability query verb The patch adds infrastructure to query ODP capabilities in the mlx5 driver. The code will read the capabilities from the device, and enable only those capabilities that both the driver and the device supports. At this point ODP is not supported, so no capability is copied from the device, but the patch exposes the global ODP device capability bit. Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:19:02 -08:00
Haggai Eran	c1395a2a8c	IB/mlx5: Add function to read WQE from user-space Add a helper function mlx5_ib_read_user_wqe to read information from user-space owned work queues. The function will be used in a later patch by the page-fault handling code in mlx5_ib. Signed-off-by: Haggai Eran <haggaie@mellanox.com> [ Add stub for ib_umem_copy_from() for CONFIG_INFINIBAND_USER_MEM=n - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:13:35 -08:00
Haggai Eran	406f9e5fa9	IB/core: Replace ib_umem's offset field with a full address In order to allow umems that do not pin memory, we need the umem to keep track of its region's address. This makes the offset field redundant, and so this patch removes it. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:13:35 -08:00
Haggai Eran	968e78dd96	IB/mlx5: Enhance UMR support to allow partial page table update The current UMR interface doesn't allow partial updates to a memory region's page tables. This patch changes the interface to allow that. It also changes the way the UMR operation validates the memory region's state. When set, IB_SEND_UMR_FAIL_IF_FREE will cause the UMR operation to fail if the MKEY is in the free state. When it is unchecked the operation will check that it isn't in the free state. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:13:35 -08:00
Haggai Eran	21af2c3ebf	IB/mlx5: Remove per-MR pas and dma pointers Since UMR code now uses its own context struct on the stack, the pas and dma pointers for the UMR operation that remained in the mlx5_ib_mr struct are not necessary. This patch removes them. Fixes: `a74d24168d` ("IB/mlx5: Refactor UMR to have its own context struct") Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:13:35 -08:00
Devesh Sharma	e5f0508d43	RDMA/ocrdma: Always resolve destination mac from GRH for UD QPs For user applications that use UD QPs, always resolve destination MAC from the GRH. This is to avoid failure due to any garbage value in the attr->dmac. Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:13:09 -08:00
Mitesh Ahuja	95bf0093a9	RDMA/ocrdma: Fix ocrdma_query_qp() to report q_key value for UD QPs Signed-off-by: Mitesh Ahuja <mitesh.ahuja@emulex.com> Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:13:09 -08:00
Jack Morgenstein	9f35e8995b	IB/mlx4: Fix an incorrectly shadowed variable in mlx4_ib_rereg_user_mr This error was detected by sparse static checker: drivers/infiniband/hw/mlx4/mr.c:226:21: warning: symbol 'err' shadows an earlier one drivers/infiniband/hw/mlx4/mr.c:197:13: originally declared here Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:12:29 -08:00
Hariprasad S	e6b11163d4	RDMA/cxgb4: Handle NET_XMIT return codes cxgb4_create_server() and cxgb4_create_server6() return NET_XMIT_* values or a negative errno. iw_cxgb4 need to handle this correctly. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:10:46 -08:00
Steve Wise	5b34180883	RDMA/cxgb4: Wake up waiters after flushing the qp When transitioning into ERROR state, the QP was getting flushed after waking up any waiters. This can cause applications to miss flushed work requests which can stall an NFS mount. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:10:46 -08:00
Hariprasad Shenai	2550a88d95	RDMA/cxgb4: Limit MRs to < 8GB for T4/T5 devices T4/T5 hardware can't handle MRs >= 8GB due to a hardware bug. So limit registrations to < 8GB for thse devices. Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:10:46 -08:00
Hariprasad Shenai	10be6b48fd	RDMA/cxgb4: Fix locking issue in process_mpa_request Fix the following lockdep report: ============================================= [ INFO: possible recursive locking detected ] 3.17.0+ #3 Tainted: G E --------------------------------------------- kworker/u64:3/299 is trying to acquire lock: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] but task is already holding lock: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&epc->mutex); lock(&epc->mutex); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by kworker/u64:3/299: #0: ("%s""iw_cxgb4"){.+.+.+}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0 #1: (skb_work){+.+.+.}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0 #2: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4] stack backtrace: CPU: 2 PID: 299 Comm: kworker/u64:3 Tainted: G E 3.17.0+ #3 Hardware name: Dell Inc. PowerEdge T110/0X744K, BIOS 1.2.1 01/28/2010 Workqueue: iw_cxgb4 process_work [iw_cxgb4] ffff8800b91593d0 ffff8800b8a2f9f8 ffffffff815df107 0000000000000001 ffff8800b9158750 ffff8800b8a2fa28 ffffffff8109f0e2 ffff8800bb768a00 ffff8800b91593d0 ffff8800b9158750 0000000000000000 ffff8800b8a2fa88 Call Trace: [<ffffffff815df107>] dump_stack+0x49/0x62 [<ffffffff8109f0e2>] print_deadlock_bug+0xf2/0x100 [<ffffffff810a0f04>] validate_chain+0x454/0x700 [<ffffffff810a1574>] __lock_acquire+0x3c4/0x580 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff810a17cc>] lock_acquire+0x9c/0x110 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff815e111b>] mutex_lock_nested+0x4b/0x360 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff810c181a>] ? del_timer_sync+0xaa/0xd0 [<ffffffff810c1770>] ? try_to_del_timer_sync+0x70/0x70 [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffffa074a3ec>] ? update_rx_credits+0xec/0x140 [iw_cxgb4] [<ffffffffa074e381>] rx_data+0xd1/0x1f0 [iw_cxgb4] [<ffffffff8109ff23>] ? mark_held_locks+0x73/0xa0 [<ffffffff815e4b90>] ? _raw_spin_unlock_irqrestore+0x40/0x70 [<ffffffff810a020d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff810a02dd>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffa074c931>] process_work+0x51/0x80 [iw_cxgb4] [<ffffffff8106f1c8>] process_one_work+0x1b8/0x4d0 [<ffffffff8106f14d>] ? process_one_work+0x13d/0x4d0 [<ffffffff8106f600>] worker_thread+0x120/0x3c0 [<ffffffff8106f4e0>] ? process_one_work+0x4d0/0x4d0 [<ffffffff81074a0e>] kthread+0xde/0x100 [<ffffffff815e4b40>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70 [<ffffffff815e512c>] ret_from_fork+0x7c/0xb0 [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70 Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:10:46 -08:00
Pramod Kumar	123bc2a27a	RDMA/cxgb4: Configure 0B MRs to match HW implementation 0B MRs need some tweaks to work correctly with HW. When writing the TPTE, if the MR length is zero we now: 1) turn off all permissions 2) set the length to -1 While functionality/capabilities of the MR are the same with these changes, it resolves a dapltest 0B RDMA Read test failure. Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Pramod Kumar <pramod@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:10:46 -08:00
Pramod Kumar	63a71ba617	RDMA/cxgb4: Increase epd buff size for debug interface IPv6 address string lengths require increasing the buffer size for debugfs handlers. Signed-off-by: Pramod Kumar <pramod@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:10:45 -08:00
Matan Barak	d57febe1a4	net/mlx4: Add A0 hybrid steering A0 hybrid steering is a form of high performance flow steering. By using this mode, mlx4 cards use a fast limited table based steering, in order to enable fast steering of unicast packets to a QP. In order to implement A0 hybrid steering we allocate resources from different zones: (1) General range (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region. When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP, we try hard to allocate the QP from range (2). Otherwise, we try hard not to allocate from this range. However, when the system is pushed to its limits and one needs every resource, the allocator uses every region it can. Meaning, when we run out of raw-eth qps, the allocator allocates from the general range (and the special-A0 area is no longer active). If we run out of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that is also exhausted, the allocator will allocate from the general range (and the A0 region is no longer active). Note that if a raw-eth qp is allocated from the general range, it attempts to allocate the range such that bits 6 and 7 (blueflame bits) in the QP number are not set. When the feature is used in SRIOV, the VF has to notify the PF what kind of QP attributes it needs. In order to do that, along with the "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According to the combination of these bits, the PF tries to allocate a suitable QP. In order to maintain backward compatibility (with older PFs), the PF notifies which QP attributes it supports via QUERY_FUNC_CAP command. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Eugenia Emantayev	ddae0349fd	net/mlx4: Change QP allocation scheme When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Matan Barak	3dca0f42c7	net/mlx4_core: Use tasklet for user-space CQ completion events Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx4_en's and IPoIB napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:34 -05:00
Eli Cohen	d14e71103b	mlx5: Fix error flow in add_keys If mlx5_core_create_mkey fails, decrease the pending counter to undo the previous increment. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:56 -05:00
Eli Cohen	6a4f139aae	mlx5: Fix sparse warnings 1. Add required __acquire/__release statements to balance spinlock usage. 2. Change the index parameter of begin_wqe() to be unsigned to match supplied argument type. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:56 -05:00
Hariprasad Shenai	b2e1a3f091	RDMA/cxgb4/cxgb4vf/csiostor: Cleanup macros/register defines related to PCIE, RSS and FW This patch cleanups all PCIE, RSS & FW related macros/register defines that are defined in t4fw_api.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-22 16:57:47 -05:00
Hariprasad Shenai	5167865aaa	RDMA/cxgb4/csiostor: Cleansup FW related macros/register defines for PF/VF and LDST This patch cleanups PF/VF and LDST related macros/register defines that are defined in t4fw_api.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-22 16:57:47 -05:00
Hariprasad Shenai	77a80e23cc	RDMA/cxgb4: Cleanup Filter related macros/register defines This patch cleanups all filter related macros/register defines that are defined in t4fw_api.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-22 16:57:46 -05:00
Al Viro	479163f460	mlx5: don't duplicate kvfree() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:58:18 -05:00
Matan Barak	7ae0e400cd	net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs Previously, the driver queried the firmware in order to get the number of supported EQs. Under SRIOV, since this was done before the driver notified the firmware how many VFs it actually needs, the firmware had to take into account a worst case scenario and always allocated four EQs per VF, where one was used for events while the others were used for completions. Now, when the firmware supports the asymmetric allocation scheme, denoted by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we can get more EQs and MSI-X vectors per function. Moreover, when running in the new firmware/driver mode, the limitation that the number of EQs should be a power of two is lifted. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:21 -05:00
Anish Bhatt	d7990b0c34	cxgb4i/cxgb4 : Refactor macros to conform to uniform standards Refactored all macros used in cxgb4i as part of previously started cxgb4 macro names cleanup. Makes them more uniform and avoids namespace collision. Minor changes in other drivers where required as some of these macros are used by multiple drivers, affected drivers are iw_cxgb4, cxgb4(vf) & csiostor Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 14:36:22 -05:00
Hariprasad Shenai	e2ac962895	cxgb4: Cleanup macros so they follow the same style and look consistent, part 2 Various patches have ended up changing the style of the symbolic macros/register defines to different style. As a result, the current kernel.org files are a mix of different macro styles. Since this macro/register defines is used by different drivers a few patch series have ended up adding duplicate macro/register define entries with different styles. This makes these register define/macro files a complete mess and we want to make them clean and consistent. This patch cleans up a part of it. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 12:57:10 -05:00
Or Gerlitz	571e1b2c7a	mlx4: Avoid leaking steering rules on flow creation error flow If mlx4_ib_create_flow() attempts to create > 1 rules with the firmware, and one of these registrations fail, we leaked the already created flow rules. One example of the leak is when the registration of the VXLAN ghost steering rule fails, we didn't unregister the original rule requested by the user, introduced in commit `d2fce8a906` "mlx4: Set user-space raw Ethernet QPs to properly handle VXLAN traffic". While here, add dump of the VXLAN portion of steering rules so it can actually be seen when flow creation fails. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:48:58 -04:00

1 2 3 4 5 ...

2662 Commits