linux

Author	SHA1	Message	Date
Vladimir Sokolovsky	c9257433f2	mlx4_core: Set RAE and init mtt_sz field in FRMR MPT entries Set the RAE (remote access enable) bit and correctly initialize the MTT size in MPT entries being set up for fast register memory regions. Otherwise the callers can't enable remote access and in fact can't fast register at all (since the HCA will think no MTT entries are allocated). Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-02 13:38:29 -07:00
Linus Torvalds	8be1a6d6c7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files mlx4_core: Add VLAN tag field to WQE control segment struct RDMA/nes: CM connection setup/teardown rework IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG IPoIB/cm: Connected mode is no longer EXPERIMENTAL RDMA/ucm: BKL is not needed for ib_ucm_open() RDMA/ucma: BKL is not needed for ucma_open()	2008-07-26 20:40:36 -07:00
FUJITA Tomonori	8d8bb39b9e	dma-mapping: add the device argument to dma_mapping_error() Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
Jack Morgenstein	51a379d0c8	mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files Update existing Mellanox copyright lines to 2008, and add such lines to files where they are missing. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-25 10:32:52 -07:00
Linus Torvalds	5c402355ad	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: MAINTAINERS: Remove Glenn Streiff from NetEffect entry mlx4_core: Improve error message when not enough UAR pages are available IB/mlx4: Add support for memory management extensions and local DMA L_Key IB/mthca: Keep free count for MTT buddy allocator mlx4_core: Keep free count for MTT buddy allocator mlx4_code: Add missing FW status return code IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg mlx4_core: Add module parameter to enable QoS support RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() IB/ehca: Release mutex in error path of alloc_small_queue_page() IB/ehca: Use default value for Local CA ACK Delay if FW returns 0 IB/ehca: Filter PATH_MIG events if QP was never armed IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event	2008-07-24 12:56:07 -07:00
Andrea Righi	27ac792ca0	PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Roland Dreier	7644264082	mlx4_core: Improve error message when not enough UAR pages are available If an mlx4 device with default FW (which gives a UAR BAR size of 8 MB) is used in a system with 64 KB pages, then there are only 8192/64==128 UAR pages available. However, the first 128 UAR pages are reserved for use with event queue doorbells, so no UAR pages are available to do anything else with, which means that the driver cannot work. The current driver fails with a fairly cryptic "Failed to allocate driver access region, aborting" message in this situation. Fix the driver to detect the problem earlier and print out a clearer description of the problem and a suggestion of how to fix it (use a new firmware image). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-23 08:12:47 -07:00
Roland Dreier	95d04f0735	IB/mlx4: Add support for memory management extensions and local DMA L_Key Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-23 08:12:26 -07:00
Roland Dreier	e4044cfc49	mlx4_core: Keep free count for MTT buddy allocator MTT entries are allocated with a buddy allocator, which just keeps bitmaps for each level of the buddy table. However, all free space starts out at the highest order, and small allocations start scanning from the lowest order. When the lowest order tables have no free space, this can lead to scanning potentially millions of bits before finding a free entry at a higher order. We can avoid this by just keeping a count of how many free entries each order has, and skipping the bitmap scan when an order is completely empty. This provides a nice performance boost for a negligible increase in memory usage. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:19:40 -07:00
Jack Morgenstein	899698dad7	mlx4_code: Add missing FW status return code Add ICM_ERROR firmware status code. In mapping to errnos, -ENFILE seems closest. This is in preparation for providing more detailed log info using mlx4_err() in low-level driver when a non-zero status is returned. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:19:39 -07:00
Jack Morgenstein	51f5f0ee22	mlx4_core: Add module parameter to enable QoS support Add a module parameter "enable_qos" to mlx4_core. If this param is set, enable support for QoS in the INIT_HCA command. By default, the parameter is set to 0 (disabled). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:19:37 -07:00
Vladimir Sokolovsky	2d92865158	mlx4_core: Use MOD_STAT_CFG command to get minimal page size There was a bug in some versions of the mlx4 driver in mlx4_alloc_fmr(), which hardcoded the minimum acceptable page_shift to be 12. However, new ConnectX firmware can support a minimum page_shift of 9 (log_pg_sz of 9 returned by QUERY_DEV_LIM) -- so with old drivers, ib_fmr_alloc() would fail for ULPs using the device minimum when creating FMRs. To preserve firmware compatibility with released mlx4 drivers, the firmware will continue to return 12 as before for log_page_sz in QUERY_DEV_CAP for these drivers. However, to enable new drivers to take advantage of the available smaller page size, the mlx4 driver now first sets the log_pg_sz to the device minimum by setting a log_page_sz value to 0 via the MOD_STAT_CFG command and then reading the real minimum via QUERY_DEV_CAP. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:53 -07:00
Ron Livne	521e575b9a	IB/mlx4: Add support for blocking multicast loopback packets Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK flag by using the per-multicast group loopback blocking feature of mlx4 hardware. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Oren Duer	c5057ddccb	mlx4_core: Support creation of FMRs with pages smaller than 4K Don't hard code a test against a minimum page shift of 12, since the device may support smaller pages. Test against the actual smallest page size from the device capabilities. Signed-off-by: Oren Duer <oren@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-05-05 15:56:52 -07:00
Olaf Kirch	bbdc2821db	mlx4_core: Avoid recycling old FMR R_Keys too soon When a FMR is unmapped, mlx4 resets the map count to 0, and clears the upper part of the R_Key which is used as the sequence counter. This poses a problem for RDS, which uses ib_fmr_unmap as a fence operation. RDS assumes that after issuing an unmap, the old R_Keys will be invalid for a "reasonable" period of time. For instance, Oracle processes uses shared memory buffers allocated from a pool of buffers. When a process dies, we want to reclaim these buffers -- but we must make sure there are no pending RDMA operations to/from those buffers. The only way to achieve that is by using unmap and sync the TPT. However, when the sequence count is reset on unmap, there is a high likelihood that a new mapping will be given the same R_Key that was issued a few milliseconds ago. To prevent this, don't reset the sequence count when unmapping a FMR. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-29 13:46:53 -07:00
Yevgeny Petrilin	e463c7b197	mlx4_core: Add a way to set the "collapsed" CQ flag Extend the mlx4_cq_resize() API with a way to set the "collapsed" flag for the CQ being created. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-29 13:46:50 -07:00
Yevgeny Petrilin	ed4d3c1061	mlx4_core: Add helper to move QP to ready-to-send Avoid duplicating code in ethernet and FC modules. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-25 14:52:32 -07:00
Yevgeny Petrilin	38ae6a5354	mlx4_core: Add HW queues allocation helpers Wrap doorbell, buffer and MTT allocation in helper functions for ethernet and FC modules to use. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-25 14:27:08 -07:00
Vladimir Sokolovsky	f5b3a096b1	mlx4_core: CQ resizing should pass a 0 opcode modifier to MODIFY_CQ The call to mlx4_MODIFY_CQ() had a typo so that mlx4_cq_resize() was actually asking the FW to modify a CQ's interrupt moderation rather than asking it to resize a CQ. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-23 11:55:45 -07:00
Yevgeny Petrilin	6296883ca4	mlx4_core: Move kernel doorbell management into core In addition to mlx4_ib, there will be ethernet and FC consumers of mlx4_core, so move the code for managing kernel doorbells into the core module to avoid having to duplicate this multiple times. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-23 11:55:45 -07:00
Eli Cohen	4ff08a76bc	IB/mlx4: Fix incorrect comment mlx4 hardware does not support external DDR memory. Moreover, UAR area (BAR 2) can change depending on FW version. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:35 -07:00
Eli Cohen	4dc51b3258	IB/mlx4: Fix race when detaching a QP from a multicast group When detaching the last QP from an MCG entry, we need to make sure that at any time, there will be no entry with zero number of QPs which is linked to the list of the MCGs of the corresponding hash index. So don't write back the MCG entry if we are removing the last QP; just unlink the entry. Also, remove an unnecessary MCG read when attaching a QP requires allocation of a new entry in the AMGM. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:35 -07:00
Vladimir Sokolovsky	bbf8eed1a0	IB/mlx4: Add support for resizing CQs Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:33 -07:00
Eli Cohen	3fdcb97f0b	IB/mlx4: Add support for modifying CQ moderation parameters Signed-off-by: Eli Cohen <eli@mellnaox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:33 -07:00
Jack Morgenstein	9b1f38515c	mlx4_core: Increase max number of QPs to 128K With the advent large clusters which utilize multicore hosts, 64K QPs is not enough. We should increase the default maximum for QPs to 128K. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:32 -07:00
Eli Cohen	b832be1e40	IB/mlx4: Add IPoIB LSO support Add TSO support to the mlx4_ib driver. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:27 -07:00
Eli Cohen	8ff095ec4b	IB/mlx4: Add IPoIB checksum offload support ConnectX devices support checksum generation and verification of TCP and UDP packets for UD IPoIB messages. This patch checks if the HCA supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it does. It implements support for handling the IB_SEND_IP_CSUM send flag and setting the csum_ok field in receive work completions. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Ali Ayub <ali@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:10 -07:00
Roland Dreier	37608eea86	mlx4_core: Fix confusion between mlx4_event and mlx4_dev_event enums The struct mlx4_interface.event() method was supposed to get an enum mlx4_dev_event, but the driver code was actually passing in the hardware enum mlx4_event values. Fix up the callers of mlx4_dispatch_event() so that they pass in the right type of value, and fix up the event method in mlx4_ib so that it can handle the enum mlx4_dev_event values. This eliminates the need for the subtype parameter to the event method, so remove it. This also fixes the sparse warning drivers/net/mlx4/intf.c:127:48: warning: mixing different enum types drivers/net/mlx4/intf.c:127:48: int enum mlx4_event versus drivers/net/mlx4/intf.c:127:48: int enum mlx4_dev_event Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:08 -07:00
Roland Dreier	ca28121114	mlx4_core: Move opening brace of function onto a new line Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:04 -07:00
Jack Morgenstein	11e75a7455	mlx4_core: Move table_find from fmr_alloc to fmr_enable mlx4_table_find (for FMR MPTs) requires that ICM memory already be mapped. Before this fix, FMR allocation depended on ICM memory already being mapped for the MPT entry. If all currently mapped entries are taken, the find operation fails (even if the MPT ICM table still had more entries, which were just not mapped yet). This fix moves the mpt find operation to fmr_enable, to guarantee that any required ICM memory mapping has already occurred. Found by Oren Duer of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-14 10:43:48 -08:00
Olof Johansson	29c271123d	mlx4_core: Fix build break (missing include) Commit `313abe55` ("mlx4_core: For 64-bit systems, vmap() kernel queue buffers") caused this to pop up on powerpc allyesconfig, looks like a missing include file: drivers/net/mlx4/alloc.c: In function 'mlx4_buf_alloc': drivers/net/mlx4/alloc.c:162: error: implicit declaration of function 'vmap' drivers/net/mlx4/alloc.c:162: error: 'VM_MAP' undeclared (first use in this function) drivers/net/mlx4/alloc.c:162: error: (Each undeclared identifier is reported only once drivers/net/mlx4/alloc.c:162: error: for each function it appears in.) drivers/net/mlx4/alloc.c:162: warning: assignment makes pointer from integer without a cast drivers/net/mlx4/alloc.c: In function 'mlx4_buf_free': drivers/net/mlx4/alloc.c:187: error: implicit declaration of function 'vunmap' Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-11 14:19:42 -08:00
Roland Dreier	b57aacfa7a	mlx4_core: Clean up struct mlx4_buf Now that struct mlx4_buf.u is a struct instead of a union because of the vmap() changes, there's no point in having a struct at all. So move .direct and .page_list directly into struct mlx4_buf and get rid of a bunch of unnecessary ".u"s. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-06 21:17:59 -08:00
Jack Morgenstein	313abe55a8	mlx4_core: For 64-bit systems, vmap() kernel queue buffers Since kernel virtual memory is not a problem on 64-bit systems, there is no reason to use our own 2-layer page mapping scheme for large kernel queue buffers on such systems. Instead, map the page list to a single virtually contiguous buffer with vmap(), so that can we access buffer memory via direct indexing. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-06 21:17:45 -08:00
Roland Dreier	f33afc26dc	IB: Avoid marking __devinitdata as const Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
Jack Morgenstein	893da75956	mlx4_core: Don't read reserved fields in mlx4_QUERY_ADAPTER() The firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; eliminate these fields from the query. Initialize the rev_id field of the mlx4 device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:43 -08:00
Roland Dreier	e8f9b2ed98	mlx4_core: Fix more section mismatches Commit `3d73c288` ("mlx4_core: Fix section mismatches") fixed some of the section mismatches introduced when error recovery was added, but there were still more cases of errory recovery code calling into __devinit code from regular .text. Fix this by getting rid of the now-incorrect __devinit annotations. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:41 -08:00
Jack Morgenstein	5920869f1e	mlx4_core: Fix max_eqs masking in QUERY_DEV_CAP log_max_eqs is a 4-bit field, not a 3-bit field in the response to the QUERY_DEV_CAP FW command, so we should mask with 0xf instead of 0x7 when reading it. Found by Yossi Leybovitch of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:29 -08:00
Jack Morgenstein	9ed87fd34c	mlx4_core: Fix state check in mlx4_qp_modify() When checking the states passed in, mlx4_qp_modify() accidentally checks cur_state twice rather than checking cur_state and new_state. Fix this to make sure that both values are in-bounds. Since these values may be passed in from userspace, this bug results in userspace being able to trigger an oops. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: stable <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-20 13:01:28 -08:00
Jack Morgenstein	e383d19e90	mlx4_core: Fix thinko in QP destroy (incorrect bitmap_free) Fix thinko in commit `eaf559bf` ("mlx4_core: Don't free special QPs in QP number bitmap"). The old commit had the logic exactly backwards and ended up freeing only special QPs, which not only left the original bug in place but also introduced the problem that the QP number bitmap would get full after a while. Found by Dotan Barak of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-14 08:20:03 -08:00
Ali Ayoub	3bba11e5c4	mlx4_core: Fix possible bad free in mlx4_buf_free() When mlx4_buf_free() is called from the error path of mlx4_buf_alloc(), it may be passed a buffer structure that does not have all pages filled in. Add a check for NULL to mlx4_buf_free() so we avoid passing NULL to dma_free_coherent() (which will crash). Signed-off-by: Ali Ayoub <ali@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:26:57 -08:00
Jens Axboe	642f149031	SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 11:20:47 +02:00
Linus Torvalds	0b776eb542	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Increase command timeout for INIT_HCA to 10 seconds IPoIB/cm: Use common CQ for CM send completions IB/uverbs: Fix checking of userspace object ownership IB/mlx4: Sanity check userspace send queue sizes IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" IB/ehca: Enable large page MRs by default IB/ehca: Change meaning of hca_cap_mr_pgsize IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr() IB/ehca: Fix masking error in {,re}reg_phys_mr() IB/ehca: Supply QP token for SRQ base QPs IPoIB: Use round_jiffies() for ah_reap_task RDMA/cma: Fix deadlock destroying listen requests RDMA/cma: Add locking around QP accesses IB/mthca: Avoid alignment traps when writing doorbells mlx4_core: Kill mlx4_write64_raw()	2007-10-23 09:56:11 -07:00
Jens Axboe	45711f1af6	[SG] Update drivers to use sg helpers Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:19:53 +02:00
Jack Morgenstein	77109cc282	mlx4_core: Increase command timeout for INIT_HCA to 10 seconds The current INIT_HCA firmware command timeout is sufficient for the default number of resources (QPs, CQs, etc) being allocated, but if the HCA profile is modified to increase the amount of resources, then a spurious timeout is detected and HCA initialization fails. Increase the timeout for the INIT_HCA command to 10 seconds, which also brings it into line with all the other command timeouts. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-21 15:06:04 -07:00
Roland Dreier	b027cacdab	mlx4_core: Fix infinite loop on device initialization Commit `3d73c288` ("mlx4_core: Fix section mismatches") introduced a stupid bug in device init: when some of mlx4_init_one() was split off into __mlx4_init_one(), the call from the main mlx4_init_one() function was back to mlx4_init_one() rather than to __mlx4_init_one(), which leads to an obvious infinite loop if the function is every called. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-13 14:10:50 -07:00
Roland Dreier	3d73c2884f	mlx4_core: Fix section mismatches Commit `ee49bd93` ("mlx4_core: Reset device when internal error is detected") introduced some section mismatch problems when CONFIG_HOTPLUG=n, because the error recovery code tears down and reinitializes the device after everything is loaded, which ends up calling into lots of code marked __devinit and __devexit from regular .text. Fix this by getting rid of these now-incorrect section markers. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-10 15:43:54 -07:00
Roland Dreier	2e61c646ed	mlx4_core: Use mmiowb() to avoid firmware commands getting jumbled up Firmware commands are sent to the HCA by writing multiple words to a command register block. Access to this block of registers is serialized with a mutex. However, on large SGI systems writes to the register block may be reordered within the system interconnect and reach the HCA in a different order than they were issued (even with the mutex). Fix this by adding an mmiowb() before dropping the mutex. This bug was observed with real workloads with the similar FW command code in the mthca driver, and adding the mmiowb() as in commit 66547550 ("IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up") was confirmed to fix the problems, so we should add the same fix to mlx4. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:18 -07:00
Jack Morgenstein	e57ac0c297	mlx4_core: Increase max number of QPs per multicast group to 56 Increase the number of QPs allowed per multicast group from 8 to 56. This allows for one QP per core on 16-core systems, which are now quite common, and allows some space for future growth. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Jack Morgenstein	8ad11fb6b0	IB/mlx4: Implement FMRs Implement FMRs for mlx4. This is an adaptation of code from mthca. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Jack Morgenstein	d7bb58fb1c	mlx4_core: Write MTTs from CPU instead with of WRITE_MTT FW command Write MTT entries directly to ICM from the driver (eliminating use of WRITE_MTT command). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. This code will also be used to implement FMRs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Roland Dreier	121964ec38	mlx4_core: Fix meaning of dev->caps.reserved_mtts Everything that uses caps.reserved_mtts expects it to be a count of MTT segments, not MTT entries. So convert the value that the FW gives us to a count of segments. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Roland Dreier	cf78237d7b	mlx4_core: Reserve the correct number of MTT segments Taking ilog2(dev->caps.reserved_mtts) to find out the order to pass to the MTT buddy allocator will do the wrong thing if reserved_mtts is ever not a power of 2. Be safe and use fls(dev->caps.reserved_mtts - 1). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Jack Morgenstein	5b0bf5e25e	mlx4_core: Support ICM tables in coherent memory Enable having ICM tables in coherent memory, and use coherent memory for the dMPT table. This will allow writing MPT entries for MRs both via the SW2HW_MPT command and also directly by the driver for FMR remapping without needing to flush or worry about cacheline boundaries. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:15 -07:00
Jack Morgenstein	cd9281d873	IB/mlx4: Display misc device information under /sys/class/infiniband/ display the following device information under /sys/class/infiniband/mlx4_X: board_id, fw_ver, hw_rev, hca_type. This patch makes this information available to userspace utilities such as ibstat and ibv_devinfo. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Roland Dreier	ea98054fef	mlx4_core: Change capability decoding: SRC->XRC The SRC ("scalable RC") transport has been renamed to XRC ("extended RC"), to avoid having an abbreviation that is so easily confused with an abbreviation for "source." Update the HCA capability decoding output to use the new name. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Roland Dreier	d7dc3ccbe4	IB/mlx4: Fix up SRQ limit_watermark endianness mlx4_srq_query() returns a big-endian 16-bit value through an int *, which screws up sparse checking. Fix this so that a CPU-endian value is returned. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:06 -07:00
Michael S. Tsirkin	08fb105540	mlx4_core: Enable MSI-X by default Recover from MSI-X errors by automatically falling back on regular interrupt, instead of asking the user to do this manually. This makes it possible to enable MSI-X by default, and will make it possible to get rid of the msi_x module option in the future. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:05 -07:00
Roland Dreier	eaf559bf56	mlx4_core: Don't free special QPs in QP number bitmap Special QPs are not allocated using the regular QP number bitmap, so when they are destroyed, their QP number should not be freed in the bitmap. Found by Dotan Barak of Mellanox. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:05 -07:00
Dotan Barak	36ce10d3e8	mlx4_core: Use enum value GO_BIT_TIMEOUT_MSECS Rename GO_BIT_TIMEOUT to GO_BIT_TIMEOUT_MSECS for clarity, and actually use it as the go bit timeout (instead of having the define but then ignoring it and using a hard-coded 10 * HZ for the actual timeout). Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:04 -07:00
Eli Cohen	947b2a8083	mlx4_core: Wait 1 second after reset before accessing device Put a 1000 msec delay after resetting the device before attempting to do config cycles on it. Not waiting causes system hangs on some chipsets, e.g. Intel E7520, when the driver is loaded. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-13 08:47:44 -07:00
Jack Morgenstein	0172e2e14c	mlx4_core: Remove kfree() in mlx4_mr_alloc() error flow mlx4_mr_alloc() doesn't actually allocate mr (it just initializes the pointer that the caller passes in), so it shouldn't free it if an error occurs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-28 08:30:45 -07:00
Roland Dreier	0981582dbf	mlx4_core: Change command token on timeout The FW command token is currently only updated on a command completion event. This means that on command timeout, the same token will be reused for new command, which results in a mess if the timed out command does eventually complete. This is the same change as the patch for mthca from Michael S. Tsirkin <mst@dev.mellanox.co.il> that was just merged. It seems sensible to avoid gratuitous differences in FW command processing between mthca and mlx4. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-20 21:19:43 -07:00
Jack Morgenstein	c9f2ba5ed2	IB/mlx4: Increase max outstanding RDMA reads as target Change the maximum number of outstanding RDMA reads allowed as a target from 4 to 16 to per QP. This allows RDMA read operations to pipeline better. Pointed out by Dotan Barak and Sagi Rotem. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 20:50:50 -07:00
Jack Morgenstein	ee49bd9397	mlx4_core: Reset device when internal error is detected Reset the device when an internal error is detected. Also, detect errors by polling the error buffer rather than using interrupts. This is more robust and doesn't depend on MSI-X. Remove the old interrupt handler entirely, since we don't want to support two mechanisms for detecting internal errors. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:42 -07:00
Jack Morgenstein	65541cb7cf	IB/mlx4: Implement query SRQ Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-12 15:41:24 -07:00
Jack Morgenstein	6a775e2ba4	IB/mlx4: Implement query QP Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-12 15:41:00 -07:00
Dotan Barak	149983af60	mlx4_core: Get the maximum message size from reported device capabilities Get the maximum message size from the device capabilities returned from the QUERY_DEV_CAP firmware command, rather than hard-coding 2 GB. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:26 -07:00
Michael S. Tsirkin	525f5f44c4	mlx4_core: Include linux/mutex.h from mlx4.h mlx4.h uses struct mutex, so although <linux/mutex.h> seems to be pulled in indirectly by one of the headers it includes, the right thing to do is to include <linux/mutex.h> directly. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:20 -07:00
Bill Nottingham	287aa83dff	drivers/net: fix comparisons of unsigned < 0 Recent gcc versions emit warnings when unsigned variables are compared < 0 or >= 0. Signed-off-by: Bill Nottingham <notting@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-07-08 22:16:39 -04:00
Jack Morgenstein	786f238e4f	mlx4_core: Add new Mellanox device IDs Add new IDs for PCIe gen2 devices. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-02 20:41:35 -07:00
Roland Dreier	5ae2a7a836	IB/mlx4: Handle FW command interface rev 3 Upcoming firmware introduces command interface revision 3, which changes the way port capabilities are queried and set. Update the driver to handle both the new and old command interfaces by adding a new MLX4_FLAG_OLD_PORT_CMDS that it is set after querying the firmware interface revision and then using the correct interface based on the setting of the flag. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-18 08:15:02 -07:00
Roland Dreier	0e6e741621	IB/mlx4: Handle new FW requirement for send request prefetching New ConnectX firmware introduces FW command interface revision 2, which requires that for each QP, a chunk of send queue entries (the "headroom") is kept marked as invalid, so that the HCA doesn't get confused if it prefetches entries that haven't been posted yet. Add code to the driver to do this, and also update the user ABI so that userspace can request that the prefetcher be turned off for userspace QPs (we just leave the prefetcher on for all kernel QPs). Unfortunately, marking send queue entries this way is confuses older firmware, so we change the driver to allow only FW command interface revisions 2. This means that users will have to update their firmware to work with the new driver, but the firmware is changing quickly and the old firmware has lots of other bugs anyway, so this shouldn't be too big a deal. Based on a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-18 08:13:48 -07:00
Jack Morgenstein	b2d9308ae4	mlx4_core: Don't set MTT address in dMPT entries with PA set If a dMPT entry has the PA flag (direct physical address) set, then the (unused) MTT base address field has to be set to 0. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 23:24:38 -07:00
Roland Dreier	fe40900f40	mlx4_core: Check firmware command interface revision HCA firmware with incompatible changes to the FW commmand interface is coming soon. Add a check of the interface revision during initialization and bail out if the firmware advertises a revision that the driver doesn't know about. This will avoid strange failures later if the driver goes on using the wrong interface revision. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 23:24:36 -07:00
Roland Dreier	3e1db334dc	IB/mthca, mlx4_core: Fix typo in comment s/signifant/significant/ Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 11:51:59 -07:00
Roland Dreier	2c5cb23558	mlx4_core: Free catastrophic error MSI-X interrupt with correct dev_id We need to pass the same dev_id to free_irq() and request_irq(). When using MSI-X, the MLX4_EQ_CATAS interrupt uses a different dev_id from the other interrupts. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 11:51:58 -07:00
Roland Dreier	b581401ed0	mlx4_core: Initialize ctx_list and ctx_lock earlier We may call mlx4_dispatch_event() before mlx4_register_device() is called for a device, because for example a catastrophic error happens immediately after we enable interrupts. Therefore priv->ctx_list and priv->ctx_lock need to be initialized earlier. This bug was actually exposed by the MSI-X bug that returned IRQ numbers to drivers in reverse order, so that the first FW command interrupt looked to mlx4 like a catastrophic error. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 11:51:58 -07:00
Eli Cohen	09360d5408	mlx4_core: Fix CQ context layout The reserved6 field should be 64 bits, not just 16 bits. Without this, the structure does not match the hardware layout on 32-bit architectures: the db_rec_addr field ends up at offset 52 instead of offset 56. The bug slipped by because the alignment of __be64 members ends up putting it in the right place on x86-64. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 11:51:57 -07:00
Roland Dreier	a2cb4a98f2	IB/mlx4: Fix last allocated object tracking in bitmap allocator Set last allocated object to the object after the one just allocated before ORing in the extra top bits. Also handle the case where this wraps around. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-29 16:07:09 -07:00
Roland Dreier	23c15c21d3	mlx4_core: Fix array overrun in dump_dev_cap_flags() Don't overrun fname[] array when decoding device flags. This was spotted by the Coverity checker (CID 1642). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:57 -07:00
Al Viro	9cbe05c712	missing includes in mlx4 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Jeff Garzik <jeff@garzik.org> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-15 18:56:37 -07:00
Linus Torvalds	de7860c3f3	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IPoIB/cm: Optimize stale connection detection IB/mthca: Set cleaned CQEs back to HW ownership when cleaning CQ IB/mthca: Fix posting >255 recv WRs for Tavor RDMA/cma: Add check to validate that cm_id is bound to a device RDMA/cma: Fix synchronization with device removal in cma_iw_handler RDMA/cma: Simplify device removal handling code IB/ehca: Disable scaling code by default, bump version number IB/ehca: Beautify sysfs attribute code and fix compiler warnings IB/ehca: Remove _irqsave, move #ifdef IB/ehca: Fix AQP0/1 QP number IB/ehca: Correctly set GRH mask bit in ehca_modify_qp() IB/ehca: Serialize hypervisor calls in ehca_register_mr() IB/ipath: Shadow the gpio_mask register IB/mlx4: Fix uninitialized spinlock for 32-bit archs mlx4_core: Remove unused doorbell_lock net: Trivial MLX4_DEBUG dependency fix.	2007-05-15 09:52:31 -07:00
Roland Dreier	20eebcf09c	mlx4_core: Remove unused doorbell_lock struct mlx4_priv.doorbell_lock is never used, so delete it. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-13 08:54:18 -07:00
Andrew Morton	4093785dd1	mlx4: don't use deprecated IRQ flags Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-05-11 17:53:31 -04:00
Roland Dreier	225c7b1fee	IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters Add an InfiniBand driver for Mellanox ConnectX adapters. Because these adapters can also be used as ethernet NICs and Fibre Channel HBAs, the driver is split into two modules: mlx4_core: Handles low-level things like device initialization and processing firmware commands. Also controls resource allocation so that the InfiniBand, ethernet and FC functions can share a device without stepping on each other. mlx4_ib: Handles InfiniBand-specific things; plugs into the InfiniBand midlayer. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:38 -07:00

1 2 3 4 5

235 Commits