linux

Author	SHA1	Message	Date
Roland Dreier	0559d8dc13	IB/core: Don't return EINVAL from sysfs rate attribute for invalid speeds Commit `e9319b0cb0` ("IB/core: Fix SDR rates in sysfs") changed our sysfs rate attribute to return EINVAL to userspace if the underlying device driver returns an invalid rate. Apparently some drivers do this when the link is down and some userspace pukes if it gets an error when reading this attribute, so avoid a regression by not return an error to match the old code. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-04-02 10:57:31 -07:00
Linus Torvalds	0c2fe82a9b	InfiniBand/RDMA changes for the 3.4 merge window. Nothing big really stands out; by patch count lots of fixes to the mlx4 driver plus some cleanups and fixes to the core and other drivers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJPZ2THAAoJEENa44ZhAt0hMgkP/AwXjwzz6lYlIizytQ5KOH11 CT/VoHLIGIwY31aRhZRHggWLEbJi8akgfh04K9PiSh/muFRPYk5KJDoQEoJV5Odv 12krqtpZeL41YcAPq0wiyP3Ki5AZDCDumzWbOtmxE9m93mVfi3yFTTWDHFo/7SOx A2kUITdKuZhgBpCFYS23Gs8QTWcMPUlwNlM76edG90BjSySHsK15tbqTUaEOC6Ux SPG0p0c2YjlZCPmRObrCL65o5LFRVajinZ4rXN7rre4LEU+IuXgW+mQU2Kwy8z6a oMPE2l4OMSqj270r2PlVK087gGH69nKfLgLQLJiP0Id+KwdYUk7vQcA+Fu0jwjP3 Ci/ahllvB76IiaNbax0vTy2ohCJmilLLotAP46NW0vPR2/KnmVy0Mqk8pXQQddw4 tnAFNnafn/MjTty8GwqyXR/enJZs29ePcSxcYnKjXYRIgG4ldejL2wktgSra8+gb 3/qFag1HRgZzcICkXGpHj7uWJ2KDe++1m6KTb0THUmrjuiel6ATFZpYoeRed3GfS ZMqpjZvuXM8nA9CrKMkzm5vIiUFF8Qq0hUTLR38F8FiNEhmC08JqH5PqjJ3Z18if R1yG4W4xmp/0ICHg4zyg6LWV3YsweDD+jSCJpW+KNBvu3HtYb41HIlK+i2TTmtPD 3etEZZRwROAyYKcwN01p =DbLx -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes for the 3.4 merge window from Roland Dreier: "Nothing big really stands out; by patch count lots of fixes to the mlx4 driver plus some cleanups and fixes to the core and other drivers." * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (28 commits) mlx4_core: Scale size of MTT table with system RAM mlx4_core: Allow dynamic MTU configuration for IB ports IB/mlx4: Fix info returned when querying IBoE ports IB/mlx4: Fix possible missed completion event mlx4_core: Report thermal error events mlx4_core: Fix one more static exported function IB: Change CQE "csum_ok" field to a bit flag RDMA/iwcm: Reject connect requests if cmid is not in LISTEN state RDMA/cxgb3: Don't pass irq flags to flush_qp() mlx4_core: Get rid of redundant ext_port_cap flags RDMA/ucma: Fix AB-BA deadlock IB/ehca: Fix ilog2() compile failure IB: Use central enum for speed instead of hard-coded values IB/iser: Post initial receive buffers before sending the final login request IB/iser: Free IB connection resources in the proper place IB/srp: Consolidate repetitive sysfs code IB/srp: Use pr_fmt() and pr_err()/pr_warn() IB/core: Fix SDR rates in sysfs mlx4: Enforce device max FMR maps in FMR alloc IB/mlx4: Set bad_wr for invalid send opcode ...	2012-03-21 10:33:42 -07:00
Roland Dreier	f0e88aeb19	Merge branches 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iser', 'mad', 'nes', 'qib', 'srp' and 'srpt' into for-next	2012-03-19 09:50:33 -07:00
Steve Wise	3eae7c9f97	RDMA/iwcm: Reject connect requests if cmid is not in LISTEN state When destroying a listening cmid, the iwcm first marks the state of the cmid as DESTROYING, then releases the lock and calls into the iWARP provider to destroy the endpoint. Since the cmid is not locked, its possible for the iWARP provider to pass a connection request event to the iwcm, which will be silently dropped by the iwcm. This causes the iWARP provider to never free up the resources from this connection because the assumption is the iwcm will accept or reject this connection. The solution is to reject these connection requests. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-07 15:14:53 -08:00
Hefty, Sean	186834b5de	RDMA/ucma: Fix AB-BA deadlock When we destroy a cm_id, we must purge associated events from the event queue. If the cm_id is for a listen request, we also purge corresponding pending connect requests. This requires destroying the cm_id's associated with the connect requests by calling rdma_destroy_id(). rdma_destroy_id() blocks until all outstanding callbacks have completed. The issue is that we hold file->mut while purging events from the event queue. We also acquire file->mut in our event handler. Calling rdma_destroy_id() while holding file->mut can lead to a deadlock, since the event handler callback cannot acquire file->mut, which prevents rdma_destroy_id() from completing. Fix this by moving events to purge from the event queue to a temporary list. We can then release file->mut and call rdma_destroy_id() outside of holding any locks. Bug report by Or Gerlitz <ogerlitz@mellanox.com>: [ INFO: possible circular locking dependency detected ] 3.3.0-rc5-00008-g79f1e43-dirty #34 Tainted: G I tgtd/9018 is trying to acquire lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm] but task is already holding lock: (&file->mut){+.+.+.}, at: [<ffffffffa02470fe>] ucma_free_ctx+0xb6/0x196 [rdma_ucm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&file->mut){+.+.+.}: [<ffffffff810682f3>] lock_acquire+0xf0/0x116 [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6 [<ffffffffa0247636>] ucma_event_handler+0x148/0x1dc [rdma_ucm] [<ffffffffa035a79a>] cma_ib_handler+0x1a7/0x1f7 [rdma_cm] [<ffffffffa0333e88>] cm_process_work+0x32/0x119 [ib_cm] [<ffffffffa03362ab>] cm_work_handler+0xfb8/0xfe5 [ib_cm] [<ffffffff810423e2>] process_one_work+0x2bd/0x4a6 [<ffffffff810429e2>] worker_thread+0x1d6/0x350 [<ffffffff810462a6>] kthread+0x84/0x8c [<ffffffff81369624>] kernel_thread_helper+0x4/0x10 -> #0 (&id_priv->handler_mutex){+.+.+.}: [<ffffffff81067b86>] __lock_acquire+0x10d5/0x1752 [<ffffffff810682f3>] lock_acquire+0xf0/0x116 [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6 [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffffa024715f>] ucma_free_ctx+0x117/0x196 [rdma_ucm] [<ffffffffa0247255>] ucma_close+0x77/0xb4 [rdma_ucm] [<ffffffff810df6ef>] fput+0x117/0x1cf [<ffffffff810dc76e>] filp_close+0x6d/0x78 [<ffffffff8102b667>] put_files_struct+0xbd/0x17d [<ffffffff8102b76d>] exit_files+0x46/0x4e [<ffffffff8102d057>] do_exit+0x299/0x75d [<ffffffff8102d599>] do_group_exit+0x7e/0xa9 [<ffffffff8103ae4b>] get_signal_to_deliver+0x536/0x555 [<ffffffff81001717>] do_signal+0x39/0x634 [<ffffffff81001d39>] do_notify_resume+0x27/0x69 [<ffffffff81361c03>] retint_signal+0x46/0x83 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&file->mut); lock(&id_priv->handler_mutex); lock(&file->mut); lock(&id_priv->handler_mutex); * DEADLOCK * 1 lock held by tgtd/9018: #0: (&file->mut){+.+.+.}, at: [<ffffffffa02470fe>] ucma_free_ctx+0xb6/0x196 [rdma_ucm] stack backtrace: Pid: 9018, comm: tgtd Tainted: G I 3.3.0-rc5-00008-g79f1e43-dirty #34 Call Trace: [<ffffffff81029e9c>] ? console_unlock+0x18e/0x207 [<ffffffff81066433>] print_circular_bug+0x28e/0x29f [<ffffffff81067b86>] __lock_acquire+0x10d5/0x1752 [<ffffffff810682f3>] lock_acquire+0xf0/0x116 [<ffffffffa0359a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6 [<ffffffffa0359a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffff8106546d>] ? trace_hardirqs_on_caller+0x11e/0x155 [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffffa024715f>] ucma_free_ctx+0x117/0x196 [rdma_ucm] [<ffffffffa0247255>] ucma_close+0x77/0xb4 [rdma_ucm] [<ffffffff810df6ef>] fput+0x117/0x1cf [<ffffffff810dc76e>] filp_close+0x6d/0x78 [<ffffffff8102b667>] put_files_struct+0xbd/0x17d [<ffffffff8102b5cc>] ? put_files_struct+0x22/0x17d [<ffffffff8102b76d>] exit_files+0x46/0x4e [<ffffffff8102d057>] do_exit+0x299/0x75d [<ffffffff8102d599>] do_group_exit+0x7e/0xa9 [<ffffffff8103ae4b>] get_signal_to_deliver+0x536/0x555 [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81001717>] do_signal+0x39/0x634 [<ffffffff8135e037>] ? printk+0x3c/0x45 [<ffffffff8106546d>] ? trace_hardirqs_on_caller+0x11e/0x155 [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81361803>] ? _raw_spin_unlock_irq+0x2b/0x40 [<ffffffff81039011>] ? set_current_blocked+0x44/0x49 [<ffffffff81361bce>] ? retint_signal+0x11/0x83 [<ffffffff81001d39>] do_notify_resume+0x27/0x69 [<ffffffff8118a1fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81361c03>] retint_signal+0x46/0x83 Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-05 12:27:57 -08:00
Or Gerlitz	2e96691c31	IB: Use central enum for speed instead of hard-coded values The kernel IB stack uses one enumeration for IB speed, which wasn't explicitly specified in the verbs header file. Add that enum, and use it all over the code. The IB speed/width notation is also used by iWARP and IBoE HW drivers, which use the convention of rate = speed * width to advertise their port link rate. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-05 09:25:16 -08:00
Roland Dreier	e9319b0cb0	IB/core: Fix SDR rates in sysfs Commit `71eeba16` ("IB: Add new InfiniBand link speeds") introduced a bug where eg the rate for IB 4X SDR links iss displayed as "8.5 Gb/sec" instead of "10 Gb/sec" as it used to be. Fix that. Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-02-27 09:15:08 -08:00
Pablo Neira Ayuso	80d326fab5	netlink: add netlink_dump_control structure for netlink_dump_start() Davem considers that the argument list of this interface is getting out of control. This patch tries to address this issue following his proposal: struct netlink_dump_control c = { .dump = dump, .done = done, ... }; netlink_dump_start(..., &c); Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-26 14:10:06 -05:00
Swapna Thete	0b30704304	IB/mad: Return error response for unsupported MADs Set up a response with appropriate error status and send it for MADs that are not supported by a specific class/version. Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Swapna Thete <swapna.thete@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-02-25 17:47:32 -08:00
David S. Miller	d5ef8a4d87	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/infiniband/hw/nes/nes_cm.c Simple whitespace conflict. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-10 23:32:28 -05:00
Roland Dreier	f36ae34238	Merge branches 'cma', 'ipath', 'misc', 'mlx4', 'nes' and 'qib' into for-next	2012-01-30 16:18:21 -08:00
Sean Hefty	9ced69ca52	RDMA/ucma: Discard all events for new connections until accepted After reporting a new connection request to user space, the rdma_ucm will discard subsequent events until the user has associated a user space idenfier with the kernel cm_id. This is needed to avoid reporting a reject/disconnect event to the user for a request that they may not have processed. The user space identifier is set once the user tries to accept the connection request. However, the following race exists in ucma_accept(): ctx->uid = cmd.uid; <events may be reported now> ret = rdma_accept(ctx->cm_id, ...); Once ctx->uid has been set, new events may be reported to the user. While the above mentioned race is avoided, there is an issue that the user _may_ receive a reject/disconnect event if rdma_accept() fails, depending on when the event is processed. To simplify the use of rdma_accept(), discard all events unless rdma_accept() succeeds. This problem was discovered based on questions from Roland Dreier <roland@purestorage.com>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-01-27 10:07:48 -08:00
Bernd Schubert	e47e321a35	RDMA/core: Fix kernel panic by always initializing qp->usecnt We have just been investigating kernel panics related to cq->ibcq.event_handler() completion calls. The problem is that ib_destroy_qp() fails with -EBUSY. Further investigation revealed qp->usecnt is not initialized. This counter was introduced in linux-3.2 by commit `0e0ec7e063` ("RDMA/core: Export ib_open_qp() to share XRC TGT QPs") but it only gets initialized for IB_QPT_XRC_TGT, but it is checked in ib_destroy_qp() for any QP type. Fix this by initializing qp->usecnt for every QP we create. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: Sven Breuner <sven.breuner@itwm.fraunhofer.de> [ Initialize qp->usecnt in uverbs too. - Sean ] Signed-off-by: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-01-27 09:20:10 -08:00
David Miller	02b619555a	infiniband: Convert dst_fetch_ha() over to dst_neigh_lookup(). Now we must provide the IP destination address, and a reference has to be dropped when we're done with the entry. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-25 21:30:36 -05:00
Linus Torvalds	48fa57ac2c	infiniband changes for 3.3 merge window -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJPBQQDAAoJEENa44ZhAt0hy1EP/A/dz741mEd2QZxYHgloK8XU sSoHvq+vDxHnOvBrDuaHXT47FoY+OSVE+ESeJJQJ9L+B6g3yacP3hNSIcguXFs8Z v011AZAeRvQx3bLu5R9+eDL+YTyotkR0sl/huoLkSwlqrEqGA85eLqf5RSQdxYZf iC1ZXfg0KTrtb6rBvohNcijpmIVEe83SWfnD/ZuCGuWq++DyVJxzECnR7p5D8a9q eMJEnIKVEIpqkqXrPQr/blVSfGQL54QuUdYtoKAS8ZW6BzjIwCGUmKdoT1vaNqX5 sIntxXMcgIgE2r0y/nDK+QIFS4U784eUevIC/LeunbhWUEQX05f3l6+V566/T9hX lvp5M6aonsSSvtqrVi6SF5rvSHFlwPpvAY3+jhjXKLpZ5OxMqf/ZlTN1xN4bin1A whGnznU+51Tjzph6Or8iXo5yExDUQhowX1Z3CYDmh/UqzKHRqaFAuiC071r8GZW3 BEOV9yf/+qPsgtXAiO4jSKlLrOJbMgEI4BoITXTO9HvZH9dHGXDYLvULdHDmFaBi XLg5zcAjou24855miv/gnBQzDc0NWW184BGS9hPE9zmbQlJr7gA4zI0Eggtj+3MO 7z/SLTrxKSfjZJR8Z3cGsnBjCs1VFqV+YQnTkyZYLORLf4F3RbDLe6aJQ+9WBA1g 86J11MjrG30erg3gbXun =8ZmW -----END PGP SIGNATURE----- Merge tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband infiniband changes for 3.3 merge window * tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: rdma/core: Fix sparse warnings RDMA/cma: Fix endianness bugs RDMA/nes: Fix terminate during AE RDMA/nes: Make unnecessarily global nes_set_pau() static RDMA/nes: Change MDIO bus clock to 2.5MHz IB/cm: Fix layout of APR message IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE IB/qib: Default some module parameters optimally IB/qib: Optimize locking for get_txreq() IB/qib: Fix a possible data corruption when receiving packets IB/qib: Eliminate 64-bit jiffies use IB/qib: Fix style issues IB/uverbs: Protect QP multicast list	2012-01-08 14:05:48 -08:00
Linus Torvalds	972b2c7199	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc//mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount ...	2012-01-08 12:19:57 -08:00
Roland Dreier	1583676d9e	Merge branches 'cma', 'misc', 'mlx4', 'nes', 'qib' and 'uverbs' into for-next	2012-01-04 09:18:20 -08:00
Sean Hefty	c89d1bedf8	rdma/core: Fix sparse warnings Clean up sparse warnings in the rdma core layer. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-01-04 09:17:45 -08:00
Sean Hefty	46ea5061c7	RDMA/cma: Fix endianness bugs Fix endianness bugs reported by sparse in the RDMA core stack. Note that these are real bugs, but don't affect any existing code to the best of my knowledge. The mlid issue would only affect kernel users of rdma_join_multicast which have the rdma_cm attach/detach its QP. There are no current in tree users that do this. (rdma_join_multicast may be used called by user space applications, which does not have this issue.) And the pkey setting is simply returned as informational. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-01-04 09:13:52 -08:00
Eli Cohen	6f233d300d	IB/cm: Fix layout of APR message Add a missing 16-bit reserved field between ap_status and info fields. Signed-off-by: Eli Cohen <eli@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-01-03 21:04:18 -08:00
Eli Cohen	e214a0fe2b	IB/uverbs: Protect QP multicast list Userspace verbs multicast attach/detach operations on a QP are done while holding the rwsem of the QP for reading. That's not sufficient since a reader lock allows more than one reader to acquire the lock. However, multicast attach/detach does list manipulation that can corrupt the list if multiple threads run in parallel. Fix this by acquiring the rwsem as a writer to serialize attach/detach operations. Add idr_write_qp() and put_qp_write() to encapsulate this. This fixes oops seen when running applications that perform multicast joins/leaves. Reported by: Mike Dubman <miked@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-01-03 20:36:48 -08:00
Al Viro	2c9ede55ec	switch device_get_devnode() and ->devnode() to umode_t * both callers of device_get_devnode() are only interested in lower 16bits and nobody tries to return anything wider than 16bit anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:55 -05:00
David S. Miller	abb434cb05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-23 17:13:56 -05:00
Sean Hefty	04ded16724	RDMA/cma: Verify private data length private_data_len is defined as a u8. If the user specifies a large private_data size (> 220 bytes), we will calculate a total length that exceeds 255, resulting in private_data_len wrapping back to 0. This can lead to overwriting random kernel memory. Avoid this by verifying that the resulting size fits into a u8. Reported-by: B. Thery <benjamin.thery@bull.net> Addresses: <http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2335> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-12-19 09:15:33 -08:00
David Miller	51d4597451	infiniband: addr: Consolidate code to fetch neighbour hardware address from dst. IPV4 should do exactly what the IPV6 code does here, which is use the neighbour obtained via the dst entry. And now that the two code paths do the same thing, use a common helper function to perform the operation. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Roland Dreier <roland@purestorage.com>	2011-12-05 15:20:19 -05:00
David Miller	2721745501	net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>	2011-12-05 15:20:19 -05:00
David S. Miller	b3613118eb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2011-12-02 13:49:21 -05:00
Eric Dumazet	580da35a31	IB: Fix RCU lockdep splats Commit `f2c31e32b3` ("net: fix NULL dereferences in check_peer_redir()") forgot to take care of infiniband uses of dst neighbours. Many thanks to Marc Aurele who provided a nice bug report and feedback. Reported-by: Marc Aurele La France <tsi@ualberta.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-11-29 13:37:11 -08:00
Alexey Dobriyan	4e3fd7a06d	net: remove ipv6_addr_copy() C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-22 16:43:32 -05:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Linus Torvalds	f470f8d4e7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits) mlx4_core: Deprecate log_num_vlan module param IB/mlx4: Don't set VLAN in IBoE WQEs' control segment IB/mlx4: Enable 4K mtu for IBoE RDMA/cxgb4: Mark QP in error before disabling the queue in firmware RDMA/cxgb4: Serialize calls to CQ's comp_handler RDMA/cxgb3: Serialize calls to CQ's comp_handler IB/qib: Fix issue with link states and QSFP cables IB/mlx4: Configure extended active speeds mlx4_core: Add extended port capabilities support IB/qib: Hold links until tuning data is available IB/qib: Clean up checkpatch issue IB/qib: Remove s_lock around header validation IB/qib: Precompute timeout jiffies to optimize latency IB/qib: Use RCU for qpn lookup IB/qib: Eliminate divide/mod in converting idx to egr buf pointer IB/qib: Decode path MTU optimization IB/qib: Optimize RC/UC code by IB operation IPoIB: Use the right function to do DMA unmap pages RDMA/cxgb4: Use correct QID in insert_recv_cqe() RDMA/cxgb4: Make sure flush CQ entries are collected on connection close ...	2011-11-01 10:51:38 -07:00
Roland Dreier	504255f8d0	Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', 'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next	2011-11-01 09:37:08 -07:00
Christoph Lameter	bc3e53f682	mm: distinguish between mlocked and pinned pages Some kernel components pin user space memory (infiniband and perf) (by increasing the page count) and account that memory as "mlocked". The difference between mlocking and pinning is: A. mlocked pages are marked with PG_mlocked and are exempt from swapping. Page migration may move them around though. They are kept on a special LRU list. B. Pinned pages cannot be moved because something needs to directly access physical memory. They may not be on any LRU list. I recently saw an mlockalled process where mm->locked_vm became bigger than the virtual size of the process (!) because some memory was accounted for twice: Once when the page was mlocked and once when the Infiniband layer increased the refcount because it needt to pin the RDMA memory. This patch introduces a separate counter for pinned pages and accounts them seperately. Signed-off-by: Christoph Lameter <cl@linux.com> Cc: Mike Marciniszyn <infinipath@qlogic.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-10-31 17:30:46 -07:00
Paul Gortmaker	b108d9764c	infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE These were getting it implicitly via device.h --> module.h but we are going to stop that when we clean up the headers. Fix these in advance so the tree remains biscect-clean. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:31:35 -04:00
Paul Gortmaker	e4dd23d753	infiniband: Fix up module files that need to include module.h They had been getting it implicitly via device.h but we can't rely on that for the future, due to a pending cleanup so fix it now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:31:35 -04:00
Paul Gortmaker	fc87af74af	infiniband: Fix up users implicitly relying on getting stat.h They get it via module.h (via device.h) but we want to clean that up. When we do, we'll get things like: CC [M] drivers/infiniband/core/sysfs.o sysfs.c:361: error: 'S_IRUGO' undeclared here (not in a function) sysfs.c:654: error: 'S_IWUSR' undeclared here (not in a function) so add in the stat header it is using explicitly in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:31:34 -04:00
Sean Hefty	42849b2697	RDMA/uverbs: Export ib_open_qp() capability to user space Allow processes that share the same XRC domain to open an existing shareable QP. This permits those processes to receive events on the shared QP and transfer ownership, so that any process may modify the QP. The latter allows the creating process to exit, while a remaining process can still transition it for path migration purposes. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:50:56 -07:00
Sean Hefty	0e0ec7e063	RDMA/core: Export ib_open_qp() to share XRC TGT QPs XRC TGT QPs are shared resources among multiple processes. Since the creating process may exit, allow other processes which share the same XRC domain to open an existing QP. This allows us to transfer ownership of an XRC TGT QP to another process. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:49:51 -07:00
Sean Hefty	2622e18ef4	IB/cm: Do not automatically disconnect XRC TGT QPs Because an XRC TGT QP can end up being shared among multiple processes, don't have the ib_cm automatically send a DREQ when the userspace process that owns the ib_cm_id exits. Disconnect can be initiated by the user directly; otherwise, the owner of the XRC INI QP controls the connection. Note that as a result of the process exiting, the ib_cm will stop tracking the XRC connection on the target side. For the purposes of disconnecting, this isn't a big deal. The ib_cm will respond to the DREQ appropriately. For other messages, mainly LAP, the CM will reject the request, since there's no one available to route the request to. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:42:06 -07:00
Sean Hefty	18c441a6c3	RDMA/cma: Support XRC QPs Allow users to connect XRC QPs through the rdma_cm. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:41:36 -07:00
Sean Hefty	638ef7a6c6	RDMA/ucm: Allow user to specify QP type when creating id Allow the user to indicate the QP type separately from the port space when allocating an rdma_cm_id. With RDMA_PS_IB, there is no longer a 1:1 relationship between the QP type and port space, so we need to switch on the QP type to select between UD and connected QPs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:40:36 -07:00
Sean Hefty	2d2e941529	RDMA/cm: Define new RDMA port space specific to IB Add RDMA_PS_IB. XRC QP types will use the IB port space when operating over the RDMA CM. For the 'IP protocol' field value, we select 0x3F, which is listed as being for 'any local network'. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:39:52 -07:00
Sean Hefty	ef70044647	IB/cm: Update XRC support based on XRC annex errata The XRC annex was updated to have XRC behave more like RD. Specifically, the XRC TGT QPN moves from the local QPN to local EECN field. Lookup of SRQN is done using the REQ/REP protocol. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:38:35 -07:00
Sean Hefty	d26a360b77	IB/cm: Update protocol to support XRC Update the REQ and REP messages to support XRC connection setup according to the XRC Annex. Several existing fields must be set to 0 or 1 when connecting XRC QPs, and a reserved field is changed to an extended transport type. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:37:55 -07:00
Sean Hefty	b93f3c1872	RDMA/uverbs: Export XRC TGT QPs to user space Allow user space to operate on XRC TGT QPs the same way as other types of QPs, with one notable exception: since XRC TGT QPs may be shared among multiple processes, the XRC TGT QP is allowed to exist beyond the lifetime of the creating process. The process that creates the QP is allowed to destroy it, but if the process exits without destroying the QP, then the QP will be left bound to the lifetime of the XRCD. TGT QPs are not associated with CQs or a PD. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:37:07 -07:00
Sean Hefty	9977f4f64b	RDMA/uverbs: Export XRC INI QPs to userspace XRC INI QPs are similar to send only RC QPs. Allow user space to create INI QPs. Note that INI QPs do not require receive CQs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:32:27 -07:00
Sean Hefty	8541f8de05	RDMA/uverbs: Export XRC SRQs to user space We require additional information to create XRC SRQs than we can exchange using the existing create SRQ ABI. Provide an enhanced create ABI for extended SRQ types. Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il> and Roland Dreier <roland@purestorage.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:29:18 -07:00
Sean Hefty	53d0bd1e7f	RDMA/uverbs: Export XRC domains to user space Allow user space to create XRC domains. Because XRCDs are expected to be shared among multiple processes, we use inodes to identify an XRCD. Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:21:24 -07:00
Sean Hefty	d3d72d909e	RDMA/verbs: Cleanup XRC TGT QPs when destroying XRCD XRC TGT QPs are intended to be shared among multiple users and processes. Allow the destruction of an XRC TGT QP to be done explicitly through ib_destroy_qp() or when the XRCD is destroyed. To support destroying an XRC TGT QP, we need to track TGT QPs with the XRCD. When the XRCD is destroyed, all tracked XRC TGT QPs are also cleaned up. To avoid stale reference issues, if a user is holding a reference on a TGT QP, we increment a reference count on the QP. The user releases the reference by calling ib_release_qp. This releases any access to the QP from a user above verbs, but allows the QP to continue to exist until destroyed by the XRCD. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:20:27 -07:00
Sean Hefty	b42b63cf0d	RDMA/core: Add XRC QPs XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). XRC communication is between an initiator (INI) QP and a target (TGT) QP. Target QPs are associated with SRQs through an XRCD. An XRC TGT QP behaves like a receive-only RD QP. XRC INI QPs behave similarly to RC QPs, except that work requests posted to an XRC INI QP must specify the remote SRQ that is the target of the work request. We define two new QP types for XRC, to distinguish between INI and TGT QPs, and update the core layer to support XRC QPs. This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:16:19 -07:00

1 2 3 4 5 ...

657 Commits