Commit Graph

1937 Commits

Author SHA1 Message Date
Roland Dreier
f0e88aeb19 Merge branches 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iser', 'mad', 'nes', 'qib', 'srp' and 'srpt' into for-next 2012-03-19 09:50:33 -07:00
Roland Dreier
42872c7a5e Merge branches 'misc' and 'mlx4' into for-next
Conflicts:
	drivers/infiniband/hw/mlx4/main.c
	drivers/net/ethernet/mellanox/mlx4/main.c
	include/linux/mlx4/device.h
2012-03-12 16:25:28 -07:00
Or Gerlitz
a9c766bb75 IB/mlx4: Fix info returned when querying IBoE ports
To issue a port query, use the QUERY_(Ethernet)_PORT command instead
of the MAD_IFC command, since MAD_IFC attempts to query the firmware
IB SMA, which is irrelevant for IBoE ports.

This allows us to handle both 10Gb/s and 40Gb/s rates (e.g in sysfs),
using QDR speed (10Gb/s) and width of 1X or 4X.

Signed-off-by: Dotan Barak <dotanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-12 16:24:59 -07:00
Eli Cohen
3616f9cead IB/mlx4: Fix possible missed completion event
If an erroneous CQE is polled in the first iteration (i.e. npolled ==
0), we don't update the consumer index and hence the hardware could
get a wrong notion of how many CQEs software polled.  Fix this by
unconditionally updating the doorbell record.  We could change the
check to be something like

	if (npolled || err != -EAGAIN)
		...

but it does not seem worth the effort since a posted write to memory
should not cost too much.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-12 16:24:59 -07:00
Or Gerlitz
d927d505c5 IB: Change CQE "csum_ok" field to a bit flag
Use a bit in wc_flags rather then a whole integer to hold the
"checksum OK" flag.  By itself, this change doesn't reduce the size of
struct ib_wc on 64bit machines -- it stays on 56 bytes because of
padding.  However, it will allow to add more fields in the future
without enlarging the struct.  Also, it will let us have a unified
approach with future libibverbs checksum offload reporting, because a
bit flag doesn't break the library ABI.

This patch was suggested during conversation with Liran Liss
<liranl@mellanox.com>.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-08 12:34:27 -08:00
Steve Wise
db4106ce63 RDMA/cxgb3: Don't pass irq flags to flush_qp()
Since flush_qp() is always called with irqs disabled, all the locking
inside flush_qp() and __flush_qp() doesn't need irq save/restore.

Further, passing the flag variable from iwch_modify_qp() is just wrong
and causes a WARN_ON() in local_bh_enable().

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-07 15:12:45 -08:00
Or Gerlitz
8154c07fe1 mlx4_core: Get rid of redundant ext_port_cap flags
While doing the work for commit a6f7feae6d ("IB/mlx4: pass SMP
vendor-specific attribute MADs to firmware") we realized that the
firmware would respond on all sorts of vendor-specific MADs.
Therefore commit 97285b7817 ("mlx4_core: Add extended port
capabilities support") adds redundant code into the driver, since
there's no real reaon to maintain the extended capabilities of the
port, as they can be queried on demand (e.g the FDR10 capability).

This patch reverts commit 97285b7817 and removes the check for
extended caps from the mlx4_ib driver port query flow.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-06 17:25:18 -08:00
Kyle McMartin
bd50f8924c IB/ehca: Fix ilog2() compile failure
I'm getting compile failures building this driver, which I narrowed
down to the ilog2 call in ehca_get_max_hwpage_size...

    ERROR: ".____ilog2_NaN" [drivers/infiniband/hw/ehca/ib_ehca.ko]
    undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2

The use of shca->hca_cap_mr_pgsize is confusing the compiler, and
resulting in the __builtin_constant_p in ilog2 going insane.

I tried making it take the u32 pgsize as an argument and the expansion
of shca->_pgsize in the caller, but that failed as well.

With this patch in place, the driver compiles on my GCC 4.6.2 here.

Suggested-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Kyle McMartin <kmcmarti@redhat.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 10:12:35 -08:00
Or Gerlitz
2e96691c31 IB: Use central enum for speed instead of hard-coded values
The kernel IB stack uses one enumeration for IB speed, which wasn't
explicitly specified in the verbs header file.  Add that enum, and use
it all over the code.

The IB speed/width notation is also used by iWARP and IBoE HW drivers,
which use the convention of rate = speed * width to advertise their
port link rate.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 09:25:16 -08:00
Eli Cohen
a5bbe892da mlx4: Enforce device max FMR maps in FMR alloc
ConnectX devices have a limit on the number of mappings that can be
done on an FMR before having to call sync_tpt.  The current
mlx4_ib driver reports the limit correctly in max_map_per_fmr in
.query_device(), but mlx4_core doesn't check it when actually
allocating FMRs.

Add a max_fmr_maps field to struct mlx4_caps and enforce this maximum
value on FMR allocations.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-26 01:43:37 -08:00
Eli Cohen
4ba6b8eaa9 IB/mlx4: Set bad_wr for invalid send opcode
If the opcode of a work request exceeds the range of valid opcodes,
return the pointer to the offending work request.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-26 01:37:30 -08:00
Eric Dumazet
6aeaa48b0d IB/ehca: Use kthread_create_on_node()
Since create_comp_task() creates percpu kthread, it makes sense to use
kthread_create_on_node() to get proper NUMA affinity for kthread
stack.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:47:21 -08:00
Mike Marciniszyn
520b3ee705 IB/qib: Avoid filtering LID on SMA portinfo
The current get portinfo handling filters the LID being sent,
changing zero to 0xffff.

This causes OpenSM to log excessive warning messages.

Reviewed-by: Edward Mascarenhas <edward.mascarenhas@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:50 -08:00
Mike Marciniszyn
a778f3fddc IB/qib: Add logic for affinity hint
Call irq_set_affinity_hint() to give userspace programs such as
irqbalance the information to be able to distribute qib interrupts
appropriately.

The logic allocates all non-receive interrupts to the first CPU local
to the HCA.  Receive interrupts are allocated round robin starting
with the second CPU local to the HCA with potential wrap back to the
second CPU.

This patch also adds a refinement to the name registered for MSI-X
interrupts so that user level scripts can determine the device
associated with the IRQs when there are multiple HCAs with a
potentially different set of local CPUs.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:49 -08:00
Tatyana Nikolova
8dd87fba93 RDMA/nes: Fixes for sparse endianness warnings
Fix endianness problems detect by sparse, introduced with the enhanced
MPA patch.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:37 -08:00
Kumar Sanghvi
91018f8632 RDMA/cxgb4: Add missing peer2peer check in MPAv2 code
Don't worry about p2p_type if peer2peer itself is not requested in the
first place.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:02 -08:00
David S. Miller
d5ef8a4d87 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/infiniband/hw/nes/nes_cm.c

Simple whitespace conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10 23:32:28 -05:00
Roland Dreier
f36ae34238 Merge branches 'cma', 'ipath', 'misc', 'mlx4', 'nes' and 'qib' into for-next 2012-01-30 16:18:21 -08:00
Tatyana Nikolova
c5488c571f RDMA/nes: Copyright update
Update copyright information in the source files.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-30 16:18:07 -08:00
Jack Morgenstein
a6f7feae6d IB/mlx4: pass SMP vendor-specific attribute MADs to firmware
In the current code, vendor-specific MADs (e.g with the FDR-10
attribute) are silently dropped by the driver, resulting in timeouts
at the sending side and inability to query/configure the relevant
feature.  However, the ConnectX firmware is able to handle such MADs.
For unsupported attributes, the firmware returns a GET_RESPONSE MAD
containing an error status.

For example, for a FDR-10 node with LID 11:

    # ibstat mlx4_0 1

    CA: 'mlx4_0'
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 40 (FDR10)
    Base lid: 11
    LMC: 0
    SM lid: 24
    Capability mask: 0x02514868
    Port GUID: 0x0002c903002e65d1
    Link layer: InfiniBand

Extended Port Query (EPI) vendor mad timeouts before the patch:

    # smpquery MEPI 11 -d

    ibwarn: [4196] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
    ibwarn: [4196] _do_madrpc: retry 1 (timeout 1000 ms)
    ibwarn: [4196] _do_madrpc: retry 2 (timeout 1000 ms)
    ibwarn: [4196] _do_madrpc: timeout after 3 retries, 3000 ms
    ibwarn: [4196] mad_rpc: _do_madrpc failed; dport (Lid 11)
    smpquery: iberror: [pid 4196] main: failed: operation EPI: ext port info query failed

EPI query works OK with the patch:

    # smpquery MEPI 11 -d

    ibwarn: [6548] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
    ibwarn: [6548] mad_rpc: data offs 64 sz 64
    mad data
    0000 0000 0000 0001 0000 0001 0000 0001
    0000 0000 0000 0000 0000 0000 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    # Ext Port info: Lid 11 port 0
    StateChangeEnable:...............0x00
    LinkSpeedSupported:..............0x01
    LinkSpeedEnabled:................0x01
    LinkSpeedActive:.................0x01

Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Ira Weiny <weiny2@llnl.gov>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-30 16:15:17 -08:00
Tatyana Nikolova
4a4b03f4ef RDMA/nes: Fix fast memory registration opcode
Fix fast memory registration opcode in local invalidate completion.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Donald Wood <Donald.E.Wood@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:15:13 -08:00
Tatyana Nikolova
94f622bdac RDMA/nes: Fix fast memory registration length
Zero high order word of fast memory registration (FMR) length field.
FMR length field is 32 bits, so high word should always be zero.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Donald Wood <Donald.E.Wood@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:14:51 -08:00
Mike Marciniszyn
b6bfefb041 IB/qib: Roll back PCIe tuning change
Commit 8d4548f2b ("IB/qib: Default some module parameters optimally")
introduced an issue with older root complexes.  They cannot handle the
pcie_caps of 0x51 (MaxReadReq 4096, MaxPayload=256).

A typical diagnostic in this situation reported by syslog contains
the text:

  [PCIe Poisoned TLP][Send DMA memory read]

Restore the module paramter default to zero with will avoid any
changes in the root complex.

Reviewed-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:03:38 -08:00
Julia Lawall
0f3696eb21 IB/qib: Use GFP_ATOMIC when locks are held
alloc_dummy_hdrq() is called with locks held and thus should not use
GFP_KERNEL.

The semantic patch that makes this report is available in
scripts/coccinelle/locks/call_kern.cocci.

Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:59:22 -08:00
Roland Dreier
7525c85be0 RDMA/nes: Add missing rcu_read_unlock() in nes_addr_resolve_neigh()
Make sure all exit paths from this function unlock everything.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:54:30 -08:00
Tatyana Nikolova
81f99dcc93 RDMA/nes: Fix for sending MPA reject frame
Set a reject flag, when sending MPA reject message to inform the peer
that the application has rejected the connection.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:50:48 -08:00
Dan Carpenter
ef5352875a IB/ipath: Calling PTR_ERR() on right variable in create_file()
"dentry" is a valid pointer.  "*dentry" was intended.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:48:27 -08:00
David Miller
e55684fadb infiniband: nes: Convert nes_addr_resolve_neigh() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-25 21:30:37 -05:00
David Miller
64b7007eb9 infiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-25 21:30:37 -05:00
Rusty Russell
90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Rusty Russell
69116f279a module_param: avoid bool abuse, add bint for special cases.
For historical reasons, we allow module_param(bool) to take an int (or
an unsigned int).  That's going away.

A few drivers really want an int: they set it to -1 and a parameter
will set it to 0 or 1.  This sucks: reading them from sysfs will give
'Y' for both -1 and 1, but if we change it to an int, then the users
might be broken (if they did "param" instead of "param=1").

Use a new 'bint' parser for them.

(ntfs has a different problem: it needs an int for debug_msgs because
it's also exposed via sysctl.)

Cc: Steve Glendinning <steve.glendinning@smsc.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux390@de.ibm.com
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: lm-sensors@lm-sensors.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: alsa-devel@alsa-project.org
Acked-by: Takashi Iwai <tiwai@suse.de> (For the sound part)
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com> (For the hwmon driver)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:17 +10:30
Linus Torvalds
48fa57ac2c infiniband changes for 3.3 merge window
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPBQQDAAoJEENa44ZhAt0hy1EP/A/dz741mEd2QZxYHgloK8XU
 sSoHvq+vDxHnOvBrDuaHXT47FoY+OSVE+ESeJJQJ9L+B6g3yacP3hNSIcguXFs8Z
 v011AZAeRvQx3bLu5R9+eDL+YTyotkR0sl/huoLkSwlqrEqGA85eLqf5RSQdxYZf
 iC1ZXfg0KTrtb6rBvohNcijpmIVEe83SWfnD/ZuCGuWq++DyVJxzECnR7p5D8a9q
 eMJEnIKVEIpqkqXrPQr/blVSfGQL54QuUdYtoKAS8ZW6BzjIwCGUmKdoT1vaNqX5
 sIntxXMcgIgE2r0y/nDK+QIFS4U784eUevIC/LeunbhWUEQX05f3l6+V566/T9hX
 lvp5M6aonsSSvtqrVi6SF5rvSHFlwPpvAY3+jhjXKLpZ5OxMqf/ZlTN1xN4bin1A
 whGnznU+51Tjzph6Or8iXo5yExDUQhowX1Z3CYDmh/UqzKHRqaFAuiC071r8GZW3
 BEOV9yf/+qPsgtXAiO4jSKlLrOJbMgEI4BoITXTO9HvZH9dHGXDYLvULdHDmFaBi
 XLg5zcAjou24855miv/gnBQzDc0NWW184BGS9hPE9zmbQlJr7gA4zI0Eggtj+3MO
 7z/SLTrxKSfjZJR8Z3cGsnBjCs1VFqV+YQnTkyZYLORLf4F3RbDLe6aJQ+9WBA1g
 86J11MjrG30erg3gbXun
 =8ZmW
 -----END PGP SIGNATURE-----

Merge tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

infiniband changes for 3.3 merge window

* tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  rdma/core: Fix sparse warnings
  RDMA/cma: Fix endianness bugs
  RDMA/nes: Fix terminate during AE
  RDMA/nes: Make unnecessarily global nes_set_pau() static
  RDMA/nes: Change MDIO bus clock to 2.5MHz
  IB/cm: Fix layout of APR message
  IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
  IB/qib: Default some module parameters optimally
  IB/qib: Optimize locking for get_txreq()
  IB/qib: Fix a possible data corruption when receiving packets
  IB/qib: Eliminate 64-bit jiffies use
  IB/qib: Fix style issues
  IB/uverbs: Protect QP multicast list
2012-01-08 14:05:48 -08:00
Linus Torvalds
972b2c7199 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
  reiserfs: Properly display mount options in /proc/mounts
  vfs: prevent remount read-only if pending removes
  vfs: count unlinked inodes
  vfs: protect remounting superblock read-only
  vfs: keep list of mounts for each superblock
  vfs: switch ->show_options() to struct dentry *
  vfs: switch ->show_path() to struct dentry *
  vfs: switch ->show_devname() to struct dentry *
  vfs: switch ->show_stats to struct dentry *
  switch security_path_chmod() to struct path *
  vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
  vfs: trim includes a bit
  switch mnt_namespace ->root to struct mount
  vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
  vfs: opencode mntget() mnt_set_mountpoint()
  vfs: spread struct mount - remaining argument of next_mnt()
  vfs: move fsnotify junk to struct mount
  vfs: move mnt_devname
  vfs: move mnt_list to struct mount
  vfs: switch pnode.h macros to struct mount *
  ...
2012-01-08 12:19:57 -08:00
Roland Dreier
1583676d9e Merge branches 'cma', 'misc', 'mlx4', 'nes', 'qib' and 'uverbs' into for-next 2012-01-04 09:18:20 -08:00
Tatyana Nikolova
196f40c846 RDMA/nes: Fix terminate during AE
Fix for reset which happens right after sending a terminate message.
Terminate timer is not deleted when the connection is closed.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-04 09:12:39 -08:00
Tatyana Nikolova
b0fda90f2a RDMA/nes: Make unnecessarily global nes_set_pau() static
Warned about by sparse.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-04 09:07:24 -08:00
Tatyana Nikolova
30b7e117af RDMA/nes: Change MDIO bus clock to 2.5MHz
Change the PHY clock divisor to make the MDIO clock 2.5MHz, instead of
3.5MHz (which is out of spec).

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-04 09:02:15 -08:00
Or Gerlitz
9106c41069 IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
For IBoE, SLs 0-7 are mapped to Ethernet 802.1Q user priority bits
(pbits) which are part of the VLAN tag, SLs 8-15 are reserved.

Under Ethernet, the ConnectX firmware treats (decode/encode) the four
bit SL field in various constructs such as QPC / UD WQE / CQE as PPP0
and not as 0PPP. This correlates well to the fact that within the
vlan tag the pbits are located in bits 15-13 and not 12-14.

The current code wasn't consistent around that area - the
encoding was correct for the IBoE QPC.path.schedule_queue field,
but was wrong for IBoE CQEs and when MLX header was built.

These inconsistencies resulted in wrong SL <--> wire 802.1Q pbits
mapping, which is fixed by using SL <--> PPP0 all around the place.

Signed-off-by: Oren Duer <oren@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 21:00:02 -08:00
Mike Marciniszyn
8d4548f2b7 IB/qib: Default some module parameters optimally
Minimize the need for users to have to set module parameters to get
good performance.

The following two parameters are changed:
 - rcvhdrcnt to twice the rcvegrcnt
 - pcie_caps=0x51

The rcvhdrcnt at twice the egrcount allows the preemptive NAK code
during reception to function in 100% of the cases rather than a sender
jiffies-based timeout.

The pcie_caps default of 0x51 will set the proposed MaxPayload and
MaxReceiveReqest to 256 and 4096 respectively.  The capabilities on
the root complex will be used to limit those values.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 20:54:01 -08:00
Mike Marciniszyn
4894710951 IB/qib: Optimize locking for get_txreq()
The current code locks the QP s_lock, followed by the pending_lock, I
guess to to protect against the allocate failing.

This patch only locks the pending_lock, assuming that the empty case
is an exeception, in which case the pending_lock is dropped, and the
original code is executed.  This will save a lock of s_lock in the
normal case.

The observation is that the sdma descriptors will deplete at twice the
rate of txreq's, so this should be rare.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 20:53:31 -08:00
Ram Vepa
eddfb67525 IB/qib: Fix a possible data corruption when receiving packets
Prevent a receive data corruption by ensuring that the write to update
the rcvhdrheadn register to generate an interrupt is at the very end
of the receive processing.

Signed-off-by: Ramkrishna Vepa <ram.vepa@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 20:53:02 -08:00
Mike Marciniszyn
8482d5d1bc IB/qib: Eliminate 64-bit jiffies use
The qib driver makes use of the the 64-bit jiffies API.

Code inspection reveals that that version of the API is not really
required.  This patch converts to use the "normal" jiffies.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 20:52:12 -08:00
Mike Marciniszyn
865b64be86 IB/qib: Fix style issues
More style issues revealed with checkpatch.pl -f.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 20:51:42 -08:00
Al Viro
f9ec80061a infiniband: umode_t noise, including open-coded S_ISDIR()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:03 -05:00
David S. Miller
abb434cb05 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/bluetooth/l2cap_core.c

Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-23 17:13:56 -05:00
Roland Dreier
480390c8f3 Merge branches 'cma', 'mlx4' and 'qib' into for-next 2011-12-19 09:19:49 -08:00
Mike Marciniszyn
29d1b16145 IB/qib: Correct sense on freectxts increment and decrement
Commit 53ab1c6498 ("IB/qib: Correct nfreectxts for multiple HCAs")
reversed the increments and decrements of dd->nfreectxts.  Fix it.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-12-19 09:19:34 -08:00
Jack Morgenstein
8e59d254fe mlx4_ib: disable SRIOV mode for IB ports (not yet supported)
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 13:56:07 -05:00
Jack Morgenstein
f9baff509f mlx4_core: Add "native" argument to mlx4_cmd and its callers (where needed)
For SRIOV, some Hypervisor commands can be executed directly (native = 1).
Others should go through the command wrapper flow (for tracking resource
usage, for example, or for changing some HCA configurations that slaves
need to be notified of).

This patch sets the groundwork for this capability -- adding the correct
value of "native" in each case.

Note that if SRIOV is not activated, this parameter has no effect.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 13:56:05 -05:00
Jack Morgenstein
65dab25deb mlx4: Extanding port_mask functionality
Port mask now has additional state.
Port can be set as "none". In this case neither the mlx4_en or mlx4_ib
drivers take ownership of the port.
In multifunction mode there is an option to set the vfs as single ported devices.
(in single function mode, both physical ports belong to same function)

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 13:56:05 -05:00
Roland Dreier
4af3ce0de0 IB/mlx4: Fix shutdown crash accessing a non-existent bitmap
Commit cfcde11c3d ("IB/mlx4: Use flow counters on IBoE ports") added
code that sets elements of counters[] to -1 if no counter is allocated,
but then goes ahead and passes every entry to mlx4_counter_free() on
shutdown.  This is a bad idea, especially if MLX4_DEV_CAP_FLAG_COUNTERS
isn't set so there isn't even an underlying bitmap to free from.

Tested-by: Sean Hefty <sean.hefty@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-12-06 10:47:37 -08:00
David Miller
3786cf189f infiniband: cxgb4: Consolidate 3 copies of the same operation into 1 helper function.
Three pieces of code do the same thing, create a l2t entry and then
import this information into the c4iw_ep object.

Create a helper function and call it from these 3 locations instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
2011-12-05 15:20:20 -05:00
David Miller
40e2bb588f infiniband: nes: Use dst's neighbour entry.
Do this instead of performing a by-hand lookup.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
2011-12-05 15:20:19 -05:00
David Miller
a4757123ae cxgb3: Rework t3_l2t_get to take a dst_entry instead of a neighbour.
This way we consolidate the RCU locking down into the place where it
actually matters, and also we can make the code handle
dst_get_neighbour_noref() returning NULL properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-05 15:20:19 -05:00
David Miller
2721745501 net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.
To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
2011-12-05 15:20:19 -05:00
David S. Miller
b3613118eb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-02 13:49:21 -05:00
Roland Dreier
a493f1a24a Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next 2011-11-29 18:01:53 -08:00
Eric Dumazet
580da35a31 IB: Fix RCU lockdep splats
Commit f2c31e32b3 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France <tsi@ualberta.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29 13:37:11 -08:00
Mike Marciniszyn
8ee887d74b IB/qib: Fix over-scheduling of QSFP work
Don't over-schedule QSFP work on driver initialization.  It could end
up being run simultaneously on two different CPUs resulting in bad
EEPROM reads.  In combination with setting the physical IB link state
prior to the IBC being brought out of reset, this can cause the link
state machine to start training early with wrong settings.

Signed-off-by: Mitko Haralanov <mitko@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28 12:17:33 -08:00
Kumar Sanghvi
01b225e18f RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2
Fix logic so that we don't retry with MPAv1 once we have done that
already.  Otherwise, we end up retrying with MPAv1 even when its not
needed on getting peer aborts - and this could lead to kernel panic.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28 11:58:07 -08:00
Jonathan Lallinger
c34c97ad8c RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic
Fix another place in the code where logic dealing with the t4_cqe was
using the wrong QID.  This fixes the counting logic so that it tests
against the SQ QID instead of the RQ QID when counting RCQES.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28 11:53:05 -08:00
David S. Miller
9ca36f7db2 infiniband: Update net drivers for netdev_features_t changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-16 18:05:50 -05:00
Mike Marciniszyn
042f36e156 IB/qib: Don't use schedule_work()
It was mistakenly introduced by dde05cbdf8 ("IB/qib: Hold links
until tuning data is available").

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-08 10:37:53 -08:00
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Roland Dreier
b8108d6886 Merge branches 'iser', 'mthca' and 'qib' into for-next 2011-11-04 09:36:04 -07:00
Mike Marciniszyn
30ab7e230b IB/qib: Fix panic in RC error flushing logic
The following panic can occur when flushing a QP:

    RIP: 0010:[<ffffffffa0168e8b>]  [<ffffffffa0168e8b>] qib_send_complete+0x3b/0x190 [ib_qib]
    RSP: 0018:ffff8803cdc6fc90  EFLAGS: 00010046
    RAX: 0000000000000000 RBX: ffff8803d84ba000 RCX: 0000000000000000
    RDX: 0000000000000005 RSI: ffffc90015a53430 RDI: ffff8803d84ba000
    RBP: ffff8803cdc6fce0 R08: ffff8803cdc6fc90 R09: 0000000000000001
    R10: 00000000ffffffff R11: 0000000000000000 R12: ffff8803d84ba0c0
    R13: ffff8803d84ba5cc R14: 0000000000000800 R15: 0000000000000246
    FS:  0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 0000000000000034 CR3: 00000003e44f9000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process qib/0 (pid: 1350, threadinfo ffff8803cdc6e000, task ffff88042728a100)
    Stack:
     53544c5553455201 0000000100000005 0000000000000000 ffff8803d84ba000
     0000000000000000 0000000000000000 0000000000000000 0000000000000000
     0000000000000000 0000000000000001 ffff8803cdc6fd30 ffffffffa0165d7a
    Call Trace:
     [<ffffffffa0165d7a>] qib_make_rc_req+0x36a/0xe80 [ib_qib]
     [<ffffffffa0165a10>] ?  qib_make_rc_req+0x0/0xe80 [ib_qib]
     [<ffffffffa01698b3>] qib_do_send+0xf3/0xb60 [ib_qib]
     [<ffffffff814db757>] ? thread_return+0x4e/0x777
     [<ffffffffa01697c0>] ? qib_do_send+0x0/0xb60 [ib_qib]
     [<ffffffff81088bf0>] worker_thread+0x170/0x2a0
     [<ffffffff8108e530>] ?  autoremove_wake_function+0x0/0x40
     [<ffffffff81088a80>] ? worker_thread+0x0/0x2a0
     [<ffffffff8108e1c6>] kthread+0x96/0xa0
     [<ffffffff8100c1ca>] child_rip+0xa/0x20
     [<ffffffff8108e130>] ? kthread+0x0/0xa0
     [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
    RIP  [<ffffffffa0168e8b>] qib_send_complete+0x3b/0x190 [ib_qib]

The RC error state flush logic in qib_make_rc_req() could return all
of the acked wqes and potentially have emptied the queue.  It would
then unconditionally try return a flush completion via
qib_send_complete() for an invalid wqe, or worse a valid one that is
not queued. The panic results when the completion code tries to
maintain an MR reference count for a NULL MR.

This fix modifies logic to only send one completion per
qib_make_rc_req() call and changing the completion status from
IB_WC_SUCCESS to IB_WC_WR_FLUSH_ERR as the completions progress.

The outer loop will call as many times as necessary to flush the queue.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-04 09:35:44 -07:00
Roland Dreier
e4221314a5 IB/mthca: Fix buddy->num_free allocation size
The num_free field of mthca_buddy has a type of array of unsigned int
while it was allocated as an array of pointers.  On 64-bit platforms
this allocates twice more than required.  Fix this by allocating the
correct size for the type.

This is the same bug just fixed in mlx4 by Eli Cohen <eli@mellanox.co.il>.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-03 17:48:25 -07:00
Linus Torvalds
f470f8d4e7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits)
  mlx4_core: Deprecate log_num_vlan module param
  IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
  IB/mlx4: Enable 4K mtu for IBoE
  RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
  RDMA/cxgb4: Serialize calls to CQ's comp_handler
  RDMA/cxgb3: Serialize calls to CQ's comp_handler
  IB/qib: Fix issue with link states and QSFP cables
  IB/mlx4: Configure extended active speeds
  mlx4_core: Add extended port capabilities support
  IB/qib: Hold links until tuning data is available
  IB/qib: Clean up checkpatch issue
  IB/qib: Remove s_lock around header validation
  IB/qib: Precompute timeout jiffies to optimize latency
  IB/qib: Use RCU for qpn lookup
  IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
  IB/qib: Decode path MTU optimization
  IB/qib: Optimize RC/UC code by IB operation
  IPoIB: Use the right function to do DMA unmap pages
  RDMA/cxgb4: Use correct QID in insert_recv_cqe()
  RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
  ...
2011-11-01 10:51:38 -07:00
Roland Dreier
504255f8d0 Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', 'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next 2011-11-01 09:37:08 -07:00
Christoph Lameter
bc3e53f682 mm: distinguish between mlocked and pinned pages
Some kernel components pin user space memory (infiniband and perf) (by
increasing the page count) and account that memory as "mlocked".

The difference between mlocking and pinning is:

A. mlocked pages are marked with PG_mlocked and are exempt from
   swapping. Page migration may move them around though.
   They are kept on a special LRU list.

B. Pinned pages cannot be moved because something needs to
   directly access physical memory. They may not be on any
   LRU list.

I recently saw an mlockalled process where mm->locked_vm became
bigger than the virtual size of the process (!) because some
memory was accounted for twice:

Once when the page was mlocked and once when the Infiniband
layer increased the refcount because it needt to pin the RDMA
memory.

This patch introduces a separate counter for pinned pages and
accounts them seperately.

Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: Mike Marciniszyn <infinipath@qlogic.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:46 -07:00
Paul Gortmaker
fec14d2fce infiniband: add moduleparam.h to drivers/infiniband as required
These files were getting the moduleparam infrastructure from the
implicit presence of module.h being everywhere, but that is going
away soon.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:36 -04:00
Paul Gortmaker
b108d9764c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE
These were getting it implicitly via device.h --> module.h but
we are going to stop that when we clean up the headers.

Fix these in advance so the tree remains biscect-clean.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:35 -04:00
Paul Gortmaker
e4dd23d753 infiniband: Fix up module files that need to include module.h
They had been getting it implicitly via device.h but we can't
rely on that for the future, due to a pending cleanup so fix
it now.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:35 -04:00
Paul Gortmaker
fc87af74af infiniband: Fix up users implicitly relying on getting stat.h
They get it via module.h (via device.h) but we want to clean that up.
When we do, we'll get things like:

  CC [M]  drivers/infiniband/core/sysfs.o
  sysfs.c:361: error: 'S_IRUGO' undeclared here (not in a function)
  sysfs.c:654: error: 'S_IWUSR' undeclared here (not in a function)

so add in the stat header it is using explicitly in advance.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:34 -04:00
Or Gerlitz
80a2dcd8d0 IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
There's no need to set the vlan-related fields in an IBoE send WQE
control segment:

 - the vlan to be used by a UD QP is set in the datagram segment.
 - for GSI (CM) QP, all the headers down to 8021q and MAC are built by
   the software anyway.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:57:51 -07:00
Or Gerlitz
bcacb89756 IB/mlx4: Enable 4K mtu for IBoE
The IBoE port MTU is derived from the corresponding Ethernet netdevice
MTU, which can support jumbo frames of 9K, and hence surely supports
the max IB mtu of 4K.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:55:15 -07:00
Tom Tucker
d32ae393db RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
QPs need to be moved to error before telling the firwmare to shutdown
the queue.  Otherwise, the application can submit WRs that will never
get fetched by the hardware and never flushed by the driver.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swsie@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:36:08 -07:00
Kumar Sanghvi
581bbe2cd0 RDMA/cxgb4: Serialize calls to CQ's comp_handler
Commit 01e7da6ba5 ("RDMA/cxgb4: Make sure flush CQ entries are
collected on connection close") introduced a potential problem where a
CQ's comp_handler can get called simultaneously from different places
in the iw_cxgb4 driver.  This does not comply with
Documentation/infiniband/core_locking.txt, which states that at a
given point of time, there should be only one callback per CQ should
be active.

This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>.
Based on discussion between Parav Pandit and Steve Wise, this patch
fixes the above problem by serializing the calls to a CQ's
comp_handler using a spin_lock.

Reported-by: Parav Pandit <Parav.Pandit@Emulex.Com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:34:53 -07:00
Kumar Sanghvi
f7cc25d018 RDMA/cxgb3: Serialize calls to CQ's comp_handler
iw_cxgb3 has a potential problem where a CQ's comp_handler can get
called simultaneously from different places in iw_cxgb3 driver.  This
does not comply with Documentation/infiniband/core_locking.txt, which
states that at a given point of time, there should be only one
callback per CQ should be active.

Such problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>
for iw_cxgb4 driver.  Based on discussion between Parav Pandit and
Steve Wise, this patch fixes the above problem by serializing the
calls to a CQ's comp_handler using a spin_lock.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:33:17 -07:00
Mitko Haralanov
16d99812d5 IB/qib: Fix issue with link states and QSFP cables
Fix an issue where the link would come up after replugging a cable
even if it has been DISABLED manually.

Signed-off-by: Mitko Haralanov <mitko@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 10:57:59 -07:00
Marcel Apfelbaum
a5e12dff75 IB/mlx4: Configure extended active speeds
Set the extended active speeds based on the hardware configuration.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>

[ Move FDR-10 handling into ib_link_query_port().  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-28 11:36:16 -07:00
Mitko Haralanov
dde05cbdf8 IB/qib: Hold links until tuning data is available
Hold the link state machine until the tuning data is read from the
QSFP EEPROM so correct tuning settings are applied before the state
machine attempts to bring the link up.  Link is also held on cable
unplug in case a different cable is used.

Signed-off-by: Mitko Haralanov <mitko@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 15:08:20 -07:00
Mike Marciniszyn
44d75d3d92 IB/qib: Clean up checkpatch issue
This was probably present from initial submission.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 15:08:18 -07:00
Mike Marciniszyn
9fd5473deb IB/qib: Remove s_lock around header validation
Review of qib_ruc_check_hdr() shows that the s_lock is not required in
the normal case.  The r_lock is held in all cases, and protects the qp
fields that are read.

The s_lock will be needed to around the call to qib_migrate_qp() to
insure that the send engine sees a consistent set of fields.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:57 -07:00
Mike Marciniszyn
d0f2faf72d IB/qib: Precompute timeout jiffies to optimize latency
A new field is added to qib_qp called timeout_jiffies. It is
initialized upon create and modify.

The field is now used instead of a computation based on qp->timeout.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:56 -07:00
Mike Marciniszyn
af061a644a IB/qib: Use RCU for qpn lookup
The heavy weight spinlock in qib_lookup_qpn() is replaced with RCU.
The hash list itself is now accessed via jhash functions instead of mod.

The changes should benefit multiple receive contexts in different
processors by not contending for the lock just to read the hash
structures.

The patch also adds a lookaside_qp (pointer) and a lookaside_qpn in
the context.  The interrupt handler will test the current packet's qpn
against lookaside_qpn if the lookaside_qp pointer is non-NULL.  The
pointer is NULL'ed when the interrupt handler exits.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:54 -07:00
Mike Marciniszyn
9e1c0e4325 IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
The context init now saves a shift from rcvegrbufs_perchunk
rcvegrbufs_perchunk_shift using ilog2.   A BUG_ON() protects the
power of 2 assumption.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:52 -07:00
Mike Marciniszyn
cc6ea1385b IB/qib: Decode path MTU optimization
Store both the encoded and decoded MTU in the QP structure as a minor
optimization for UC/RC receive routines.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:50 -07:00
Mike Marciniszyn
2fc109c890 IB/qib: Optimize RC/UC code by IB operation
The memset for zeroing work completions had been unconditional.

This patch removes the memset and moves the zeroing into the work
completion with a more explicit field by field set.  With this patch,
non-ONLY/non-LAST packets will avoid the overhead since they will not
generate a completion.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21 09:38:49 -07:00
Eric Dumazet
9e903e0852 net: add skb frag size accessors
To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:10:46 -04:00
Jonathan Lallinger
e14d62c05c RDMA/cxgb4: Use correct QID in insert_recv_cqe()
When creating flushed receive CQEs, set the QPID field in the t4_cqe
to the SQ QID and not the RQ QID.  Otherwise the poll code will not
find the correct QP context.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-14 14:23:40 -07:00
Kumar Sanghvi
01e7da6ba5 RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
At the time when a peer closes the connection, iw_cxgb4 will not send
a cq event if ibqp.uobject exists.  In that case, its possible for a
user application to get blocked in ibv_get_cq_event().

To resolve this, call the cq's comp_handler to unblock any read from
ibv_get_cq_event().  This will trigger userspace to poll the cq and
collect flush status completions for any pending work requests.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-14 14:23:04 -07:00
Sean Hefty
42849b2697 RDMA/uverbs: Export ib_open_qp() capability to user space
Allow processes that share the same XRC domain to open an existing
shareable QP.  This permits those processes to receive events on the
shared QP and transfer ownership, so that any process may modify the
QP.  The latter allows the creating process to exit, while a remaining
process can still transition it for path migration purposes.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:50:56 -07:00
Sean Hefty
0a1405da99 IB/mlx4: Add support for XRC QPs
Support the creation of XRC INI and TGT QPs.  To handle the case where
a CQ or PD is not provided, we allocate them internally with the xrcd.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:44:18 -07:00
Sean Hefty
18abd5ea57 IB/mlx4: Add support for XRC SRQs
Allow the user to create XRC SRQs.  This patch is based on a patch
from Jack Morgenstrein <jackm@dev.mellanox.co.il>.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:43:46 -07:00
Sean Hefty
012a8ff577 IB/mlx4: Add support for XRC domains
Support creating and destroying XRC domains.  Any sharing of the XRCD
is managed above the low-level driver.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:43:03 -07:00
Sean Hefty
96104eda01 RDMA/core: Add SRQ type field
Currently, there is only a single ("basic") type of SRQ, but with XRC
support we will add a second.  Prepare for this by defining an SRQ type
and setting all current users to IB_SRQT_BASIC.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:13:26 -07:00
Randy Dunlap
3e60a77ea2 IB/ipath: Add missing <linux/stat.h> in ipath_chip_init.c
Fix build errors:

    drivers/infiniband/hw/ipath/ipath_init_chip.c:54:1: error: 'S_IRUGO' undeclared here (not in a function)
    drivers/infiniband/hw/ipath/ipath_init_chip.c:54:1: error: bit-field '<anonymous>' width not an integer constant
    drivers/infiniband/hw/ipath/ipath_init_chip.c:67:1: error: 'S_IWUSR' undeclared here (not in a function)
    drivers/infiniband/hw/ipath/ipath_init_chip.c:67:1: error: bit-field '<anonymous>' width not an integer constant

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-10 12:01:22 -07:00
Faisal Latif
0f0bee8bbc RDMA/nes: Support for Packed And Unaligned fpdus
Support for Packed and Unaligned (PAU) FPDUs is needed for
interoperability between NES and non-NES nodes. When the NES hardware
detects a PAU frame, it will pass it to the driver to process the
frame.  NES driver creates a new frame for each FPDU and forwards it
to the hardware to be sent to its associated qp.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-10 10:54:47 -07:00
Faisal Latif
6224c7eeff RDMA/nes: Print IP address for critcal errors
Print the IP address of the remote host when a critical asynchronous event is
received.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-10 10:51:21 -07:00