Commit Graph

141 Commits

Author SHA1 Message Date
Doug Ledford
082eaa5083 Merge branches 'nes', 'cxgb4' and 'iwpm' into k.o/for-4.6 2016-03-16 13:57:43 -04:00
Steve Wise
170003c894 iw_cxgb4: remove port mapper related code
Now that most of the port mapper code been moved to iwcm, we can remove
it from iw_cxgb4.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-16 13:48:21 -04:00
Hariprasad S
ac8e4c69a0 cxgb4/iw_cxgb4: TOS support
This series provides support for iWARP applications to specify a TOS
value and have that map to a VLAN Priority for iw_cxgb4 iWARP connections.

In iw_cxgb4, when allocating an L2T entry, pass the skb_priority based
on the tos value in the cm_id. Also pass the correct tos value during
connection setup so the passive side gets the client's desired tos.
When sending the FLOWC work request to FW, if the egress device is
in a vlan, then use the vlan priority bits as the scheduling class.
This allows associating RDMA connections with scheduling classes to
provide traffic shaping per flow.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:10:14 -05:00
Hariprasad S
6812faefb7 iw_cxgb4: remove false error log entry
Don't log errors if a listening endpoint is going away when procesing a
PASS_ACCEPT_REQ message.  This can happen.  Change the error printk to
a PDBG() debug log entry

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:10:14 -05:00
Linus Torvalds
048ccca8c1 Initial roundup of 4.5 merge window patches
- Remove usage of ib_query_device and instead store attributes in
   ib_device struct
 - Move iopoll out of block and into lib, rename to irqpoll, and use
   in several places in the rdma stack as our new completion queue
   polling library mechanism.  Update the other block drivers that
   already used iopoll to use the new mechanism too.
 - Replace the per-entry GID table locks with a single GID table lock
 - IPoIB multicast cleanup
 - Cleanups to the IB MR facility
 - Add support for 64bit extended IB counters
 - Fix for netlink oops while parsing RDMA nl messages
 - RoCEv2 support for the core IB code
 - mlx4 RoCEv2 support
 - mlx5 RoCEv2 support
 - Cross Channel support for mlx5
 - Timestamp support for mlx5
 - Atomic support for mlx5
 - Raw QP support for mlx5
 - MAINTAINERS update for mlx4/mlx5
 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates
 - Add support for remote invalidate to the iSER driver (pushed through the
   RDMA tree due to dependencies, acknowledged by nab)
 - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies,
   acknowledged by Bruce)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoSygAAoJELgmozMOVy/dDjsP/2vbTda2MvQfkfkGEZBQdJSg
 095RN0gQgCJdg78lAl8yuaK8r4VN/7uefpDtFdudH1I/Pei7X0wxN9R1UzFNG4KR
 AD53lz92IVPs15328SbPR2kvNWISR9aBFQo3rlElq3Grqlp0EMn2Ou1vtu87rekF
 aMllxr8Nl0uZhP+eWusOsYpJUUtwirLgRnrAyfqo2UxZh/TMIroT0TCx1KXjVcAg
 dhDARiZAdu3OgSc6OsWqmH+DELEq6dFVA5F+DDBGAb8bFZqlJc7cuMHWInwNsNXT
 so4bnEQ835alTbsdYtqs5DUNS8heJTAJP4Uz0ehkTh/uNCcvnKeUTw1c2P/lXI1k
 7s33gMM+0FXj0swMBw0kKwAF2d9Hhus9UAN7NwjBuOyHcjGRd5q7SAnfWkvKx000
 s9jVW19slb2I38gB58nhjOh8s+vXUArgxnV1+kTia1+bJSR5swvVoWRicRXdF0vh
 TvLX/BjbSIU73g1TnnLNYoBTV3ybFKQ6bVdQW7fzSTDs54dsI1vvdHXi3bYZCpnL
 HVwQTZRfEzkvb0AdKbcvf8p/TlaAHem3ODqtO1eHvO4if1QJBSn+SptTEeJVYYdK
 n4B3l/dMoBH4JXJUmEHB9jwAvYOpv/YLAFIvdL7NFwbqGNsC3nfXFcmkVORB1W3B
 KEMcM2we4bz+uyKMjEAD
 =5oO7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "Initial roundup of 4.5 merge window patches

   - Remove usage of ib_query_device and instead store attributes in
     ib_device struct

   - Move iopoll out of block and into lib, rename to irqpoll, and use
     in several places in the rdma stack as our new completion queue
     polling library mechanism.  Update the other block drivers that
     already used iopoll to use the new mechanism too.

   - Replace the per-entry GID table locks with a single GID table lock

   - IPoIB multicast cleanup

   - Cleanups to the IB MR facility

   - Add support for 64bit extended IB counters

   - Fix for netlink oops while parsing RDMA nl messages

   - RoCEv2 support for the core IB code

   - mlx4 RoCEv2 support

   - mlx5 RoCEv2 support

   - Cross Channel support for mlx5

   - Timestamp support for mlx5

   - Atomic support for mlx5

   - Raw QP support for mlx5

   - MAINTAINERS update for mlx4/mlx5

   - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates

   - Add support for remote invalidate to the iSER driver (pushed
     through the RDMA tree due to dependencies, acknowledged by nab)

   - Update to NFSoRDMA (pushed through the RDMA tree due to
     dependencies, acknowledged by Bruce)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
  IB/mlx5: Unify CQ create flags check
  IB/mlx5: Expose Raw Packet QP to user space consumers
  {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
  IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
  IB/mlx5: Add Raw Packet QP query functionality
  IB/mlx5: Add create and destroy functionality for Raw Packet QP
  IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
  IB/mlx5: Allocate a Transport Domain for each ucontext
  net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
  net/mlx5_core: Add RQ and SQ event handling
  net/mlx5_core: Export transport objects
  IB/mlx5: Expose CQE version to user-space
  IB/mlx5: Add CQE version 1 support to user QPs and SRQs
  IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
  IB/sa: Fix netlink local service GFP crash
  IB/srpt: Remove redundant wc array
  IB/qib: Improve ipoib UD performance
  IB/mlx4: Advertise RoCE v2 support
  IB/mlx4: Create and use another QP1 for RoCEv2
  IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
  ...
2016-01-23 18:45:06 -08:00
Hariprasad S
28de1f7437 iw_cxgb4: Take clip reference before starting IPv6 listen
The h/w is designed in such a way that, if you do anything IPv6
related, a valid clip entry must be there. So take clip reference
before creating IPv6 listening servers, and then if we fail to
create server, release the clip entry.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:23:40 -05:00
Linus Torvalds
7d1fc01afc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  floppy: make local variable non-static
  exynos: fixes an incorrect header guard
  dt-bindings: fixes some incorrect header guards
  cpufreq-dt: correct dead link in documentation
  cpufreq: ARM big LITTLE: correct dead link in documentation
  treewide: Fix typos in printk
  Documentation: filesystem: Fix typo in fs/eventfd.c
  fs/super.c: use && instead of & for warn_on condition
  Documentation: fix sysfs-ptp
  lib: scatterlist: fix Kconfig description
2016-01-14 17:04:19 -08:00
Masanari Iida
e3d132d123 treewide: Fix typos in printk
This patch fix multiple spelling typos found in
various part of kernel.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-12-08 14:59:19 +01:00
Hariprasad S
963cab5082 iw_cxgb4: Adds support for T6 adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 23:16:38 -04:00
Hariprasad S
3dd9a5dc24 iw_cxgb4: reverse the ord/ird in the ESTABLISHED upcall
The ESTABLISHED event should have the peer's ord/ird so
swap the values in the event before the upcall.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 17:16:10 -04:00
Hariprasad S
f57b780c00 iw_cxgb4: fix misuse of ep->ord for minimum ird calculation
When calculating the minimum ird in c4iw_accept_cr(), we need to always
have a value of at least 1 if the RTR message is a 0B read.  The code
was
incorrectly using ep->ord for this logic which was incorrectly adjusting
the ird and causing incorrect ord/ird negotiation when using MPAv2 to
negotiate these values.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 17:16:10 -04:00
Hariprasad S
158c776dba iw_cxgb4: pass the ord/ird in connect reply events
This allows client ULPs to get the negotiated ord/ird which is useful
to avoid stalling the SQ due to exceeding the ORD.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 17:16:10 -04:00
Hariprasad S
99718e59fa iw_cxgb4: detect fatal errors while creating listening filters
In c4iw_create_listen(), if we're using listen filters, then bail out
of the busy loop if the device becomes fatally dead

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 17:16:10 -04:00
Nicholas Krause
54b9a96f10 IB/cxgb4: Fix if statement in pick_local_ip6adddrs
This fixes an if statement checking the return value of the function
get_lladdr for success in the function pick_local_ip6addrs to instead
of directly checking the return value of this call check the opposite
as get_lladdr returns zero for success which would incorrectly make
this if statement block not execute with the current if statement
check.

Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 14:03:59 -04:00
Hariprasad S
84cc6ac62d iw_cxgb4: Add support for clip
Add support for ipv6 address handling clip api provided by lld

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28 22:54:48 -04:00
Hariprasad S
b8ac311246 iw_cxgb4: set the default MPA version to 2
This enables ORD/IRD negotiation and its about time to enable it by
default

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28 22:54:47 -04:00
Steve Wise
940fd304d2 iw_cxgb4: use wildcard mapping for getting remote addr info
For listening endpoints bound to the wildcard address, we need to pass
the wildcard address mapping to iwpm_get_remote_info() instead of the
mapped address of the new child connection.

Without this fix, and with iwarp port mapping enabled, each iw_cxgb4
connection that is spawned from a listening endpoint bound to the wildcard
address, will generate an annoying dmesg entry about failing to find
the remote address mapping info, and the connection state displayed in
debugfs under /sys/kernel/debug/iw_cxgb4/<pci-slot-no>/eps  will not have
the peer's address/port mapping info.  The connection still works though.

Fixes: 5b6b8fe ("RDMA/cxgb4: Report the actual address of the remote connecting peer")

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-11 17:16:49 -04:00
Hariprasad S
179d03bbfd iw_cxgb4: Remove negative advice dmesg warnings
Remove these log messages in favor of per-endpoint counters as well as
device-global counters that can be inspected via debugfs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 13:21:27 -04:00
Steve Wise
5b6b8fe640 RDMA/cxgb4: Report the actual address of the remote connecting peer
Get the actual (non-mapped) ip/tcp address of the connecting peer from
the port mapper

Also setup the passive side endpoint to correctly display the actual
and mapped addresses for the new connection.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad S
6198dd8d7a iw_cxgb4: 32b platform fixes
- get_dma_mr() was using ~0UL which is should be ~0ULL.  This causes the
DMA MR to get setup incorrectly in hardware.

- wr_log_show() needed a 64b divide function div64_u64() instead of
  doing
division directly.

- fixed warnings about recasting a pointer to a u64

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad S
0b7410471d iw_cxgb4: Cleanup register defines/MACROS
Cleanup macros and register defines for consistency

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05 09:18:01 -04:00
Hariprasad Shenai
cf7fe64aee iw_cxgb4: Cleanup register defines/MACROS defined in t4fw_ri_api.h
Cleanup all the MACROS that are defined in t4fw_ri_api.h and affected files

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-16 01:07:02 -05:00
Hariprasad Shenai
bdc590b99f iw_cxgb4/cxgb4/cxgb4vf/cxgb4i/csiostor: Cleanup register defines/macros related to all other cpl messages
This patch cleanups all other macros/register define related to
CPL messages that are defined in t4_msg.h and the affected files

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:19:34 -05:00
Hariprasad Shenai
6c53e938a8 iw_cxgb4/cxgb4/cxgb4i: Cleanup register defines/MACROS related to CM CPL messages
This patch cleanups all macros/register define related to connection management
CPL messages that are defined in t4_msg.h and the affected files

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:19:34 -05:00
Hariprasad S
e6b11163d4 RDMA/cxgb4: Handle NET_XMIT return codes
cxgb4_create_server() and cxgb4_create_server6() return NET_XMIT_*
values or a negative errno. iw_cxgb4 need to handle this correctly.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:10:46 -08:00
Hariprasad Shenai
10be6b48fd RDMA/cxgb4: Fix locking issue in process_mpa_request
Fix the following lockdep report:

    =============================================
    [ INFO: possible recursive locking detected ]
    3.17.0+ #3 Tainted: G            E
    ---------------------------------------------
    kworker/u64:3/299 is trying to acquire lock:
     (&epc->mutex){+.+.+.}, at: [<ffffffffa074e07a>]
    process_mpa_request+0x1aa/0x3e0 [iw_cxgb4]

    but task is already holding lock:
     (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&epc->mutex);
      lock(&epc->mutex);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    3 locks held by kworker/u64:3/299:
     #0:  ("%s""iw_cxgb4"){.+.+.+}, at: [<ffffffff8106f14d>]
    process_one_work+0x13d/0x4d0
     #1:  (skb_work){+.+.+.}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0
     #2:  (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0
    [iw_cxgb4]

    stack backtrace:
    CPU: 2 PID: 299 Comm: kworker/u64:3 Tainted: G            E  3.17.0+ #3
    Hardware name: Dell Inc. PowerEdge T110/0X744K, BIOS 1.2.1 01/28/2010
    Workqueue: iw_cxgb4 process_work [iw_cxgb4]
     ffff8800b91593d0 ffff8800b8a2f9f8 ffffffff815df107 0000000000000001
     ffff8800b9158750 ffff8800b8a2fa28 ffffffff8109f0e2 ffff8800bb768a00
     ffff8800b91593d0 ffff8800b9158750 0000000000000000 ffff8800b8a2fa88
    Call Trace:
     [<ffffffff815df107>] dump_stack+0x49/0x62
     [<ffffffff8109f0e2>] print_deadlock_bug+0xf2/0x100
     [<ffffffff810a0f04>] validate_chain+0x454/0x700
     [<ffffffff810a1574>] __lock_acquire+0x3c4/0x580
     [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4]
     [<ffffffff810a17cc>] lock_acquire+0x9c/0x110
     [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4]
     [<ffffffff815e111b>] mutex_lock_nested+0x4b/0x360
     [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4]
     [<ffffffff810c181a>] ? del_timer_sync+0xaa/0xd0
     [<ffffffff810c1770>] ? try_to_del_timer_sync+0x70/0x70
     [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4]
     [<ffffffffa074a3ec>] ? update_rx_credits+0xec/0x140 [iw_cxgb4]
     [<ffffffffa074e381>] rx_data+0xd1/0x1f0 [iw_cxgb4]
     [<ffffffff8109ff23>] ? mark_held_locks+0x73/0xa0
     [<ffffffff815e4b90>] ? _raw_spin_unlock_irqrestore+0x40/0x70
     [<ffffffff810a020d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
     [<ffffffff810a02dd>] ? trace_hardirqs_on+0xd/0x10
     [<ffffffffa074c931>] process_work+0x51/0x80 [iw_cxgb4]
     [<ffffffff8106f1c8>] process_one_work+0x1b8/0x4d0
     [<ffffffff8106f14d>] ? process_one_work+0x13d/0x4d0
     [<ffffffff8106f600>] worker_thread+0x120/0x3c0
     [<ffffffff8106f4e0>] ? process_one_work+0x4d0/0x4d0
     [<ffffffff81074a0e>] kthread+0xde/0x100
     [<ffffffff815e4b40>] ? _raw_spin_unlock_irq+0x30/0x40
     [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70
     [<ffffffff815e512c>] ret_from_fork+0x7c/0xb0
     [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70

Based on original work by Steve Wise <swise@opengridcomputing.com>.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-12-15 18:10:46 -08:00
Hariprasad Shenai
5167865aaa RDMA/cxgb4/csiostor: Cleansup FW related macros/register defines for PF/VF and LDST
This patch cleanups PF/VF and LDST related macros/register defines that are
defined in t4fw_api.h and the affected files.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-22 16:57:47 -05:00
Hariprasad Shenai
77a80e23cc RDMA/cxgb4: Cleanup Filter related macros/register defines
This patch cleanups all filter related macros/register defines that are defined
in t4fw_api.h and the affected files.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-22 16:57:46 -05:00
Anish Bhatt
d7990b0c34 cxgb4i/cxgb4 : Refactor macros to conform to uniform standards
Refactored all macros used in cxgb4i as part of previously started cxgb4 macro
names cleanup. Makes them more uniform and avoids namespace collision.
Minor changes in other drivers where required as some of these macros are used
 by multiple drivers, affected drivers are iw_cxgb4, cxgb4(vf) & csiostor

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:36:22 -05:00
Hariprasad Shenai
e2ac962895 cxgb4: Cleanup macros so they follow the same style and look consistent, part 2
Various patches have ended up changing the style of the symbolic macros/register
defines to different style.

As a result, the current kernel.org files are a mix of different macro styles.
Since this macro/register defines is used by different drivers a
few patch series have ended up adding duplicate macro/register define entries
with different styles. This makes these register define/macro files a complete
mess and we want to make them clean and consistent. This patch cleans up a part
of it.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-10 12:57:10 -05:00
Hariprasad S
da22b896b1 RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate line
This fixes ntuple calculation for IPv6 active open request for T5
adapter.  And also removes an duplicate line which got added in commit
92e7ae7172 ("iw_cxgb4: Choose appropriate hw mtu index and ISS for
iWARP connections")

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-14 00:34:08 -07:00
Hariprasad S
d480201b22 RDMA/cxgb4: Add missing neigh_release in find_route
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-14 00:34:08 -07:00
Hariprasad S
04524a47c3 RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emss
best_mtu and set_emss were not considering ipv6 header for ipv6 case.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-14 00:34:08 -07:00
David S. Miller
8fd90bb889 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/infiniband/hw/cxgb4/device.c

The cxgb4 conflict was simply overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22 00:44:59 -07:00
Hariprasad Shenai
dd92b12453 iw_cxgb4: log detailed warnings for negative advice
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-21 20:23:59 -07:00
Hariprasad Shenai
4c2c576322 cxgb4/iw_cxgb4: use firmware ord/ird resource limits
Advertise a larger max read queue depth for qps, and gather the resource limits
from fw and use them to avoid exhaustinq all the resources.

Design:

cxgb4:

Obtain the max_ordird_qp and max_ird_adapter device params from FW
at init time and pass them up to the ULDs when they attach.  If these
parameters are not available, due to older firmware, then hard-code
the values based on the known values for older firmware.
iw_cxgb4:

Fix the c4iw_query_device() to report these correct values based on
adapter parameters.  ibv_query_device() will always return:

max_qp_rd_atom = max_qp_init_rd_atom = min(module_max, max_ordird_qp)
max_res_rd_atom = max_ird_adapter

Bump up the per qp max module option to 32, allowing it to be increased
by the user up to the device max of max_ordird_qp.  32 seems to be
sufficient to maximize throughput for streaming read benchmarks.

Fail connection setup if the negotiated IRD exhausts the available
adapter ird resources.  So the driver will track the amount of ird
resource in use and not send an RI_WR/INIT to FW that would reduce the
available ird resources below zero.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:25:16 -07:00
Steve Wise
46c1376db1 RDMA/cxgb4: Call iwpm_init() only once
We need to only register with the iwpm core once.  Currently it is
being done for every adapter, which causes a failure for each adapter
but the first, making multiple adapters unusable.

Fixes: 9eccfe109b ("RDMA/cxgb4: Add support for iWARP Port Mapper user space service")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-07-13 10:00:54 -07:00
Hariprasad S
5dab6d3ab1 RDMA/cxgb4: Clean up connection on ARP error
Based on origninal work by Steve Wise <swise@opengridcomputing.com>

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-07-08 16:56:54 -07:00
Hariprasad S
233b430103 RDMA/cxgb4: Fix skb_leak in reject_cr()
Based on origninal work by Steve Wise <swise@opengridcomputing.com>

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-07-08 16:56:54 -07:00
Hariprasad Shenai
35b1de5579 rdma/cxgb4: Fixes cxgb4 probe failure in VM when PF is exposed through PCI Passthrough
Change logic which determines our Physical Function at PCI Probe time.
Now we read the PL_WHOAMI register and get the Physical Function.

Pass Physical Function to Upper Layer Drivers in lld_info structure in the
new field "pf" added to lld_info.  This is useful for the cases where the
PF, say PF4, is attached to a Virtual Machine via some form of "PCI
Pass Through" technology and the PCI Function shows up as PF0 in the VM.

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01 18:56:10 -07:00
Linus Torvalds
f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
Hariprasad Shenai
b408ff282d iw_cxgb4: don't truncate the recv window size
Fixed a bug that shows up with recv window sizes that exceed the size of
the RCV_BUFSIZ field in opt0 (>= 1024K).  If the recv window exceeds
this, then we specify the max possible in opt0, add add the rest in via
a RX_DATA_ACK credits.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 22:49:54 -07:00
Hariprasad Shenai
92e7ae7172 iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections
Select the appropriate hw mtu index and initial sequence number to optimize
hw memory performance.

Add new cxgb4_best_aligned_mtu() which allows callers to provide enough
information to be used to [possibly] select an MTU which will result in the
TCP Data Segment Size (AKA Maximum Segment Size) to be an aligned value.

If an RTR message exhange is required, then align the ISS to 8B - 1 + 4, so
that after the SYN the send seqno will align on a 4B boundary. The RTR
message exchange will leave the send seqno aligned on an 8B boundary.
If an RTR is not required, then align the ISS to 8B - 1.  The goal is
to have the send seqno be 8B aligned when we send the first FPDU.

Based on original work by Casey Leedom <leeedom@chelsio.com> and
Steve Wise <swise@opengridcomputing.com>

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 22:49:54 -07:00
Roland Dreier
eeaddf3670 Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next 2014-06-10 10:12:14 -07:00
Steve Wise
9eccfe109b RDMA/cxgb4: Add support for iWARP Port Mapper user space service
Based on original work by Vipul Pandya.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>

[ Fix htons -> ntohs to make sparse happy.  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-06-10 10:12:06 -07:00
Steve Wise
11b8e22d4d RDMA/cxgb4: Fix vlan support
RDMA connections over a vlan interface don't work due to
import_ep() not using the correct egress device.

 - use the real device in import_ep()
 - use rdma_vlan_dev_real_dev() in get_real_dev().

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-05-19 18:00:32 -07:00
Steve Wise
92e5011ab0 RDMA/cxgb4: Force T5 connections to use TAHOE congestion control
This is required to work around a T5 HW issue.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-28 17:29:41 -07:00
Steve Wise
cc18b939e1 RDMA/cxgb4: Fix endpoint mutex deadlocks
In cases where the cm calls c4iw_modify_rc_qp() with the endpoint
mutex held, they must be called with internal == 1.  rx_data() and
process_mpa_reply() are not doing this.  This causes a deadlock
because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some
!internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex.
The design was intended to only do the disconnect for !internal calls.

Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with
internal == 1, and then disconnect only after releasing the mutex.

Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with
internal == 1 and set a new attr flag telling it to send a TERMINATE
message.  Previously this was implied by !internal.

Change process_mpa_reply() to return whether the caller should
disconnect after releasing the endpoint mutex.  Now rx_data() will do
the disconnect in the cases where process_mpa_reply() wants to
disconnect after the TERMINATE is sent.

Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal,
and to send a TERMINATE message if attrs->send_term is 1.

Change abort_connection() to not aquire the ep mutex for setting the
state, and make all calls to abort_connection() do so with the mutex
held.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-28 17:29:41 -07:00
Steve Wise
b33bd0cbfa RDMA/cxgb4: Endpoint timeout fixes
1) timedout endpoint processing can be starved. If there are continual
   CPL messages flowing into the driver, the endpoint timeout
   processing can be starved.  This condition exposed the other bugs
   below.

Solution: In process_work(), call process_timedout_eps() after each CPL
is processed.

2) Connection events can be processed even though the endpoint is on
   the timeout list.  If the endpoint is scheduled for timeout
   processing, then we must ignore MPA Start Requests and Replies.

Solution: Change stop_ep_timer() to return 1 if the ep has already been
queued for timeout processing.  All the callers of stop_ep_timer() need
to check this and act accordingly.  There are just a few cases where
the caller needs to do something different if stop_ep_timer() returns 1:

1) in process_mpa_reply(), ignore the reply and  process_timeout()
   will abort the connection.

2) in process_mpa_request, ignore the request and process_timeout()
   will abort the connection.

It is ok for callers of stop_ep_timer() to abort the connection since
that will leave the state in ABORTING or DEAD, and process_timeout()
now ignores timeouts when the ep is in these states.

3) Double insertion on the timeout list.  Since the endpoint timers
   are used for connection setup and teardown, we need to guard
   against the possibility that an endpoint is already on the timeout
   list.  This is a rare condition and only seen under heavy load and
   in the presense of the above 2 bugs.

Solution: In ep_timeout(), don't queue the endpoint if it is already on
the queue.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-04-11 11:36:07 -07:00
Linus Torvalds
877f075aac Main batch of InfiniBand/RDMA changes for 3.15:
- The biggest change is core API extensions and mlx5 low-level driver
    support for handling DIF/DIX-style protection information, and the
    addition of PI support to the iSER initiator.  Target support will be
    arriving shortly through the SCSI target tree.
 
  - A nice simplification to the "umem" memory pinning library now that
    we have chained sg lists.  Kudos to Yishai Hadas for realizing our
    code didn't have to be so crazy.
 
  - Another nice simplification to the sg wrappers used by qib, ipath and
    ehca to handle their mapping of memory to adapter.
 
  - The usual batch of fixes to bugs found by static checkers etc. from
    intrepid people like Dan Carpenter and Yann Droneaud.
 
  - A large batch of cxgb4, ocrdma, qib driver updates.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJTPYBnAAoJEENa44ZhAt0hGI4P/29eotGwpkANUQE6FQvxCUL2
 CXJtSg52lmYvGJrPK4IhihpbtQmHJz3iXEzlOOWidTw1dJgObR6vFaRymh7+vDLs
 CdzybMcXdasarqTuYeJbFzhkimpwtWWrMy/8Ik/Jj/5glGQ6cUSpdYZzVtFhYNqf
 hCGE8iLi+tuekJJj1htut5D6apXM7udcdc2yLJNOdsSj/VUXt1oqG1x9xAi9R8Tq
 7o8eFSStdlja0EBQ6Hli2zauCSnQkaUtr8h6EAFbcCtvBK8HqsHSc2gfq2ViFUiN
 ztt167oWoQnVkR0qCPL5nVt+CRQHHROprVXvbpcTI3aW61gNIl6OrUUOXefzHXac
 TNi+fdMpiEB/JQ4Z04Jzd1dGCSjYeTqPj4rO4meFjBmxRDdTgZHu7FWwejT1nYJ5
 d2abVdCOT+QWlIlM7m/pjdWJII5OYM+4/jtTayGepEaR4fTUzKtPZPBLNUBDBKE+
 4f92PC8LiuPkwJgb6XT96onPz1bDCOnPSEdwoKUFKPeGUcwgVOM/Wx5NU4Yf7rfg
 RxQwZ7mJXbjCYFlmGGo/0QDy6UEGkIFYlJSzooP+wlK1JvZ5h2M+9QKX2FtwzR+R
 I2kBxcTXWsM/h88R7MkNqbNIllmhssrJwmAE46OneZbfoBOB+JZjb4nLRTu0jEcS
 zn6f16GmJ37BKn2/qYY/
 =Ww6H
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband updates from Roland Dreier:
 "Main batch of InfiniBand/RDMA changes for 3.15:

   - The biggest change is core API extensions and mlx5 low-level driver
     support for handling DIF/DIX-style protection information, and the
     addition of PI support to the iSER initiator.  Target support will
     be arriving shortly through the SCSI target tree.

   - A nice simplification to the "umem" memory pinning library now that
     we have chained sg lists.  Kudos to Yishai Hadas for realizing our
     code didn't have to be so crazy.

   - Another nice simplification to the sg wrappers used by qib, ipath
     and ehca to handle their mapping of memory to adapter.

   - The usual batch of fixes to bugs found by static checkers etc.
     from intrepid people like Dan Carpenter and Yann Droneaud.

   - A large batch of cxgb4, ocrdma, qib driver updates"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (102 commits)
  RDMA/ocrdma: Unregister inet notifier when unloading ocrdma
  RDMA/ocrdma: Fix warnings about pointer <-> integer casts
  RDMA/ocrdma: Code clean-up
  RDMA/ocrdma: Display FW version
  RDMA/ocrdma: Query controller information
  RDMA/ocrdma: Support non-embedded mailbox commands
  RDMA/ocrdma: Handle CQ overrun error
  RDMA/ocrdma: Display proper value for max_mw
  RDMA/ocrdma: Use non-zero tag in SRQ posting
  RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr()
  RDMA/ocrdma: Increment abi version count
  RDMA/ocrdma: Update version string
  be2net: Add abi version between be2net and ocrdma
  RDMA/ocrdma: ABI versioning between ocrdma and be2net
  RDMA/ocrdma: Allow DPP QP creation
  RDMA/ocrdma: Read ASIC_ID register to select asic_gen
  RDMA/ocrdma: SQ and RQ doorbell offset clean up
  RDMA/ocrdma: EQ full catastrophe avoidance
  RDMA/cxgb4: Disable DSGL use by default
  RDMA/cxgb4: rx_data() needs to hold the ep mutex
  ...
2014-04-03 16:57:19 -07:00