5.3 Merge window RDMA pull request

A smaller cycle this time. Notably we see another new driver, 'Soft
 iWarp', and the deletion of an ancient unused driver for nes.
 
 - Revise and simplify the signature offload RDMA MR APIs
 
 - More progress on hoisting object allocation boiler plate code out of the
   drivers
 
 - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib, i40iw
 
 - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc conversion
 
 - Removal of obsolete ib_ucm chardev and nes driver
 
 - netlink based discovery of chardevs and autoloading of the modules
   providing them
 
 - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma
 
 - New driver 'siw' for software based iWarp running on top of netdev,
   much like rxe's software RoCE.
 
 - mlx5 feature to report events in their raw devx format to userspace
 
 - Expose per-object counters through rdma tool
 
 - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
   from netdev
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl0ozSwACgkQOG33FX4g
 mxqncg//Qe2zSnlbd6r3hofsc1WiHSx/CiXtT52BUGipO+cWQUwO7hGFuUHIFCuZ
 JBg7mc998xkyLIH85a/txd+RwAIApKgHVdd+VlrmybZeYCiERAMFpWg8cHpzrbnw
 l3Ln9fTtJf/NAhO0ZCGV9DCd01fs9yVQgAv21UnLJMUhp9Pzk/iMhu7C7IiSLKvz
 t7iFhEqPXNJdoqZ+wtWyc/463YxKUd9XNg9Z1neQdaeZrX4UjgDbY9x/ub3zOvQV
 jc/IL4GysJ3z8mfx5mAd6sE/jAjhcnJuaGYYATqkxiLZEP+muYwU50CNs951XhJC
 b/EfRQIcLg9kq/u6CP+CuWlMrRWy3U7yj3/mrbbGhlGq88Yt6FGqUf0aFy6TYMaO
 RzTG5ZR+0AmsOrR1QU+DbH9CKX5PGZko6E7UCdjROqUlAUOjNwRr99O5mYrZoM9E
 PdN2vtdWY9COR3Q+7APdhWIA/MdN2vjr3LDsR3H94tru1yi6dB/BPDRcJieozaxn
 2T+YrZbV+9/YgrccpPQCilaQdanXKpkmbYkbEzVLPcOEV/lT9odFDt3eK+6duVDL
 ufu8fs1xapMDHKkcwo5jeNZcoSJymAvHmGfZlo2PPOmh802Ul60bvYKwfheVkhHF
 Eee5/ovCMs1NLqFiq7Zq5mXO0fR0BHyg9VVjJBZm2JtazyuhoHQ=
 =iWcG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A smaller cycle this time. Notably we see another new driver, 'Soft
  iWarp', and the deletion of an ancient unused driver for nes.

   - Revise and simplify the signature offload RDMA MR APIs

   - More progress on hoisting object allocation boiler plate code out
     of the drivers

   - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib,
     i40iw

   - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc
     conversion

   - Removal of obsolete ib_ucm chardev and nes driver

   - netlink based discovery of chardevs and autoloading of the modules
     providing them

   - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma

   - New driver 'siw' for software based iWarp running on top of netdev,
     much like rxe's software RoCE.

   - mlx5 feature to report events in their raw devx format to userspace

   - Expose per-object counters through rdma tool

   - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
     from netdev"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits)
  RMDA/siw: Require a 64 bit arch
  RDMA/siw: Mark expected switch fall-throughs
  RDMA/core: Fix -Wunused-const-variable warnings
  rdma/siw: Remove set but not used variable 's'
  rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS
  RDMA/siw: Add missing rtnl_lock around access to ifa
  rdma/siw: Use proper enumerated type in map_cqe_status
  RDMA/siw: Remove unnecessary kthread create/destroy printouts
  IB/rdmavt: Fix variable shadowing issue in rvt_create_cq
  RDMA/core: Fix race when resolving IP address
  RDMA/core: Make rdma_counter.h compile stand alone
  IB/core: Work on the caller socket net namespace in nldev_newlink()
  RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
  RDMA/mlx5: Set RDMA DIM to be enabled by default
  RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink
  RDMA/core: Provide RDMA DIM support for ULPs
  linux/dim: Implement RDMA adaptive moderation (DIM)
  IB/mlx5: Report correctly tag matching rendezvous capability
  docs: infiniband: add it to the driver-api bookset
  IB/mlx5: Implement VHCA tunnel mechanism in DEVX
  ...
This commit is contained in:
Linus Torvalds 2019-07-15 20:38:15 -07:00
commit 2a3c389a0f
221 changed files with 18860 additions and 24846 deletions

View File

@ -423,23 +423,6 @@ Description:
(e.g. driver restart on the VM which owns the VF).
sysfs interface for NetEffect RNIC Low-Level iWARP driver (nes)
---------------------------------------------------------------
What: /sys/class/infiniband/nesX/hw_rev
What: /sys/class/infiniband/nesX/hca_type
What: /sys/class/infiniband/nesX/board_id
Date: Feb, 2008
KernelVersion: v2.6.25
Contact: linux-rdma@vger.kernel.org
Description:
hw_rev: (RO) Hardware revision number
hca_type: (RO) Host Channel Adapter type (NEX020)
board_id: (RO) Manufacturing board id
sysfs interface for Chelsio T4/T5 RDMA driver (cxgb4)
-----------------------------------------------------

View File

@ -90,6 +90,7 @@ needed).
driver-api/index
core-api/index
infiniband/index
media/index
networking/index
input/index

View File

@ -1,4 +1,6 @@
INFINIBAND MIDLAYER LOCKING
===========================
InfiniBand Midlayer Locking
===========================
This guide is an attempt to make explicit the locking assumptions
made by the InfiniBand midlayer. It describes the requirements on
@ -6,45 +8,47 @@ INFINIBAND MIDLAYER LOCKING
protocols that use the midlayer.
Sleeping and interrupt context
==============================
With the following exceptions, a low-level driver implementation of
all of the methods in struct ib_device may sleep. The exceptions
are any methods from the list:
create_ah
modify_ah
query_ah
destroy_ah
post_send
post_recv
poll_cq
req_notify_cq
map_phys_fmr
- create_ah
- modify_ah
- query_ah
- destroy_ah
- post_send
- post_recv
- poll_cq
- req_notify_cq
- map_phys_fmr
which may not sleep and must be callable from any context.
The corresponding functions exported to upper level protocol
consumers:
ib_create_ah
ib_modify_ah
ib_query_ah
ib_destroy_ah
ib_post_send
ib_post_recv
ib_req_notify_cq
ib_map_phys_fmr
- ib_create_ah
- ib_modify_ah
- ib_query_ah
- ib_destroy_ah
- ib_post_send
- ib_post_recv
- ib_req_notify_cq
- ib_map_phys_fmr
are therefore safe to call from any context.
In addition, the function
ib_dispatch_event
- ib_dispatch_event
used by low-level drivers to dispatch asynchronous events through
the midlayer is also safe to call from any context.
Reentrancy
----------
All of the methods in struct ib_device exported by a low-level
driver must be fully reentrant. The low-level driver is required to
@ -62,6 +66,7 @@ Reentrancy
information between different calls of ib_poll_cq() is not defined.
Callbacks
---------
A low-level driver must not perform a callback directly from the
same callchain as an ib_device method call. For example, it is not
@ -74,18 +79,18 @@ Callbacks
completion event handlers for the same CQ are not called
simultaneously. The driver must guarantee that only one CQ event
handler for a given CQ is running at a time. In other words, the
following situation is not allowed:
following situation is not allowed::
CPU1 CPU2
CPU1 CPU2
low-level driver ->
consumer CQ event callback:
/* ... */
ib_req_notify_cq(cq, ...);
low-level driver ->
/* ... */ consumer CQ event callback:
/* ... */
return from CQ event handler
low-level driver ->
consumer CQ event callback:
/* ... */
ib_req_notify_cq(cq, ...);
low-level driver ->
/* ... */ consumer CQ event callback:
/* ... */
return from CQ event handler
The context in which completion event and asynchronous event
callbacks run is not defined. Depending on the low-level driver, it
@ -93,6 +98,7 @@ Callbacks
Upper level protocol consumers may not sleep in a callback.
Hot-plug
--------
A low-level driver announces that a device is ready for use by
consumers when it calls ib_register_device(), all initialization

View File

@ -0,0 +1,23 @@
.. SPDX-License-Identifier: GPL-2.0
==========
InfiniBand
==========
.. toctree::
:maxdepth: 1
core_locking
ipoib
opa_vnic
sysfs
tag_matching
user_mad
user_verbs
.. only:: subproject and html
Indices
=======
* :ref:`genindex`

View File

@ -1,4 +1,6 @@
IP OVER INFINIBAND
==================
IP over InfiniBand
==================
The ib_ipoib driver is an implementation of the IP over InfiniBand
protocol as specified by RFC 4391 and 4392, issued by the IETF ipoib
@ -8,16 +10,17 @@ IP OVER INFINIBAND
masqueraded to the kernel as ethernet interfaces).
Partitions and P_Keys
=====================
When the IPoIB driver is loaded, it creates one interface for each
port using the P_Key at index 0. To create an interface with a
different P_Key, write the desired P_Key into the main interface's
/sys/class/net/<intf name>/create_child file. For example:
/sys/class/net/<intf name>/create_child file. For example::
echo 0x8001 > /sys/class/net/ib0/create_child
This will create an interface named ib0.8001 with P_Key 0x8001. To
remove a subinterface, use the "delete_child" file:
remove a subinterface, use the "delete_child" file::
echo 0x8001 > /sys/class/net/ib0/delete_child
@ -28,6 +31,7 @@ Partitions and P_Keys
rtnl_link_ops, where children created using either way behave the same.
Datagram vs Connected modes
===========================
The IPoIB driver supports two modes of operation: datagram and
connected. The mode is set and read through an interface's
@ -51,6 +55,7 @@ Datagram vs Connected modes
networking stack to use the smaller UD MTU for these neighbours.
Stateless offloads
==================
If the IB HW supports IPoIB stateless offloads, IPoIB advertises
TCP/IP checksum and/or Large Send (LSO) offloading capability to the
@ -60,9 +65,10 @@ Stateless offloads
on/off using ethtool calls. Currently LRO is supported only for
checksum offload capable devices.
Stateless offloads are supported only in datagram mode.
Stateless offloads are supported only in datagram mode.
Interrupt moderation
====================
If the underlying IB device supports CQ event moderation, one can
use ethtool to set interrupt mitigation parameters and thus reduce
@ -71,6 +77,7 @@ Interrupt moderation
moderation is supported.
Debugging Information
=====================
By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
to 'y', tracing messages are compiled into the driver. They are
@ -79,7 +86,7 @@ Debugging Information
runtime through files in /sys/module/ib_ipoib/.
CONFIG_INFINIBAND_IPOIB_DEBUG also enables files in the debugfs
virtual filesystem. By mounting this filesystem, for example with
virtual filesystem. By mounting this filesystem, for example with::
mount -t debugfs none /sys/kernel/debug
@ -96,10 +103,13 @@ Debugging Information
performance, because it adds tests to the fast path.
References
==========
Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
http://ietf.org/rfc/rfc4391.txt
http://ietf.org/rfc/rfc4391.txt
IP over InfiniBand (IPoIB) Architecture (RFC 4392)
http://ietf.org/rfc/rfc4392.txt
http://ietf.org/rfc/rfc4392.txt
IP over InfiniBand: Connected Mode (RFC 4755)
http://ietf.org/rfc/rfc4755.txt

View File

@ -1,3 +1,7 @@
=================================================================
Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC)
=================================================================
Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature
supports Ethernet functionality over Omni-Path fabric by encapsulating
the Ethernet packets between HFI nodes.
@ -17,70 +21,72 @@ an independent Ethernet network. The configuration is performed by an
Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM)
application. HFI nodes can have multiple VNICs each connected to a
different virtual Ethernet switch. The below diagram presents a case
of two virtual Ethernet switches with two HFI nodes.
of two virtual Ethernet switches with two HFI nodes::
+-------------------+
| Subnet/ |
| Ethernet |
| Manager |
+-------------------+
/ /
/ /
/ /
/ /
+-----------------------------+ +------------------------------+
| Virtual Ethernet Switch | | Virtual Ethernet Switch |
| +---------+ +---------+ | | +---------+ +---------+ |
| | VPORT | | VPORT | | | | VPORT | | VPORT | |
+--+---------+----+---------+-+ +-+---------+----+---------+---+
| \ / |
| \ / |
| \/ |
| / \ |
| / \ |
+-----------+------------+ +-----------+------------+
| VNIC | VNIC | | VNIC | VNIC |
+-----------+------------+ +-----------+------------+
| HFI | | HFI |
+------------------------+ +------------------------+
+-------------------+
| Subnet/ |
| Ethernet |
| Manager |
+-------------------+
/ /
/ /
/ /
/ /
+-----------------------------+ +------------------------------+
| Virtual Ethernet Switch | | Virtual Ethernet Switch |
| +---------+ +---------+ | | +---------+ +---------+ |
| | VPORT | | VPORT | | | | VPORT | | VPORT | |
+--+---------+----+---------+-+ +-+---------+----+---------+---+
| \ / |
| \ / |
| \/ |
| / \ |
| / \ |
+-----------+------------+ +-----------+------------+
| VNIC | VNIC | | VNIC | VNIC |
+-----------+------------+ +-----------+------------+
| HFI | | HFI |
+------------------------+ +------------------------+
The Omni-Path encapsulated Ethernet packet format is as described below.
Bits Field
------------------------------------
==================== ================================
Bits Field
==================== ================================
Quad Word 0:
0-19 SLID (lower 20 bits)
20-30 Length (in Quad Words)
31 BECN bit
32-51 DLID (lower 20 bits)
52-56 SC (Service Class)
57-59 RC (Routing Control)
60 FECN bit
61-62 L2 (=10, 16B format)
63 LT (=1, Link Transfer Head Flit)
0-19 SLID (lower 20 bits)
20-30 Length (in Quad Words)
31 BECN bit
32-51 DLID (lower 20 bits)
52-56 SC (Service Class)
57-59 RC (Routing Control)
60 FECN bit
61-62 L2 (=10, 16B format)
63 LT (=1, Link Transfer Head Flit)
Quad Word 1:
0-7 L4 type (=0x78 ETHERNET)
8-11 SLID[23:20]
12-15 DLID[23:20]
16-31 PKEY
32-47 Entropy
48-63 Reserved
0-7 L4 type (=0x78 ETHERNET)
8-11 SLID[23:20]
12-15 DLID[23:20]
16-31 PKEY
32-47 Entropy
48-63 Reserved
Quad Word 2:
0-15 Reserved
16-31 L4 header
32-63 Ethernet Packet
0-15 Reserved
16-31 L4 header
32-63 Ethernet Packet
Quad Words 3 to N-1:
0-63 Ethernet packet (pad extended)
0-63 Ethernet packet (pad extended)
Quad Word N (last):
0-23 Ethernet packet (pad extended)
24-55 ICRC
56-61 Tail
62-63 LT (=01, Link Transfer Tail Flit)
0-23 Ethernet packet (pad extended)
24-55 ICRC
56-61 Tail
62-63 LT (=01, Link Transfer Tail Flit)
==================== ================================
Ethernet packet is padded on the transmit side to ensure that the VNIC OPA
packet is quad word aligned. The 'Tail' field contains the number of bytes
@ -123,7 +129,7 @@ operation. It also handles the encapsulation of Ethernet packets with an
Omni-Path header in the transmit path. For each VNIC interface, the
information required for encapsulation is configured by the EM via VEMA MAD
interface. It also passes any control information to the HW dependent driver
by invoking the RDMA netdev control operations.
by invoking the RDMA netdev control operations::
+-------------------+ +----------------------+
| | | Linux |

View File

@ -1,4 +1,6 @@
SYSFS FILES
===========
Sysfs files
===========
The sysfs interface has moved to
Documentation/ABI/stable/sysfs-class-infiniband.

View File

@ -1,12 +1,16 @@
==================
Tag matching logic
==================
The MPI standard defines a set of rules, known as tag-matching, for matching
source send operations to destination receives. The following parameters must
match the following source and destination parameters:
* Communicator
* User tag - wild card may be specified by the receiver
* Source rank wild car may be specified by the receiver
* Destination rank wild
The ordering rules require that when more than one pair of send and receive
message envelopes may match, the pair that includes the earliest posted-send
and the earliest posted-receive is the pair that must be used to satisfy the
@ -35,6 +39,7 @@ the header to initiate an RDMA READ operation directly to the matching buffer.
A fin message needs to be received in order for the buffer to be reused.
Tag matching implementation
===========================
There are two types of matching objects used, the posted receive list and the
unexpected message list. The application posts receive buffers through calls

View File

@ -1,6 +1,9 @@
USERSPACE MAD ACCESS
====================
Userspace MAD access
====================
Device files
============
Each port of each InfiniBand device has a "umad" device and an
"issm" device attached. For example, a two-port HCA will have two
@ -8,12 +11,13 @@ Device files
device of each type (for switch port 0).
Creating MAD agents
===================
A MAD agent can be created by filling in a struct ib_user_mad_reg_req
and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file
descriptor for the appropriate device file. If the registration
request succeeds, a 32-bit id will be returned in the structure.
For example:
For example::
struct ib_user_mad_reg_req req = { /* ... */ };
ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req);
@ -26,12 +30,14 @@ Creating MAD agents
ioctl. Also, all agents registered through a file descriptor will
be unregistered when the descriptor is closed.
2014 -- a new registration ioctl is now provided which allows additional
2014
a new registration ioctl is now provided which allows additional
fields to be provided during registration.
Users of this registration call are implicitly setting the use of
pkey_index (see below).
Receiving MADs
==============
MADs are received using read(). The receive side now supports
RMPP. The buffer passed to read() must be at least one
@ -41,7 +47,8 @@ Receiving MADs
MAD (RMPP), the errno is set to ENOSPC and the length of the
buffer needed is set in mad.length.
Example for normal MAD (non RMPP) reads:
Example for normal MAD (non RMPP) reads::
struct ib_user_mad *mad;
mad = malloc(sizeof *mad + 256);
ret = read(fd, mad, sizeof *mad + 256);
@ -50,7 +57,8 @@ Receiving MADs
free(mad);
}
Example for RMPP reads:
Example for RMPP reads::
struct ib_user_mad *mad;
mad = malloc(sizeof *mad + 256);
ret = read(fd, mad, sizeof *mad + 256);
@ -76,11 +84,12 @@ Receiving MADs
poll()/select() may be used to wait until a MAD can be read.
Sending MADs
============
MADs are sent using write(). The agent ID for sending should be
filled into the id field of the MAD, the destination LID should be
filled into the lid field, and so on. The send side does support
RMPP so arbitrary length MAD can be sent. For example:
RMPP so arbitrary length MAD can be sent. For example::
struct ib_user_mad *mad;
@ -97,6 +106,7 @@ Sending MADs
perror("write");
Transaction IDs
===============
Users of the umad devices can use the lower 32 bits of the
transaction ID field (that is, the least significant half of the
@ -105,6 +115,7 @@ Transaction IDs
the kernel and will be overwritten before a MAD is sent.
P_Key Index Handling
====================
The old ib_umad interface did not allow setting the P_Key index for
MADs that are sent and did not provide a way for obtaining the P_Key
@ -119,6 +130,7 @@ P_Key Index Handling
default, and the IB_USER_MAD_ENABLE_PKEY ioctl will be removed.
Setting IsSM Capability Bit
===========================
To set the IsSM capability bit for a port, simply open the
corresponding issm device file. If the IsSM bit is already set,
@ -129,25 +141,26 @@ Setting IsSM Capability Bit
the issm file.
/dev files
==========
To create the appropriate character device files automatically with
udev, a rule like
udev, a rule like::
KERNEL=="umad*", NAME="infiniband/%k"
KERNEL=="issm*", NAME="infiniband/%k"
can be used. This will create device nodes named
can be used. This will create device nodes named::
/dev/infiniband/umad0
/dev/infiniband/issm0
for the first port, and so on. The InfiniBand device and port
associated with these devices can be determined from the files
associated with these devices can be determined from the files::
/sys/class/infiniband_mad/umad0/ibdev
/sys/class/infiniband_mad/umad0/port
and
and::
/sys/class/infiniband_mad/issm0/ibdev
/sys/class/infiniband_mad/issm0/port

View File

@ -1,4 +1,6 @@
USERSPACE VERBS ACCESS
======================
Userspace verbs access
======================
The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
enables direct userspace access to IB hardware via "verbs," as
@ -13,6 +15,7 @@ USERSPACE VERBS ACCESS
libmthca userspace driver be installed.
User-kernel communication
=========================
Userspace communicates with the kernel for slow path, resource
management operations via the /dev/infiniband/uverbsN character
@ -28,6 +31,7 @@ User-kernel communication
system call.
Resource management
===================
Since creation and destruction of all IB resources is done by
commands passed through a file descriptor, the kernel can keep track
@ -41,6 +45,7 @@ Resource management
prevent one process from touching another process's resources.
Memory pinning
==============
Direct userspace I/O requires that memory regions that are potential
I/O targets be kept resident at the same physical address. The
@ -54,13 +59,14 @@ Memory pinning
number of pages pinned by a process.
/dev files
==========
To create the appropriate character device files automatically with
udev, a rule like
udev, a rule like::
KERNEL=="uverbs*", NAME="infiniband/%k"
can be used. This will create device nodes named
can be used. This will create device nodes named::
/dev/infiniband/uverbs0

View File

@ -11018,14 +11018,6 @@ F: driver/net/net_failover.c
F: include/net/net_failover.h
F: Documentation/networking/net_failover.rst
NETEFFECT IWARP RNIC DRIVER (IW_NES)
M: Faisal Latif <faisal.latif@intel.com>
L: linux-rdma@vger.kernel.org
W: http://www.intel.com/Products/Server/Adapters/Server-Cluster/Server-Cluster-overview.htm
S: Supported
F: drivers/infiniband/hw/nes/
F: include/uapi/rdma/nes-abi.h
NETEM NETWORK EMULATOR
M: Stephen Hemminger <stephen@networkplumber.org>
L: netem@lists.linux-foundation.org (moderated for non-subscribers)
@ -14755,6 +14747,13 @@ M: Chris Boot <bootc@bootc.net>
S: Maintained
F: drivers/leds/leds-net48xx.c
SOFT-IWARP DRIVER (siw)
M: Bernard Metzler <bmt@zurich.ibm.com>
L: linux-rdma@vger.kernel.org
S: Supported
F: drivers/infiniband/sw/siw/
F: include/uapi/rdma/siw-abi.h
SOFT-ROCE DRIVER (rxe)
M: Moni Shoua <monis@mellanox.com>
L: linux-rdma@vger.kernel.org

View File

@ -7,6 +7,7 @@ menuconfig INFINIBAND
depends on m || IPV6 != m
depends on !ALPHA
select IRQ_POLL
select DIMLIB
---help---
Core support for InfiniBand (IB). Make sure to also select
any protocols you wish to use as well as drivers for your
@ -36,17 +37,6 @@ config INFINIBAND_USER_ACCESS
libibverbs, libibcm and a hardware driver library from
rdma-core <https://github.com/linux-rdma/rdma-core>.
config INFINIBAND_USER_ACCESS_UCM
tristate "Userspace CM (UCM, DEPRECATED)"
depends on BROKEN || COMPILE_TEST
depends on INFINIBAND_USER_ACCESS
help
The UCM module has known security flaws, which no one is
interested to fix. The user-space part of this code was
dropped from the upstream a long time ago.
This option is DEPRECATED and planned to be removed.
config INFINIBAND_EXP_LEGACY_VERBS_NEW_UAPI
bool "Allow experimental legacy verbs in new ioctl uAPI (EXPERIMENTAL)"
depends on INFINIBAND_USER_ACCESS
@ -98,7 +88,6 @@ source "drivers/infiniband/hw/efa/Kconfig"
source "drivers/infiniband/hw/i40iw/Kconfig"
source "drivers/infiniband/hw/mlx4/Kconfig"
source "drivers/infiniband/hw/mlx5/Kconfig"
source "drivers/infiniband/hw/nes/Kconfig"
source "drivers/infiniband/hw/ocrdma/Kconfig"
source "drivers/infiniband/hw/vmw_pvrdma/Kconfig"
source "drivers/infiniband/hw/usnic/Kconfig"
@ -108,6 +97,7 @@ source "drivers/infiniband/hw/hfi1/Kconfig"
source "drivers/infiniband/hw/qedr/Kconfig"
source "drivers/infiniband/sw/rdmavt/Kconfig"
source "drivers/infiniband/sw/rxe/Kconfig"
source "drivers/infiniband/sw/siw/Kconfig"
endif
source "drivers/infiniband/ulp/ipoib/Kconfig"

View File

@ -6,13 +6,12 @@ obj-$(CONFIG_INFINIBAND) += ib_core.o ib_cm.o iw_cm.o \
$(infiniband-y)
obj-$(CONFIG_INFINIBAND_USER_MAD) += ib_umad.o
obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o $(user_access-y)
obj-$(CONFIG_INFINIBAND_USER_ACCESS_UCM) += ib_ucm.o $(user_access-y)
ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
device.o fmr_pool.o cache.o netlink.o \
roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
multicast.o mad.o smi.o agent.o mad_rmpp.o \
nldev.o restrack.o
nldev.o restrack.o counters.o
ib_core-$(CONFIG_SECURITY_INFINIBAND) += security.o
ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o
@ -29,8 +28,6 @@ rdma_ucm-y := ucma.o
ib_umad-y := user_mad.o
ib_ucm-y := ucm.o
ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
rdma_core.o uverbs_std_types.o uverbs_ioctl.o \
uverbs_std_types_cq.o \

View File

@ -337,7 +337,7 @@ static int dst_fetch_ha(const struct dst_entry *dst,
neigh_event_send(n, NULL);
ret = -ENODATA;
} else {
memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
neigh_ha_snapshot(dev_addr->dst_dev_addr, n, dst->dev);
}
neigh_release(n);

View File

@ -60,6 +60,7 @@ extern bool ib_devices_shared_netns;
int ib_device_register_sysfs(struct ib_device *device);
void ib_device_unregister_sysfs(struct ib_device *device);
int ib_device_rename(struct ib_device *ibdev, const char *name);
int ib_device_set_dim(struct ib_device *ibdev, u8 use_dim);
typedef void (*roce_netdev_callback)(struct ib_device *device, u8 port,
struct net_device *idev, void *cookie);
@ -88,6 +89,15 @@ typedef int (*nldev_callback)(struct ib_device *device,
int ib_enum_all_devs(nldev_callback nldev_cb, struct sk_buff *skb,
struct netlink_callback *cb);
struct ib_client_nl_info {
struct sk_buff *nl_msg;
struct device *cdev;
unsigned int port;
u64 abi;
};
int ib_get_client_nl_info(struct ib_device *ibdev, const char *client_name,
struct ib_client_nl_info *res);
enum ib_cache_gid_default_mode {
IB_CACHE_GID_DEFAULT_MODE_SET,
IB_CACHE_GID_DEFAULT_MODE_DELETE

View File

@ -0,0 +1,634 @@
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
/*
* Copyright (c) 2019 Mellanox Technologies. All rights reserved.
*/
#include <rdma/ib_verbs.h>
#include <rdma/rdma_counter.h>
#include "core_priv.h"
#include "restrack.h"
#define ALL_AUTO_MODE_MASKS (RDMA_COUNTER_MASK_QP_TYPE)
static int __counter_set_mode(struct rdma_counter_mode *curr,
enum rdma_nl_counter_mode new_mode,
enum rdma_nl_counter_mask new_mask)
{
if ((new_mode == RDMA_COUNTER_MODE_AUTO) &&
((new_mask & (~ALL_AUTO_MODE_MASKS)) ||
(curr->mode != RDMA_COUNTER_MODE_NONE)))
return -EINVAL;
curr->mode = new_mode;
curr->mask = new_mask;
return 0;
}
/**
* rdma_counter_set_auto_mode() - Turn on/off per-port auto mode
*
* When @on is true, the @mask must be set; When @on is false, it goes
* into manual mode if there's any counter, so that the user is able to
* manually access them.
*/
int rdma_counter_set_auto_mode(struct ib_device *dev, u8 port,
bool on, enum rdma_nl_counter_mask mask)
{
struct rdma_port_counter *port_counter;
int ret;
port_counter = &dev->port_data[port].port_counter;
mutex_lock(&port_counter->lock);
if (on) {
ret = __counter_set_mode(&port_counter->mode,
RDMA_COUNTER_MODE_AUTO, mask);
} else {
if (port_counter->mode.mode != RDMA_COUNTER_MODE_AUTO) {
ret = -EINVAL;
goto out;
}
if (port_counter->num_counters)
ret = __counter_set_mode(&port_counter->mode,
RDMA_COUNTER_MODE_MANUAL, 0);
else
ret = __counter_set_mode(&port_counter->mode,
RDMA_COUNTER_MODE_NONE, 0);
}
out:
mutex_unlock(&port_counter->lock);
return ret;
}
static struct rdma_counter *rdma_counter_alloc(struct ib_device *dev, u8 port,
enum rdma_nl_counter_mode mode)
{
struct rdma_port_counter *port_counter;
struct rdma_counter *counter;
int ret;
if (!dev->ops.counter_dealloc || !dev->ops.counter_alloc_stats)
return NULL;
counter = kzalloc(sizeof(*counter), GFP_KERNEL);
if (!counter)
return NULL;
counter->device = dev;
counter->port = port;
counter->res.type = RDMA_RESTRACK_COUNTER;
counter->stats = dev->ops.counter_alloc_stats(counter);
if (!counter->stats)
goto err_stats;
port_counter = &dev->port_data[port].port_counter;
mutex_lock(&port_counter->lock);
if (mode == RDMA_COUNTER_MODE_MANUAL) {
ret = __counter_set_mode(&port_counter->mode,
RDMA_COUNTER_MODE_MANUAL, 0);
if (ret)
goto err_mode;
}
port_counter->num_counters++;
mutex_unlock(&port_counter->lock);
counter->mode.mode = mode;
kref_init(&counter->kref);
mutex_init(&counter->lock);
return counter;
err_mode:
mutex_unlock(&port_counter->lock);
kfree(counter->stats);
err_stats:
kfree(counter);
return NULL;
}
static void rdma_counter_free(struct rdma_counter *counter)
{
struct rdma_port_counter *port_counter;
port_counter = &counter->device->port_data[counter->port].port_counter;
mutex_lock(&port_counter->lock);
port_counter->num_counters--;
if (!port_counter->num_counters &&
(port_counter->mode.mode == RDMA_COUNTER_MODE_MANUAL))
__counter_set_mode(&port_counter->mode, RDMA_COUNTER_MODE_NONE,
0);
mutex_unlock(&port_counter->lock);
rdma_restrack_del(&counter->res);
kfree(counter->stats);
kfree(counter);
}
static void auto_mode_init_counter(struct rdma_counter *counter,
const struct ib_qp *qp,
enum rdma_nl_counter_mask new_mask)
{
struct auto_mode_param *param = &counter->mode.param;
counter->mode.mode = RDMA_COUNTER_MODE_AUTO;
counter->mode.mask = new_mask;
if (new_mask & RDMA_COUNTER_MASK_QP_TYPE)
param->qp_type = qp->qp_type;
}
static bool auto_mode_match(struct ib_qp *qp, struct rdma_counter *counter,
enum rdma_nl_counter_mask auto_mask)
{
struct auto_mode_param *param = &counter->mode.param;
bool match = true;
if (rdma_is_kernel_res(&counter->res) != rdma_is_kernel_res(&qp->res))
return false;
/* Ensure that counter belong to right PID */
if (!rdma_is_kernel_res(&counter->res) &&
!rdma_is_kernel_res(&qp->res) &&
(task_pid_vnr(counter->res.task) != current->pid))
return false;
if (auto_mask & RDMA_COUNTER_MASK_QP_TYPE)
match &= (param->qp_type == qp->qp_type);
return match;
}
static int __rdma_counter_bind_qp(struct rdma_counter *counter,
struct ib_qp *qp)
{
int ret;
if (qp->counter)
return -EINVAL;
if (!qp->device->ops.counter_bind_qp)
return -EOPNOTSUPP;
mutex_lock(&counter->lock);
ret = qp->device->ops.counter_bind_qp(counter, qp);
mutex_unlock(&counter->lock);
return ret;
}
static int __rdma_counter_unbind_qp(struct ib_qp *qp)
{
struct rdma_counter *counter = qp->counter;
int ret;
if (!qp->device->ops.counter_unbind_qp)
return -EOPNOTSUPP;
mutex_lock(&counter->lock);
ret = qp->device->ops.counter_unbind_qp(qp);
mutex_unlock(&counter->lock);
return ret;
}
static void counter_history_stat_update(const struct rdma_counter *counter)
{
struct ib_device *dev = counter->device;
struct rdma_port_counter *port_counter;
int i;
port_counter = &dev->port_data[counter->port].port_counter;
if (!port_counter->hstats)
return;
for (i = 0; i < counter->stats->num_counters; i++)
port_counter->hstats->value[i] += counter->stats->value[i];
}
/**
* rdma_get_counter_auto_mode - Find the counter that @qp should be bound
* with in auto mode
*
* Return: The counter (with ref-count increased) if found
*/
static struct rdma_counter *rdma_get_counter_auto_mode(struct ib_qp *qp,
u8 port)
{
struct rdma_port_counter *port_counter;
struct rdma_counter *counter = NULL;
struct ib_device *dev = qp->device;
struct rdma_restrack_entry *res;
struct rdma_restrack_root *rt;
unsigned long id = 0;
port_counter = &dev->port_data[port].port_counter;
rt = &dev->res[RDMA_RESTRACK_COUNTER];
xa_lock(&rt->xa);
xa_for_each(&rt->xa, id, res) {
if (!rdma_is_visible_in_pid_ns(res))
continue;
counter = container_of(res, struct rdma_counter, res);
if ((counter->device != qp->device) || (counter->port != port))
goto next;
if (auto_mode_match(qp, counter, port_counter->mode.mask))
break;
next:
counter = NULL;
}
if (counter && !kref_get_unless_zero(&counter->kref))
counter = NULL;
xa_unlock(&rt->xa);
return counter;
}
static void rdma_counter_res_add(struct rdma_counter *counter,
struct ib_qp *qp)
{
if (rdma_is_kernel_res(&qp->res)) {
rdma_restrack_set_task(&counter->res, qp->res.kern_name);
rdma_restrack_kadd(&counter->res);
} else {
rdma_restrack_attach_task(&counter->res, qp->res.task);
rdma_restrack_uadd(&counter->res);
}
}
static void counter_release(struct kref *kref)
{
struct rdma_counter *counter;
counter = container_of(kref, struct rdma_counter, kref);
counter_history_stat_update(counter);
counter->device->ops.counter_dealloc(counter);
rdma_counter_free(counter);
}
/**
* rdma_counter_bind_qp_auto - Check and bind the QP to a counter base on
* the auto-mode rule
*/
int rdma_counter_bind_qp_auto(struct ib_qp *qp, u8 port)
{
struct rdma_port_counter *port_counter;
struct ib_device *dev = qp->device;
struct rdma_counter *counter;
int ret;
if (!rdma_is_port_valid(dev, port))
return -EINVAL;
port_counter = &dev->port_data[port].port_counter;
if (port_counter->mode.mode != RDMA_COUNTER_MODE_AUTO)
return 0;
counter = rdma_get_counter_auto_mode(qp, port);
if (counter) {
ret = __rdma_counter_bind_qp(counter, qp);
if (ret) {
kref_put(&counter->kref, counter_release);
return ret;
}
} else {
counter = rdma_counter_alloc(dev, port, RDMA_COUNTER_MODE_AUTO);
if (!counter)
return -ENOMEM;
auto_mode_init_counter(counter, qp, port_counter->mode.mask);
ret = __rdma_counter_bind_qp(counter, qp);
if (ret) {
rdma_counter_free(counter);
return ret;
}
rdma_counter_res_add(counter, qp);
}
return 0;
}
/**
* rdma_counter_unbind_qp - Unbind a qp from a counter
* @force:
* true - Decrease the counter ref-count anyway (e.g., qp destroy)
*/
int rdma_counter_unbind_qp(struct ib_qp *qp, bool force)
{
struct rdma_counter *counter = qp->counter;
int ret;
if (!counter)
return -EINVAL;
ret = __rdma_counter_unbind_qp(qp);
if (ret && !force)
return ret;
kref_put(&counter->kref, counter_release);
return 0;
}
int rdma_counter_query_stats(struct rdma_counter *counter)
{
struct ib_device *dev = counter->device;
int ret;
if (!dev->ops.counter_update_stats)
return -EINVAL;
mutex_lock(&counter->lock);
ret = dev->ops.counter_update_stats(counter);
mutex_unlock(&counter->lock);
return ret;
}
static u64 get_running_counters_hwstat_sum(struct ib_device *dev,
u8 port, u32 index)
{
struct rdma_restrack_entry *res;
struct rdma_restrack_root *rt;
struct rdma_counter *counter;
unsigned long id = 0;
u64 sum = 0;
rt = &dev->res[RDMA_RESTRACK_COUNTER];
xa_lock(&rt->xa);
xa_for_each(&rt->xa, id, res) {
if (!rdma_restrack_get(res))
continue;
xa_unlock(&rt->xa);
counter = container_of(res, struct rdma_counter, res);
if ((counter->device != dev) || (counter->port != port) ||
rdma_counter_query_stats(counter))
goto next;
sum += counter->stats->value[index];
next:
xa_lock(&rt->xa);
rdma_restrack_put(res);
}
xa_unlock(&rt->xa);
return sum;
}
/**
* rdma_counter_get_hwstat_value() - Get the sum value of all counters on a
* specific port, including the running ones and history data
*/
u64 rdma_counter_get_hwstat_value(struct ib_device *dev, u8 port, u32 index)
{
struct rdma_port_counter *port_counter;
u64 sum;
port_counter = &dev->port_data[port].port_counter;
sum = get_running_counters_hwstat_sum(dev, port, index);
sum += port_counter->hstats->value[index];
return sum;
}
static struct ib_qp *rdma_counter_get_qp(struct ib_device *dev, u32 qp_num)
{
struct rdma_restrack_entry *res = NULL;
struct ib_qp *qp = NULL;
res = rdma_restrack_get_byid(dev, RDMA_RESTRACK_QP, qp_num);
if (IS_ERR(res))
return NULL;
if (!rdma_is_visible_in_pid_ns(res))
goto err;
qp = container_of(res, struct ib_qp, res);
if (qp->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
goto err;
return qp;
err:
rdma_restrack_put(&qp->res);
return NULL;
}
static int rdma_counter_bind_qp_manual(struct rdma_counter *counter,
struct ib_qp *qp)
{
if ((counter->device != qp->device) || (counter->port != qp->port))
return -EINVAL;
return __rdma_counter_bind_qp(counter, qp);
}
static struct rdma_counter *rdma_get_counter_by_id(struct ib_device *dev,
u32 counter_id)
{
struct rdma_restrack_entry *res;
struct rdma_counter *counter;
res = rdma_restrack_get_byid(dev, RDMA_RESTRACK_COUNTER, counter_id);
if (IS_ERR(res))
return NULL;
if (!rdma_is_visible_in_pid_ns(res)) {
rdma_restrack_put(res);
return NULL;
}
counter = container_of(res, struct rdma_counter, res);
kref_get(&counter->kref);
rdma_restrack_put(res);
return counter;
}
/**
* rdma_counter_bind_qpn() - Bind QP @qp_num to counter @counter_id
*/
int rdma_counter_bind_qpn(struct ib_device *dev, u8 port,
u32 qp_num, u32 counter_id)
{
struct rdma_counter *counter;
struct ib_qp *qp;
int ret;
qp = rdma_counter_get_qp(dev, qp_num);
if (!qp)
return -ENOENT;
counter = rdma_get_counter_by_id(dev, counter_id);
if (!counter) {
ret = -ENOENT;
goto err;
}
if (counter->res.task != qp->res.task) {
ret = -EINVAL;
goto err_task;
}
ret = rdma_counter_bind_qp_manual(counter, qp);
if (ret)
goto err_task;
rdma_restrack_put(&qp->res);
return 0;
err_task:
kref_put(&counter->kref, counter_release);
err:
rdma_restrack_put(&qp->res);
return ret;
}
/**
* rdma_counter_bind_qpn_alloc() - Alloc a counter and bind QP @qp_num to it
* The id of new counter is returned in @counter_id
*/
int rdma_counter_bind_qpn_alloc(struct ib_device *dev, u8 port,
u32 qp_num, u32 *counter_id)
{
struct rdma_counter *counter;
struct ib_qp *qp;
int ret;
if (!rdma_is_port_valid(dev, port))
return -EINVAL;
qp = rdma_counter_get_qp(dev, qp_num);
if (!qp)
return -ENOENT;
if (rdma_is_port_valid(dev, qp->port) && (qp->port != port)) {
ret = -EINVAL;
goto err;
}
counter = rdma_counter_alloc(dev, port, RDMA_COUNTER_MODE_MANUAL);
if (!counter) {
ret = -ENOMEM;
goto err;
}
ret = rdma_counter_bind_qp_manual(counter, qp);
if (ret)
goto err_bind;
if (counter_id)
*counter_id = counter->id;
rdma_counter_res_add(counter, qp);
rdma_restrack_put(&qp->res);
return ret;
err_bind:
rdma_counter_free(counter);
err:
rdma_restrack_put(&qp->res);
return ret;
}
/**
* rdma_counter_unbind_qpn() - Unbind QP @qp_num from a counter
*/
int rdma_counter_unbind_qpn(struct ib_device *dev, u8 port,
u32 qp_num, u32 counter_id)
{
struct rdma_port_counter *port_counter;
struct ib_qp *qp;
int ret;
if (!rdma_is_port_valid(dev, port))
return -EINVAL;
qp = rdma_counter_get_qp(dev, qp_num);
if (!qp)
return -ENOENT;
if (rdma_is_port_valid(dev, qp->port) && (qp->port != port)) {
ret = -EINVAL;
goto out;
}
port_counter = &dev->port_data[port].port_counter;
if (!qp->counter || qp->counter->id != counter_id ||
port_counter->mode.mode != RDMA_COUNTER_MODE_MANUAL) {
ret = -EINVAL;
goto out;
}
ret = rdma_counter_unbind_qp(qp, false);
out:
rdma_restrack_put(&qp->res);
return ret;
}
int rdma_counter_get_mode(struct ib_device *dev, u8 port,
enum rdma_nl_counter_mode *mode,
enum rdma_nl_counter_mask *mask)
{
struct rdma_port_counter *port_counter;
port_counter = &dev->port_data[port].port_counter;
*mode = port_counter->mode.mode;
*mask = port_counter->mode.mask;
return 0;
}
void rdma_counter_init(struct ib_device *dev)
{
struct rdma_port_counter *port_counter;
u32 port;
if (!dev->ops.alloc_hw_stats || !dev->port_data)
return;
rdma_for_each_port(dev, port) {
port_counter = &dev->port_data[port].port_counter;
port_counter->mode.mode = RDMA_COUNTER_MODE_NONE;
mutex_init(&port_counter->lock);
port_counter->hstats = dev->ops.alloc_hw_stats(dev, port);
if (!port_counter->hstats)
goto fail;
}
return;
fail:
rdma_for_each_port(dev, port) {
port_counter = &dev->port_data[port].port_counter;
kfree(port_counter->hstats);
port_counter->hstats = NULL;
}
return;
}
void rdma_counter_release(struct ib_device *dev)
{
struct rdma_port_counter *port_counter;
u32 port;
if (!dev->ops.alloc_hw_stats)
return;
rdma_for_each_port(dev, port) {
port_counter = &dev->port_data[port].port_counter;
kfree(port_counter->hstats);
}
}

View File

@ -18,6 +18,53 @@
#define IB_POLL_FLAGS \
(IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS)
static const struct dim_cq_moder
rdma_dim_prof[RDMA_DIM_PARAMS_NUM_PROFILES] = {
{1, 0, 1, 0},
{1, 0, 4, 0},
{2, 0, 4, 0},
{2, 0, 8, 0},
{4, 0, 8, 0},
{16, 0, 8, 0},
{16, 0, 16, 0},
{32, 0, 16, 0},
{32, 0, 32, 0},
};
static void ib_cq_rdma_dim_work(struct work_struct *w)
{
struct dim *dim = container_of(w, struct dim, work);
struct ib_cq *cq = dim->priv;
u16 usec = rdma_dim_prof[dim->profile_ix].usec;
u16 comps = rdma_dim_prof[dim->profile_ix].comps;
dim->state = DIM_START_MEASURE;
cq->device->ops.modify_cq(cq, comps, usec);
}
static void rdma_dim_init(struct ib_cq *cq)
{
struct dim *dim;
if (!cq->device->ops.modify_cq || !cq->device->use_cq_dim ||
cq->poll_ctx == IB_POLL_DIRECT)
return;
dim = kzalloc(sizeof(struct dim), GFP_KERNEL);
if (!dim)
return;
dim->state = DIM_START_MEASURE;
dim->tune_state = DIM_GOING_RIGHT;
dim->profile_ix = RDMA_DIM_START_PROFILE;
dim->priv = cq;
cq->dim = dim;
INIT_WORK(&dim->work, ib_cq_rdma_dim_work);
}
static int __ib_process_cq(struct ib_cq *cq, int budget, struct ib_wc *wcs,
int batch)
{
@ -78,6 +125,7 @@ static void ib_cq_completion_direct(struct ib_cq *cq, void *private)
static int ib_poll_handler(struct irq_poll *iop, int budget)
{
struct ib_cq *cq = container_of(iop, struct ib_cq, iop);
struct dim *dim = cq->dim;
int completed;
completed = __ib_process_cq(cq, budget, cq->wc, IB_POLL_BATCH);
@ -87,6 +135,9 @@ static int ib_poll_handler(struct irq_poll *iop, int budget)
irq_poll_sched(&cq->iop);
}
if (dim)
rdma_dim(dim, completed);
return completed;
}
@ -105,6 +156,8 @@ static void ib_cq_poll_work(struct work_struct *work)
if (completed >= IB_POLL_BUDGET_WORKQUEUE ||
ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
queue_work(cq->comp_wq, &cq->work);
else if (cq->dim)
rdma_dim(cq->dim, completed);
}
static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
@ -113,7 +166,7 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
}
/**
* __ib_alloc_cq - allocate a completion queue
* __ib_alloc_cq_user - allocate a completion queue
* @dev: device to allocate the CQ for
* @private: driver private data, accessible from cq->cq_context
* @nr_cqe: number of CQEs to allocate
@ -139,25 +192,30 @@ struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
struct ib_cq *cq;
int ret = -ENOMEM;
cq = dev->ops.create_cq(dev, &cq_attr, NULL);
if (IS_ERR(cq))
return cq;
cq = rdma_zalloc_drv_obj(dev, ib_cq);
if (!cq)
return ERR_PTR(ret);
cq->device = dev;
cq->uobject = NULL;
cq->event_handler = NULL;
cq->cq_context = private;
cq->poll_ctx = poll_ctx;
atomic_set(&cq->usecnt, 0);
cq->wc = kmalloc_array(IB_POLL_BATCH, sizeof(*cq->wc), GFP_KERNEL);
if (!cq->wc)
goto out_destroy_cq;
goto out_free_cq;
cq->res.type = RDMA_RESTRACK_CQ;
rdma_restrack_set_task(&cq->res, caller);
ret = dev->ops.create_cq(cq, &cq_attr, NULL);
if (ret)
goto out_free_wc;
rdma_restrack_kadd(&cq->res);
rdma_dim_init(cq);
switch (cq->poll_ctx) {
case IB_POLL_DIRECT:
cq->comp_handler = ib_cq_completion_direct;
@ -178,29 +236,29 @@ struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
break;
default:
ret = -EINVAL;
goto out_free_wc;
goto out_destroy_cq;
}
return cq;
out_destroy_cq:
rdma_restrack_del(&cq->res);
cq->device->ops.destroy_cq(cq, udata);
out_free_wc:
kfree(cq->wc);
rdma_restrack_del(&cq->res);
out_destroy_cq:
cq->device->ops.destroy_cq(cq, udata);
out_free_cq:
kfree(cq);
return ERR_PTR(ret);
}
EXPORT_SYMBOL(__ib_alloc_cq_user);
/**
* ib_free_cq - free a completion queue
* ib_free_cq_user - free a completion queue
* @cq: completion queue to free.
* @udata: User data or NULL for kernel object
*/
void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
{
int ret;
if (WARN_ON_ONCE(atomic_read(&cq->usecnt)))
return;
@ -218,9 +276,12 @@ void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
WARN_ON_ONCE(1);
}
kfree(cq->wc);
rdma_restrack_del(&cq->res);
ret = cq->device->ops.destroy_cq(cq, udata);
WARN_ON_ONCE(ret);
cq->device->ops.destroy_cq(cq, udata);
if (cq->dim)
cancel_work_sync(&cq->dim->work);
kfree(cq->dim);
kfree(cq->wc);
kfree(cq);
}
EXPORT_SYMBOL(ib_free_cq_user);

View File

@ -46,6 +46,7 @@
#include <rdma/rdma_netlink.h>
#include <rdma/ib_addr.h>
#include <rdma/ib_cache.h>
#include <rdma/rdma_counter.h>
#include "core_priv.h"
#include "restrack.h"
@ -270,7 +271,7 @@ struct ib_port_data_rcu {
struct ib_port_data pdata[];
};
static int ib_device_check_mandatory(struct ib_device *device)
static void ib_device_check_mandatory(struct ib_device *device)
{
#define IB_MANDATORY_FUNC(x) { offsetof(struct ib_device_ops, x), #x }
static const struct {
@ -305,8 +306,6 @@ static int ib_device_check_mandatory(struct ib_device *device)
break;
}
}
return 0;
}
/*
@ -375,7 +374,7 @@ struct ib_device *ib_device_get_by_name(const char *name,
down_read(&devices_rwsem);
device = __ib_device_get_by_name(name);
if (device && driver_id != RDMA_DRIVER_UNKNOWN &&
device->driver_id != driver_id)
device->ops.driver_id != driver_id)
device = NULL;
if (device) {
@ -449,6 +448,15 @@ int ib_device_rename(struct ib_device *ibdev, const char *name)
return 0;
}
int ib_device_set_dim(struct ib_device *ibdev, u8 use_dim)
{
if (use_dim > 1)
return -EINVAL;
ibdev->use_cq_dim = use_dim;
return 0;
}
static int alloc_name(struct ib_device *ibdev, const char *name)
{
struct ib_device *device;
@ -494,10 +502,12 @@ static void ib_device_release(struct device *device)
if (dev->port_data) {
ib_cache_release_one(dev);
ib_security_release_port_pkey_list(dev);
rdma_counter_release(dev);
kfree_rcu(container_of(dev->port_data, struct ib_port_data_rcu,
pdata[0]),
rcu_head);
}
xa_destroy(&dev->compat_devs);
xa_destroy(&dev->client_data);
kfree_rcu(dev, rcu_head);
@ -1193,10 +1203,7 @@ static int setup_device(struct ib_device *device)
int ret;
setup_dma_device(device);
ret = ib_device_check_mandatory(device);
if (ret)
return ret;
ib_device_check_mandatory(device);
ret = setup_port_data(device);
if (ret) {
@ -1321,6 +1328,8 @@ int ib_register_device(struct ib_device *device, const char *name)
ib_device_register_rdmacg(device);
rdma_counter_init(device);
/*
* Ensure that ADD uevent is not fired because it
* is too early amd device is not initialized yet.
@ -1479,7 +1488,7 @@ void ib_unregister_driver(enum rdma_driver_id driver_id)
down_read(&devices_rwsem);
xa_for_each (&devices, index, ib_dev) {
if (ib_dev->driver_id != driver_id)
if (ib_dev->ops.driver_id != driver_id)
continue;
get_device(&ib_dev->dev);
@ -1749,6 +1758,104 @@ void ib_unregister_client(struct ib_client *client)
}
EXPORT_SYMBOL(ib_unregister_client);
static int __ib_get_global_client_nl_info(const char *client_name,
struct ib_client_nl_info *res)
{
struct ib_client *client;
unsigned long index;
int ret = -ENOENT;
down_read(&clients_rwsem);
xa_for_each_marked (&clients, index, client, CLIENT_REGISTERED) {
if (strcmp(client->name, client_name) != 0)
continue;
if (!client->get_global_nl_info) {
ret = -EOPNOTSUPP;
break;
}
ret = client->get_global_nl_info(res);
if (WARN_ON(ret == -ENOENT))
ret = -EINVAL;
if (!ret && res->cdev)
get_device(res->cdev);
break;
}
up_read(&clients_rwsem);
return ret;
}
static int __ib_get_client_nl_info(struct ib_device *ibdev,
const char *client_name,
struct ib_client_nl_info *res)
{
unsigned long index;
void *client_data;
int ret = -ENOENT;
down_read(&ibdev->client_data_rwsem);
xan_for_each_marked (&ibdev->client_data, index, client_data,
CLIENT_DATA_REGISTERED) {
struct ib_client *client = xa_load(&clients, index);
if (!client || strcmp(client->name, client_name) != 0)
continue;
if (!client->get_nl_info) {
ret = -EOPNOTSUPP;
break;
}
ret = client->get_nl_info(ibdev, client_data, res);
if (WARN_ON(ret == -ENOENT))
ret = -EINVAL;
/*
* The cdev is guaranteed valid as long as we are inside the
* client_data_rwsem as remove_one can't be called. Keep it
* valid for the caller.
*/
if (!ret && res->cdev)
get_device(res->cdev);
break;
}
up_read(&ibdev->client_data_rwsem);
return ret;
}
/**
* ib_get_client_nl_info - Fetch the nl_info from a client
* @device - IB device
* @client_name - Name of the client
* @res - Result of the query
*/
int ib_get_client_nl_info(struct ib_device *ibdev, const char *client_name,
struct ib_client_nl_info *res)
{
int ret;
if (ibdev)
ret = __ib_get_client_nl_info(ibdev, client_name, res);
else
ret = __ib_get_global_client_nl_info(client_name, res);
#ifdef CONFIG_MODULES
if (ret == -ENOENT) {
request_module("rdma-client-%s", client_name);
if (ibdev)
ret = __ib_get_client_nl_info(ibdev, client_name, res);
else
ret = __ib_get_global_client_nl_info(client_name, res);
}
#endif
if (ret) {
if (ret == -ENOENT)
return -EOPNOTSUPP;
return ret;
}
if (WARN_ON(!res->cdev))
return -EINVAL;
return 0;
}
/**
* ib_set_client_data - Set IB client context
* @device:Device to set context for
@ -2039,7 +2146,7 @@ struct ib_device *ib_device_get_by_netdev(struct net_device *ndev,
(uintptr_t)ndev) {
if (rcu_access_pointer(cur->netdev) == ndev &&
(driver_id == RDMA_DRIVER_UNKNOWN ||
cur->ib_dev->driver_id == driver_id) &&
cur->ib_dev->ops.driver_id == driver_id) &&
ib_device_try_get(cur->ib_dev)) {
res = cur->ib_dev;
break;
@ -2344,12 +2451,28 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
#define SET_OBJ_SIZE(ptr, name) SET_DEVICE_OP(ptr, size_##name)
if (ops->driver_id != RDMA_DRIVER_UNKNOWN) {
WARN_ON(dev_ops->driver_id != RDMA_DRIVER_UNKNOWN &&
dev_ops->driver_id != ops->driver_id);
dev_ops->driver_id = ops->driver_id;
}
if (ops->owner) {
WARN_ON(dev_ops->owner && dev_ops->owner != ops->owner);
dev_ops->owner = ops->owner;
}
if (ops->uverbs_abi_ver)
dev_ops->uverbs_abi_ver = ops->uverbs_abi_ver;
dev_ops->uverbs_no_driver_id_binding |=
ops->uverbs_no_driver_id_binding;
SET_DEVICE_OP(dev_ops, add_gid);
SET_DEVICE_OP(dev_ops, advise_mr);
SET_DEVICE_OP(dev_ops, alloc_dm);
SET_DEVICE_OP(dev_ops, alloc_fmr);
SET_DEVICE_OP(dev_ops, alloc_hw_stats);
SET_DEVICE_OP(dev_ops, alloc_mr);
SET_DEVICE_OP(dev_ops, alloc_mr_integrity);
SET_DEVICE_OP(dev_ops, alloc_mw);
SET_DEVICE_OP(dev_ops, alloc_pd);
SET_DEVICE_OP(dev_ops, alloc_rdma_netdev);
@ -2357,6 +2480,11 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, alloc_xrcd);
SET_DEVICE_OP(dev_ops, attach_mcast);
SET_DEVICE_OP(dev_ops, check_mr_status);
SET_DEVICE_OP(dev_ops, counter_alloc_stats);
SET_DEVICE_OP(dev_ops, counter_bind_qp);
SET_DEVICE_OP(dev_ops, counter_dealloc);
SET_DEVICE_OP(dev_ops, counter_unbind_qp);
SET_DEVICE_OP(dev_ops, counter_update_stats);
SET_DEVICE_OP(dev_ops, create_ah);
SET_DEVICE_OP(dev_ops, create_counters);
SET_DEVICE_OP(dev_ops, create_cq);
@ -2409,6 +2537,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, iw_reject);
SET_DEVICE_OP(dev_ops, iw_rem_ref);
SET_DEVICE_OP(dev_ops, map_mr_sg);
SET_DEVICE_OP(dev_ops, map_mr_sg_pi);
SET_DEVICE_OP(dev_ops, map_phys_fmr);
SET_DEVICE_OP(dev_ops, mmap);
SET_DEVICE_OP(dev_ops, modify_ah);
@ -2445,6 +2574,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, unmap_fmr);
SET_OBJ_SIZE(dev_ops, ib_ah);
SET_OBJ_SIZE(dev_ops, ib_cq);
SET_OBJ_SIZE(dev_ops, ib_pd);
SET_OBJ_SIZE(dev_ops, ib_srq);
SET_OBJ_SIZE(dev_ops, ib_ucontext);

View File

@ -34,14 +34,18 @@ void ib_mr_pool_put(struct ib_qp *qp, struct list_head *list, struct ib_mr *mr)
EXPORT_SYMBOL(ib_mr_pool_put);
int ib_mr_pool_init(struct ib_qp *qp, struct list_head *list, int nr,
enum ib_mr_type type, u32 max_num_sg)
enum ib_mr_type type, u32 max_num_sg, u32 max_num_meta_sg)
{
struct ib_mr *mr;
unsigned long flags;
int ret, i;
for (i = 0; i < nr; i++) {
mr = ib_alloc_mr(qp->pd, type, max_num_sg);
if (type == IB_MR_TYPE_INTEGRITY)
mr = ib_alloc_mr_integrity(qp->pd, max_num_sg,
max_num_meta_sg);
else
mr = ib_alloc_mr(qp->pd, type, max_num_sg);
if (IS_ERR(mr)) {
ret = PTR_ERR(mr);
goto out;

View File

@ -42,84 +42,105 @@
#include "cma_priv.h"
#include "restrack.h"
/*
* Sort array elements by the netlink attribute name
*/
static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
[RDMA_NLDEV_ATTR_DEV_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING,
.len = IB_DEVICE_NAME_MAX - 1},
[RDMA_NLDEV_ATTR_PORT_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_FW_VERSION] = { .type = NLA_NUL_STRING,
.len = IB_FW_VERSION_NAME_MAX - 1},
[RDMA_NLDEV_ATTR_NODE_GUID] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_SUBNET_PREFIX] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_LID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_SM_LID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_LMC] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_PORT_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_SUMMARY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME] = { .type = NLA_NUL_STRING,
.len = 16 },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_CHARDEV] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_CHARDEV_ABI] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_CHARDEV_NAME] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_EMPTY_STRING },
[RDMA_NLDEV_ATTR_CHARDEV_TYPE] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_CHARDEV_TYPE_SIZE },
[RDMA_NLDEV_ATTR_DEV_DIM] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING,
.len = IB_DEVICE_NAME_MAX },
[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_PROTOCOL] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_EMPTY_STRING },
[RDMA_NLDEV_ATTR_DRIVER] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_DRIVER_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_DRIVER_PRINT_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DRIVER_STRING] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_EMPTY_STRING },
[RDMA_NLDEV_ATTR_DRIVER_S32] = { .type = NLA_S32 },
[RDMA_NLDEV_ATTR_DRIVER_S64] = { .type = NLA_S64 },
[RDMA_NLDEV_ATTR_DRIVER_U32] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_DRIVER_U64] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_FW_VERSION] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_EMPTY_STRING },
[RDMA_NLDEV_ATTR_LID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_LINK_TYPE] = { .type = NLA_NUL_STRING,
.len = IFNAMSIZ },
[RDMA_NLDEV_ATTR_LMC] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_NDEV_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_NDEV_NAME] = { .type = NLA_NUL_STRING,
.len = IFNAMSIZ },
[RDMA_NLDEV_ATTR_NODE_GUID] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_PORT_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_PORT_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_CM_ID] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CM_IDN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_CM_ID_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CQ] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CQE] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_CQN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_CQ_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CTXN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_DST_ADDR] = {
.len = sizeof(struct __kernel_sockaddr_storage) },
[RDMA_NLDEV_ATTR_RES_IOVA] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_KERN_NAME] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_EMPTY_STRING },
[RDMA_NLDEV_ATTR_RES_LKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_LOCAL_DMA_LKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_LQPN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_MR] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_MRLEN] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_MRN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_MR_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_PD] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_PDN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_PD_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_PID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_POLL_CTX] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_PS] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_QP] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_QP_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_LQPN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_RKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_RQPN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_RQ_PSN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_SQ_PSN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_SRC_ADDR] = {
.len = sizeof(struct __kernel_sockaddr_storage) },
[RDMA_NLDEV_ATTR_RES_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_PID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_KERN_NAME] = { .type = NLA_NUL_STRING,
.len = TASK_COMM_LEN },
[RDMA_NLDEV_ATTR_RES_CM_ID] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CM_ID_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_PS] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_SRC_ADDR] = {
.len = sizeof(struct __kernel_sockaddr_storage) },
[RDMA_NLDEV_ATTR_RES_DST_ADDR] = {
.len = sizeof(struct __kernel_sockaddr_storage) },
[RDMA_NLDEV_ATTR_RES_CQ] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CQ_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_CQE] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_SUMMARY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR]= { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME]= { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_EMPTY_STRING },
[RDMA_NLDEV_ATTR_RES_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_UNSAFE_GLOBAL_RKEY]= { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_USECNT] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_POLL_CTX] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_MR] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_MR_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_RKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_LKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_IOVA] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_MRLEN] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_PD] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_PD_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_LOCAL_DMA_LKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_UNSAFE_GLOBAL_RKEY] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_NDEV_INDEX] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_NDEV_NAME] = { .type = NLA_NUL_STRING,
.len = IFNAMSIZ },
[RDMA_NLDEV_ATTR_DRIVER] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_DRIVER_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_DRIVER_STRING] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_ENTRY_STRLEN },
[RDMA_NLDEV_ATTR_DRIVER_PRINT_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DRIVER_S32] = { .type = NLA_S32 },
[RDMA_NLDEV_ATTR_DRIVER_U32] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_DRIVER_S64] = { .type = NLA_S64 },
[RDMA_NLDEV_ATTR_DRIVER_U64] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_PDN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_CQN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_MRN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_CM_IDN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_CTXN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_LINK_TYPE] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_ENTRY_STRLEN },
[RDMA_NLDEV_SYS_ATTR_NETNS_MODE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_PROTOCOL] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_ENTRY_STRLEN },
[RDMA_NLDEV_ATTR_SM_LID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_SUBNET_PREFIX] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_STAT_AUTO_MODE_MASK] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_STAT_MODE] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_STAT_RES] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_STAT_COUNTER] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_STAT_COUNTER_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_STAT_COUNTER_ID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_STAT_HWCOUNTERS] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY_NAME] = { .type = NLA_NUL_STRING },
[RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY_VALUE] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_UVERBS_DRIVER_ID] = { .type = NLA_U32 },
[RDMA_NLDEV_NET_NS_FD] = { .type = NLA_U32 },
[RDMA_NLDEV_SYS_ATTR_NETNS_MODE] = { .type = NLA_U8 },
};
static int put_driver_name_print_type(struct sk_buff *msg, const char *name,
@ -232,6 +253,8 @@ static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
return -EMSGSIZE;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_NODE_TYPE, device->node_type))
return -EMSGSIZE;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_DIM, device->use_cq_dim))
return -EMSGSIZE;
/*
* Link type is determined on first port and mlx4 device
@ -532,6 +555,9 @@ static int fill_res_cq_entry(struct sk_buff *msg, bool has_cap_net_admin,
nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_POLL_CTX, cq->poll_ctx))
goto err;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_DIM, (cq->dim != NULL)))
goto err;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_CQN, res->id))
goto err;
if (!rdma_is_kernel_res(res) &&
@ -623,6 +649,152 @@ static int fill_res_pd_entry(struct sk_buff *msg, bool has_cap_net_admin,
err: return -EMSGSIZE;
}
static int fill_stat_counter_mode(struct sk_buff *msg,
struct rdma_counter *counter)
{
struct rdma_counter_mode *m = &counter->mode;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_MODE, m->mode))
return -EMSGSIZE;
if (m->mode == RDMA_COUNTER_MODE_AUTO)
if ((m->mask & RDMA_COUNTER_MASK_QP_TYPE) &&
nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_TYPE, m->param.qp_type))
return -EMSGSIZE;
return 0;
}
static int fill_stat_counter_qp_entry(struct sk_buff *msg, u32 qpn)
{
struct nlattr *entry_attr;
entry_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_QP_ENTRY);
if (!entry_attr)
return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qpn))
goto err;
nla_nest_end(msg, entry_attr);
return 0;
err:
nla_nest_cancel(msg, entry_attr);
return -EMSGSIZE;
}
static int fill_stat_counter_qps(struct sk_buff *msg,
struct rdma_counter *counter)
{
struct rdma_restrack_entry *res;
struct rdma_restrack_root *rt;
struct nlattr *table_attr;
struct ib_qp *qp = NULL;
unsigned long id = 0;
int ret = 0;
table_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_QP);
rt = &counter->device->res[RDMA_RESTRACK_QP];
xa_lock(&rt->xa);
xa_for_each(&rt->xa, id, res) {
if (!rdma_is_visible_in_pid_ns(res))
continue;
qp = container_of(res, struct ib_qp, res);
if (qp->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
continue;
if (!qp->counter || (qp->counter->id != counter->id))
continue;
ret = fill_stat_counter_qp_entry(msg, qp->qp_num);
if (ret)
goto err;
}
xa_unlock(&rt->xa);
nla_nest_end(msg, table_attr);
return 0;
err:
xa_unlock(&rt->xa);
nla_nest_cancel(msg, table_attr);
return ret;
}
static int fill_stat_hwcounter_entry(struct sk_buff *msg,
const char *name, u64 value)
{
struct nlattr *entry_attr;
entry_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY);
if (!entry_attr)
return -EMSGSIZE;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY_NAME,
name))
goto err;
if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY_VALUE,
value, RDMA_NLDEV_ATTR_PAD))
goto err;
nla_nest_end(msg, entry_attr);
return 0;
err:
nla_nest_cancel(msg, entry_attr);
return -EMSGSIZE;
}
static int fill_stat_counter_hwcounters(struct sk_buff *msg,
struct rdma_counter *counter)
{
struct rdma_hw_stats *st = counter->stats;
struct nlattr *table_attr;
int i;
table_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_STAT_HWCOUNTERS);
if (!table_attr)
return -EMSGSIZE;
for (i = 0; i < st->num_counters; i++)
if (fill_stat_hwcounter_entry(msg, st->names[i], st->value[i]))
goto err;
nla_nest_end(msg, table_attr);
return 0;
err:
nla_nest_cancel(msg, table_attr);
return -EMSGSIZE;
}
static int fill_res_counter_entry(struct sk_buff *msg, bool has_cap_net_admin,
struct rdma_restrack_entry *res,
uint32_t port)
{
struct rdma_counter *counter =
container_of(res, struct rdma_counter, res);
if (port && port != counter->port)
return 0;
/* Dump it even query failed */
rdma_counter_query_stats(counter);
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, counter->port) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, counter->id) ||
fill_res_name_pid(msg, &counter->res) ||
fill_stat_counter_mode(msg, counter) ||
fill_stat_counter_qps(msg, counter) ||
fill_stat_counter_hwcounters(msg, counter))
return -EMSGSIZE;
return 0;
}
static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
@ -704,6 +876,14 @@ static int nldev_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
goto put_done;
}
if (tb[RDMA_NLDEV_ATTR_DEV_DIM]) {
u8 use_dim;
use_dim = nla_get_u8(tb[RDMA_NLDEV_ATTR_DEV_DIM]);
err = ib_device_set_dim(device, use_dim);
goto done;
}
done:
ib_device_put(device);
put_done:
@ -990,19 +1170,15 @@ static const struct nldev_fill_res_entry fill_entries[RDMA_RESTRACK_MAX] = {
.entry = RDMA_NLDEV_ATTR_RES_PD_ENTRY,
.id = RDMA_NLDEV_ATTR_RES_PDN,
},
[RDMA_RESTRACK_COUNTER] = {
.fill_res_func = fill_res_counter_entry,
.nldev_cmd = RDMA_NLDEV_CMD_STAT_GET,
.nldev_attr = RDMA_NLDEV_ATTR_STAT_COUNTER,
.entry = RDMA_NLDEV_ATTR_STAT_COUNTER_ENTRY,
.id = RDMA_NLDEV_ATTR_STAT_COUNTER_ID,
},
};
static bool is_visible_in_pid_ns(struct rdma_restrack_entry *res)
{
/*
* 1. Kern resources should be visible in init name space only
* 2. Present only resources visible in the current namespace
*/
if (rdma_is_kernel_res(res))
return task_active_pid_ns(current) == &init_pid_ns;
return task_active_pid_ns(current) == task_active_pid_ns(res->task);
}
static int res_get_common_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack,
enum rdma_restrack_type res_type)
@ -1047,7 +1223,7 @@ static int res_get_common_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
goto err;
}
if (!is_visible_in_pid_ns(res)) {
if (!rdma_is_visible_in_pid_ns(res)) {
ret = -ENOENT;
goto err_get;
}
@ -1159,7 +1335,7 @@ static int res_get_common_dumpit(struct sk_buff *skb,
* objects.
*/
xa_for_each(&rt->xa, id, res) {
if (!is_visible_in_pid_ns(res))
if (!rdma_is_visible_in_pid_ns(res))
continue;
if (idx < start || !rdma_restrack_get(res))
@ -1237,6 +1413,7 @@ RES_GET_FUNCS(cm_id, RDMA_RESTRACK_CM_ID);
RES_GET_FUNCS(cq, RDMA_RESTRACK_CQ);
RES_GET_FUNCS(pd, RDMA_RESTRACK_PD);
RES_GET_FUNCS(mr, RDMA_RESTRACK_MR);
RES_GET_FUNCS(counter, RDMA_RESTRACK_COUNTER);
static LIST_HEAD(link_ops);
static DECLARE_RWSEM(link_ops_rwsem);
@ -1299,7 +1476,7 @@ static int nldev_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
nla_strlcpy(ndev_name, tb[RDMA_NLDEV_ATTR_NDEV_NAME],
sizeof(ndev_name));
ndev = dev_get_by_name(&init_net, ndev_name);
ndev = dev_get_by_name(sock_net(skb->sk), ndev_name);
if (!ndev)
return -ENODEV;
@ -1347,6 +1524,90 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
return 0;
}
static int nldev_get_chardev(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
char client_name[RDMA_NLDEV_ATTR_CHARDEV_TYPE_SIZE];
struct ib_client_nl_info data = {};
struct ib_device *ibdev = NULL;
struct sk_buff *msg;
u32 index;
int err;
err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1, nldev_policy,
extack);
if (err || !tb[RDMA_NLDEV_ATTR_CHARDEV_TYPE])
return -EINVAL;
nla_strlcpy(client_name, tb[RDMA_NLDEV_ATTR_CHARDEV_TYPE],
sizeof(client_name));
if (tb[RDMA_NLDEV_ATTR_DEV_INDEX]) {
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
ibdev = ib_device_get_by_index(sock_net(skb->sk), index);
if (!ibdev)
return -EINVAL;
if (tb[RDMA_NLDEV_ATTR_PORT_INDEX]) {
data.port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(ibdev, data.port)) {
err = -EINVAL;
goto out_put;
}
} else {
data.port = -1;
}
} else if (tb[RDMA_NLDEV_ATTR_PORT_INDEX]) {
return -EINVAL;
}
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
err = -ENOMEM;
goto out_put;
}
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_GET_CHARDEV),
0, 0);
data.nl_msg = msg;
err = ib_get_client_nl_info(ibdev, client_name, &data);
if (err)
goto out_nlmsg;
err = nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CHARDEV,
huge_encode_dev(data.cdev->devt),
RDMA_NLDEV_ATTR_PAD);
if (err)
goto out_data;
err = nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CHARDEV_ABI, data.abi,
RDMA_NLDEV_ATTR_PAD);
if (err)
goto out_data;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_CHARDEV_NAME,
dev_name(data.cdev))) {
err = -EMSGSIZE;
goto out_data;
}
nlmsg_end(msg, nlh);
put_device(data.cdev);
if (ibdev)
ib_device_put(ibdev);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
out_data:
put_device(data.cdev);
out_nlmsg:
nlmsg_free(msg);
out_put:
if (ibdev)
ib_device_put(ibdev);
return err;
}
static int nldev_sys_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
@ -1399,11 +1660,375 @@ static int nldev_set_sys_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
return err;
}
static int nldev_stat_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
u32 index, port, mode, mask = 0, qpn, cntn = 0;
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct ib_device *device;
struct sk_buff *msg;
int ret;
ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
/* Currently only counter for QP is supported */
if (ret || !tb[RDMA_NLDEV_ATTR_STAT_RES] ||
!tb[RDMA_NLDEV_ATTR_DEV_INDEX] ||
!tb[RDMA_NLDEV_ATTR_PORT_INDEX] || !tb[RDMA_NLDEV_ATTR_STAT_MODE])
return -EINVAL;
if (nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_RES]) != RDMA_NLDEV_ATTR_RES_QP)
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(device, port)) {
ret = -EINVAL;
goto err;
}
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
ret = -ENOMEM;
goto err;
}
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_STAT_SET),
0, 0);
mode = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_MODE]);
if (mode == RDMA_COUNTER_MODE_AUTO) {
if (tb[RDMA_NLDEV_ATTR_STAT_AUTO_MODE_MASK])
mask = nla_get_u32(
tb[RDMA_NLDEV_ATTR_STAT_AUTO_MODE_MASK]);
ret = rdma_counter_set_auto_mode(device, port,
mask ? true : false, mask);
if (ret)
goto err_msg;
} else {
qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
if (tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]) {
cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
ret = rdma_counter_bind_qpn(device, port, qpn, cntn);
} else {
ret = rdma_counter_bind_qpn_alloc(device, port,
qpn, &cntn);
}
if (ret)
goto err_msg;
if (fill_nldev_handle(msg, device) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qpn)) {
ret = -EMSGSIZE;
goto err_fill;
}
}
nlmsg_end(msg, nlh);
ib_device_put(device);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
err_fill:
rdma_counter_unbind_qpn(device, port, qpn, cntn);
err_msg:
nlmsg_free(msg);
err:
ib_device_put(device);
return ret;
}
static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct ib_device *device;
struct sk_buff *msg;
u32 index, port, qpn, cntn;
int ret;
ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
if (ret || !tb[RDMA_NLDEV_ATTR_STAT_RES] ||
!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_PORT_INDEX] ||
!tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID] ||
!tb[RDMA_NLDEV_ATTR_RES_LQPN])
return -EINVAL;
if (nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_RES]) != RDMA_NLDEV_ATTR_RES_QP)
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(device, port)) {
ret = -EINVAL;
goto err;
}
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
ret = -ENOMEM;
goto err;
}
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_STAT_SET),
0, 0);
cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
if (ret)
goto err_unbind;
if (fill_nldev_handle(msg, device) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qpn)) {
ret = -EMSGSIZE;
goto err_fill;
}
nlmsg_end(msg, nlh);
ib_device_put(device);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
err_fill:
rdma_counter_bind_qpn(device, port, qpn, cntn);
err_unbind:
nlmsg_free(msg);
err:
ib_device_put(device);
return ret;
}
static int stat_get_doit_default_counter(struct sk_buff *skb,
struct nlmsghdr *nlh,
struct netlink_ext_ack *extack,
struct nlattr *tb[])
{
struct rdma_hw_stats *stats;
struct nlattr *table_attr;
struct ib_device *device;
int ret, num_cnts, i;
struct sk_buff *msg;
u32 index, port;
u64 v;
if (!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_PORT_INDEX])
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
if (!device->ops.alloc_hw_stats || !device->ops.get_hw_stats) {
ret = -EINVAL;
goto err;
}
port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(device, port)) {
ret = -EINVAL;
goto err;
}
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
ret = -ENOMEM;
goto err;
}
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_STAT_GET),
0, 0);
if (fill_nldev_handle(msg, device) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port)) {
ret = -EMSGSIZE;
goto err_msg;
}
stats = device->port_data ? device->port_data[port].hw_stats : NULL;
if (stats == NULL) {
ret = -EINVAL;
goto err_msg;
}
mutex_lock(&stats->lock);
num_cnts = device->ops.get_hw_stats(device, stats, port, 0);
if (num_cnts < 0) {
ret = -EINVAL;
goto err_stats;
}
table_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_STAT_HWCOUNTERS);
if (!table_attr) {
ret = -EMSGSIZE;
goto err_stats;
}
for (i = 0; i < num_cnts; i++) {
v = stats->value[i] +
rdma_counter_get_hwstat_value(device, port, i);
if (fill_stat_hwcounter_entry(msg, stats->names[i], v)) {
ret = -EMSGSIZE;
goto err_table;
}
}
nla_nest_end(msg, table_attr);
mutex_unlock(&stats->lock);
nlmsg_end(msg, nlh);
ib_device_put(device);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
err_table:
nla_nest_cancel(msg, table_attr);
err_stats:
mutex_unlock(&stats->lock);
err_msg:
nlmsg_free(msg);
err:
ib_device_put(device);
return ret;
}
static int stat_get_doit_qp(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack, struct nlattr *tb[])
{
static enum rdma_nl_counter_mode mode;
static enum rdma_nl_counter_mask mask;
struct ib_device *device;
struct sk_buff *msg;
u32 index, port;
int ret;
if (tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID])
return nldev_res_get_counter_doit(skb, nlh, extack);
if (!tb[RDMA_NLDEV_ATTR_STAT_MODE] ||
!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_PORT_INDEX])
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(device, port)) {
ret = -EINVAL;
goto err;
}
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
ret = -ENOMEM;
goto err;
}
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_STAT_GET),
0, 0);
ret = rdma_counter_get_mode(device, port, &mode, &mask);
if (ret)
goto err_msg;
if (fill_nldev_handle(msg, device) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_MODE, mode))
goto err_msg;
if ((mode == RDMA_COUNTER_MODE_AUTO) &&
nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_AUTO_MODE_MASK, mask))
goto err_msg;
nlmsg_end(msg, nlh);
ib_device_put(device);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
err_msg:
nlmsg_free(msg);
err:
ib_device_put(device);
return ret;
}
static int nldev_stat_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
int ret;
ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
if (ret)
return -EINVAL;
if (!tb[RDMA_NLDEV_ATTR_STAT_RES])
return stat_get_doit_default_counter(skb, nlh, extack, tb);
switch (nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_RES])) {
case RDMA_NLDEV_ATTR_RES_QP:
ret = stat_get_doit_qp(skb, nlh, extack, tb);
break;
default:
ret = -EINVAL;
break;
}
return ret;
}
static int nldev_stat_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
int ret;
ret = nlmsg_parse(cb->nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, NULL);
if (ret || !tb[RDMA_NLDEV_ATTR_STAT_RES])
return -EINVAL;
switch (nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_RES])) {
case RDMA_NLDEV_ATTR_RES_QP:
ret = nldev_res_get_counter_dumpit(skb, cb);
break;
default:
ret = -EINVAL;
break;
}
return ret;
}
static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
[RDMA_NLDEV_CMD_GET] = {
.doit = nldev_get_doit,
.dump = nldev_get_dumpit,
},
[RDMA_NLDEV_CMD_GET_CHARDEV] = {
.doit = nldev_get_chardev,
},
[RDMA_NLDEV_CMD_SET] = {
.doit = nldev_set_doit,
.flags = RDMA_NL_ADMIN_PERM,
@ -1449,6 +2074,17 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
},
[RDMA_NLDEV_CMD_SYS_SET] = {
.doit = nldev_set_sys_set_doit,
},
[RDMA_NLDEV_CMD_STAT_SET] = {
.doit = nldev_stat_set_doit,
.flags = RDMA_NL_ADMIN_PERM,
},
[RDMA_NLDEV_CMD_STAT_GET] = {
.doit = nldev_stat_get_doit,
.dump = nldev_stat_get_dumpit,
},
[RDMA_NLDEV_CMD_STAT_DEL] = {
.doit = nldev_stat_del_doit,
.flags = RDMA_NL_ADMIN_PERM,
},
};

View File

@ -6,6 +6,7 @@
#include <rdma/rdma_cm.h>
#include <rdma/ib_verbs.h>
#include <rdma/restrack.h>
#include <rdma/rdma_counter.h>
#include <linux/mutex.h>
#include <linux/sched/task.h>
#include <linux/pid_namespace.h>
@ -45,6 +46,7 @@ static const char *type2str(enum rdma_restrack_type type)
[RDMA_RESTRACK_CM_ID] = "CM_ID",
[RDMA_RESTRACK_MR] = "MR",
[RDMA_RESTRACK_CTX] = "CTX",
[RDMA_RESTRACK_COUNTER] = "COUNTER",
};
return names[type];
@ -169,6 +171,8 @@ static struct ib_device *res_to_dev(struct rdma_restrack_entry *res)
return container_of(res, struct ib_mr, res)->device;
case RDMA_RESTRACK_CTX:
return container_of(res, struct ib_ucontext, res)->device;
case RDMA_RESTRACK_COUNTER:
return container_of(res, struct rdma_counter, res)->device;
default:
WARN_ONCE(true, "Wrong resource tracking type %u\n", res->type);
return NULL;
@ -190,6 +194,20 @@ void rdma_restrack_set_task(struct rdma_restrack_entry *res,
}
EXPORT_SYMBOL(rdma_restrack_set_task);
/**
* rdma_restrack_attach_task() - attach the task onto this resource
* @res: resource entry
* @task: the task to attach, the current task will be used if it is NULL.
*/
void rdma_restrack_attach_task(struct rdma_restrack_entry *res,
struct task_struct *task)
{
if (res->task)
put_task_struct(res->task);
get_task_struct(task);
res->task = task;
}
static void rdma_restrack_add(struct rdma_restrack_entry *res)
{
struct ib_device *dev = res_to_dev(res);
@ -203,15 +221,22 @@ static void rdma_restrack_add(struct rdma_restrack_entry *res)
kref_init(&res->kref);
init_completion(&res->comp);
if (res->type != RDMA_RESTRACK_QP)
ret = xa_alloc_cyclic(&rt->xa, &res->id, res, xa_limit_32b,
&rt->next_id, GFP_KERNEL);
else {
if (res->type == RDMA_RESTRACK_QP) {
/* Special case to ensure that LQPN points to right QP */
struct ib_qp *qp = container_of(res, struct ib_qp, res);
ret = xa_insert(&rt->xa, qp->qp_num, res, GFP_KERNEL);
res->id = ret ? 0 : qp->qp_num;
} else if (res->type == RDMA_RESTRACK_COUNTER) {
/* Special case to ensure that cntn points to right counter */
struct rdma_counter *counter;
counter = container_of(res, struct rdma_counter, res);
ret = xa_insert(&rt->xa, counter->id, res, GFP_KERNEL);
res->id = ret ? 0 : counter->id;
} else {
ret = xa_alloc_cyclic(&rt->xa, &res->id, res, xa_limit_32b,
&rt->next_id, GFP_KERNEL);
}
if (!ret)
@ -237,7 +262,8 @@ EXPORT_SYMBOL(rdma_restrack_kadd);
*/
void rdma_restrack_uadd(struct rdma_restrack_entry *res)
{
if (res->type != RDMA_RESTRACK_CM_ID)
if ((res->type != RDMA_RESTRACK_CM_ID) &&
(res->type != RDMA_RESTRACK_COUNTER))
res->task = NULL;
if (!res->task)
@ -323,3 +349,16 @@ out:
}
}
EXPORT_SYMBOL(rdma_restrack_del);
bool rdma_is_visible_in_pid_ns(struct rdma_restrack_entry *res)
{
/*
* 1. Kern resources should be visible in init
* namespace only
* 2. Present only resources visible in the current
* namespace
*/
if (rdma_is_kernel_res(res))
return task_active_pid_ns(current) == &init_pid_ns;
return task_active_pid_ns(current) == task_active_pid_ns(res->task);
}

View File

@ -25,4 +25,7 @@ struct rdma_restrack_root {
int rdma_restrack_init(struct ib_device *dev);
void rdma_restrack_clean(struct ib_device *dev);
void rdma_restrack_attach_task(struct rdma_restrack_entry *res,
struct task_struct *task);
bool rdma_is_visible_in_pid_ns(struct rdma_restrack_entry *res);
#endif /* _RDMA_CORE_RESTRACK_H_ */

View File

@ -51,24 +51,23 @@ static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num,
return false;
}
static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev,
bool pi_support)
{
u32 max_pages;
if (pi_support)
max_pages = dev->attrs.max_pi_fast_reg_page_list_len;
else
max_pages = dev->attrs.max_fast_reg_page_list_len;
/* arbitrary limit to avoid allocating gigantic resources */
return min_t(u32, dev->attrs.max_fast_reg_page_list_len, 256);
return min_t(u32, max_pages, 256);
}
/* Caller must have zero-initialized *reg. */
static int rdma_rw_init_one_mr(struct ib_qp *qp, u8 port_num,
struct rdma_rw_reg_ctx *reg, struct scatterlist *sg,
u32 sg_cnt, u32 offset)
static inline int rdma_rw_inv_key(struct rdma_rw_reg_ctx *reg)
{
u32 pages_per_mr = rdma_rw_fr_page_list_len(qp->pd->device);
u32 nents = min(sg_cnt, pages_per_mr);
int count = 0, ret;
reg->mr = ib_mr_pool_get(qp, &qp->rdma_mrs);
if (!reg->mr)
return -EAGAIN;
int count = 0;
if (reg->mr->need_inval) {
reg->inv_wr.opcode = IB_WR_LOCAL_INV;
@ -79,6 +78,25 @@ static int rdma_rw_init_one_mr(struct ib_qp *qp, u8 port_num,
reg->inv_wr.next = NULL;
}
return count;
}
/* Caller must have zero-initialized *reg. */
static int rdma_rw_init_one_mr(struct ib_qp *qp, u8 port_num,
struct rdma_rw_reg_ctx *reg, struct scatterlist *sg,
u32 sg_cnt, u32 offset)
{
u32 pages_per_mr = rdma_rw_fr_page_list_len(qp->pd->device,
qp->integrity_en);
u32 nents = min(sg_cnt, pages_per_mr);
int count = 0, ret;
reg->mr = ib_mr_pool_get(qp, &qp->rdma_mrs);
if (!reg->mr)
return -EAGAIN;
count += rdma_rw_inv_key(reg);
ret = ib_map_mr_sg(reg->mr, sg, nents, &offset, PAGE_SIZE);
if (ret < 0 || ret < nents) {
ib_mr_pool_put(qp, &qp->rdma_mrs, reg->mr);
@ -102,7 +120,8 @@ static int rdma_rw_init_mr_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
u64 remote_addr, u32 rkey, enum dma_data_direction dir)
{
struct rdma_rw_reg_ctx *prev = NULL;
u32 pages_per_mr = rdma_rw_fr_page_list_len(qp->pd->device);
u32 pages_per_mr = rdma_rw_fr_page_list_len(qp->pd->device,
qp->integrity_en);
int i, j, ret = 0, count = 0;
ctx->nr_ops = (sg_cnt + pages_per_mr - 1) / pages_per_mr;
@ -343,13 +362,14 @@ int rdma_rw_ctx_signature_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
u64 remote_addr, u32 rkey, enum dma_data_direction dir)
{
struct ib_device *dev = qp->pd->device;
u32 pages_per_mr = rdma_rw_fr_page_list_len(qp->pd->device);
u32 pages_per_mr = rdma_rw_fr_page_list_len(qp->pd->device,
qp->integrity_en);
struct ib_rdma_wr *rdma_wr;
struct ib_send_wr *prev_wr = NULL;
int count = 0, ret;
if (sg_cnt > pages_per_mr || prot_sg_cnt > pages_per_mr) {
pr_err("SG count too large\n");
pr_err("SG count too large: sg_cnt=%d, prot_sg_cnt=%d, pages_per_mr=%d\n",
sg_cnt, prot_sg_cnt, pages_per_mr);
return -EINVAL;
}
@ -358,75 +378,58 @@ int rdma_rw_ctx_signature_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
return -ENOMEM;
sg_cnt = ret;
ret = ib_dma_map_sg(dev, prot_sg, prot_sg_cnt, dir);
if (!ret) {
ret = -ENOMEM;
goto out_unmap_sg;
if (prot_sg_cnt) {
ret = ib_dma_map_sg(dev, prot_sg, prot_sg_cnt, dir);
if (!ret) {
ret = -ENOMEM;
goto out_unmap_sg;
}
prot_sg_cnt = ret;
}
prot_sg_cnt = ret;
ctx->type = RDMA_RW_SIG_MR;
ctx->nr_ops = 1;
ctx->sig = kcalloc(1, sizeof(*ctx->sig), GFP_KERNEL);
if (!ctx->sig) {
ctx->reg = kcalloc(1, sizeof(*ctx->reg), GFP_KERNEL);
if (!ctx->reg) {
ret = -ENOMEM;
goto out_unmap_prot_sg;
}
ret = rdma_rw_init_one_mr(qp, port_num, &ctx->sig->data, sg, sg_cnt, 0);
if (ret < 0)
goto out_free_ctx;
count += ret;
prev_wr = &ctx->sig->data.reg_wr.wr;
ret = rdma_rw_init_one_mr(qp, port_num, &ctx->sig->prot,
prot_sg, prot_sg_cnt, 0);
if (ret < 0)
goto out_destroy_data_mr;
count += ret;
if (ctx->sig->prot.inv_wr.next)
prev_wr->next = &ctx->sig->prot.inv_wr;
else
prev_wr->next = &ctx->sig->prot.reg_wr.wr;
prev_wr = &ctx->sig->prot.reg_wr.wr;
ctx->sig->sig_mr = ib_mr_pool_get(qp, &qp->sig_mrs);
if (!ctx->sig->sig_mr) {
ctx->reg->mr = ib_mr_pool_get(qp, &qp->sig_mrs);
if (!ctx->reg->mr) {
ret = -EAGAIN;
goto out_destroy_prot_mr;
goto out_free_ctx;
}
if (ctx->sig->sig_mr->need_inval) {
memset(&ctx->sig->sig_inv_wr, 0, sizeof(ctx->sig->sig_inv_wr));
count += rdma_rw_inv_key(ctx->reg);
ctx->sig->sig_inv_wr.opcode = IB_WR_LOCAL_INV;
ctx->sig->sig_inv_wr.ex.invalidate_rkey = ctx->sig->sig_mr->rkey;
memcpy(ctx->reg->mr->sig_attrs, sig_attrs, sizeof(struct ib_sig_attrs));
prev_wr->next = &ctx->sig->sig_inv_wr;
prev_wr = &ctx->sig->sig_inv_wr;
ret = ib_map_mr_sg_pi(ctx->reg->mr, sg, sg_cnt, NULL, prot_sg,
prot_sg_cnt, NULL, SZ_4K);
if (unlikely(ret)) {
pr_err("failed to map PI sg (%d)\n", sg_cnt + prot_sg_cnt);
goto out_destroy_sig_mr;
}
ctx->sig->sig_wr.wr.opcode = IB_WR_REG_SIG_MR;
ctx->sig->sig_wr.wr.wr_cqe = NULL;
ctx->sig->sig_wr.wr.sg_list = &ctx->sig->data.sge;
ctx->sig->sig_wr.wr.num_sge = 1;
ctx->sig->sig_wr.access_flags = IB_ACCESS_LOCAL_WRITE;
ctx->sig->sig_wr.sig_attrs = sig_attrs;
ctx->sig->sig_wr.sig_mr = ctx->sig->sig_mr;
if (prot_sg_cnt)
ctx->sig->sig_wr.prot = &ctx->sig->prot.sge;
prev_wr->next = &ctx->sig->sig_wr.wr;
prev_wr = &ctx->sig->sig_wr.wr;
ctx->reg->reg_wr.wr.opcode = IB_WR_REG_MR_INTEGRITY;
ctx->reg->reg_wr.wr.wr_cqe = NULL;
ctx->reg->reg_wr.wr.num_sge = 0;
ctx->reg->reg_wr.wr.send_flags = 0;
ctx->reg->reg_wr.access = IB_ACCESS_LOCAL_WRITE;
if (rdma_protocol_iwarp(qp->device, port_num))
ctx->reg->reg_wr.access |= IB_ACCESS_REMOTE_WRITE;
ctx->reg->reg_wr.mr = ctx->reg->mr;
ctx->reg->reg_wr.key = ctx->reg->mr->lkey;
count++;
ctx->sig->sig_sge.addr = 0;
ctx->sig->sig_sge.length = ctx->sig->data.sge.length;
if (sig_attrs->wire.sig_type != IB_SIG_TYPE_NONE)
ctx->sig->sig_sge.length += ctx->sig->prot.sge.length;
ctx->reg->sge.addr = ctx->reg->mr->iova;
ctx->reg->sge.length = ctx->reg->mr->length;
if (sig_attrs->wire.sig_type == IB_SIG_TYPE_NONE)
ctx->reg->sge.length -= ctx->reg->mr->sig_attrs->meta_length;
rdma_wr = &ctx->sig->data.wr;
rdma_wr->wr.sg_list = &ctx->sig->sig_sge;
rdma_wr = &ctx->reg->wr;
rdma_wr->wr.sg_list = &ctx->reg->sge;
rdma_wr->wr.num_sge = 1;
rdma_wr->remote_addr = remote_addr;
rdma_wr->rkey = rkey;
@ -434,21 +437,18 @@ int rdma_rw_ctx_signature_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
rdma_wr->wr.opcode = IB_WR_RDMA_WRITE;
else
rdma_wr->wr.opcode = IB_WR_RDMA_READ;
prev_wr->next = &rdma_wr->wr;
prev_wr = &rdma_wr->wr;
ctx->reg->reg_wr.wr.next = &rdma_wr->wr;
count++;
return count;
out_destroy_prot_mr:
if (prot_sg_cnt)
ib_mr_pool_put(qp, &qp->rdma_mrs, ctx->sig->prot.mr);
out_destroy_data_mr:
ib_mr_pool_put(qp, &qp->rdma_mrs, ctx->sig->data.mr);
out_destroy_sig_mr:
ib_mr_pool_put(qp, &qp->sig_mrs, ctx->reg->mr);
out_free_ctx:
kfree(ctx->sig);
kfree(ctx->reg);
out_unmap_prot_sg:
ib_dma_unmap_sg(dev, prot_sg, prot_sg_cnt, dir);
if (prot_sg_cnt)
ib_dma_unmap_sg(dev, prot_sg, prot_sg_cnt, dir);
out_unmap_sg:
ib_dma_unmap_sg(dev, sg, sg_cnt, dir);
return ret;
@ -491,22 +491,8 @@ struct ib_send_wr *rdma_rw_ctx_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
switch (ctx->type) {
case RDMA_RW_SIG_MR:
rdma_rw_update_lkey(&ctx->sig->data, true);
if (ctx->sig->prot.mr)
rdma_rw_update_lkey(&ctx->sig->prot, true);
ctx->sig->sig_mr->need_inval = true;
ib_update_fast_reg_key(ctx->sig->sig_mr,
ib_inc_rkey(ctx->sig->sig_mr->lkey));
ctx->sig->sig_sge.lkey = ctx->sig->sig_mr->lkey;
if (ctx->sig->data.inv_wr.next)
first_wr = &ctx->sig->data.inv_wr;
else
first_wr = &ctx->sig->data.reg_wr.wr;
last_wr = &ctx->sig->data.wr.wr;
break;
case RDMA_RW_MR:
/* fallthrough */
for (i = 0; i < ctx->nr_ops; i++) {
rdma_rw_update_lkey(&ctx->reg[i],
ctx->reg[i].wr.wr.opcode !=
@ -605,7 +591,7 @@ EXPORT_SYMBOL(rdma_rw_ctx_destroy);
/**
* rdma_rw_ctx_destroy_signature - release all resources allocated by
* rdma_rw_ctx_init_signature
* rdma_rw_ctx_signature_init
* @ctx: context to release
* @qp: queue pair to operate on
* @port_num: port num to which the connection is bound
@ -623,16 +609,12 @@ void rdma_rw_ctx_destroy_signature(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
if (WARN_ON_ONCE(ctx->type != RDMA_RW_SIG_MR))
return;
ib_mr_pool_put(qp, &qp->rdma_mrs, ctx->sig->data.mr);
ib_mr_pool_put(qp, &qp->sig_mrs, ctx->reg->mr);
kfree(ctx->reg);
ib_dma_unmap_sg(qp->pd->device, sg, sg_cnt, dir);
if (ctx->sig->prot.mr) {
ib_mr_pool_put(qp, &qp->rdma_mrs, ctx->sig->prot.mr);
if (prot_sg_cnt)
ib_dma_unmap_sg(qp->pd->device, prot_sg, prot_sg_cnt, dir);
}
ib_mr_pool_put(qp, &qp->sig_mrs, ctx->sig->sig_mr);
kfree(ctx->sig);
}
EXPORT_SYMBOL(rdma_rw_ctx_destroy_signature);
@ -653,7 +635,7 @@ unsigned int rdma_rw_mr_factor(struct ib_device *device, u8 port_num,
unsigned int mr_pages;
if (rdma_rw_can_use_mr(device, port_num))
mr_pages = rdma_rw_fr_page_list_len(device);
mr_pages = rdma_rw_fr_page_list_len(device, false);
else
mr_pages = device->attrs.max_sge_rd;
return DIV_ROUND_UP(maxpages, mr_pages);
@ -679,9 +661,8 @@ void rdma_rw_init_qp(struct ib_device *dev, struct ib_qp_init_attr *attr)
* we'll need two additional MRs for the registrations and the
* invalidation.
*/
if (attr->create_flags & IB_QP_CREATE_SIGNATURE_EN)
factor += 6; /* (inv + reg) * (data + prot + sig) */
else if (rdma_rw_can_use_mr(dev, attr->port_num))
if (attr->create_flags & IB_QP_CREATE_INTEGRITY_EN ||
rdma_rw_can_use_mr(dev, attr->port_num))
factor += 2; /* inv + reg */
attr->cap.max_send_wr += factor * attr->cap.max_rdma_ctxs;
@ -697,20 +678,22 @@ void rdma_rw_init_qp(struct ib_device *dev, struct ib_qp_init_attr *attr)
int rdma_rw_init_mrs(struct ib_qp *qp, struct ib_qp_init_attr *attr)
{
struct ib_device *dev = qp->pd->device;
u32 nr_mrs = 0, nr_sig_mrs = 0;
u32 nr_mrs = 0, nr_sig_mrs = 0, max_num_sg = 0;
int ret = 0;
if (attr->create_flags & IB_QP_CREATE_SIGNATURE_EN) {
if (attr->create_flags & IB_QP_CREATE_INTEGRITY_EN) {
nr_sig_mrs = attr->cap.max_rdma_ctxs;
nr_mrs = attr->cap.max_rdma_ctxs * 2;
nr_mrs = attr->cap.max_rdma_ctxs;
max_num_sg = rdma_rw_fr_page_list_len(dev, true);
} else if (rdma_rw_can_use_mr(dev, attr->port_num)) {
nr_mrs = attr->cap.max_rdma_ctxs;
max_num_sg = rdma_rw_fr_page_list_len(dev, false);
}
if (nr_mrs) {
ret = ib_mr_pool_init(qp, &qp->rdma_mrs, nr_mrs,
IB_MR_TYPE_MEM_REG,
rdma_rw_fr_page_list_len(dev));
max_num_sg, 0);
if (ret) {
pr_err("%s: failed to allocated %d MRs\n",
__func__, nr_mrs);
@ -720,10 +703,10 @@ int rdma_rw_init_mrs(struct ib_qp *qp, struct ib_qp_init_attr *attr)
if (nr_sig_mrs) {
ret = ib_mr_pool_init(qp, &qp->sig_mrs, nr_sig_mrs,
IB_MR_TYPE_SIGNATURE, 2);
IB_MR_TYPE_INTEGRITY, max_num_sg, max_num_sg);
if (ret) {
pr_err("%s: failed to allocated %d SIG MRs\n",
__func__, nr_mrs);
__func__, nr_sig_mrs);
goto out_free_rdma_mrs;
}
}

View File

@ -43,6 +43,7 @@
#include <rdma/ib_mad.h>
#include <rdma/ib_pma.h>
#include <rdma/ib_cache.h>
#include <rdma/rdma_counter.h>
struct ib_port;
@ -800,9 +801,12 @@ static int update_hw_stats(struct ib_device *dev, struct rdma_hw_stats *stats,
return 0;
}
static ssize_t print_hw_stat(struct rdma_hw_stats *stats, int index, char *buf)
static ssize_t print_hw_stat(struct ib_device *dev, int port_num,
struct rdma_hw_stats *stats, int index, char *buf)
{
return sprintf(buf, "%llu\n", stats->value[index]);
u64 v = rdma_counter_get_hwstat_value(dev, port_num, index);
return sprintf(buf, "%llu\n", stats->value[index] + v);
}
static ssize_t show_hw_stats(struct kobject *kobj, struct attribute *attr,
@ -828,7 +832,7 @@ static ssize_t show_hw_stats(struct kobject *kobj, struct attribute *attr,
ret = update_hw_stats(dev, stats, hsa->port_num, hsa->index);
if (ret)
goto unlock;
ret = print_hw_stat(stats, hsa->index, buf);
ret = print_hw_stat(dev, hsa->port_num, stats, hsa->index, buf);
unlock:
mutex_unlock(&stats->lock);
@ -999,6 +1003,8 @@ static void setup_hw_stats(struct ib_device *device, struct ib_port *port,
goto err;
port->hw_stats_ag = hsag;
port->hw_stats = stats;
if (device->port_data)
device->port_data[port_num].hw_stats = stats;
} else {
struct kobject *kobj = &device->dev.kobj;
ret = sysfs_create_group(kobj, hsag);
@ -1289,6 +1295,8 @@ const struct attribute_group ib_dev_attr_group = {
void ib_free_port_attrs(struct ib_core_device *coredev)
{
struct ib_device *device = rdma_device_to_ibdev(&coredev->dev);
bool is_full_dev = &device->coredev == coredev;
struct kobject *p, *t;
list_for_each_entry_safe(p, t, &coredev->port_list, entry) {
@ -1298,6 +1306,8 @@ void ib_free_port_attrs(struct ib_core_device *coredev)
if (port->hw_stats_ag)
free_hsag(&port->kobj, port->hw_stats_ag);
kfree(port->hw_stats);
if (device->port_data && is_full_dev)
device->port_data[port->port_num].hw_stats = NULL;
if (port->pma_table)
sysfs_remove_group(p, port->pma_table);

File diff suppressed because it is too large Load Diff

View File

@ -52,6 +52,8 @@
#include <rdma/rdma_cm_ib.h>
#include <rdma/ib_addr.h>
#include <rdma/ib.h>
#include <rdma/rdma_netlink.h>
#include "core_priv.h"
MODULE_AUTHOR("Sean Hefty");
MODULE_DESCRIPTION("RDMA Userspace Connection Manager Access");
@ -81,7 +83,7 @@ struct ucma_file {
};
struct ucma_context {
int id;
u32 id;
struct completion comp;
atomic_t ref;
int events_reported;
@ -94,7 +96,7 @@ struct ucma_context {
struct list_head list;
struct list_head mc_list;
/* mark that device is in process of destroying the internal HW
* resources, protected by the global mut
* resources, protected by the ctx_table lock
*/
int closing;
/* sync between removal event and id destroy, protected by file mut */
@ -104,7 +106,7 @@ struct ucma_context {
struct ucma_multicast {
struct ucma_context *ctx;
int id;
u32 id;
int events_reported;
u64 uid;
@ -122,9 +124,8 @@ struct ucma_event {
struct work_struct close_work;
};
static DEFINE_MUTEX(mut);
static DEFINE_IDR(ctx_idr);
static DEFINE_IDR(multicast_idr);
static DEFINE_XARRAY_ALLOC(ctx_table);
static DEFINE_XARRAY_ALLOC(multicast_table);
static const struct file_operations ucma_fops;
@ -133,7 +134,7 @@ static inline struct ucma_context *_ucma_find_context(int id,
{
struct ucma_context *ctx;
ctx = idr_find(&ctx_idr, id);
ctx = xa_load(&ctx_table, id);
if (!ctx)
ctx = ERR_PTR(-ENOENT);
else if (ctx->file != file || !ctx->cm_id)
@ -145,7 +146,7 @@ static struct ucma_context *ucma_get_ctx(struct ucma_file *file, int id)
{
struct ucma_context *ctx;
mutex_lock(&mut);
xa_lock(&ctx_table);
ctx = _ucma_find_context(id, file);
if (!IS_ERR(ctx)) {
if (ctx->closing)
@ -153,7 +154,7 @@ static struct ucma_context *ucma_get_ctx(struct ucma_file *file, int id)
else
atomic_inc(&ctx->ref);
}
mutex_unlock(&mut);
xa_unlock(&ctx_table);
return ctx;
}
@ -216,10 +217,7 @@ static struct ucma_context *ucma_alloc_ctx(struct ucma_file *file)
INIT_LIST_HEAD(&ctx->mc_list);
ctx->file = file;
mutex_lock(&mut);
ctx->id = idr_alloc(&ctx_idr, ctx, 0, 0, GFP_KERNEL);
mutex_unlock(&mut);
if (ctx->id < 0)
if (xa_alloc(&ctx_table, &ctx->id, ctx, xa_limit_32b, GFP_KERNEL))
goto error;
list_add_tail(&ctx->list, &file->ctx_list);
@ -238,13 +236,10 @@ static struct ucma_multicast* ucma_alloc_multicast(struct ucma_context *ctx)
if (!mc)
return NULL;
mutex_lock(&mut);
mc->id = idr_alloc(&multicast_idr, NULL, 0, 0, GFP_KERNEL);
mutex_unlock(&mut);
if (mc->id < 0)
mc->ctx = ctx;
if (xa_alloc(&multicast_table, &mc->id, NULL, xa_limit_32b, GFP_KERNEL))
goto error;
mc->ctx = ctx;
list_add_tail(&mc->list, &ctx->mc_list);
return mc;
@ -319,9 +314,9 @@ static void ucma_removal_event_handler(struct rdma_cm_id *cm_id)
* handled separately below.
*/
if (ctx->cm_id == cm_id) {
mutex_lock(&mut);
xa_lock(&ctx_table);
ctx->closing = 1;
mutex_unlock(&mut);
xa_unlock(&ctx_table);
queue_work(ctx->file->close_wq, &ctx->close_work);
return;
}
@ -523,9 +518,7 @@ static ssize_t ucma_create_id(struct ucma_file *file, const char __user *inbuf,
err2:
rdma_destroy_id(cm_id);
err1:
mutex_lock(&mut);
idr_remove(&ctx_idr, ctx->id);
mutex_unlock(&mut);
xa_erase(&ctx_table, ctx->id);
mutex_lock(&file->mut);
list_del(&ctx->list);
mutex_unlock(&file->mut);
@ -537,13 +530,13 @@ static void ucma_cleanup_multicast(struct ucma_context *ctx)
{
struct ucma_multicast *mc, *tmp;
mutex_lock(&mut);
mutex_lock(&ctx->file->mut);
list_for_each_entry_safe(mc, tmp, &ctx->mc_list, list) {
list_del(&mc->list);
idr_remove(&multicast_idr, mc->id);
xa_erase(&multicast_table, mc->id);
kfree(mc);
}
mutex_unlock(&mut);
mutex_unlock(&ctx->file->mut);
}
static void ucma_cleanup_mc_events(struct ucma_multicast *mc)
@ -614,11 +607,11 @@ static ssize_t ucma_destroy_id(struct ucma_file *file, const char __user *inbuf,
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
return -EFAULT;
mutex_lock(&mut);
xa_lock(&ctx_table);
ctx = _ucma_find_context(cmd.id, file);
if (!IS_ERR(ctx))
idr_remove(&ctx_idr, ctx->id);
mutex_unlock(&mut);
__xa_erase(&ctx_table, ctx->id);
xa_unlock(&ctx_table);
if (IS_ERR(ctx))
return PTR_ERR(ctx);
@ -630,14 +623,14 @@ static ssize_t ucma_destroy_id(struct ucma_file *file, const char __user *inbuf,
flush_workqueue(ctx->file->close_wq);
/* At this point it's guaranteed that there is no inflight
* closing task */
mutex_lock(&mut);
xa_lock(&ctx_table);
if (!ctx->closing) {
mutex_unlock(&mut);
xa_unlock(&ctx_table);
ucma_put_ctx(ctx);
wait_for_completion(&ctx->comp);
rdma_destroy_id(ctx->cm_id);
} else {
mutex_unlock(&mut);
xa_unlock(&ctx_table);
}
resp.events_reported = ucma_free_ctx(ctx);
@ -951,8 +944,7 @@ static ssize_t ucma_query_path(struct ucma_context *ctx,
}
}
if (copy_to_user(response, resp,
sizeof(*resp) + (i * sizeof(struct ib_path_rec_data))))
if (copy_to_user(response, resp, struct_size(resp, path_data, i)))
ret = -EFAULT;
kfree(resp);
@ -1432,9 +1424,7 @@ static ssize_t ucma_process_join(struct ucma_file *file,
goto err3;
}
mutex_lock(&mut);
idr_replace(&multicast_idr, mc, mc->id);
mutex_unlock(&mut);
xa_store(&multicast_table, mc->id, mc, 0);
mutex_unlock(&file->mut);
ucma_put_ctx(ctx);
@ -1444,9 +1434,7 @@ err3:
rdma_leave_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr);
ucma_cleanup_mc_events(mc);
err2:
mutex_lock(&mut);
idr_remove(&multicast_idr, mc->id);
mutex_unlock(&mut);
xa_erase(&multicast_table, mc->id);
list_del(&mc->list);
kfree(mc);
err1:
@ -1508,8 +1496,8 @@ static ssize_t ucma_leave_multicast(struct ucma_file *file,
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
return -EFAULT;
mutex_lock(&mut);
mc = idr_find(&multicast_idr, cmd.id);
xa_lock(&multicast_table);
mc = xa_load(&multicast_table, cmd.id);
if (!mc)
mc = ERR_PTR(-ENOENT);
else if (mc->ctx->file != file)
@ -1517,8 +1505,8 @@ static ssize_t ucma_leave_multicast(struct ucma_file *file,
else if (!atomic_inc_not_zero(&mc->ctx->ref))
mc = ERR_PTR(-ENXIO);
else
idr_remove(&multicast_idr, mc->id);
mutex_unlock(&mut);
__xa_erase(&multicast_table, mc->id);
xa_unlock(&multicast_table);
if (IS_ERR(mc)) {
ret = PTR_ERR(mc);
@ -1615,14 +1603,14 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
* events being added before existing events.
*/
ucma_lock_files(cur_file, new_file);
mutex_lock(&mut);
xa_lock(&ctx_table);
list_move_tail(&ctx->list, &new_file->ctx_list);
ucma_move_events(ctx, new_file);
ctx->file = new_file;
resp.events_reported = ctx->events_reported;
mutex_unlock(&mut);
xa_unlock(&ctx_table);
ucma_unlock_files(cur_file, new_file);
response:
@ -1757,18 +1745,15 @@ static int ucma_close(struct inode *inode, struct file *filp)
ctx->destroying = 1;
mutex_unlock(&file->mut);
mutex_lock(&mut);
idr_remove(&ctx_idr, ctx->id);
mutex_unlock(&mut);
xa_erase(&ctx_table, ctx->id);
flush_workqueue(file->close_wq);
/* At that step once ctx was marked as destroying and workqueue
* was flushed we are safe from any inflights handlers that
* might put other closing task.
*/
mutex_lock(&mut);
xa_lock(&ctx_table);
if (!ctx->closing) {
mutex_unlock(&mut);
xa_unlock(&ctx_table);
ucma_put_ctx(ctx);
wait_for_completion(&ctx->comp);
/* rdma_destroy_id ensures that no event handlers are
@ -1776,7 +1761,7 @@ static int ucma_close(struct inode *inode, struct file *filp)
*/
rdma_destroy_id(ctx->cm_id);
} else {
mutex_unlock(&mut);
xa_unlock(&ctx_table);
}
ucma_free_ctx(ctx);
@ -1805,6 +1790,19 @@ static struct miscdevice ucma_misc = {
.fops = &ucma_fops,
};
static int ucma_get_global_nl_info(struct ib_client_nl_info *res)
{
res->abi = RDMA_USER_CM_ABI_VERSION;
res->cdev = ucma_misc.this_device;
return 0;
}
static struct ib_client rdma_cma_client = {
.name = "rdma_cm",
.get_global_nl_info = ucma_get_global_nl_info,
};
MODULE_ALIAS_RDMA_CLIENT("rdma_cm");
static ssize_t show_abi_version(struct device *dev,
struct device_attribute *attr,
char *buf)
@ -1833,7 +1831,14 @@ static int __init ucma_init(void)
ret = -ENOMEM;
goto err2;
}
ret = ib_register_client(&rdma_cma_client);
if (ret)
goto err3;
return 0;
err3:
unregister_net_sysctl_table(ucma_ctl_table_hdr);
err2:
device_remove_file(ucma_misc.this_device, &dev_attr_abi_version);
err1:
@ -1843,11 +1848,10 @@ err1:
static void __exit ucma_cleanup(void)
{
ib_unregister_client(&rdma_cma_client);
unregister_net_sysctl_table(ucma_ctl_table_hdr);
device_remove_file(ucma_misc.this_device, &dev_attr_abi_version);
misc_deregister(&ucma_misc);
idr_destroy(&ctx_idr);
idr_destroy(&multicast_idr);
}
module_init(ucma_init);

View File

@ -54,9 +54,10 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
page = sg_page_iter_page(&sg_iter);
if (!PageDirty(page) && umem->writable && dirty)
set_page_dirty_lock(page);
put_page(page);
if (umem->writable && dirty)
put_user_pages_dirty_lock(&page, 1);
else
put_user_page(page);
}
sg_free_table(&umem->sg_head);
@ -244,7 +245,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
umem->context = context;
umem->length = size;
umem->address = addr;
umem->page_shift = PAGE_SHIFT;
umem->writable = ib_access_writable(access);
umem->owning_mm = mm = current->mm;
mmgrab(mm);
@ -361,6 +361,9 @@ static void __ib_umem_release_tail(struct ib_umem *umem)
*/
void ib_umem_release(struct ib_umem *umem)
{
if (!umem)
return;
if (umem->is_odp) {
ib_umem_odp_release(to_ib_umem_odp(umem));
__ib_umem_release_tail(umem);
@ -385,7 +388,7 @@ int ib_umem_page_count(struct ib_umem *umem)
n = 0;
for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i)
n += sg_dma_len(sg) >> umem->page_shift;
n += sg_dma_len(sg) >> PAGE_SHIFT;
return n;
}

View File

@ -59,7 +59,7 @@ static u64 node_start(struct umem_odp_node *n)
struct ib_umem_odp *umem_odp =
container_of(n, struct ib_umem_odp, interval_tree);
return ib_umem_start(&umem_odp->umem);
return ib_umem_start(umem_odp);
}
/* Note that the representation of the intervals in the interval tree
@ -72,7 +72,7 @@ static u64 node_last(struct umem_odp_node *n)
struct ib_umem_odp *umem_odp =
container_of(n, struct ib_umem_odp, interval_tree);
return ib_umem_end(&umem_odp->umem) - 1;
return ib_umem_end(umem_odp) - 1;
}
INTERVAL_TREE_DEFINE(struct umem_odp_node, rb, u64, __subtree_last,
@ -107,8 +107,6 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
u64 start, u64 end, void *cookie)
{
struct ib_umem *umem = &umem_odp->umem;
/*
* Increase the number of notifiers running, to
* prevent any further fault handling on this MR.
@ -119,8 +117,8 @@ static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
* all pending page faults. */
smp_wmb();
complete_all(&umem_odp->notifier_completion);
umem->context->invalidate_range(umem_odp, ib_umem_start(umem),
ib_umem_end(umem));
umem_odp->umem.context->invalidate_range(
umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp));
return 0;
}
@ -151,6 +149,7 @@ static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn,
{
struct ib_ucontext_per_mm *per_mm =
container_of(mn, struct ib_ucontext_per_mm, mn);
int rc;
if (mmu_notifier_range_blockable(range))
down_read(&per_mm->umem_rwsem);
@ -167,11 +166,14 @@ static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn,
return 0;
}
return rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start,
range->end,
invalidate_range_start_trampoline,
mmu_notifier_range_blockable(range),
NULL);
rc = rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start,
range->end,
invalidate_range_start_trampoline,
mmu_notifier_range_blockable(range),
NULL);
if (rc)
up_read(&per_mm->umem_rwsem);
return rc;
}
static int invalidate_range_end_trampoline(struct ib_umem_odp *item, u64 start,
@ -205,10 +207,9 @@ static const struct mmu_notifier_ops ib_umem_notifiers = {
static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp)
{
struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
struct ib_umem *umem = &umem_odp->umem;
down_write(&per_mm->umem_rwsem);
if (likely(ib_umem_start(umem) != ib_umem_end(umem)))
if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp)))
rbt_ib_umem_insert(&umem_odp->interval_tree,
&per_mm->umem_tree);
up_write(&per_mm->umem_rwsem);
@ -217,10 +218,9 @@ static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp)
static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp)
{
struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
struct ib_umem *umem = &umem_odp->umem;
down_write(&per_mm->umem_rwsem);
if (likely(ib_umem_start(umem) != ib_umem_end(umem)))
if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp)))
rbt_ib_umem_remove(&umem_odp->interval_tree,
&per_mm->umem_tree);
complete_all(&umem_odp->notifier_completion);
@ -351,7 +351,7 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root,
umem->context = ctx;
umem->length = size;
umem->address = addr;
umem->page_shift = PAGE_SHIFT;
odp_data->page_shift = PAGE_SHIFT;
umem->writable = root->umem.writable;
umem->is_odp = 1;
odp_data->per_mm = per_mm;
@ -405,18 +405,19 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
struct mm_struct *mm = umem->owning_mm;
int ret_val;
umem_odp->page_shift = PAGE_SHIFT;
if (access & IB_ACCESS_HUGETLB) {
struct vm_area_struct *vma;
struct hstate *h;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ib_umem_start(umem));
vma = find_vma(mm, ib_umem_start(umem_odp));
if (!vma || !is_vm_hugetlb_page(vma)) {
up_read(&mm->mmap_sem);
return -EINVAL;
}
h = hstate_vma(vma);
umem->page_shift = huge_page_shift(h);
umem_odp->page_shift = huge_page_shift(h);
up_read(&mm->mmap_sem);
}
@ -424,16 +425,16 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
init_completion(&umem_odp->notifier_completion);
if (ib_umem_num_pages(umem)) {
if (ib_umem_odp_num_pages(umem_odp)) {
umem_odp->page_list =
vzalloc(array_size(sizeof(*umem_odp->page_list),
ib_umem_num_pages(umem)));
ib_umem_odp_num_pages(umem_odp)));
if (!umem_odp->page_list)
return -ENOMEM;
umem_odp->dma_list =
vzalloc(array_size(sizeof(*umem_odp->dma_list),
ib_umem_num_pages(umem)));
ib_umem_odp_num_pages(umem_odp)));
if (!umem_odp->dma_list) {
ret_val = -ENOMEM;
goto out_page_list;
@ -456,16 +457,14 @@ out_page_list:
void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
{
struct ib_umem *umem = &umem_odp->umem;
/*
* Ensure that no more pages are mapped in the umem.
*
* It is the driver's responsibility to ensure, before calling us,
* that the hardware will not attempt to access the MR any more.
*/
ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem),
ib_umem_end(umem));
ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
ib_umem_end(umem_odp));
remove_umem_from_per_mm(umem_odp);
put_per_mm(umem_odp);
@ -487,7 +486,7 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
* The function returns -EFAULT if the DMA mapping operation fails. It returns
* -EAGAIN if a concurrent invalidation prevents us from updating the page.
*
* The page is released via put_page even if the operation failed. For
* The page is released via put_user_page even if the operation failed. For
* on-demand pinning, the page is released whenever it isn't stored in the
* umem.
*/
@ -498,8 +497,8 @@ static int ib_umem_odp_map_dma_single_page(
u64 access_mask,
unsigned long current_seq)
{
struct ib_umem *umem = &umem_odp->umem;
struct ib_device *dev = umem->context->device;
struct ib_ucontext *context = umem_odp->umem.context;
struct ib_device *dev = context->device;
dma_addr_t dma_addr;
int remove_existing_mapping = 0;
int ret = 0;
@ -514,10 +513,9 @@ static int ib_umem_odp_map_dma_single_page(
goto out;
}
if (!(umem_odp->dma_list[page_index])) {
dma_addr = ib_dma_map_page(dev,
page,
0, BIT(umem->page_shift),
DMA_BIDIRECTIONAL);
dma_addr =
ib_dma_map_page(dev, page, 0, BIT(umem_odp->page_shift),
DMA_BIDIRECTIONAL);
if (ib_dma_mapping_error(dev, dma_addr)) {
ret = -EFAULT;
goto out;
@ -536,15 +534,16 @@ static int ib_umem_odp_map_dma_single_page(
}
out:
put_page(page);
put_user_page(page);
if (remove_existing_mapping) {
ib_umem_notifier_start_account(umem_odp);
umem->context->invalidate_range(
context->invalidate_range(
umem_odp,
ib_umem_start(umem) + (page_index << umem->page_shift),
ib_umem_start(umem) +
((page_index + 1) << umem->page_shift));
ib_umem_start(umem_odp) +
(page_index << umem_odp->page_shift),
ib_umem_start(umem_odp) +
((page_index + 1) << umem_odp->page_shift));
ib_umem_notifier_end_account(umem_odp);
ret = -EAGAIN;
}
@ -581,27 +580,26 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
u64 bcnt, u64 access_mask,
unsigned long current_seq)
{
struct ib_umem *umem = &umem_odp->umem;
struct task_struct *owning_process = NULL;
struct mm_struct *owning_mm = umem_odp->umem.owning_mm;
struct page **local_page_list = NULL;
u64 page_mask, off;
int j, k, ret = 0, start_idx, npages = 0, page_shift;
unsigned int flags = 0;
int j, k, ret = 0, start_idx, npages = 0;
unsigned int flags = 0, page_shift;
phys_addr_t p = 0;
if (access_mask == 0)
return -EINVAL;
if (user_virt < ib_umem_start(umem) ||
user_virt + bcnt > ib_umem_end(umem))
if (user_virt < ib_umem_start(umem_odp) ||
user_virt + bcnt > ib_umem_end(umem_odp))
return -EFAULT;
local_page_list = (struct page **)__get_free_page(GFP_KERNEL);
if (!local_page_list)
return -ENOMEM;
page_shift = umem->page_shift;
page_shift = umem_odp->page_shift;
page_mask = ~(BIT(page_shift) - 1);
off = user_virt & (~page_mask);
user_virt = user_virt & page_mask;
@ -621,7 +619,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
if (access_mask & ODP_WRITE_ALLOWED_BIT)
flags |= FOLL_WRITE;
start_idx = (user_virt - ib_umem_start(umem)) >> page_shift;
start_idx = (user_virt - ib_umem_start(umem_odp)) >> page_shift;
k = start_idx;
while (bcnt > 0) {
@ -659,7 +657,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
ret = -EFAULT;
break;
}
put_page(local_page_list[j]);
put_user_page(local_page_list[j]);
continue;
}
@ -686,8 +684,8 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
* ib_umem_odp_map_dma_single_page().
*/
if (npages - (j + 1) > 0)
release_pages(&local_page_list[j+1],
npages - (j + 1));
put_user_pages(&local_page_list[j+1],
npages - (j + 1));
break;
}
}
@ -711,21 +709,20 @@ EXPORT_SYMBOL(ib_umem_odp_map_dma_pages);
void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt,
u64 bound)
{
struct ib_umem *umem = &umem_odp->umem;
int idx;
u64 addr;
struct ib_device *dev = umem->context->device;
struct ib_device *dev = umem_odp->umem.context->device;
virt = max_t(u64, virt, ib_umem_start(umem));
bound = min_t(u64, bound, ib_umem_end(umem));
virt = max_t(u64, virt, ib_umem_start(umem_odp));
bound = min_t(u64, bound, ib_umem_end(umem_odp));
/* Note that during the run of this function, the
* notifiers_count of the MR is > 0, preventing any racing
* faults from completion. We might be racing with other
* invalidations, so we must make sure we free each page only
* once. */
mutex_lock(&umem_odp->umem_mutex);
for (addr = virt; addr < bound; addr += BIT(umem->page_shift)) {
idx = (addr - ib_umem_start(umem)) >> umem->page_shift;
for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) {
idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift;
if (umem_odp->page_list[idx]) {
struct page *page = umem_odp->page_list[idx];
dma_addr_t dma = umem_odp->dma_list[idx];
@ -733,7 +730,8 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt,
WARN_ON(!dma_addr);
ib_dma_unmap_page(dev, dma_addr, PAGE_SIZE,
ib_dma_unmap_page(dev, dma_addr,
BIT(umem_odp->page_shift),
DMA_BIDIRECTIONAL);
if (dma & ODP_WRITE_ALLOWED_BIT) {
struct page *head_page = compound_head(page);

View File

@ -54,6 +54,7 @@
#include <rdma/ib_mad.h>
#include <rdma/ib_user_mad.h>
#include <rdma/rdma_netlink.h>
#include "core_priv.h"
@ -744,7 +745,7 @@ found:
"process %s did not enable P_Key index support.\n",
current->comm);
dev_warn(&file->port->dev,
" Documentation/infiniband/user_mad.txt has info on the new ABI.\n");
" Documentation/infiniband/user_mad.rst has info on the new ABI.\n");
}
}
@ -1124,11 +1125,48 @@ static const struct file_operations umad_sm_fops = {
.llseek = no_llseek,
};
static int ib_umad_get_nl_info(struct ib_device *ibdev, void *client_data,
struct ib_client_nl_info *res)
{
struct ib_umad_device *umad_dev = client_data;
if (!rdma_is_port_valid(ibdev, res->port))
return -EINVAL;
res->abi = IB_USER_MAD_ABI_VERSION;
res->cdev = &umad_dev->ports[res->port - rdma_start_port(ibdev)].dev;
return 0;
}
static struct ib_client umad_client = {
.name = "umad",
.add = ib_umad_add_one,
.remove = ib_umad_remove_one
.remove = ib_umad_remove_one,
.get_nl_info = ib_umad_get_nl_info,
};
MODULE_ALIAS_RDMA_CLIENT("umad");
static int ib_issm_get_nl_info(struct ib_device *ibdev, void *client_data,
struct ib_client_nl_info *res)
{
struct ib_umad_device *umad_dev =
ib_get_client_data(ibdev, &umad_client);
if (!rdma_is_port_valid(ibdev, res->port))
return -EINVAL;
res->abi = IB_USER_MAD_ABI_VERSION;
res->cdev = &umad_dev->ports[res->port - rdma_start_port(ibdev)].sm_dev;
return 0;
}
static struct ib_client issm_client = {
.name = "issm",
.get_nl_info = ib_issm_get_nl_info,
};
MODULE_ALIAS_RDMA_CLIENT("issm");
static ssize_t ibdev_show(struct device *dev, struct device_attribute *attr,
char *buf)
@ -1387,13 +1425,17 @@ static int __init ib_umad_init(void)
}
ret = ib_register_client(&umad_client);
if (ret) {
pr_err("couldn't register ib_umad client\n");
if (ret)
goto out_class;
}
ret = ib_register_client(&issm_client);
if (ret)
goto out_client;
return 0;
out_client:
ib_unregister_client(&umad_client);
out_class:
class_unregister(&umad_class);
@ -1411,6 +1453,7 @@ out:
static void __exit ib_umad_cleanup(void)
{
ib_unregister_client(&issm_client);
ib_unregister_client(&umad_client);
class_unregister(&umad_class);
unregister_chrdev_region(base_umad_dev,

View File

@ -756,7 +756,9 @@ static int ib_uverbs_reg_mr(struct uverbs_attr_bundle *attrs)
mr->device = pd->device;
mr->pd = pd;
mr->type = IB_MR_TYPE_USER;
mr->dm = NULL;
mr->sig_attrs = NULL;
mr->uobject = uobj;
atomic_inc(&pd->usecnt);
mr->res.type = RDMA_RESTRACK_MR;
@ -1021,12 +1023,11 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
attr.comp_vector = cmd->comp_vector;
attr.flags = cmd->flags;
cq = ib_dev->ops.create_cq(ib_dev, &attr, &attrs->driver_udata);
if (IS_ERR(cq)) {
ret = PTR_ERR(cq);
cq = rdma_zalloc_drv_obj(ib_dev, ib_cq);
if (!cq) {
ret = -ENOMEM;
goto err_file;
}
cq->device = ib_dev;
cq->uobject = &obj->uobject;
cq->comp_handler = ib_uverbs_comp_handler;
@ -1034,6 +1035,10 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
cq->cq_context = ev_file ? &ev_file->ev_queue : NULL;
atomic_set(&cq->usecnt, 0);
ret = ib_dev->ops.create_cq(cq, &attr, &attrs->driver_udata);
if (ret)
goto err_free;
obj->uobject.object = cq;
memset(&resp, 0, sizeof resp);
resp.base.cq_handle = obj->uobject.id;
@ -1054,7 +1059,9 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
err_cb:
ib_destroy_cq_user(cq, uverbs_get_cleared_udata(attrs));
cq = NULL;
err_free:
kfree(cq);
err_file:
if (ev_file)
ib_uverbs_release_ucq(attrs->ufile, ev_file, obj);
@ -2541,7 +2548,7 @@ static int ib_uverbs_detach_mcast(struct uverbs_attr_bundle *attrs)
struct ib_uqp_object *obj;
struct ib_qp *qp;
struct ib_uverbs_mcast_entry *mcast;
int ret = -EINVAL;
int ret;
bool found = false;
ret = uverbs_request(attrs, &cmd, sizeof(cmd));
@ -3715,9 +3722,6 @@ static int ib_uverbs_ex_modify_cq(struct uverbs_attr_bundle *attrs)
* trailing driver_data flex array. In this case the size of the base struct
* cannot be changed.
*/
#define offsetof_after(_struct, _member) \
(offsetof(_struct, _member) + sizeof(((_struct *)NULL)->_member))
#define UAPI_DEF_WRITE_IO(req, resp) \
.write.has_resp = 1 + \
BUILD_BUG_ON_ZERO(offsetof(req, response) != 0) + \
@ -3748,11 +3752,11 @@ static int ib_uverbs_ex_modify_cq(struct uverbs_attr_bundle *attrs)
*/
#define UAPI_DEF_WRITE_IO_EX(req, req_last_member, resp, resp_last_member) \
.write.has_resp = 1, \
.write.req_size = offsetof_after(req, req_last_member), \
.write.resp_size = offsetof_after(resp, resp_last_member)
.write.req_size = offsetofend(req, req_last_member), \
.write.resp_size = offsetofend(resp, resp_last_member)
#define UAPI_DEF_WRITE_I_EX(req, req_last_member) \
.write.req_size = offsetof_after(req, req_last_member)
.write.req_size = offsetofend(req, req_last_member)
const struct uapi_definition uverbs_def_write_intf[] = {
DECLARE_UVERBS_OBJECT(

View File

@ -51,6 +51,7 @@
#include <rdma/ib.h>
#include <rdma/uverbs_std_types.h>
#include <rdma/rdma_netlink.h>
#include "uverbs.h"
#include "core_priv.h"
@ -198,7 +199,7 @@ void ib_uverbs_release_file(struct kref *ref)
ib_dev = srcu_dereference(file->device->ib_dev,
&file->device->disassociate_srcu);
if (ib_dev && !ib_dev->ops.disassociate_ucontext)
module_put(ib_dev->owner);
module_put(ib_dev->ops.owner);
srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
if (atomic_dec_and_test(&file->device->refcount))
@ -1065,7 +1066,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
module_dependent = !(ib_dev->ops.disassociate_ucontext);
if (module_dependent) {
if (!try_module_get(ib_dev->owner)) {
if (!try_module_get(ib_dev->ops.owner)) {
ret = -ENODEV;
goto err;
}
@ -1100,7 +1101,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
return stream_open(inode, filp);
err_module:
module_put(ib_dev->owner);
module_put(ib_dev->ops.owner);
err:
mutex_unlock(&dev->lists_mutex);
@ -1148,12 +1149,41 @@ static const struct file_operations uverbs_mmap_fops = {
.compat_ioctl = ib_uverbs_ioctl,
};
static int ib_uverbs_get_nl_info(struct ib_device *ibdev, void *client_data,
struct ib_client_nl_info *res)
{
struct ib_uverbs_device *uverbs_dev = client_data;
int ret;
if (res->port != -1)
return -EINVAL;
res->abi = ibdev->ops.uverbs_abi_ver;
res->cdev = &uverbs_dev->dev;
/*
* To support DRIVER_ID binding in userspace some of the driver need
* upgrading to expose their PCI dependent revision information
* through get_context instead of relying on modalias matching. When
* the drivers are fixed they can drop this flag.
*/
if (!ibdev->ops.uverbs_no_driver_id_binding) {
ret = nla_put_u32(res->nl_msg, RDMA_NLDEV_ATTR_UVERBS_DRIVER_ID,
ibdev->ops.driver_id);
if (ret)
return ret;
}
return 0;
}
static struct ib_client uverbs_client = {
.name = "uverbs",
.no_kverbs_req = true,
.add = ib_uverbs_add_one,
.remove = ib_uverbs_remove_one
.remove = ib_uverbs_remove_one,
.get_nl_info = ib_uverbs_get_nl_info,
};
MODULE_ALIAS_RDMA_CLIENT("uverbs");
static ssize_t ibdev_show(struct device *device, struct device_attribute *attr,
char *buf)
@ -1186,7 +1216,7 @@ static ssize_t abi_version_show(struct device *device,
srcu_key = srcu_read_lock(&dev->disassociate_srcu);
ib_dev = srcu_dereference(dev->ib_dev, &dev->disassociate_srcu);
if (ib_dev)
ret = sprintf(buf, "%d\n", ib_dev->uverbs_abi_ver);
ret = sprintf(buf, "%u\n", ib_dev->ops.uverbs_abi_ver);
srcu_read_unlock(&dev->disassociate_srcu, srcu_key);
return ret;

View File

@ -111,9 +111,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
INIT_LIST_HEAD(&obj->comp_list);
INIT_LIST_HEAD(&obj->async_list);
cq = ib_dev->ops.create_cq(ib_dev, &attr, &attrs->driver_udata);
if (IS_ERR(cq)) {
ret = PTR_ERR(cq);
cq = rdma_zalloc_drv_obj(ib_dev, ib_cq);
if (!cq) {
ret = -ENOMEM;
goto err_event_file;
}
@ -122,10 +122,15 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
cq->comp_handler = ib_uverbs_comp_handler;
cq->event_handler = ib_uverbs_cq_event_handler;
cq->cq_context = ev_file ? &ev_file->ev_queue : NULL;
obj->uobject.object = cq;
obj->uobject.user_handle = user_handle;
atomic_set(&cq->usecnt, 0);
cq->res.type = RDMA_RESTRACK_CQ;
ret = ib_dev->ops.create_cq(cq, &attr, &attrs->driver_udata);
if (ret)
goto err_free;
obj->uobject.object = cq;
obj->uobject.user_handle = user_handle;
rdma_restrack_uadd(&cq->res);
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_CQ_RESP_CQE, &cq->cqe,
@ -136,7 +141,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
return 0;
err_cq:
ib_destroy_cq_user(cq, uverbs_get_cleared_udata(attrs));
cq = NULL;
err_free:
kfree(cq);
err_event_file:
if (ev_file)
uverbs_uobject_put(ev_file_uobj);

View File

@ -128,6 +128,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(
mr->device = pd->device;
mr->pd = pd;
mr->type = IB_MR_TYPE_DM;
mr->dm = dm;
mr->uobject = uobj;
atomic_inc(&pd->usecnt);

View File

@ -22,6 +22,8 @@ static void *uapi_add_elm(struct uverbs_api *uapi, u32 key, size_t alloc_size)
return ERR_PTR(-EOVERFLOW);
elm = kzalloc(alloc_size, GFP_KERNEL);
if (!elm)
return ERR_PTR(-ENOMEM);
rc = radix_tree_insert(&uapi->radix, key, elm);
if (rc) {
kfree(elm);
@ -645,7 +647,7 @@ struct uverbs_api *uverbs_alloc_api(struct ib_device *ibdev)
return ERR_PTR(-ENOMEM);
INIT_RADIX_TREE(&uapi->radix, GFP_KERNEL);
uapi->driver_id = ibdev->driver_id;
uapi->driver_id = ibdev->ops.driver_id;
rc = uapi_merge_def(uapi, ibdev, uverbs_core_api, false);
if (rc)

View File

@ -209,7 +209,7 @@ __attribute_const__ int ib_rate_to_mbps(enum ib_rate rate)
EXPORT_SYMBOL(ib_rate_to_mbps);
__attribute_const__ enum rdma_transport_type
rdma_node_get_transport(enum rdma_node_type node_type)
rdma_node_get_transport(unsigned int node_type)
{
if (node_type == RDMA_NODE_USNIC)
@ -299,6 +299,7 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
mr->device = pd->device;
mr->pd = pd;
mr->type = IB_MR_TYPE_DMA;
mr->uobject = NULL;
mr->need_inval = false;
@ -316,7 +317,7 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
EXPORT_SYMBOL(__ib_alloc_pd);
/**
* ib_dealloc_pd - Deallocates a protection domain.
* ib_dealloc_pd_user - Deallocates a protection domain.
* @pd: The protection domain to deallocate.
* @udata: Valid user data or NULL for kernel object
*
@ -1157,6 +1158,10 @@ struct ib_qp *ib_create_qp_user(struct ib_pd *pd,
qp_init_attr->cap.max_recv_sge))
return ERR_PTR(-EINVAL);
if ((qp_init_attr->create_flags & IB_QP_CREATE_INTEGRITY_EN) &&
!(device->attrs.device_cap_flags & IB_DEVICE_INTEGRITY_HANDOVER))
return ERR_PTR(-EINVAL);
/*
* If the callers is using the RDMA API calculate the resources
* needed for the RDMA READ/WRITE operations.
@ -1232,6 +1237,8 @@ struct ib_qp *ib_create_qp_user(struct ib_pd *pd,
qp->max_write_sge = qp_init_attr->cap.max_send_sge;
qp->max_read_sge = min_t(u32, qp_init_attr->cap.max_send_sge,
device->attrs.max_sge_rd);
if (qp_init_attr->create_flags & IB_QP_CREATE_INTEGRITY_EN)
qp->integrity_en = true;
return qp;
@ -1683,6 +1690,14 @@ static int _ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr,
}
}
/*
* Bind this qp to a counter automatically based on the rdma counter
* rules. This only set in RST2INIT with port specified
*/
if (!qp->counter && (attr_mask & IB_QP_PORT) &&
((attr_mask & IB_QP_STATE) && attr->qp_state == IB_QPS_INIT))
rdma_counter_bind_qp_auto(qp, attr->port_num);
ret = ib_security_modify_qp(qp, attr, attr_mask, udata);
if (ret)
goto out;
@ -1878,6 +1893,7 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata)
if (!qp->uobject)
rdma_rw_cleanup_mrs(qp);
rdma_counter_unbind_qp(qp, true);
rdma_restrack_del(&qp->res);
ret = qp->device->ops.destroy_qp(qp, udata);
if (!ret) {
@ -1916,21 +1932,28 @@ struct ib_cq *__ib_create_cq(struct ib_device *device,
const char *caller)
{
struct ib_cq *cq;
int ret;
cq = device->ops.create_cq(device, cq_attr, NULL);
cq = rdma_zalloc_drv_obj(device, ib_cq);
if (!cq)
return ERR_PTR(-ENOMEM);
if (!IS_ERR(cq)) {
cq->device = device;
cq->uobject = NULL;
cq->comp_handler = comp_handler;
cq->event_handler = event_handler;
cq->cq_context = cq_context;
atomic_set(&cq->usecnt, 0);
cq->res.type = RDMA_RESTRACK_CQ;
rdma_restrack_set_task(&cq->res, caller);
rdma_restrack_kadd(&cq->res);
cq->device = device;
cq->uobject = NULL;
cq->comp_handler = comp_handler;
cq->event_handler = event_handler;
cq->cq_context = cq_context;
atomic_set(&cq->usecnt, 0);
cq->res.type = RDMA_RESTRACK_CQ;
rdma_restrack_set_task(&cq->res, caller);
ret = device->ops.create_cq(cq, cq_attr, NULL);
if (ret) {
kfree(cq);
return ERR_PTR(ret);
}
rdma_restrack_kadd(&cq->res);
return cq;
}
EXPORT_SYMBOL(__ib_create_cq);
@ -1949,7 +1972,9 @@ int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
return -EBUSY;
rdma_restrack_del(&cq->res);
return cq->device->ops.destroy_cq(cq, udata);
cq->device->ops.destroy_cq(cq, udata);
kfree(cq);
return 0;
}
EXPORT_SYMBOL(ib_destroy_cq_user);
@ -1966,6 +1991,7 @@ int ib_dereg_mr_user(struct ib_mr *mr, struct ib_udata *udata)
{
struct ib_pd *pd = mr->pd;
struct ib_dm *dm = mr->dm;
struct ib_sig_attrs *sig_attrs = mr->sig_attrs;
int ret;
rdma_restrack_del(&mr->res);
@ -1974,6 +2000,7 @@ int ib_dereg_mr_user(struct ib_mr *mr, struct ib_udata *udata)
atomic_dec(&pd->usecnt);
if (dm)
atomic_dec(&dm->usecnt);
kfree(sig_attrs);
}
return ret;
@ -1981,7 +2008,7 @@ int ib_dereg_mr_user(struct ib_mr *mr, struct ib_udata *udata)
EXPORT_SYMBOL(ib_dereg_mr_user);
/**
* ib_alloc_mr() - Allocates a memory region
* ib_alloc_mr_user() - Allocates a memory region
* @pd: protection domain associated with the region
* @mr_type: memory region type
* @max_num_sg: maximum sg entries available for registration.
@ -2001,6 +2028,9 @@ struct ib_mr *ib_alloc_mr_user(struct ib_pd *pd, enum ib_mr_type mr_type,
if (!pd->device->ops.alloc_mr)
return ERR_PTR(-EOPNOTSUPP);
if (WARN_ON_ONCE(mr_type == IB_MR_TYPE_INTEGRITY))
return ERR_PTR(-EINVAL);
mr = pd->device->ops.alloc_mr(pd, mr_type, max_num_sg, udata);
if (!IS_ERR(mr)) {
mr->device = pd->device;
@ -2011,12 +2041,66 @@ struct ib_mr *ib_alloc_mr_user(struct ib_pd *pd, enum ib_mr_type mr_type,
mr->need_inval = false;
mr->res.type = RDMA_RESTRACK_MR;
rdma_restrack_kadd(&mr->res);
mr->type = mr_type;
mr->sig_attrs = NULL;
}
return mr;
}
EXPORT_SYMBOL(ib_alloc_mr_user);
/**
* ib_alloc_mr_integrity() - Allocates an integrity memory region
* @pd: protection domain associated with the region
* @max_num_data_sg: maximum data sg entries available for registration
* @max_num_meta_sg: maximum metadata sg entries available for
* registration
*
* Notes:
* Memory registration page/sg lists must not exceed max_num_sg,
* also the integrity page/sg lists must not exceed max_num_meta_sg.
*
*/
struct ib_mr *ib_alloc_mr_integrity(struct ib_pd *pd,
u32 max_num_data_sg,
u32 max_num_meta_sg)
{
struct ib_mr *mr;
struct ib_sig_attrs *sig_attrs;
if (!pd->device->ops.alloc_mr_integrity ||
!pd->device->ops.map_mr_sg_pi)
return ERR_PTR(-EOPNOTSUPP);
if (!max_num_meta_sg)
return ERR_PTR(-EINVAL);
sig_attrs = kzalloc(sizeof(struct ib_sig_attrs), GFP_KERNEL);
if (!sig_attrs)
return ERR_PTR(-ENOMEM);
mr = pd->device->ops.alloc_mr_integrity(pd, max_num_data_sg,
max_num_meta_sg);
if (IS_ERR(mr)) {
kfree(sig_attrs);
return mr;
}
mr->device = pd->device;
mr->pd = pd;
mr->dm = NULL;
mr->uobject = NULL;
atomic_inc(&pd->usecnt);
mr->need_inval = false;
mr->res.type = RDMA_RESTRACK_MR;
rdma_restrack_kadd(&mr->res);
mr->type = IB_MR_TYPE_INTEGRITY;
mr->sig_attrs = sig_attrs;
return mr;
}
EXPORT_SYMBOL(ib_alloc_mr_integrity);
/* "Fast" memory regions */
struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd,
@ -2226,19 +2310,17 @@ EXPORT_SYMBOL(ib_create_wq);
*/
int ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata)
{
int err;
struct ib_cq *cq = wq->cq;
struct ib_pd *pd = wq->pd;
if (atomic_read(&wq->usecnt))
return -EBUSY;
err = wq->device->ops.destroy_wq(wq, udata);
if (!err) {
atomic_dec(&pd->usecnt);
atomic_dec(&cq->usecnt);
}
return err;
wq->device->ops.destroy_wq(wq, udata);
atomic_dec(&pd->usecnt);
atomic_dec(&cq->usecnt);
return 0;
}
EXPORT_SYMBOL(ib_destroy_wq);
@ -2375,6 +2457,43 @@ int ib_set_vf_guid(struct ib_device *device, int vf, u8 port, u64 guid,
}
EXPORT_SYMBOL(ib_set_vf_guid);
/**
* ib_map_mr_sg_pi() - Map the dma mapped SG lists for PI (protection
* information) and set an appropriate memory region for registration.
* @mr: memory region
* @data_sg: dma mapped scatterlist for data
* @data_sg_nents: number of entries in data_sg
* @data_sg_offset: offset in bytes into data_sg
* @meta_sg: dma mapped scatterlist for metadata
* @meta_sg_nents: number of entries in meta_sg
* @meta_sg_offset: offset in bytes into meta_sg
* @page_size: page vector desired page size
*
* Constraints:
* - The MR must be allocated with type IB_MR_TYPE_INTEGRITY.
*
* Return: 0 on success.
*
* After this completes successfully, the memory region
* is ready for registration.
*/
int ib_map_mr_sg_pi(struct ib_mr *mr, struct scatterlist *data_sg,
int data_sg_nents, unsigned int *data_sg_offset,
struct scatterlist *meta_sg, int meta_sg_nents,
unsigned int *meta_sg_offset, unsigned int page_size)
{
if (unlikely(!mr->device->ops.map_mr_sg_pi ||
WARN_ON_ONCE(mr->type != IB_MR_TYPE_INTEGRITY)))
return -EOPNOTSUPP;
mr->page_size = page_size;
return mr->device->ops.map_mr_sg_pi(mr, data_sg, data_sg_nents,
data_sg_offset, meta_sg,
meta_sg_nents, meta_sg_offset);
}
EXPORT_SYMBOL(ib_map_mr_sg_pi);
/**
* ib_map_mr_sg() - Map the largest prefix of a dma mapped SG list
* and set it the memory region.

View File

@ -7,7 +7,6 @@ obj-$(CONFIG_INFINIBAND_EFA) += efa/
obj-$(CONFIG_INFINIBAND_I40IW) += i40iw/
obj-$(CONFIG_MLX4_INFINIBAND) += mlx4/
obj-$(CONFIG_MLX5_INFINIBAND) += mlx5/
obj-$(CONFIG_INFINIBAND_NES) += nes/
obj-$(CONFIG_INFINIBAND_OCRDMA) += ocrdma/
obj-$(CONFIG_INFINIBAND_VMWARE_PVRDMA) += vmw_pvrdma/
obj-$(CONFIG_INFINIBAND_USNIC) += usnic/

View File

@ -805,10 +805,8 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
rdev->sqp_ah = NULL;
}
if (!IS_ERR_OR_NULL(qp->rumem))
ib_umem_release(qp->rumem);
if (!IS_ERR_OR_NULL(qp->sumem))
ib_umem_release(qp->sumem);
ib_umem_release(qp->rumem);
ib_umem_release(qp->sumem);
mutex_lock(&rdev->qp_lock);
list_del(&qp->list);
@ -1201,12 +1199,8 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
qp_destroy:
bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
free_umem:
if (udata) {
if (qp->rumem)
ib_umem_release(qp->rumem);
if (qp->sumem)
ib_umem_release(qp->sumem);
}
ib_umem_release(qp->rumem);
ib_umem_release(qp->sumem);
fail:
kfree(qp);
return ERR_PTR(rc);
@ -1302,8 +1296,7 @@ void bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata)
if (qplib_srq->cq)
nq = qplib_srq->cq->nq;
bnxt_qplib_destroy_srq(&rdev->qplib_res, qplib_srq);
if (srq->umem)
ib_umem_release(srq->umem);
ib_umem_release(srq->umem);
atomic_dec(&rdev->srq_count);
if (nq)
nq->budget--;
@ -1412,8 +1405,7 @@ int bnxt_re_create_srq(struct ib_srq *ib_srq,
return 0;
fail:
if (srq->umem)
ib_umem_release(srq->umem);
ib_umem_release(srq->umem);
exit:
return rc;
}
@ -2517,9 +2509,8 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
}
/* Completion Queues */
int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
void bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
int rc;
struct bnxt_re_cq *cq;
struct bnxt_qplib_nq *nq;
struct bnxt_re_dev *rdev;
@ -2528,29 +2519,20 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
rdev = cq->rdev;
nq = cq->qplib_cq.nq;
rc = bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq);
if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to destroy HW CQ");
return rc;
}
if (!IS_ERR_OR_NULL(cq->umem))
ib_umem_release(cq->umem);
bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq);
ib_umem_release(cq->umem);
atomic_dec(&rdev->cq_count);
nq->budget--;
kfree(cq->cql);
kfree(cq);
return 0;
}
struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibcq->device, ibdev);
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
struct bnxt_re_cq *cq = NULL;
struct bnxt_re_cq *cq = container_of(ibcq, struct bnxt_re_cq, ib_cq);
int rc, entries;
int cqe = attr->cqe;
struct bnxt_qplib_nq *nq = NULL;
@ -2559,11 +2541,8 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
/* Validate CQ fields */
if (cqe < 1 || cqe > dev_attr->max_cq_wqes) {
dev_err(rdev_to_dev(rdev), "Failed to create CQ -max exceeded");
return ERR_PTR(-EINVAL);
return -EINVAL;
}
cq = kzalloc(sizeof(*cq), GFP_KERNEL);
if (!cq)
return ERR_PTR(-ENOMEM);
cq->rdev = rdev;
cq->qplib_cq.cq_handle = (u64)(unsigned long)(&cq->qplib_cq);
@ -2641,15 +2620,13 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
}
}
return &cq->ib_cq;
return 0;
c2fail:
if (udata)
ib_umem_release(cq->umem);
ib_umem_release(cq->umem);
fail:
kfree(cq->cql);
kfree(cq);
return ERR_PTR(rc);
return rc;
}
static u8 __req_to_ib_wc_status(u8 qstatus)
@ -3353,8 +3330,7 @@ int bnxt_re_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
mr->npages = 0;
mr->pages = NULL;
}
if (!IS_ERR_OR_NULL(mr->ib_umem))
ib_umem_release(mr->ib_umem);
ib_umem_release(mr->ib_umem);
kfree(mr);
atomic_dec(&rdev->mr_count);
@ -3630,10 +3606,10 @@ int bnxt_re_alloc_ucontext(struct ib_ucontext *ctx, struct ib_udata *udata)
u32 chip_met_rev_num = 0;
int rc;
dev_dbg(rdev_to_dev(rdev), "ABI version requested %d",
ibdev->uverbs_abi_ver);
dev_dbg(rdev_to_dev(rdev), "ABI version requested %u",
ibdev->ops.uverbs_abi_ver);
if (ibdev->uverbs_abi_ver != BNXT_RE_ABI_VERSION) {
if (ibdev->ops.uverbs_abi_ver != BNXT_RE_ABI_VERSION) {
dev_dbg(rdev_to_dev(rdev), " is different from the device %d ",
BNXT_RE_ABI_VERSION);
return -EPERM;

View File

@ -94,11 +94,11 @@ struct bnxt_re_qp {
};
struct bnxt_re_cq {
struct ib_cq ib_cq;
struct bnxt_re_dev *rdev;
spinlock_t cq_lock; /* protect cq */
u16 cq_count;
u16 cq_period;
struct ib_cq ib_cq;
struct bnxt_qplib_cq qplib_cq;
struct bnxt_qplib_cqe *cql;
#define MAX_CQL_PER_POLL 1024
@ -190,10 +190,9 @@ int bnxt_re_post_send(struct ib_qp *qp, const struct ib_send_wr *send_wr,
const struct ib_send_wr **bad_send_wr);
int bnxt_re_post_recv(struct ib_qp *qp, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
int bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
void bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
int bnxt_re_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *pd, int mr_access_flags);

View File

@ -596,6 +596,10 @@ static void bnxt_re_unregister_ib(struct bnxt_re_dev *rdev)
}
static const struct ib_device_ops bnxt_re_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_BNXT_RE,
.uverbs_abi_ver = BNXT_RE_ABI_VERSION,
.add_gid = bnxt_re_add_gid,
.alloc_hw_stats = bnxt_re_ib_alloc_hw_stats,
.alloc_mr = bnxt_re_alloc_mr,
@ -637,6 +641,7 @@ static const struct ib_device_ops bnxt_re_dev_ops = {
.reg_user_mr = bnxt_re_reg_user_mr,
.req_notify_cq = bnxt_re_req_notify_cq,
INIT_RDMA_OBJ_SIZE(ib_ah, bnxt_re_ah, ib_ah),
INIT_RDMA_OBJ_SIZE(ib_cq, bnxt_re_cq, ib_cq),
INIT_RDMA_OBJ_SIZE(ib_pd, bnxt_re_pd, ib_pd),
INIT_RDMA_OBJ_SIZE(ib_srq, bnxt_re_srq, ib_srq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, bnxt_re_ucontext, ib_uctx),
@ -648,7 +653,6 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
int ret;
/* ib device init */
ibdev->owner = THIS_MODULE;
ibdev->node_type = RDMA_NODE_IB_CA;
strlcpy(ibdev->node_desc, BNXT_RE_DESC " HCA",
strlen(BNXT_RE_DESC) + 5);
@ -661,7 +665,6 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
ibdev->local_dma_lkey = BNXT_QPLIB_RSVD_LKEY;
/* User space */
ibdev->uverbs_abi_ver = BNXT_RE_ABI_VERSION;
ibdev->uverbs_cmd_mask =
(1ull << IB_USER_VERBS_CMD_GET_CONTEXT) |
(1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) |
@ -691,7 +694,6 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
rdma_set_device_sysfs_group(ibdev, &bnxt_re_dev_attr_group);
ibdev->driver_id = RDMA_DRIVER_BNXT_RE;
ib_set_device_ops(ibdev, &bnxt_re_dev_ops);
ret = ib_device_set_netdev(&rdev->ibdev, rdev->netdev, 1);
if (ret)

View File

@ -174,7 +174,6 @@ int cxio_create_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq, int kernel)
return -ENOMEM;
}
dma_unmap_addr_set(cq, mapping, cq->dma_addr);
memset(cq->queue, 0, size);
setup.id = cq->cqid;
setup.base_addr = (u64) (cq->dma_addr);
setup.size = 1UL << cq->size_log2;
@ -187,20 +186,6 @@ int cxio_create_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq, int kernel)
return (rdev_p->t3cdev_p->ctl(rdev_p->t3cdev_p, RDMA_CQ_SETUP, &setup));
}
#ifdef notyet
int cxio_resize_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq)
{
struct rdma_cq_setup setup;
setup.id = cq->cqid;
setup.base_addr = (u64) (cq->dma_addr);
setup.size = 1UL << cq->size_log2;
setup.credits = setup.size;
setup.credit_thres = setup.size; /* TBD: overflow recovery */
setup.ovfl_mode = 1;
return (rdev_p->t3cdev_p->ctl(rdev_p->t3cdev_p, RDMA_CQ_SETUP, &setup));
}
#endif
static u32 get_qpid(struct cxio_rdev *rdev_p, struct cxio_ucontext *uctx)
{
struct cxio_qpid_list *entry;
@ -219,7 +204,7 @@ static u32 get_qpid(struct cxio_rdev *rdev_p, struct cxio_ucontext *uctx)
if (!qpid)
goto out;
for (i = qpid+1; i & rdev_p->qpmask; i++) {
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
break;
entry->qpid = i;
@ -237,7 +222,7 @@ static void put_qpid(struct cxio_rdev *rdev_p, u32 qpid,
{
struct cxio_qpid_list *entry;
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
return;
pr_debug("%s qpid 0x%x\n", __func__, qpid);
@ -317,17 +302,15 @@ err1:
return -ENOMEM;
}
int cxio_destroy_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq)
void cxio_destroy_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq)
{
int err;
err = cxio_hal_clear_cq_ctx(rdev_p, cq->cqid);
cxio_hal_clear_cq_ctx(rdev_p, cq->cqid);
kfree(cq->sw_queue);
dma_free_coherent(&(rdev_p->rnic_info.pdev->dev),
(1UL << (cq->size_log2))
* sizeof(struct t3_cqe) + 1, cq->queue,
dma_unmap_addr(cq, mapping));
cxio_hal_put_cqid(rdev_p->rscp, cq->cqid);
return err;
}
int cxio_destroy_qp(struct cxio_rdev *rdev_p, struct t3_wq *wq,
@ -538,8 +521,6 @@ static int cxio_hal_init_ctrl_qp(struct cxio_rdev *rdev_p)
dma_unmap_addr_set(&rdev_p->ctrl_qp, mapping,
rdev_p->ctrl_qp.dma_addr);
rdev_p->ctrl_qp.doorbell = (void __iomem *)rdev_p->rnic_info.kdb_addr;
memset(rdev_p->ctrl_qp.workq, 0,
(1 << T3_CTRL_QP_SIZE_LOG2) * sizeof(union t3_wr));
mutex_init(&rdev_p->ctrl_qp.lock);
init_waitqueue_head(&rdev_p->ctrl_qp.waitq);
@ -565,9 +546,9 @@ static int cxio_hal_init_ctrl_qp(struct cxio_rdev *rdev_p)
wqe->sge_cmd = cpu_to_be64(sge_cmd);
wqe->ctx1 = cpu_to_be64(ctx1);
wqe->ctx0 = cpu_to_be64(ctx0);
pr_debug("CtrlQP dma_addr 0x%llx workq %p size %d\n",
(unsigned long long)rdev_p->ctrl_qp.dma_addr,
rdev_p->ctrl_qp.workq, 1 << T3_CTRL_QP_SIZE_LOG2);
pr_debug("CtrlQP dma_addr %pad workq %p size %d\n",
&rdev_p->ctrl_qp.dma_addr, rdev_p->ctrl_qp.workq,
1 << T3_CTRL_QP_SIZE_LOG2);
skb->priority = CPL_PRIORITY_CONTROL;
return iwch_cxgb3_ofld_send(rdev_p->t3cdev_p, skb);
err:

View File

@ -158,8 +158,7 @@ void cxio_rdev_close(struct cxio_rdev *rdev);
int cxio_hal_cq_op(struct cxio_rdev *rdev, struct t3_cq *cq,
enum t3_cq_opcode op, u32 credit);
int cxio_create_cq(struct cxio_rdev *rdev, struct t3_cq *cq, int kernel);
int cxio_destroy_cq(struct cxio_rdev *rdev, struct t3_cq *cq);
int cxio_resize_cq(struct cxio_rdev *rdev, struct t3_cq *cq);
void cxio_destroy_cq(struct cxio_rdev *rdev, struct t3_cq *cq);
void cxio_release_ucontext(struct cxio_rdev *rdev, struct cxio_ucontext *uctx);
void cxio_init_ucontext(struct cxio_rdev *rdev, struct cxio_ucontext *uctx);
int cxio_create_qp(struct cxio_rdev *rdev, u32 kernel_domain, struct t3_wq *wq,

View File

@ -170,7 +170,7 @@ static void release_tid(struct t3cdev *tdev, u32 hwtid, struct sk_buff *skb)
{
struct cpl_tid_release *req;
skb = get_skb(skb, sizeof *req, GFP_KERNEL);
skb = get_skb(skb, sizeof(*req), GFP_KERNEL);
if (!skb)
return;
req = skb_put(skb, sizeof(*req));

View File

@ -88,7 +88,7 @@ static int iwch_alloc_ucontext(struct ib_ucontext *ucontext,
return 0;
}
static int iwch_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
static void iwch_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct iwch_cq *chp;
@ -100,17 +100,16 @@ static int iwch_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
wait_event(chp->wait, !atomic_read(&chp->refcnt));
cxio_destroy_cq(&chp->rhp->rdev, &chp->cq);
kfree(chp);
return 0;
}
static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
static int iwch_create_cq(struct ib_cq *ibcq,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct ib_device *ibdev = ibcq->device;
int entries = attr->cqe;
struct iwch_dev *rhp;
struct iwch_cq *chp;
struct iwch_dev *rhp = to_iwch_dev(ibcq->device);
struct iwch_cq *chp = to_iwch_cq(ibcq);
struct iwch_create_cq_resp uresp;
struct iwch_create_cq_req ureq;
static int warned;
@ -118,19 +117,13 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
pr_debug("%s ib_dev %p entries %d\n", __func__, ibdev, entries);
if (attr->flags)
return ERR_PTR(-EINVAL);
rhp = to_iwch_dev(ibdev);
chp = kzalloc(sizeof(*chp), GFP_KERNEL);
if (!chp)
return ERR_PTR(-ENOMEM);
return -EINVAL;
if (udata) {
if (!t3a_device(rhp)) {
if (ib_copy_from_udata(&ureq, udata, sizeof (ureq))) {
kfree(chp);
return ERR_PTR(-EFAULT);
}
if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
return -EFAULT;
chp->user_rptr_addr = (u32 __user *)(unsigned long)ureq.user_rptr_addr;
}
}
@ -151,10 +144,9 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
entries = roundup_pow_of_two(entries);
chp->cq.size_log2 = ilog2(entries);
if (cxio_create_cq(&rhp->rdev, &chp->cq, !udata)) {
kfree(chp);
return ERR_PTR(-ENOMEM);
}
if (cxio_create_cq(&rhp->rdev, &chp->cq, !udata))
return -ENOMEM;
chp->rhp = rhp;
chp->ibcq.cqe = 1 << chp->cq.size_log2;
spin_lock_init(&chp->lock);
@ -163,8 +155,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
init_waitqueue_head(&chp->wait);
if (xa_store_irq(&rhp->cqs, chp->cq.cqid, chp, GFP_KERNEL)) {
cxio_destroy_cq(&chp->rhp->rdev, &chp->cq);
kfree(chp);
return ERR_PTR(-ENOMEM);
return -ENOMEM;
}
if (udata) {
@ -172,10 +163,10 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
struct iwch_ucontext *ucontext = rdma_udata_to_drv_context(
udata, struct iwch_ucontext, ibucontext);
mm = kmalloc(sizeof *mm, GFP_KERNEL);
mm = kmalloc(sizeof(*mm), GFP_KERNEL);
if (!mm) {
iwch_destroy_cq(&chp->ibcq, udata);
return ERR_PTR(-ENOMEM);
return -ENOMEM;
}
uresp.cqid = chp->cq.cqid;
uresp.size_log2 = chp->cq.size_log2;
@ -185,7 +176,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
spin_unlock(&ucontext->mmap_lock);
mm->key = uresp.key;
mm->addr = virt_to_phys(chp->cq.queue);
if (udata->outlen < sizeof uresp) {
if (udata->outlen < sizeof(uresp)) {
if (!warned++)
pr_warn("Warning - downlevel libcxgb3 (non-fatal)\n");
mm->len = PAGE_ALIGN((1UL << uresp.size_log2) *
@ -196,86 +187,19 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
sizeof(struct t3_cqe));
uresp.memsize = mm->len;
uresp.reserved = 0;
resplen = sizeof uresp;
resplen = sizeof(uresp);
}
if (ib_copy_to_udata(udata, &uresp, resplen)) {
kfree(mm);
iwch_destroy_cq(&chp->ibcq, udata);
return ERR_PTR(-EFAULT);
return -EFAULT;
}
insert_mmap(ucontext, mm);
}
pr_debug("created cqid 0x%0x chp %p size 0x%0x, dma_addr 0x%0llx\n",
pr_debug("created cqid 0x%0x chp %p size 0x%0x, dma_addr %pad\n",
chp->cq.cqid, chp, (1 << chp->cq.size_log2),
(unsigned long long)chp->cq.dma_addr);
return &chp->ibcq;
}
static int iwch_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata)
{
#ifdef notyet
struct iwch_cq *chp = to_iwch_cq(cq);
struct t3_cq oldcq, newcq;
int ret;
pr_debug("%s ib_cq %p cqe %d\n", __func__, cq, cqe);
/* We don't downsize... */
if (cqe <= cq->cqe)
return 0;
/* create new t3_cq with new size */
cqe = roundup_pow_of_two(cqe+1);
newcq.size_log2 = ilog2(cqe);
/* Dont allow resize to less than the current wce count */
if (cqe < Q_COUNT(chp->cq.rptr, chp->cq.wptr)) {
return -ENOMEM;
}
/* Quiesce all QPs using this CQ */
ret = iwch_quiesce_qps(chp);
if (ret) {
return ret;
}
ret = cxio_create_cq(&chp->rhp->rdev, &newcq);
if (ret) {
return ret;
}
/* copy CQEs */
memcpy(newcq.queue, chp->cq.queue, (1 << chp->cq.size_log2) *
sizeof(struct t3_cqe));
/* old iwch_qp gets new t3_cq but keeps old cqid */
oldcq = chp->cq;
chp->cq = newcq;
chp->cq.cqid = oldcq.cqid;
/* resize new t3_cq to update the HW context */
ret = cxio_resize_cq(&chp->rhp->rdev, &chp->cq);
if (ret) {
chp->cq = oldcq;
return ret;
}
chp->ibcq.cqe = (1<<chp->cq.size_log2) - 1;
/* destroy old t3_cq */
oldcq.cqid = newcq.cqid;
ret = cxio_destroy_cq(&chp->rhp->rdev, &oldcq);
if (ret) {
pr_err("%s - cxio_destroy_cq failed %d\n", __func__, ret);
}
/* add user hooks here */
/* resume qps */
ret = iwch_resume_qps(chp);
return ret;
#else
return -ENOSYS;
#endif
&chp->cq.dma_addr);
return 0;
}
static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
@ -422,8 +346,7 @@ static int iwch_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
xa_erase_irq(&rhp->mrs, mmid);
if (mhp->kva)
kfree((void *) (unsigned long) mhp->kva);
if (mhp->umem)
ib_umem_release(mhp->umem);
ib_umem_release(mhp->umem);
pr_debug("%s mmid 0x%x ptr %p\n", __func__, mmid, mhp);
kfree(mhp);
return 0;
@ -553,7 +476,7 @@ static struct ib_mr *iwch_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
for_each_sg_dma_page(mhp->umem->sg_head.sgl, &sg_iter, mhp->umem->nmap, 0) {
pages[i++] = cpu_to_be64(sg_page_iter_dma_address(&sg_iter));
if (i == PAGE_SIZE / sizeof *pages) {
if (i == PAGE_SIZE / sizeof(*pages)) {
err = iwch_write_pbl(mhp, pages, i, n);
if (err)
goto pbl_done;
@ -587,7 +510,7 @@ pbl_done:
pr_debug("%s user resp pbl_addr 0x%x\n", __func__,
uresp.pbl_addr);
if (ib_copy_to_udata(udata, &uresp, sizeof (uresp))) {
if (ib_copy_to_udata(udata, &uresp, sizeof(uresp))) {
iwch_dereg_mr(&mhp->ibmr, udata);
err = -EFAULT;
goto err;
@ -880,13 +803,13 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
struct iwch_mm_entry *mm1, *mm2;
mm1 = kmalloc(sizeof *mm1, GFP_KERNEL);
mm1 = kmalloc(sizeof(*mm1), GFP_KERNEL);
if (!mm1) {
iwch_destroy_qp(&qhp->ibqp, udata);
return ERR_PTR(-ENOMEM);
}
mm2 = kmalloc(sizeof *mm2, GFP_KERNEL);
mm2 = kmalloc(sizeof(*mm2), GFP_KERNEL);
if (!mm2) {
kfree(mm1);
iwch_destroy_qp(&qhp->ibqp, udata);
@ -903,7 +826,7 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
uresp.db_key = ucontext->key;
ucontext->key += PAGE_SIZE;
spin_unlock(&ucontext->mmap_lock);
if (ib_copy_to_udata(udata, &uresp, sizeof (uresp))) {
if (ib_copy_to_udata(udata, &uresp, sizeof(uresp))) {
kfree(mm1);
kfree(mm2);
iwch_destroy_qp(&qhp->ibqp, udata);
@ -911,7 +834,7 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
}
mm1->key = uresp.key;
mm1->addr = virt_to_phys(qhp->wq.queue);
mm1->len = PAGE_ALIGN(wqsize * sizeof (union t3_wr));
mm1->len = PAGE_ALIGN(wqsize * sizeof(union t3_wr));
insert_mmap(ucontext, mm1);
mm2->key = uresp.db_key;
mm2->addr = qhp->wq.udb & PAGE_MASK;
@ -919,10 +842,11 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
insert_mmap(ucontext, mm2);
}
qhp->ibqp.qp_num = qhp->wq.qpid;
pr_debug("%s sq_num_entries %d, rq_num_entries %d qpid 0x%0x qhp %p dma_addr 0x%llx size %d rq_addr 0x%x\n",
__func__, qhp->attr.sq_num_entries, qhp->attr.rq_num_entries,
qhp->wq.qpid, qhp, (unsigned long long)qhp->wq.dma_addr,
1 << qhp->wq.size_log2, qhp->wq.rq_addr);
pr_debug(
"%s sq_num_entries %d, rq_num_entries %d qpid 0x%0x qhp %p dma_addr %pad size %d rq_addr 0x%x\n",
__func__, qhp->attr.sq_num_entries, qhp->attr.rq_num_entries,
qhp->wq.qpid, qhp, &qhp->wq.dma_addr, 1 << qhp->wq.size_log2,
qhp->wq.rq_addr);
return &qhp->ibqp;
}
@ -932,7 +856,7 @@ static int iwch_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
struct iwch_dev *rhp;
struct iwch_qp *qhp;
enum iwch_qp_attr_mask mask = 0;
struct iwch_qp_attributes attrs;
struct iwch_qp_attributes attrs = {};
pr_debug("%s ib_qp %p\n", __func__, ibqp);
@ -944,7 +868,6 @@ static int iwch_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
if (!attr_mask)
return 0;
memset(&attrs, 0, sizeof attrs);
qhp = to_iwch_qp(ibqp);
rhp = qhp->rhp;
@ -1040,7 +963,6 @@ static int iwch_query_device(struct ib_device *ibdev, struct ib_device_attr *pro
return -EINVAL;
dev = to_iwch_dev(ibdev);
memset(props, 0, sizeof *props);
memcpy(&props->sys_image_guid, dev->rdev.t3cdev_p->lldev->dev_addr, 6);
props->hw_ver = dev->rdev.t3cdev_p->type;
props->fw_ver = fw_vers_string_to_u64(dev);
@ -1304,6 +1226,11 @@ static void get_dev_fw_ver_str(struct ib_device *ibdev, char *str)
}
static const struct ib_device_ops iwch_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_CXGB3,
.uverbs_abi_ver = IWCH_UVERBS_ABI_VERSION,
.uverbs_no_driver_id_binding = 1,
.alloc_hw_stats = iwch_alloc_stats,
.alloc_mr = iwch_alloc_mr,
.alloc_mw = iwch_alloc_mw,
@ -1341,8 +1268,8 @@ static const struct ib_device_ops iwch_dev_ops = {
.query_port = iwch_query_port,
.reg_user_mr = iwch_reg_user_mr,
.req_notify_cq = iwch_arm_cq,
.resize_cq = iwch_resize_cq,
INIT_RDMA_OBJ_SIZE(ib_pd, iwch_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_cq, iwch_cq, ibcq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, iwch_ucontext, ibucontext),
};
@ -1351,7 +1278,6 @@ int iwch_register_device(struct iwch_dev *dev)
pr_debug("%s iwch_dev %p\n", __func__, dev);
memset(&dev->ibdev.node_guid, 0, sizeof(dev->ibdev.node_guid));
memcpy(&dev->ibdev.node_guid, dev->rdev.t3cdev_p->lldev->dev_addr, 6);
dev->ibdev.owner = THIS_MODULE;
dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY |
IB_DEVICE_MEM_WINDOW |
IB_DEVICE_MEM_MGT_EXTENSIONS;
@ -1383,12 +1309,10 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.phys_port_cnt = dev->rdev.port_info.nports;
dev->ibdev.num_comp_vectors = 1;
dev->ibdev.dev.parent = &dev->rdev.rnic_info.pdev->dev;
dev->ibdev.uverbs_abi_ver = IWCH_UVERBS_ABI_VERSION;
memcpy(dev->ibdev.iw_ifname, dev->rdev.t3cdev_p->lldev->name,
sizeof(dev->ibdev.iw_ifname));
dev->ibdev.driver_id = RDMA_DRIVER_CXGB3;
rdma_set_device_sysfs_group(&dev->ibdev, &iwch_attr_group);
ib_set_device_ops(&dev->ibdev, &iwch_dev_ops);
return ib_register_device(&dev->ibdev, "cxgb3_%d");

View File

@ -953,7 +953,7 @@ static int send_mpa_req(struct c4iw_ep *ep, struct sk_buff *skb,
mpalen = sizeof(*mpa) + ep->plen;
if (mpa_rev_to_use == 2)
mpalen += sizeof(struct mpa_v2_conn_params);
wrlen = roundup(mpalen + sizeof *req, 16);
wrlen = roundup(mpalen + sizeof(*req), 16);
skb = get_skb(skb, wrlen, GFP_KERNEL);
if (!skb) {
connect_reply_upcall(ep, -ENOMEM);
@ -997,8 +997,9 @@ static int send_mpa_req(struct c4iw_ep *ep, struct sk_buff *skb,
}
if (mpa_rev_to_use == 2) {
mpa->private_data_size = htons(ntohs(mpa->private_data_size) +
sizeof (struct mpa_v2_conn_params));
mpa->private_data_size =
htons(ntohs(mpa->private_data_size) +
sizeof(struct mpa_v2_conn_params));
pr_debug("initiator ird %u ord %u\n", ep->ird,
ep->ord);
mpa_v2_params.ird = htons((u16)ep->ird);
@ -1057,7 +1058,7 @@ static int send_mpa_reject(struct c4iw_ep *ep, const void *pdata, u8 plen)
mpalen = sizeof(*mpa) + plen;
if (ep->mpa_attr.version == 2 && ep->mpa_attr.enhanced_rdma_conn)
mpalen += sizeof(struct mpa_v2_conn_params);
wrlen = roundup(mpalen + sizeof *req, 16);
wrlen = roundup(mpalen + sizeof(*req), 16);
skb = get_skb(NULL, wrlen, GFP_KERNEL);
if (!skb) {
@ -1088,8 +1089,9 @@ static int send_mpa_reject(struct c4iw_ep *ep, const void *pdata, u8 plen)
if (ep->mpa_attr.version == 2 && ep->mpa_attr.enhanced_rdma_conn) {
mpa->flags |= MPA_ENHANCED_RDMA_CONN;
mpa->private_data_size = htons(ntohs(mpa->private_data_size) +
sizeof (struct mpa_v2_conn_params));
mpa->private_data_size =
htons(ntohs(mpa->private_data_size) +
sizeof(struct mpa_v2_conn_params));
mpa_v2_params.ird = htons(((u16)ep->ird) |
(peer2peer ? MPA_V2_PEER2PEER_MODEL :
0));
@ -1136,7 +1138,7 @@ static int send_mpa_reply(struct c4iw_ep *ep, const void *pdata, u8 plen)
mpalen = sizeof(*mpa) + plen;
if (ep->mpa_attr.version == 2 && ep->mpa_attr.enhanced_rdma_conn)
mpalen += sizeof(struct mpa_v2_conn_params);
wrlen = roundup(mpalen + sizeof *req, 16);
wrlen = roundup(mpalen + sizeof(*req), 16);
skb = get_skb(NULL, wrlen, GFP_KERNEL);
if (!skb) {
@ -1171,8 +1173,9 @@ static int send_mpa_reply(struct c4iw_ep *ep, const void *pdata, u8 plen)
if (ep->mpa_attr.version == 2 && ep->mpa_attr.enhanced_rdma_conn) {
mpa->flags |= MPA_ENHANCED_RDMA_CONN;
mpa->private_data_size = htons(ntohs(mpa->private_data_size) +
sizeof (struct mpa_v2_conn_params));
mpa->private_data_size =
htons(ntohs(mpa->private_data_size) +
sizeof(struct mpa_v2_conn_params));
mpa_v2_params.ird = htons((u16)ep->ird);
mpa_v2_params.ord = htons((u16)ep->ord);
if (peer2peer && (ep->mpa_attr.p2p_type !=

View File

@ -34,16 +34,15 @@
#include "iw_cxgb4.h"
static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
struct c4iw_dev_ucontext *uctx, struct sk_buff *skb,
struct c4iw_wr_wait *wr_waitp)
static void destroy_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
struct c4iw_dev_ucontext *uctx, struct sk_buff *skb,
struct c4iw_wr_wait *wr_waitp)
{
struct fw_ri_res_wr *res_wr;
struct fw_ri_res *res;
int wr_len;
int ret;
wr_len = sizeof *res_wr + sizeof *res;
wr_len = sizeof(*res_wr) + sizeof(*res);
set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
res_wr = __skb_put_zero(skb, wr_len);
@ -59,14 +58,13 @@ static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
res->u.cq.iqid = cpu_to_be32(cq->cqid);
c4iw_init_wr_wait(wr_waitp);
ret = c4iw_ref_send_wait(rdev, skb, wr_waitp, 0, 0, __func__);
c4iw_ref_send_wait(rdev, skb, wr_waitp, 0, 0, __func__);
kfree(cq->sw_queue);
dma_free_coherent(&(rdev->lldi.pdev->dev),
cq->memsize, cq->queue,
dma_unmap_addr(cq, mapping));
c4iw_put_cqid(rdev, cq->cqid, uctx);
return ret;
}
static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
@ -104,7 +102,6 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
goto err3;
}
dma_unmap_addr_set(cq, mapping, cq->dma_addr);
memset(cq->queue, 0, cq->memsize);
if (user && ucontext->is_32b_cqe) {
cq->qp_errp = &((struct t4_status_page *)
@ -117,7 +114,7 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
}
/* build fw_ri_res_wr */
wr_len = sizeof *res_wr + sizeof *res;
wr_len = sizeof(*res_wr) + sizeof(*res);
skb = alloc_skb(wr_len, GFP_KERNEL);
if (!skb) {
@ -970,7 +967,7 @@ int c4iw_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
return !err || err == -ENODATA ? npolled : err;
}
int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct c4iw_cq *chp;
struct c4iw_ucontext *ucontext;
@ -988,18 +985,16 @@ int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
ucontext ? &ucontext->uctx : &chp->cq.rdev->uctx,
chp->destroy_skb, chp->wr_waitp);
c4iw_put_wr_wait(chp->wr_waitp);
kfree(chp);
return 0;
}
struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct ib_device *ibdev = ibcq->device;
int entries = attr->cqe;
int vector = attr->comp_vector;
struct c4iw_dev *rhp;
struct c4iw_cq *chp;
struct c4iw_dev *rhp = to_c4iw_dev(ibcq->device);
struct c4iw_cq *chp = to_c4iw_cq(ibcq);
struct c4iw_create_cq ucmd;
struct c4iw_create_cq_resp uresp;
int ret, wr_len;
@ -1010,22 +1005,16 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
pr_debug("ib_dev %p entries %d\n", ibdev, entries);
if (attr->flags)
return ERR_PTR(-EINVAL);
rhp = to_c4iw_dev(ibdev);
return -EINVAL;
if (vector >= rhp->rdev.lldi.nciq)
return ERR_PTR(-EINVAL);
return -EINVAL;
if (udata) {
if (udata->inlen < sizeof(ucmd))
ucontext->is_32b_cqe = 1;
}
chp = kzalloc(sizeof(*chp), GFP_KERNEL);
if (!chp)
return ERR_PTR(-ENOMEM);
chp->wr_waitp = c4iw_alloc_wr_wait(GFP_KERNEL);
if (!chp->wr_waitp) {
ret = -ENOMEM;
@ -1095,10 +1084,10 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
if (ucontext) {
ret = -ENOMEM;
mm = kmalloc(sizeof *mm, GFP_KERNEL);
mm = kmalloc(sizeof(*mm), GFP_KERNEL);
if (!mm)
goto err_remove_handle;
mm2 = kmalloc(sizeof *mm2, GFP_KERNEL);
mm2 = kmalloc(sizeof(*mm2), GFP_KERNEL);
if (!mm2)
goto err_free_mm;
@ -1135,10 +1124,11 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
mm2->len = PAGE_SIZE;
insert_mmap(ucontext, mm2);
}
pr_debug("cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n",
chp->cq.cqid, chp, chp->cq.size,
chp->cq.memsize, (unsigned long long)chp->cq.dma_addr);
return &chp->ibcq;
pr_debug("cqid 0x%0x chp %p size %u memsize %zu, dma_addr %pad\n",
chp->cq.cqid, chp, chp->cq.size, chp->cq.memsize,
&chp->cq.dma_addr);
return 0;
err_free_mm2:
kfree(mm2);
err_free_mm:
@ -1154,8 +1144,7 @@ err_free_skb:
err_free_wr_wait:
c4iw_put_wr_wait(chp->wr_waitp);
err_free_chp:
kfree(chp);
return ERR_PTR(ret);
return ret;
}
int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)

View File

@ -327,7 +327,7 @@ static int qp_open(struct inode *inode, struct file *file)
unsigned long index;
int count = 1;
qpd = kmalloc(sizeof *qpd, GFP_KERNEL);
qpd = kmalloc(sizeof(*qpd), GFP_KERNEL);
if (!qpd)
return -ENOMEM;
@ -421,7 +421,7 @@ static int stag_open(struct inode *inode, struct file *file)
int ret = 0;
int count = 1;
stagd = kmalloc(sizeof *stagd, GFP_KERNEL);
stagd = kmalloc(sizeof(*stagd), GFP_KERNEL);
if (!stagd) {
ret = -ENOMEM;
goto out;
@ -1075,7 +1075,7 @@ static void *c4iw_uld_add(const struct cxgb4_lld_info *infop)
pr_info("Chelsio T4/T5 RDMA Driver - version %s\n",
DRV_VERSION);
ctx = kzalloc(sizeof *ctx, GFP_KERNEL);
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx) {
ctx = ERR_PTR(-ENOMEM);
goto out;
@ -1243,10 +1243,9 @@ static int c4iw_uld_state_change(void *handle, enum cxgb4_state new_state)
case CXGB4_STATE_START_RECOVERY:
pr_info("%s: Fatal Error\n", pci_name(ctx->lldi.pdev));
if (ctx->dev) {
struct ib_event event;
struct ib_event event = {};
ctx->dev->rdev.flags |= T4_FATAL_ERROR;
memset(&event, 0, sizeof event);
event.event = IB_EVENT_DEVICE_FATAL;
event.device = &ctx->dev->ibdev;
ib_dispatch_event(&event);

View File

@ -490,13 +490,13 @@ struct c4iw_qp {
struct t4_wq wq;
spinlock_t lock;
struct mutex mutex;
struct kref kref;
wait_queue_head_t wait;
int sq_sig_all;
struct c4iw_srq *srq;
struct work_struct free_work;
struct c4iw_ucontext *ucontext;
struct c4iw_wr_wait *wr_waitp;
struct completion qp_rel_comp;
refcount_t qp_refcnt;
};
static inline struct c4iw_qp *to_c4iw_qp(struct ib_qp *ibqp)
@ -992,10 +992,9 @@ struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start,
struct ib_udata *udata);
struct ib_mr *c4iw_get_dma_mr(struct ib_pd *pd, int acc);
int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata);
int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
int c4iw_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *attr,
enum ib_srq_attr_mask srq_attr_mask,

View File

@ -130,8 +130,9 @@ static int _c4iw_write_mem_inline(struct c4iw_rdev *rdev, u32 addr, u32 len,
copy_len = len > C4IW_MAX_INLINE_SIZE ? C4IW_MAX_INLINE_SIZE :
len;
wr_len = roundup(sizeof *req + sizeof *sc +
roundup(copy_len, T4_ULPTX_MIN_IO), 16);
wr_len = roundup(sizeof(*req) + sizeof(*sc) +
roundup(copy_len, T4_ULPTX_MIN_IO),
16);
if (!skb) {
skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
@ -807,8 +808,7 @@ int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
mhp->attr.pbl_size << 3);
if (mhp->kva)
kfree((void *) (unsigned long) mhp->kva);
if (mhp->umem)
ib_umem_release(mhp->umem);
ib_umem_release(mhp->umem);
pr_debug("mmid 0x%x ptr %p\n", mmid, mhp);
c4iw_put_wr_wait(mhp->wr_waitp);
kfree(mhp);

View File

@ -271,7 +271,6 @@ static int c4iw_query_device(struct ib_device *ibdev, struct ib_device_attr *pro
return -EINVAL;
dev = to_c4iw_dev(ibdev);
memset(props, 0, sizeof *props);
memcpy(&props->sys_image_guid, dev->rdev.lldi.ports[0]->dev_addr, 6);
props->hw_ver = CHELSIO_CHIP_RELEASE(dev->rdev.lldi.adapter_type);
props->fw_ver = dev->rdev.lldi.fw_vers;
@ -490,6 +489,10 @@ static int fill_res_entry(struct sk_buff *msg, struct rdma_restrack_entry *res)
}
static const struct ib_device_ops c4iw_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_CXGB4,
.uverbs_abi_ver = C4IW_UVERBS_ABI_VERSION,
.alloc_hw_stats = c4iw_alloc_stats,
.alloc_mr = c4iw_alloc_mr,
.alloc_mw = c4iw_alloc_mw,
@ -534,6 +537,7 @@ static const struct ib_device_ops c4iw_dev_ops = {
.reg_user_mr = c4iw_reg_user_mr,
.req_notify_cq = c4iw_arm_cq,
INIT_RDMA_OBJ_SIZE(ib_pd, c4iw_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_cq, c4iw_cq, ibcq),
INIT_RDMA_OBJ_SIZE(ib_srq, c4iw_srq, ibsrq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, c4iw_ucontext, ibucontext),
};
@ -561,7 +565,6 @@ void c4iw_register_device(struct work_struct *work)
pr_debug("c4iw_dev %p\n", dev);
memset(&dev->ibdev.node_guid, 0, sizeof(dev->ibdev.node_guid));
memcpy(&dev->ibdev.node_guid, dev->rdev.lldi.ports[0]->dev_addr, 6);
dev->ibdev.owner = THIS_MODULE;
dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW;
if (fastreg_support)
dev->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS;
@ -594,13 +597,11 @@ void c4iw_register_device(struct work_struct *work)
dev->ibdev.phys_port_cnt = dev->rdev.lldi.nports;
dev->ibdev.num_comp_vectors = dev->rdev.lldi.nciq;
dev->ibdev.dev.parent = &dev->rdev.lldi.pdev->dev;
dev->ibdev.uverbs_abi_ver = C4IW_UVERBS_ABI_VERSION;
memcpy(dev->ibdev.iw_ifname, dev->rdev.lldi.ports[0]->name,
sizeof(dev->ibdev.iw_ifname));
rdma_set_device_sysfs_group(&dev->ibdev, &c4iw_attr_group);
dev->ibdev.driver_id = RDMA_DRIVER_CXGB4;
ib_set_device_ops(&dev->ibdev, &c4iw_dev_ops);
ret = set_netdevs(&dev->ibdev, &dev->rdev);
if (ret)

View File

@ -274,7 +274,6 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq *wq,
(unsigned long long)virt_to_phys(wq->sq.queue),
wq->rq.queue,
(unsigned long long)virt_to_phys(wq->rq.queue));
memset(wq->rq.queue, 0, wq->rq.memsize);
dma_unmap_addr_set(&wq->rq, mapping, wq->rq.dma_addr);
}
@ -303,7 +302,7 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq *wq,
wq->rq.msn = 1;
/* build fw_ri_res_wr */
wr_len = sizeof *res_wr + 2 * sizeof *res;
wr_len = sizeof(*res_wr) + 2 * sizeof(*res);
if (need_rq)
wr_len += sizeof(*res);
skb = alloc_skb(wr_len, GFP_KERNEL);
@ -439,7 +438,7 @@ static int build_immd(struct t4_sq *sq, struct fw_ri_immd *immdp,
rem -= len;
}
}
len = roundup(plen + sizeof *immdp, 16) - (plen + sizeof *immdp);
len = roundup(plen + sizeof(*immdp), 16) - (plen + sizeof(*immdp));
if (len)
memset(dstp, 0, len);
immdp->op = FW_RI_DATA_IMMD;
@ -528,7 +527,7 @@ static int build_rdma_send(struct t4_sq *sq, union t4_wr *wqe,
T4_MAX_SEND_INLINE, &plen);
if (ret)
return ret;
size = sizeof wqe->send + sizeof(struct fw_ri_immd) +
size = sizeof(wqe->send) + sizeof(struct fw_ri_immd) +
plen;
} else {
ret = build_isgl((__be64 *)sq->queue,
@ -537,7 +536,7 @@ static int build_rdma_send(struct t4_sq *sq, union t4_wr *wqe,
wr->sg_list, wr->num_sge, &plen);
if (ret)
return ret;
size = sizeof wqe->send + sizeof(struct fw_ri_isgl) +
size = sizeof(wqe->send) + sizeof(struct fw_ri_isgl) +
wr->num_sge * sizeof(struct fw_ri_sge);
}
} else {
@ -545,7 +544,7 @@ static int build_rdma_send(struct t4_sq *sq, union t4_wr *wqe,
wqe->send.u.immd_src[0].r1 = 0;
wqe->send.u.immd_src[0].r2 = 0;
wqe->send.u.immd_src[0].immdlen = 0;
size = sizeof wqe->send + sizeof(struct fw_ri_immd);
size = sizeof(wqe->send) + sizeof(struct fw_ri_immd);
plen = 0;
}
*len16 = DIV_ROUND_UP(size, 16);
@ -579,7 +578,7 @@ static int build_rdma_write(struct t4_sq *sq, union t4_wr *wqe,
T4_MAX_WRITE_INLINE, &plen);
if (ret)
return ret;
size = sizeof wqe->write + sizeof(struct fw_ri_immd) +
size = sizeof(wqe->write) + sizeof(struct fw_ri_immd) +
plen;
} else {
ret = build_isgl((__be64 *)sq->queue,
@ -588,7 +587,7 @@ static int build_rdma_write(struct t4_sq *sq, union t4_wr *wqe,
wr->sg_list, wr->num_sge, &plen);
if (ret)
return ret;
size = sizeof wqe->write + sizeof(struct fw_ri_isgl) +
size = sizeof(wqe->write) + sizeof(struct fw_ri_isgl) +
wr->num_sge * sizeof(struct fw_ri_sge);
}
} else {
@ -596,7 +595,7 @@ static int build_rdma_write(struct t4_sq *sq, union t4_wr *wqe,
wqe->write.u.immd_src[0].r1 = 0;
wqe->write.u.immd_src[0].r2 = 0;
wqe->write.u.immd_src[0].immdlen = 0;
size = sizeof wqe->write + sizeof(struct fw_ri_immd);
size = sizeof(wqe->write) + sizeof(struct fw_ri_immd);
plen = 0;
}
*len16 = DIV_ROUND_UP(size, 16);
@ -683,7 +682,7 @@ static int build_rdma_read(union t4_wr *wqe, const struct ib_send_wr *wr,
}
wqe->read.r2 = 0;
wqe->read.r5 = 0;
*len16 = DIV_ROUND_UP(sizeof wqe->read, 16);
*len16 = DIV_ROUND_UP(sizeof(wqe->read), 16);
return 0;
}
@ -766,8 +765,8 @@ static int build_rdma_recv(struct c4iw_qp *qhp, union t4_recv_wr *wqe,
&wqe->recv.isgl, wr->sg_list, wr->num_sge, NULL);
if (ret)
return ret;
*len16 = DIV_ROUND_UP(sizeof wqe->recv +
wr->num_sge * sizeof(struct fw_ri_sge), 16);
*len16 = DIV_ROUND_UP(
sizeof(wqe->recv) + wr->num_sge * sizeof(struct fw_ri_sge), 16);
return 0;
}
@ -886,47 +885,21 @@ static int build_inv_stag(union t4_wr *wqe, const struct ib_send_wr *wr,
{
wqe->inv.stag_inv = cpu_to_be32(wr->ex.invalidate_rkey);
wqe->inv.r2 = 0;
*len16 = DIV_ROUND_UP(sizeof wqe->inv, 16);
*len16 = DIV_ROUND_UP(sizeof(wqe->inv), 16);
return 0;
}
static void free_qp_work(struct work_struct *work)
{
struct c4iw_ucontext *ucontext;
struct c4iw_qp *qhp;
struct c4iw_dev *rhp;
qhp = container_of(work, struct c4iw_qp, free_work);
ucontext = qhp->ucontext;
rhp = qhp->rhp;
pr_debug("qhp %p ucontext %p\n", qhp, ucontext);
destroy_qp(&rhp->rdev, &qhp->wq,
ucontext ? &ucontext->uctx : &rhp->rdev.uctx, !qhp->srq);
c4iw_put_wr_wait(qhp->wr_waitp);
kfree(qhp);
}
static void queue_qp_free(struct kref *kref)
{
struct c4iw_qp *qhp;
qhp = container_of(kref, struct c4iw_qp, kref);
pr_debug("qhp %p\n", qhp);
queue_work(qhp->rhp->rdev.free_workq, &qhp->free_work);
}
void c4iw_qp_add_ref(struct ib_qp *qp)
{
pr_debug("ib_qp %p\n", qp);
kref_get(&to_c4iw_qp(qp)->kref);
refcount_inc(&to_c4iw_qp(qp)->qp_refcnt);
}
void c4iw_qp_rem_ref(struct ib_qp *qp)
{
pr_debug("ib_qp %p\n", qp);
kref_put(&to_c4iw_qp(qp)->kref, queue_qp_free);
if (refcount_dec_and_test(&to_c4iw_qp(qp)->qp_refcnt))
complete(&to_c4iw_qp(qp)->qp_rel_comp);
}
static void add_to_fc_list(struct list_head *head, struct list_head *entry)
@ -1606,7 +1579,7 @@ static void post_terminate(struct c4iw_qp *qhp, struct t4_cqe *err_cqe,
FW_WR_LEN16_V(DIV_ROUND_UP(sizeof(*wqe), 16)));
wqe->u.terminate.type = FW_RI_TYPE_TERMINATE;
wqe->u.terminate.immdlen = cpu_to_be32(sizeof *term);
wqe->u.terminate.immdlen = cpu_to_be32(sizeof(*term));
term = (struct terminate_message *)wqe->u.terminate.termmsg;
if (qhp->attr.layer_etype == (LAYER_MPA|DDP_LLP)) {
term->layer_etype = qhp->attr.layer_etype;
@ -1751,16 +1724,15 @@ static int rdma_fini(struct c4iw_dev *rhp, struct c4iw_qp *qhp,
static void build_rtr_msg(u8 p2p_type, struct fw_ri_init *init)
{
pr_debug("p2p_type = %d\n", p2p_type);
memset(&init->u, 0, sizeof init->u);
memset(&init->u, 0, sizeof(init->u));
switch (p2p_type) {
case FW_RI_INIT_P2PTYPE_RDMA_WRITE:
init->u.write.opcode = FW_RI_RDMA_WRITE_WR;
init->u.write.stag_sink = cpu_to_be32(1);
init->u.write.to_sink = cpu_to_be64(1);
init->u.write.u.immd_src[0].op = FW_RI_DATA_IMMD;
init->u.write.len16 = DIV_ROUND_UP(sizeof init->u.write +
sizeof(struct fw_ri_immd),
16);
init->u.write.len16 = DIV_ROUND_UP(
sizeof(init->u.write) + sizeof(struct fw_ri_immd), 16);
break;
case FW_RI_INIT_P2PTYPE_READ_REQ:
init->u.write.opcode = FW_RI_RDMA_READ_WR;
@ -1768,7 +1740,7 @@ static void build_rtr_msg(u8 p2p_type, struct fw_ri_init *init)
init->u.read.to_src_lo = cpu_to_be32(1);
init->u.read.stag_sink = cpu_to_be32(1);
init->u.read.to_sink_lo = cpu_to_be32(1);
init->u.read.len16 = DIV_ROUND_UP(sizeof init->u.read, 16);
init->u.read.len16 = DIV_ROUND_UP(sizeof(init->u.read), 16);
break;
}
}
@ -1782,7 +1754,7 @@ static int rdma_init(struct c4iw_dev *rhp, struct c4iw_qp *qhp)
pr_debug("qhp %p qid 0x%x tid %u ird %u ord %u\n", qhp,
qhp->wq.sq.qid, qhp->ep->hwtid, qhp->ep->ird, qhp->ep->ord);
skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
skb = alloc_skb(sizeof(*wqe), GFP_KERNEL);
if (!skb) {
ret = -ENOMEM;
goto out;
@ -2099,10 +2071,12 @@ int c4iw_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
{
struct c4iw_dev *rhp;
struct c4iw_qp *qhp;
struct c4iw_ucontext *ucontext;
struct c4iw_qp_attributes attrs;
qhp = to_c4iw_qp(ib_qp);
rhp = qhp->rhp;
ucontext = qhp->ucontext;
attrs.next_state = C4IW_QP_STATE_ERROR;
if (qhp->attr.state == C4IW_QP_STATE_TERMINATE)
@ -2120,7 +2094,17 @@ int c4iw_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
c4iw_qp_rem_ref(ib_qp);
wait_for_completion(&qhp->qp_rel_comp);
pr_debug("ib_qp %p qpid 0x%0x\n", ib_qp, qhp->wq.sq.qid);
pr_debug("qhp %p ucontext %p\n", qhp, ucontext);
destroy_qp(&rhp->rdev, &qhp->wq,
ucontext ? &ucontext->uctx : &rhp->rdev.uctx, !qhp->srq);
c4iw_put_wr_wait(qhp->wr_waitp);
kfree(qhp);
return 0;
}
@ -2230,8 +2214,8 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *attrs,
spin_lock_init(&qhp->lock);
mutex_init(&qhp->mutex);
init_waitqueue_head(&qhp->wait);
kref_init(&qhp->kref);
INIT_WORK(&qhp->free_work, free_qp_work);
init_completion(&qhp->qp_rel_comp);
refcount_set(&qhp->qp_refcnt, 1);
ret = xa_insert_irq(&rhp->qps, qhp->wq.sq.qid, qhp, GFP_KERNEL);
if (ret)
@ -2302,7 +2286,7 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *attrs,
ucontext->key += PAGE_SIZE;
}
spin_unlock(&ucontext->mmap_lock);
ret = ib_copy_to_udata(udata, &uresp, sizeof uresp);
ret = ib_copy_to_udata(udata, &uresp, sizeof(uresp));
if (ret)
goto err_free_ma_sync_key;
sq_key_mm->key = uresp.sq_key;
@ -2386,7 +2370,7 @@ int c4iw_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
struct c4iw_dev *rhp;
struct c4iw_qp *qhp;
enum c4iw_qp_attr_mask mask = 0;
struct c4iw_qp_attributes attrs;
struct c4iw_qp_attributes attrs = {};
pr_debug("ib_qp %p\n", ibqp);
@ -2398,7 +2382,6 @@ int c4iw_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
if (!attr_mask)
return 0;
memset(&attrs, 0, sizeof attrs);
qhp = to_c4iw_qp(ibqp);
rhp = qhp->rhp;
@ -2482,8 +2465,8 @@ int c4iw_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
{
struct c4iw_qp *qhp = to_c4iw_qp(ibqp);
memset(attr, 0, sizeof *attr);
memset(init_attr, 0, sizeof *init_attr);
memset(attr, 0, sizeof(*attr));
memset(init_attr, 0, sizeof(*init_attr));
attr->qp_state = to_ib_qp_state(qhp->attr.state);
init_attr->cap.max_send_wr = qhp->attr.sq_num_entries;
init_attr->cap.max_recv_wr = qhp->attr.rq_num_entries;

View File

@ -126,7 +126,7 @@ u32 c4iw_get_cqid(struct c4iw_rdev *rdev, struct c4iw_dev_ucontext *uctx)
rdev->stats.qid.cur += rdev->qpmask + 1;
mutex_unlock(&rdev->stats.lock);
for (i = qid+1; i & rdev->qpmask; i++) {
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
goto out;
entry->qid = i;
@ -137,13 +137,13 @@ u32 c4iw_get_cqid(struct c4iw_rdev *rdev, struct c4iw_dev_ucontext *uctx)
* now put the same ids on the qp list since they all
* map to the same db/gts page.
*/
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
goto out;
entry->qid = qid;
list_add_tail(&entry->entry, &uctx->qpids);
for (i = qid+1; i & rdev->qpmask; i++) {
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
goto out;
entry->qid = i;
@ -165,7 +165,7 @@ void c4iw_put_cqid(struct c4iw_rdev *rdev, u32 qid,
{
struct c4iw_qid_list *entry;
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
return;
pr_debug("qid 0x%x\n", qid);
@ -200,7 +200,7 @@ u32 c4iw_get_qpid(struct c4iw_rdev *rdev, struct c4iw_dev_ucontext *uctx)
rdev->stats.qid.cur += rdev->qpmask + 1;
mutex_unlock(&rdev->stats.lock);
for (i = qid+1; i & rdev->qpmask; i++) {
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
goto out;
entry->qid = i;
@ -211,13 +211,13 @@ u32 c4iw_get_qpid(struct c4iw_rdev *rdev, struct c4iw_dev_ucontext *uctx)
* now put the same ids on the cq list since they all
* map to the same db/gts page.
*/
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
goto out;
entry->qid = qid;
list_add_tail(&entry->entry, &uctx->cqids);
for (i = qid; i & rdev->qpmask; i++) {
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
goto out;
entry->qid = i;
@ -239,7 +239,7 @@ void c4iw_put_qpid(struct c4iw_rdev *rdev, u32 qid,
{
struct c4iw_qid_list *entry;
entry = kmalloc(sizeof *entry, GFP_KERNEL);
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
return;
pr_debug("qid 0x%x\n", qid);

View File

@ -7,10 +7,8 @@
#define _EFA_H_
#include <linux/bitops.h>
#include <linux/idr.h>
#include <linux/interrupt.h>
#include <linux/pci.h>
#include <linux/sched.h>
#include <rdma/efa-abi.h>
#include <rdma/ib_verbs.h>
@ -136,10 +134,9 @@ int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata);
struct ib_qp *efa_create_qp(struct ib_pd *ibpd,
struct ib_qp_init_attr *init_attr,
struct ib_udata *udata);
int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
struct ib_cq *efa_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length,
u64 virt_addr, int access_flags,
struct ib_udata *udata);

View File

@ -39,8 +39,6 @@
enum efa_cmd_status {
EFA_CMD_SUBMITTED,
EFA_CMD_COMPLETED,
/* Abort - canceled by the driver */
EFA_CMD_ABORTED,
};
struct efa_comp_ctx {
@ -280,36 +278,34 @@ static void efa_com_dealloc_ctx_id(struct efa_com_admin_queue *aq,
static inline void efa_com_put_comp_ctx(struct efa_com_admin_queue *aq,
struct efa_comp_ctx *comp_ctx)
{
u16 comp_id = comp_ctx->user_cqe->acq_common_descriptor.command &
EFA_ADMIN_ACQ_COMMON_DESC_COMMAND_ID_MASK;
u16 cmd_id = comp_ctx->user_cqe->acq_common_descriptor.command &
EFA_ADMIN_ACQ_COMMON_DESC_COMMAND_ID_MASK;
u16 ctx_id = cmd_id & (aq->depth - 1);
ibdev_dbg(aq->efa_dev, "Putting completion command_id %d\n", comp_id);
ibdev_dbg(aq->efa_dev, "Put completion command_id %#x\n", cmd_id);
comp_ctx->occupied = 0;
efa_com_dealloc_ctx_id(aq, comp_id);
efa_com_dealloc_ctx_id(aq, ctx_id);
}
static struct efa_comp_ctx *efa_com_get_comp_ctx(struct efa_com_admin_queue *aq,
u16 command_id, bool capture)
u16 cmd_id, bool capture)
{
if (command_id >= aq->depth) {
ibdev_err(aq->efa_dev,
"command id is larger than the queue size. cmd_id: %u queue size %d\n",
command_id, aq->depth);
return NULL;
}
u16 ctx_id = cmd_id & (aq->depth - 1);
if (aq->comp_ctx[command_id].occupied && capture) {
ibdev_err(aq->efa_dev, "Completion context is occupied\n");
if (aq->comp_ctx[ctx_id].occupied && capture) {
ibdev_err(aq->efa_dev,
"Completion context for command_id %#x is occupied\n",
cmd_id);
return NULL;
}
if (capture) {
aq->comp_ctx[command_id].occupied = 1;
ibdev_dbg(aq->efa_dev, "Taking completion ctxt command_id %d\n",
command_id);
aq->comp_ctx[ctx_id].occupied = 1;
ibdev_dbg(aq->efa_dev,
"Take completion ctxt for command_id %#x\n", cmd_id);
}
return &aq->comp_ctx[command_id];
return &aq->comp_ctx[ctx_id];
}
static struct efa_comp_ctx *__efa_com_submit_admin_cmd(struct efa_com_admin_queue *aq,
@ -320,6 +316,7 @@ static struct efa_comp_ctx *__efa_com_submit_admin_cmd(struct efa_com_admin_queu
{
struct efa_comp_ctx *comp_ctx;
u16 queue_size_mask;
u16 cmd_id;
u16 ctx_id;
u16 pi;
@ -328,13 +325,16 @@ static struct efa_comp_ctx *__efa_com_submit_admin_cmd(struct efa_com_admin_queu
ctx_id = efa_com_alloc_ctx_id(aq);
/* cmd_id LSBs are the ctx_id and MSBs are entropy bits from pc */
cmd_id = ctx_id & queue_size_mask;
cmd_id |= aq->sq.pc & ~queue_size_mask;
cmd_id &= EFA_ADMIN_AQ_COMMON_DESC_COMMAND_ID_MASK;
cmd->aq_common_descriptor.command_id = cmd_id;
cmd->aq_common_descriptor.flags |= aq->sq.phase &
EFA_ADMIN_AQ_COMMON_DESC_PHASE_MASK;
cmd->aq_common_descriptor.command_id |= ctx_id &
EFA_ADMIN_AQ_COMMON_DESC_COMMAND_ID_MASK;
comp_ctx = efa_com_get_comp_ctx(aq, ctx_id, true);
comp_ctx = efa_com_get_comp_ctx(aq, cmd_id, true);
if (!comp_ctx) {
efa_com_dealloc_ctx_id(aq, ctx_id);
return ERR_PTR(-EINVAL);
@ -532,16 +532,6 @@ static int efa_com_wait_and_process_admin_cq_polling(struct efa_comp_ctx *comp_c
msleep(aq->poll_interval);
}
if (comp_ctx->status == EFA_CMD_ABORTED) {
ibdev_err(aq->efa_dev, "Command was aborted\n");
atomic64_inc(&aq->stats.aborted_cmd);
err = -ENODEV;
goto out;
}
WARN_ONCE(comp_ctx->status != EFA_CMD_COMPLETED,
"Invalid completion status %d\n", comp_ctx->status);
err = efa_com_comp_status_to_errno(comp_ctx->comp_status);
out:
efa_com_put_comp_ctx(aq, comp_ctx);
@ -665,66 +655,6 @@ int efa_com_cmd_exec(struct efa_com_admin_queue *aq,
return err;
}
/**
* efa_com_abort_admin_commands - Abort all the outstanding admin commands.
* @edev: EFA communication layer struct
*
* This method aborts all the outstanding admin commands.
* The caller should then call efa_com_wait_for_abort_completion to make sure
* all the commands were completed.
*/
static void efa_com_abort_admin_commands(struct efa_com_dev *edev)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_comp_ctx *comp_ctx;
unsigned long flags;
u16 i;
spin_lock(&aq->sq.lock);
spin_lock_irqsave(&aq->cq.lock, flags);
for (i = 0; i < aq->depth; i++) {
comp_ctx = efa_com_get_comp_ctx(aq, i, false);
if (!comp_ctx)
break;
comp_ctx->status = EFA_CMD_ABORTED;
complete(&comp_ctx->wait_event);
}
spin_unlock_irqrestore(&aq->cq.lock, flags);
spin_unlock(&aq->sq.lock);
}
/**
* efa_com_wait_for_abort_completion - Wait for admin commands abort.
* @edev: EFA communication layer struct
*
* This method wait until all the outstanding admin commands will be completed.
*/
static void efa_com_wait_for_abort_completion(struct efa_com_dev *edev)
{
struct efa_com_admin_queue *aq = &edev->aq;
int i;
/* all mine */
for (i = 0; i < aq->depth; i++)
down(&aq->avail_cmds);
/* let it go */
for (i = 0; i < aq->depth; i++)
up(&aq->avail_cmds);
}
static void efa_com_admin_flush(struct efa_com_dev *edev)
{
struct efa_com_admin_queue *aq = &edev->aq;
clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state);
efa_com_abort_admin_commands(edev);
efa_com_wait_for_abort_completion(edev);
}
/**
* efa_com_admin_destroy - Destroy the admin and the async events queues.
* @edev: EFA communication layer struct
@ -737,7 +667,7 @@ void efa_com_admin_destroy(struct efa_com_dev *edev)
struct efa_com_admin_sq *sq = &aq->sq;
u16 size;
efa_com_admin_flush(edev);
clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state);
devm_kfree(edev->dmadev, aq->comp_ctx_pool);
devm_kfree(edev->dmadev, aq->comp_ctx);

View File

@ -45,7 +45,6 @@ struct efa_com_admin_sq {
/* Don't use anything other than atomic64 */
struct efa_com_stats_admin {
atomic64_t aborted_cmd;
atomic64_t submitted_cmd;
atomic64_t completed_cmd;
atomic64_t no_completion;

View File

@ -3,7 +3,6 @@
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#include "efa.h"
#include "efa_com.h"
#include "efa_com_cmd.h"
@ -57,7 +56,7 @@ int efa_com_create_qp(struct efa_com_dev *edev,
res->send_sub_cq_idx = cmd_completion.send_sub_cq_idx;
res->recv_sub_cq_idx = cmd_completion.recv_sub_cq_idx;
return err;
return 0;
}
int efa_com_modify_qp(struct efa_com_dev *edev,
@ -181,7 +180,7 @@ int efa_com_create_cq(struct efa_com_dev *edev,
result->cq_idx = cmd_completion.cq_idx;
result->actual_depth = params->cq_depth;
return err;
return 0;
}
int efa_com_destroy_cq(struct efa_com_dev *edev,
@ -307,7 +306,8 @@ int efa_com_create_ah(struct efa_com_dev *edev,
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err) {
ibdev_err(edev->efa_dev, "Failed to create ah [%d]\n", err);
ibdev_err(edev->efa_dev, "Failed to create ah for %pI6 [%d]\n",
ah_cmd.dest_addr, err);
return err;
}

View File

@ -100,7 +100,7 @@ static int efa_request_mgmnt_irq(struct efa_dev *dev)
nr_cpumask_bits, &irq->affinity_hint_mask, irq->vector);
irq_set_affinity_hint(irq->vector, &irq->affinity_hint_mask);
return err;
return 0;
}
static void efa_setup_mgmnt_irq(struct efa_dev *dev)
@ -197,6 +197,10 @@ static void efa_stats_init(struct efa_dev *dev)
}
static const struct ib_device_ops efa_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_EFA,
.uverbs_abi_ver = EFA_UVERBS_ABI_VERSION,
.alloc_pd = efa_alloc_pd,
.alloc_ucontext = efa_alloc_ucontext,
.create_ah = efa_create_ah,
@ -220,6 +224,7 @@ static const struct ib_device_ops efa_dev_ops = {
.reg_user_mr = efa_reg_mr,
INIT_RDMA_OBJ_SIZE(ib_ah, efa_ah, ibah),
INIT_RDMA_OBJ_SIZE(ib_cq, efa_cq, ibcq),
INIT_RDMA_OBJ_SIZE(ib_pd, efa_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_ucontext, efa_ucontext, ibucontext),
};
@ -259,12 +264,10 @@ static int efa_ib_device_add(struct efa_dev *dev)
if (err)
goto err_release_doorbell_bar;
dev->ibdev.owner = THIS_MODULE;
dev->ibdev.node_type = RDMA_NODE_UNSPECIFIED;
dev->ibdev.phys_port_cnt = 1;
dev->ibdev.num_comp_vectors = 1;
dev->ibdev.dev.parent = &pdev->dev;
dev->ibdev.uverbs_abi_ver = EFA_UVERBS_ABI_VERSION;
dev->ibdev.uverbs_cmd_mask =
(1ull << IB_USER_VERBS_CMD_GET_CONTEXT) |
@ -287,7 +290,6 @@ static int efa_ib_device_add(struct efa_dev *dev)
dev->ibdev.uverbs_ex_cmd_mask =
(1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE);
dev->ibdev.driver_id = RDMA_DRIVER_EFA;
ib_set_device_ops(&dev->ibdev, &efa_dev_ops);
err = ib_register_device(&dev->ibdev, "efa_%d");

View File

@ -447,12 +447,6 @@ void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
struct efa_dev *dev = to_edev(ibpd->device);
struct efa_pd *pd = to_epd(ibpd);
if (udata->inlen &&
!ib_is_udata_cleared(udata, 0, udata->inlen)) {
ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n");
return;
}
ibdev_dbg(&dev->ibdev, "Dealloc pd[%d]\n", pd->pdn);
efa_pd_dealloc(dev, pd->pdn);
}
@ -470,12 +464,6 @@ int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
struct efa_qp *qp = to_eqp(ibqp);
int err;
if (udata->inlen &&
!ib_is_udata_cleared(udata, 0, udata->inlen)) {
ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n");
return -EINVAL;
}
ibdev_dbg(&dev->ibdev, "Destroy qp[%u]\n", ibqp->qp_num);
err = efa_destroy_qp_handle(dev, qp->qp_handle);
if (err)
@ -870,31 +858,18 @@ static int efa_destroy_cq_idx(struct efa_dev *dev, int cq_idx)
return efa_com_destroy_cq(&dev->edev, &params);
}
int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
{
struct efa_dev *dev = to_edev(ibcq->device);
struct efa_cq *cq = to_ecq(ibcq);
int err;
if (udata->inlen &&
!ib_is_udata_cleared(udata, 0, udata->inlen)) {
ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n");
return -EINVAL;
}
ibdev_dbg(&dev->ibdev,
"Destroy cq[%d] virt[0x%p] freed: size[%lu], dma[%pad]\n",
cq->cq_idx, cq->cpu_addr, cq->size, &cq->dma_addr);
err = efa_destroy_cq_idx(dev, cq->cq_idx);
if (err)
return err;
efa_destroy_cq_idx(dev, cq->cq_idx);
dma_unmap_single(&dev->pdev->dev, cq->dma_addr, cq->size,
DMA_FROM_DEVICE);
kfree(cq);
return 0;
}
static int cq_mmap_entries_setup(struct efa_dev *dev, struct efa_cq *cq,
@ -910,17 +885,20 @@ static int cq_mmap_entries_setup(struct efa_dev *dev, struct efa_cq *cq,
return 0;
}
static struct ib_cq *do_create_cq(struct ib_device *ibdev, int entries,
int vector, struct ib_ucontext *ibucontext,
struct ib_udata *udata)
int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct efa_ucontext *ucontext = rdma_udata_to_drv_context(
udata, struct efa_ucontext, ibucontext);
struct efa_ibv_create_cq_resp resp = {};
struct efa_com_create_cq_params params;
struct efa_com_create_cq_result result;
struct ib_device *ibdev = ibcq->device;
struct efa_dev *dev = to_edev(ibdev);
struct efa_ibv_create_cq cmd = {};
struct efa_cq *cq = to_ecq(ibcq);
bool cq_entry_inserted = false;
struct efa_cq *cq;
int entries = attr->cqe;
int err;
ibdev_dbg(ibdev, "create_cq entries %d\n", entries);
@ -978,19 +956,13 @@ static struct ib_cq *do_create_cq(struct ib_device *ibdev, int entries,
goto err_out;
}
cq = kzalloc(sizeof(*cq), GFP_KERNEL);
if (!cq) {
err = -ENOMEM;
goto err_out;
}
cq->ucontext = to_eucontext(ibucontext);
cq->ucontext = ucontext;
cq->size = PAGE_ALIGN(cmd.cq_entry_size * entries * cmd.num_sub_cqs);
cq->cpu_addr = efa_zalloc_mapped(dev, &cq->dma_addr, cq->size,
DMA_FROM_DEVICE);
if (!cq->cpu_addr) {
err = -ENOMEM;
goto err_free_cq;
goto err_out;
}
params.uarn = cq->ucontext->uarn;
@ -1009,8 +981,8 @@ static struct ib_cq *do_create_cq(struct ib_device *ibdev, int entries,
err = cq_mmap_entries_setup(dev, cq, &resp);
if (err) {
ibdev_dbg(ibdev,
"Could not setup cq[%u] mmap entries\n", cq->cq_idx);
ibdev_dbg(ibdev, "Could not setup cq[%u] mmap entries\n",
cq->cq_idx);
goto err_destroy_cq;
}
@ -1026,11 +998,10 @@ static struct ib_cq *do_create_cq(struct ib_device *ibdev, int entries,
}
}
ibdev_dbg(ibdev,
"Created cq[%d], cq depth[%u]. dma[%pad] virt[0x%p]\n",
ibdev_dbg(ibdev, "Created cq[%d], cq depth[%u]. dma[%pad] virt[0x%p]\n",
cq->cq_idx, result.actual_depth, &cq->dma_addr, cq->cpu_addr);
return &cq->ibcq;
return 0;
err_destroy_cq:
efa_destroy_cq_idx(dev, cq->cq_idx);
@ -1039,23 +1010,9 @@ err_free_mapped:
DMA_FROM_DEVICE);
if (!cq_entry_inserted)
free_pages_exact(cq->cpu_addr, cq->size);
err_free_cq:
kfree(cq);
err_out:
atomic64_inc(&dev->stats.sw_stats.create_cq_err);
return ERR_PTR(err);
}
struct ib_cq *efa_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct efa_ucontext *ucontext = rdma_udata_to_drv_context(udata,
struct efa_ucontext,
ibucontext);
return do_create_cq(ibdev, attr->cqe, attr->comp_vector,
&ucontext->ibucontext, udata);
return err;
}
static int umem_to_page_list(struct efa_dev *dev,
@ -1065,21 +1022,15 @@ static int umem_to_page_list(struct efa_dev *dev,
u8 hp_shift)
{
u32 pages_in_hp = BIT(hp_shift - PAGE_SHIFT);
struct sg_dma_page_iter sg_iter;
unsigned int page_idx = 0;
struct ib_block_iter biter;
unsigned int hp_idx = 0;
ibdev_dbg(&dev->ibdev, "hp_cnt[%u], pages_in_hp[%u]\n",
hp_cnt, pages_in_hp);
for_each_sg_dma_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
if (page_idx % pages_in_hp == 0) {
page_list[hp_idx] = sg_page_iter_dma_address(&sg_iter);
hp_idx++;
}
page_idx++;
}
rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap,
BIT(hp_shift))
page_list[hp_idx++] = rdma_block_iter_dma_address(&biter);
return 0;
}
@ -1114,14 +1065,14 @@ err:
*/
static int pbl_chunk_list_create(struct efa_dev *dev, struct pbl_context *pbl)
{
unsigned int entry, payloads_in_sg, chunk_list_size, chunk_idx, payload_idx;
struct pbl_chunk_list *chunk_list = &pbl->phys.indirect.chunk_list;
int page_cnt = pbl->phys.indirect.pbl_buf_size_in_pages;
struct scatterlist *pages_sgl = pbl->phys.indirect.sgl;
unsigned int chunk_list_size, chunk_idx, payload_idx;
int sg_dma_cnt = pbl->phys.indirect.sg_dma_cnt;
struct efa_com_ctrl_buff_info *ctrl_buf;
u64 *cur_chunk_buf, *prev_chunk_buf;
struct scatterlist *sg;
struct ib_block_iter biter;
dma_addr_t dma_addr;
int i;
@ -1155,18 +1106,15 @@ static int pbl_chunk_list_create(struct efa_dev *dev, struct pbl_context *pbl)
chunk_idx = 0;
payload_idx = 0;
cur_chunk_buf = chunk_list->chunks[0].buf;
for_each_sg(pages_sgl, sg, sg_dma_cnt, entry) {
payloads_in_sg = sg_dma_len(sg) >> EFA_CHUNK_PAYLOAD_SHIFT;
for (i = 0; i < payloads_in_sg; i++) {
cur_chunk_buf[payload_idx++] =
(sg_dma_address(sg) & ~(EFA_CHUNK_PAYLOAD_SIZE - 1)) +
(EFA_CHUNK_PAYLOAD_SIZE * i);
rdma_for_each_block(pages_sgl, &biter, sg_dma_cnt,
EFA_CHUNK_PAYLOAD_SIZE) {
cur_chunk_buf[payload_idx++] =
rdma_block_iter_dma_address(&biter);
if (payload_idx == EFA_PTRS_PER_CHUNK) {
chunk_idx++;
cur_chunk_buf = chunk_list->chunks[chunk_idx].buf;
payload_idx = 0;
}
if (payload_idx == EFA_PTRS_PER_CHUNK) {
chunk_idx++;
cur_chunk_buf = chunk_list->chunks[chunk_idx].buf;
payload_idx = 0;
}
}
@ -1314,30 +1262,30 @@ static int pbl_create(struct efa_dev *dev,
int err;
pbl->pbl_buf_size_in_bytes = hp_cnt * EFA_CHUNK_PAYLOAD_PTR_SIZE;
pbl->pbl_buf = kzalloc(pbl->pbl_buf_size_in_bytes,
GFP_KERNEL | __GFP_NOWARN);
if (pbl->pbl_buf) {
pbl->pbl_buf = kvzalloc(pbl->pbl_buf_size_in_bytes, GFP_KERNEL);
if (!pbl->pbl_buf)
return -ENOMEM;
if (is_vmalloc_addr(pbl->pbl_buf)) {
pbl->physically_continuous = 0;
err = umem_to_page_list(dev, umem, pbl->pbl_buf, hp_cnt,
hp_shift);
if (err)
goto err_free;
err = pbl_indirect_initialize(dev, pbl);
if (err)
goto err_free;
} else {
pbl->physically_continuous = 1;
err = umem_to_page_list(dev, umem, pbl->pbl_buf, hp_cnt,
hp_shift);
if (err)
goto err_continuous;
goto err_free;
err = pbl_continuous_initialize(dev, pbl);
if (err)
goto err_continuous;
} else {
pbl->physically_continuous = 0;
pbl->pbl_buf = vzalloc(pbl->pbl_buf_size_in_bytes);
if (!pbl->pbl_buf)
return -ENOMEM;
err = umem_to_page_list(dev, umem, pbl->pbl_buf, hp_cnt,
hp_shift);
if (err)
goto err_indirect;
err = pbl_indirect_initialize(dev, pbl);
if (err)
goto err_indirect;
goto err_free;
}
ibdev_dbg(&dev->ibdev,
@ -1346,24 +1294,20 @@ static int pbl_create(struct efa_dev *dev,
return 0;
err_continuous:
kfree(pbl->pbl_buf);
return err;
err_indirect:
vfree(pbl->pbl_buf);
err_free:
kvfree(pbl->pbl_buf);
return err;
}
static void pbl_destroy(struct efa_dev *dev, struct pbl_context *pbl)
{
if (pbl->physically_continuous) {
if (pbl->physically_continuous)
dma_unmap_single(&dev->pdev->dev, pbl->phys.continuous.dma_addr,
pbl->pbl_buf_size_in_bytes, DMA_TO_DEVICE);
kfree(pbl->pbl_buf);
} else {
else
pbl_indirect_terminate(dev, pbl);
vfree(pbl->pbl_buf);
}
kvfree(pbl->pbl_buf);
}
static int efa_create_inline_pbl(struct efa_dev *dev, struct efa_mr *mr,
@ -1417,56 +1361,6 @@ static int efa_create_pbl(struct efa_dev *dev,
return 0;
}
static void efa_cont_pages(struct ib_umem *umem, u64 addr,
unsigned long max_page_shift,
int *count, u8 *shift, u32 *ncont)
{
struct scatterlist *sg;
u64 base = ~0, p = 0;
unsigned long tmp;
unsigned long m;
u64 len, pfn;
int i = 0;
int entry;
addr = addr >> PAGE_SHIFT;
tmp = (unsigned long)addr;
m = find_first_bit(&tmp, BITS_PER_LONG);
if (max_page_shift)
m = min_t(unsigned long, max_page_shift - PAGE_SHIFT, m);
for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
len = DIV_ROUND_UP(sg_dma_len(sg), PAGE_SIZE);
pfn = sg_dma_address(sg) >> PAGE_SHIFT;
if (base + p != pfn) {
/*
* If either the offset or the new
* base are unaligned update m
*/
tmp = (unsigned long)(pfn | p);
if (!IS_ALIGNED(tmp, 1 << m))
m = find_first_bit(&tmp, BITS_PER_LONG);
base = pfn;
p = 0;
}
p += len;
i += len;
}
if (i) {
m = min_t(unsigned long, ilog2(roundup_pow_of_two(i)), m);
*ncont = DIV_ROUND_UP(i, (1 << m));
} else {
m = 0;
*ncont = 0;
}
*shift = PAGE_SHIFT + m;
*count = i;
}
struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length,
u64 virt_addr, int access_flags,
struct ib_udata *udata)
@ -1474,11 +1368,10 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length,
struct efa_dev *dev = to_edev(ibpd->device);
struct efa_com_reg_mr_params params = {};
struct efa_com_reg_mr_result result = {};
unsigned long max_page_shift;
struct pbl_context pbl;
unsigned int pg_sz;
struct efa_mr *mr;
int inline_size;
int npages;
int err;
if (udata->inlen &&
@ -1515,13 +1408,24 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length,
params.iova = virt_addr;
params.mr_length_in_bytes = length;
params.permissions = access_flags & 0x1;
max_page_shift = fls64(dev->dev_attr.page_size_cap);
efa_cont_pages(mr->umem, start, max_page_shift, &npages,
&params.page_shift, &params.page_num);
pg_sz = ib_umem_find_best_pgsz(mr->umem,
dev->dev_attr.page_size_cap,
virt_addr);
if (!pg_sz) {
err = -EOPNOTSUPP;
ibdev_dbg(&dev->ibdev, "Failed to find a suitable page size in page_size_cap %#llx\n",
dev->dev_attr.page_size_cap);
goto err_unmap;
}
params.page_shift = __ffs(pg_sz);
params.page_num = DIV_ROUND_UP(length + (start & (pg_sz - 1)),
pg_sz);
ibdev_dbg(&dev->ibdev,
"start %#llx length %#llx npages %d params.page_shift %u params.page_num %u\n",
start, length, npages, params.page_shift, params.page_num);
"start %#llx length %#llx params.page_shift %u params.page_num %u\n",
start, length, params.page_shift, params.page_num);
inline_size = ARRAY_SIZE(params.pbl.inline_pbl_array);
if (params.page_num <= inline_size) {
@ -1567,12 +1471,6 @@ int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
struct efa_mr *mr = to_emr(ibmr);
int err;
if (udata->inlen &&
!ib_is_udata_cleared(udata, 0, udata->inlen)) {
ibdev_dbg(&dev->ibdev, "Incompatible ABI params\n");
return -EINVAL;
}
ibdev_dbg(&dev->ibdev, "Deregister mr[%d]\n", ibmr->lkey);
if (mr->umem) {
@ -1580,8 +1478,8 @@ int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
err = efa_com_dereg_mr(&dev->edev, &params);
if (err)
return err;
ib_umem_release(mr->umem);
}
ib_umem_release(mr->umem);
kfree(mr);
@ -1707,13 +1605,15 @@ static int __efa_mmap(struct efa_dev *dev, struct efa_ucontext *ucontext,
err = -EINVAL;
}
if (err)
if (err) {
ibdev_dbg(
&dev->ibdev,
"Couldn't mmap address[%#llx] length[%#llx] mmap_flag[%d] err[%d]\n",
entry->address, length, entry->mmap_flag, err);
return err;
}
return err;
return 0;
}
int efa_mmap(struct ib_ucontext *ibucontext,

View File

@ -10,6 +10,7 @@ obj-$(CONFIG_INFINIBAND_HFI1) += hfi1.o
hfi1-y := \
affinity.o \
aspm.o \
chip.o \
device.o \
driver.o \

View File

@ -0,0 +1,270 @@
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
/*
* Copyright(c) 2019 Intel Corporation.
*
*/
#include "aspm.h"
/* Time after which the timer interrupt will re-enable ASPM */
#define ASPM_TIMER_MS 1000
/* Time for which interrupts are ignored after a timer has been scheduled */
#define ASPM_RESCHED_TIMER_MS (ASPM_TIMER_MS / 2)
/* Two interrupts within this time trigger ASPM disable */
#define ASPM_TRIGGER_MS 1
#define ASPM_TRIGGER_NS (ASPM_TRIGGER_MS * 1000 * 1000ull)
#define ASPM_L1_SUPPORTED(reg) \
((((reg) & PCI_EXP_LNKCAP_ASPMS) >> 10) & 0x2)
uint aspm_mode = ASPM_MODE_DISABLED;
module_param_named(aspm, aspm_mode, uint, 0444);
MODULE_PARM_DESC(aspm, "PCIe ASPM: 0: disable, 1: enable, 2: dynamic");
static bool aspm_hw_l1_supported(struct hfi1_devdata *dd)
{
struct pci_dev *parent = dd->pcidev->bus->self;
u32 up, dn;
/*
* If the driver does not have access to the upstream component,
* it cannot support ASPM L1 at all.
*/
if (!parent)
return false;
pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &dn);
dn = ASPM_L1_SUPPORTED(dn);
pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &up);
up = ASPM_L1_SUPPORTED(up);
/* ASPM works on A-step but is reported as not supported */
return (!!dn || is_ax(dd)) && !!up;
}
/* Set L1 entrance latency for slower entry to L1 */
static void aspm_hw_set_l1_ent_latency(struct hfi1_devdata *dd)
{
u32 l1_ent_lat = 0x4u;
u32 reg32;
pci_read_config_dword(dd->pcidev, PCIE_CFG_REG_PL3, &reg32);
reg32 &= ~PCIE_CFG_REG_PL3_L1_ENT_LATENCY_SMASK;
reg32 |= l1_ent_lat << PCIE_CFG_REG_PL3_L1_ENT_LATENCY_SHIFT;
pci_write_config_dword(dd->pcidev, PCIE_CFG_REG_PL3, reg32);
}
static void aspm_hw_enable_l1(struct hfi1_devdata *dd)
{
struct pci_dev *parent = dd->pcidev->bus->self;
/*
* If the driver does not have access to the upstream component,
* it cannot support ASPM L1 at all.
*/
if (!parent)
return;
/* Enable ASPM L1 first in upstream component and then downstream */
pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC,
PCI_EXP_LNKCTL_ASPM_L1);
pcie_capability_clear_and_set_word(dd->pcidev, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC,
PCI_EXP_LNKCTL_ASPM_L1);
}
void aspm_hw_disable_l1(struct hfi1_devdata *dd)
{
struct pci_dev *parent = dd->pcidev->bus->self;
/* Disable ASPM L1 first in downstream component and then upstream */
pcie_capability_clear_and_set_word(dd->pcidev, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC, 0x0);
if (parent)
pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC, 0x0);
}
static void aspm_enable(struct hfi1_devdata *dd)
{
if (dd->aspm_enabled || aspm_mode == ASPM_MODE_DISABLED ||
!dd->aspm_supported)
return;
aspm_hw_enable_l1(dd);
dd->aspm_enabled = true;
}
static void aspm_disable(struct hfi1_devdata *dd)
{
if (!dd->aspm_enabled || aspm_mode == ASPM_MODE_ENABLED)
return;
aspm_hw_disable_l1(dd);
dd->aspm_enabled = false;
}
static void aspm_disable_inc(struct hfi1_devdata *dd)
{
unsigned long flags;
spin_lock_irqsave(&dd->aspm_lock, flags);
aspm_disable(dd);
atomic_inc(&dd->aspm_disabled_cnt);
spin_unlock_irqrestore(&dd->aspm_lock, flags);
}
static void aspm_enable_dec(struct hfi1_devdata *dd)
{
unsigned long flags;
spin_lock_irqsave(&dd->aspm_lock, flags);
if (atomic_dec_and_test(&dd->aspm_disabled_cnt))
aspm_enable(dd);
spin_unlock_irqrestore(&dd->aspm_lock, flags);
}
/* ASPM processing for each receive context interrupt */
void __aspm_ctx_disable(struct hfi1_ctxtdata *rcd)
{
bool restart_timer;
bool close_interrupts;
unsigned long flags;
ktime_t now, prev;
spin_lock_irqsave(&rcd->aspm_lock, flags);
/* PSM contexts are open */
if (!rcd->aspm_intr_enable)
goto unlock;
prev = rcd->aspm_ts_last_intr;
now = ktime_get();
rcd->aspm_ts_last_intr = now;
/* An interrupt pair close together in time */
close_interrupts = ktime_to_ns(ktime_sub(now, prev)) < ASPM_TRIGGER_NS;
/* Don't push out our timer till this much time has elapsed */
restart_timer = ktime_to_ns(ktime_sub(now, rcd->aspm_ts_timer_sched)) >
ASPM_RESCHED_TIMER_MS * NSEC_PER_MSEC;
restart_timer = restart_timer && close_interrupts;
/* Disable ASPM and schedule timer */
if (rcd->aspm_enabled && close_interrupts) {
aspm_disable_inc(rcd->dd);
rcd->aspm_enabled = false;
restart_timer = true;
}
if (restart_timer) {
mod_timer(&rcd->aspm_timer,
jiffies + msecs_to_jiffies(ASPM_TIMER_MS));
rcd->aspm_ts_timer_sched = now;
}
unlock:
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
}
/* Timer function for re-enabling ASPM in the absence of interrupt activity */
static void aspm_ctx_timer_function(struct timer_list *t)
{
struct hfi1_ctxtdata *rcd = from_timer(rcd, t, aspm_timer);
unsigned long flags;
spin_lock_irqsave(&rcd->aspm_lock, flags);
aspm_enable_dec(rcd->dd);
rcd->aspm_enabled = true;
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
}
/*
* Disable interrupt processing for verbs contexts when PSM or VNIC contexts
* are open.
*/
void aspm_disable_all(struct hfi1_devdata *dd)
{
struct hfi1_ctxtdata *rcd;
unsigned long flags;
u16 i;
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = hfi1_rcd_get_by_index(dd, i);
if (rcd) {
del_timer_sync(&rcd->aspm_timer);
spin_lock_irqsave(&rcd->aspm_lock, flags);
rcd->aspm_intr_enable = false;
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
hfi1_rcd_put(rcd);
}
}
aspm_disable(dd);
atomic_set(&dd->aspm_disabled_cnt, 0);
}
/* Re-enable interrupt processing for verbs contexts */
void aspm_enable_all(struct hfi1_devdata *dd)
{
struct hfi1_ctxtdata *rcd;
unsigned long flags;
u16 i;
aspm_enable(dd);
if (aspm_mode != ASPM_MODE_DYNAMIC)
return;
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = hfi1_rcd_get_by_index(dd, i);
if (rcd) {
spin_lock_irqsave(&rcd->aspm_lock, flags);
rcd->aspm_intr_enable = true;
rcd->aspm_enabled = true;
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
hfi1_rcd_put(rcd);
}
}
}
static void aspm_ctx_init(struct hfi1_ctxtdata *rcd)
{
spin_lock_init(&rcd->aspm_lock);
timer_setup(&rcd->aspm_timer, aspm_ctx_timer_function, 0);
rcd->aspm_intr_supported = rcd->dd->aspm_supported &&
aspm_mode == ASPM_MODE_DYNAMIC &&
rcd->ctxt < rcd->dd->first_dyn_alloc_ctxt;
}
void aspm_init(struct hfi1_devdata *dd)
{
struct hfi1_ctxtdata *rcd;
u16 i;
spin_lock_init(&dd->aspm_lock);
dd->aspm_supported = aspm_hw_l1_supported(dd);
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = hfi1_rcd_get_by_index(dd, i);
if (rcd)
aspm_ctx_init(rcd);
hfi1_rcd_put(rcd);
}
/* Start with ASPM disabled */
aspm_hw_set_l1_ent_latency(dd);
dd->aspm_enabled = false;
aspm_hw_disable_l1(dd);
/* Now turn on ASPM if configured */
aspm_enable_all(dd);
}
void aspm_exit(struct hfi1_devdata *dd)
{
aspm_disable_all(dd);
/* Turn on ASPM on exit to conserve power */
aspm_enable(dd);
}

View File

@ -57,266 +57,20 @@ enum aspm_mode {
ASPM_MODE_DYNAMIC = 2, /* ASPM enabled/disabled dynamically */
};
/* Time after which the timer interrupt will re-enable ASPM */
#define ASPM_TIMER_MS 1000
/* Time for which interrupts are ignored after a timer has been scheduled */
#define ASPM_RESCHED_TIMER_MS (ASPM_TIMER_MS / 2)
/* Two interrupts within this time trigger ASPM disable */
#define ASPM_TRIGGER_MS 1
#define ASPM_TRIGGER_NS (ASPM_TRIGGER_MS * 1000 * 1000ull)
#define ASPM_L1_SUPPORTED(reg) \
(((reg & PCI_EXP_LNKCAP_ASPMS) >> 10) & 0x2)
void aspm_init(struct hfi1_devdata *dd);
void aspm_exit(struct hfi1_devdata *dd);
void aspm_hw_disable_l1(struct hfi1_devdata *dd);
void __aspm_ctx_disable(struct hfi1_ctxtdata *rcd);
void aspm_disable_all(struct hfi1_devdata *dd);
void aspm_enable_all(struct hfi1_devdata *dd);
static inline bool aspm_hw_l1_supported(struct hfi1_devdata *dd)
{
struct pci_dev *parent = dd->pcidev->bus->self;
u32 up, dn;
/*
* If the driver does not have access to the upstream component,
* it cannot support ASPM L1 at all.
*/
if (!parent)
return false;
pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &dn);
dn = ASPM_L1_SUPPORTED(dn);
pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &up);
up = ASPM_L1_SUPPORTED(up);
/* ASPM works on A-step but is reported as not supported */
return (!!dn || is_ax(dd)) && !!up;
}
/* Set L1 entrance latency for slower entry to L1 */
static inline void aspm_hw_set_l1_ent_latency(struct hfi1_devdata *dd)
{
u32 l1_ent_lat = 0x4u;
u32 reg32;
pci_read_config_dword(dd->pcidev, PCIE_CFG_REG_PL3, &reg32);
reg32 &= ~PCIE_CFG_REG_PL3_L1_ENT_LATENCY_SMASK;
reg32 |= l1_ent_lat << PCIE_CFG_REG_PL3_L1_ENT_LATENCY_SHIFT;
pci_write_config_dword(dd->pcidev, PCIE_CFG_REG_PL3, reg32);
}
static inline void aspm_hw_enable_l1(struct hfi1_devdata *dd)
{
struct pci_dev *parent = dd->pcidev->bus->self;
/*
* If the driver does not have access to the upstream component,
* it cannot support ASPM L1 at all.
*/
if (!parent)
return;
/* Enable ASPM L1 first in upstream component and then downstream */
pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC,
PCI_EXP_LNKCTL_ASPM_L1);
pcie_capability_clear_and_set_word(dd->pcidev, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC,
PCI_EXP_LNKCTL_ASPM_L1);
}
static inline void aspm_hw_disable_l1(struct hfi1_devdata *dd)
{
struct pci_dev *parent = dd->pcidev->bus->self;
/* Disable ASPM L1 first in downstream component and then upstream */
pcie_capability_clear_and_set_word(dd->pcidev, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC, 0x0);
if (parent)
pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL,
PCI_EXP_LNKCTL_ASPMC, 0x0);
}
static inline void aspm_enable(struct hfi1_devdata *dd)
{
if (dd->aspm_enabled || aspm_mode == ASPM_MODE_DISABLED ||
!dd->aspm_supported)
return;
aspm_hw_enable_l1(dd);
dd->aspm_enabled = true;
}
static inline void aspm_disable(struct hfi1_devdata *dd)
{
if (!dd->aspm_enabled || aspm_mode == ASPM_MODE_ENABLED)
return;
aspm_hw_disable_l1(dd);
dd->aspm_enabled = false;
}
static inline void aspm_disable_inc(struct hfi1_devdata *dd)
{
unsigned long flags;
spin_lock_irqsave(&dd->aspm_lock, flags);
aspm_disable(dd);
atomic_inc(&dd->aspm_disabled_cnt);
spin_unlock_irqrestore(&dd->aspm_lock, flags);
}
static inline void aspm_enable_dec(struct hfi1_devdata *dd)
{
unsigned long flags;
spin_lock_irqsave(&dd->aspm_lock, flags);
if (atomic_dec_and_test(&dd->aspm_disabled_cnt))
aspm_enable(dd);
spin_unlock_irqrestore(&dd->aspm_lock, flags);
}
/* ASPM processing for each receive context interrupt */
static inline void aspm_ctx_disable(struct hfi1_ctxtdata *rcd)
{
bool restart_timer;
bool close_interrupts;
unsigned long flags;
ktime_t now, prev;
/* Quickest exit for minimum impact */
if (!rcd->aspm_intr_supported)
if (likely(!rcd->aspm_intr_supported))
return;
spin_lock_irqsave(&rcd->aspm_lock, flags);
/* PSM contexts are open */
if (!rcd->aspm_intr_enable)
goto unlock;
prev = rcd->aspm_ts_last_intr;
now = ktime_get();
rcd->aspm_ts_last_intr = now;
/* An interrupt pair close together in time */
close_interrupts = ktime_to_ns(ktime_sub(now, prev)) < ASPM_TRIGGER_NS;
/* Don't push out our timer till this much time has elapsed */
restart_timer = ktime_to_ns(ktime_sub(now, rcd->aspm_ts_timer_sched)) >
ASPM_RESCHED_TIMER_MS * NSEC_PER_MSEC;
restart_timer = restart_timer && close_interrupts;
/* Disable ASPM and schedule timer */
if (rcd->aspm_enabled && close_interrupts) {
aspm_disable_inc(rcd->dd);
rcd->aspm_enabled = false;
restart_timer = true;
}
if (restart_timer) {
mod_timer(&rcd->aspm_timer,
jiffies + msecs_to_jiffies(ASPM_TIMER_MS));
rcd->aspm_ts_timer_sched = now;
}
unlock:
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
}
/* Timer function for re-enabling ASPM in the absence of interrupt activity */
static inline void aspm_ctx_timer_function(struct timer_list *t)
{
struct hfi1_ctxtdata *rcd = from_timer(rcd, t, aspm_timer);
unsigned long flags;
spin_lock_irqsave(&rcd->aspm_lock, flags);
aspm_enable_dec(rcd->dd);
rcd->aspm_enabled = true;
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
}
/*
* Disable interrupt processing for verbs contexts when PSM or VNIC contexts
* are open.
*/
static inline void aspm_disable_all(struct hfi1_devdata *dd)
{
struct hfi1_ctxtdata *rcd;
unsigned long flags;
u16 i;
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = hfi1_rcd_get_by_index(dd, i);
if (rcd) {
del_timer_sync(&rcd->aspm_timer);
spin_lock_irqsave(&rcd->aspm_lock, flags);
rcd->aspm_intr_enable = false;
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
hfi1_rcd_put(rcd);
}
}
aspm_disable(dd);
atomic_set(&dd->aspm_disabled_cnt, 0);
}
/* Re-enable interrupt processing for verbs contexts */
static inline void aspm_enable_all(struct hfi1_devdata *dd)
{
struct hfi1_ctxtdata *rcd;
unsigned long flags;
u16 i;
aspm_enable(dd);
if (aspm_mode != ASPM_MODE_DYNAMIC)
return;
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = hfi1_rcd_get_by_index(dd, i);
if (rcd) {
spin_lock_irqsave(&rcd->aspm_lock, flags);
rcd->aspm_intr_enable = true;
rcd->aspm_enabled = true;
spin_unlock_irqrestore(&rcd->aspm_lock, flags);
hfi1_rcd_put(rcd);
}
}
}
static inline void aspm_ctx_init(struct hfi1_ctxtdata *rcd)
{
spin_lock_init(&rcd->aspm_lock);
timer_setup(&rcd->aspm_timer, aspm_ctx_timer_function, 0);
rcd->aspm_intr_supported = rcd->dd->aspm_supported &&
aspm_mode == ASPM_MODE_DYNAMIC &&
rcd->ctxt < rcd->dd->first_dyn_alloc_ctxt;
}
static inline void aspm_init(struct hfi1_devdata *dd)
{
struct hfi1_ctxtdata *rcd;
u16 i;
spin_lock_init(&dd->aspm_lock);
dd->aspm_supported = aspm_hw_l1_supported(dd);
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
rcd = hfi1_rcd_get_by_index(dd, i);
if (rcd)
aspm_ctx_init(rcd);
hfi1_rcd_put(rcd);
}
/* Start with ASPM disabled */
aspm_hw_set_l1_ent_latency(dd);
dd->aspm_enabled = false;
aspm_hw_disable_l1(dd);
/* Now turn on ASPM if configured */
aspm_enable_all(dd);
}
static inline void aspm_exit(struct hfi1_devdata *dd)
{
aspm_disable_all(dd);
/* Turn on ASPM on exit to conserve power */
aspm_enable(dd);
__aspm_ctx_disable(rcd);
}
#endif /* _ASPM_H */

View File

@ -987,9 +987,6 @@ static int __i2c_debugfs_open(struct inode *in, struct file *fp, u32 target)
struct hfi1_pportdata *ppd;
int ret;
if (!try_module_get(THIS_MODULE))
return -ENODEV;
ppd = private2ppd(fp);
ret = acquire_chip_resource(ppd->dd, i2c_target(target), 0);
@ -1155,6 +1152,7 @@ static int exprom_wp_debugfs_release(struct inode *in, struct file *fp)
{ \
.name = nm, \
.ops = { \
.owner = THIS_MODULE, \
.read = readroutine, \
.write = writeroutine, \
.llseek = generic_file_llseek, \
@ -1165,6 +1163,7 @@ static int exprom_wp_debugfs_release(struct inode *in, struct file *fp)
{ \
.name = nm, \
.ops = { \
.owner = THIS_MODULE, \
.read = readf, \
.write = writef, \
.llseek = generic_file_llseek, \

View File

@ -2744,8 +2744,7 @@ static int pma_get_opa_portstatus(struct opa_pma_mad *pmp,
u16 link_width;
u16 link_speed;
response_data_size = sizeof(struct opa_port_status_rsp) +
num_vls * sizeof(struct _vls_pctrs);
response_data_size = struct_size(rsp, vls, num_vls);
if (response_data_size > sizeof(pmp->data)) {
pmp->mad_hdr.status |= OPA_PM_STATUS_REQUEST_TOO_LARGE;
return reply((struct ib_mad_hdr *)pmp);
@ -3014,8 +3013,7 @@ static int pma_get_opa_datacounters(struct opa_pma_mad *pmp,
}
/* Sanity check */
response_data_size = sizeof(struct opa_port_data_counters_msg) +
num_vls * sizeof(struct _vls_dctrs);
response_data_size = struct_size(req, port[0].vls, num_vls);
if (response_data_size > sizeof(pmp->data)) {
pmp->mad_hdr.status |= IB_SMP_INVALID_FIELD;
@ -3232,8 +3230,7 @@ static int pma_get_opa_porterrors(struct opa_pma_mad *pmp,
return reply((struct ib_mad_hdr *)pmp);
}
response_data_size = sizeof(struct opa_port_error_counters64_msg) +
num_vls * sizeof(struct _vls_ectrs);
response_data_size = struct_size(req, port[0].vls, num_vls);
if (response_data_size > sizeof(pmp->data)) {
pmp->mad_hdr.status |= IB_SMP_INVALID_FIELD;

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015 - 2018 Intel Corporation.
* Copyright(c) 2015 - 2019 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -450,10 +450,6 @@ static int hfi1_pcie_caps;
module_param_named(pcie_caps, hfi1_pcie_caps, int, 0444);
MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), ReadReq (4..7)");
uint aspm_mode = ASPM_MODE_DISABLED;
module_param_named(aspm, aspm_mode, uint, 0444);
MODULE_PARM_DESC(aspm, "PCIe ASPM: 0: disable, 1: enable, 2: dynamic");
/**
* tune_pcie_caps() - Code to adjust PCIe capabilities.
* @dd: Valid device data structure

View File

@ -1594,9 +1594,8 @@ void hfi1_sc_wantpiobuf_intr(struct send_context *sc, u32 needint)
else
sc_del_credit_return_intr(sc);
trace_hfi1_wantpiointr(sc, needint, sc->credit_ctrl);
if (needint) {
if (needint)
sc_return_credits(sc);
}
}
/**

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015 - 2018 Intel Corporation.
* Copyright(c) 2015 - 2019 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -348,7 +348,7 @@ int hfi1_setup_wqe(struct rvt_qp *qp, struct rvt_swqe *wqe, bool *call_send)
break;
case IB_QPT_GSI:
case IB_QPT_UD:
ah = ibah_to_rvtah(wqe->ud_wr.ah);
ah = rvt_get_swqe_ah(wqe);
if (wqe->length > (1 << ah->log_pmtu))
return -EINVAL;
if (ibp->sl_to_sc[rdma_ah_get_sl(&ah->attr)] == 0xf)
@ -702,8 +702,8 @@ void qp_iter_print(struct seq_file *s, struct rvt_qp_iter *iter)
sde ? sde->this_idx : 0,
send_context,
send_context ? send_context->sw_index : 0,
ibcq_to_rvtcq(qp->ibqp.send_cq)->queue->head,
ibcq_to_rvtcq(qp->ibqp.send_cq)->queue->tail,
ib_cq_head(qp->ibqp.send_cq),
ib_cq_tail(qp->ibqp.send_cq),
qp->pid,
qp->s_state,
qp->s_ack_state,

View File

@ -1830,23 +1830,14 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah)
}
while (qp->s_last != qp->s_acked) {
u32 s_last;
wqe = rvt_get_swqe_ptr(qp, qp->s_last);
if (cmp_psn(wqe->lpsn, qp->s_sending_psn) >= 0 &&
cmp_psn(qp->s_sending_psn, qp->s_sending_hpsn) <= 0)
break;
trdma_clean_swqe(qp, wqe);
rvt_qp_wqe_unreserve(qp, wqe);
s_last = qp->s_last;
trace_hfi1_qp_send_completion(qp, wqe, s_last);
if (++s_last >= qp->s_size)
s_last = 0;
qp->s_last = s_last;
/* see post_send() */
barrier();
rvt_put_qp_swqe(qp, wqe);
rvt_qp_swqe_complete(qp,
trace_hfi1_qp_send_completion(qp, wqe, qp->s_last);
rvt_qp_complete_swqe(qp,
wqe,
ib_hfi1_wc_opcode[wqe->wr.opcode],
IB_WC_SUCCESS);
@ -1890,19 +1881,10 @@ struct rvt_swqe *do_rc_completion(struct rvt_qp *qp,
trace_hfi1_rc_completion(qp, wqe->lpsn);
if (cmp_psn(wqe->lpsn, qp->s_sending_psn) < 0 ||
cmp_psn(qp->s_sending_psn, qp->s_sending_hpsn) > 0) {
u32 s_last;
trdma_clean_swqe(qp, wqe);
rvt_put_qp_swqe(qp, wqe);
rvt_qp_wqe_unreserve(qp, wqe);
s_last = qp->s_last;
trace_hfi1_qp_send_completion(qp, wqe, s_last);
if (++s_last >= qp->s_size)
s_last = 0;
qp->s_last = s_last;
/* see post_send() */
barrier();
rvt_qp_swqe_complete(qp,
trace_hfi1_qp_send_completion(qp, wqe, qp->s_last);
rvt_qp_complete_swqe(qp,
wqe,
ib_hfi1_wc_opcode[wqe->wr.opcode],
IB_WC_SUCCESS);
@ -3026,8 +3008,7 @@ send_last:
wc.dlid_path_bits = 0;
wc.port_num = 0;
/* Signal completion event if the solicited bit is set. */
rvt_cq_enter(ibcq_to_rvtcq(qp->ibqp.recv_cq), &wc,
ib_bth_is_solicited(ohdr));
rvt_recv_cq(qp, &wc, ib_bth_is_solicited(ohdr));
break;
case OP(RDMA_WRITE_ONLY):

View File

@ -475,7 +475,7 @@ static struct rvt_qp *first_qp(struct hfi1_ctxtdata *rcd,
* Must hold the qp s_lock and the exp_lock.
*
* Return:
* false if either of the conditions below are statisfied:
* false if either of the conditions below are satisfied:
* 1. The list is empty or
* 2. The indicated qp is at the head of the list and the
* HFI1_S_WAIT_TID_SPACE bit is set in qp->s_flags.
@ -2024,7 +2024,6 @@ static int tid_rdma_rcv_error(struct hfi1_packet *packet,
trace_hfi1_tid_req_rcv_err(qp, 0, e->opcode, e->psn, e->lpsn, req);
if (e->opcode == TID_OP(READ_REQ)) {
struct ib_reth *reth;
u32 offset;
u32 len;
u32 rkey;
u64 vaddr;
@ -2036,7 +2035,6 @@ static int tid_rdma_rcv_error(struct hfi1_packet *packet,
* The requester always restarts from the start of the original
* request.
*/
offset = delta_psn(psn, e->psn) * qp->pmtu;
len = be32_to_cpu(reth->length);
if (psn != e->psn || len != req->total_len)
goto unlock;
@ -4550,7 +4548,7 @@ void hfi1_rc_rcv_tid_rdma_ack(struct hfi1_packet *packet)
struct rvt_swqe *wqe;
struct tid_rdma_request *req;
struct tid_rdma_flow *flow;
u32 aeth, psn, req_psn, ack_psn, fspsn, resync_psn, ack_kpsn;
u32 aeth, psn, req_psn, ack_psn, resync_psn, ack_kpsn;
unsigned long flags;
u16 fidx;
@ -4754,7 +4752,6 @@ done:
IB_AETH_CREDIT_MASK) {
case 0: /* PSN sequence error */
flow = &req->flows[req->acked_tail];
fspsn = full_flow_psn(flow, flow->flow_state.spsn);
trace_hfi1_tid_flow_rcv_tid_ack(qp, req->acked_tail,
flow);
req->r_ack_psn = mask_psn(be32_to_cpu(ohdr->bth[2]));

View File

@ -79,6 +79,8 @@ __print_symbolic(opcode, \
ib_opcode_name(RC_ATOMIC_ACKNOWLEDGE), \
ib_opcode_name(RC_COMPARE_SWAP), \
ib_opcode_name(RC_FETCH_ADD), \
ib_opcode_name(RC_SEND_LAST_WITH_INVALIDATE), \
ib_opcode_name(RC_SEND_ONLY_WITH_INVALIDATE), \
ib_opcode_name(TID_RDMA_WRITE_REQ), \
ib_opcode_name(TID_RDMA_WRITE_RESP), \
ib_opcode_name(TID_RDMA_WRITE_DATA), \

View File

@ -476,8 +476,7 @@ last_imm:
wc.dlid_path_bits = 0;
wc.port_num = 0;
/* Signal completion event if the solicited bit is set. */
rvt_cq_enter(ibcq_to_rvtcq(qp->ibqp.recv_cq), &wc,
ib_bth_is_solicited(ohdr));
rvt_recv_cq(qp, &wc, ib_bth_is_solicited(ohdr));
break;
case OP(RDMA_WRITE_FIRST):

View File

@ -1,5 +1,5 @@
/*
* Copyright(c) 2015 - 2018 Intel Corporation.
* Copyright(c) 2015 - 2019 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@ -87,7 +87,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
rcu_read_lock();
qp = rvt_lookup_qpn(ib_to_rvt(sqp->ibqp.device), &ibp->rvp,
swqe->ud_wr.remote_qpn);
rvt_get_swqe_remote_qpn(swqe));
if (!qp) {
ibp->rvp.n_pkt_drops++;
rcu_read_unlock();
@ -105,7 +105,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
goto drop;
}
ah_attr = &ibah_to_rvtah(swqe->ud_wr.ah)->attr;
ah_attr = rvt_get_swqe_ah_attr(swqe);
ppd = ppd_from_ibp(ibp);
if (qp->ibqp.qp_num > 1) {
@ -135,8 +135,8 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
if (qp->ibqp.qp_num) {
u32 qkey;
qkey = (int)swqe->ud_wr.remote_qkey < 0 ?
sqp->qkey : swqe->ud_wr.remote_qkey;
qkey = (int)rvt_get_swqe_remote_qkey(swqe) < 0 ?
sqp->qkey : rvt_get_swqe_remote_qkey(swqe);
if (unlikely(qkey != qp->qkey))
goto drop; /* silently drop per IBTA spec */
}
@ -240,7 +240,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_SMI) {
if (sqp->ibqp.qp_type == IB_QPT_GSI ||
sqp->ibqp.qp_type == IB_QPT_SMI)
wc.pkey_index = swqe->ud_wr.pkey_index;
wc.pkey_index = rvt_get_swqe_pkey_index(swqe);
else
wc.pkey_index = sqp->s_pkey_index;
} else {
@ -255,8 +255,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
wc.dlid_path_bits = rdma_ah_get_dlid(ah_attr) & ((1 << ppd->lmc) - 1);
wc.port_num = qp->port_num;
/* Signal completion event if the solicited bit is set. */
rvt_cq_enter(ibcq_to_rvtcq(qp->ibqp.recv_cq), &wc,
swqe->wr.send_flags & IB_SEND_SOLICITED);
rvt_recv_cq(qp, &wc, swqe->wr.send_flags & IB_SEND_SOLICITED);
ibp->rvp.n_loop_pkts++;
bail_unlock:
spin_unlock_irqrestore(&qp->r_lock, flags);
@ -283,20 +282,21 @@ static void hfi1_make_bth_deth(struct rvt_qp *qp, struct rvt_swqe *wqe,
bth0 |= IB_BTH_SOLICITED;
bth0 |= extra_bytes << 20;
if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_SMI)
*pkey = hfi1_get_pkey(ibp, wqe->ud_wr.pkey_index);
*pkey = hfi1_get_pkey(ibp, rvt_get_swqe_pkey_index(wqe));
else
*pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
if (!bypass)
bth0 |= *pkey;
ohdr->bth[0] = cpu_to_be32(bth0);
ohdr->bth[1] = cpu_to_be32(wqe->ud_wr.remote_qpn);
ohdr->bth[1] = cpu_to_be32(rvt_get_swqe_remote_qpn(wqe));
ohdr->bth[2] = cpu_to_be32(mask_psn(wqe->psn));
/*
* Qkeys with the high order bit set mean use the
* qkey from the QP context instead of the WR (see 10.2.5).
*/
ohdr->u.ud.deth[0] = cpu_to_be32((int)wqe->ud_wr.remote_qkey < 0 ?
qp->qkey : wqe->ud_wr.remote_qkey);
ohdr->u.ud.deth[0] =
cpu_to_be32((int)rvt_get_swqe_remote_qkey(wqe) < 0 ? qp->qkey :
rvt_get_swqe_remote_qkey(wqe));
ohdr->u.ud.deth[1] = cpu_to_be32(qp->ibqp.qp_num);
}
@ -316,7 +316,7 @@ void hfi1_make_ud_req_9B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
ibp = to_iport(qp->ibqp.device, qp->port_num);
ppd = ppd_from_ibp(ibp);
ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
ah_attr = rvt_get_swqe_ah_attr(wqe);
extra_bytes = -wqe->length & 3;
nwords = ((wqe->length + extra_bytes) >> 2) + SIZE_OF_CRC;
@ -380,7 +380,7 @@ void hfi1_make_ud_req_16B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
struct hfi1_pportdata *ppd;
struct hfi1_ibport *ibp;
u32 dlid, slid, nwords, extra_bytes;
u32 dest_qp = wqe->ud_wr.remote_qpn;
u32 dest_qp = rvt_get_swqe_remote_qpn(wqe);
u32 src_qp = qp->ibqp.qp_num;
u16 len, pkey;
u8 l4, sc5;
@ -388,7 +388,7 @@ void hfi1_make_ud_req_16B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
ibp = to_iport(qp->ibqp.device, qp->port_num);
ppd = ppd_from_ibp(ibp);
ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
ah_attr = rvt_get_swqe_ah_attr(wqe);
/*
* Build 16B Management Packet if either the destination
@ -450,7 +450,7 @@ void hfi1_make_ud_req_16B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
if (is_mgmt) {
l4 = OPA_16B_L4_FM;
pkey = hfi1_get_pkey(ibp, wqe->ud_wr.pkey_index);
pkey = hfi1_get_pkey(ibp, rvt_get_swqe_pkey_index(wqe));
hfi1_16B_set_qpn(&ps->s_txreq->phdr.hdr.opah.u.mgmt,
dest_qp, src_qp);
} else {
@ -515,7 +515,7 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
/* Construct the header. */
ibp = to_iport(qp->ibqp.device, qp->port_num);
ppd = ppd_from_ibp(ibp);
ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
ah_attr = rvt_get_swqe_ah_attr(wqe);
priv->hdr_type = hfi1_get_hdr_type(ppd->lid, ah_attr);
if ((!hfi1_check_mcast(rdma_ah_get_dlid(ah_attr))) ||
(rdma_ah_get_dlid(ah_attr) == be32_to_cpu(OPA_LID_PERMISSIVE))) {
@ -1061,7 +1061,7 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
dlid & ((1 << ppd_from_ibp(ibp)->lmc) - 1);
wc.port_num = qp->port_num;
/* Signal completion event if the solicited bit is set. */
rvt_cq_enter(ibcq_to_rvtcq(qp->ibqp.recv_cq), &wc, solicited);
rvt_recv_cq(qp, &wc, solicited);
return;
drop:

View File

@ -118,13 +118,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
size_t npages, bool dirty)
{
size_t i;
for (i = 0; i < npages; i++) {
if (dirty)
set_page_dirty_lock(p[i]);
put_page(p[i]);
}
if (dirty)
put_user_pages_dirty_lock(p, npages);
else
put_user_pages(p, npages);
if (mm) { /* during close after signal, mm can be NULL */
atomic64_sub(npages, &mm->pinned_vm);

View File

@ -1779,6 +1779,9 @@ static int get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats,
}
static const struct ib_device_ops hfi1_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_HFI1,
.alloc_hw_stats = alloc_hw_stats,
.alloc_rdma_netdev = hfi1_vnic_alloc_rn,
.get_dev_fw_str = hfi1_get_dev_fw_str,
@ -1829,7 +1832,6 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
*/
if (!ib_hfi1_sys_image_guid)
ib_hfi1_sys_image_guid = ibdev->node_guid;
ibdev->owner = THIS_MODULE;
ibdev->phys_port_cnt = dd->num_pports;
ibdev->dev.parent = &dd->pcidev->dev;
@ -1923,7 +1925,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
rdma_set_device_sysfs_group(&dd->verbs_dev.rdi.ibdev,
&ib_hfi1_attr_group);
ret = rvt_register_device(&dd->verbs_dev.rdi, RDMA_DRIVER_HFI1);
ret = rvt_register_device(&dd->verbs_dev.rdi);
if (ret)
goto err_verbs_txreq;

View File

@ -8,25 +8,24 @@ config INFINIBAND_HNS
is used in Hisilicon Hip06 and more further ICT SoC based on
platform device.
To compile this driver as a module, choose M here: the module
will be called hns-roce.
To compile HIP06 or HIP08 driver as module, choose M here.
config INFINIBAND_HNS_HIP06
tristate "Hisilicon Hip06 Family RoCE support"
bool "Hisilicon Hip06 Family RoCE support"
depends on INFINIBAND_HNS && HNS && HNS_DSAF && HNS_ENET
---help---
RoCE driver support for Hisilicon RoCE engine in Hisilicon Hip06 and
Hip07 SoC. These RoCE engines are platform devices.
To compile this driver as a module, choose M here: the module
will be called hns-roce-hw-v1.
To compile this driver, choose Y here: if INFINIBAND_HNS is m, this
module will be called hns-roce-hw-v1
config INFINIBAND_HNS_HIP08
tristate "Hisilicon Hip08 Family RoCE support"
bool "Hisilicon Hip08 Family RoCE support"
depends on INFINIBAND_HNS && PCI && HNS3
---help---
RoCE driver support for Hisilicon RoCE engine in Hisilicon Hip08 SoC.
The RoCE engine is a PCI device.
To compile this driver as a module, choose M here: the module
will be called hns-roce-hw-v2.
To compile this driver, choose Y here: if INFINIBAND_HNS is m, this
module will be called hns-roce-hw-v2.

View File

@ -5,11 +5,16 @@
ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3
obj-$(CONFIG_INFINIBAND_HNS) += hns-roce.o
hns-roce-objs := hns_roce_main.o hns_roce_cmd.o hns_roce_pd.o \
hns_roce_ah.o hns_roce_hem.o hns_roce_mr.o hns_roce_qp.o \
hns_roce_cq.o hns_roce_alloc.o hns_roce_db.o hns_roce_srq.o hns_roce_restrack.o
obj-$(CONFIG_INFINIBAND_HNS_HIP06) += hns-roce-hw-v1.o
hns-roce-hw-v1-objs := hns_roce_hw_v1.o
obj-$(CONFIG_INFINIBAND_HNS_HIP08) += hns-roce-hw-v2.o
hns-roce-hw-v2-objs := hns_roce_hw_v2.o hns_roce_hw_v2_dfx.o
ifdef CONFIG_INFINIBAND_HNS_HIP06
hns-roce-hw-v1-objs := hns_roce_hw_v1.o $(hns-roce-objs)
obj-$(CONFIG_INFINIBAND_HNS) += hns-roce-hw-v1.o
endif
ifdef CONFIG_INFINIBAND_HNS_HIP08
hns-roce-hw-v2-objs := hns_roce_hw_v2.o hns_roce_hw_v2_dfx.o $(hns-roce-objs)
obj-$(CONFIG_INFINIBAND_HNS) += hns-roce-hw-v2.o
endif

View File

@ -34,6 +34,7 @@
#include <linux/platform_device.h>
#include <linux/vmalloc.h>
#include "hns_roce_device.h"
#include <rdma/ib_umem.h>
int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap, unsigned long *obj)
{
@ -67,7 +68,6 @@ void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, unsigned long obj,
{
hns_roce_bitmap_free_range(bitmap, obj, 1, rr);
}
EXPORT_SYMBOL_GPL(hns_roce_bitmap_free);
int hns_roce_bitmap_alloc_range(struct hns_roce_bitmap *bitmap, int cnt,
int align, unsigned long *obj)
@ -174,7 +174,6 @@ void hns_roce_buf_free(struct hns_roce_dev *hr_dev, u32 size,
kfree(buf->page_list);
}
}
EXPORT_SYMBOL_GPL(hns_roce_buf_free);
int hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, u32 max_direct,
struct hns_roce_buf *buf, u32 page_shift)
@ -238,6 +237,104 @@ err_free:
return -ENOMEM;
}
int hns_roce_get_kmem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs,
int buf_cnt, int start, struct hns_roce_buf *buf)
{
int i, end;
int total;
end = start + buf_cnt;
if (end > buf->npages) {
dev_err(hr_dev->dev,
"invalid kmem region,offset %d,buf_cnt %d,total %d!\n",
start, buf_cnt, buf->npages);
return -EINVAL;
}
total = 0;
for (i = start; i < end; i++)
if (buf->nbufs == 1)
bufs[total++] = buf->direct.map +
((dma_addr_t)i << buf->page_shift);
else
bufs[total++] = buf->page_list[i].map;
return total;
}
int hns_roce_get_umem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs,
int buf_cnt, int start, struct ib_umem *umem,
int page_shift)
{
struct ib_block_iter biter;
int total = 0;
int idx = 0;
u64 addr;
if (page_shift < PAGE_SHIFT) {
dev_err(hr_dev->dev, "invalid page shift %d!\n", page_shift);
return -EINVAL;
}
/* convert system page cnt to hw page cnt */
rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap,
1 << page_shift) {
addr = rdma_block_iter_dma_address(&biter);
if (idx >= start) {
bufs[total++] = addr;
if (total >= buf_cnt)
goto done;
}
idx++;
}
done:
return total;
}
void hns_roce_init_buf_region(struct hns_roce_buf_region *region, int hopnum,
int offset, int buf_cnt)
{
if (hopnum == HNS_ROCE_HOP_NUM_0)
region->hopnum = 0;
else
region->hopnum = hopnum;
region->offset = offset;
region->count = buf_cnt;
}
void hns_roce_free_buf_list(dma_addr_t **bufs, int region_cnt)
{
int i;
for (i = 0; i < region_cnt; i++) {
kfree(bufs[i]);
bufs[i] = NULL;
}
}
int hns_roce_alloc_buf_list(struct hns_roce_buf_region *regions,
dma_addr_t **bufs, int region_cnt)
{
struct hns_roce_buf_region *r;
int i;
for (i = 0; i < region_cnt; i++) {
r = &regions[i];
bufs[i] = kcalloc(r->count, sizeof(dma_addr_t), GFP_KERNEL);
if (!bufs[i])
goto err_alloc;
}
return 0;
err_alloc:
hns_roce_free_buf_list(bufs, i);
return -ENOMEM;
}
void hns_roce_cleanup_bitmap(struct hns_roce_dev *hr_dev)
{
if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SRQ)

View File

@ -103,7 +103,6 @@ void hns_roce_cmd_event(struct hns_roce_dev *hr_dev, u16 token, u8 status,
context->out_param = out_param;
complete(&context->done);
}
EXPORT_SYMBOL_GPL(hns_roce_cmd_event);
/* this should be called with "use_events" */
static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
@ -162,7 +161,7 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
u8 op_modifier, u16 op, unsigned long timeout)
{
int ret = 0;
int ret;
down(&hr_dev->cmd.event_sem);
ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
@ -204,7 +203,6 @@ int hns_roce_cmd_mbox(struct hns_roce_dev *hr_dev, u64 in_param, u64 out_param,
return ret;
}
EXPORT_SYMBOL_GPL(hns_roce_cmd_mbox);
int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
{
@ -291,7 +289,6 @@ struct hns_roce_cmd_mailbox
return mailbox;
}
EXPORT_SYMBOL_GPL(hns_roce_alloc_cmd_mailbox);
void hns_roce_free_cmd_mailbox(struct hns_roce_dev *hr_dev,
struct hns_roce_cmd_mailbox *mailbox)
@ -302,4 +299,3 @@ void hns_roce_free_cmd_mailbox(struct hns_roce_dev *hr_dev,
dma_pool_free(hr_dev->cmd.pool, mailbox->buf, mailbox->dma);
kfree(mailbox);
}
EXPORT_SYMBOL_GPL(hns_roce_free_cmd_mailbox);

View File

@ -205,7 +205,6 @@ void hns_roce_free_cq(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq)
hns_roce_table_put(hr_dev, &cq_table->table, hr_cq->cqn);
hns_roce_bitmap_free(&cq_table->bitmap, hr_cq->cqn, BITMAP_NO_RR);
}
EXPORT_SYMBOL_GPL(hns_roce_free_cq);
static int hns_roce_ib_get_cq_umem(struct hns_roce_dev *hr_dev,
struct ib_udata *udata,
@ -235,8 +234,7 @@ static int hns_roce_ib_get_cq_umem(struct hns_roce_dev *hr_dev,
&buf->hr_mtt);
} else {
ret = hns_roce_mtt_init(hr_dev, ib_umem_page_count(*umem),
(*umem)->page_shift,
&buf->hr_mtt);
PAGE_SHIFT, &buf->hr_mtt);
}
if (ret)
goto err_buf;
@ -300,15 +298,15 @@ static void hns_roce_ib_free_cq_buf(struct hns_roce_dev *hr_dev,
&buf->hr_buf);
}
struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
int hns_roce_ib_create_cq(struct ib_cq *ib_cq,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
struct hns_roce_dev *hr_dev = to_hr_dev(ib_cq->device);
struct device *dev = hr_dev->dev;
struct hns_roce_ib_create_cq ucmd;
struct hns_roce_ib_create_cq_resp resp = {};
struct hns_roce_cq *hr_cq = NULL;
struct hns_roce_cq *hr_cq = to_hr_cq(ib_cq);
struct hns_roce_uar *uar = NULL;
int vector = attr->comp_vector;
int cq_entries = attr->cqe;
@ -319,13 +317,9 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
if (cq_entries < 1 || cq_entries > hr_dev->caps.max_cqes) {
dev_err(dev, "Creat CQ failed. entries=%d, max=%d\n",
cq_entries, hr_dev->caps.max_cqes);
return ERR_PTR(-EINVAL);
return -EINVAL;
}
hr_cq = kzalloc(sizeof(*hr_cq), GFP_KERNEL);
if (!hr_cq)
return ERR_PTR(-ENOMEM);
if (hr_dev->caps.min_cqes)
cq_entries = max(cq_entries, hr_dev->caps.min_cqes);
@ -416,7 +410,7 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
goto err_cqc;
}
return &hr_cq->ib_cq;
return 0;
err_cqc:
hns_roce_free_cq(hr_dev, hr_cq);
@ -428,9 +422,8 @@ err_dbmap:
err_mtt:
hns_roce_mtt_cleanup(hr_dev, &hr_cq->hr_buf.hr_mtt);
if (udata)
ib_umem_release(hr_cq->umem);
else
ib_umem_release(hr_cq->umem);
if (!udata)
hns_roce_ib_free_cq_buf(hr_dev, &hr_cq->hr_buf,
hr_cq->ib_cq.cqe);
@ -439,47 +432,37 @@ err_db:
hns_roce_free_db(hr_dev, &hr_cq->db);
err_cq:
kfree(hr_cq);
return ERR_PTR(ret);
return ret;
}
EXPORT_SYMBOL_GPL(hns_roce_ib_create_cq);
int hns_roce_ib_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
void hns_roce_ib_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ib_cq->device);
struct hns_roce_cq *hr_cq = to_hr_cq(ib_cq);
int ret = 0;
if (hr_dev->hw->destroy_cq) {
ret = hr_dev->hw->destroy_cq(ib_cq, udata);
} else {
hns_roce_free_cq(hr_dev, hr_cq);
hns_roce_mtt_cleanup(hr_dev, &hr_cq->hr_buf.hr_mtt);
if (udata) {
ib_umem_release(hr_cq->umem);
if (hr_cq->db_en == 1)
hns_roce_db_unmap_user(
rdma_udata_to_drv_context(
udata,
struct hns_roce_ucontext,
ibucontext),
&hr_cq->db);
} else {
/* Free the buff of stored cq */
hns_roce_ib_free_cq_buf(hr_dev, &hr_cq->hr_buf,
ib_cq->cqe);
if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB)
hns_roce_free_db(hr_dev, &hr_cq->db);
}
kfree(hr_cq);
hr_dev->hw->destroy_cq(ib_cq, udata);
return;
}
return ret;
hns_roce_free_cq(hr_dev, hr_cq);
hns_roce_mtt_cleanup(hr_dev, &hr_cq->hr_buf.hr_mtt);
ib_umem_release(hr_cq->umem);
if (udata) {
if (hr_cq->db_en == 1)
hns_roce_db_unmap_user(rdma_udata_to_drv_context(
udata,
struct hns_roce_ucontext,
ibucontext),
&hr_cq->db);
} else {
/* Free the buff of stored cq */
hns_roce_ib_free_cq_buf(hr_dev, &hr_cq->hr_buf, ib_cq->cqe);
if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB)
hns_roce_free_db(hr_dev, &hr_cq->db);
}
}
EXPORT_SYMBOL_GPL(hns_roce_ib_destroy_cq);
void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn)
{
@ -495,7 +478,6 @@ void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn)
++cq->arm_sn;
cq->comp(cq);
}
EXPORT_SYMBOL_GPL(hns_roce_cq_completion);
void hns_roce_cq_event(struct hns_roce_dev *hr_dev, u32 cqn, int event_type)
{
@ -517,7 +499,6 @@ void hns_roce_cq_event(struct hns_roce_dev *hr_dev, u32 cqn, int event_type)
if (atomic_dec_and_test(&cq->refcount))
complete(&cq->free);
}
EXPORT_SYMBOL_GPL(hns_roce_cq_event);
int hns_roce_init_cq_table(struct hns_roce_dev *hr_dev)
{

View File

@ -51,7 +51,6 @@ out:
return ret;
}
EXPORT_SYMBOL(hns_roce_db_map_user);
void hns_roce_db_unmap_user(struct hns_roce_ucontext *context,
struct hns_roce_db *db)
@ -67,7 +66,6 @@ void hns_roce_db_unmap_user(struct hns_roce_ucontext *context,
mutex_unlock(&context->page_mutex);
}
EXPORT_SYMBOL(hns_roce_db_unmap_user);
static struct hns_roce_db_pgdir *hns_roce_alloc_db_pgdir(
struct device *dma_device)
@ -78,7 +76,8 @@ static struct hns_roce_db_pgdir *hns_roce_alloc_db_pgdir(
if (!pgdir)
return NULL;
bitmap_fill(pgdir->order1, HNS_ROCE_DB_PER_PAGE / 2);
bitmap_fill(pgdir->order1,
HNS_ROCE_DB_PER_PAGE / HNS_ROCE_DB_TYPE_COUNT);
pgdir->bits[0] = pgdir->order0;
pgdir->bits[1] = pgdir->order1;
pgdir->page = dma_alloc_coherent(dma_device, PAGE_SIZE,
@ -116,7 +115,7 @@ found:
db->u.pgdir = pgdir;
db->index = i;
db->db_record = pgdir->page + db->index;
db->dma = pgdir->db_dma + db->index * 4;
db->dma = pgdir->db_dma + db->index * HNS_ROCE_DB_UNIT_SIZE;
db->order = order;
return 0;
@ -150,7 +149,6 @@ out:
return ret;
}
EXPORT_SYMBOL_GPL(hns_roce_alloc_db);
void hns_roce_free_db(struct hns_roce_dev *hr_dev, struct hns_roce_db *db)
{
@ -170,7 +168,8 @@ void hns_roce_free_db(struct hns_roce_dev *hr_dev, struct hns_roce_db *db)
i >>= o;
set_bit(i, db->u.pgdir->bits[o]);
if (bitmap_full(db->u.pgdir->order1, HNS_ROCE_DB_PER_PAGE / 2)) {
if (bitmap_full(db->u.pgdir->order1,
HNS_ROCE_DB_PER_PAGE / HNS_ROCE_DB_TYPE_COUNT)) {
dma_free_coherent(hr_dev->dev, PAGE_SIZE, db->u.pgdir->page,
db->u.pgdir->db_dma);
list_del(&db->u.pgdir->list);
@ -179,4 +178,3 @@ void hns_roce_free_db(struct hns_roce_dev *hr_dev, struct hns_roce_db *db)
mutex_unlock(&hr_dev->pgdir_mutex);
}
EXPORT_SYMBOL_GPL(hns_roce_free_db);

View File

@ -37,9 +37,12 @@
#define DRV_NAME "hns_roce"
/* hip08 is a pci device, it includes two version according pci version id */
#define PCI_REVISION_ID_HIP08_A 0x20
#define PCI_REVISION_ID_HIP08_B 0x21
#define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
#define MAC_ADDR_OCTET_NUM 6
#define HNS_ROCE_MAX_MSG_LEN 0x80000000
#define HNS_ROCE_ALOGN_UP(a, b) ((((a) + (b) - 1) / (b)) * (b))
@ -48,6 +51,10 @@
#define HNS_ROCE_BA_SIZE (32 * 4096)
#define BA_BYTE_LEN 8
#define BITS_PER_BYTE 8
/* Hardware specification only for v1 engine */
#define HNS_ROCE_MIN_CQE_NUM 0x40
#define HNS_ROCE_MIN_WQE_NUM 0x20
@ -55,6 +62,7 @@
/* Hardware specification only for v1 engine */
#define HNS_ROCE_MAX_INNER_MTPT_NUM 0x7
#define HNS_ROCE_MAX_MTPT_PBL_NUM 0x100000
#define HNS_ROCE_MAX_SGE_NUM 2
#define HNS_ROCE_EACH_FREE_CQ_WAIT_MSECS 20
#define HNS_ROCE_MAX_FREE_CQ_WAIT_CNT \
@ -64,6 +72,9 @@
#define HNS_ROCE_MAX_IRQ_NUM 128
#define HNS_ROCE_SGE_IN_WQE 2
#define HNS_ROCE_SGE_SHIFT 4
#define EQ_ENABLE 1
#define EQ_DISABLE 0
@ -81,6 +92,7 @@
#define HNS_ROCE_MAX_PORTS 6
#define HNS_ROCE_MAX_GID_NUM 16
#define HNS_ROCE_GID_SIZE 16
#define HNS_ROCE_SGE_SIZE 16
#define HNS_ROCE_HOP_NUM_0 0xff
@ -111,6 +123,8 @@
#define PAGES_SHIFT_24 24
#define PAGES_SHIFT_32 32
#define HNS_ROCE_PCI_BAR_NUM 2
#define HNS_ROCE_IDX_QUE_ENTRY_SZ 4
#define SRQ_DB_REG 0x230
@ -213,6 +227,9 @@ enum hns_roce_mtt_type {
MTT_TYPE_IDX
};
#define HNS_ROCE_DB_TYPE_COUNT 2
#define HNS_ROCE_DB_UNIT_SIZE 4
enum {
HNS_ROCE_DB_PER_PAGE = PAGE_SIZE / 4
};
@ -324,6 +341,29 @@ struct hns_roce_mtt {
enum hns_roce_mtt_type mtt_type;
};
struct hns_roce_buf_region {
int offset; /* page offset */
u32 count; /* page count*/
int hopnum; /* addressing hop num */
};
#define HNS_ROCE_MAX_BT_REGION 3
#define HNS_ROCE_MAX_BT_LEVEL 3
struct hns_roce_hem_list {
struct list_head root_bt;
/* link all bt dma mem by hop config */
struct list_head mid_bt[HNS_ROCE_MAX_BT_REGION][HNS_ROCE_MAX_BT_LEVEL];
struct list_head btm_bt; /* link all bottom bt in @mid_bt */
dma_addr_t root_ba; /* pointer to the root ba table */
int bt_pg_shift;
};
/* memory translate region */
struct hns_roce_mtr {
struct hns_roce_hem_list hem_list;
int buf_pg_shift;
};
struct hns_roce_mw {
struct ib_mw ibmw;
u32 pdn;
@ -413,8 +453,8 @@ struct hns_roce_buf {
struct hns_roce_db_pgdir {
struct list_head list;
DECLARE_BITMAP(order0, HNS_ROCE_DB_PER_PAGE);
DECLARE_BITMAP(order1, HNS_ROCE_DB_PER_PAGE / 2);
unsigned long *bits[2];
DECLARE_BITMAP(order1, HNS_ROCE_DB_PER_PAGE / HNS_ROCE_DB_TYPE_COUNT);
unsigned long *bits[HNS_ROCE_DB_TYPE_COUNT];
u32 *page;
dma_addr_t db_dma;
};
@ -472,7 +512,7 @@ struct hns_roce_idx_que {
u32 buf_size;
struct ib_umem *umem;
struct hns_roce_mtt mtt;
u64 *bitmap;
unsigned long *bitmap;
};
struct hns_roce_srq {
@ -535,7 +575,7 @@ struct hns_roce_av {
u8 hop_limit;
__le32 sl_tclass_flowlabel;
u8 dgid[HNS_ROCE_GID_SIZE];
u8 mac[6];
u8 mac[ETH_ALEN];
__le16 vlan;
bool vlan_en;
};
@ -620,6 +660,14 @@ struct hns_roce_qp {
struct ib_umem *umem;
struct hns_roce_mtt mtt;
struct hns_roce_mtr mtr;
/* this define must less than HNS_ROCE_MAX_BT_REGION */
#define HNS_ROCE_WQE_REGION_MAX 3
struct hns_roce_buf_region regions[HNS_ROCE_WQE_REGION_MAX];
int region_cnt;
int wqe_bt_pg_shift;
u32 buff_size;
struct mutex mutex;
u8 port;
@ -830,6 +878,9 @@ struct hns_roce_caps {
u32 mtt_ba_pg_sz;
u32 mtt_buf_pg_sz;
u32 mtt_hop_num;
u32 wqe_sq_hop_num;
u32 wqe_sge_hop_num;
u32 wqe_rq_hop_num;
u32 sccc_ba_pg_sz;
u32 sccc_buf_pg_sz;
u32 sccc_hop_num;
@ -921,7 +972,7 @@ struct hns_roce_hw {
int (*poll_cq)(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr,
struct ib_udata *udata);
int (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata);
void (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata);
int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period);
int (*init_eq)(struct hns_roce_dev *hr_dev);
void (*cleanup_eq)(struct hns_roce_dev *hr_dev);
@ -940,6 +991,16 @@ struct hns_roce_hw {
const struct ib_device_ops *hns_roce_dev_srq_ops;
};
enum hns_phy_state {
HNS_ROCE_PHY_SLEEP = 1,
HNS_ROCE_PHY_POLLING = 2,
HNS_ROCE_PHY_DISABLED = 3,
HNS_ROCE_PHY_TRAINING = 4,
HNS_ROCE_PHY_LINKUP = 5,
HNS_ROCE_PHY_LINKERR = 6,
HNS_ROCE_PHY_TEST = 7
};
struct hns_roce_dev {
struct ib_device ib_dev;
struct platform_device *pdev;
@ -962,7 +1023,7 @@ struct hns_roce_dev {
struct hns_roce_caps caps;
struct xarray qp_table_xa;
unsigned char dev_addr[HNS_ROCE_MAX_PORTS][MAC_ADDR_OCTET_NUM];
unsigned char dev_addr[HNS_ROCE_MAX_PORTS][ETH_ALEN];
u64 sys_image_guid;
u32 vendor_id;
u32 vendor_part_id;
@ -1084,6 +1145,19 @@ void hns_roce_mtt_cleanup(struct hns_roce_dev *hr_dev,
int hns_roce_buf_write_mtt(struct hns_roce_dev *hr_dev,
struct hns_roce_mtt *mtt, struct hns_roce_buf *buf);
void hns_roce_mtr_init(struct hns_roce_mtr *mtr, int bt_pg_shift,
int buf_pg_shift);
int hns_roce_mtr_attach(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
dma_addr_t **bufs, struct hns_roce_buf_region *regions,
int region_cnt);
void hns_roce_mtr_cleanup(struct hns_roce_dev *hr_dev,
struct hns_roce_mtr *mtr);
/* hns roce hw need current block and next block addr from mtt */
#define MTT_MIN_COUNT 2
int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
int offset, u64 *mtt_buf, int mtt_max, u64 *base_addr);
int hns_roce_init_pd_table(struct hns_roce_dev *hr_dev);
int hns_roce_init_mr_table(struct hns_roce_dev *hr_dev);
int hns_roce_init_eq_table(struct hns_roce_dev *hr_dev);
@ -1148,6 +1222,18 @@ int hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, u32 max_direct,
int hns_roce_ib_umem_write_mtt(struct hns_roce_dev *hr_dev,
struct hns_roce_mtt *mtt, struct ib_umem *umem);
void hns_roce_init_buf_region(struct hns_roce_buf_region *region, int hopnum,
int offset, int buf_cnt);
int hns_roce_alloc_buf_list(struct hns_roce_buf_region *regions,
dma_addr_t **bufs, int count);
void hns_roce_free_buf_list(dma_addr_t **bufs, int count);
int hns_roce_get_kmem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs,
int buf_cnt, int start, struct hns_roce_buf *buf);
int hns_roce_get_umem_bufs(struct hns_roce_dev *hr_dev, dma_addr_t *bufs,
int buf_cnt, int start, struct ib_umem *umem,
int page_shift);
int hns_roce_create_srq(struct ib_srq *srq,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata);
@ -1178,11 +1264,11 @@ void hns_roce_release_range_qp(struct hns_roce_dev *hr_dev, int base_qpn,
__be32 send_ieth(const struct ib_send_wr *wr);
int to_hr_qp_type(int qp_type);
struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
int hns_roce_ib_create_cq(struct ib_cq *ib_cq,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
int hns_roce_ib_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
void hns_roce_ib_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
void hns_roce_free_cq(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq);
int hns_roce_db_map_user(struct hns_roce_ucontext *context,

View File

@ -56,7 +56,6 @@ bool hns_roce_check_whether_mhop(struct hns_roce_dev *hr_dev, u32 type)
return false;
}
EXPORT_SYMBOL_GPL(hns_roce_check_whether_mhop);
static bool hns_roce_check_hem_null(struct hns_roce_hem **hem, u64 start_idx,
u32 bt_chunk_num)
@ -165,7 +164,7 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
+ PAGE_SHIFT);
mhop->bt_chunk_size = 1 << (hr_dev->caps.mtt_ba_pg_sz
+ PAGE_SHIFT);
mhop->ba_l0_num = mhop->bt_chunk_size / 8;
mhop->ba_l0_num = mhop->bt_chunk_size / BA_BYTE_LEN;
mhop->hop_num = hr_dev->caps.mtt_hop_num;
break;
case HEM_TYPE_CQE:
@ -173,7 +172,7 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
+ PAGE_SHIFT);
mhop->bt_chunk_size = 1 << (hr_dev->caps.cqe_ba_pg_sz
+ PAGE_SHIFT);
mhop->ba_l0_num = mhop->bt_chunk_size / 8;
mhop->ba_l0_num = mhop->bt_chunk_size / BA_BYTE_LEN;
mhop->hop_num = hr_dev->caps.cqe_hop_num;
break;
case HEM_TYPE_SRQWQE:
@ -181,7 +180,7 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
+ PAGE_SHIFT);
mhop->bt_chunk_size = 1 << (hr_dev->caps.srqwqe_ba_pg_sz
+ PAGE_SHIFT);
mhop->ba_l0_num = mhop->bt_chunk_size / 8;
mhop->ba_l0_num = mhop->bt_chunk_size / BA_BYTE_LEN;
mhop->hop_num = hr_dev->caps.srqwqe_hop_num;
break;
case HEM_TYPE_IDX:
@ -189,7 +188,7 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
+ PAGE_SHIFT);
mhop->bt_chunk_size = 1 << (hr_dev->caps.idx_ba_pg_sz
+ PAGE_SHIFT);
mhop->ba_l0_num = mhop->bt_chunk_size / 8;
mhop->ba_l0_num = mhop->bt_chunk_size / BA_BYTE_LEN;
mhop->hop_num = hr_dev->caps.idx_hop_num;
break;
default:
@ -206,7 +205,7 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
* MTT/CQE alloc hem for bt pages.
*/
bt_num = hns_roce_get_bt_num(table->type, mhop->hop_num);
chunk_ba_num = mhop->bt_chunk_size / 8;
chunk_ba_num = mhop->bt_chunk_size / BA_BYTE_LEN;
chunk_size = table->type < HEM_TYPE_MTT ? mhop->buf_chunk_size :
mhop->bt_chunk_size;
table_idx = (*obj & (table->num_obj - 1)) /
@ -234,7 +233,6 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
return 0;
}
EXPORT_SYMBOL_GPL(hns_roce_calc_hem_mhop);
static struct hns_roce_hem *hns_roce_alloc_hem(struct hns_roce_dev *hr_dev,
int npages,
@ -376,18 +374,19 @@ static int hns_roce_set_hem(struct hns_roce_dev *hr_dev,
bt_cmd = hr_dev->reg_base + ROCEE_BT_CMD_H_REG;
end = msecs_to_jiffies(HW_SYNC_TIMEOUT_MSECS) + jiffies;
while (1) {
if (readl(bt_cmd) >> BT_CMD_SYNC_SHIFT) {
if (!(time_before(jiffies, end))) {
dev_err(dev, "Write bt_cmd err,hw_sync is not zero.\n");
spin_unlock_irqrestore(lock, flags);
return -EBUSY;
}
} else {
end = HW_SYNC_TIMEOUT_MSECS;
while (end) {
if (!(readl(bt_cmd) >> BT_CMD_SYNC_SHIFT))
break;
}
mdelay(HW_SYNC_SLEEP_TIME_INTERVAL);
end -= HW_SYNC_SLEEP_TIME_INTERVAL;
}
if (end <= 0) {
dev_err(dev, "Write bt_cmd err,hw_sync is not zero.\n");
spin_unlock_irqrestore(lock, flags);
return -EBUSY;
}
bt_cmd_l = (u32)bt_ba;
@ -435,7 +434,7 @@ static int hns_roce_table_mhop_get(struct hns_roce_dev *hr_dev,
buf_chunk_size = mhop.buf_chunk_size;
bt_chunk_size = mhop.bt_chunk_size;
hop_num = mhop.hop_num;
chunk_ba_num = bt_chunk_size / 8;
chunk_ba_num = bt_chunk_size / BA_BYTE_LEN;
bt_num = hns_roce_get_bt_num(table->type, hop_num);
switch (bt_num) {
@ -620,7 +619,6 @@ out:
mutex_unlock(&table->mutex);
return ret;
}
EXPORT_SYMBOL_GPL(hns_roce_table_get);
static void hns_roce_table_mhop_put(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_table *table,
@ -645,7 +643,7 @@ static void hns_roce_table_mhop_put(struct hns_roce_dev *hr_dev,
bt_chunk_size = mhop.bt_chunk_size;
hop_num = mhop.hop_num;
chunk_ba_num = bt_chunk_size / 8;
chunk_ba_num = bt_chunk_size / BA_BYTE_LEN;
bt_num = hns_roce_get_bt_num(table->type, hop_num);
switch (bt_num) {
@ -763,7 +761,6 @@ void hns_roce_table_put(struct hns_roce_dev *hr_dev,
mutex_unlock(&table->mutex);
}
EXPORT_SYMBOL_GPL(hns_roce_table_put);
void *hns_roce_table_find(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_table *table,
@ -799,7 +796,7 @@ void *hns_roce_table_find(struct hns_roce_dev *hr_dev,
i = mhop.l0_idx;
j = mhop.l1_idx;
if (mhop.hop_num == 2)
hem_idx = i * (mhop.bt_chunk_size / 8) + j;
hem_idx = i * (mhop.bt_chunk_size / BA_BYTE_LEN) + j;
else if (mhop.hop_num == 1 ||
mhop.hop_num == HNS_ROCE_HOP_NUM_0)
hem_idx = i;
@ -836,7 +833,6 @@ out:
mutex_unlock(&table->mutex);
return addr;
}
EXPORT_SYMBOL_GPL(hns_roce_table_find);
int hns_roce_table_get_range(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_table *table,
@ -999,7 +995,7 @@ int hns_roce_init_hem_table(struct hns_roce_dev *hr_dev,
}
obj_per_chunk = buf_chunk_size / obj_size;
num_hem = (nobj + obj_per_chunk - 1) / obj_per_chunk;
bt_chunk_num = bt_chunk_size / 8;
bt_chunk_num = bt_chunk_size / BA_BYTE_LEN;
if (type >= HEM_TYPE_MTT)
num_bt_l0 = bt_chunk_num;
@ -1156,3 +1152,463 @@ void hns_roce_cleanup_hem(struct hns_roce_dev *hr_dev)
&hr_dev->mr_table.mtt_cqe_table);
hns_roce_cleanup_hem_table(hr_dev, &hr_dev->mr_table.mtt_table);
}
struct roce_hem_item {
struct list_head list; /* link all hems in the same bt level */
struct list_head sibling; /* link all hems in last hop for mtt */
void *addr;
dma_addr_t dma_addr;
size_t count; /* max ba numbers */
int start; /* start buf offset in this hem */
int end; /* end buf offset in this hem */
};
static struct roce_hem_item *hem_list_alloc_item(struct hns_roce_dev *hr_dev,
int start, int end,
int count, bool exist_bt,
int bt_level)
{
struct roce_hem_item *hem;
hem = kzalloc(sizeof(*hem), GFP_KERNEL);
if (!hem)
return NULL;
if (exist_bt) {
hem->addr = dma_alloc_coherent(hr_dev->dev,
count * BA_BYTE_LEN,
&hem->dma_addr, GFP_KERNEL);
if (!hem->addr) {
kfree(hem);
return NULL;
}
}
hem->count = count;
hem->start = start;
hem->end = end;
INIT_LIST_HEAD(&hem->list);
INIT_LIST_HEAD(&hem->sibling);
return hem;
}
static void hem_list_free_item(struct hns_roce_dev *hr_dev,
struct roce_hem_item *hem, bool exist_bt)
{
if (exist_bt)
dma_free_coherent(hr_dev->dev, hem->count * BA_BYTE_LEN,
hem->addr, hem->dma_addr);
kfree(hem);
}
static void hem_list_free_all(struct hns_roce_dev *hr_dev,
struct list_head *head, bool exist_bt)
{
struct roce_hem_item *hem, *temp_hem;
list_for_each_entry_safe(hem, temp_hem, head, list) {
list_del(&hem->list);
hem_list_free_item(hr_dev, hem, exist_bt);
}
}
static void hem_list_link_bt(struct hns_roce_dev *hr_dev, void *base_addr,
u64 table_addr)
{
*(u64 *)(base_addr) = table_addr;
}
/* assign L0 table address to hem from root bt */
static void hem_list_assign_bt(struct hns_roce_dev *hr_dev,
struct roce_hem_item *hem, void *cpu_addr,
u64 phy_addr)
{
hem->addr = cpu_addr;
hem->dma_addr = (dma_addr_t)phy_addr;
}
static inline bool hem_list_page_is_in_range(struct roce_hem_item *hem,
int offset)
{
return (hem->start <= offset && offset <= hem->end);
}
static struct roce_hem_item *hem_list_search_item(struct list_head *ba_list,
int page_offset)
{
struct roce_hem_item *hem, *temp_hem;
struct roce_hem_item *found = NULL;
list_for_each_entry_safe(hem, temp_hem, ba_list, list) {
if (hem_list_page_is_in_range(hem, page_offset)) {
found = hem;
break;
}
}
return found;
}
static bool hem_list_is_bottom_bt(int hopnum, int bt_level)
{
/*
* hopnum base address table levels
* 0 L0(buf)
* 1 L0 -> buf
* 2 L0 -> L1 -> buf
* 3 L0 -> L1 -> L2 -> buf
*/
return bt_level >= (hopnum ? hopnum - 1 : hopnum);
}
/**
* calc base address entries num
* @hopnum: num of mutihop addressing
* @bt_level: base address table level
* @unit: ba entries per bt page
*/
static u32 hem_list_calc_ba_range(int hopnum, int bt_level, int unit)
{
u32 step;
int max;
int i;
if (hopnum <= bt_level)
return 0;
/*
* hopnum bt_level range
* 1 0 unit
* ------------
* 2 0 unit * unit
* 2 1 unit
* ------------
* 3 0 unit * unit * unit
* 3 1 unit * unit
* 3 2 unit
*/
step = 1;
max = hopnum - bt_level;
for (i = 0; i < max; i++)
step = step * unit;
return step;
}
/**
* calc the root ba entries which could cover all regions
* @regions: buf region array
* @region_cnt: array size of @regions
* @unit: ba entries per bt page
*/
int hns_roce_hem_list_calc_root_ba(const struct hns_roce_buf_region *regions,
int region_cnt, int unit)
{
struct hns_roce_buf_region *r;
int total = 0;
int step;
int i;
for (i = 0; i < region_cnt; i++) {
r = (struct hns_roce_buf_region *)&regions[i];
if (r->hopnum > 1) {
step = hem_list_calc_ba_range(r->hopnum, 1, unit);
if (step > 0)
total += (r->count + step - 1) / step;
} else {
total += r->count;
}
}
return total;
}
static int hem_list_alloc_mid_bt(struct hns_roce_dev *hr_dev,
const struct hns_roce_buf_region *r, int unit,
int offset, struct list_head *mid_bt,
struct list_head *btm_bt)
{
struct roce_hem_item *hem_ptrs[HNS_ROCE_MAX_BT_LEVEL] = { NULL };
struct list_head temp_list[HNS_ROCE_MAX_BT_LEVEL];
struct roce_hem_item *cur, *pre;
const int hopnum = r->hopnum;
int start_aligned;
int distance;
int ret = 0;
int max_ofs;
int level;
u32 step;
int end;
if (hopnum <= 1)
return 0;
if (hopnum > HNS_ROCE_MAX_BT_LEVEL) {
dev_err(hr_dev->dev, "invalid hopnum %d!\n", hopnum);
return -EINVAL;
}
if (offset < r->offset) {
dev_err(hr_dev->dev, "invalid offset %d,min %d!\n",
offset, r->offset);
return -EINVAL;
}
distance = offset - r->offset;
max_ofs = r->offset + r->count - 1;
for (level = 0; level < hopnum; level++)
INIT_LIST_HEAD(&temp_list[level]);
/* config L1 bt to last bt and link them to corresponding parent */
for (level = 1; level < hopnum; level++) {
cur = hem_list_search_item(&mid_bt[level], offset);
if (cur) {
hem_ptrs[level] = cur;
continue;
}
step = hem_list_calc_ba_range(hopnum, level, unit);
if (step < 1) {
ret = -EINVAL;
goto err_exit;
}
start_aligned = (distance / step) * step + r->offset;
end = min_t(int, start_aligned + step - 1, max_ofs);
cur = hem_list_alloc_item(hr_dev, start_aligned, end, unit,
true, level);
if (!cur) {
ret = -ENOMEM;
goto err_exit;
}
hem_ptrs[level] = cur;
list_add(&cur->list, &temp_list[level]);
if (hem_list_is_bottom_bt(hopnum, level))
list_add(&cur->sibling, &temp_list[0]);
/* link bt to parent bt */
if (level > 1) {
pre = hem_ptrs[level - 1];
step = (cur->start - pre->start) / step * BA_BYTE_LEN;
hem_list_link_bt(hr_dev, pre->addr + step,
cur->dma_addr);
}
}
list_splice(&temp_list[0], btm_bt);
for (level = 1; level < hopnum; level++)
list_splice(&temp_list[level], &mid_bt[level]);
return 0;
err_exit:
for (level = 1; level < hopnum; level++)
hem_list_free_all(hr_dev, &temp_list[level], true);
return ret;
}
static int hem_list_alloc_root_bt(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list, int unit,
const struct hns_roce_buf_region *regions,
int region_cnt)
{
struct roce_hem_item *hem, *temp_hem, *root_hem;
struct list_head temp_list[HNS_ROCE_MAX_BT_REGION];
const struct hns_roce_buf_region *r;
struct list_head temp_root;
struct list_head temp_btm;
void *cpu_base;
u64 phy_base;
int ret = 0;
int offset;
int total;
int step;
int i;
r = &regions[0];
root_hem = hem_list_search_item(&hem_list->root_bt, r->offset);
if (root_hem)
return 0;
INIT_LIST_HEAD(&temp_root);
total = r->offset;
/* indicate to last region */
r = &regions[region_cnt - 1];
root_hem = hem_list_alloc_item(hr_dev, total, r->offset + r->count - 1,
unit, true, 0);
if (!root_hem)
return -ENOMEM;
list_add(&root_hem->list, &temp_root);
hem_list->root_ba = root_hem->dma_addr;
INIT_LIST_HEAD(&temp_btm);
for (i = 0; i < region_cnt; i++)
INIT_LIST_HEAD(&temp_list[i]);
total = 0;
for (i = 0; i < region_cnt && total < unit; i++) {
r = &regions[i];
if (!r->count)
continue;
/* all regions's mid[x][0] shared the root_bt's trunk */
cpu_base = root_hem->addr + total * BA_BYTE_LEN;
phy_base = root_hem->dma_addr + total * BA_BYTE_LEN;
/* if hopnum is 0 or 1, cut a new fake hem from the root bt
* which's address share to all regions.
*/
if (hem_list_is_bottom_bt(r->hopnum, 0)) {
hem = hem_list_alloc_item(hr_dev, r->offset,
r->offset + r->count - 1,
r->count, false, 0);
if (!hem) {
ret = -ENOMEM;
goto err_exit;
}
hem_list_assign_bt(hr_dev, hem, cpu_base, phy_base);
list_add(&hem->list, &temp_list[i]);
list_add(&hem->sibling, &temp_btm);
total += r->count;
} else {
step = hem_list_calc_ba_range(r->hopnum, 1, unit);
if (step < 1) {
ret = -EINVAL;
goto err_exit;
}
/* if exist mid bt, link L1 to L0 */
list_for_each_entry_safe(hem, temp_hem,
&hem_list->mid_bt[i][1], list) {
offset = hem->start / step * BA_BYTE_LEN;
hem_list_link_bt(hr_dev, cpu_base + offset,
hem->dma_addr);
total++;
}
}
}
list_splice(&temp_btm, &hem_list->btm_bt);
list_splice(&temp_root, &hem_list->root_bt);
for (i = 0; i < region_cnt; i++)
list_splice(&temp_list[i], &hem_list->mid_bt[i][0]);
return 0;
err_exit:
for (i = 0; i < region_cnt; i++)
hem_list_free_all(hr_dev, &temp_list[i], false);
hem_list_free_all(hr_dev, &temp_root, true);
return ret;
}
/* construct the base address table and link them by address hop config */
int hns_roce_hem_list_request(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list,
const struct hns_roce_buf_region *regions,
int region_cnt)
{
const struct hns_roce_buf_region *r;
int ofs, end;
int ret = 0;
int unit;
int i;
if (region_cnt > HNS_ROCE_MAX_BT_REGION) {
dev_err(hr_dev->dev, "invalid region region_cnt %d!\n",
region_cnt);
return -EINVAL;
}
unit = (1 << hem_list->bt_pg_shift) / BA_BYTE_LEN;
for (i = 0; i < region_cnt; i++) {
r = &regions[i];
if (!r->count)
continue;
end = r->offset + r->count;
for (ofs = r->offset; ofs < end; ofs += unit) {
ret = hem_list_alloc_mid_bt(hr_dev, r, unit, ofs,
hem_list->mid_bt[i],
&hem_list->btm_bt);
if (ret) {
dev_err(hr_dev->dev,
"alloc hem trunk fail ret=%d!\n", ret);
goto err_alloc;
}
}
}
ret = hem_list_alloc_root_bt(hr_dev, hem_list, unit, regions,
region_cnt);
if (ret)
dev_err(hr_dev->dev, "alloc hem root fail ret=%d!\n", ret);
else
return 0;
err_alloc:
hns_roce_hem_list_release(hr_dev, hem_list);
return ret;
}
void hns_roce_hem_list_release(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list)
{
int i, j;
for (i = 0; i < HNS_ROCE_MAX_BT_REGION; i++)
for (j = 0; j < HNS_ROCE_MAX_BT_LEVEL; j++)
hem_list_free_all(hr_dev, &hem_list->mid_bt[i][j],
j != 0);
hem_list_free_all(hr_dev, &hem_list->root_bt, true);
INIT_LIST_HEAD(&hem_list->btm_bt);
hem_list->root_ba = 0;
}
void hns_roce_hem_list_init(struct hns_roce_hem_list *hem_list,
int bt_page_order)
{
int i, j;
INIT_LIST_HEAD(&hem_list->root_bt);
INIT_LIST_HEAD(&hem_list->btm_bt);
for (i = 0; i < HNS_ROCE_MAX_BT_REGION; i++)
for (j = 0; j < HNS_ROCE_MAX_BT_LEVEL; j++)
INIT_LIST_HEAD(&hem_list->mid_bt[i][j]);
hem_list->bt_pg_shift = bt_page_order;
}
void *hns_roce_hem_list_find_mtt(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list,
int offset, int *mtt_cnt, u64 *phy_addr)
{
struct list_head *head = &hem_list->btm_bt;
struct roce_hem_item *hem, *temp_hem;
void *cpu_base = NULL;
u64 phy_base = 0;
int nr = 0;
list_for_each_entry_safe(hem, temp_hem, head, sibling) {
if (hem_list_page_is_in_range(hem, offset)) {
nr = offset - hem->start;
cpu_base = hem->addr + nr * BA_BYTE_LEN;
phy_base = hem->dma_addr + nr * BA_BYTE_LEN;
nr = hem->end + 1 - offset;
break;
}
}
if (mtt_cnt)
*mtt_cnt = nr;
if (phy_addr)
*phy_addr = phy_base;
return cpu_base;
}

View File

@ -34,8 +34,8 @@
#ifndef _HNS_ROCE_HEM_H
#define _HNS_ROCE_HEM_H
#define HW_SYNC_TIMEOUT_MSECS 500
#define HW_SYNC_SLEEP_TIME_INTERVAL 20
#define HW_SYNC_TIMEOUT_MSECS (25 * HW_SYNC_SLEEP_TIME_INTERVAL)
#define BT_CMD_SYNC_SHIFT 31
enum {
@ -133,6 +133,20 @@ int hns_roce_calc_hem_mhop(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_mhop *mhop);
bool hns_roce_check_whether_mhop(struct hns_roce_dev *hr_dev, u32 type);
void hns_roce_hem_list_init(struct hns_roce_hem_list *hem_list,
int bt_page_order);
int hns_roce_hem_list_calc_root_ba(const struct hns_roce_buf_region *regions,
int region_cnt, int unit);
int hns_roce_hem_list_request(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list,
const struct hns_roce_buf_region *regions,
int region_cnt);
void hns_roce_hem_list_release(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list);
void *hns_roce_hem_list_find_mtt(struct hns_roce_dev *hr_dev,
struct hns_roce_hem_list *hem_list,
int offset, int *mtt_cnt, u64 *phy_addr);
static inline void hns_roce_hem_first(struct hns_roce_hem *hem,
struct hns_roce_hem_iter *iter)
{

View File

@ -717,7 +717,7 @@ static int hns_roce_v1_rsv_lp_qp(struct hns_roce_dev *hr_dev)
union ib_gid dgid;
u64 subnet_prefix;
int attr_mask = 0;
int ret = -ENOMEM;
int ret;
int i, j;
u8 queue_en[HNS_ROCE_V1_RESV_QP] = { 0 };
u8 phy_port;
@ -730,10 +730,16 @@ static int hns_roce_v1_rsv_lp_qp(struct hns_roce_dev *hr_dev)
/* Reserved cq for loop qp */
cq_init_attr.cqe = HNS_ROCE_MIN_WQE_NUM * 2;
cq_init_attr.comp_vector = 0;
cq = hns_roce_ib_create_cq(&hr_dev->ib_dev, &cq_init_attr, NULL);
if (IS_ERR(cq)) {
dev_err(dev, "Create cq for reserved loop qp failed!");
ibdev = &hr_dev->ib_dev;
cq = rdma_zalloc_drv_obj(ibdev, ib_cq);
if (!cq)
return -ENOMEM;
ret = hns_roce_ib_create_cq(cq, &cq_init_attr, NULL);
if (ret) {
dev_err(dev, "Create cq for reserved loop qp failed!");
goto alloc_cq_failed;
}
free_mr->mr_free_cq = to_hr_cq(cq);
free_mr->mr_free_cq->ib_cq.device = &hr_dev->ib_dev;
@ -743,7 +749,6 @@ static int hns_roce_v1_rsv_lp_qp(struct hns_roce_dev *hr_dev)
free_mr->mr_free_cq->ib_cq.cq_context = NULL;
atomic_set(&free_mr->mr_free_cq->ib_cq.usecnt, 0);
ibdev = &hr_dev->ib_dev;
pd = rdma_zalloc_drv_obj(ibdev, ib_pd);
if (!pd)
goto alloc_mem_failed;
@ -818,7 +823,7 @@ static int hns_roce_v1_rsv_lp_qp(struct hns_roce_dev *hr_dev)
attr.dest_qp_num = hr_qp->qpn;
memcpy(rdma_ah_retrieve_dmac(&attr.ah_attr),
hr_dev->dev_addr[port],
MAC_ADDR_OCTET_NUM);
ETH_ALEN);
memcpy(&dgid.raw, &subnet_prefix, sizeof(u64));
memcpy(&dgid.raw[8], hr_dev->dev_addr[port], 3);
@ -865,9 +870,9 @@ alloc_pd_failed:
kfree(pd);
alloc_mem_failed:
if (hns_roce_ib_destroy_cq(cq, NULL))
dev_err(dev, "Destroy cq for create_lp_qp failed!\n");
hns_roce_ib_destroy_cq(cq, NULL);
alloc_cq_failed:
kfree(cq);
return ret;
}
@ -894,10 +899,8 @@ static void hns_roce_v1_release_lp_qp(struct hns_roce_dev *hr_dev)
i, ret);
}
ret = hns_roce_ib_destroy_cq(&free_mr->mr_free_cq->ib_cq, NULL);
if (ret)
dev_err(dev, "Destroy cq for mr_free failed(%d)!\n", ret);
hns_roce_ib_destroy_cq(&free_mr->mr_free_cq->ib_cq, NULL);
kfree(&free_mr->mr_free_cq->ib_cq);
hns_roce_dealloc_pd(&free_mr->mr_free_pd->ibpd, NULL);
kfree(&free_mr->mr_free_pd->ibpd);
}
@ -966,8 +969,7 @@ static int hns_roce_v1_recreate_lp_qp(struct hns_roce_dev *hr_dev)
struct hns_roce_free_mr *free_mr;
struct hns_roce_v1_priv *priv;
struct completion comp;
unsigned long end =
msecs_to_jiffies(HNS_ROCE_V1_RECREATE_LP_QP_TIMEOUT_MSECS) + jiffies;
unsigned long end = HNS_ROCE_V1_RECREATE_LP_QP_TIMEOUT_MSECS;
priv = (struct hns_roce_v1_priv *)hr_dev->priv;
free_mr = &priv->free_mr;
@ -987,10 +989,11 @@ static int hns_roce_v1_recreate_lp_qp(struct hns_roce_dev *hr_dev)
queue_work(free_mr->free_mr_wq, &(lp_qp_work->work));
while (time_before_eq(jiffies, end)) {
while (end) {
if (try_wait_for_completion(&comp))
return 0;
msleep(HNS_ROCE_V1_RECREATE_LP_QP_WAIT_VALUE);
end -= HNS_ROCE_V1_RECREATE_LP_QP_WAIT_VALUE;
}
lp_qp_work->comp_flag = 0;
@ -1104,8 +1107,7 @@ static int hns_roce_v1_dereg_mr(struct hns_roce_dev *hr_dev,
struct hns_roce_free_mr *free_mr;
struct hns_roce_v1_priv *priv;
struct completion comp;
unsigned long end =
msecs_to_jiffies(HNS_ROCE_V1_FREE_MR_TIMEOUT_MSECS) + jiffies;
unsigned long end = HNS_ROCE_V1_FREE_MR_TIMEOUT_MSECS;
unsigned long start = jiffies;
int npages;
int ret = 0;
@ -1135,10 +1137,11 @@ static int hns_roce_v1_dereg_mr(struct hns_roce_dev *hr_dev,
queue_work(free_mr->free_mr_wq, &(mr_work->work));
while (time_before_eq(jiffies, end)) {
while (end) {
if (try_wait_for_completion(&comp))
goto free_mr;
msleep(HNS_ROCE_V1_FREE_MR_WAIT_VALUE);
end -= HNS_ROCE_V1_FREE_MR_WAIT_VALUE;
}
mr_work->comp_flag = 0;
@ -1161,8 +1164,7 @@ free_mr:
hns_roce_bitmap_free(&hr_dev->mr_table.mtpt_bitmap,
key_to_hw_index(mr->key), 0);
if (mr->umem)
ib_umem_release(mr->umem);
ib_umem_release(mr->umem);
kfree(mr);
@ -1557,6 +1559,7 @@ static int hns_roce_v1_profile(struct hns_roce_dev *hr_dev)
caps->reserved_mrws = 1;
caps->reserved_uars = 0;
caps->reserved_cqs = 0;
caps->reserved_qps = 12; /* 2 SQP per port, six ports total 12 */
caps->chunk_sz = HNS_ROCE_V1_TABLE_CHUNK_SIZE;
for (i = 0; i < caps->num_ports; i++)
@ -1742,11 +1745,14 @@ static int hns_roce_v1_set_gid(struct hns_roce_dev *hr_dev, u8 port,
int gid_index, const union ib_gid *gid,
const struct ib_gid_attr *attr)
{
unsigned long flags;
u32 *p = NULL;
u8 gid_idx = 0;
gid_idx = hns_get_gid_index(hr_dev, port, gid_index);
spin_lock_irqsave(&hr_dev->iboe.lock, flags);
p = (u32 *)&gid->raw[0];
roce_raw_write(*p, hr_dev->reg_base + ROCEE_PORT_GID_L_0_REG +
(HNS_ROCE_V1_GID_NUM * gid_idx));
@ -1763,6 +1769,8 @@ static int hns_roce_v1_set_gid(struct hns_roce_dev *hr_dev, u8 port,
roce_raw_write(*p, hr_dev->reg_base + ROCEE_PORT_GID_H_0_REG +
(HNS_ROCE_V1_GID_NUM * gid_idx));
spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
return 0;
}
@ -2458,10 +2466,10 @@ static int hns_roce_v1_clear_hem(struct hns_roce_dev *hr_dev,
bt_cmd = hr_dev->reg_base + ROCEE_BT_CMD_H_REG;
end = msecs_to_jiffies(HW_SYNC_TIMEOUT_MSECS) + jiffies;
end = HW_SYNC_TIMEOUT_MSECS;
while (1) {
if (readl(bt_cmd) >> BT_CMD_SYNC_SHIFT) {
if (!(time_before(jiffies, end))) {
if (!end) {
dev_err(dev, "Write bt_cmd err,hw_sync is not zero.\n");
spin_unlock_irqrestore(&hr_dev->bt_cmd_lock,
flags);
@ -2470,7 +2478,8 @@ static int hns_roce_v1_clear_hem(struct hns_roce_dev *hr_dev,
} else {
break;
}
msleep(HW_SYNC_SLEEP_TIME_INTERVAL);
mdelay(HW_SYNC_SLEEP_TIME_INTERVAL);
end -= HW_SYNC_SLEEP_TIME_INTERVAL;
}
bt_cmd_val[0] = (__le32)bt_ba;
@ -3633,9 +3642,8 @@ int hns_roce_v1_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
hns_roce_mtt_cleanup(hr_dev, &hr_qp->mtt);
if (udata)
ib_umem_release(hr_qp->umem);
else {
ib_umem_release(hr_qp->umem);
if (!udata) {
kfree(hr_qp->sq.wrid);
kfree(hr_qp->rq.wrid);
@ -3649,7 +3657,7 @@ int hns_roce_v1_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
return 0;
}
static int hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
static void hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ibcq->device);
struct hns_roce_cq *hr_cq = to_hr_cq(ibcq);
@ -3658,7 +3666,6 @@ static int hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
u32 cqe_cnt_cur;
u32 cq_buf_size;
int wait_time = 0;
int ret = 0;
hns_roce_free_cq(hr_dev, hr_cq);
@ -3680,7 +3687,6 @@ static int hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
if (wait_time > HNS_ROCE_MAX_FREE_CQ_WAIT_CNT) {
dev_warn(dev, "Destroy cq 0x%lx timeout!\n",
hr_cq->cqn);
ret = -ETIMEDOUT;
break;
}
wait_time++;
@ -3688,17 +3694,12 @@ static int hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
hns_roce_mtt_cleanup(hr_dev, &hr_cq->hr_buf.hr_mtt);
if (ibcq->uobject)
ib_umem_release(hr_cq->umem);
else {
ib_umem_release(hr_cq->umem);
if (!udata) {
/* Free the buff of stored cq */
cq_buf_size = (ibcq->cqe + 1) * hr_dev->caps.cq_entry_sz;
hns_roce_buf_free(hr_dev, cq_buf_size, &hr_cq->hr_buf.hr_buf);
}
kfree(hr_cq);
return ret;
}
static void set_eq_cons_index_v1(struct hns_roce_eq *eq, int req_not)
@ -3902,7 +3903,8 @@ static int hns_roce_v1_aeq_int(struct hns_roce_dev *hr_dev,
*/
dma_rmb();
dev_dbg(dev, "aeqe = %p, aeqe->asyn.event_type = 0x%lx\n", aeqe,
dev_dbg(dev, "aeqe = %pK, aeqe->asyn.event_type = 0x%lx\n",
aeqe,
roce_get_field(aeqe->asyn,
HNS_ROCE_AEQE_U32_4_EVENT_TYPE_M,
HNS_ROCE_AEQE_U32_4_EVENT_TYPE_S));
@ -4265,7 +4267,6 @@ static int hns_roce_v1_create_eq(struct hns_roce_dev *hr_dev,
}
eq->buf_list[i].map = tmp_dma_addr;
memset(eq->buf_list[i].buf, 0, HNS_ROCE_BA_SIZE);
}
eq->cons_index = 0;
roce_set_field(tmp, ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_M,

View File

@ -1098,7 +1098,7 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
if (ret == CMD_RST_PRC_SUCCESS)
return 0;
if (ret == CMD_RST_PRC_EBUSY)
return ret;
return -EBUSY;
ret = __hns_roce_cmq_send(hr_dev, desc, num);
if (ret) {
@ -1106,7 +1106,7 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
if (retval == CMD_RST_PRC_SUCCESS)
return 0;
else if (retval == CMD_RST_PRC_EBUSY)
return retval;
return -EBUSY;
}
return ret;
@ -1130,6 +1130,45 @@ static int hns_roce_cmq_query_hw_info(struct hns_roce_dev *hr_dev)
return 0;
}
static void hns_roce_function_clear(struct hns_roce_dev *hr_dev)
{
struct hns_roce_func_clear *resp;
struct hns_roce_cmq_desc desc;
unsigned long end;
int ret;
hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_FUNC_CLEAR, false);
resp = (struct hns_roce_func_clear *)desc.data;
ret = hns_roce_cmq_send(hr_dev, &desc, 1);
if (ret) {
dev_err(hr_dev->dev, "Func clear write failed, ret = %d.\n",
ret);
return;
}
msleep(HNS_ROCE_V2_READ_FUNC_CLEAR_FLAG_INTERVAL);
end = HNS_ROCE_V2_FUNC_CLEAR_TIMEOUT_MSECS;
while (end) {
msleep(HNS_ROCE_V2_READ_FUNC_CLEAR_FLAG_FAIL_WAIT);
end -= HNS_ROCE_V2_READ_FUNC_CLEAR_FLAG_FAIL_WAIT;
hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_FUNC_CLEAR,
true);
ret = hns_roce_cmq_send(hr_dev, &desc, 1);
if (ret)
continue;
if (roce_get_bit(resp->func_done, FUNC_CLEAR_RST_FUN_DONE_S)) {
hr_dev->is_reset = true;
return;
}
}
dev_err(hr_dev->dev, "Func clear fail.\n");
}
static int hns_roce_query_fw_ver(struct hns_roce_dev *hr_dev)
{
struct hns_roce_query_fw_info *resp;
@ -1574,7 +1613,10 @@ static int hns_roce_v2_profile(struct hns_roce_dev *hr_dev)
caps->mtt_ba_pg_sz = 0;
caps->mtt_buf_pg_sz = 0;
caps->mtt_hop_num = HNS_ROCE_MTT_HOP_NUM;
caps->cqe_ba_pg_sz = 0;
caps->wqe_sq_hop_num = 2;
caps->wqe_sge_hop_num = 1;
caps->wqe_rq_hop_num = 2;
caps->cqe_ba_pg_sz = 6;
caps->cqe_buf_pg_sz = 0;
caps->cqe_hop_num = HNS_ROCE_CQE_HOP_NUM;
caps->srqwqe_ba_pg_sz = 0;
@ -1774,7 +1816,6 @@ static int hns_roce_init_link_table(struct hns_roce_dev *hr_dev,
goto err_alloc_buf_failed;
link_tbl->pg_list[i].map = t;
memset(link_tbl->pg_list[i].buf, 0, buf_chk_sz);
entry[i].blk_ba0 = (t >> 12) & 0xffffffff;
roce_set_field(entry[i].blk_ba1_nxt_ptr,
@ -1891,6 +1932,9 @@ static void hns_roce_v2_exit(struct hns_roce_dev *hr_dev)
{
struct hns_roce_v2_priv *priv = hr_dev->priv;
if (hr_dev->pci_dev->revision == 0x21)
hns_roce_function_clear(hr_dev);
hns_roce_free_link_table(hr_dev, &priv->tpq);
hns_roce_free_link_table(hr_dev, &priv->tsq);
}
@ -1974,7 +2018,7 @@ static int hns_roce_v2_chk_mbox(struct hns_roce_dev *hr_dev,
unsigned long timeout)
{
struct device *dev = hr_dev->dev;
unsigned long end = 0;
unsigned long end;
u32 status;
end = msecs_to_jiffies(timeout) + jiffies;
@ -2340,15 +2384,10 @@ static void *get_srq_wqe(struct hns_roce_srq *srq, int n)
static void hns_roce_free_srq_wqe(struct hns_roce_srq *srq, int wqe_index)
{
u32 bitmap_num;
int bit_num;
/* always called with interrupts disabled. */
spin_lock(&srq->lock);
bitmap_num = wqe_index / (sizeof(u64) * 8);
bit_num = wqe_index % (sizeof(u64) * 8);
srq->idx_que.bitmap[bitmap_num] |= (1ULL << bit_num);
bitmap_clear(srq->idx_que.bitmap, wqe_index, 1);
srq->tail++;
spin_unlock(&srq->lock);
@ -2977,7 +3016,7 @@ static int hns_roce_v2_clear_hem(struct hns_roce_dev *hr_dev,
{
struct device *dev = hr_dev->dev;
struct hns_roce_cmd_mailbox *mailbox;
int ret = 0;
int ret;
u16 op = 0xff;
if (!hns_roce_check_whether_mhop(hr_dev, table->type))
@ -3026,7 +3065,6 @@ static int hns_roce_v2_clear_hem(struct hns_roce_dev *hr_dev,
}
static int hns_roce_v2_qp_modify(struct hns_roce_dev *hr_dev,
struct hns_roce_mtt *mtt,
enum ib_qp_state cur_state,
enum ib_qp_state new_state,
struct hns_roce_v2_qp_context *context,
@ -3426,7 +3464,9 @@ static void modify_qp_init_to_init(struct ib_qp *ibqp,
else
roce_set_field(context->byte_4_sqpn_tst,
V2_QPC_BYTE_4_SGE_SHIFT_M,
V2_QPC_BYTE_4_SGE_SHIFT_S, hr_qp->sq.max_gs > 2 ?
V2_QPC_BYTE_4_SGE_SHIFT_S,
hr_qp->sq.max_gs >
HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE ?
ilog2((unsigned int)hr_qp->sge.sge_cnt) : 0);
roce_set_field(qpc_mask->byte_4_sqpn_tst, V2_QPC_BYTE_4_SGE_SHIFT_M,
@ -3520,6 +3560,31 @@ static void modify_qp_init_to_init(struct ib_qp *ibqp,
}
}
static bool check_wqe_rq_mtt_count(struct hns_roce_dev *hr_dev,
struct hns_roce_qp *hr_qp, int mtt_cnt,
u32 page_size)
{
struct device *dev = hr_dev->dev;
if (hr_qp->rq.wqe_cnt < 1)
return true;
if (mtt_cnt < 1) {
dev_err(dev, "qp(0x%lx) rqwqe buf ba find failed\n",
hr_qp->qpn);
return false;
}
if (mtt_cnt < MTT_MIN_COUNT &&
(hr_qp->rq.offset + page_size) < hr_qp->buff_size) {
dev_err(dev, "qp(0x%lx) next rqwqe buf ba find failed\n",
hr_qp->qpn);
return false;
}
return true;
}
static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
const struct ib_qp_attr *attr, int attr_mask,
struct hns_roce_v2_qp_context *context,
@ -3529,25 +3594,27 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
struct device *dev = hr_dev->dev;
u64 mtts[MTT_MIN_COUNT] = { 0 };
dma_addr_t dma_handle_3;
dma_addr_t dma_handle_2;
dma_addr_t dma_handle;
u64 wqe_sge_ba;
u32 page_size;
u8 port_num;
u64 *mtts_3;
u64 *mtts_2;
u64 *mtts;
int count;
u8 *dmac;
u8 *smac;
int port;
/* Search qp buf's mtts */
mtts = hns_roce_table_find(hr_dev, &hr_dev->mr_table.mtt_table,
hr_qp->mtt.first_seg, &dma_handle);
if (!mtts) {
dev_err(dev, "qp buf pa find failed\n");
return -EINVAL;
}
page_size = 1 << (hr_dev->caps.mtt_buf_pg_sz + PAGE_SHIFT);
count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr,
hr_qp->rq.offset / page_size, mtts,
MTT_MIN_COUNT, &wqe_sge_ba);
if (!ibqp->srq)
if (!check_wqe_rq_mtt_count(hr_dev, hr_qp, count, page_size))
return -EINVAL;
/* Search IRRL's mtts */
mtts_2 = hns_roce_table_find(hr_dev, &hr_dev->qp_table.irrl_table,
@ -3571,7 +3638,7 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
}
dmac = (u8 *)attr->ah_attr.roce.dmac;
context->wqe_sge_ba = (u32)(dma_handle >> 3);
context->wqe_sge_ba = (u32)(wqe_sge_ba >> 3);
qpc_mask->wqe_sge_ba = 0;
/*
@ -3581,22 +3648,23 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
* 0 at the same time, else set them to 0x1.
*/
roce_set_field(context->byte_12_sq_hop, V2_QPC_BYTE_12_WQE_SGE_BA_M,
V2_QPC_BYTE_12_WQE_SGE_BA_S, dma_handle >> (32 + 3));
V2_QPC_BYTE_12_WQE_SGE_BA_S, wqe_sge_ba >> (32 + 3));
roce_set_field(qpc_mask->byte_12_sq_hop, V2_QPC_BYTE_12_WQE_SGE_BA_M,
V2_QPC_BYTE_12_WQE_SGE_BA_S, 0);
roce_set_field(context->byte_12_sq_hop, V2_QPC_BYTE_12_SQ_HOP_NUM_M,
V2_QPC_BYTE_12_SQ_HOP_NUM_S,
hr_dev->caps.mtt_hop_num == HNS_ROCE_HOP_NUM_0 ?
0 : hr_dev->caps.mtt_hop_num);
hr_dev->caps.wqe_sq_hop_num == HNS_ROCE_HOP_NUM_0 ?
0 : hr_dev->caps.wqe_sq_hop_num);
roce_set_field(qpc_mask->byte_12_sq_hop, V2_QPC_BYTE_12_SQ_HOP_NUM_M,
V2_QPC_BYTE_12_SQ_HOP_NUM_S, 0);
roce_set_field(context->byte_20_smac_sgid_idx,
V2_QPC_BYTE_20_SGE_HOP_NUM_M,
V2_QPC_BYTE_20_SGE_HOP_NUM_S,
((ibqp->qp_type == IB_QPT_GSI) || hr_qp->sq.max_gs > 2) ?
hr_dev->caps.mtt_hop_num : 0);
((ibqp->qp_type == IB_QPT_GSI) ||
hr_qp->sq.max_gs > HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE) ?
hr_dev->caps.wqe_sge_hop_num : 0);
roce_set_field(qpc_mask->byte_20_smac_sgid_idx,
V2_QPC_BYTE_20_SGE_HOP_NUM_M,
V2_QPC_BYTE_20_SGE_HOP_NUM_S, 0);
@ -3604,8 +3672,8 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
roce_set_field(context->byte_20_smac_sgid_idx,
V2_QPC_BYTE_20_RQ_HOP_NUM_M,
V2_QPC_BYTE_20_RQ_HOP_NUM_S,
hr_dev->caps.mtt_hop_num == HNS_ROCE_HOP_NUM_0 ?
0 : hr_dev->caps.mtt_hop_num);
hr_dev->caps.wqe_rq_hop_num == HNS_ROCE_HOP_NUM_0 ?
0 : hr_dev->caps.wqe_rq_hop_num);
roce_set_field(qpc_mask->byte_20_smac_sgid_idx,
V2_QPC_BYTE_20_RQ_HOP_NUM_M,
V2_QPC_BYTE_20_RQ_HOP_NUM_S, 0);
@ -3613,7 +3681,7 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
roce_set_field(context->byte_16_buf_ba_pg_sz,
V2_QPC_BYTE_16_WQE_SGE_BA_PG_SZ_M,
V2_QPC_BYTE_16_WQE_SGE_BA_PG_SZ_S,
hr_dev->caps.mtt_ba_pg_sz + PG_SHIFT_OFFSET);
hr_qp->wqe_bt_pg_shift + PG_SHIFT_OFFSET);
roce_set_field(qpc_mask->byte_16_buf_ba_pg_sz,
V2_QPC_BYTE_16_WQE_SGE_BA_PG_SZ_M,
V2_QPC_BYTE_16_WQE_SGE_BA_PG_SZ_S, 0);
@ -3626,29 +3694,24 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
V2_QPC_BYTE_16_WQE_SGE_BUF_PG_SZ_M,
V2_QPC_BYTE_16_WQE_SGE_BUF_PG_SZ_S, 0);
page_size = 1 << (hr_dev->caps.mtt_buf_pg_sz + PAGE_SHIFT);
context->rq_cur_blk_addr = (u32)(mtts[hr_qp->rq.offset / page_size]
>> PAGE_ADDR_SHIFT);
context->rq_cur_blk_addr = (u32)(mtts[0] >> PAGE_ADDR_SHIFT);
qpc_mask->rq_cur_blk_addr = 0;
roce_set_field(context->byte_92_srq_info,
V2_QPC_BYTE_92_RQ_CUR_BLK_ADDR_M,
V2_QPC_BYTE_92_RQ_CUR_BLK_ADDR_S,
mtts[hr_qp->rq.offset / page_size]
>> (32 + PAGE_ADDR_SHIFT));
mtts[0] >> (32 + PAGE_ADDR_SHIFT));
roce_set_field(qpc_mask->byte_92_srq_info,
V2_QPC_BYTE_92_RQ_CUR_BLK_ADDR_M,
V2_QPC_BYTE_92_RQ_CUR_BLK_ADDR_S, 0);
context->rq_nxt_blk_addr = (u32)(mtts[hr_qp->rq.offset / page_size + 1]
>> PAGE_ADDR_SHIFT);
context->rq_nxt_blk_addr = (u32)(mtts[1] >> PAGE_ADDR_SHIFT);
qpc_mask->rq_nxt_blk_addr = 0;
roce_set_field(context->byte_104_rq_sge,
V2_QPC_BYTE_104_RQ_NXT_BLK_ADDR_M,
V2_QPC_BYTE_104_RQ_NXT_BLK_ADDR_S,
mtts[hr_qp->rq.offset / page_size + 1]
>> (32 + PAGE_ADDR_SHIFT));
mtts[1] >> (32 + PAGE_ADDR_SHIFT));
roce_set_field(qpc_mask->byte_104_rq_sge,
V2_QPC_BYTE_104_RQ_NXT_BLK_ADDR_M,
V2_QPC_BYTE_104_RQ_NXT_BLK_ADDR_S, 0);
@ -3708,13 +3771,14 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
roce_set_field(qpc_mask->byte_20_smac_sgid_idx,
V2_QPC_BYTE_20_SGID_IDX_M,
V2_QPC_BYTE_20_SGID_IDX_S, 0);
memcpy(&(context->dmac), dmac, 4);
memcpy(&(context->dmac), dmac, sizeof(u32));
roce_set_field(context->byte_52_udpspn_dmac, V2_QPC_BYTE_52_DMAC_M,
V2_QPC_BYTE_52_DMAC_S, *((u16 *)(&dmac[4])));
qpc_mask->dmac = 0;
roce_set_field(qpc_mask->byte_52_udpspn_dmac, V2_QPC_BYTE_52_DMAC_M,
V2_QPC_BYTE_52_DMAC_S, 0);
/* mtu*(2^LP_PKTN_INI) should not bigger than 1 message length 64kb */
roce_set_field(context->byte_56_dqpn_err, V2_QPC_BYTE_56_LP_PKTN_INI_M,
V2_QPC_BYTE_56_LP_PKTN_INI_S, 4);
roce_set_field(qpc_mask->byte_56_dqpn_err, V2_QPC_BYTE_56_LP_PKTN_INI_M,
@ -3756,6 +3820,7 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
roce_set_field(qpc_mask->byte_132_trrl, V2_QPC_BYTE_132_TRRL_TAIL_MAX_M,
V2_QPC_BYTE_132_TRRL_TAIL_MAX_S, 0);
/* rocee send 2^lp_sgen_ini segs every time */
roce_set_field(context->byte_168_irrl_idx,
V2_QPC_BYTE_168_LP_SGEN_INI_M,
V2_QPC_BYTE_168_LP_SGEN_INI_S, 3);
@ -3774,18 +3839,30 @@ static int modify_qp_rtr_to_rts(struct ib_qp *ibqp,
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
struct device *dev = hr_dev->dev;
dma_addr_t dma_handle;
u64 sge_cur_blk = 0;
u64 sq_cur_blk = 0;
u32 page_size;
u64 *mtts;
int count;
/* Search qp buf's mtts */
mtts = hns_roce_table_find(hr_dev, &hr_dev->mr_table.mtt_table,
hr_qp->mtt.first_seg, &dma_handle);
if (!mtts) {
dev_err(dev, "qp buf pa find failed\n");
count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, 0, &sq_cur_blk, 1, NULL);
if (count < 1) {
dev_err(dev, "qp(0x%lx) buf pa find failed\n", hr_qp->qpn);
return -EINVAL;
}
if (hr_qp->sge.offset) {
page_size = 1 << (hr_dev->caps.mtt_buf_pg_sz + PAGE_SHIFT);
count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr,
hr_qp->sge.offset / page_size,
&sge_cur_blk, 1, NULL);
if (count < 1) {
dev_err(dev, "qp(0x%lx) sge pa find failed\n",
hr_qp->qpn);
return -EINVAL;
}
}
/* Not support alternate path and path migration */
if ((attr_mask & IB_QP_ALT_PATH) ||
(attr_mask & IB_QP_PATH_MIG_STATE)) {
@ -3799,37 +3876,37 @@ static int modify_qp_rtr_to_rts(struct ib_qp *ibqp,
* we should set all bits of the relevant fields in context mask to
* 0 at the same time, else set them to 0x1.
*/
context->sq_cur_blk_addr = (u32)(mtts[0] >> PAGE_ADDR_SHIFT);
context->sq_cur_blk_addr = (u32)(sq_cur_blk >> PAGE_ADDR_SHIFT);
roce_set_field(context->byte_168_irrl_idx,
V2_QPC_BYTE_168_SQ_CUR_BLK_ADDR_M,
V2_QPC_BYTE_168_SQ_CUR_BLK_ADDR_S,
mtts[0] >> (32 + PAGE_ADDR_SHIFT));
sq_cur_blk >> (32 + PAGE_ADDR_SHIFT));
qpc_mask->sq_cur_blk_addr = 0;
roce_set_field(qpc_mask->byte_168_irrl_idx,
V2_QPC_BYTE_168_SQ_CUR_BLK_ADDR_M,
V2_QPC_BYTE_168_SQ_CUR_BLK_ADDR_S, 0);
page_size = 1 << (hr_dev->caps.mtt_buf_pg_sz + PAGE_SHIFT);
context->sq_cur_sge_blk_addr =
((ibqp->qp_type == IB_QPT_GSI) || hr_qp->sq.max_gs > 2) ?
((u32)(mtts[hr_qp->sge.offset / page_size]
>> PAGE_ADDR_SHIFT)) : 0;
context->sq_cur_sge_blk_addr = ((ibqp->qp_type == IB_QPT_GSI) ||
hr_qp->sq.max_gs > HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE) ?
((u32)(sge_cur_blk >>
PAGE_ADDR_SHIFT)) : 0;
roce_set_field(context->byte_184_irrl_idx,
V2_QPC_BYTE_184_SQ_CUR_SGE_BLK_ADDR_M,
V2_QPC_BYTE_184_SQ_CUR_SGE_BLK_ADDR_S,
((ibqp->qp_type == IB_QPT_GSI) || hr_qp->sq.max_gs > 2) ?
(mtts[hr_qp->sge.offset / page_size] >>
((ibqp->qp_type == IB_QPT_GSI) || hr_qp->sq.max_gs >
HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE) ?
(sge_cur_blk >>
(32 + PAGE_ADDR_SHIFT)) : 0);
qpc_mask->sq_cur_sge_blk_addr = 0;
roce_set_field(qpc_mask->byte_184_irrl_idx,
V2_QPC_BYTE_184_SQ_CUR_SGE_BLK_ADDR_M,
V2_QPC_BYTE_184_SQ_CUR_SGE_BLK_ADDR_S, 0);
context->rx_sq_cur_blk_addr = (u32)(mtts[0] >> PAGE_ADDR_SHIFT);
context->rx_sq_cur_blk_addr = (u32)(sq_cur_blk >> PAGE_ADDR_SHIFT);
roce_set_field(context->byte_232_irrl_sge,
V2_QPC_BYTE_232_RX_SQ_CUR_BLK_ADDR_M,
V2_QPC_BYTE_232_RX_SQ_CUR_BLK_ADDR_S,
mtts[0] >> (32 + PAGE_ADDR_SHIFT));
sq_cur_blk >> (32 + PAGE_ADDR_SHIFT));
qpc_mask->rx_sq_cur_blk_addr = 0;
roce_set_field(qpc_mask->byte_232_irrl_sge,
V2_QPC_BYTE_232_RX_SQ_CUR_BLK_ADDR_M,
@ -4144,7 +4221,7 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
roce_set_field(context->byte_224_retry_msg,
V2_QPC_BYTE_224_RETRY_MSG_PSN_M,
V2_QPC_BYTE_224_RETRY_MSG_PSN_S,
attr->sq_psn >> 16);
attr->sq_psn >> V2_QPC_BYTE_220_RETRY_MSG_PSN_S);
roce_set_field(qpc_mask->byte_224_retry_msg,
V2_QPC_BYTE_224_RETRY_MSG_PSN_M,
V2_QPC_BYTE_224_RETRY_MSG_PSN_S, 0);
@ -4230,7 +4307,7 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
V2_QPC_BYTE_60_QP_ST_S, 0);
/* SW pass context to HW */
ret = hns_roce_v2_qp_modify(hr_dev, &hr_qp->mtt, cur_state, new_state,
ret = hns_roce_v2_qp_modify(hr_dev, cur_state, new_state,
context, hr_qp);
if (ret) {
dev_err(dev, "hns_roce_qp_modify failed(%d)\n", ret);
@ -4374,11 +4451,12 @@ static int hns_roce_v2_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
V2_QPC_BYTE_56_DQPN_M,
V2_QPC_BYTE_56_DQPN_S);
qp_attr->qp_access_flags = ((roce_get_bit(context->byte_76_srqn_op_en,
V2_QPC_BYTE_76_RRE_S)) << 2) |
((roce_get_bit(context->byte_76_srqn_op_en,
V2_QPC_BYTE_76_RWE_S)) << 1) |
((roce_get_bit(context->byte_76_srqn_op_en,
V2_QPC_BYTE_76_ATE_S)) << 3);
V2_QPC_BYTE_76_RRE_S)) << V2_QP_RWE_S) |
((roce_get_bit(context->byte_76_srqn_op_en,
V2_QPC_BYTE_76_RWE_S)) << V2_QP_RRE_S) |
((roce_get_bit(context->byte_76_srqn_op_en,
V2_QPC_BYTE_76_ATE_S)) << V2_QP_ATE_S);
if (hr_qp->ibqp.qp_type == IB_QPT_RC ||
hr_qp->ibqp.qp_type == IB_QPT_UC) {
struct ib_global_route *grh =
@ -4487,7 +4565,7 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
(hr_qp->ibqp.qp_type == IB_QPT_UD))
hns_roce_release_range_qp(hr_dev, hr_qp->qpn, 1);
hns_roce_mtt_cleanup(hr_dev, &hr_qp->mtt);
hns_roce_mtr_cleanup(hr_dev, &hr_qp->mtr);
if (udata) {
struct hns_roce_ucontext *context =
@ -4501,7 +4579,6 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
if (hr_qp->rq.wqe_cnt && (hr_qp->rdb_en == 1))
hns_roce_db_unmap_user(context, &hr_qp->rdb);
ib_umem_release(hr_qp->umem);
} else {
kfree(hr_qp->sq.wrid);
kfree(hr_qp->rq.wrid);
@ -4509,6 +4586,7 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
if (hr_qp->rq.wqe_cnt)
hns_roce_free_db(hr_dev, &hr_qp->rdb);
}
ib_umem_release(hr_qp->umem);
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RQ_INLINE) &&
hr_qp->rq.wqe_cnt) {
@ -4682,7 +4760,6 @@ static void hns_roce_irq_work_handle(struct work_struct *work)
dev_warn(dev, "Path migration failed.\n");
break;
case HNS_ROCE_EVENT_TYPE_COMM_EST:
dev_info(dev, "Communication established.\n");
break;
case HNS_ROCE_EVENT_TYPE_SQ_DRAINED:
dev_warn(dev, "Send queue drained.\n");
@ -5151,8 +5228,8 @@ static void hns_roce_mhop_free_eq(struct hns_roce_dev *hr_dev,
dma_free_coherent(dev, bt_chk_sz, eq->bt_l1[i],
eq->l1_dma[i]);
for (j = 0; j < bt_chk_sz / 8; j++) {
idx = i * (bt_chk_sz / 8) + j;
for (j = 0; j < bt_chk_sz / BA_BYTE_LEN; j++) {
idx = i * (bt_chk_sz / BA_BYTE_LEN) + j;
if ((i == eq->l0_last_num - 1)
&& j == eq->l1_last_num - 1) {
eqe_alloc = (buf_chk_sz / eq->eqe_size)
@ -5368,9 +5445,9 @@ static int hns_roce_mhop_alloc_eq(struct hns_roce_dev *hr_dev,
buf_chk_sz = 1 << (hr_dev->caps.eqe_buf_pg_sz + PAGE_SHIFT);
bt_chk_sz = 1 << (hr_dev->caps.eqe_ba_pg_sz + PAGE_SHIFT);
ba_num = (PAGE_ALIGN(eq->entries * eq->eqe_size) + buf_chk_sz - 1)
/ buf_chk_sz;
bt_num = (ba_num + bt_chk_sz / 8 - 1) / (bt_chk_sz / 8);
ba_num = DIV_ROUND_UP(PAGE_ALIGN(eq->entries * eq->eqe_size),
buf_chk_sz);
bt_num = DIV_ROUND_UP(ba_num, bt_chk_sz / BA_BYTE_LEN);
/* hop_num = 0 */
if (mhop_num == HNS_ROCE_HOP_NUM_0) {
@ -5387,8 +5464,6 @@ static int hns_roce_mhop_alloc_eq(struct hns_roce_dev *hr_dev,
eq->cur_eqe_ba = eq->l0_dma;
eq->nxt_eqe_ba = 0;
memset(eq->bt_l0, 0, eq->entries * eq->eqe_size);
return 0;
}
@ -5415,12 +5490,12 @@ static int hns_roce_mhop_alloc_eq(struct hns_roce_dev *hr_dev,
goto err_dma_alloc_l0;
if (mhop_num == 1) {
if (ba_num > (bt_chk_sz / 8))
if (ba_num > (bt_chk_sz / BA_BYTE_LEN))
dev_err(dev, "ba_num %d is too large for 1 hop\n",
ba_num);
/* alloc buf */
for (i = 0; i < bt_chk_sz / 8; i++) {
for (i = 0; i < bt_chk_sz / BA_BYTE_LEN; i++) {
if (eq_buf_cnt + 1 < ba_num) {
size = buf_chk_sz;
} else {
@ -5444,7 +5519,7 @@ static int hns_roce_mhop_alloc_eq(struct hns_roce_dev *hr_dev,
} else if (mhop_num == 2) {
/* alloc L1 BT and buf */
for (i = 0; i < bt_chk_sz / 8; i++) {
for (i = 0; i < bt_chk_sz / BA_BYTE_LEN; i++) {
eq->bt_l1[i] = dma_alloc_coherent(dev, bt_chk_sz,
&(eq->l1_dma[i]),
GFP_KERNEL);
@ -5452,8 +5527,8 @@ static int hns_roce_mhop_alloc_eq(struct hns_roce_dev *hr_dev,
goto err_dma_alloc_l1;
*(eq->bt_l0 + i) = eq->l1_dma[i];
for (j = 0; j < bt_chk_sz / 8; j++) {
idx = i * bt_chk_sz / 8 + j;
for (j = 0; j < bt_chk_sz / BA_BYTE_LEN; j++) {
idx = i * bt_chk_sz / BA_BYTE_LEN + j;
if (eq_buf_cnt + 1 < ba_num) {
size = buf_chk_sz;
} else {
@ -5498,8 +5573,8 @@ err_dma_alloc_l1:
dma_free_coherent(dev, bt_chk_sz, eq->bt_l1[i],
eq->l1_dma[i]);
for (j = 0; j < bt_chk_sz / 8; j++) {
idx = i * bt_chk_sz / 8 + j;
for (j = 0; j < bt_chk_sz / BA_BYTE_LEN; j++) {
idx = i * bt_chk_sz / BA_BYTE_LEN + j;
dma_free_coherent(dev, buf_chk_sz, eq->buf[idx],
eq->buf_dma[idx]);
}
@ -5522,11 +5597,11 @@ err_dma_alloc_buf:
dma_free_coherent(dev, bt_chk_sz, eq->bt_l1[i],
eq->l1_dma[i]);
for (j = 0; j < bt_chk_sz / 8; j++) {
for (j = 0; j < bt_chk_sz / BA_BYTE_LEN; j++) {
if (i == record_i && j >= record_j)
break;
idx = i * bt_chk_sz / 8 + j;
idx = i * bt_chk_sz / BA_BYTE_LEN + j;
dma_free_coherent(dev, buf_chk_sz,
eq->buf[idx],
eq->buf_dma[idx]);
@ -5972,18 +6047,19 @@ out:
return ret;
}
static int find_empty_entry(struct hns_roce_idx_que *idx_que)
static int find_empty_entry(struct hns_roce_idx_que *idx_que,
unsigned long size)
{
int bit_num;
int i;
int wqe_idx;
/* bitmap[i] is set zero if all bits are allocated */
for (i = 0; idx_que->bitmap[i] == 0; ++i)
;
bit_num = ffs(idx_que->bitmap[i]);
idx_que->bitmap[i] &= ~(1ULL << (bit_num - 1));
if (unlikely(bitmap_full(idx_que->bitmap, size)))
return -ENOSPC;
return i * sizeof(u64) * 8 + (bit_num - 1);
wqe_idx = find_first_zero_bit(idx_que->bitmap, size);
bitmap_set(idx_que->bitmap, wqe_idx, 1);
return wqe_idx;
}
static void fill_idx_queue(struct hns_roce_idx_que *idx_que,
@ -6029,7 +6105,13 @@ static int hns_roce_v2_post_srq_recv(struct ib_srq *ibsrq,
break;
}
wqe_idx = find_empty_entry(&srq->idx_que);
wqe_idx = find_empty_entry(&srq->idx_que, srq->max);
if (wqe_idx < 0) {
ret = -ENOMEM;
*bad_wr = wr;
break;
}
fill_idx_queue(&srq->idx_que, ind, wqe_idx);
wqe = get_srq_wqe(srq, wqe_idx);
dseg = (struct hns_roce_v2_wqe_data_seg *)wqe;
@ -6041,9 +6123,9 @@ static int hns_roce_v2_post_srq_recv(struct ib_srq *ibsrq,
}
if (i < srq->max_gs) {
dseg->len = 0;
dseg->lkey = cpu_to_le32(0x100);
dseg->addr = 0;
dseg[i].len = 0;
dseg[i].lkey = cpu_to_le32(0x100);
dseg[i].addr = 0;
}
srq->wrid[wqe_idx] = wr->wr_id;
@ -6059,7 +6141,8 @@ static int hns_roce_v2_post_srq_recv(struct ib_srq *ibsrq,
*/
wmb();
srq_db.byte_4 = HNS_ROCE_V2_SRQ_DB << 24 | srq->srqn;
srq_db.byte_4 = HNS_ROCE_V2_SRQ_DB << V2_DB_BYTE_4_CMD_S |
(srq->srqn & V2_DB_BYTE_4_TAG_M);
srq_db.parameter = srq->head;
hns_roce_write64(hr_dev, (__le32 *)&srq_db, srq->db_reg_l);
@ -6301,6 +6384,7 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
if (!hr_dev)
return 0;
hr_dev->is_reset = true;
hr_dev->active = false;
hr_dev->dis_db = true;

View File

@ -54,7 +54,7 @@
#define HNS_ROCE_V2_MAX_CQ_NUM 0x100000
#define HNS_ROCE_V2_MAX_CQC_TIMER_NUM 0x100
#define HNS_ROCE_V2_MAX_SRQ_NUM 0x100000
#define HNS_ROCE_V2_MAX_CQE_NUM 0x10000
#define HNS_ROCE_V2_MAX_CQE_NUM 0x400000
#define HNS_ROCE_V2_MAX_SRQWQE_NUM 0x8000
#define HNS_ROCE_V2_MAX_RQ_SGE_NUM 0x100
#define HNS_ROCE_V2_MAX_SQ_SGE_NUM 0xff
@ -241,6 +241,7 @@ enum hns_roce_opcode_type {
HNS_ROCE_OPC_POST_MB = 0x8504,
HNS_ROCE_OPC_QUERY_MB_ST = 0x8505,
HNS_ROCE_OPC_CFG_BT_ATTR = 0x8506,
HNS_ROCE_OPC_FUNC_CLEAR = 0x8508,
HNS_ROCE_OPC_CLR_SCCC = 0x8509,
HNS_ROCE_OPC_QUERY_SCCC = 0x850a,
HNS_ROCE_OPC_RESET_SCCC = 0x850b,
@ -886,6 +887,10 @@ struct hns_roce_v2_qp_context {
#define V2_QPC_BYTE_256_SQ_FLUSH_IDX_S 16
#define V2_QPC_BYTE_256_SQ_FLUSH_IDX_M GENMASK(31, 16)
#define V2_QP_RWE_S 1 /* rdma write enable */
#define V2_QP_RRE_S 2 /* rdma read enable */
#define V2_QP_ATE_S 3 /* rdma atomic enable */
struct hns_roce_v2_cqe {
__le32 byte_4;
union {
@ -1226,6 +1231,22 @@ struct hns_roce_query_fw_info {
__le32 rsv[5];
};
struct hns_roce_func_clear {
__le32 rst_funcid_en;
__le32 func_done;
__le32 rsv[4];
};
#define FUNC_CLEAR_RST_FUN_DONE_S 0
/* Each physical function manages up to 248 virtual functions
* it takes up to 100ms for each function to execute clear
* if an abnormal reset occurs, it is executed twice at most;
* so it takes up to 249 * 2 * 100ms.
*/
#define HNS_ROCE_V2_FUNC_CLEAR_TIMEOUT_MSECS (249 * 2 * 100)
#define HNS_ROCE_V2_READ_FUNC_CLEAR_FLAG_INTERVAL 40
#define HNS_ROCE_V2_READ_FUNC_CLEAR_FLAG_FAIL_WAIT 20
struct hns_roce_cfg_llm_a {
__le32 base_addr_l;
__le32 base_addr_h;

View File

@ -57,17 +57,16 @@ int hns_get_gid_index(struct hns_roce_dev *hr_dev, u8 port, int gid_index)
{
return gid_index * hr_dev->caps.num_ports + port;
}
EXPORT_SYMBOL_GPL(hns_get_gid_index);
static int hns_roce_set_mac(struct hns_roce_dev *hr_dev, u8 port, u8 *addr)
{
u8 phy_port;
u32 i = 0;
if (!memcmp(hr_dev->dev_addr[port], addr, MAC_ADDR_OCTET_NUM))
if (!memcmp(hr_dev->dev_addr[port], addr, ETH_ALEN))
return 0;
for (i = 0; i < MAC_ADDR_OCTET_NUM; i++)
for (i = 0; i < ETH_ALEN; i++)
hr_dev->dev_addr[port][i] = addr[i];
phy_port = hr_dev->iboe.phy_port[port];
@ -78,18 +77,13 @@ static int hns_roce_add_gid(const struct ib_gid_attr *attr, void **context)
{
struct hns_roce_dev *hr_dev = to_hr_dev(attr->device);
u8 port = attr->port_num - 1;
unsigned long flags;
int ret;
if (port >= hr_dev->caps.num_ports)
return -EINVAL;
spin_lock_irqsave(&hr_dev->iboe.lock, flags);
ret = hr_dev->hw->set_gid(hr_dev, port, attr->index, &attr->gid, attr);
spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
return ret;
}
@ -98,18 +92,13 @@ static int hns_roce_del_gid(const struct ib_gid_attr *attr, void **context)
struct hns_roce_dev *hr_dev = to_hr_dev(attr->device);
struct ib_gid_attr zattr = { };
u8 port = attr->port_num - 1;
unsigned long flags;
int ret;
if (port >= hr_dev->caps.num_ports)
return -EINVAL;
spin_lock_irqsave(&hr_dev->iboe.lock, flags);
ret = hr_dev->hw->set_gid(hr_dev, port, attr->index, &zgid, &zattr);
spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
return ret;
}
@ -272,7 +261,8 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u8 port_num,
props->active_mtu = mtu ? min(props->max_mtu, mtu) : IB_MTU_256;
props->state = (netif_running(net_dev) && netif_carrier_ok(net_dev)) ?
IB_PORT_ACTIVE : IB_PORT_DOWN;
props->phys_state = (props->state == IB_PORT_ACTIVE) ? 5 : 3;
props->phys_state = (props->state == IB_PORT_ACTIVE) ?
HNS_ROCE_PHY_LINKUP : HNS_ROCE_PHY_DISABLED;
spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
@ -319,7 +309,7 @@ static int hns_roce_modify_port(struct ib_device *ib_dev, u8 port_num, int mask,
static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
struct ib_udata *udata)
{
int ret = 0;
int ret;
struct hns_roce_ucontext *context = to_hr_ucontext(uctx);
struct hns_roce_ib_alloc_ucontext_resp resp = {};
struct hns_roce_dev *hr_dev = to_hr_dev(uctx->device);
@ -423,6 +413,11 @@ static void hns_roce_unregister_device(struct hns_roce_dev *hr_dev)
}
static const struct ib_device_ops hns_roce_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_HNS,
.uverbs_abi_ver = 1,
.uverbs_no_driver_id_binding = 1,
.add_gid = hns_roce_add_gid,
.alloc_pd = hns_roce_alloc_pd,
.alloc_ucontext = hns_roce_alloc_ucontext,
@ -451,6 +446,7 @@ static const struct ib_device_ops hns_roce_dev_ops = {
.reg_user_mr = hns_roce_reg_user_mr,
INIT_RDMA_OBJ_SIZE(ib_ah, hns_roce_ah, ibah),
INIT_RDMA_OBJ_SIZE(ib_cq, hns_roce_cq, ib_cq),
INIT_RDMA_OBJ_SIZE(ib_pd, hns_roce_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_ucontext, hns_roce_ucontext, ibucontext),
};
@ -489,14 +485,12 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
ib_dev = &hr_dev->ib_dev;
ib_dev->owner = THIS_MODULE;
ib_dev->node_type = RDMA_NODE_IB_CA;
ib_dev->dev.parent = dev;
ib_dev->phys_port_cnt = hr_dev->caps.num_ports;
ib_dev->local_dma_lkey = hr_dev->caps.reserved_lkey;
ib_dev->num_comp_vectors = hr_dev->caps.num_comp_vectors;
ib_dev->uverbs_abi_ver = 1;
ib_dev->uverbs_cmd_mask =
(1ULL << IB_USER_VERBS_CMD_GET_CONTEXT) |
(1ULL << IB_USER_VERBS_CMD_QUERY_DEVICE) |
@ -545,7 +539,6 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
ib_set_device_ops(ib_dev, hr_dev->hw->hns_roce_dev_srq_ops);
}
ib_dev->driver_id = RDMA_DRIVER_HNS;
ib_set_device_ops(ib_dev, hr_dev->hw->hns_roce_dev_ops);
ib_set_device_ops(ib_dev, &hns_roce_dev_ops);
for (i = 0; i < hr_dev->caps.num_ports; i++) {
@ -980,7 +973,6 @@ error_failed_cmq_init:
return ret;
}
EXPORT_SYMBOL_GPL(hns_roce_init);
void hns_roce_exit(struct hns_roce_dev *hr_dev)
{
@ -1001,7 +993,6 @@ void hns_roce_exit(struct hns_roce_dev *hr_dev)
if (hr_dev->hw->reset)
hr_dev->hw->reset(hr_dev, false);
}
EXPORT_SYMBOL_GPL(hns_roce_exit);
MODULE_LICENSE("Dual BSD/GPL");
MODULE_AUTHOR("Wei Hu <xavier.huwei@huawei.com>");

View File

@ -47,7 +47,6 @@ unsigned long key_to_hw_index(u32 key)
{
return (key << 24) | (key >> 8);
}
EXPORT_SYMBOL_GPL(key_to_hw_index);
static int hns_roce_sw2hw_mpt(struct hns_roce_dev *hr_dev,
struct hns_roce_cmd_mailbox *mailbox,
@ -66,7 +65,6 @@ int hns_roce_hw2sw_mpt(struct hns_roce_dev *hr_dev,
mpt_index, !mailbox, HNS_ROCE_CMD_HW2SW_MPT,
HNS_ROCE_CMD_TIMEOUT_MSECS);
}
EXPORT_SYMBOL_GPL(hns_roce_hw2sw_mpt);
static int hns_roce_buddy_alloc(struct hns_roce_buddy *buddy, int order,
unsigned long *seg)
@ -293,7 +291,6 @@ void hns_roce_mtt_cleanup(struct hns_roce_dev *hr_dev, struct hns_roce_mtt *mtt)
break;
}
}
EXPORT_SYMBOL_GPL(hns_roce_mtt_cleanup);
static void hns_roce_loop_free(struct hns_roce_dev *hr_dev,
struct hns_roce_mr *mr, int err_loop_index,
@ -314,11 +311,11 @@ static void hns_roce_loop_free(struct hns_roce_dev *hr_dev,
dma_free_coherent(dev, pbl_bt_sz, mr->pbl_bt_l1[i],
mr->pbl_l1_dma_addr[i]);
for (j = 0; j < pbl_bt_sz / 8; j++) {
for (j = 0; j < pbl_bt_sz / BA_BYTE_LEN; j++) {
if (i == loop_i && j >= loop_j)
break;
bt_idx = i * pbl_bt_sz / 8 + j;
bt_idx = i * pbl_bt_sz / BA_BYTE_LEN + j;
dma_free_coherent(dev, pbl_bt_sz,
mr->pbl_bt_l2[bt_idx],
mr->pbl_l2_dma_addr[bt_idx]);
@ -329,8 +326,8 @@ static void hns_roce_loop_free(struct hns_roce_dev *hr_dev,
dma_free_coherent(dev, pbl_bt_sz, mr->pbl_bt_l1[i],
mr->pbl_l1_dma_addr[i]);
for (j = 0; j < pbl_bt_sz / 8; j++) {
bt_idx = i * pbl_bt_sz / 8 + j;
for (j = 0; j < pbl_bt_sz / BA_BYTE_LEN; j++) {
bt_idx = i * pbl_bt_sz / BA_BYTE_LEN + j;
dma_free_coherent(dev, pbl_bt_sz,
mr->pbl_bt_l2[bt_idx],
mr->pbl_l2_dma_addr[bt_idx]);
@ -533,7 +530,7 @@ static int hns_roce_mr_alloc(struct hns_roce_dev *hr_dev, u32 pd, u64 iova,
{
struct device *dev = hr_dev->dev;
unsigned long index = 0;
int ret = 0;
int ret;
/* Allocate a key for mr from mr_table */
ret = hns_roce_bitmap_alloc(&hr_dev->mr_table.mtpt_bitmap, &index);
@ -559,7 +556,8 @@ static int hns_roce_mr_alloc(struct hns_roce_dev *hr_dev, u32 pd, u64 iova,
mr->pbl_l0_dma_addr = 0;
} else {
if (!hr_dev->caps.pbl_hop_num) {
mr->pbl_buf = dma_alloc_coherent(dev, npages * 8,
mr->pbl_buf = dma_alloc_coherent(dev,
npages * BA_BYTE_LEN,
&(mr->pbl_dma_addr),
GFP_KERNEL);
if (!mr->pbl_buf)
@ -590,9 +588,8 @@ static void hns_roce_mhop_free(struct hns_roce_dev *hr_dev,
if (mhop_num == HNS_ROCE_HOP_NUM_0)
return;
/* hop_num = 1 */
if (mhop_num == 1) {
dma_free_coherent(dev, (unsigned int)(npages * 8),
dma_free_coherent(dev, (unsigned int)(npages * BA_BYTE_LEN),
mr->pbl_buf, mr->pbl_dma_addr);
return;
}
@ -603,12 +600,13 @@ static void hns_roce_mhop_free(struct hns_roce_dev *hr_dev,
if (mhop_num == 2) {
for (i = 0; i < mr->l0_chunk_last_num; i++) {
if (i == mr->l0_chunk_last_num - 1) {
npages_allocated = i * (pbl_bt_sz / 8);
npages_allocated =
i * (pbl_bt_sz / BA_BYTE_LEN);
dma_free_coherent(dev,
(npages - npages_allocated) * 8,
mr->pbl_bt_l1[i],
mr->pbl_l1_dma_addr[i]);
(npages - npages_allocated) * BA_BYTE_LEN,
mr->pbl_bt_l1[i],
mr->pbl_l1_dma_addr[i]);
break;
}
@ -621,16 +619,17 @@ static void hns_roce_mhop_free(struct hns_roce_dev *hr_dev,
dma_free_coherent(dev, pbl_bt_sz, mr->pbl_bt_l1[i],
mr->pbl_l1_dma_addr[i]);
for (j = 0; j < pbl_bt_sz / 8; j++) {
bt_idx = i * (pbl_bt_sz / 8) + j;
for (j = 0; j < pbl_bt_sz / BA_BYTE_LEN; j++) {
bt_idx = i * (pbl_bt_sz / BA_BYTE_LEN) + j;
if ((i == mr->l0_chunk_last_num - 1)
&& j == mr->l1_chunk_last_num - 1) {
npages_allocated = bt_idx *
(pbl_bt_sz / 8);
(pbl_bt_sz / BA_BYTE_LEN);
dma_free_coherent(dev,
(npages - npages_allocated) * 8,
(npages - npages_allocated) *
BA_BYTE_LEN,
mr->pbl_bt_l2[bt_idx],
mr->pbl_l2_dma_addr[bt_idx]);
@ -675,7 +674,8 @@ static void hns_roce_mr_free(struct hns_roce_dev *hr_dev,
npages = ib_umem_page_count(mr->umem);
if (!hr_dev->caps.pbl_hop_num)
dma_free_coherent(dev, (unsigned int)(npages * 8),
dma_free_coherent(dev,
(unsigned int)(npages * BA_BYTE_LEN),
mr->pbl_buf, mr->pbl_dma_addr);
else
hns_roce_mhop_free(hr_dev, mr);
@ -1059,6 +1059,7 @@ static int hns_roce_ib_umem_write_mr(struct hns_roce_dev *hr_dev,
for_each_sg_dma_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
page_addr = sg_page_iter_dma_address(&sg_iter);
if (!hr_dev->caps.pbl_hop_num) {
/* for hip06, page addr is aligned to 4K */
mr->pbl_buf[i++] = page_addr >> 12;
} else if (hr_dev->caps.pbl_hop_num == 1) {
mr->pbl_buf[i++] = page_addr;
@ -1069,7 +1070,7 @@ static int hns_roce_ib_umem_write_mr(struct hns_roce_dev *hr_dev,
mr->pbl_bt_l2[i][j] = page_addr;
j++;
if (j >= (pbl_bt_sz / 8)) {
if (j >= (pbl_bt_sz / BA_BYTE_LEN)) {
i++;
j = 0;
}
@ -1117,7 +1118,8 @@ struct ib_mr *hns_roce_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
} else {
u64 pbl_size = 1;
bt_size = (1 << (hr_dev->caps.pbl_ba_pg_sz + PAGE_SHIFT)) / 8;
bt_size = (1 << (hr_dev->caps.pbl_ba_pg_sz + PAGE_SHIFT)) /
BA_BYTE_LEN;
for (i = 0; i < hr_dev->caps.pbl_hop_num; i++)
pbl_size *= bt_size;
if (n > pbl_size) {
@ -1293,9 +1295,7 @@ int hns_roce_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
} else {
hns_roce_mr_free(hr_dev, mr);
if (mr->umem)
ib_umem_release(mr->umem);
ib_umem_release(mr->umem);
kfree(mr);
}
@ -1491,3 +1491,119 @@ int hns_roce_dealloc_mw(struct ib_mw *ibmw)
return 0;
}
void hns_roce_mtr_init(struct hns_roce_mtr *mtr, int bt_pg_shift,
int buf_pg_shift)
{
hns_roce_hem_list_init(&mtr->hem_list, bt_pg_shift);
mtr->buf_pg_shift = buf_pg_shift;
}
void hns_roce_mtr_cleanup(struct hns_roce_dev *hr_dev,
struct hns_roce_mtr *mtr)
{
hns_roce_hem_list_release(hr_dev, &mtr->hem_list);
}
static int hns_roce_write_mtr(struct hns_roce_dev *hr_dev,
struct hns_roce_mtr *mtr, dma_addr_t *bufs,
struct hns_roce_buf_region *r)
{
int offset;
int count;
int npage;
u64 *mtts;
int end;
int i;
offset = r->offset;
end = offset + r->count;
npage = 0;
while (offset < end) {
mtts = hns_roce_hem_list_find_mtt(hr_dev, &mtr->hem_list,
offset, &count, NULL);
if (!mtts)
return -ENOBUFS;
/* Save page addr, low 12 bits : 0 */
for (i = 0; i < count; i++) {
if (hr_dev->hw_rev == HNS_ROCE_HW_VER1)
mtts[i] = cpu_to_le64(bufs[npage] >>
PAGE_ADDR_SHIFT);
else
mtts[i] = cpu_to_le64(bufs[npage]);
npage++;
}
offset += count;
}
return 0;
}
int hns_roce_mtr_attach(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
dma_addr_t **bufs, struct hns_roce_buf_region *regions,
int region_cnt)
{
struct hns_roce_buf_region *r;
int ret;
int i;
ret = hns_roce_hem_list_request(hr_dev, &mtr->hem_list, regions,
region_cnt);
if (ret)
return ret;
for (i = 0; i < region_cnt; i++) {
r = &regions[i];
ret = hns_roce_write_mtr(hr_dev, mtr, bufs[i], r);
if (ret) {
dev_err(hr_dev->dev,
"write mtr[%d/%d] err %d,offset=%d.\n",
i, region_cnt, ret, r->offset);
goto err_write;
}
}
return 0;
err_write:
hns_roce_hem_list_release(hr_dev, &mtr->hem_list);
return ret;
}
int hns_roce_mtr_find(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
int offset, u64 *mtt_buf, int mtt_max, u64 *base_addr)
{
u64 *mtts = mtt_buf;
int mtt_count;
int total = 0;
u64 *addr;
int npage;
int left;
if (mtts == NULL || mtt_max < 1)
goto done;
left = mtt_max;
while (left > 0) {
mtt_count = 0;
addr = hns_roce_hem_list_find_mtt(hr_dev, &mtr->hem_list,
offset + total,
&mtt_count, NULL);
if (!addr || !mtt_count)
goto done;
npage = min(mtt_count, left);
memcpy(&mtts[total], addr, BA_BYTE_LEN * npage);
left -= npage;
total += npage;
}
done:
if (base_addr)
*base_addr = mtr->hem_list.root_ba;
return total;
}

View File

@ -83,18 +83,16 @@ int hns_roce_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
return 0;
}
EXPORT_SYMBOL_GPL(hns_roce_alloc_pd);
void hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
{
hns_roce_pd_free(to_hr_dev(pd->device), to_hr_pd(pd)->pdn);
}
EXPORT_SYMBOL_GPL(hns_roce_dealloc_pd);
int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
{
struct resource *res;
int ret = 0;
int ret;
/* Using bitmap to manager UAR index */
ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);

View File

@ -64,7 +64,6 @@ void hns_roce_qp_event(struct hns_roce_dev *hr_dev, u32 qpn, int event_type)
if (atomic_dec_and_test(&qp->refcount))
complete(&qp->free);
}
EXPORT_SYMBOL_GPL(hns_roce_qp_event);
static void hns_roce_ib_qp_event(struct hns_roce_qp *hr_qp,
enum hns_roce_event type)
@ -139,7 +138,6 @@ enum hns_roce_qp_state to_hns_roce_state(enum ib_qp_state state)
return HNS_ROCE_QP_NUM_STATE;
}
}
EXPORT_SYMBOL_GPL(to_hns_roce_state);
static int hns_roce_gsi_qp_alloc(struct hns_roce_dev *hr_dev, unsigned long qpn,
struct hns_roce_qp *hr_qp)
@ -242,7 +240,6 @@ void hns_roce_qp_remove(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
__xa_erase(xa, hr_qp->qpn & (hr_dev->caps.num_qps - 1));
xa_unlock_irqrestore(xa, flags);
}
EXPORT_SYMBOL_GPL(hns_roce_qp_remove);
void hns_roce_qp_free(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
{
@ -257,22 +254,19 @@ void hns_roce_qp_free(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
hns_roce_table_put(hr_dev, &qp_table->trrl_table,
hr_qp->qpn);
hns_roce_table_put(hr_dev, &qp_table->irrl_table, hr_qp->qpn);
hns_roce_table_put(hr_dev, &qp_table->qp_table, hr_qp->qpn);
}
}
EXPORT_SYMBOL_GPL(hns_roce_qp_free);
void hns_roce_release_range_qp(struct hns_roce_dev *hr_dev, int base_qpn,
int cnt)
{
struct hns_roce_qp_table *qp_table = &hr_dev->qp_table;
if (base_qpn < SQP_NUM)
if (base_qpn < hr_dev->caps.reserved_qps)
return;
hns_roce_bitmap_free_range(&qp_table->bitmap, base_qpn, cnt, BITMAP_RR);
}
EXPORT_SYMBOL_GPL(hns_roce_release_range_qp);
static int hns_roce_set_rq_size(struct hns_roce_dev *hr_dev,
struct ib_qp_cap *cap, bool is_user, int has_rq,
@ -392,8 +386,8 @@ static int hns_roce_set_user_sq_size(struct hns_roce_dev *hr_dev,
hr_qp->sq.wqe_shift), PAGE_SIZE);
} else {
page_size = 1 << (hr_dev->caps.mtt_buf_pg_sz + PAGE_SHIFT);
hr_qp->sge.sge_cnt =
max(page_size / (1 << hr_qp->sge.sge_shift), ex_sge_num);
hr_qp->sge.sge_cnt = ex_sge_num ?
max(page_size / (1 << hr_qp->sge.sge_shift), ex_sge_num) : 0;
hr_qp->buff_size = HNS_ROCE_ALOGN_UP((hr_qp->rq.wqe_cnt <<
hr_qp->rq.wqe_shift), page_size) +
HNS_ROCE_ALOGN_UP((hr_qp->sge.sge_cnt <<
@ -422,6 +416,91 @@ static int hns_roce_set_user_sq_size(struct hns_roce_dev *hr_dev,
return 0;
}
static int split_wqe_buf_region(struct hns_roce_dev *hr_dev,
struct hns_roce_qp *hr_qp,
struct hns_roce_buf_region *regions,
int region_max, int page_shift)
{
int page_size = 1 << page_shift;
bool is_extend_sge;
int region_cnt = 0;
int buf_size;
int buf_cnt;
if (hr_qp->buff_size < 1 || region_max < 1)
return region_cnt;
if (hr_qp->sge.sge_cnt > 0)
is_extend_sge = true;
else
is_extend_sge = false;
/* sq region */
if (is_extend_sge)
buf_size = hr_qp->sge.offset - hr_qp->sq.offset;
else
buf_size = hr_qp->rq.offset - hr_qp->sq.offset;
if (buf_size > 0 && region_cnt < region_max) {
buf_cnt = DIV_ROUND_UP(buf_size, page_size);
hns_roce_init_buf_region(&regions[region_cnt],
hr_dev->caps.wqe_sq_hop_num,
hr_qp->sq.offset / page_size,
buf_cnt);
region_cnt++;
}
/* sge region */
if (is_extend_sge) {
buf_size = hr_qp->rq.offset - hr_qp->sge.offset;
if (buf_size > 0 && region_cnt < region_max) {
buf_cnt = DIV_ROUND_UP(buf_size, page_size);
hns_roce_init_buf_region(&regions[region_cnt],
hr_dev->caps.wqe_sge_hop_num,
hr_qp->sge.offset / page_size,
buf_cnt);
region_cnt++;
}
}
/* rq region */
buf_size = hr_qp->buff_size - hr_qp->rq.offset;
if (buf_size > 0) {
buf_cnt = DIV_ROUND_UP(buf_size, page_size);
hns_roce_init_buf_region(&regions[region_cnt],
hr_dev->caps.wqe_rq_hop_num,
hr_qp->rq.offset / page_size,
buf_cnt);
region_cnt++;
}
return region_cnt;
}
static int calc_wqe_bt_page_shift(struct hns_roce_dev *hr_dev,
struct hns_roce_buf_region *regions,
int region_cnt)
{
int bt_pg_shift;
int ba_num;
int ret;
bt_pg_shift = PAGE_SHIFT + hr_dev->caps.mtt_ba_pg_sz;
/* all root ba entries must in one bt page */
do {
ba_num = (1 << bt_pg_shift) / BA_BYTE_LEN;
ret = hns_roce_hem_list_calc_root_ba(regions, region_cnt,
ba_num);
if (ret <= ba_num)
break;
bt_pg_shift++;
} while (ret > ba_num);
return bt_pg_shift - PAGE_SHIFT;
}
static int hns_roce_set_kernel_sq_size(struct hns_roce_dev *hr_dev,
struct ib_qp_cap *cap,
struct hns_roce_qp *hr_qp)
@ -534,15 +613,17 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
struct ib_udata *udata, unsigned long sqpn,
struct hns_roce_qp *hr_qp)
{
dma_addr_t *buf_list[ARRAY_SIZE(hr_qp->regions)] = { 0 };
struct device *dev = hr_dev->dev;
struct hns_roce_ib_create_qp ucmd;
struct hns_roce_ib_create_qp_resp resp = {};
struct hns_roce_ucontext *uctx = rdma_udata_to_drv_context(
udata, struct hns_roce_ucontext, ibucontext);
struct hns_roce_buf_region *r;
unsigned long qpn = 0;
int ret = 0;
u32 page_shift;
u32 npages;
int buf_count;
int ret;
int i;
mutex_init(&hr_qp->mutex);
@ -596,6 +677,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
init_attr->cap.max_recv_sge];
}
page_shift = PAGE_SHIFT + hr_dev->caps.mtt_buf_pg_sz;
if (udata) {
if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
dev_err(dev, "ib_copy_from_udata error for create qp\n");
@ -617,32 +699,28 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
ret = PTR_ERR(hr_qp->umem);
goto err_rq_sge_list;
}
hr_qp->mtt.mtt_type = MTT_TYPE_WQE;
page_shift = PAGE_SHIFT;
if (hr_dev->caps.mtt_buf_pg_sz) {
npages = (ib_umem_page_count(hr_qp->umem) +
(1 << hr_dev->caps.mtt_buf_pg_sz) - 1) /
(1 << hr_dev->caps.mtt_buf_pg_sz);
page_shift += hr_dev->caps.mtt_buf_pg_sz;
ret = hns_roce_mtt_init(hr_dev, npages,
page_shift,
&hr_qp->mtt);
} else {
ret = hns_roce_mtt_init(hr_dev,
ib_umem_page_count(hr_qp->umem),
page_shift, &hr_qp->mtt);
}
hr_qp->region_cnt = split_wqe_buf_region(hr_dev, hr_qp,
hr_qp->regions, ARRAY_SIZE(hr_qp->regions),
page_shift);
ret = hns_roce_alloc_buf_list(hr_qp->regions, buf_list,
hr_qp->region_cnt);
if (ret) {
dev_err(dev, "hns_roce_mtt_init error for create qp\n");
goto err_buf;
dev_err(dev, "alloc buf_list error for create qp\n");
goto err_alloc_list;
}
ret = hns_roce_ib_umem_write_mtt(hr_dev, &hr_qp->mtt,
hr_qp->umem);
if (ret) {
dev_err(dev, "hns_roce_ib_umem_write_mtt error for create qp\n");
goto err_mtt;
for (i = 0; i < hr_qp->region_cnt; i++) {
r = &hr_qp->regions[i];
buf_count = hns_roce_get_umem_bufs(hr_dev,
buf_list[i], r->count, r->offset,
hr_qp->umem, page_shift);
if (buf_count != r->count) {
dev_err(dev,
"get umem buf err, expect %d,ret %d.\n",
r->count, buf_count);
ret = -ENOBUFS;
goto err_get_bufs;
}
}
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SQ_RECORD_DB) &&
@ -653,7 +731,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
&hr_qp->sdb);
if (ret) {
dev_err(dev, "sq record doorbell map failed!\n");
goto err_mtt;
goto err_get_bufs;
}
/* indicate kernel supports sq record db */
@ -715,7 +793,6 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
}
/* Allocate QP buf */
page_shift = PAGE_SHIFT + hr_dev->caps.mtt_buf_pg_sz;
if (hns_roce_buf_alloc(hr_dev, hr_qp->buff_size,
(1 << page_shift) * 2,
&hr_qp->hr_buf, page_shift)) {
@ -723,21 +800,28 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
ret = -ENOMEM;
goto err_db;
}
hr_qp->mtt.mtt_type = MTT_TYPE_WQE;
/* Write MTT */
ret = hns_roce_mtt_init(hr_dev, hr_qp->hr_buf.npages,
hr_qp->hr_buf.page_shift, &hr_qp->mtt);
hr_qp->region_cnt = split_wqe_buf_region(hr_dev, hr_qp,
hr_qp->regions, ARRAY_SIZE(hr_qp->regions),
page_shift);
ret = hns_roce_alloc_buf_list(hr_qp->regions, buf_list,
hr_qp->region_cnt);
if (ret) {
dev_err(dev, "hns_roce_mtt_init error for kernel create qp\n");
goto err_buf;
dev_err(dev, "alloc buf_list error for create qp!\n");
goto err_alloc_list;
}
ret = hns_roce_buf_write_mtt(hr_dev, &hr_qp->mtt,
&hr_qp->hr_buf);
if (ret) {
dev_err(dev, "hns_roce_buf_write_mtt error for kernel create qp\n");
goto err_mtt;
for (i = 0; i < hr_qp->region_cnt; i++) {
r = &hr_qp->regions[i];
buf_count = hns_roce_get_kmem_bufs(hr_dev,
buf_list[i], r->count, r->offset,
&hr_qp->hr_buf);
if (buf_count != r->count) {
dev_err(dev,
"get kmem buf err, expect %d,ret %d.\n",
r->count, buf_count);
ret = -ENOBUFS;
goto err_get_bufs;
}
}
hr_qp->sq.wrid = kcalloc(hr_qp->sq.wqe_cnt, sizeof(u64),
@ -761,6 +845,17 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
}
}
hr_qp->wqe_bt_pg_shift = calc_wqe_bt_page_shift(hr_dev, hr_qp->regions,
hr_qp->region_cnt);
hns_roce_mtr_init(&hr_qp->mtr, PAGE_SHIFT + hr_qp->wqe_bt_pg_shift,
page_shift);
ret = hns_roce_mtr_attach(hr_dev, &hr_qp->mtr, buf_list,
hr_qp->regions, hr_qp->region_cnt);
if (ret) {
dev_err(dev, "mtr attach error for create qp\n");
goto err_mtr;
}
if (init_attr->qp_type == IB_QPT_GSI &&
hr_dev->hw_rev == HNS_ROCE_HW_VER1) {
/* In v1 engine, GSI QP context in RoCE engine's register */
@ -796,6 +891,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
}
hr_qp->event = hns_roce_ib_qp_event;
hns_roce_free_buf_list(buf_list, hr_qp->region_cnt);
return 0;
@ -810,6 +906,9 @@ err_qpn:
if (!sqpn)
hns_roce_release_range_qp(hr_dev, qpn, 1);
err_mtr:
hns_roce_mtr_cleanup(hr_dev, &hr_qp->mtr);
err_wrid:
if (udata) {
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB) &&
@ -829,14 +928,13 @@ err_sq_dbmap:
hns_roce_qp_has_sq(init_attr))
hns_roce_db_unmap_user(uctx, &hr_qp->sdb);
err_mtt:
hns_roce_mtt_cleanup(hr_dev, &hr_qp->mtt);
err_get_bufs:
hns_roce_free_buf_list(buf_list, hr_qp->region_cnt);
err_buf:
if (hr_qp->umem)
ib_umem_release(hr_qp->umem);
else
err_alloc_list:
if (!hr_qp->umem)
hns_roce_buf_free(hr_dev, hr_qp->buff_size, &hr_qp->hr_buf);
ib_umem_release(hr_qp->umem);
err_db:
if (!udata && hns_roce_qp_has_rq(init_attr) &&
@ -923,7 +1021,6 @@ struct ib_qp *hns_roce_create_qp(struct ib_pd *pd,
return &hr_qp->ibqp;
}
EXPORT_SYMBOL_GPL(hns_roce_create_qp);
int to_hr_qp_type(int qp_type)
{
@ -942,7 +1039,6 @@ int to_hr_qp_type(int qp_type)
return transport_type;
}
EXPORT_SYMBOL_GPL(to_hr_qp_type);
int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata)
@ -1062,7 +1158,6 @@ void hns_roce_lock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq)
spin_lock_nested(&send_cq->lock, SINGLE_DEPTH_NESTING);
}
}
EXPORT_SYMBOL_GPL(hns_roce_lock_cqs);
void hns_roce_unlock_cqs(struct hns_roce_cq *send_cq,
struct hns_roce_cq *recv_cq) __releases(&send_cq->lock)
@ -1079,7 +1174,6 @@ void hns_roce_unlock_cqs(struct hns_roce_cq *send_cq,
spin_unlock_irq(&recv_cq->lock);
}
}
EXPORT_SYMBOL_GPL(hns_roce_unlock_cqs);
static void *get_wqe(struct hns_roce_qp *hr_qp, int offset)
{
@ -1091,20 +1185,17 @@ void *get_recv_wqe(struct hns_roce_qp *hr_qp, int n)
{
return get_wqe(hr_qp, hr_qp->rq.offset + (n << hr_qp->rq.wqe_shift));
}
EXPORT_SYMBOL_GPL(get_recv_wqe);
void *get_send_wqe(struct hns_roce_qp *hr_qp, int n)
{
return get_wqe(hr_qp, hr_qp->sq.offset + (n << hr_qp->sq.wqe_shift));
}
EXPORT_SYMBOL_GPL(get_send_wqe);
void *get_send_extend_sge(struct hns_roce_qp *hr_qp, int n)
{
return hns_roce_buf_offset(&hr_qp->hr_buf, hr_qp->sge.offset +
(n << hr_qp->sge.sge_shift));
}
EXPORT_SYMBOL_GPL(get_send_extend_sge);
bool hns_roce_wq_overflow(struct hns_roce_wq *hr_wq, int nreq,
struct ib_cq *ib_cq)
@ -1123,7 +1214,6 @@ bool hns_roce_wq_overflow(struct hns_roce_wq *hr_wq, int nreq,
return cur + nreq >= hr_wq->max_post;
}
EXPORT_SYMBOL_GPL(hns_roce_wq_overflow);
int hns_roce_init_qp_table(struct hns_roce_dev *hr_dev)
{
@ -1135,11 +1225,7 @@ int hns_roce_init_qp_table(struct hns_roce_dev *hr_dev)
mutex_init(&qp_table->scc_mutex);
xa_init(&hr_dev->qp_table_xa);
/* In hw v1, a port include two SQP, six ports total 12 */
if (hr_dev->caps.max_sq_sg <= 2)
reserved_from_bot = SQP_NUM;
else
reserved_from_bot = hr_dev->caps.reserved_qps;
reserved_from_bot = hr_dev->caps.reserved_qps;
ret = hns_roce_bitmap_init(&qp_table->bitmap, hr_dev->caps.num_qps,
hr_dev->caps.num_qps - 1, reserved_from_bot,

View File

@ -30,7 +30,6 @@ void hns_roce_srq_event(struct hns_roce_dev *hr_dev, u32 srqn, int event_type)
if (atomic_dec_and_test(&srq->refcount))
complete(&srq->free);
}
EXPORT_SYMBOL_GPL(hns_roce_srq_event);
static void hns_roce_ib_srq_event(struct hns_roce_srq *srq,
enum hns_roce_event event_type)
@ -181,28 +180,19 @@ static int hns_roce_create_idx_que(struct ib_pd *pd, struct hns_roce_srq *srq,
{
struct hns_roce_dev *hr_dev = to_hr_dev(pd->device);
struct hns_roce_idx_que *idx_que = &srq->idx_que;
u32 bitmap_num;
int i;
bitmap_num = HNS_ROCE_ALOGN_UP(srq->max, 8 * sizeof(u64));
idx_que->bitmap = kcalloc(1, bitmap_num / 8, GFP_KERNEL);
idx_que->bitmap = bitmap_zalloc(srq->max, GFP_KERNEL);
if (!idx_que->bitmap)
return -ENOMEM;
bitmap_num = bitmap_num / (8 * sizeof(u64));
idx_que->buf_size = srq->idx_que.buf_size;
if (hns_roce_buf_alloc(hr_dev, idx_que->buf_size, (1 << page_shift) * 2,
&idx_que->idx_buf, page_shift)) {
kfree(idx_que->bitmap);
bitmap_free(idx_que->bitmap);
return -ENOMEM;
}
for (i = 0; i < bitmap_num; i++)
idx_que->bitmap[i] = ~(0UL);
return 0;
}
@ -264,8 +254,7 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
} else
ret = hns_roce_mtt_init(hr_dev,
ib_umem_page_count(srq->umem),
srq->umem->page_shift,
&srq->mtt);
PAGE_SHIFT, &srq->mtt);
if (ret)
goto err_buf;
@ -291,10 +280,9 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
ret = hns_roce_mtt_init(hr_dev, npages,
page_shift, &srq->idx_que.mtt);
} else {
ret = hns_roce_mtt_init(hr_dev,
ib_umem_page_count(srq->idx_que.umem),
srq->idx_que.umem->page_shift,
&srq->idx_que.mtt);
ret = hns_roce_mtt_init(
hr_dev, ib_umem_page_count(srq->idx_que.umem),
PAGE_SHIFT, &srq->idx_que.mtt);
}
if (ret) {
@ -391,21 +379,19 @@ err_idx_buf:
hns_roce_mtt_cleanup(hr_dev, &srq->idx_que.mtt);
err_idx_mtt:
if (udata)
ib_umem_release(srq->idx_que.umem);
ib_umem_release(srq->idx_que.umem);
err_create_idx:
hns_roce_buf_free(hr_dev, srq->idx_que.buf_size,
&srq->idx_que.idx_buf);
kfree(srq->idx_que.bitmap);
bitmap_free(srq->idx_que.bitmap);
err_srq_mtt:
hns_roce_mtt_cleanup(hr_dev, &srq->mtt);
err_buf:
if (udata)
ib_umem_release(srq->umem);
else
ib_umem_release(srq->umem);
if (!udata)
hns_roce_buf_free(hr_dev, srq_buf_size, &srq->buf);
return ret;
@ -419,15 +405,15 @@ void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
hns_roce_srq_free(hr_dev, srq);
hns_roce_mtt_cleanup(hr_dev, &srq->mtt);
if (ibsrq->uobject) {
if (udata) {
hns_roce_mtt_cleanup(hr_dev, &srq->idx_que.mtt);
ib_umem_release(srq->idx_que.umem);
ib_umem_release(srq->umem);
} else {
kvfree(srq->wrid);
hns_roce_buf_free(hr_dev, srq->max << srq->wqe_shift,
&srq->buf);
}
ib_umem_release(srq->idx_que.umem);
ib_umem_release(srq->umem);
}
int hns_roce_init_srq_table(struct hns_roce_dev *hr_dev)

View File

@ -4279,11 +4279,11 @@ static void i40iw_qhash_ctrl(struct i40iw_device *iwdev,
/* if not found then add a child listener if interface is going up */
if (!ifup)
return;
child_listen_node = kzalloc(sizeof(*child_listen_node), GFP_ATOMIC);
child_listen_node = kmemdup(parent_listen_node,
sizeof(*child_listen_node), GFP_ATOMIC);
if (!child_listen_node)
return;
node_allocated = true;
memcpy(child_listen_node, parent_listen_node, sizeof(*child_listen_node));
memcpy(child_listen_node->loc_addr, ipaddr, ipv4 ? 4 : 16);

View File

@ -772,6 +772,8 @@ static int i40iw_query_qp(struct ib_qp *ibqp,
struct i40iw_qp *iwqp = to_iwqp(ibqp);
struct i40iw_sc_qp *qp = &iwqp->sc_qp;
attr->qp_state = iwqp->ibqp_state;
attr->cur_qp_state = attr->qp_state;
attr->qp_access_flags = 0;
attr->cap.max_send_wr = qp->qp_uk.sq_size;
attr->cap.max_recv_wr = qp->qp_uk.rq_size;
@ -1064,44 +1066,38 @@ void i40iw_cq_wq_destroy(struct i40iw_device *iwdev, struct i40iw_sc_cq *cq)
* @ib_cq: cq pointer
* @udata: user data or NULL for kernel object
*/
static int i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
static void i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct i40iw_cq *iwcq;
struct i40iw_device *iwdev;
struct i40iw_sc_cq *cq;
if (!ib_cq) {
i40iw_pr_err("ib_cq == NULL\n");
return 0;
}
iwcq = to_iwcq(ib_cq);
iwdev = to_iwdev(ib_cq->device);
cq = &iwcq->sc_cq;
i40iw_cq_wq_destroy(iwdev, cq);
cq_free_resources(iwdev, iwcq);
kfree(iwcq);
i40iw_rem_devusecount(iwdev);
return 0;
}
/**
* i40iw_create_cq - create cq
* @ibdev: device pointer from stack
* @ibcq: CQ allocated
* @attr: attributes for cq
* @udata: user data
*/
static struct ib_cq *i40iw_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
static int i40iw_create_cq(struct ib_cq *ibcq,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct ib_device *ibdev = ibcq->device;
struct i40iw_device *iwdev = to_iwdev(ibdev);
struct i40iw_cq *iwcq;
struct i40iw_cq *iwcq = to_iwcq(ibcq);
struct i40iw_pbl *iwpbl;
u32 cq_num = 0;
struct i40iw_sc_cq *cq;
struct i40iw_sc_dev *dev = &iwdev->sc_dev;
struct i40iw_cq_init_info info;
struct i40iw_cq_init_info info = {};
enum i40iw_status_code status;
struct i40iw_cqp_request *cqp_request;
struct cqp_commands_info *cqp_info;
@ -1111,22 +1107,16 @@ static struct ib_cq *i40iw_create_cq(struct ib_device *ibdev,
int entries = attr->cqe;
if (iwdev->closing)
return ERR_PTR(-ENODEV);
return -ENODEV;
if (entries > iwdev->max_cqe)
return ERR_PTR(-EINVAL);
iwcq = kzalloc(sizeof(*iwcq), GFP_KERNEL);
if (!iwcq)
return ERR_PTR(-ENOMEM);
memset(&info, 0, sizeof(info));
return -EINVAL;
err_code = i40iw_alloc_resource(iwdev, iwdev->allocated_cqs,
iwdev->max_cq, &cq_num,
&iwdev->next_cq);
if (err_code)
goto error;
return err_code;
cq = &iwcq->sc_cq;
cq->back_cq = (void *)iwcq;
@ -1233,15 +1223,13 @@ static struct ib_cq *i40iw_create_cq(struct ib_device *ibdev,
}
i40iw_add_devusecount(iwdev);
return (struct ib_cq *)iwcq;
return 0;
cq_destroy:
i40iw_cq_wq_destroy(iwdev, cq);
cq_free_resources:
cq_free_resources(iwdev, iwcq);
error:
kfree(iwcq);
return ERR_PTR(err_code);
return err_code;
}
/**
@ -2018,8 +2006,7 @@ static int i40iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
struct cqp_commands_info *cqp_info;
u32 stag_idx;
if (iwmr->region)
ib_umem_release(iwmr->region);
ib_umem_release(iwmr->region);
if (iwmr->type != IW_MEMREG_TYPE_MEM) {
/* region is released. only test for userness. */
@ -2655,6 +2642,11 @@ static int i40iw_query_pkey(struct ib_device *ibdev,
}
static const struct ib_device_ops i40iw_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_I40IW,
/* NOTE: Older kernels wrongly use 0 for the uverbs_abi_ver */
.uverbs_abi_ver = I40IW_ABI_VER,
.alloc_hw_stats = i40iw_alloc_hw_stats,
.alloc_mr = i40iw_alloc_mr,
.alloc_pd = i40iw_alloc_pd,
@ -2694,6 +2686,7 @@ static const struct ib_device_ops i40iw_dev_ops = {
.reg_user_mr = i40iw_reg_user_mr,
.req_notify_cq = i40iw_req_notify_cq,
INIT_RDMA_OBJ_SIZE(ib_pd, i40iw_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_cq, i40iw_cq, ibcq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, i40iw_ucontext, ibucontext),
};
@ -2712,7 +2705,6 @@ static struct i40iw_ib_device *i40iw_init_rdma_device(struct i40iw_device *iwdev
i40iw_pr_err("iwdev == NULL\n");
return NULL;
}
iwibdev->ibdev.owner = THIS_MODULE;
iwdev->iwibdev = iwibdev;
iwibdev->iwdev = iwdev;
@ -2771,9 +2763,6 @@ void i40iw_port_ibevent(struct i40iw_device *iwdev)
*/
void i40iw_destroy_rdma_device(struct i40iw_ib_device *iwibdev)
{
if (!iwibdev)
return;
ib_unregister_device(&iwibdev->ibdev);
wait_event_timeout(iwibdev->iwdev->close_wq,
!atomic64_read(&iwibdev->iwdev->use_count),
@ -2795,7 +2784,6 @@ int i40iw_register_rdma_device(struct i40iw_device *iwdev)
return -ENOMEM;
iwibdev = iwdev->iwibdev;
rdma_set_device_sysfs_group(&iwibdev->ibdev, &i40iw_attr_group);
iwibdev->ibdev.driver_id = RDMA_DRIVER_I40IW;
ret = ib_register_device(&iwibdev->ibdev, "i40iw%d");
if (ret)
goto error;

View File

@ -172,14 +172,14 @@ err_buf:
}
#define CQ_CREATE_FLAGS_SUPPORTED IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION
struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct ib_device *ibdev = ibcq->device;
int entries = attr->cqe;
int vector = attr->comp_vector;
struct mlx4_ib_dev *dev = to_mdev(ibdev);
struct mlx4_ib_cq *cq;
struct mlx4_ib_cq *cq = to_mcq(ibcq);
struct mlx4_uar *uar;
void *buf_addr;
int err;
@ -187,14 +187,10 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
udata, struct mlx4_ib_ucontext, ibucontext);
if (entries < 1 || entries > dev->dev->caps.max_cqes)
return ERR_PTR(-EINVAL);
return -EINVAL;
if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)
return ERR_PTR(-EINVAL);
cq = kzalloc(sizeof(*cq), GFP_KERNEL);
if (!cq)
return ERR_PTR(-ENOMEM);
return -EINVAL;
entries = roundup_pow_of_two(entries + 1);
cq->ibcq.cqe = entries - 1;
@ -269,7 +265,7 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
goto err_cq_free;
}
return &cq->ibcq;
return 0;
err_cq_free:
mlx4_cq_free(dev->dev, &cq->mcq);
@ -281,19 +277,15 @@ err_dbmap:
err_mtt:
mlx4_mtt_cleanup(dev->dev, &cq->buf.mtt);
if (udata)
ib_umem_release(cq->umem);
else
ib_umem_release(cq->umem);
if (!udata)
mlx4_ib_free_cq_buf(dev, &cq->buf, cq->ibcq.cqe);
err_db:
if (!udata)
mlx4_db_free(dev->dev, &cq->db);
err_cq:
kfree(cq);
return ERR_PTR(err);
return err;
}
static int mlx4_alloc_resize_buf(struct mlx4_ib_dev *dev, struct mlx4_ib_cq *cq,
@ -475,18 +467,15 @@ err_buf:
kfree(cq->resize_buf);
cq->resize_buf = NULL;
if (cq->resize_umem) {
ib_umem_release(cq->resize_umem);
cq->resize_umem = NULL;
}
ib_umem_release(cq->resize_umem);
cq->resize_umem = NULL;
out:
mutex_unlock(&cq->resize_mutex);
return err;
}
int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
{
struct mlx4_ib_dev *dev = to_mdev(cq->device);
struct mlx4_ib_cq *mcq = to_mcq(cq);
@ -501,15 +490,11 @@ int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
struct mlx4_ib_ucontext,
ibucontext),
&mcq->db);
ib_umem_release(mcq->umem);
} else {
mlx4_ib_free_cq_buf(dev, &mcq->buf, cq->cqe);
mlx4_db_free(dev->dev, &mcq->db);
}
kfree(mcq);
return 0;
ib_umem_release(mcq->umem);
}
static void dump_cqe(void *cqe)

View File

@ -1089,7 +1089,8 @@ static int mlx4_ib_alloc_ucontext(struct ib_ucontext *uctx,
if (!dev->ib_active)
return -EAGAIN;
if (ibdev->uverbs_abi_ver == MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION) {
if (ibdev->ops.uverbs_abi_ver ==
MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION) {
resp_v3.qp_tab_size = dev->dev->caps.num_qps;
resp_v3.bf_reg_size = dev->dev->caps.bf_reg_size;
resp_v3.bf_regs_per_page = dev->dev->caps.bf_regs_per_page;
@ -1111,7 +1112,7 @@ static int mlx4_ib_alloc_ucontext(struct ib_ucontext *uctx,
INIT_LIST_HEAD(&context->wqn_ranges_list);
mutex_init(&context->wqn_ranges_mutex);
if (ibdev->uverbs_abi_ver == MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION)
if (ibdev->ops.uverbs_abi_ver == MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION)
err = ib_copy_to_udata(udata, &resp_v3, sizeof(resp_v3));
else
err = ib_copy_to_udata(udata, &resp, sizeof(resp));
@ -2509,6 +2510,10 @@ static void get_fw_ver_str(struct ib_device *device, char *str)
}
static const struct ib_device_ops mlx4_ib_dev_ops = {
.owner = THIS_MODULE,
.driver_id = RDMA_DRIVER_MLX4,
.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION,
.add_gid = mlx4_ib_add_gid,
.alloc_mr = mlx4_ib_alloc_mr,
.alloc_pd = mlx4_ib_alloc_pd,
@ -2560,6 +2565,7 @@ static const struct ib_device_ops mlx4_ib_dev_ops = {
.resize_cq = mlx4_ib_resize_cq,
INIT_RDMA_OBJ_SIZE(ib_ah, mlx4_ib_ah, ibah),
INIT_RDMA_OBJ_SIZE(ib_cq, mlx4_ib_cq, ibcq),
INIT_RDMA_OBJ_SIZE(ib_pd, mlx4_ib_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_srq, mlx4_ib_srq, ibsrq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, mlx4_ib_ucontext, ibucontext),
@ -2642,7 +2648,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->dev = dev;
ibdev->bond_next_port = 0;
ibdev->ib_dev.owner = THIS_MODULE;
ibdev->ib_dev.node_type = RDMA_NODE_IB_CA;
ibdev->ib_dev.local_dma_lkey = dev->caps.reserved_lkey;
ibdev->num_ports = num_ports;
@ -2651,11 +2656,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.num_comp_vectors = dev->caps.num_comp_vectors;
ibdev->ib_dev.dev.parent = &dev->persist->pdev->dev;
if (dev->caps.userspace_caps)
ibdev->ib_dev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION;
else
ibdev->ib_dev.uverbs_abi_ver = MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION;
ibdev->ib_dev.uverbs_cmd_mask =
(1ull << IB_USER_VERBS_CMD_GET_CONTEXT) |
(1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) |
@ -2729,6 +2729,10 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_fs_ops);
}
if (!dev->caps.userspace_caps)
ibdev->ib_dev.ops.uverbs_abi_ver =
MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION;
mlx4_ib_alloc_eqs(dev, ibdev);
spin_lock_init(&iboe->lock);
@ -2839,7 +2843,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
goto err_steer_free_bitmap;
rdma_set_device_sysfs_group(&ibdev->ib_dev, &mlx4_attr_group);
ibdev->ib_dev.driver_id = RDMA_DRIVER_MLX4;
if (ib_register_device(&ibdev->ib_dev, "mlx4_%d"))
goto err_diag_counters;

View File

@ -743,10 +743,9 @@ int mlx4_ib_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents,
unsigned int *sg_offset);
int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period);
int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata);
struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
int mlx4_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
int mlx4_ib_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
void __mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq);
@ -907,7 +906,7 @@ void mlx4_ib_sl2vl_update(struct mlx4_ib_dev *mdev, int port);
struct ib_wq *mlx4_ib_create_wq(struct ib_pd *pd,
struct ib_wq_init_attr *init_attr,
struct ib_udata *udata);
int mlx4_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
void mlx4_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
int mlx4_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
u32 wq_attr_mask, struct ib_udata *udata);

View File

@ -258,7 +258,7 @@ int mlx4_ib_umem_calc_optimal_mtt_size(struct ib_umem *umem, u64 start_va,
int *num_of_mtts)
{
u64 block_shift = MLX4_MAX_MTT_SHIFT;
u64 min_shift = umem->page_shift;
u64 min_shift = PAGE_SHIFT;
u64 last_block_aligned_end = 0;
u64 current_block_start = 0;
u64 first_block_start = 0;
@ -295,8 +295,8 @@ int mlx4_ib_umem_calc_optimal_mtt_size(struct ib_umem *umem, u64 start_va,
* in access to the wrong data.
*/
misalignment_bits =
(start_va & (~(((u64)(BIT(umem->page_shift))) - 1ULL)))
^ current_block_start;
(start_va & (~(((u64)(PAGE_SIZE)) - 1ULL))) ^
current_block_start;
block_shift = min(alignment_of(misalignment_bits),
block_shift);
}
@ -368,8 +368,7 @@ end:
}
static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start,
u64 length, u64 virt_addr,
int access_flags)
u64 length, int access_flags)
{
/*
* Force registering the memory as writable if the underlying pages
@ -415,8 +414,7 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
if (!mr)
return ERR_PTR(-ENOMEM);
mr->umem =
mlx4_get_umem_mr(udata, start, length, virt_addr, access_flags);
mr->umem = mlx4_get_umem_mr(udata, start, length, access_flags);
if (IS_ERR(mr->umem)) {
err = PTR_ERR(mr->umem);
goto err_free;
@ -505,7 +503,7 @@ int mlx4_ib_rereg_user_mr(struct ib_mr *mr, int flags,
mlx4_mr_rereg_mem_cleanup(dev->dev, &mmr->mmr);
ib_umem_release(mmr->umem);
mmr->umem = mlx4_get_umem_mr(udata, start, length, virt_addr,
mmr->umem = mlx4_get_umem_mr(udata, start, length,
mr_access_flags);
if (IS_ERR(mmr->umem)) {
err = PTR_ERR(mmr->umem);
@ -514,7 +512,7 @@ int mlx4_ib_rereg_user_mr(struct ib_mr *mr, int flags,
goto release_mpt_entry;
}
n = ib_umem_page_count(mmr->umem);
shift = mmr->umem->page_shift;
shift = PAGE_SHIFT;
err = mlx4_mr_rereg_mem_write(dev->dev, &mmr->mmr,
virt_addr, length, n, shift,

View File

@ -1207,10 +1207,9 @@ err_mtt:
mlx4_mtt_cleanup(dev->dev, &qp->mtt);
err_buf:
if (qp->umem)
ib_umem_release(qp->umem);
else
if (!qp->umem)
mlx4_buf_free(dev->dev, qp->buf_size, &qp->buf);
ib_umem_release(qp->umem);
err_db:
if (!udata && qp_has_rq(init_attr))
@ -1421,7 +1420,6 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
mlx4_ib_db_unmap_user(mcontext, &qp->db);
}
ib_umem_release(qp->umem);
} else {
kvfree(qp->sq.wrid);
kvfree(qp->rq.wrid);
@ -1432,6 +1430,7 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
if (qp->rq.wqe_cnt)
mlx4_db_free(dev->dev, &qp->db);
}
ib_umem_release(qp->umem);
del_gid_entries(qp);
}
@ -4248,7 +4247,7 @@ int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
return err;
}
int mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata)
void mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata)
{
struct mlx4_ib_dev *dev = to_mdev(ibwq->device);
struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
@ -4259,8 +4258,6 @@ int mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata)
destroy_qp_common(dev, qp, MLX4_IB_RWQ_SRC, udata);
kfree(qp);
return 0;
}
struct ib_rwq_ind_table

View File

@ -115,7 +115,7 @@ int mlx4_ib_create_srq(struct ib_srq *ib_srq,
return PTR_ERR(srq->umem);
err = mlx4_mtt_init(dev->dev, ib_umem_page_count(srq->umem),
srq->umem->page_shift, &srq->mtt);
PAGE_SHIFT, &srq->mtt);
if (err)
goto err_buf;
@ -204,10 +204,9 @@ err_mtt:
mlx4_mtt_cleanup(dev->dev, &srq->mtt);
err_buf:
if (srq->umem)
ib_umem_release(srq->umem);
else
if (!srq->umem)
mlx4_buf_free(dev->dev, buf_size, &srq->buf);
ib_umem_release(srq->umem);
err_db:
if (!udata)
@ -275,13 +274,13 @@ void mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
struct mlx4_ib_ucontext,
ibucontext),
&msrq->db);
ib_umem_release(msrq->umem);
} else {
kvfree(msrq->wrid);
mlx4_buf_free(dev->dev, msrq->msrq.max << msrq->msrq.wqe_shift,
&msrq->buf);
mlx4_db_free(dev->dev, &msrq->db);
}
ib_umem_release(msrq->umem);
}
void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index)

View File

@ -884,15 +884,15 @@ static void notify_soft_wc_handler(struct work_struct *work)
cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
}
struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct ib_udata *udata)
{
struct ib_device *ibdev = ibcq->device;
int entries = attr->cqe;
int vector = attr->comp_vector;
struct mlx5_ib_dev *dev = to_mdev(ibdev);
struct mlx5_ib_cq *cq = to_mcq(ibcq);
u32 out[MLX5_ST_SZ_DW(create_cq_out)];
struct mlx5_ib_cq *cq;
int uninitialized_var(index);
int uninitialized_var(inlen);
u32 *cqb = NULL;
@ -904,18 +904,14 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
if (entries < 0 ||
(entries > (1 << MLX5_CAP_GEN(dev->mdev, log_max_cq_sz))))
return ERR_PTR(-EINVAL);
return -EINVAL;
if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EOPNOTSUPP);
return -EOPNOTSUPP;
entries = roundup_pow_of_two(entries + 1);
if (entries > (1 << MLX5_CAP_GEN(dev->mdev, log_max_cq_sz)))
return ERR_PTR(-EINVAL);
cq = kzalloc(sizeof(*cq), GFP_KERNEL);
if (!cq)
return ERR_PTR(-ENOMEM);
return -EINVAL;
cq->ibcq.cqe = entries - 1;
mutex_init(&cq->resize_mutex);
@ -930,13 +926,13 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
err = create_cq_user(dev, udata, cq, entries, &cqb, &cqe_size,
&index, &inlen);
if (err)
goto err_create;
return err;
} else {
cqe_size = cache_line_size() == 128 ? 128 : 64;
err = create_cq_kernel(dev, cq, entries, cqe_size, &cqb,
&index, &inlen);
if (err)
goto err_create;
return err;
INIT_WORK(&cq->notify_work, notify_soft_wc_handler);
}
@ -981,7 +977,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
kvfree(cqb);
return &cq->ibcq;
return 0;
err_cmd:
mlx5_core_destroy_cq(dev->mdev, &cq->mcq);
@ -992,14 +988,10 @@ err_cqb:
destroy_cq_user(cq, udata);
else
destroy_cq_kernel(dev, cq);
err_create:
kfree(cq);
return ERR_PTR(err);
return err;
}
int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
void mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
{
struct mlx5_ib_dev *dev = to_mdev(cq->device);
struct mlx5_ib_cq *mcq = to_mcq(cq);
@ -1009,10 +1001,6 @@ int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
destroy_cq_user(mcq, udata);
else
destroy_cq_kernel(dev, mcq);
kfree(mcq);
return 0;
}
static int is_equal_rsn(struct mlx5_cqe64 *cqe64, u32 rsn)
@ -1138,11 +1126,6 @@ static int resize_user(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq,
return 0;
}
static void un_resize_user(struct mlx5_ib_cq *cq)
{
ib_umem_release(cq->resize_umem);
}
static int resize_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq,
int entries, int cqe_size)
{
@ -1165,12 +1148,6 @@ ex:
return err;
}
static void un_resize_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq)
{
free_cq_buf(dev, cq->resize_buf);
cq->resize_buf = NULL;
}
static int copy_resize_cqes(struct mlx5_ib_cq *cq)
{
struct mlx5_ib_dev *dev = to_mdev(cq->ibcq.device);
@ -1351,10 +1328,11 @@ ex_alloc:
kvfree(in);
ex_resize:
if (udata)
un_resize_user(cq);
else
un_resize_kernel(dev, cq);
ib_umem_release(cq->resize_umem);
if (!udata) {
free_cq_buf(dev, cq->resize_buf);
cq->resize_buf = NULL;
}
ex:
mutex_unlock(&cq->resize_mutex);
return err;

File diff suppressed because it is too large Load Diff

View File

@ -200,19 +200,33 @@ static void pma_cnt_assign(struct ib_pma_portcounters *pma_cnt,
vl_15_dropped);
}
static int process_pma_cmd(struct mlx5_core_dev *mdev, u8 port_num,
static int process_pma_cmd(struct mlx5_ib_dev *dev, u8 port_num,
const struct ib_mad *in_mad, struct ib_mad *out_mad)
{
int err;
struct mlx5_core_dev *mdev;
bool native_port = true;
u8 mdev_port_num;
void *out_cnt;
int err;
mdev = mlx5_ib_get_native_port_mdev(dev, port_num, &mdev_port_num);
if (!mdev) {
/* Fail to get the native port, likely due to 2nd port is still
* unaffiliated. In such case default to 1st port and attached
* PF device.
*/
native_port = false;
mdev = dev->mdev;
mdev_port_num = 1;
}
/* Declaring support of extended counters */
if (in_mad->mad_hdr.attr_id == IB_PMA_CLASS_PORT_INFO) {
struct ib_class_port_info cpi = {};
cpi.capability_mask = IB_PMA_CLASS_CAP_EXT_WIDTH;
memcpy((out_mad->data + 40), &cpi, sizeof(cpi));
return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
err = IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
goto done;
}
if (in_mad->mad_hdr.attr_id == IB_PMA_PORT_COUNTERS_EXT) {
@ -221,11 +235,13 @@ static int process_pma_cmd(struct mlx5_core_dev *mdev, u8 port_num,
int sz = MLX5_ST_SZ_BYTES(query_vport_counter_out);
out_cnt = kvzalloc(sz, GFP_KERNEL);
if (!out_cnt)
return IB_MAD_RESULT_FAILURE;
if (!out_cnt) {
err = IB_MAD_RESULT_FAILURE;
goto done;
}
err = mlx5_core_query_vport_counter(mdev, 0, 0,
port_num, out_cnt, sz);
mdev_port_num, out_cnt, sz);
if (!err)
pma_cnt_ext_assign(pma_cnt_ext, out_cnt);
} else {
@ -234,20 +250,23 @@ static int process_pma_cmd(struct mlx5_core_dev *mdev, u8 port_num,
int sz = MLX5_ST_SZ_BYTES(ppcnt_reg);
out_cnt = kvzalloc(sz, GFP_KERNEL);
if (!out_cnt)
return IB_MAD_RESULT_FAILURE;
if (!out_cnt) {
err = IB_MAD_RESULT_FAILURE;
goto done;
}
err = mlx5_core_query_ib_ppcnt(mdev, port_num,
err = mlx5_core_query_ib_ppcnt(mdev, mdev_port_num,
out_cnt, sz);
if (!err)
pma_cnt_assign(pma_cnt, out_cnt);
}
}
kvfree(out_cnt);
if (err)
return IB_MAD_RESULT_FAILURE;
return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
err = err ? IB_MAD_RESULT_FAILURE :
IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY;
done:
if (native_port)
mlx5_ib_put_native_port_mdev(dev, port_num);
return err;
}
int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
@ -259,8 +278,6 @@ int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
struct mlx5_ib_dev *dev = to_mdev(ibdev);
const struct ib_mad *in_mad = (const struct ib_mad *)in;
struct ib_mad *out_mad = (struct ib_mad *)out;
struct mlx5_core_dev *mdev;
u8 mdev_port_num;
int ret;
if (WARN_ON_ONCE(in_mad_size != sizeof(*in_mad) ||
@ -269,19 +286,14 @@ int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
memset(out_mad->data, 0, sizeof(out_mad->data));
mdev = mlx5_ib_get_native_port_mdev(dev, port_num, &mdev_port_num);
if (!mdev)
return IB_MAD_RESULT_FAILURE;
if (MLX5_CAP_GEN(mdev, vport_counters) &&
if (MLX5_CAP_GEN(dev->mdev, vport_counters) &&
in_mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_PERF_MGMT &&
in_mad->mad_hdr.method == IB_MGMT_METHOD_GET) {
ret = process_pma_cmd(mdev, mdev_port_num, in_mad, out_mad);
ret = process_pma_cmd(dev, port_num, in_mad, out_mad);
} else {
ret = process_mad(ibdev, mad_flags, port_num, in_wc, in_grh,
in_mad, out_mad);
}
mlx5_ib_put_native_port_mdev(dev, port_num);
return ret;
}

Some files were not shown because too many files have changed in this diff Show More