Commit Graph

1384 Commits

Author SHA1 Message Date
Leon Romanovsky
9abb0d1bbd RDMA: Simplify get firmware interface
There is a need to forward FW version to user space
application through RDMA netlink. In order to make it safe, there
is need to declare nla_policy and limit the size of FW string.

The new define IB_FW_VERSION_NAME_MAX will limit the size of
FW version string. That define was chosen to be equal to
ETHTOOL_FWVERS_LEN, because many drivers anyway are limited
by that value indirectly.

The introduction of this define allows us to remove the string size
from get_fw_str function signature.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10 13:28:10 +03:00
Leon Romanovsky
ac50525374 RDMA/netlink: Expose device and port capability masks
The port capability mask is exposed to user space via sysfs interface,
while device capabilities are available for verbs only.

This patch provides those capabilities through netlink interface.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:10 +03:00
Leon Romanovsky
c3f66f7b00 RDMA/netlink: Implement nldev port doit callback
Provide ability to get specific to device and port information.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:09 +03:00
Leon Romanovsky
7d02f605f0 RDMA/netlink: Add nldev port dumpit implementation
This patch implements the query interface to get all
ports data for the specific device.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:09 +03:00
Leon Romanovsky
e5c9469efc RDMA/netlink: Add nldev device doit implementation
Provide ability to query specific device.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:08 +03:00
Leon Romanovsky
b4c598a67e RDMA/netlink: Implement nldev device dumpit calback
This patch adds the ability to return all available devices
together with their properties.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:08 +03:00
Leon Romanovsky
6c80b41abe RDMA/netlink: Add nldev initialization flows
Add nldev init and exit flows to the RDMA/core.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:07 +03:00
Leon Romanovsky
1a6e7c31d7 RDMA/netlink: Add netlink device definitions to UAPI
Introduce new defines to rdma_netlink.h, so the RDMA configuration tool
will be able to communicate with RDMA subsystem by using the shared defines.

The addition of new client (NLDEV) revealed the fact that we exposed by
mistake the RDMA_NL_I40IW define which is not backed by any RDMA netlink
by now and it won't be exposed in the future too. So this patch reuses
the value and deletes the old defines.

The NLDEV operates with objects. The struct ib_device has two straightforward
objects: device itself and ports of that device.

This brings us to propose the following commands to work on those objects:
 * RDMA_NLDEV_CMD_{GET,SET,NEW,DEL} - works on ib_device itself
 * RDMA_NLDEV_CMD_PORT_{GET,SET,NEW,DEL} - works on ports of specific ib_device

Those commands receive/return the device index (RDMA_NLDEV_ATTR_DEV_INDEX)
and port index (RDMA_NLDEV_ATTR_PORT_INDEX). For device object accesses,
the RDMA_NLDEV_ATTR_PORT_INDEX will return the maximum number of ports
for specific ib_device and for port access the actual port index.

The port index starts from 1 to follow RDMA/core internal semantics and
the sysfs exposed knobs.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:28:07 +03:00
Leon Romanovsky
8bc67414f2 RDMA/netlink: Update copyright
Add Mellanox to the copyright header.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:21:57 +03:00
Leon Romanovsky
647c75ac59 RDMA/netlink: Convert LS to doit callback
RDMA_NL_LS protocol is actually does not dump anything,
but sets data and it should be handled by doit callback.

This patch actually converts RDMA_NL_LS to doit callback, while
preserving IWCM and RDMA_CM flows through netlink_dump_start().

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:21:56 +03:00
Leon Romanovsky
c729943a77 RDMA/netlink: Reduce indirection access to cb_table
Introduce intermediate variable to store access to fields
of cb_table.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:21:56 +03:00
Leon Romanovsky
1830ba21b9 RDMA/netlink: Add and implement doit netlink callback
The .doit callback is used by netlink core to differentiate
between get and set operations. Common convention is to use
that call for command operations like (SET, ADD, e.t.c.) and/or
access without NLF_M_DUMP flag.

This commit adds proper declaration and implementation
to RDMA netlink.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:21:55 +03:00
Leon Romanovsky
ecc82c53f9 RDMA/core: Add and expose static device index
This patch adds static device index in similar fashion to
already available in netdev world (struct net->ifindex).

In downstream patches, the RDMA nelink will use this idx-to-ib_device
conversion, so as part of this commit, we are exposing the translation
function to be visible for IB/core users.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10 13:21:54 +03:00
Leon Romanovsky
8030c8357a RDMA/core: Add iterator over ib_devices
The coming nldev needs to iterate over all IB devices in the system
and in order to not expose the ib_devices list outside the devices.c,
it is necessary to provide function iterator.

Current version is written explicitly for nldev callback to avoid
over-engineering at this stage, but it can be easily extended for
other types.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:21:54 +03:00
Leon Romanovsky
3250b4dbd8 RDMA/netlink: Rename netlink callback struct
The RDMA netlink client infrastructure was removed and made obsolete.
The old infrastructure defined struct ibnl_client_cbs. Now that all
uses of this have been updated to the new infrastructure, rename the
struct to be compliant with the current stack naming standards:
struct rdma_nl_cbs.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:20:15 +03:00
Leon Romanovsky
ff61c425c1 RDMA/netlink: Simplify and rename ibnl_chk_listeners
Make ibnl_chk_listeners function to be one line by removing
unneeded comparison.

Rename that function to be complaint to other functions in RDMA netlink.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:19:03 +03:00
Leon Romanovsky
4d7f693af0 RDMA/netlink: Rename and remove redundant parameter from ibnl_multicast
The pointer to netlink header was not used in the ibnl_multicast
function, so let's remove it and simplify the function
signature.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:19:03 +03:00
Leon Romanovsky
f00e646370 RDMA/netlink: Rename and remove redundant parameter from ibnl_unicast*
Netlink message header is not needed for unicast reply, hence remove it.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:19:02 +03:00
Leon Romanovsky
1a1c116f3d RDMA/netlink: Simplify the put_msg and put_attr
Reuse standard macros to cancel the netlink message
in case of error.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:19:01 +03:00
Leon Romanovsky
e3a2b93ddd RDMA/netlink: Add flag to consolidate common handling
Add ability to provide flags to control RDMA netlink callbacks
and convert addr.c and sa_query.c to be first users of such
infrastructure. It allows to move their CAP_NET_ADMIN checks
into netlink core.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
2017-08-10 13:18:45 +03:00
Leon Romanovsky
5d7ee40907 RDMA/iwcm: Remove extra EXPORT_SYMBOLS
The iwcm exports functions which are not used outside of ib_core.
This patch simply removes these EXPORT_SYMBOLS.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Chien Tin Tung <chien.tin.tung@intel.com>
2017-08-10 13:17:43 +03:00
Leon Romanovsky
93fa50760b RDMA/iwcm: Remove useless check of netlink client validity
RDMA netlink implementation guarantees that supplied
client number is in allowed range.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Chien Tin Tung <chien.tin.tung@intel.com>
2017-08-10 13:17:26 +03:00
Leon Romanovsky
3c3e75d5ff RDMA/netlink: Avoid double pass for RDMA netlink messages
The standard netlink_rcv_skb function skips messages without
NLM_F_REQUEST flag in it, while SA netlink client issues them.

In commit bc10ed7d3d ("IB/core: Add rdma netlink helper functions")
the local function was introduced to allow such messages.

This led to double pass for every incoming message.

In this patch, we unify that local implementation and netlink_rcv_skb
functions, so there will be no need for double pass anymore.

As a outcome, this combined function gained more strict check
for NLM_F_REQUEST flag and it is now allowed for SA pathquery
client only.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10 13:15:42 +03:00
Leon Romanovsky
64401b69b2 RDMA/netlink: Remove redundant owner option for netlink callbacks
Owner field is not needed to be set because netlink is part of ib_core
which will be unloaded last after all other modules are unloaded.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10 13:15:41 +03:00
Leon Romanovsky
c9901724a2 RDMA/netlink: Remove netlink clients infrastructure
RDMA netlink has a complicated infrastructure for dynamically
registering and de-registering netlink clients to the NETLINK_RDMA
group. The complicated portion of this code is not widely used because
2 of the 3 current clients are statically compiled together with
netlink.c. The infrastructure, therefore, is deemed overkill.

Refactor the code to eliminate the dynamically added clients. Now all
clients are pre-registered in a client array at compile time, and at run
time they merely check-in with the infrastructure to pass their callback
table for inclusion in the pre-sized client array.

This also allows for future cleanups and removal of unneeded code in the
iwcm* netlink handler.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Chien Tin Tung <chien.tin.tung@intel.com>
2017-08-10 13:13:06 +03:00
Ismail, Mustafa
9047811b77 RDMA/core: Add wait/retry version of ibnl_unicast
Add a wait/retry version of ibnl_unicast, ibnl_unicast_wait,
and modify ibnl_unicast to not wait/retry.  This eliminates
the undesirable wait for future users of ibnl_unicast.

Change Portmapper calls originating from kernel to user-space
to use ibnl_unicast_wait and take advantage of the wait/retry
logic in netlink_unicast.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2017-08-09 16:08:27 +03:00
Doug Ledford
3c7f67d188 IB/cma: Fix default RoCE type setting
The initial patch for changing the stack to use RoCEv2 GIDs by default
set the CMA_PREFERRED_ROCE_GID_TYPE to an incorrect value.  Instead of
an absolute value, we needed to set the right bit in a bitmask.  Correct
the default setting so we use RoCEv2 by default.

Fixes: 63a5f483af (IB/cma: Set default gid type to RoCEv2)
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-28 13:47:24 -04:00
Doug Ledford
a5f66725c7 Merge branch 'misc' into k.o/for-next 2017-07-27 09:00:38 -04:00
Yishai Hadas
2dee0e5458 IB/uverbs: Enable QP creation with a given source QP number
Enable QP creation with a given source QP number, the created QP will
use the source QPN as its wire QP number.

To create such a QP, root privileges (i.e. CAP_NET_RAW) are required
from the user application.

This comes as a pre-patch for downstream patches in this series to
allow user space applications to accelerate traffic which is typically
handled by IPoIB ULP.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24 10:40:46 -04:00
Noa Osherovich
9636a56fa8 IB/core: Add support for RoCEv2 multicast
When creating address handle from multicast GID, set MAC according to
the appropriate formula instead of searching for it in the GID table:
- For IPv4 multicast GID use ip_eth_mc_map().
- For IPv6 multicast GID use ipv6_eth_mc_map().

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24 10:40:23 -04:00
Noa Osherovich
be1d325a33 IB/core: Set RoCEv2 MGID according to spec
RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding
IPv4 address is encoded into the GID according to the following rule:
GID= :ffff:<IPv4 address>

Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it
zeroed and change rdma_is_multicast_addr() to consider the new logic.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24 10:40:23 -04:00
Noa Osherovich
5236333592 IB/core: Fix the validations of a multicast LID in attach or detach operations
RoCE Annex (A16.9.10/11) declares that during attach (detach) QP to a
multicast group, if the QP is associated with a RoCE port, the
multicast group MLID is unused and is ignored.

During attach or detach multicast, when the QP is associated with a
port, it is enough to check the port's link layer and validate the
LID only if it is Infiniband. Otherwise, avoid validating the
multicast LID.

Fixes: 8561eae60f ("IB/core: For multicast functions, verify that LIDs are multicast LIDs")
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24 10:40:23 -04:00
Yuval Shaia
d41861942f IB/core: Add generic function to extract IB speed from netdev
Logic of retrieving netdev speed from net_device and translating it to
IB speed is implemented in rxe, in usnic and in bnxt drivers.

Define new function which merges all.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Christian Benvenuti <benve@cisco.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24 08:45:11 -04:00
Moni Shoua
63a5f483af IB/cma: Set default gid type to RoCEv2
RoCEv2 is the preferred RDMA protocol for Ethernet link layer because
of its advantages over RoCEv1. For better user experience make it the
default choice for RDMA_CM connections if device/port support it.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24 08:43:55 -04:00
Ismail, Mustafa
a62ab66b13 RDMA/core: Initialize port_num in qp_attr
Initialize the port_num for iWARP in rdma_init_qp_attr.

Fixes: 5ecce4c9b17b("Check port number supplied by user verbs cmds")
Cc: <stable@vger.kernel.org> # v2.6.14+
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:24:13 -04:00
Ismail, Mustafa
5a7a88f1b4 RDMA/uverbs: Fix the check for port number
The port number is only valid if IB_QP_PORT is set in the mask.
So only check port number if it is valid to prevent modify_qp from
failing due to an invalid port number.

Fixes: 5ecce4c9b17b("Check port number supplied by user verbs cmds")
Cc: <stable@vger.kernel.org> # v2.6.14+
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:24:13 -04:00
Matan Barak
266098b841 IB/core: Fix sparse warnings
Delete unused variables to prevent sparse warnings.

Fixes: db1b5ddd53 ("IB/core: Rename uverbs event file structure")
Fixes: fd3c7904db ("IB/core: Change idr objects to use the new schema")

Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Tadeusz Struk
ebc9ca43e1 IB/core: Allow QP state transition from reset to error
Playing with IP-O-IB interface can trigger a warning message:
"ib0: Failed to modify QP to ERROR state" to be logged.
This happens when the QP is in IB_QPS_RESET state and the stack
is trying to transition it to IB_QPS_ERR state in ipoib_ib_dev_stop().

According to the IB spec, Table 91 - "QP State Transition Properties"
it looks like the transition from reset to error is valid:

Transition: Any State to Error
Required Attributes: None
Optional Attributes: None allowed
Actions: Queue processing is stopped. Work Requests pending or in
process are completed in error, when possible.

This patch allows the transition and quiets the message.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:30 -04:00
Majd Dibbiny
8fe8bacb92 IB/core: Add ordered workqueue for RoCE GID management
Currently the RoCE GID management uses the ib_wq to do add and delete new GIDs
according to the netdev events.

The ib_wq isn't an ordered workqueue and thus two work elements can be executed
concurrently which will result in unexpected behavior and inconsistency of the
GIDs cache content.

Example:
ifconfig eth1 11.11.11.11/16 up

This command will invoke the following netdev events in the following order:
1. NETDEV_UP
2. NETDEV_DOWN
3. NETDEV_UP

If (2) and (3) will be executed concurrently or in reverse order, instead of
having a new GID with 11.11.11.11 IP, we will end up without any new GIDs.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:25 -04:00
Parav Pandit
f7c8f2e9dd IB/uverbs: Make use of ib_modify_qp variant to avoid resolving DMAC
This patch makes use of IB core's ib_modify_qp_with_udata function that
also resolves the DMAC and handles udata.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:20:49 -04:00
Parav Pandit
a512c2fbef IB/core: Introduce modify QP operation with udata
This patch adds new function ib_modify_qp_with_udata so that
uverbs layer can avoid handling L2 mac address at verbs layer
and depend on the core layer to resolve the mac address consistently
for all required QPs.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:20:41 -04:00
Moni Shoua
cbd09aebc2 IB/core: Don't resolve IP address to the loopback device
When resolving an IP address that is on the host of the caller the
result from querying the routing table is the loopback device. This is
not a valid response, because it doesn't represent the RDMA device and
the port.

Therefore, callers need to check the resolved device and if it is a
loopback device find an alternative way to resolve it. To avoid this we
make sure that the response from rdma_resolve_ip() will not be the
loopback device.

While that, we fix an static checker warning about dereferencing an
unintitialized pointer using the same solution as in commit abeffce90c
("net/mlx5e: Fix a -Wmaybe-uninitialized warning") as a reference.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 11:45:42 -04:00
Moni Shoua
bebb2a473a IB/core: Namespace is mandatory input for address resolution
In function addr_resolve() the namespace is a required input parameter
and not an output. It is passed later for searching the routing table
and device addresses. Also, it shouldn't be copied back to the caller.

Fixes: 565edd1d55 ('IB/addr: Pass network namespace as a parameter')
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 11:45:34 -04:00
Gustavo A. R. Silva
28b5b3a23b RDMA/core: Document confusing code
While looking into Coverity ID 1351047 I ran into the following
piece of code at
drivers/infiniband/core/verbs.c:496:

ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid,
                                   ah_attr->dmac,
                                   wc->wc_flags & IB_WC_WITH_VLAN ?
                                   NULL : &vlan_id,
                                   &if_index, &hoplimit);

The issue here is that the position of arguments in the call to
rdma_addr_find_l2_eth_by_grh() function do not match the order of
the parameters:

&dgid is passed to sgid
&sgid is passed to dgid

This is the function prototype:

int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,
 				 const union ib_gid *dgid,
 				 u8 *dmac, u16 *vlan_id, int *if_index,
 				 int *hoplimit)

My question here is if this is intentional?

Answer:
Yes. ib_init_ah_from_wc() creates ah from the incoming packet.
Incoming packet has dgid of the receiver node on which this code is
getting executed and sgid contains the GID of the sender.

When resolving mac address of destination, you use arrived dgid as
sgid and use sgid as dgid because sgid contains destinations GID whom to
respond to.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 11:45:17 -04:00
Linus Torvalds
a7d4026834 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer fixes from James Morris:
 "Bugfixes for TPM and SELinux"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  IB/core: Fix static analysis warning in ib_policy_change_task
  IB/core: Fix uninitialized variable use in check_qp_port_pkey_settings
  tpm: do not suspend/resume if power stays on
  tpm: use tpm2_pcr_read() in tpm2_do_selftest()
  tpm: use tpm_buf functions in tpm2_pcr_read()
  tpm_tis: make ilb_base_addr static
  tpm: consolidate the TPM startup code
  tpm: Enable CLKRUN protocol for Braswell systems
  tpm/tpm_crb: fix priv->cmd_size initialisation
  tpm: fix a kernel memory leak in tpm-sysfs.c
  tpm: Issue a TPM2_Shutdown for TPM2 devices.
  Add "shutdown" to "struct class".
2017-07-07 17:06:28 -07:00
Daniel Jurgens
a750cfde13 IB/core: Fix static analysis warning in ib_policy_change_task
ib_get_cached_subnet_prefix can technically fail, but the only way it
could is not possible based on the loop conditions. Check the return
value before using the variable sp to resolve a static analysis warning.

-v1:
- Fix check to !ret. Paul Moore

Fixes: 8f408ab64b ("selinux lsm IB/core: Implement LSM notification
system")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-07-07 09:49:26 +10:00
Daniel Jurgens
79d0636ac7 IB/core: Fix uninitialized variable use in check_qp_port_pkey_settings
Check the return value from get_pkey_and_subnet_prefix to prevent using
uninitialized variables.

Fixes: d291f1a652 ("IB/core: Enforce PKey security on QPs")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-07-07 09:49:26 +10:00
Linus Torvalds
9871ab22f2 Fixes #3 for 4.12-rc
- 2 Fixes for OPA found by debug kernel
 - 1 Fix for user supplied input causing kernel problems
 - 1 Fix for the IPoIB fixes submitted around -rc4
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZXVfPAAoJELgmozMOVy/dRRcP/AiN4wyEQ897se1fKXAktL1g
 a17tiSkK2MukAVHbM++9Ea/YXK66e2s7Ls8Pd230E85N3V48rSUhWZUIUQLOm+gS
 b98z53uNs6KkdBCezXABsHIi4PB6u1CfzaFaUfN5WI3ymAgsYqpQWMtNyO6GNe/R
 Dur3vDieXPNJ2x+F1jiNxHFBXLKofCG0y1FX88zqsQI5vVVq7ASKgaaSX3T1emQY
 18l4Dd7pesrWj4QD9jaqQiYkruF5VC1NE8/he8Zzy6XjSgnUZZfjbjuMptbW4y3y
 Tvvd5bjMAkJhCbK1mhe1dZHPlYJhAguUBZfThjVSKtiMGwRhGA4SYkRtek3nZOga
 /OLhERgj0VomHx7o+Pwp74DWnsSv08EMoc4hXKHZPPyxok83r9czejqm7mC2VbGd
 Sa8LmVeLQp79e9MbGAj+PbNRHf9CE9dnLeFUmbj+qptXUVGvT8j9U1a9iTjTz0+2
 NX/O4iWjtnt/CIkH9dhN9aWolswbmO2jSmmzb/x2EuCLv94GNtTyZLSifvxSYMnN
 IWO86aGQmuUkWJ3RI/5tzq+gVzI6bdKB9hG5DOPWN/uJVF9nWkq3c69Bv9djvUoM
 xi/rI0grxTqYHelRx3ja4ZqaI43R6YwL928XdtZJKQ/uNanq65Lyd6KKz3W7hT0l
 emCoqb2MjuzsNWIPkSgg
 =JEor
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma update from Doug Ledford:
 "This includes two bugs against the newly added opa vnic that were
  found by turning on the debug kernel options:

   - sleeping while holding a lock, so a one line fix where they
     switched it from GFP_KERNEL allocation to a GFP_ATOMIC allocation

   - a case where they had an isolated caller of their code that could
     call them in an atomic context so they had to switch their use of a
     mutex to a spinlock to be safe, so this was considerably more lines
     of diff because all uses of that lock had to be switched

  In addition, the bug that was discussed with you already about an out
  of bounds array access in ib_uverbs_modify_qp and ib_uverbs_create_ah
  and is only seven lines of diff.

  And finally, one fix to an earlier fix in the -rc cycle that broke
  hfi1 and qib in regards to IPoIB (this one is, unfortunately, larger
  than I would like for a -rc7 submission, but fixing the problem
  required that we not treat all devices as though they had allocated a
  netdev universally because it isn't true, and it took 70 lines of diff
  to resolve the issue, but the final patch has been vetted by Intel and
  Mellanox and they've both given their approval to the fix).

  Summary:

   - Two fixes for OPA found by debug kernel
   - Fix for user supplied input causing kernel problems
   - Fix for the IPoIB fixes submitted around -rc4"

[ Doug sent this having not noticed the 4.12 release, so I guess I'll be
  getting another rdma pull request with the actuakl merge window
  updates and not just fixes.

  Oh well - it would have been nice if this small update had been the
  merge window one.     - Linus ]

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev
  RDMA/uverbs: Check port number supplied by user verbs cmds
  IB/opa_vnic: Use spinlock instead of mutex for stats_lock
  IB/opa_vnic: Use GFP_ATOMIC while sending trap
2017-07-06 11:45:08 -07:00
Linus Torvalds
5518b69b76 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Reasonably busy this cycle, but perhaps not as busy as in the 4.12
  merge window:

   1) Several optimizations for UDP processing under high load from
      Paolo Abeni.

   2) Support pacing internally in TCP when using the sch_fq packet
      scheduler for this is not practical. From Eric Dumazet.

   3) Support mutliple filter chains per qdisc, from Jiri Pirko.

   4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

   5) Add batch dequeueing to vhost_net, from Jason Wang.

   6) Flesh out more completely SCTP checksum offload support, from
      Davide Caratti.

   7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
      Neira Ayuso, and Matthias Schiffer.

   8) Add devlink support to nfp driver, from Simon Horman.

   9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
      Prabhu.

  10) Add stack depth tracking to BPF verifier and use this information
      in the various eBPF JITs. From Alexei Starovoitov.

  11) Support XDP on qed device VFs, from Yuval Mintz.

  12) Introduce BPF PROG ID for better introspection of installed BPF
      programs. From Martin KaFai Lau.

  13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

  14) For loads, allow narrower accesses in bpf verifier checking, from
      Yonghong Song.

  15) Support MIPS in the BPF selftests and samples infrastructure, the
      MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
      Daney.

  16) Support kernel based TLS, from Dave Watson and others.

  17) Remove completely DST garbage collection, from Wei Wang.

  18) Allow installing TCP MD5 rules using prefixes, from Ivan
      Delalande.

  19) Add XDP support to Intel i40e driver, from Björn Töpel

  20) Add support for TC flower offload in nfp driver, from Simon
      Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
      Kicinski, and Bert van Leeuwen.

  21) IPSEC offloading support in mlx5, from Ilan Tayari.

  22) Add HW PTP support to macb driver, from Rafal Ozieblo.

  23) Networking refcount_t conversions, From Elena Reshetova.

  24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
      for tuning the TCP sockopt settings of a group of applications,
      currently via CGROUPs"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
  net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
  dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
  cxgb4: Support for get_ts_info ethtool method
  cxgb4: Add PTP Hardware Clock (PHC) support
  cxgb4: time stamping interface for PTP
  nfp: default to chained metadata prepend format
  nfp: remove legacy MAC address lookup
  nfp: improve order of interfaces in breakout mode
  net: macb: remove extraneous return when MACB_EXT_DESC is defined
  bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
  bpf: fix return in load_bpf_file
  mpls: fix rtm policy in mpls_getroute
  net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
  net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
  ...
2017-07-05 12:31:59 -07:00
Linus Torvalds
e24dd9ee53 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:

 - a major update for AppArmor. From JJ:

     * several bug fixes and cleanups

     * the patch to add symlink support to securityfs that was floated
       on the list earlier and the apparmorfs changes that make use of
       securityfs symlinks

     * it introduces the domain labeling base code that Ubuntu has been
       carrying for several years, with several cleanups applied. And it
       converts the current mediation over to using the domain labeling
       base, which brings domain stacking support with it. This finally
       will bring the base upstream code in line with Ubuntu and provide
       a base to upstream the new feature work that Ubuntu carries.

     * This does _not_ contain any of the newer apparmor mediation
       features/controls (mount, signals, network, keys, ...) that
       Ubuntu is currently carrying, all of which will be RFC'd on top
       of this.

 - Notable also is the Infiniband work in SELinux, and the new file:map
   permission. From Paul:

      "While we're down to 21 patches for v4.13 (it was 31 for v4.12),
       the diffstat jumps up tremendously with over 2k of line changes.

       Almost all of these changes are the SELinux/IB work done by
       Daniel Jurgens; some other noteworthy changes include a NFS v4.2
       labeling fix, a new file:map permission, and reporting of policy
       capabilities on policy load"

   There's also now genfscon labeling support for tracefs, which was
   lost in v4.1 with the separation from debugfs.

 - Smack incorporates a safer socket check in file_receive, and adds a
   cap_capable call in privilege check.

 - TPM as usual has a bunch of fixes and enhancements.

 - Multiple calls to security_add_hooks() can now be made for the same
   LSM, to allow LSMs to have hook declarations across multiple files.

 - IMA now supports different "ima_appraise=" modes (eg. log, fix) from
   the boot command line.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (126 commits)
  apparmor: put back designators in struct initialisers
  seccomp: Switch from atomic_t to recount_t
  seccomp: Adjust selftests to avoid double-join
  seccomp: Clean up core dump logic
  IMA: update IMA policy documentation to include pcr= option
  ima: Log the same audit cause whenever a file has no signature
  ima: Simplify policy_func_show.
  integrity: Small code improvements
  ima: fix get_binary_runtime_size()
  ima: use ima_parse_buf() to parse template data
  ima: use ima_parse_buf() to parse measurements headers
  ima: introduce ima_parse_buf()
  ima: Add cgroups2 to the defaults list
  ima: use memdup_user_nul
  ima: fix up #endif comments
  IMA: Correct Kconfig dependencies for hash selection
  ima: define is_ima_appraise_enabled()
  ima: define Kconfig IMA_APPRAISE_BOOTPARAM option
  ima: define a set of appraisal rules requiring file signatures
  ima: extend the "ima_policy" boot command line to support multiple policies
  ...
2017-07-05 11:26:35 -07:00