Commit Graph

404 Commits

Author SHA1 Message Date
Yishai Hadas
a1069c1c75 IB/mlx5: Use uid as part of PD commands
Use uid as part of PD commands so that the firmware can manage the
PD object in a secured way.

For example when a QP is created its uid must match the CQ uid which it
uses.

Next patches in this series will use the uid from the PD, then will come
a patch to set the uid on the PD so that all objects will be properly
work in one change.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25 14:06:04 -06:00
Doug Ledford
f9882bb506 Merge branch 'mlx5-vport-loopback' into rdma.get
For dependencies, branch based on 'mlx5-next' of
    git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

mlx5 mcast/ucast loopback control enhancements from Leon Romanovsky:

====================
This is short series from Mark which extends handling of loopback
traffic. Originally mlx5 IB dynamically enabled/disabled both unicast
and multicast based on number of users. However RAW ethernet QPs need
more granular access.
====================

Fixed failed automerge in mlx5_ib.h (minor context conflict issue)

mlx5-vport-loopback branch:
    RDMA/mlx5: Enable vport loopback when user context or QP mandate
    RDMA/mlx5: Allow creating RAW ethernet QP with loopback support
    RDMA/mlx5: Refactor transport domain bookkeeping logic
    net/mlx5: Rename incorrect naming in IFC file

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21 20:41:58 -04:00
Mark Bloch
0042f9e458 RDMA/mlx5: Enable vport loopback when user context or QP mandate
A user can create a QP which can accept loopback traffic, but that's not
enough. We need to enable loopback on the vport as well. Currently vport
loopback is enabled only when more than 1 users are using the IB device,
update the logic to consider whatever a QP which supports loopback was
created, if so enable vport loopback even if there is only a single user.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21 20:20:59 -04:00
Mark Bloch
a560f1d9af RDMA/mlx5: Refactor transport domain bookkeeping logic
In preparation to enable loopback on a single user context move the logic
that enables/disables loopback to separate functions and group variables
under a single struct.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21 20:20:59 -04:00
Jason Gunthorpe
f27a0d50a4 RDMA/umem: Use umem->owning_mm inside ODP
Since ODP had a single struct mmu_notifier located in the ucontext it
could only handle a single MM at a time, and this prevented it from using
the new owning_mm system.

With the prior rework it is now simple to let ODP track multiple MMs per
ucontext, finish the job so that the per_mm is allocated on a mm by mm
basis, and freed when the last umem is dropped from the ucontext.

As a side effect the new saner locking removes the lockdep splat about
nesting the umem_rwsem between mmu_notifier_unregister and
ib_umem_odp_release.

It also makes ODP work with multiple processes, across, fork, etc.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21 11:58:36 -04:00
Jason Gunthorpe
e2cd1d1ad2 RDMA/mlx5: Use rdma_user_mmap_io
Rely on the new core code helper to map BAR memory from the driver.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-20 16:19:30 -04:00
Mark Bloch
a7ee18bdee RDMA/mlx5: Allow creating a matcher for a NIC TX flow table
Currently a matcher can only be created and attached to a NIC RX flow
table. Extend it to allow it on NIC TX flow tables as well.

In order to achieve that, we:

1) Expose a new attribute: MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS.
   enum ib_flow_flags is used as valid flags. Only
   IB_FLOW_ATTR_FLAGS_EGRESS is supported.

2) Remove the requirement to have a DEVX or QP destination when creating a
   flow. A flow added to NIC TX flow table will forward the packet outside
   of the vport (Wire or E-Switch in the SR-iOV case).

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:07 -06:00
Mark Bloch
b47fd4ffe2 RDMA/mlx5: Add NIC TX namespace when getting a flow table
Add the ability to get a NIC TX flow table when using _get_flow_table().
This will allow to create a matcher and a flow rule on the NIC TX path.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:07 -06:00
Mark Bloch
b823dd6d86 RDMA/mlx5: Refactor raw flow creation
Move struct mlx5_flow_act to be passed from the method entry point,
this will allow to add support for flow action for the raw create flow
path.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:07 -06:00
Mark Bloch
501f14e37b RDMA/mlx5: Don't overwrite action if already set
We support only a single action type per flow rule, in case the user passes
the same type of flow actions fail the flow creation.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
2ea2620390 RDMA/mlx5: Refactor flow action parsing to be more generic
Make the parsing of flow actions more generic so it could be used by
mlx5 raw create flow.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
e806f9328b RDMA/mlx5: Enable attaching packet reformat action to steering flows
Any matching rules will be mutated based on the packet reformat context
which is attached to that given flow rule.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
5c2db53f62 RDMA/mlx5: Enable reformat on NIC RX if supported
A L3_TUNNEL_TO_L2 decap flow action requires to enable the encap bit on
the flow table, enable it if supported. This will allow to attach those
flow actions to NIC RX steering. We don't enable if running on a
representor.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
10a308964e RDMA/mlx5: Enable attaching DECAP action to steering flows
Any matching packet will be stripped of it's VXLAN tunnel, only the inner
L2 onward is left. The user will receive the decapsulated packet.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
4adda1122c RDMA/mlx5: Enable decap and packet reformat on flow tables
If NIC RX flow tables support decap operation, enable it on creation,
This allows to perform decapsulation of tunnelled packets by steering
rules. If NIC TX flow tables support reformat operation, enable it on
creation.

We don't enable those capabilities on representors as the E-Switch should
handle packet modification (can be configured via TC) and as current
hardware can't handle both FDB and NIC flow tables with decap/packet
reformat support.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
b1085be3f4 RDMA/mlx5: Enable attaching modify header to steering flows
When creating a flow steering rule, allow the user to attach a modify
header action.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Mark Bloch
78dd0c430f RDMA/mlx5: Add NIC TX steering support
Just like ingress steering, allow a user to create steering rules that
match egress vport traffic. We expose the same number of priorities as
the bypass (NIC RX) steering.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11 09:28:06 -06:00
Jason Gunthorpe
af68ccbc11 Merge branch 'mlx5-flow-mutate' into rdma.git for-next
For dependencies, branch based on 'mellanox/mlx5-next' of
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

Pull Flow actions to mutate packets from Leon Romanovsky:

====================
This series exposes the ability to create flow actions which can
mutate packet headers. We do that by exposing two new verbs:
 * modify header - can change existing packet headers. packet
 * reformat - can encapsulate or decapsulate a packet.
              Once created a flow action must be attached to a steering
              rule for it to take effect.

The first 10 patches refactor mlx5_core code, rename internal structures
to better reflect their operation and export needed functions so the RDMA
side can allocate the action.

The last 5 patches expose via the IOCTL infrastructure mlx5_ib methods
which do the actual allocation of resources and return an handle to the
user. A user of this API is expected to know how to work with the device's
spec as the input to those function is HW depended.

An example usage of the modify header action is routing, A user can create
an action which edits the L2 header and decrease the TTL.

An example usage of the packet reformat action is VXLAN encap/decap which
is done by the HW.
====================

* branch 'mlx5-flow-mutate':
  RDMA/mlx5: Extend packet reformat verbs
  RDMA/mlx5: Add new flow action verb - packet reformat
  RDMA/uverbs: Add generic function to fill in flow action object
  RDMA/mlx5: Add a new flow action verb - modify header
  RDMA/uverbs: Add UVERBS_ATTR_CONST_IN to the specs language
  net/mlx5: Export packet reformat alloc/dealloc functions
  net/mlx5: Pass a namespace for packet reformat ID allocation
  net/mlx5: Expose new packet reformat capabilities
  {net, RDMA}/mlx5: Rename encap to reformat packet
  net/mlx5: Move header encap type to IFC header file
  net/mlx5: Break encap/decap into two separated flow table creation flags
  net/mlx5: Add support for more namespaces when allocating modify header
  net/mlx5: Export modify header alloc/dealloc functions
  net/mlx5: Add proper NIC TX steering flow tables support
  net/mlx5: Cleanup flow namespace getter switch logic

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05 15:24:58 -06:00
Mark Bloch
b4749bf256 RDMA/mlx5: Add a new flow action verb - modify header
Expose the ability to create a flow action which changes packet
headers. The data passed from userspace should be modify header actions as
defined by HW specification.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05 15:23:58 -06:00
Majd Dibbiny
c6a21c3864 IB/mlx5: Change TX affinity assignment in RoCE LAG mode
In the current code, the TX affinity is per RoCE device, which can cause
unfairness between different contexts. e.g. if we open two contexts, and
each open 10 QPs concurrently, all of the QPs of the first context might
end up on the first port instead of distributed on the two ports as
expected

To overcome this unfairness between processes, we maintain per device TX
affinity, and per process TX affinity.

The allocation algorithm is as follow:

1. Hold two tx_port_affinity atomic variables, one per RoCE device and one
   per ucontext. Both initialized to 0.

2. In mlx5_ib_alloc_ucontext do:
 2.1. ucontext.tx_port_affinity = device.tx_port_affinity
 2.2. device.tx_port_affinity += 1

3. In modify QP INIT2RST:
 3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM
 3.2. ucontext.tx_port_affinity += 1

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-04 16:26:14 -06:00
Jason Gunthorpe
89982f7cce Linux 4.18
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAltwm2geHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGITkH/iSzkVhT2OxHoir0
 mLVzTi7/Z17L0e/ELl7TvAC0iLFlWZKdlGR0g3b4/QpXLPmNK4HxiDRTQuWn8ke0
 qDZyDq89HqLt+mpeFZ43PCd9oqV8CH2xxK3iCWReqv6bNnowGnRpSStlks4rDqWn
 zURC/5sUh7TzEG4s997RrrpnyPeQWUlf/Mhtzg2/WvK2btoLWgu5qzjX1uFh3s7u
 vaF2NXVJ3X03gPktyxZzwtO1SwLFS1jhwUXWBZ5AnoJ99ywkghQnkqS/2YpekNTm
 wFk80/78sU+d91aAqO8kkhHj8VRrd+9SGnZ4mB2aZHwjZjGcics4RRtxukSfOQ+6
 L47IdXo=
 =sJkt
 -----END PGP SIGNATURE-----

Merge tag 'v4.18' into rdma.git for-next

Resolve merge conflicts from the -rc cycle against the rdma.git tree:

Conflicts:
 drivers/infiniband/core/uverbs_cmd.c
  - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
  - Merge removal of file->ucontext in for-next with new code in -rc
 drivers/infiniband/core/uverbs_main.c
  - for-next removed code from ib_uverbs_write() that was modified
    in for-rc

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16 13:12:00 -06:00
Jason Gunthorpe
7d96c9b176 IB/uverbs: Have the core code create the uverbs_root_spec
There is no reason for drivers to do this, the core code should take of
everything. The drivers will provide their information from rodata to
describe their modifications to the core's base uapi specification.

The core uses this to build up the runtime uapi for each device.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-10 16:06:24 -06:00
Jason Gunthorpe
9f49a5b5c2 RDMA/netdev: Use priv_destructor for netdev cleanup
Now that the unregister_netdev flow for IPoIB no longer relies on external
code we can now introduce the use of priv_destructor and
needs_free_netdev.

The rdma_netdev flow is switched to use the netdev common priv_destructor
instead of the special free_rdma_netdev and the IPOIB ULP adjusted:
 - priv_destructor needs to switch to point to the ULP's destructor
   which will then call the rdma_ndev's in the right order
 - We need to be careful around the error unwind of register_netdev
   as it sometimes calls priv_destructor on failure
 - ULPs need to use ndo_init/uninit to ensure proper ordering
   of failures around register_netdev

Switching to priv_destructor is a necessary pre-requisite to using
the rtnl new_link mechanism.

The VNIC user for rdma_netdev should also be revised, but that is left for
another patch.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02 20:27:43 -06:00
Jason Gunthorpe
bccd06223f IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language
This clearly indicates that the input is a bitwise combination of values
in an enum, and identifies which enum contains the definition of the bits.

Special accessors are provided that handle the mandatory validation of the
allowed bits and enforce the correct type for bitwise flags.

If we had introduced this at the start then the kabi would have uniformly
used u64 data to pass flags, however today there is a mixture of u64 and
u32 flags. All places are converted to accept both sizes and the accessor
fixes it. This allows all existing flags to grow to u64 in future without
any hassle.

Finally all flags are, by definition, optional. If flags are not passed
the accessor does not fail, but provides a value of zero.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-30 20:23:29 -06:00
Qing Huang
2577188edc IB/mlx5: avoid excessive warning msgs when creating VFs on 2nd port
When a CX5 device is configured in dual-port RoCE mode, after creating
many VFs against port 1, creating the same number of VFs against port 2
will flood kernel/syslog with something like
"mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
affiliated."

So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
repeatedly attempts to bind the new mpi structure to every device on the
list until it finds an unbound device.

Change the log level from warn to dbg to avoid log flooding as the warning
should be harmless.

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-26 16:02:28 -06:00
Yishai Hadas
cb80fb1892 IB/mlx5: Enable driver uapi commands for flow steering
Expose the mlx5 flow steering parsing trees, exposing the functionality to
user space.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24 14:33:52 -06:00
Yishai Hadas
6346f0bfa0 IB/mlx5: Add support for a flow table destination for driver flow steering
Add support to set a destination that is a flow table, this can come from
the DEVX destination.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24 14:03:56 -06:00
Yishai Hadas
d4be3f4466 IB/mlx5: Support adding flow steering rule by raw description
Add support to set a public flow steering rule when its destination is a
TIR by using raw specification data.

The logic follows the verbs API but instead of using ib_spec(s) the raw,
device specific, description is used.

This allows supporting specialty matchers without having to define new
matches in the verbs struct based language.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24 14:03:56 -06:00
Yishai Hadas
3226944124 IB/mlx5: Introduce driver create and destroy flow methods
Introduce driver create and destroy flow methods on the uverbs flow
object.

This allows the driver to get its specific device attributes to match the
underlay specification while still using the generic ib_flow object for
cleanup and code sharing.

The IB object's attributes are set via the ib_set_flow() helper function.

The specific implementation for the given specification is added in
downstream patches.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24 14:03:49 -06:00
Kamal Heib
aa09ea6e6b RDMA/mlx5: Remove set but not used variables
Remove "uctx" and "pa" variables that were set but not used.

Fixes: a8b92ca1b0 ("IB/mlx5: Introduce DEVX")
Fixes: 8f06228733 ("RDMA/mlx5: Remove debug prints of VMA pointers")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-23 15:34:54 -06:00
Artemy Kovalyov
b02289b3d6 RDMA: Validate grh_required when handling AVs
Extend the existing grh_required flag to check when AV's are handled that
a GRH is present.

Since we don't want to do query_port during the AV checks for performance
reasons move the flag into the immutable_data.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-10 11:13:04 -06:00
Jason Gunthorpe
2f944c0fbf RDMA: Fix storage of PortInfo CapabilityMask in the kernel
The internal flag IP_BASED_GIDS was added to a field that was being used
to hold the port Info CapabilityMask without considering the effects this
will have. Since most drivers just use the value from the HW MAD it means
IP_BASED_GIDS will also become set on any HW that sets the IBA flag
IsOtherLocalChangesNoticeSupported - which is not intended.

Fix this by keeping port_cap_flags only for the IBA CapabilityMask value
and store unrelated flags externally. Move the bit definitions for this to
ib_mad.h to make it clear what is happening.

To keep the uAPI unchanged define a new set of flags in the uapi header
that are only used by ib_uverbs_query_port_resp.port_cap_flags which match
the current flags supported in rdma-core, and the values exposed by the
current kernel.

Fixes: b4a26a2728 ("IB: Report using RoCE IP based gids in port caps")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-10 11:06:45 -06:00
Parav Pandit
921c0f5ba5 IB/mlx5: Honor cnt_set_id_valid flag instead of set_id
It is incorrect to depend on set_id value to know if counters were
allocated or not. set_id_valid field is set to true when counters
were allocated. Therefore, use set_id_valid while deciding to
free counters.

Cc: <stable@vger.kernel.org> # 4.15
Fixes: aac4492ef2 ("IB/mlx5: Update counter implementation for dual port RoCE")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-09 13:30:25 -06:00
Leon Romanovsky
e3f1ed1f5a RDMA/mlx5: Remove unused port number parameter
Clean up a little bit code to drop unused port_num parameter.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-09 13:24:21 -06:00
Jason Gunthorpe
83bb444233 RDMA/uverbs: Remove UA_FLAGS
This bit of boilerplate isn't really necessary, we can use bitfields
instead of a flags enum and the macros can then individually initialize
them through the __VA_ARGS__ like everything else.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04 13:47:01 -06:00
Jason Gunthorpe
9a119cd597 RDMA/uverbs: Get rid of the & in method specifications
Hide it inside the macros. The & is confusing and interferes with using
this as a generic DSL in later patches.

Since this also touches almost every line, also run the specs through
clang-format (with 'BinPackParameters: false') to make the maintenance
easier.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04 13:47:01 -06:00
Jason Gunthorpe
87fc2a620a RDMA/uverbs: Store the specs_root in the struct ib_uverbs_device
The specs are required to operate the uverbs file, so they belong inside
the ib_uverbs_device, not inside the ib_device. The spec passed in the
ib_device is just a communication from the driver and should not be used
during runtime.

This also changes the lifetime of the spec memory to match the
ib_uverbs_device, however at this time the spec_root can still contain
driver pointers after disassociation, so it cannot be used if ib_dev is
NULL. This is preparation for another series.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04 13:47:01 -06:00
Jason Gunthorpe
8193abb6a8 Merge branch 'mlx5-dump-fill-mkey' into rdma.git for-next
For dependencies, branch based on 'mellanox/mlx5-next' of
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

Pull Dump and fill MKEY from Leon Romanovsky:

====================
MLX5 IB HCA offers the memory key, dump_fill_mkey to increase performance,
when used in a send or receive operations.

It is used to force local HCA operations to skip the PCI bus access, while
keeping track of the processed length in the ibv_sge handling.

In this three patch series, we expose various bits in our HW spec
file (mlx5_ifc.h), move unneeded for mlx5_core FW command and export such
memory key to user space thought our mlx5-abi header file.
====================

Botched auto-merge in mlx5_ib_alloc_ucontext() resolved by hand.

* branch 'mlx5-dump-fill-mkey':
  IB/mlx5: Expose dump and fill memory key
  net/mlx5: Add hardware definitions for dump_fill_mkey
  net/mlx5: Limit scope of dump_fill_mkey function
  net/mlx5: Rate limit errors in command interface

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-04 13:23:46 -06:00
Yonatan Cohen
25bb36e75d IB/mlx5: Expose dump and fill memory key
MLX5 IB HCA offers the memory key, dump_fill_mkey to boost
performance, when used in a send or receive operations.

It is used to force local HCA operations to skip the PCI bus access,
while keeping track of the processed length in the ibv_sge handling.

Meaning, instead of a PCI write access the HCA leaves the target
memory untouched, and skips filling that packet section. Similar
behavior is done upon send, the HCA skips data in memory relevant
to this key and saves PCI bus access.

This functionality saves PCI read/write operations.

Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-04 13:16:04 -06:00
Maor Gottlieb
a93b632c45 IB/mlx5: Fix GRE flow specification
Currently the driver sets the mask of the gre_protocol to 0xffff
without consideration in the user request.

Fix it by copy the mask from the verbs spec.

Fixes: da2f22ae77 ("IB/mlx5: Add support for GRE flow specification")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-03 15:46:07 -06:00
Bart Van Assche
7496a511a0 IB/mlx5: Remove set-but-not-used variables
Avoid that the compiler complains about set-but-not-used variables when
building with W=1. This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-03 14:14:21 -06:00
Leon Romanovsky
1517799965 RDMA/mlx5: Don't leak UARs in case of free fails
The failure in releasing one UAR doesn't mean that we can't continue to
release rest of system pages, so don't return too early.

As part of cleanup, there is no need to print warning if
mlx5_cmd_free_uar() fails because such warning will be printed as part of
mlx5_cmd_exec().

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-29 11:45:50 -06:00
David S. Miller
04c6faa175 mlx5-fixes-2018-06-26
Fixes for mlx5 core and netdev driver:
 
 Two fixes from Alex Vesker to address command interface issues
  - Race in command interface polling mode
  - Incorrect raw command length parsing
 
 From Shay Agroskin, Fix wrong size allocation for QoS ETC TC regitster.
 
 From Or Gerlitz and Eli Cohin, Address backward compatability issues for when
 Eswitch capability is not advertised for the PF host driver
     - Fix required capability for manipulating MPFS
     - E-Switch, Disallow vlan/spoofcheck setup if not being esw manager
     - Avoid dealing with vport IB/eth representors if not being e-switch manager
     - E-Switch, Avoid setup attempt if not being e-switch manager
     - Don't attempt to dereference the ppriv struct if not being eswitch manager
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbMtMEAAoJEEg/ir3gV/o+NjAIAMGrerpwg8ADBj+b9tSWm4WV
 2yAJ561kBObwhA+uDJtH7mGUO3+AnkcWz9vynGqFdmkOikUcbpPkBb9D+rmFbkX2
 E585pwR3pH7lEzYEG4xO6SwuQcQ4OytFNxz94AT6CgNEXqrmbrD7A5Vsgk265yZq
 pJzL1OVfkXKOtb2x5PpCOh19/28OxAzyMQfoklsE2Wn7j8/2RWX0UUDxuF8jS+He
 9loaurT4Fsfo5JYE+o+k38knHFBkdTUZBD9/bZrtaMcrD68bZdJTpZm6eYwRXW3S
 7J88SmH/xTy74f1KY4qf0JOTxnaWtm/r4YaCXf1QD05W2/U9FQpIW1ipMKH51vk=
 =te2H
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2018-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-fixes-2018-06-26

Fixes for mlx5 core and netdev driver:

Two fixes from Alex Vesker to address command interface issues
 - Race in command interface polling mode
 - Incorrect raw command length parsing

From Shay Agroskin, Fix wrong size allocation for QoS ETC TC regitster.

From Or Gerlitz and Eli Cohin, Address backward compatability issues for when
Eswitch capability is not advertised for the PF host driver
    - Fix required capability for manipulating MPFS
    - E-Switch, Disallow vlan/spoofcheck setup if not being esw manager
    - Avoid dealing with vport IB/eth representors if not being e-switch manager
    - E-Switch, Avoid setup attempt if not being e-switch manager
    - Don't attempt to dereference the ppriv struct if not being eswitch manager
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28 16:21:35 +09:00
Or Gerlitz
aff2252a2a IB/mlx5: Avoid dealing with vport representors if not being e-switch manager
In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the switchdev mode representors are running on
the embedded cpu (EC) and not at the host.

As such, we should avoid dealing with vport representors if
not being esw manager.

Fixes: b5ca15ad7e ('IB/mlx5: Add proper representors support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26 15:26:30 -07:00
Yishai Hadas
d0e84c0ad3 IB/mlx5: Add support for drain SQ & RQ
This patch follows the logic from ib_core but considers the internal
device state upon executing the involved commands.

Specifically,
Upon internal error state modify QP to an error state can be assumed to
be success as each in-progress WR going to be flushed in error in any
case as expected by that modify command.

In addition,
As the drain should never fail the driver makes sure that post_send/recv
will succeed even if the device is already in an internal error state.
As such once the driver will supply the simulated/SW CQEs the CQE for
the drain WR will be handled as well.

In case of an internal error state the CQE for the drain WR may be
completed as part of the main task that handled the error state or by
the task that issued the drain WR.

As the above depends on scheduling the code takes the relevant locks and
actions to make sure that the completion handler for that WR will always
be called after that the post_send/recv were issued but not in parallel
to the other task that handles the error flow.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-25 14:32:36 -06:00
Jason Gunthorpe
4d7dff2b8b Merge branch 'icrc-counter' into rdma.git for-next
For dependencies, branch based on 'mellanox/mlx5-next' of
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

Pull RoCE ICRC counters from Leon Romanovsky:

====================
This series exposes RoCE ICRC counter through existing RDMA hw_counters
sysfs interface.

The first patch has all HW definitions in mlx5_ifc.h file and second patch
is the actual counter implementation.
====================

* branch 'icrc-counter':
  IB/mlx5: Support RoCE ICRC encapsulated error counter
  net/mlx5: Add RoCE RX ICRC encapsulated counter
2018-06-22 08:53:27 -06:00
Talat Batheesh
9f876f3de6 IB/mlx5: Support RoCE ICRC encapsulated error counter
This patch adds support to query the counter that counts the
RoCE packets with corrupted ICRC (Invariant Cyclic Redundancy Code).

This counter will be under
/sys/class/infiniband/<mlx5-dev>/ports/<port>/hw_counters/

rx_icrc_encapsulated - The number of RoCE packets with ICRC
error.

Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-22 08:51:14 -06:00
Linus Torvalds
1abd8a8f39 4.18-rc
Regression and crashing bug fixes:
 
 - mlx4/5: Fixes for issues found from various checkers
 - A resource tracking and uverbs regression in the core code
 - qedr: NULL pointer regression found during testing
 - rxe: Various small bugs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCgAGBQJbKr/pAAoJEDht9xV+IJsasIoP/2yyHUHjBp3vVNJ3A2qRnzAJ
 Yt4DHVo+lWfAhtEY+1rqRQx432aa+gv7e9TUA/Y9Llj0+C2nrOIsNniJvyjF7UrF
 djtAua66p5L+TxmeQPbQP+RsE8pUoczxtPWvpTP6dJ5pkp+/0IJl4P7aZNG+WlYT
 t/4pW1zBejhA9nXfHCFej4A3HM3/6oW3narmIldrNhW1EH7+5jeidyyLKueY6c1Q
 MJ8zfLQM/ZdP1hFwrzfZPMsFmGI4WD7P0F4jWVa+JvpeedV/jOTVVBLKrjHfF1JS
 7JMEeVlK/Mqsu4hCu/BJqHsh8kpFs4aTGfHUOyusZ1xsOx92X1QWCTtGEwi/ZKZh
 PvZMkbWU6Syd1IFwtMRHrKMxGQYrErwXf9V3xHxVn4bIFEAWTT8qn/T1w+tiUcJY
 gBtfqpLuIdzjZ4JtNGBRtfxOvhzqBkHdZO7sd1ARmuIf6Euzvas9AEz9qH893Oun
 rfeLOL70hoz2TrJIpnDApndo9LFEGUB+ypUpax9e99nVHVdbPh/PSdRze/2khoj3
 oJ8z8oh6KAimiW1sMkJ89fefDfUnkkOFOYrxH3nTYfkdrOHyiEtpLuE424pZwVKM
 uWqQ+yoXRuab4X58Gw2ezYq2/UIILn4hJEJ/VdTgJomb41nd0iZtKNlgw2uk8G8M
 WhOCed7yvYsp6hDi8pSq
 =Gjuy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Here are eight fairly small fixes collected over the last two weeks.

  Regression and crashing bug fixes:

   - mlx4/5: Fixes for issues found from various checkers

   - A resource tracking and uverbs regression in the core code

   - qedr: NULL pointer regression found during testing

   - rxe: Various small bugs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/rxe: Fix missing completion for mem_reg work requests
  RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()
  IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write
  IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
  RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM
  IB/mlx5: Fix return value check in flow_counters_set_data()
  IB/mlx5: Fix memory leak in mlx5_ib_create_flow
  IB/rxe: avoid double kfree skb
2018-06-21 07:22:30 +09:00
Leon Romanovsky
cfdeb8934b RDMA/mlx5: Refactor transport domain checks
Put all relevant checks for transport domain in the
mlx5_ib_alloc/dealloc_transport_domain functions.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19 13:32:17 -06:00
Yishai Hadas
c59450c463 IB/mlx5: Expose DEVX tree
Expose DEVX tree to be used by upper layers.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19 10:53:02 -06:00