Commit Graph

192 Commits

Author SHA1 Message Date
Robin Murphy
cc97c6d94e RDMA/usnic: Refactor usnic_uiom_alloc_pd()
Rather than hard-coding pci_bus_type, pass the PF device through to
usnic_uiom_alloc_pd() and retrieve its bus there. This prepares for
iommu_domain_alloc() changing to take a device rather than a bus_type.

Link: https://lore.kernel.org/r/ef607cb3f5a09920b86971b8c8e60af8c647457e.1649169359.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-08 14:20:20 -03:00
Robin Murphy
d9539fb7c2 RDMA/usnic: Stop using iommu_present()
Even if an IOMMU might be present for some PCI segment in the system,
that doesn't necessarily mean it provides translation for the device(s)
we care about. Replace iommu_present() with a more appropriate check at
probe time, and garbage-collect the resulting empty init function.

Link: https://lore.kernel.org/r/f707b4248e1d33b6d2c7f1d7c94febb802cf9890.1649161199.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-08 14:12:12 -03:00
Jason Gunthorpe
e945c653c8 RDMA: Split kernel-only global device caps from uverbs device caps
Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part of the
uapi. This limits the device_cap_flags to being the same bitmap that will
be copied to userspace.

This cleanly splits out the uverbs flags from the kernel flags to avoid
confusion in the flags bitmap.

Add some short comments describing which each of the kernel flags is
connected to. Remove unused kernel flags.

Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-06 15:02:13 -03:00
Leon Romanovsky
cad4c6caad RDMA/usnic: Delete useless module.h include
There is no need in include of module.h in the following files.

Link: https://lore.kernel.org/r/745480bafd6f63c97a7049f34d84ef17dbc167d6.1642960861.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-28 13:03:13 -04:00
Greg Kroah-Hartman
01097139e7 RDMA: Use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a kobj_type,
through the default_attrs field, and the default_groups field.  Move the
IB code to use default_groups field which has been the preferred way since
commit aa30f47cf6 ("kobject: Add support for default attribute groups to
kobj_type") so that we can soon get rid of the obsolete default_attrs
field.

Link: https://lore.kernel.org/r/20220103152259.531034-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-05 15:15:15 -04:00
Dan Carpenter
10f2d1cbf8 RDMA/usnic: Clean up usnic_ib_alloc_pd()
Remove the unnecessary "umem_pd" variable.  And usnic_uiom_alloc_pd()
never returns NULL so remove the NULL check.

Link: https://lore.kernel.org/r/20211118113924.GH1147@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-19 12:02:30 -04:00
Jakub Kicinski
fd92213e9a RDMA: Constify netdev->dev_addr accesses
netdev->dev_addr will become const soon, make sure drivers propagate the
qualifier.

Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:33:09 -03:00
Leon Romanovsky
a86cd017a4 RDMA/usnic: Lock VF with mutex instead of spinlock
Usnic VF doesn't need lock in atomic context to create QPs, so it is safe
to use mutex instead of spinlock. Such change fixes the following smatch
error.

Smatch static checker warning:

   lib/kobject.c:289 kobject_set_name_vargs()
    warn: sleeping in atomic context

Fixes: 514aee660d ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/2a0e295786c127e518ebee8bb7cafcb819a625f6.1631520231.git.leonro@nvidia.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-24 10:55:28 -03:00
Leon Romanovsky
514aee660d RDMA: Globally allocate and release QP memory
Convert QP object to follow IB/core general allocation scheme.  That
change allows us to make sure that restrack properly kref the memory.

Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com
Reviewed-by: Gal Pressman <galpress@amazon.com> #efa
Tested-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core
Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-03 13:44:27 -03:00
Jason Gunthorpe
915e4af59f RDMA: Remove rdma_set_device_sysfs_group()
The driver's device group can be specified as part of the ops structure
like the device's port group. No need for the complicated API.

Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:32 -03:00
Linus Torvalds
f34b2cf178 RDMA merge window pull request
This is significantly bug fixes and general cleanups. The noteworthy new
 features are fairly small:
 
 - XRC support for HNS and improves RQ operations
 
 - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and
   qib
 
 - Quite a few general cleanups on spelling, error handling, static checker
   detections, etc
 
 - Increase the number of device ports supported beyond 255. High port
   count software switches now exist
 
 - Several bug fixes for rtrs
 
 - mlx5 Device Memory support for host controlled atomics
 
 - Report SRQ tables through to rdma-tool
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmCMMHEACgkQOG33FX4g
 mxri3Q//RAgIExCGHebQ9xkptZHVyTLLJMpiMl2cqk3ZVRdDZ7QdiQjIqY2KqlUK
 nxBj7EXJeX6rV5a1xqCcOO1gBetB28TSwnCNE2ZqrXP5B59ISW8D052IWza3UkUz
 WmHLARxHQlyKBWA4+ZAgfoUGL0NmWA8QPf56t/RK/3/OsuYnGzcnWmmFbt8XKFcH
 NtO3KC45mKWDqqG0A0XRrLbEQz/ElO3OuPBqlBKgB3ZgGPzgsOUTOGkm1tCcZ89L
 /pvZGB7SklKZdCX8TxdpVGd9h0zHl8pqh1yEzvTA1ypNAYSUId2mvZXluU8J5yJl
 FLk7E1IxE5050FNEc7T5uZdUVntulYiqL2558coRI34l5w26pKGjIMxw/nTB8hg8
 4ZfBtKVemIG6yzW5Up6iBpK7qWYpvLWVShwYAWhbNsjN7JGzJuh1gJnjbmYgyz2P
 RTMU9wjFPLL2wZxg4LDHACVJNBb82j6KKuE+kZWpk11ro7INw9+7YwRuTo7/ezxC
 BwXKu8wF4igwSigV55jM+WnGXLhxdC3qmx/2cbtWyLM/PzdRL96tM0RWW5v8/Nv7
 teFhkt+f3RVqcfYH5K1qCXy3UFrxG6bxFSvcHHSBx2bdIrqhuTY5FqszAYImeW2j
 iHoyIsuSuGu79HQgOzAQZsEyksWi6OYDvA9Q9VBoPP4bJ3DOAa4=
 =vsXA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is significantly bug fixes and general cleanups. The noteworthy
  new features are fairly small:

   - XRC support for HNS and improves RQ operations

   - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
     and qib

   - Quite a few general cleanups on spelling, error handling, static
     checker detections, etc

   - Increase the number of device ports supported beyond 255. High port
     count software switches now exist

   - Several bug fixes for rtrs

   - mlx5 Device Memory support for host controlled atomics

   - Report SRQ tables through to rdma-tool"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
  IB/qib: Remove redundant assignment to ret
  RDMA/nldev: Add copy-on-fork attribute to get sys command
  RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
  RDMA/siw: Fix a use after free in siw_alloc_mr
  IB/hfi1: Remove redundant variable rcd
  RDMA/nldev: Add QP numbers to SRQ information
  RDMA/nldev: Return SRQ information
  RDMA/restrack: Add support to get resource tracking for SRQ
  RDMA/nldev: Return context information
  RDMA/core: Add CM to restrack after successful attachment to a device
  RDMA/cma: Skip device which doesn't support CM
  RDMA/rxe: Fix a bug in rxe_fill_ip_info()
  RDMA/mlx5: Expose private query port
  RDMA/mlx4: Remove an unused variable
  RDMA/mlx5: Fix type assignment for ICM DM
  IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
  RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
  RDMA/cxgb4: add missing qpid increment
  IB/ipoib: Remove unnecessary struct declaration
  RDMA/bnxt_re: Get rid of custom module reference counting
  ...
2021-05-01 09:15:05 -07:00
Sami Tolvanen
4f0f586bf0 treewide: Change list_sort to use const pointers
list_sort() internally casts the comparison function passed to it
to a different type with constant struct list_head pointers, and
uses this pointer to call the functions, which trips indirect call
Control-Flow Integrity (CFI) checking.

Instead of removing the consts, this change defines the
list_cmp_func_t type and changes the comparison function types of
all list_sort() callers to use const pointers, thus avoiding type
mismatches.

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
2021-04-08 16:04:22 -07:00
Mark Bloch
1fb7f8973f RDMA: Support more than 255 rdma ports
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs.  HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 09:31:21 -03:00
Joe Perches
f068cb1db2 RDMA/usnic: Fix misuse of sysfs_emit_at
In commit e28bf1f03b ("RDMA: Convert various random sprintf sysfs _show
uses to sysfs_emit") I mistakenly used len = sysfs_emit_at to overwrite
the last trailing space of potentially multiple entry output.

Instead use a more common style by removing the trailing space from the
output formats and adding a prefixing space to the contination formats and
converting the final terminating output newline from the defective
	len = sysfs_emit_at(buf, len, "\n");
to the now appropriate and typical
	len += sysfs_emit_at(buf, len, "\n");

Fixes: e28bf1f03b ("RDMA: Convert various random sprintf sysfs _show uses to sysfs_emit")
Link: https://lore.kernel.org/r/5eb794b9c9bca0494d94b2b209f1627fa4e7b555.camel@perches.com
Reported-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19 20:27:29 -04:00
Dinghao Liu
a306aba9c8 RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grp
If usnic_ib_qp_grp_create() fails at the first call, dev_list
will not be freed on error, which leads to memleak.

Fixes: e3cf00d0a8 ("IB/usnic: Add Cisco VIC low-level hardware driver")
Link: https://lore.kernel.org/r/20201226074248.2893-1-dinghao.liu@zju.edu.cn
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-07 16:07:12 -04:00
Joe Perches
e28bf1f03b RDMA: Convert various random sprintf sysfs _show uses to sysfs_emit
Manual changes for sysfs_emit as cocci scripts can't easily convert them.

Link: https://lore.kernel.org/r/ecde7791467cddb570c6f6d2c908ffbab9145cac.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-30 21:03:52 -03:00
Joe Perches
45808361d4 RDMA: Manual changes for sysfs_emit and neatening
Make changes to use sysfs_emit in the RDMA code as cocci scripts can not
be written to handle _all_ the possible variants of various sprintf family
uses in sysfs show functions.

While there, make the code more legible and update its style to be more
like the typical kernel styles.

Miscellanea:

o Use intermediate pointers for dereferences
o Add and use string lookup functions
o return early when any intermediate call fails so normal return is
  at the bottom of the function
o mlx4/mcg.c:sysfs_show_group: use scnprintf to format intermediate strings

Link: https://lore.kernel.org/r/f5c9e4c9d8dafca1b7b70bd597ee7f8f219c31c8.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-30 21:03:52 -03:00
Joe Perches
1c7fd72687 RDMA: Convert sysfs device * show functions to use sysfs_emit()
Done with cocci script:

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	strcpy(buf, chr);
+	sysfs_emit(buf, chr);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
-	len += scnprintf(buf + len, PAGE_SIZE - len,
+	len += sysfs_emit_at(buf, len,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	...
-	strcpy(buf, chr);
-	return strlen(buf);
+	return sysfs_emit(buf, chr);
}

Link: https://lore.kernel.org/r/7f406fa8e3aa2552c022bec680f621e38d1fe414.1602122879.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:53:21 -03:00
Jason Gunthorpe
1f11a7610e RDMA: Check create_flags during create_qp
Each driver should check that the QP attrs create_flags is supported.
Unfortuantely when create_flags was added to the QP attrs the drivers were
not updated. uverbs_ex_cmd_mask was used to block it - even though kernel
drivers use these flags too.

Check that flags is zero in all drivers that don't use it, remove
IB_USER_VERBS_EX_CMD_CREATE_QP from uverbs_ex_cmd_mask. Fix the error code
to be EOPNOTSUPP.

Link: https://lore.kernel.org/r/8-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:59 -03:00
Jason Gunthorpe
1c407cb5d7 RDMA: Check flags during create_cq
Each driver should check that the CQ attrs is supported. Unfortuantely
when flags was added to the CQ attrs the drivers were not updated,
uverbs_ex_cmd_mask was used to block it. This was missed when create CQ
was converted to ioctl, so non-zero flags could have been passed into
drivers.

Check that flags is zero in all drivers that don't use it, remove
IB_USER_VERBS_EX_CMD_CREATE_CQ from uverbs_ex_cmd_mask.

Fixes: 41b2a71fc8 ("IB/uverbs: Move ioctl path of create_cq and destroy_cq to a new file")
Link: https://lore.kernel.org/r/7-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:59 -03:00
Jason Gunthorpe
26e990badd RDMA: Check attr_mask during modify_qp
Each driver should check that it can support the provided attr_mask during
modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block
modify_qp_ex because the driver didn't check RATE_LIMIT.

Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:58 -03:00
Jason Gunthorpe
44ce37bc8b RDMA: Move more uverbs_cmd_mask settings to the core
These functions all depend on the driver providing a specific op:

- REREG_MR is rereg_user_mr(). bnxt_re set this without providing the op
- ATTACH/DEATCH_MCAST is attach_mcast()/detach_mcast(). usnic set this
  without providing the op
- OPEN_QP doesn't involve the driver but requires a XRCD. qedr provides
  xrcd but forgot to set it, usnic doesn't provide XRCD but set it anyhow.
- OPEN/CLOSE_XRCD are the ops alloc_xrcd()/dealloc_xrcd()
- CREATE_SRQ/DESTROY_SRQ are the ops create_srq()/destroy_srq()
- QUERY/MODIFY_SRQ is op query_srq()/modify_srq(). hns sets this but
  sometimes supplies a NULL op.
- RESIZE_CQ is op resize_cq(). bnxt_re sets this boes doesn't supply an op
- ALLOC/DEALLOC_MW is alloc_mw()/dealloc_mw(). cxgb4 provided an
  (now deleted) implementation but no userspace

All drivers were checked that no drivers provide the op without also
setting uverbs_cmd_mask so this should have no functional change.

Link: https://lore.kernel.org/r/4-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:58 -03:00
Jason Gunthorpe
c074bb1e30 RDMA: Remove elements in uverbs_cmd_mask that all drivers set
This is a step toward eliminating uverbs_cmd_mask. Preset this list in the
core code. Only the op reg_user_mr wasn't already being required from the
drivers.

Link: https://lore.kernel.org/r/3-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:57 -03:00
Jason Gunthorpe
e0477b34d9 RDMA: Explicitly pass in the dma_device to ib_register_device
The code in setup_dma_device has become rather convoluted, move all of
this to the drivers. Drives now pass in a DMA capable struct device which
will be used to setup DMA, or drivers must fully configure the ibdev for
DMA and pass in NULL.

Other than setting the masks in rvt all drivers were doing this already
anyhow.

mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for
DMA based on their hardweare limits in:
__mthca_init_one()
  dma_set_max_seg_size (1G)

__mlx4_init_one()
  dma_set_max_seg_size (1G)

mlx5_pci_init()
  set_dma_caps()
    dma_set_max_seg_size (2G)

Other non software drivers (except usnic) were extended to UINT_MAX [1, 2]
instead of 2G as was before.

[1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/
[2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/

Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org
Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16 13:53:46 -03:00
Leon Romanovsky
43d781b9fa RDMA: Allow fail of destroy CQ
Like any other verbs objects, CQ shouldn't fail during destroy, but
mlx5_ib didn't follow this contract with mixed IB verbs objects with
DEVX. Such mix causes to the situation where FW and kernel are fully
interdependent on the reference counting of each side.

Kernel verbs and drivers that don't have DEVX flows shouldn't fail.

Fixes: e39afe3d6d ("RDMA: Convert CQ allocations to be under core responsibility")
Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 14:14:29 -03:00
Leon Romanovsky
91a7c58fce RDMA: Restore ability to fail on PD deallocate
The IB verbs objects are counted by the kernel and ib_core ensures that
deallocate PD will success so it will be called once all other objects
that depends on PD will be released. This is achieved by managing various
reference counters on such objects.

The mlx5 driver didn't follow this standard flow when allowed DEVX objects
that are not managed by ib_core to be interleaved with the ones under
ib_core responsibility.

In such interleaved scenarios deallocate command can fail and ib_core will
leave uobject in internal DB and attempt to clean it later to free
resources anyway.

This change partially restores returned value from dealloc_pd() for all
drivers, but keeping in mind that non-DEVX devices and kernel verbs paths
shouldn't fail.

Fixes: 21a428a019 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:57:22 -03:00
Jason Gunthorpe
6989aa62d3 Linux 5.9-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9ML+IeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGA8EIAIy/kTbFS0yrE9yV
 hb98oX0z9+EU9YQg9vhaRWwPd+rJF/JMQZLqYcwbhjG9abaUL3T3fEcSAefMHw8E
 LAt+hYzA38dHt7tqhsFQX3vV1VorvDVICBVN0yRPRWKKikq4OPIHzaAR9tleGAF5
 8btQisl1PjN+obwYmLuNb6aX16OCwAF+uXOwehcoJs9dvMNhwtXRzfOflWzOvOo6
 tE0bHErlylLDfLv4ZzEfczTdks4QJZ7C0xLSf3oN9AAynW42Xnhct4hi8qZY/hCf
 CMaqeN4hdpub6TvQIqBdDqMMjEXGFgeNSnAEBQY9VpvUqz8NTu6sQxwgJEKDF5tg
 d81lv2c=
 =uW/F
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc3' into rdma.git for-next

Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31 12:28:12 -03:00
Kamal Heib
b9caebb290 RDMA/usnic: Remove the query_pkey callback
Now that the query_pkey() isn't mandatory by the RDMA core, this callback
can be removed from the usnic provider. The libfabric userspace never
touches the pkey.

Link: https://lore.kernel.org/r/20200820125346.111902-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:02:42 -03:00
Colin Ian King
dfd022a9ea RDMA/usnic: Fix spelling mistake "transistion" -> "transition"
There is a spelling mistake in a usnic_err error message. Fix it.

Link: https://lore.kernel.org/r/20200805141459.23069-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18 13:18:53 -03:00
Christophe JAILLET
3e9fed7fb6 RDMA/usnic: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script bellow.
It has been compile tested.

When memory is allocated, GFP_ATOMIC should be used to be consistent with
the surrounding code.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Link: https://lore.kernel.org/r/20200711073120.249146-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16 15:42:05 -03:00
Masahiro Yamada
a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Michel Lespinasse
d8ed45c5dc mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Jason Gunthorpe
4d12c04caa RDMA: Remove 'max_map_per_fmr'
Now that FMR support is gone, this attribute can be deleted from all
places.

Link: https://lore.kernel.org/r/13-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02 20:32:54 -03:00
Kamal Heib
bb8865f435 RDMA/providers: Fix return value when QP type isn't supported
The proper return code is "-EOPNOTSUPP" when the requested QP type is
not supported by the provider.

Link: https://lore.kernel.org/r/20200130082049.463-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-04 12:13:42 -04:00
Gustavo A. R. Silva
5b361328ca RDMA: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Link: https://lore.kernel.org/r/20200213010425.GA13068@embeddedor.com
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> # added a few more
2020-02-20 13:33:51 -04:00
John Hubbard
f1f6a7dd9b mm, tree-wide: rename put_user_page*() to unpin_user_page*()
In order to provide a clearer, more symmetric API for pinning and
unpinning DMA pages.  This way, pin_user_pages*() calls match up with
unpin_user_pages*() calls, and the API is a lot closer to being
self-explanatory.

Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31 10:30:38 -08:00
John Hubbard
dfa0a4fff1 IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP
Convert infiniband to use the new pin_user_pages*() calls.

Also, revert earlier changes to Infiniband ODP that had it using
put_user_page().  ODP is "Case 3" in
Documentation/core-api/pin_user_pages.rst, which is to say, normal
get_user_pages() and put_page() is the API to use there.

The new pin_user_pages*() calls replace corresponding get_user_pages*()
calls, and set the FOLL_PIN flag.  The FOLL_PIN flag requires that the
caller must return the pages via put_user_page*() calls, but infiniband
was already doing that as part of an earlier commit.

Link: http://lkml.kernel.org/r/20200107224558.2362728-14-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31 10:30:37 -08:00
akpm@linux-foundation.org
2d15eb31b5 mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
[11~From: John Hubbard <jhubbard@nvidia.com>
Subject: mm/gup: add make_dirty arg to put_user_pages_dirty_lock()

Patch series "mm/gup: add make_dirty arg to put_user_pages_dirty_lock()",
v3.

There are about 50+ patches in my tree [2], and I'll be sending out the
remaining ones in a few more groups:

* The block/bio related changes (Jerome mostly wrote those, but I've had
  to move stuff around extensively, and add a little code)

* mm/ changes

* other subsystem patches

* an RFC that shows the current state of the tracking patch set.  That
  can only be applied after all call sites are converted, but it's good to
  get an early look at it.

This is part a tree-wide conversion, as described in fc1d8e7cca ("mm:
introduce put_user_page*(), placeholder versions").

This patch (of 3):

Provide more capable variation of put_user_pages_dirty_lock(), and delete
put_user_pages_dirty().  This is based on the following:

1.  Lots of call sites become simpler if a bool is passed into
   put_user_page*(), instead of making the call site choose which
   put_user_page*() variant to call.

2.  Christoph Hellwig's observation that set_page_dirty_lock() is
   usually correct, and set_page_dirty() is usually a bug, or at least
   questionable, within a put_user_page*() calling chain.

This leads to the following API choices:

    * put_user_pages_dirty_lock(page, npages, make_dirty)

    * There is no put_user_pages_dirty(). You have to
      hand code that, in the rare case that it's
      required.

[jhubbard@nvidia.com: remove unused variable in siw_free_plist()]
  Link: http://lkml.kernel.org/r/20190729074306.10368-1-jhubbard@nvidia.com
Link: http://lkml.kernel.org/r/20190724044537.10458-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Arnd Bergmann
2ac5a6d3a9 RDMA/usnic: Avoid overly large buffers on stack
It's never a good idea to put a 1000-byte buffer on the kernel stack. The
compiler warns about this instance when usnic_ib_log_vf() gets inlined
into usnic_ib_pci_probe():

drivers/infiniband/hw/usnic/usnic_ib_main.c:543:12: error: stack frame size of 1044 bytes in function 'usnic_ib_pci_probe' [-Werror,-Wframe-larger-than=]

As this is only called for debugging purposes in the setup path, it's
trivial to convert to a dynamic allocation.

Link: https://lore.kernel.org/r/20190906155730.2750200-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-13 16:55:55 -03:00
Kamal Heib
72a7720fca RDMA: Introduce ib_port_phys_state enum
In order to improve readability, add ib_port_phys_state enum to replace
the use of magic numbers.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Andrew Boyer <aboyer@tobark.org>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190807103138.17219-2-kamalheib1@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-12 10:18:52 -04:00
Chuhong Yuan
4f96061b92 IB/usnic: Use dev_get_drvdata
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make
the code simpler.

Link: https://lore.kernel.org/r/20190723114928.18424-1-hslester96@gmail.com
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-25 14:02:00 -03:00
Linus Torvalds
2a3c389a0f 5.3 Merge window RDMA pull request
A smaller cycle this time. Notably we see another new driver, 'Soft
 iWarp', and the deletion of an ancient unused driver for nes.
 
 - Revise and simplify the signature offload RDMA MR APIs
 
 - More progress on hoisting object allocation boiler plate code out of the
   drivers
 
 - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib, i40iw
 
 - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc conversion
 
 - Removal of obsolete ib_ucm chardev and nes driver
 
 - netlink based discovery of chardevs and autoloading of the modules
   providing them
 
 - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma
 
 - New driver 'siw' for software based iWarp running on top of netdev,
   much like rxe's software RoCE.
 
 - mlx5 feature to report events in their raw devx format to userspace
 
 - Expose per-object counters through rdma tool
 
 - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
   from netdev
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl0ozSwACgkQOG33FX4g
 mxqncg//Qe2zSnlbd6r3hofsc1WiHSx/CiXtT52BUGipO+cWQUwO7hGFuUHIFCuZ
 JBg7mc998xkyLIH85a/txd+RwAIApKgHVdd+VlrmybZeYCiERAMFpWg8cHpzrbnw
 l3Ln9fTtJf/NAhO0ZCGV9DCd01fs9yVQgAv21UnLJMUhp9Pzk/iMhu7C7IiSLKvz
 t7iFhEqPXNJdoqZ+wtWyc/463YxKUd9XNg9Z1neQdaeZrX4UjgDbY9x/ub3zOvQV
 jc/IL4GysJ3z8mfx5mAd6sE/jAjhcnJuaGYYATqkxiLZEP+muYwU50CNs951XhJC
 b/EfRQIcLg9kq/u6CP+CuWlMrRWy3U7yj3/mrbbGhlGq88Yt6FGqUf0aFy6TYMaO
 RzTG5ZR+0AmsOrR1QU+DbH9CKX5PGZko6E7UCdjROqUlAUOjNwRr99O5mYrZoM9E
 PdN2vtdWY9COR3Q+7APdhWIA/MdN2vjr3LDsR3H94tru1yi6dB/BPDRcJieozaxn
 2T+YrZbV+9/YgrccpPQCilaQdanXKpkmbYkbEzVLPcOEV/lT9odFDt3eK+6duVDL
 ufu8fs1xapMDHKkcwo5jeNZcoSJymAvHmGfZlo2PPOmh802Ul60bvYKwfheVkhHF
 Eee5/ovCMs1NLqFiq7Zq5mXO0fR0BHyg9VVjJBZm2JtazyuhoHQ=
 =iWcG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A smaller cycle this time. Notably we see another new driver, 'Soft
  iWarp', and the deletion of an ancient unused driver for nes.

   - Revise and simplify the signature offload RDMA MR APIs

   - More progress on hoisting object allocation boiler plate code out
     of the drivers

   - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib,
     i40iw

   - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc
     conversion

   - Removal of obsolete ib_ucm chardev and nes driver

   - netlink based discovery of chardevs and autoloading of the modules
     providing them

   - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma

   - New driver 'siw' for software based iWarp running on top of netdev,
     much like rxe's software RoCE.

   - mlx5 feature to report events in their raw devx format to userspace

   - Expose per-object counters through rdma tool

   - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
     from netdev"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits)
  RMDA/siw: Require a 64 bit arch
  RDMA/siw: Mark expected switch fall-throughs
  RDMA/core: Fix -Wunused-const-variable warnings
  rdma/siw: Remove set but not used variable 's'
  rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS
  RDMA/siw: Add missing rtnl_lock around access to ifa
  rdma/siw: Use proper enumerated type in map_cqe_status
  RDMA/siw: Remove unnecessary kthread create/destroy printouts
  IB/rdmavt: Fix variable shadowing issue in rvt_create_cq
  RDMA/core: Fix race when resolving IP address
  RDMA/core: Make rdma_counter.h compile stand alone
  IB/core: Work on the caller socket net namespace in nldev_newlink()
  RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
  RDMA/mlx5: Set RDMA DIM to be enabled by default
  RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink
  RDMA/core: Provide RDMA DIM support for ULPs
  linux/dim: Implement RDMA adaptive moderation (DIM)
  IB/mlx5: Report correctly tag matching rendezvous capability
  docs: infiniband: add it to the driver-api bookset
  IB/mlx5: Implement VHCA tunnel mechanism in DEVX
  ...
2019-07-15 20:38:15 -07:00
Jason Gunthorpe
371bb62158 Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into rdma.git for-next

For dependencies in next patches.

Resolve conflicts:
- Use uverbs_get_cleared_udata() with new cq allocation flow
- Continue to delete nes despite SPDX conflict
- Resolve list appends in mlx5_command_str()
- Use u16 for vport_rule stuff
- Resolve list appends in struct ib_client

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-28 21:18:23 -03:00
Leon Romanovsky
e39afe3d6d RDMA: Convert CQ allocations to be under core responsibility
Ensure that CQ is allocated and freed by IB/core and not by drivers.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11 16:39:49 -04:00
Leon Romanovsky
a52c8e2469 RDMA: Clean destroy CQ in drivers do not return errors
Like all other destroy commands, .destroy_cq() call is not supposed
to fail. In all flows, the attempt to return earlier caused to memory
leaks.

This patch converts .destroy_cq() to do not return any errors.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11 16:17:10 -04:00
Jason Gunthorpe
7a15414252 RDMA: Move owner into struct ib_device_ops
This more closely follows how other subsytems work, with owner being a
member of the structure containing the function pointers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:03 -03:00
Jason Gunthorpe
72c6ec18eb RDMA: Move uverbs_abi_ver into struct ib_device_ops
No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:02 -03:00
Jason Gunthorpe
b9560a419b RDMA: Move driver_id into struct ib_device_ops
No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:02 -03:00
Florian Westphal
2638eb8b50 net: ipv4: provide __rcu annotation for ifa_list
ifa_list is protected by rcu, yet code doesn't reflect this.

Add the __rcu annotations and fix up all places that are now reported by
sparse.

I've done this in the same commit to not add intermediate patches that
result in new warnings.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-02 18:08:36 -07:00
John Hubbard
ea99697458 RDMA: Convert put_page() to put_user_page*()
For infiniband code that retains pages via get_user_pages*(), release
those pages via the new put_user_page(), or put_user_pages*(), instead of
put_page()

This is a tiny part of the second step of fixing the problem described in
[1]. The steps are:

1) Provide put_user_page*() routines, intended to be used for releasing
   pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to invoke
   put_user_page*(), instead of put_page(). This involves dozens of call
   sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
   implement tracking of these pages. This tracking will be separate from
   the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
   special handling (especially in writeback paths) when the pages are
   backed by a filesystem. Again, [1] provides details as to why that is
   desirable.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-27 20:11:11 -03:00