Commit Graph

4216 Commits

Author SHA1 Message Date
Leon Romanovsky
931b3c1a83 RDMA/mlx5: Fix existence check for extended address vector
The extended address vector is the highest bit in be32 variable,
but it was compared with the lowest. This patch fixes the endianness
of that check and removes already declared define.

Fixes: 17d2f88f92 ("IB/mlx5: Add ODP atomics support")
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-04 14:24:05 -04:00
Amrani, Ram
c75d3ec8c0 RDMA/qedr: Prevent memory overrun in verbs' user responses
Wrap ib_copy_to_udata with a function that ensures that the data
being copied over to user space isn't longer than the allowed.

Fixes: cecbcddf64 ("qedr: Add support for QP verbs")
Fixes: a7efd7773e ("qedr: Add support for PD,PKEY and CQ verbs")
Fixes: ac1b36e55a ("qedr: Add support for user context verbs")
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Ganesh Goudar
720336c42e iw_cxgb4: don't use WR keys/addrs for 0 byte reads
Only use the read sge lkey/addr and the remote rkey/addr if the
length of the read is not zero. Otherwise the read response might
be treated as the RTR read response and not delivered to the
application. Or worse Terminator hardware will fail a 0B read
if the STAG is 0 even if the read length is 0.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Håkon Bugge
4542e3c79a IB/mlx4: Fix CM REQ retries in paravirt mode
CM REQs cannot be successfully retried, because a new pv_cm_id is
created for each request, without checking if one already exists.

By checking if an id exists before creating one, the bug is fixed.

This bug can be provoked by running an RDMA CM user-land application,
but inserting a five seconds delay before the rdma_accept() call on
the passive side. This delay is larger than the default CMA timeout,
and triggers a retry from the active side. The retried REQ will use
another pv_cm_id (the cm_id on the wire). This confuses the CM
protocol and two REJs are sent from the passive side.

Here is an excerpt from ibdump running without the patch:

3.285092       LID: 4 -> LID: 4       SDP 290 CM: ConnectRequest(SDP Hello)
7.382711       LID: 4 -> LID: 4       SDP 290 CM: ConnectRequest(SDP Hello)
7.382861       LID: 4 -> LID: 4       InfiniBand 290 CM: ConnectReject
7.387644       LID: 4 -> LID: 4       InfiniBand 290 CM: ConnectReject

and here is the same with bug fix applied:

3.251010       LID: 4 -> LID: 4       SDP 290 CM: ConnectRequest(SDP Hello)
7.349387       LID: 4 -> LID: 4       SDP 290 CM: ConnectRequest(SDP Hello)
8.258443       LID: 4 -> LID: 4       SDP 290 CM: ConnectReply(SDP Hello)
8.259890       LID: 4 -> LID: 4       InfiniBand 290 CM: ReadyToUse

Suggested-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reported-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Tested-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Selvin Xavier
601577b7d1 RDMA/bnxt_re: Fix the value reported for local ack delay
Local ack delay exposed by the driver is 0 which means infinite QP
timeout. Reporting the default value to 16 (approx 260ms)

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Selvin Xavier
499e456981 RDMA/bnxt_re: Report MISSED_EVENTS in req_notify_cq
While invoking the req_notify_cq hook, ULPs can request
whether the CQs have any CQEs pending. If CQEs are pending,
drivers can indicate  it by returning 1 for req_notify_cq.
The stack will poll CQ again till CQ is empty.

This patch peeks the CQ for any valid entries and return accordingly.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Devesh Sharma
10d1dedf9b RDMA/bnxt_re: Fix return value of poll routine
Fix the incorrect reporting of number of polled
entries by taking into account the max CQ depth
in the driver.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Devesh Sharma
254cd2590d RDMA/bnxt_re: Enable atomics only if host bios supports
Driver shall check if the host system bios has enabled
Atomic operations capability in PCI Device Control 2
register of the pci-device. Expose the ATOMIC_HCA
flag only if the Atomic operations capability is set.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Somnath Kotur
536f092805 RDMA/bnxt_re: Specify RDMA component when allocating stats context
Starting FW version 20.6.47, firmware is keeping separate statistics
for L2 and RDMA. However, driver needs to specify RDMA or not when
allocating stat_ctx.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Eddie Wai
a25d112fe9 RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP
There's a couple of bugs in the support of max_rd_atomic and
max_dest_rd_atomic. In the modify_qp, if the requested max_rd_atomic,
which is the ORRQ size, is greater than what the chip can support,
then we have to cap the request to chip max as we can't have the HW
overflow the ORRQ. Capping the max_rd_atomic support internally is okay
to do as the remaining read/atomic WRs will still be sitting in the SQ.
However, for the max_dest_rd_atomic, the driver has to error out as
this dictates the IRRQ size and we can't control what the remote
side sends.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Selvin Xavier
58d4a671d0 RDMA/bnxt_re: Report supported value to IB stack in query_device
- Report supported value for max_mr_size to IB stack in query_device.
   Also, check and log if MR size requested by application in
   reg_user_mr() is greater than value currently supported by driver.
 - Report only 4K page size support for now
 - Fix Max_QP value returned by ibv_devinfo -vv.
   In case of PF, FW reserves 129 QPs for creating QP1s of VFs
   and PF. So the max_qp value reported by FW for PF doesn'tt include
   the QP1. Fixing this issue by adding 1 with the value reported
   by FW.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Selvin Xavier
4a62c5e9e2 RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
This fix is added only to avoid system crash in some a
specific scenario. When bnxt_re driver is loaded and if
user tries to change interface mac address, delete GID
fails because QP1 is still associated with existing MAC
(default GID). If the above command fails GID tables are
not modified in the h/w or driver, but the GID context memory
is freed. Now, if the user changes the mac back to the original
value, another add_gid comes to the driver where the driver
reports that the GID is already present in its table
and tries to access the context which was already freed.

So, in this case, in order to  avoid NULL pointer de-reference,
this patch removes the context memory free  if delete_gid fails
and the same context memory is re-used in new add_gid.
Memory cleanup will be taken care during driver unload, while
deleting the GID table.

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:50 -04:00
Somnath Kotur
ab69d4c8da RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error
Posting WQE size of 2 results in a WQE_FORMAT_ERROR
thrown by the HW as it requires host to supply WQE Size with room
for atleast one SGE so that the resulting WQE size be atleast 3.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Devesh Sharma
b3b2c7c550 RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext
The driver must free the DPI during the dealloc_ucontext
instead of freeing it during dealloc_pd. However, the DPI
allocation scheme remains unchanged.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
396551eb00 IB/mlx5: Fix a warning message
"umem" is a valid pointer.  We intended to print "*umem" or even just
"err" instead.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
f0c6e88288 RDMA/ocrdma: Fix error codes in ocrdma_create_srq()
If either of these allocations fail then we return ERR_PTR(0).  That's
equivalent to NULL and results in a NULL pointer dereference in the
caller.

Fixes: fe2caefcdf ("RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
dd75cfa6d3 RDMA/ocrdma: Fix an error code in ocrdma_alloc_pd()
We should preserve the original "status" error code instead of resetting
it to zero.  Returning ERR_PTR(0) is the same as NULL and results in a
NULL dereference in the callers.  I added a printk() on error instead.

Fixes: 45e86b33ec ("RDMA/ocrdma: Cache recv DB until QP moved to RTR")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
9064d6055c IB/cxgb3: Fix error codes in iwch_alloc_mr()
We accidentally don't set the error code on some error paths.  It means
return ERR_PTR(0) which is NULL and results in a NULL dereference in the
caller.

Fixes: 13a239330a ("RDMA/cxgb3: Don't ignore insert_handle() failures")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
6ebedacbb4 cxgb4: Fix error codes in c4iw_create_cq()
If one of these kmalloc() calls fails then we return ERR_PTR(0) which is
NULL.  It results in a NULL dereference in the callers.

Fixes: cfdda9d764 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
6031e079aa IB/i40iw: Fix error code in i40iw_create_cq()
We accidentally forgot to set the error code if ib_copy_from_udata()
fails.  It means we return ERR_PTR(0) which is NULL and results in a
NULL dereference in the callers.

Fixes: d374984179 ("i40iw: add files for iwarp interface")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Dan Carpenter
653f0a71da RDMA/bnxt_re: checking for NULL instead of IS_ERR()
bnxt_re_alloc_mw() doesn't return NULL, it returns error pointers.

Fixes: 9152e0b722 ("RDMA/bnxt_re: HW workarounds for handling specific conditions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Tatyana Nikolova
af56e53ccd i40iw: Free QP PBLEs when the QP is destroyed
If the physical buffer list entries (PBLEs) of a QP are freed
up at i40iw_dereg_mr, they can be assigned to a newly
created QP before the previous QP is destroyed. Fix this
by freeing PBLEs only when the QP is destroyed.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Shiraz Saleem
44b99f88cd i40iw: Avoid memory leak of CQP request objects
Control Queue Pair (CQP) request objects, which have
not received a completion upon interface close, remain
in memory.

To fix this, identify and free all pending CQP request
objects during destroy CQP OP.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Henry Orosco
c709d7f229 i40iw: Update list correctly
To avoid infinite loop, in i40iw_ieq_handle_exception, update
plist inside while loop.

Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Henry Orosco
c5c9d27e6c i40iw: Add missing memory barrier
Add missing write memory barrier before writing the
header containing valid bit to the WQE in i40iw_puda_send.

Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Shiraz Saleem
b5e452a04a i40iw: Free QP resources on CQP destroy QP failure
Current flow leaves software QP structures in memory if
Control Queue Pair (CQP) destroy QP OP fails. To fix this,
free QP resources on fail of CQP destroy QP OP.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Shiraz Saleem
6327cb09df i40iw: Release cm_id ref on PCI function reset
On PCI function reset, cm_id reference is not released
which causes an application hang, as it waits on the
cm_id to be released on rdma_destroy.

To fix this, call i40iw_cm_disconn during a PCI function
reset to clean-up resources and release cm_id reference.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:49 -04:00
Shiraz Saleem
6c1d94de4e i40iw: Utilize iwdev->reset during PCI function reset
Utilize iwdev->reset on a PCI function reset notification
instead of passing in reset flag for resource clean-up.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:48 -04:00
Mustafa Ismail
415920aa17 i40iw: Do not poll CCQ after it is destroyed
Control Queue Pair (CQP) OPs, in this case - Update SDs,
cannot poll the Control Completion Queue (CCQ) after CCQ is
destroyed. Instead, poll via registers.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:48 -04:00
Mustafa Ismail
be8822db62 i40iw: Fix order of cleanup in close
The order for calling i40iw_destroy_pble_pool is incorrect.
Also, add PBLE_CHUNK_MEM init state to track pble pool
creation and destruction.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20 11:20:48 -04:00
oulijun
5f110ac4be IB/hns: Fix for checkpatch.pl comment style warnings
This patch correct the comment style warnings caught by
checkpatch.pl script.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:29 -04:00
oulijun
d322f004aa IB/hns: Fix the bug with modifying the MAC address without removing the driver
When modified the MAC address used hns_roce_mac function, we release and create
reserved qp again, It is not necessary to use spin_lock_bh and spin_unlock_bh in
handle_en_event, Otherwise, it will occur a error. This patch mainly fixes it.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:29 -04:00
oulijun
9de61d3fcd IB/hns: Fix the bug with rdma operation
When opcode of work request is RDMA read and write, it
should use rdma_wr to get remote_addr and rkey. This
patch fixes it.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:28 -04:00
oulijun
58c4f0d85f IB/hns: Fix the bug with wild pointer when destroy rc qp
When destroyed rc qp, the hr_qp will be used after freed. This patch
will fix it.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:28 -04:00
oulijun
5802883d4b IB/hns: Fix the bug of polling cq failed for loopback Qps
In hip06 SoC, RoCE driver creates 8 reserved loopback QPs to
ensure zero wqe when free mr. However, if the enabled phy
port number is less than 6, it will fail in polling cqe with
8 reserved loopback QPs.

In order to solve this problem, the number of loopback Qps
will be adjusted based on the number of enabled phy port.

Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:27 -04:00
Leon Romanovsky
12cc1a0273 IB/mlx5: Clean mr_cache debugfs in case of failure
The failure in creation of debugfs entries for mr_cache left entries,
which were already created.

It caused to mismatch and misguiding for the end users. The solution
is to clean mr_cache debugfs root, so no leftovers will be in the
system. In addition, let's document why the error is not needed to be
forwarded to user in case of failure.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:25 -04:00
Leon Romanovsky
8900b894e7 {net, IB}/mlx4: Remove gfp flags argument
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.

The patch removes the gfp flags argument and updates all driver paths.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:24 -04:00
Leon Romanovsky
0f4d027c3b IB/{rdmavt, qib, hfi1}: Remove gfp flags argument
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.

The patch removes the gfp flags argument and updates all driver paths.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:23 -04:00
Leon Romanovsky
98e77d9fd7 IB: Convert msleep below 20ms to usleep_range
The msleep(1) may do not sleep 1 ms as expected
and will sleep longer. The simple conversion from
msleep to usleep_range between 1ms and 2ms can solve an
issue.

The full and comprehensive explanation can be found at [1] and [2].

[1] https://lkml.org/lkml/2007/8/3/250
[2] Documentation/timers/timers-howto.txt

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 21:21:22 -04:00
Bart Van Assche
99975cd4fd mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array
ib_map_mr_sg() can pass an SG-list to .map_mr_sg() that is larger
than what fits into a single MR. .map_mr_sg() must not attempt to
map more SG-list elements than what fits into a single MR.
Hence make sure that mlx5_ib_sg_to_klms() does not write outside
the MR klms[] array.

Fixes: b005d31647 ("mlx5: Add arbitrary sg list support")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Israel Rukshin <israelr@mellanox.com>
Cc: <stable@vger.kernel.org>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 11:45:07 -04:00
Dennis Dalessandro
91647f4c2d IB/hfi1: Ensure dd->gi_mask can not be overflowed
As the code stands today the array access in remap_intr() is OK. To
future proof the code though we should explicitly check to ensure the
index value is not outside of the valid range. This is not a straight
forward calculation so err on the side of caution.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17 11:39:45 -04:00
Linus Torvalds
9871ab22f2 Fixes #3 for 4.12-rc
- 2 Fixes for OPA found by debug kernel
 - 1 Fix for user supplied input causing kernel problems
 - 1 Fix for the IPoIB fixes submitted around -rc4
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZXVfPAAoJELgmozMOVy/dRRcP/AiN4wyEQ897se1fKXAktL1g
 a17tiSkK2MukAVHbM++9Ea/YXK66e2s7Ls8Pd230E85N3V48rSUhWZUIUQLOm+gS
 b98z53uNs6KkdBCezXABsHIi4PB6u1CfzaFaUfN5WI3ymAgsYqpQWMtNyO6GNe/R
 Dur3vDieXPNJ2x+F1jiNxHFBXLKofCG0y1FX88zqsQI5vVVq7ASKgaaSX3T1emQY
 18l4Dd7pesrWj4QD9jaqQiYkruF5VC1NE8/he8Zzy6XjSgnUZZfjbjuMptbW4y3y
 Tvvd5bjMAkJhCbK1mhe1dZHPlYJhAguUBZfThjVSKtiMGwRhGA4SYkRtek3nZOga
 /OLhERgj0VomHx7o+Pwp74DWnsSv08EMoc4hXKHZPPyxok83r9czejqm7mC2VbGd
 Sa8LmVeLQp79e9MbGAj+PbNRHf9CE9dnLeFUmbj+qptXUVGvT8j9U1a9iTjTz0+2
 NX/O4iWjtnt/CIkH9dhN9aWolswbmO2jSmmzb/x2EuCLv94GNtTyZLSifvxSYMnN
 IWO86aGQmuUkWJ3RI/5tzq+gVzI6bdKB9hG5DOPWN/uJVF9nWkq3c69Bv9djvUoM
 xi/rI0grxTqYHelRx3ja4ZqaI43R6YwL928XdtZJKQ/uNanq65Lyd6KKz3W7hT0l
 emCoqb2MjuzsNWIPkSgg
 =JEor
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma update from Doug Ledford:
 "This includes two bugs against the newly added opa vnic that were
  found by turning on the debug kernel options:

   - sleeping while holding a lock, so a one line fix where they
     switched it from GFP_KERNEL allocation to a GFP_ATOMIC allocation

   - a case where they had an isolated caller of their code that could
     call them in an atomic context so they had to switch their use of a
     mutex to a spinlock to be safe, so this was considerably more lines
     of diff because all uses of that lock had to be switched

  In addition, the bug that was discussed with you already about an out
  of bounds array access in ib_uverbs_modify_qp and ib_uverbs_create_ah
  and is only seven lines of diff.

  And finally, one fix to an earlier fix in the -rc cycle that broke
  hfi1 and qib in regards to IPoIB (this one is, unfortunately, larger
  than I would like for a -rc7 submission, but fixing the problem
  required that we not treat all devices as though they had allocated a
  netdev universally because it isn't true, and it took 70 lines of diff
  to resolve the issue, but the final patch has been vetted by Intel and
  Mellanox and they've both given their approval to the fix).

  Summary:

   - Two fixes for OPA found by debug kernel
   - Fix for user supplied input causing kernel problems
   - Fix for the IPoIB fixes submitted around -rc4"

[ Doug sent this having not noticed the 4.12 release, so I guess I'll be
  getting another rdma pull request with the actuakl merge window
  updates and not just fixes.

  Oh well - it would have been nice if this small update had been the
  merge window one.     - Linus ]

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev
  RDMA/uverbs: Check port number supplied by user verbs cmds
  IB/opa_vnic: Use spinlock instead of mutex for stats_lock
  IB/opa_vnic: Use GFP_ATOMIC while sending trap
2017-07-06 11:45:08 -07:00
Niranjana Vishwanathapura
8e95960199 IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev
IPOIB is calling free_rdma_netdev even though alloc_rdma_netdev has
returned -EOPNOTSUPP.
Move free_rdma_netdev from ib_device structure to rdma_netdev structure
thus ensuring proper cleanup function is called for the rdma net device.

Fix the following trace:

ib0: Failed to modify QP to ERROR state
BUG: unable to handle kernel paging request at 0000000000001d20
IP: hfi1_vnic_free_rn+0x26/0xb0 [hfi1]
Call Trace:
 ipoib_remove_one+0xbe/0x160 [ib_ipoib]
 ib_unregister_device+0xd0/0x170 [ib_core]
 rvt_unregister_device+0x29/0x90 [rdmavt]
 hfi1_unregister_ib_device+0x1a/0x100 [hfi1]
 remove_one+0x4b/0x220 [hfi1]
 pci_device_remove+0x39/0xc0
 device_release_driver_internal+0x141/0x200
 driver_detach+0x3f/0x80
 bus_remove_driver+0x55/0xd0
 driver_unregister+0x2c/0x50
 pci_unregister_driver+0x2a/0xa0
 hfi1_mod_cleanup+0x10/0xf65 [hfi1]
 SyS_delete_module+0x171/0x250
 do_syscall_64+0x67/0x150
 entry_SYSCALL64_slow_path+0x25/0x25

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-05 17:11:00 -04:00
Linus Torvalds
5518b69b76 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Reasonably busy this cycle, but perhaps not as busy as in the 4.12
  merge window:

   1) Several optimizations for UDP processing under high load from
      Paolo Abeni.

   2) Support pacing internally in TCP when using the sch_fq packet
      scheduler for this is not practical. From Eric Dumazet.

   3) Support mutliple filter chains per qdisc, from Jiri Pirko.

   4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

   5) Add batch dequeueing to vhost_net, from Jason Wang.

   6) Flesh out more completely SCTP checksum offload support, from
      Davide Caratti.

   7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
      Neira Ayuso, and Matthias Schiffer.

   8) Add devlink support to nfp driver, from Simon Horman.

   9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
      Prabhu.

  10) Add stack depth tracking to BPF verifier and use this information
      in the various eBPF JITs. From Alexei Starovoitov.

  11) Support XDP on qed device VFs, from Yuval Mintz.

  12) Introduce BPF PROG ID for better introspection of installed BPF
      programs. From Martin KaFai Lau.

  13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

  14) For loads, allow narrower accesses in bpf verifier checking, from
      Yonghong Song.

  15) Support MIPS in the BPF selftests and samples infrastructure, the
      MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
      Daney.

  16) Support kernel based TLS, from Dave Watson and others.

  17) Remove completely DST garbage collection, from Wei Wang.

  18) Allow installing TCP MD5 rules using prefixes, from Ivan
      Delalande.

  19) Add XDP support to Intel i40e driver, from Björn Töpel

  20) Add support for TC flower offload in nfp driver, from Simon
      Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
      Kicinski, and Bert van Leeuwen.

  21) IPSEC offloading support in mlx5, from Ilan Tayari.

  22) Add HW PTP support to macb driver, from Rafal Ozieblo.

  23) Networking refcount_t conversions, From Elena Reshetova.

  24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
      for tuning the TCP sockopt settings of a group of applications,
      currently via CGROUPs"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
  net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
  dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
  cxgb4: Support for get_ts_info ethtool method
  cxgb4: Add PTP Hardware Clock (PHC) support
  cxgb4: time stamping interface for PTP
  nfp: default to chained metadata prepend format
  nfp: remove legacy MAC address lookup
  nfp: improve order of interfaces in breakout mode
  net: macb: remove extraneous return when MACB_EXT_DESC is defined
  bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
  bpf: fix return in load_bpf_file
  mpls: fix rtm policy in mpls_getroute
  net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
  net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
  ...
2017-07-05 12:31:59 -07:00
Linus Torvalds
974668417b driver core patches for 4.13-rc1
Here is the big driver core update for 4.13-rc1.
 
 The large majority of this is a lot of cleanup of old fields in the
 driver core structures and their remaining usages in random drivers.
 All of those fixes have been reviewed by the various subsystem
 maintainers.  There's also some small firmware updates in here, a new
 kobject uevent api interface that makes userspace interaction easier,
 and a few other minor things.
 
 All of these have been in linux-next for a long while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWVpX4A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymobgCfd0d13IfpZoq1N41wc6z2Z0xD7cwAnRMeH1/p
 kEeISGpHPYP9f8PBh9FO
 =Hfqt
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big driver core update for 4.13-rc1.

  The large majority of this is a lot of cleanup of old fields in the
  driver core structures and their remaining usages in random drivers.
  All of those fixes have been reviewed by the various subsystem
  maintainers. There's also some small firmware updates in here, a new
  kobject uevent api interface that makes userspace interaction easier,
  and a few other minor things.

  All of these have been in linux-next for a long while with no reported
  issues"

* tag 'driver-core-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (56 commits)
  arm: mach-rpc: ecard: fix build error
  zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO()
  driver-core: remove struct bus_type.dev_attrs
  powerpc: vio_cmo: use dev_groups and not dev_attrs for bus_type
  powerpc: vio: use dev_groups and not dev_attrs for bus_type
  USB: usbip: convert to use DRIVER_ATTR_RW
  s390: drivers: convert to use DRIVER_ATTR_RO/WO
  platform: thinkpad_acpi: convert to use DRIVER_ATTR_RO/RW
  pcmcia: ds: convert to use DRIVER_ATTR_RO
  wireless: ipw2x00: convert to use DRIVER_ATTR_RW
  net: ehea: convert to use DRIVER_ATTR_RO
  net: caif: convert to use DRIVER_ATTR_RO
  TTY: hvc: convert to use DRIVER_ATTR_RW
  PCI: pci-driver: convert to use DRIVER_ATTR_WO
  IB: nes: convert to use DRIVER_ATTR_RW
  HID: hid-core: convert to use DRIVER_ATTR_RO and drv_groups
  arm: ecard: fix dev_groups patch typo
  tty: serdev: use dev_groups and not dev_attrs for bus_type
  sparc: vio: use dev_groups and not dev_attrs for bus_type
  hid: intel-ish-hid: use dev_groups and not dev_attrs for bus_type
  ...
2017-07-03 20:27:48 -07:00
Reshetova, Elena
633547973f net: convert sk_buff.users from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01 07:39:07 -07:00
Ilan Tayari
095b0927f0 IB/mlx5: Respect mlx5_core reserved GIDs
Reserved gids are taken by the mlx5_core, report smaller GID table
size to IB core.

Set mlx5_query_roce_port's return value back to int. In case of
error, return an indication. This rolls back some of the change
in commit 50f22fd8ec ("IB/mlx5: Set mlx5_query_roce_port's return value to void")

Change set_roce_addr to use gid_set function, instead of directly
sending the command.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27 16:36:47 +03:00
David S. Miller
3d09198243 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two entries being added at the same time to the IFLA
policy table, whilst parallel bug fixes to decnet
routing dst handling overlapping with the dst gc removal
in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21 17:35:22 -04:00
Kalderon, Michal
7003cdd6a1 qed*: Rename qed_roce_if.h to qed_rdma_if.h
Rename the qed_roce_if file to qed_rdma_if as it
represents a common interface for RoCE and iWARP.

this commit affects RDMA/qedr as well.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21 15:32:59 -04:00
yuan linyu
de77b966ce net: introduce __skb_put_[zero, data, u8]
follow Johannes Berg, semantic patch file as below,
@@
identifier p, p2;
expression len;
expression skb;
type t, t2;
@@
(
-p = __skb_put(skb, len);
+p = __skb_put_zero(skb, len);
|
-p = (t)__skb_put(skb, len);
+p = __skb_put_zero(skb, len);
)
... when != p
(
p2 = (t2)p;
-memset(p2, 0, len);
|
-memset(p, 0, len);
)

@@
identifier p;
expression len;
expression skb;
type t;
@@
(
-t p = __skb_put(skb, len);
+t p = __skb_put_zero(skb, len);
)
... when != p
(
-memset(p, 0, len);
)

@@
type t, t2;
identifier p, p2;
expression skb;
@@
t *p;
...
(
-p = __skb_put(skb, sizeof(t));
+p = __skb_put_zero(skb, sizeof(t));
|
-p = (t *)__skb_put(skb, sizeof(t));
+p = __skb_put_zero(skb, sizeof(t));
)
... when != p
(
p2 = (t2)p;
-memset(p2, 0, sizeof(*p));
|
-memset(p, 0, sizeof(*p));
)

@@
expression skb, len;
@@
-memset(__skb_put(skb, len), 0, len);
+__skb_put_zero(skb, len);

@@
expression skb, len, data;
@@
-memcpy(__skb_put(skb, len), data, len);
+__skb_put_data(skb, data, len);

@@
expression SKB, C, S;
typedef u8;
identifier fn = {__skb_put};
fresh identifier fn2 = fn ## "_u8";
@@
- *(u8 *)fn(SKB, S) = C;
+ fn2(SKB, C);

Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:30:14 -04:00