linux/drivers/infiniband/hw/mlx5
Idan Burstein 064e526247 IB/mlx5: posting klm/mtt list inline in the send queue for reg_wr
As most kernel RDMA ULPs, (e.g. NVMe over Fabrics in its default
"register_always=Y" mode) registers and invalidates user buffer
upon each IO.

Today the mlx5 driver is posting the registration work
request using scatter/gather entry for the MTT/KLM list.
The fetch of the MTT/KLM list becomes the bottleneck in
number of IO operation could be done by NVMe over Fabrics
host driver on a single adapter as shown below.

This patch is adding the support for inline registration
work request upon MTT/KLM list of size <=64B.

The result for NVMe over Fabrics is increase of > x3.5 for small
IOs as shown below, I expect other ULPs (e.g iSER, SRP, NFS over RDMA)
performance to be enhanced as well.

The following results were taken against a single NVMe-oF (RoCE link layer)
subsystem with a single namespace backed by null_blk using fio benchmark
(with rw=randread, numjobs=48, iodepth={16,64}, ioengine=libaio direct=1):

ConnectX-5 (pci Width x16)
---------------------------

Block Size       s/g reg_wr            inline reg_wr
++++++++++     +++++++++++++++        ++++++++++++++++
512B            1302.8K/34.82%         4951.9K/99.02%
1KB             1284.3K/33.86%         4232.7K/98.09%
2KB             1238.6K/34.1%          2797.5K/80.04%
4KB             1169.3K/32.46%         1941.3K/61.35%
8KB             1013.4K/30.08%         1236.6K/39.47%
16KB            695.7K/20.19%          696.9K/20.59%
32KB            350.3K/9.64%           350.6K/10.3%
64KB            175.86K/5.27%          175.9K/5.28%

ConnectX-4 (pci Width x8)
---------------------------

Block Size       s/g reg_wr            inline reg_wr
++++++++++     +++++++++++++++        ++++++++++++++++
512B            1285.8K/42.66%          4242.7K/98.18%
1KB             1254.1K/41.74%          3569.2K/96.00%
2KB             1185.9K/39.83%          2173.9K/75.58%
4KB             1069.4K/36.46%          1343.3K/47.47%
8KB             755.1K/27.77%           748.7K/29.14%

Tested-by: Nitzan Carmi <nitzanc@mellanox.com>
Signed-off-by: Idan Burstein <idanb@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-09 12:08:21 -04:00
..
ah.c IB/mlx5: Enable ECN capable bits for UD RoCE v2 QPs 2018-03-27 14:43:10 -06:00
cmd.c net/mlx5: Mkey creation command adjustments 2018-04-05 13:04:49 -06:00
cmd.h IB/mlx5: Device memory support in mlx5_ib 2018-04-05 13:04:49 -06:00
cong.c IB/mlx5: Change debugfs to have per port contents 2018-01-08 11:42:22 -07:00
cq.c mlx5: Move dump error CQE function out of mlx5_ib for code sharing 2018-03-27 17:17:28 -07:00
doorbell.c
gsi.c
ib_rep.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
ib_rep.h IB/mlx5: Add proper representors support 2018-02-23 12:36:39 -08:00
ib_virt.c IB/mlx5: Restore IB guid/policy for virtual functions 2017-07-24 10:34:28 -04:00
Kconfig
mad.c IB/mlx5: Route MADs for dual port RoCE 2018-01-08 11:42:23 -07:00
main.c Merge candidates for 4.17 merge window 2018-04-06 17:35:43 -07:00
Makefile IB/mlx5: Add basic regiser/unregister representors code 2018-02-23 12:36:39 -08:00
mem.c IB/mlx5: Simplify mlx5_ib_cont_pages 2017-09-25 11:47:24 -04:00
mlx5_ib.h IB/mlx5: Device memory mr registration support 2018-04-05 13:04:49 -06:00
mr.c Merge candidates for 4.17 merge window 2018-04-06 17:35:43 -07:00
odp.c IB/mlx5: Move locks initialization to the corresponding stage 2018-01-03 17:26:59 -07:00
qp.c IB/mlx5: posting klm/mtt list inline in the send queue for reg_wr 2018-05-09 12:08:21 -04:00
srq.c IB/mlx5: Fix integer overflows in mlx5_ib_create_srq 2018-03-13 16:31:21 -04:00