linux/drivers/infiniband/hw
Yangyang Li 4ad8181426 RDMA/hns: Fix RNR retransmission issue for HIP08
Due to the discrete nature of the HIP08 timer unit, a requester might
finish the timeout period sooner, in elapsed real time, than its responder
does, even when both sides share the identical RNR timeout length included
in the RNR Nak packet and the responder indeed starts the timing prior to
the requester. Furthermore, if a 'providential' resend packet arrived
before the responder's timeout period expired, the responder is certainly
entitled to drop the packet silently in the light of IB protocol.

To address this problem, our team made good use of certain hardware facts:

1) The timing resolution regards the transmission arrangements is 1
   microsecond, e.g. if cq_period field is set to 3, it would be
   interpreted as 3 microsecond by hardware

2) A QPC field shall inform the hardware how many timing unit (ticks)
   constitutes a full microsecond, which, by default, is 1000

3) It takes 14ns for the processor to handle a packet in the buffer, so
   the RNR timeout length of 10ns would ensure our processing mechanism is
   disabled during the entire timeout period and the packet won't be
   dropped silently

To achieve (3), we permanently set the QPC field mentioned in (2) to zero
which nominally indicates every time tick is equivalent to a microsecond
in wall-clock time; now, a RNR timeout period at face value of 10 would
only last 10 ticks, which is 10ns in wall-clock time.

It's worth noting that we adapt the driver by magnifying certain
configuration parameters(cq_period, eq_period and ack_timeout)by 1000
given the user assumes the configuring timing unit to be microseconds.

Also, this particular improvisation is only deployed on HIP08 since other
hardware has already solved this issue.

Fixes: cfc85f3e4b ("RDMA/hns: Add profile support for hip08 driver")
Link: https://lore.kernel.org/r/20211209140655.49493-1-liangwenpeng@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-14 19:45:04 -04:00
..
bnxt_re RDMA/bnxt_re: Remove unsupported bnxt_re_modify_ah callback 2021-11-03 09:06:36 -03:00
cxgb4 RDMA: Remove redundant 'flush_workqueue()' calls 2021-10-12 13:21:23 -03:00
efa RDMA/efa: Add support for dmabuf memory regions 2021-10-28 08:58:26 -03:00
hfi1 IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr 2021-12-07 13:22:54 -04:00
hns RDMA/hns: Fix RNR retransmission issue for HIP08 2021-12-14 19:45:04 -04:00
irdma RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ 2021-12-07 13:53:01 -04:00
mlx4 RDMA/mlx4: Do not fail the registration on port stats 2021-11-17 16:45:16 -04:00
mlx5 RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow 2021-11-25 13:16:39 -04:00
mthca RDMA: switch from 'pci_' to 'dma_' API 2021-08-23 13:43:54 -03:00
ocrdma RDMA: Globally allocate and release QP memory 2021-08-03 13:44:27 -03:00
qedr RDMA v5.16 merge window pull request 2021-11-03 08:05:59 -07:00
qib Linux 5.15 2021-11-01 14:49:20 -03:00
usnic RDMA: Constify netdev->dev_addr accesses 2021-10-25 14:33:09 -03:00
vmw_pvrdma RDMA: switch from 'pci_' to 'dma_' API 2021-08-23 13:43:54 -03:00
Makefile RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw 2021-06-02 20:06:36 -03:00