linux/drivers/infiniband/hw/hns
Yangyang Li 3ec5f54f7a RDMA/hns: Fix an cmd queue issue when resetting
If a IMP reset caused by some hardware errors and hns RoCE driver reset
occurred at the same time, there is a possiblity that the IMP will stop
dealing with command and users can't use the hardware. The logs are as
follows:

 hns3 0000:fd:00.1: cleaned 0, need to clean 1
 hns3 0000:fd:00.1: firmware version query failed -11
 hns3 0000:fd:00.1: Cmd queue init failed
 hns3 0000:fd:00.1: Upgrade reset level
 hns3 0000:fd:00.1: global reset interrupt

The hns NIC driver divides the reset process into 3 status:
initialization, hardware resetting and softwaring restting. RoCE driver
gets reset status by interfaces provided by NIC driver and commands will
not be sent to the IMP if the driver is in any above status. The main
reason for this issue is that there is a time gap between status 1 and 2,
if the RoCE driver sends commands to the IMP during this gap, the IMP will
stop working because it is not ready.

To eliminate the time gap, the hns NIC driver has added a new interface in
commit a4de02287a ("net: hns3: provide .get_cmdq_stat interface for the
client"), so RoCE driver can ensure that no commands will be sent during
resetting.

Link: https://lore.kernel.org/r/1592314778-52822-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-18 10:48:39 -03:00
..
hns_roce_ah.c RDMA: Group create AH arguments in struct 2020-05-02 20:19:53 -03:00
hns_roce_alloc.c RDMA/hns: Change all page_shift to unsigned 2020-05-25 14:02:12 -03:00
hns_roce_cmd.c RDMA/hns: Optimize cmd init and mode selection for hip08 2019-09-16 10:52:20 -03:00
hns_roce_cmd.h RDMA/hns: Rename the functions used inside creating cq 2019-11-25 10:31:48 -04:00
hns_roce_common.h RDMA/hns: Remove unused code about assert 2020-05-25 14:02:12 -03:00
hns_roce_cq.c RDMA/hns: Add CQ flag instead of independent enable flag 2020-05-25 14:02:11 -03:00
hns_roce_db.c IB: Allow calls to ib_umem_get from kernel ULPs 2020-01-16 16:14:28 +02:00
hns_roce_device.h RDMA/hns: Fix a calltrace when registering MR from userspace 2020-06-18 10:47:04 -03:00
hns_roce_hem.c RDMA/hns: Change all page_shift to unsigned 2020-05-25 14:02:12 -03:00
hns_roce_hem.h RDMA/hns: Change all page_shift to unsigned 2020-05-25 14:02:12 -03:00
hns_roce_hw_v1.c RDMA/hns: Fix a calltrace when registering MR from userspace 2020-06-18 10:47:04 -03:00
hns_roce_hw_v1.h RDMA/hns: Remove asynchronic QP destroy 2019-04-24 10:55:31 -03:00
hns_roce_hw_v2_dfx.c RDMA/hns: Dump detailed driver-specific CQ 2019-04-08 13:05:25 -03:00
hns_roce_hw_v2.c RDMA/hns: Fix an cmd queue issue when resetting 2020-06-18 10:48:39 -03:00
hns_roce_hw_v2.h RDMA/hns: Remove redundant type cast for general pointers 2020-05-25 14:20:45 -03:00
hns_roce_main.c RDMA/hns: Remove unused code about assert 2020-05-25 14:02:12 -03:00
hns_roce_mr.c RDMA/hns: Fix a calltrace when registering MR from userspace 2020-06-18 10:47:04 -03:00
hns_roce_pd.c RDMA/hns: Unify format of prints 2020-03-26 16:52:26 -03:00
hns_roce_qp.c RDMA/hns: Remove redundant parameters from free_srq/qp_wrid() 2020-05-25 14:20:45 -03:00
hns_roce_restrack.c RDMA/hns: Fix memory leak on 'context' on error return path 2019-10-28 13:41:23 -03:00
hns_roce_srq.c RDMA/hns: Remove redundant parameters from free_srq/qp_wrid() 2020-05-25 14:20:45 -03:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile RDMA/hns: Fix build error again 2019-10-29 16:16:54 -03:00