linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-17 16:43:08 +00:00

Author	SHA1	Message	Date
John Garry	da19eaba6e	scsi: hisi_sas: Delete unused I_T_NEXUS_RESET_PHYUP_TIMEOUT There is no user, so delete it. Link: https://lore.kernel.org/r/1645112566-115804-6-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-19 15:59:34 -05:00
John Garry	2dd6801a67	scsi: libsas: Delete SAS_SG_ERR No LLDD sets exec status as SAS_SG_ERR, so remove support. Link: https://lore.kernel.org/r/1645112566-115804-5-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-19 15:59:34 -05:00
John Garry	25882c82f8	scsi: libsas: Delete lldd_clear_aca callback This callback is never called, so remove support. Link: https://lore.kernel.org/r/1645112566-115804-4-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-19 15:59:34 -05:00
John Garry	1d6049a3b1	scsi: libsas: Use enum for response frame DATAPRES field As defined in table 126 of the SAS spec 1.1, use an enum for the DATAPRES field, which makes reading the code easier. Also change sas_ssp_task_response() to use a switch statement, which is more suitable (than if-else), as suggested by Christoph. Link: https://lore.kernel.org/r/1645112566-115804-3-git-send-email-john.garry@huawei.com Suggested-by: Xiang Chen <chenxiang66@hisilicon.com> Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-19 15:59:34 -05:00
John Garry	9aacf6fe90	scsi: libsas: Handle non-TMF codes in sas_scsi_find_task() LLDD TMF callbacks may return linux or other error codes instead of TMF codes. This may cause problems in sas_scsi_find_task() -> .lldd_query_task(), as only TMF codes are handled there. As such, we may not return a task_disposition type from sas_scsi_find_task(). Function sas_eh_handle_sas_errors() only handles that type, and will only progress error handling for those recognised types. Return TASK_ABORT_FAILED upon exit on the assumption that the command may still be alive and error handling should be escalated. Link: https://lore.kernel.org/r/1645112566-115804-2-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-19 15:59:34 -05:00
Martin K. Petersen	ac2beb4e3b	Merge branch '5.17/scsi-fixes' into 5.18/scsi-staging Pull 5.17 fixes branch into 5.18 tree to resolve a few pm8001 driver merge conflicts. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-14 21:51:29 -05:00
Sreekanth Reddy	22754f7fbb	scsi: mpi3mr: Bump driver version to 8.0.0.68.0 Bump mpi3mr driver version to 8.0.0.68.0. Link: https://lore.kernel.org/r/20220210095817.22828-10-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:04 -05:00
Sreekanth Reddy	d44b5fefb2	scsi: mpi3mr: Fix memory leaks Fix memory leaks related to operational reply queue's memory segments which are not getting freed while unloading the driver. Link: https://lore.kernel.org/r/20220210095817.22828-9-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:04 -05:00
Sreekanth Reddy	21401408dd	scsi: mpi3mr: Update the copyright year Update the copyright year to 2017-2022. Link: https://lore.kernel.org/r/20220210095817.22828-8-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:04 -05:00
Sreekanth Reddy	9992246127	scsi: mpi3mr: Fix reporting of actual data transfer size The driver is missing to set the residual size while completing an I/O. Ensure proper data transfer size is reported to the kernel on I/O completion based on the transfer length reported by the firmware. Link: https://lore.kernel.org/r/20220210095817.22828-7-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:04 -05:00
Sreekanth Reddy	b3911ab3a7	scsi: mpi3mr: Fix cmnd getting marked as in use forever When a driver command which requires the driver to issue a follow up command using the same command frame is outstanding and a soft reset operation occurs, then that driver command frame is getting marked as in use permanently and won't be reused again. Clear the driver command frames while flushing out the outstanding commands and avoid issuing any new requests using these command frames while soft reset is going on. Link: https://lore.kernel.org/r/20220210095817.22828-6-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:03 -05:00
Sreekanth Reddy	191a3ef586	scsi: mpi3mr: Fix hibernation issue Hibernation operation fails when it is issued for second time. This is because the driver is trying to release the IOC's PCI resources after setting power state to D3. Set the IOC's power state to D3 only after releasing the IOC's PCI resources. Link: https://lore.kernel.org/r/20220210095817.22828-5-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:03 -05:00
Sreekanth Reddy	04b27e538d	scsi: mpi3mr: Update MPI3 headers Update MPI3 headers. Link: https://lore.kernel.org/r/20220210095817.22828-4-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:03 -05:00
Sreekanth Reddy	6d211f1d26	scsi: mpi3mr: Fix printing of pending I/O count Print proper pending I/O count after issuing target reset TM operation. Link: https://lore.kernel.org/r/20220210095817.22828-3-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:03 -05:00
Sreekanth Reddy	580e674220	scsi: mpi3mr: Fix deadlock while canceling the fw event During controller reset, the driver tries to flush all the pending firmware event works from worker queue that are queued prior to the reset. However, if any work is waiting for device addition/removal operation to be completed at the SML, then a deadlock is observed. This is due to the controller reset waiting for the device addition/removal to be completed and the device/addition removal is waiting for the controller reset to be completed. To limit this deadlock, continue with the controller reset handling without canceling the work which is waiting for device addition/removal operation to complete at SML. Link: https://lore.kernel.org/r/20220210095817.22828-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:40:03 -05:00
Xiang Chen	3a20e64281	scsi: libsas: Remove unused parameter for function sas_ata_eh() Input parameter work_q is not unused in function sas_ata_eh(), so remove it. Link: https://lore.kernel.org/r/1644561778-183074-4-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:13:29 -05:00
Xiang Chen	59803ccb65	scsi: libsas: Remove duplicated setting for task->task_state_flags Task->task_state_flags is already set in function sas_alloc_task(), so remove duplicated setting. Link: https://lore.kernel.org/r/1644561778-183074-3-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:13:29 -05:00
Xiang Chen	26d4a969dd	scsi: libsas: Use void for sas_discover_event() return code The callers of function sas_discover_event() do not check its return value. The function also only ever returns 0, so use void instead. Link: https://lore.kernel.org/r/1644561778-183074-2-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:13:29 -05:00
Don Brace	31b17c3aeb	scsi: smartpqi: Fix unused variable pqi_pm_ops for clang Driver added a new dev_pm_ops structure used only if CONFIG_PM is set. The CONFIG_PM MACRO needed to be moved up in the code to avoid the compiler warnings. The hunk to move the location was missing from the above patch. Found by kernel test robot by building driver with CONFIG_PM disabled. Link: https://lore.kernel.org/all/202202090657.bstNLuce-lkp@intel.com/ Link: https://lore.kernel.org/r/20220210201151.236170-1-don.brace@microchip.com Fixes: `c66e078ad8` ("scsi: smartpqi: Fix hibernate and suspend") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike Mcgowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:06:12 -05:00
John Garry	26fc0ea74f	scsi: libsas: Drop SAS_TASK_AT_INITIATOR This flag is now only ever set, so delete it. This also avoids a use-after-free in the pm8001 queue path, as reported in the following: https://lore.kernel.org/linux-scsi/c3cb7228-254e-9584-182b-007ac5e6fe0a@huawei.com/T/#m28c94c6d3ff582ec4a9fa54819180740e8bd4cfb https://lore.kernel.org/linux-scsi/0cc0c435-b4f2-9c76-258d-865ba50a29dd@huawei.com/ [mkp: checkpatch + two SAS_TASK_AT_INITIATOR references] Link: https://lore.kernel.org/r/1644489804-85730-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 17:02:50 -05:00
John Garry	c39d5aa457	scsi: isci: Drop SAS_TASK_AT_INITIATOR check in isci_task_abort_task() In the queue path, move around when we assign sas_task->lldd_task such that this pointer and the SAS_TASK_AT_INITIATOR flag are set atomically. It is also not required to clear SAS_TASK_AT_INITIATOR in isci_task_execute_task() error path as it is also cleared immediately after in isci_task_refuse() call. Now the following items may be considered: - SAS_TASK_STATE_DONE and SAS_TASK_AT_INITIATOR are mutually exclusive apart from possibly when SAS_TASK_STATE_DONE is set in sas_scsi_find_task(), but that is after .lldd_abort_task, i.e. the considered callback, is called. - If isci_task_refuse() is called in the queue path, then sas_task->lldd_task and SAS_TASK_AT_INITIATOR are cleared atomically in isci_task_refuse(). - In the completion path, SAS_TASK_STATE_DONE is set and SAS_TASK_AT_INITIATOR is cleared atomically before the sas_task.lldd_task is cleared later. So in isci_task_abort_task() if SAS_TASK_STATE_DONE is not set and sas_task.lldd_task is still set, then SAS_TASK_AT_INITIATOR must be set - so we can drop this check on SAS_TASK_AT_INITIATOR. [mkp: checkpatch] Link: https://lore.kernel.org/r/1644489804-85730-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 16:59:57 -05:00
Gleb Chesnokov	fa1d43f396	scsi: qla2xxx: Remove unused qla_sess_op_cmd_list from scsi_qla_host_t The qla_sess_op_cmd_list was introduced in commit `8b2f5ff3d0` ("qla2xxx: cleanup cmd in qla workqueue before processing TMR"). Then the usage of this list was dropped in commit `fb35265b12` ("scsi: qla2xxx: Remove session creation redundant code"). Thus, remove this list since it is no longer used. Link: https://lore.kernel.org/r/AS8PR10MB49524AAB4C8016E4AFF17FFB9D2D9@AS8PR10MB4952.EURPRD10.PROD.OUTLOOK.COM Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Gleb Chesnokov <Chesnokov.G@raidix.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 16:51:24 -05:00
Yang Li	106b7a2549	scsi: pm8001: Clean up inconsistent indenting Eliminate the following smatch warning: drivers/scsi/pm8001/pm8001_ctl.c:760 pm8001_update_flash() warn: inconsistent indenting Link: https://lore.kernel.org/r/20220208025500.29511-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 16:45:25 -05:00
Kees Cook	03e4383c7c	scsi: ibmvscsis: Silence -Warray-bounds warning Instead of doing a cast to storage that is too small, add a union for the high 64 bits. Silences the warnings under -Warray-bounds: drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c: In function 'ibmvscsis_send_messages': drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:1934:44: error: array subscript 'struct viosrp_crq[0]' is partly outside array bounds of 'u64[1]' {aka 'long long unsigned int[1]'} [-Werror=array-bounds] 1934 \| crq->valid = VALID_CMD_RESP_EL; \| ^~ drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:1875:13: note: while referencing 'msg_hi' 1875 \| u64 msg_hi = 0; \| ^~~~~~ There is no change to the resulting binary instructions. Link: https://lore.kernel.org/lkml/20220125142430.75c3160e@canb.auug.org.au Link: https://lore.kernel.org/r/20220208061231.3429486-1-keescook@chromium.org Cc: Michael Cyr <mikecyr@linux.ibm.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Tyrel Datwyler <tyreld@linux.ibm.com> Cc: linux-scsi@vger.kernel.org Cc: target-devel@vger.kernel.org Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 16:42:22 -05:00
Saurav Kashyap	49b729f58e	scsi: qla2xxx: Add qla2x00_async_done() for async routines This done routine will delete the timer and check for its return value and decrease the reference count accordingly. This prevents boot hangs reported after commit `31e6cdbe0e` ("scsi: qla2xxx: Implement ref count for SRB") was merged. Link: https://lore.kernel.org/r/20220208093946.4471-1-njavali@marvell.com Fixes: `31e6cdbe0e` ("scsi: qla2xxx: Implement ref count for SRB") Reported-by: Ewan Milne <emilne@redhat.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-11 16:36:34 -05:00
James Smart	5852ed2a6a	scsi: lpfc: Reduce log messages seen after firmware download Messages around firmware download were incorrectly tagged as being related to discovery trace events. Thus, firmware download status ended up dumping the trace log as well as the firmware update message. As there were a couple of log messages in this state, the trace log was dumped multiple times. Resolve this by converting from trace events to SLI events. Link: https://lore.kernel.org/r/20220207180442.72836-1-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:45:02 -05:00
James Smart	c80b27cfd9	scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabled The driver is initiating NVMe PRLIs to determine device NVMe support. This should not be occurring if CONFIG_NVME_FC support is disabled. Correct this by changing the default value for FC4 support. Currently it defaults to FCP and NVMe. With change, when NVME_FC support is not enabled in the kernel, the default value is just FCP. Link: https://lore.kernel.org/r/20220207180516.73052-1-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:42:58 -05:00
Don Brace	62ed6622aa	scsi: smartpqi: Update version to 2.1.14-035 Link: https://lore.kernel.org/r/164375215867.440833.17567317655622946368.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Gerry Morong <gerry.morong@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:36 -05:00
Kevin Barnett	291c2e0071	scsi: smartpqi: Fix lsscsi -t SAS addresses Correct lsscsi -t output for newer controllers that support 16-byte WWID in the SAS address field. lsscsi -t was displaying all zeros for SAS addresses. When we added support to smartpqi for 16-byte WWIDs in the RPL data for newer controllers, we were copying the wrong part of the 16-byte WWID to the SAS address field. Link: https://lore.kernel.org/r/164375215363.440833.7298523719813806902.stgit@brunhilda.pdev.net Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:36 -05:00
Kevin Barnett	c66e078ad8	scsi: smartpqi: Fix hibernate and suspend Restructure the hibernate/suspend code to allow workarounds for the controller boot differences. Newer controllers have subtle differences in the way that they boot up. Link: https://lore.kernel.org/r/164375214859.440833.14683009064111314948.stgit@brunhilda.pdev.net Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:36 -05:00
Mike McGowen	5e6935864d	scsi: smartpqi: Fix BUILD_BUG_ON() statements Add calls to the functions at the beginning driver initialization. The BUILD_BUG_ON() statements that are currently in functions named verify_structures() in the modules smartpqi_init.c and smartpqi_sis.c do not work as currently implemented. Link: https://lore.kernel.org/r/164375214355.440833.13129778749209816497.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:36 -05:00
Mike McGowen	c52efc9238	scsi: smartpqi: Fix NUMA node not updated during init Correct NUMA node association when calling pqi_pci_probe(). In the function pqi_pci_probe(), the driver makes an OS call to get the NUMA node associated with a controller. If the call returns that there is no associated node, the driver attempts to set it to node 0. The problem is that a different local variable (cp_node) was being used to do this, but the original local variable (node) was still being used in the call to pqi_alloc_ctrl_info(). The value of "node" is not updated if the conditional after the call to dev_to_node() evaluates to TRUE. Link: https://lore.kernel.org/r/164375213850.440833.5243014942807747074.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:35 -05:00
Kevin Barnett	00598b056a	scsi: smartpqi: Expose SAS address for SATA drives Remove UNIQUE_WWID_IN_REPORT_PHYS_LUN PQI feature. This feature was originally added to solve a problem with NVMe drives, but this problem has since been solved a different way, so this PQI feature is no longer required for any type of drive. The kernel was not creating symbolic links in sysfs between SATA drives and their enclosure. The driver was enabling the UNIQUE_WWID_IN_REPORT_PHYS_LUN PQI feature, which causes the FW to return a WWID for SATA drives that is derived from a unique ID read from the SATA drive itself. The driver was exposing this WWID as the drive's SAS address. However, because this SAS address does not match the SAS address returned by an enclosure's SES Page 0xA data, the Linux kernel was not able to match a SATA drive with its enclosure. Link: https://lore.kernel.org/r/164375213346.440833.12379222470149882747.stgit@brunhilda.pdev.net Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:35 -05:00
Mike McGowen	5d8fbce04d	scsi: smartpqi: Speed up RAID 10 sequential reads Use all data disks for sequential read operations. Testing discovered inconsistent performance on RAID 10 volumes when performing 256K sequential reads. The driver was only using a single tracker to determine which physical drive to send a request to for AIO requests. Change the single tracker (next_bypass_group) to an array of trackers based on the number of data disks in a row of the RAID map. Link: https://lore.kernel.org/r/164375212842.440833.6733971458765002128.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:35 -05:00
Mahesh Rajashekhara	3ada501d60	scsi: smartpqi: Fix kdump issue when controller is locked up Avoid dropping into shell if the controller is in locked up state. Driver issues SIS soft reset to bring back the controller to SIS mode while OS boots into kdump mode. If the controller is in lockup state, SIS soft reset does not work. Since the controller lockup code has not been cleared, driver considers the firmware is no longer up and running. Driver returns back an error code to OS and the kdump fails. Link: https://lore.kernel.org/r/164375212337.440833.11955356190354940369.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:35 -05:00
Mahesh Rajashekhara	27655e9db4	scsi: smartpqi: Update volume size after expansion After modifying logical volume size, lsblk command still shows previous size of logical volume. When the driver gets any event from firmware it schedules rescan worker with delay of 10 seconds. If array expansion is so quick that it gets completed in a second, the driver could not catch logical volume expansion due to worker delay. Since driver does not detect volume expansion, driver would not call rescan device to update new size to the OS. Link: https://lore.kernel.org/r/164375211833.440833.17023155389220583731.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:35 -05:00
Sagar Biradar	b73357a1fd	scsi: smartpqi: Avoid drive spin-down during suspend On certain systems (based on PCI IDs), when the OS transitions the system into the suspend (S3) state, the BMIC flush cache command will indicate a system RESTART instead of SUSPEND. This avoids drive spin-down. Link: https://lore.kernel.org/r/164375211330.440833.7203813692110347698.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:35 -05:00
Balsundar P	42dc0426fb	scsi: smartpqi: Resolve delay issue with PQI_HZ value Change PQI_HZ to HZ. PQI_HZ macro was set to 1000 when HZ value is less than 1000. By default, PQI_HZ will result into a delay of 10 seconds(for kernel, which has HZ = 100). So in this case when firmware raises an event, rescan worker will be scheduled after a delay of (10 x PQI_HZ) = 100 seconds instead of 10 seconds. Also driver uses PQI_HZ at many instances, which might result in some other issues with respect to delay. Link: https://lore.kernel.org/r/164375210825.440833.15510172447583227486.stgit@brunhilda.pdev.net Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Balsundar P <balsundar.p@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:34 -05:00
Kevin Barnett	9e98e60bfc	scsi: smartpqi: Fix a typo in func pqi_aio_submit_io() Use correct pqi_aio_path_request structure to calculate the offset to sg_descriptors. The function pqi_aio_submit_io() uses the pqi_raid_path_request structure to calculate the offset of the structure member sg_descriptors. This is incorrect. It should be using the pqi_aio_path_request structure instead. This typo is benign because the offsets are the same in both structures. Link: https://lore.kernel.org/r/164375210321.440833.2566086558909686629.stgit@brunhilda.pdev.net Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:34 -05:00
Kevin Barnett	b4dc06a907	scsi: smartpqi: Fix a name typo and cleanup code Rename the function pqi_is_io_high_priority() to pqi_is_io_high_priority(). Remove 2 unnecessary lines from the function, and adjusted the white space. Link: https://lore.kernel.org/r/164375209818.440833.10908948825731635853.stgit@brunhilda.pdev.net Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:34 -05:00
Murthy Bhat	94a68c8143	scsi: smartpqi: Quickly propagate path failures to SCSI midlayer Return DID_NO_CONNECT when a path failure is detected. When a path fails during I/O and AIO path gets disabled for a multipath device, the I/O was retried in the RAID path slowing down path fail detection. Returning DID_NO_CONNECT allows multipath to switch paths more quickly. Link: https://lore.kernel.org/r/164375209313.440833.9992416628621839233.stgit@brunhilda.pdev.net Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Sagar Biradar <sagar.biradar@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:34 -05:00
Sagar Biradar	70ba20be4b	scsi: smartpqi: Eliminate drive spin down on warm boot Avoid drive spin down during a warm boot. Call the BMIC Flush Cache command (0xc2) to indicate the reason for the flush cache (shutdown, hibernate, suspend, or restart). Link: https://lore.kernel.org/r/164375208810.440833.11254644025650871435.stgit@brunhilda.pdev.net Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:34 -05:00
Gilbert Wu	2a47834d94	scsi: smartpqi: Enable SATA NCQ priority in sysfs Add device attribute 'sas_ncq_prio_enable' to enable SATA NCQ priority support and provide I/O priority in SCSI command and pass priority information to controller firmware. This device attribute works only when device has NCQ priority support and controller firmware can handle I/Os with NCQ priority attribute. Link: https://lore.kernel.org/r/164375208306.440833.7392577382127815362.stgit@brunhilda.pdev.net Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:34 -05:00
Don Brace	c57ee4ccb3	scsi: smartpqi: Add PCI IDs Add in new ZTE controllers: VID / DID / SVID / SDID ---- ---- ---- ---- ZTE SmartROC3100 RS241-18i 2G 9005 / 028F / 1CF2 / 5449 ZTE SmartROC3100 RS242-18i 4G 9005 / 028F / 1CF2 / 544A ZTE SmartIOC2100 RS243-18i 9005 / 028F / 1CF2 / 544B ZTE SmartROC3100 RM241B-18i 2G 9005 / 028F / 1CF2 / 544D ZTE SmartROC3100 RM242B-18i 4G 9005 / 028F / 1CF2 / 544E ZTE SmartIOC2100 RM243B-18i 9005 / 028F / 1CF2 / 544F Add PCI ID for 1100-24i controller: VID / DID / SVID / SDID ---- ---- ---- ---- HBA 1100-24i 9005 / 028F / 9005 / 1304 Add PCI IDs for HPE and Adaptec devices: VID / DID / SVID / SDID ---- ---- ---- ---- Adaptec Smart HBA 2200-8io /e 9005 / 028F / 9005 / 1463 Adaptec Smart HBA 2200-16io /e 9005 / 028F / 9005 / 14C2 HPE SR308i-p Gen11 9005 / 028F / 1590 / 0382 HPE SR308i-o Gen11 9005 / 028F / 1590 / 0383 HPE SR932i-p Gen11 9005 / 028F / 1590 / 0381 Add PCI IDs for Inspur controllers: VID / DID / SVID / SDID ---- ---- ---- ---- INSPUR RS0800M5H24i 9005 / 028F / 1BD4 / 006B INSPUR RS0800M5E8I 9005 / 028F / 1BD4 / 006C INSPUR RS0800M5H8I 9005 / 028F / 1BD4 / 006D INSPUR RS0804M5R16i 9005 / 028F / 1BD4 / 006F INSPUR RS0800M5E16i 9005 / 028F / 1BD4 / 0070 INSPUR RS0800M5H16i 9005 / 028F / 1BD4 / 0071 INSPUR RS0800M5E16i 9005 / 028F / 1BD4 / 0072 NT RAID 3100-24i 9005 / 028F / 1F0C / 3161 Add HPE and Adaptec OROC PCI IDs: VID / DID / SVID / SDID ---- ---- ---- ---- HPE SR216i-o Gen11 9005 / 028F / 9005 / 036F Adaptec SmartRAID 3284-16io /e/uC 9005 / 028F / 9005 / 1473 Adaptec SmartRAID 3254-16io /e 9005 / 028F / 9005 / 1474 Add PCI IDs for new channel controllers: VID / DID / SVID / SDID ---- ---- ---- ---- Adaptec SmartRAID 3254-8i /e 9005 / 028F / 9005 / 14A4 Adaptec SmartRAID 3252-8i /e 9005 / 028F / 9005 / 14A5 Adaptec SmartRAID 3204-8i /e 9005 / 028F / 9005 / 14A6 Align PCI IDs with OOB driver. Easier to check for differences. Link: https://lore.kernel.org/r/164375207802.440833.15947108153078495425.stgit@brunhilda.pdev.net Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Co-developed-by: Mike McGowen <Mike.McGowen@microchip.com> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com> Co-developed-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Co-developed-by: Sagar Biradar <sagar.biradar@microchip.com> Signed-off-by: Sagar Biradar <sagar.biradar@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:33 -05:00
Don Brace	c4ff687d25	scsi: smartpqi: Fix rmmod stack trace Prevent "BUG: scheduling while atomic: rmmod" stack trace. Stop setting spin_locks before calling OS functions to remove devices. Link: https://lore.kernel.org/r/164375207296.440833.4996145011193819683.stgit@brunhilda.pdev.net Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:38:33 -05:00
Kees Cook	d20b3dae63	scsi: mpt3sas: Convert to flexible arrays This converts to a flexible array instead of the old-style 1-element arrays. The existing code already did the correct math for finding the size of the resulting flexible array structure, so there is no binary difference. The other two structures converted to use flexible arrays appear to have no users at all. Link: https://lore.kernel.org/r/20220201223948.1455637-1-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:16:34 -05:00
Sebastian Andrzej Siewior	b84b6ec0f9	scsi: core: Add scsi_done_direct() for immediate completion Add scsi_done_direct() which behaves like scsi_done() except that it invokes blk_mq_complete_request_direct() in order to complete the request. Callers from process context can complete the request directly instead waking ksoftirqd. Link: https://lore.kernel.org/r/Yfw7JaszshmfYa1d@flow Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:14:15 -05:00
Martin Wilck	7cddf7e8d1	scsi: core: Make "access_state" sysfs attribute always visible If a SCSI device handler module is loaded after some SCSI devices have already been probed (e.g. via request_module() by dm-multipath), the "access_state" and "preferred_path" sysfs attributes remain invisible for these devices, although the handler is attached and live. The reason is that the visibility is only checked when the sysfs attribute group is first created. This results in an inconsistent user experience depending on the load order of SCSI low-level drivers vs. device handler modules. This patch changes user space API: attempting to read the "access_state" or "preferred_path" attributes will now result in -EINVAL rather than -ENODEV for devices that have no device handler, and tests for the existence of these attributes will have a different result. Link: https://lore.kernel.org/r/20220127141351.30706-1-mwilck@suse.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-02-07 23:06:13 -05:00
Minghao Chi (CGEL ZTE)	d1d87c33f4	scsi: lpfc: Remove redundant flush_workqueue() call destroy_workqueue() already drains the queue before destroying it, so there is no need to flush it explicitly. Remove the redundant flush_workqueue() call. Link: https://lore.kernel.org/r/20220127014330.1185114-1-chi.minghao@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi (CGEL ZTE) <chi.minghao@zte.com.cn> Signed-off-by: CGEL ZTE <cgel.zte@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 17:10:37 -05:00
Minghao Chi (CGEL ZTE)	0603be7192	scsi: qedi: Remove redundant flush_workqueue() calls destroy_workqueue() already drains the queue before destroying it, so there is no need to flush it explicitly. Remove the redundant flush_workqueue() calls. Link: https://lore.kernel.org/r/20220127013934.1184923-1-chi.minghao@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi (CGEL ZTE) <chi.minghao@zte.com.cn> Signed-off-by: CGEL ZTE <cgel.zte@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 17:09:44 -05:00
Yang Guang	2245ea91fd	scsi: bfa: Replace snprintf() with sysfs_emit() coccinelle report: ./drivers/scsi/bfa/bfad_attr.c:908:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:860:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:888:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:853:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:808:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:728:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:822:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:927:9-17: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:900:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:874:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:714:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/bfa/bfad_attr.c:839:8-16: WARNING: use scnprintf or sprintf Use sysfs_emit() instead of scnprintf() or sprintf(). Link: https://lore.kernel.org/r/def83ff75faec64ba592b867a8499b1367bae303.1643181468.git.yang.guang5@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Yang Guang <yang.guang5@zte.com.cn> Signed-off-by: David Yang <davidcomponentone@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:59:59 -05:00
Yang Guang	0ad3867b0f	scsi: mvsas: Replace snprintf() with sysfs_emit() coccinelle report: ./drivers/scsi/mvsas/mv_init.c:699:8-16: WARNING: use scnprintf or sprintf ./drivers/scsi/mvsas/mv_init.c:747:8-16: WARNING: use scnprintf or sprintf Use sysfs_emit() instead of scnprintf() or sprintf(). Link: https://lore.kernel.org/r/c1711f7cf251730a8ceb5bdfc313bf85662b3395.1643182948.git.yang.guang5@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Yang Guang <yang.guang5@zte.com.cn> Signed-off-by: David Yang <davidcomponentone@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:59:11 -05:00
Yin Xiujiang	687ba48e16	scsi: bnx2fc: Make use of the helper macro kthread_run() Repalce kthread_create/wake_up_process() with kthread_run() to simplify the code. Link: https://lore.kernel.org/r/20220126014248.466806-1-yinxiujiang@kylinos.cn Signed-off-by: Yin Xiujiang <yinxiujiang@kylinos.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:56:57 -05:00
John Garry	c763ec4c10	scsi: hisi_sas: Fix setting of hisi_sas_slot.is_internal The hisi_sas_slot.is_internal member is not set properly for ATA commands which the driver sends directly. A TMF struct pointer is normally used as a test to set this, but it is NULL for those commands. It's not ideal, but pass an empty TMF struct to set that member properly. Link: https://lore.kernel.org/r/1643627607-138785-1-git-send-email-john.garry@huawei.com Fixes: `dc313f6b12` ("scsi: hisi_sas: Factor out task prep and delivery code") Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:50:59 -05:00
Cai Huoqing	dd84a4b0fe	scsi: bnx2fc: Fix typo in comments Replace 'Offlaod' with 'Offload'. Link: https://lore.kernel.org/r/20220128063101.6953-1-cai.huoqing@linux.dev Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:49:31 -05:00
Jinyoung Choi	f681d1078d	scsi: ufs: Add checking lifetime attribute for WriteBooster Because WB performs writes in SLC mode, it is not possible to use WriteBooster indefinitely. Vendors can set a lifetime limit in the device. If the lifetime exceeds this limit, the device ican disable the WB feature. The feature is defined in the "bWriteBoosterBufferLifeTimeEst (IDN = 1E)" attribute. With lifetime exceeding the limit value, the current driver continuously performs the following query: - Write Flag: WB_ENABLE / DISABLE - Read attr: Available Buffer Size - Read attr: Current Buffer Size This patch recognizes that WriteBooster is no longer supported by the device, and prevents unnecessary queries. Link: https://lore.kernel.org/r/1891546521.01643252701746.JavaMail.epsvc@epcpadp3 Reviewed-by: Asutosh Das <quic_asutoshd@quicinc.com> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:44:57 -05:00
John Garry	df7abcaa12	scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task Currently a use-after-free may occur if a sas_task is aborted by the upper layer before we handle the I/O completion in mpi_ssp_completion() or mpi_sata_completion(). In this case, the following are the two steps in handling those I/O completions: - Call complete() to inform the upper layer handler of completion of the I/O. - Release driver resources associated with the sas_task in pm8001_ccb_task_free() call. When complete() is called, the upper layer may free the sas_task. As such, we should not touch the associated sas_task afterwards, but we do so in the pm8001_ccb_task_free() call. Fix by swapping the complete() and pm8001_ccb_task_free() calls ordering. Link: https://lore.kernel.org/r/1643289172-165636-4-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:40:07 -05:00
John Garry	61f162aa43	scsi: pm8001: Fix use-after-free for aborted TMF sas_task Currently a use-after-free may occur if a TMF sas_task is aborted before we handle the IO completion in mpi_ssp_completion(). The abort occurs due to timeout. When the timeout occurs, the SAS_TASK_STATE_ABORTED flag is set and the sas_task is freed in pm8001_exec_internal_tmf_task(). However, if the I/O completion occurs later, the I/O completion still thinks that the sas_task is available. Fix this by clearing the ccb->task if the TMF times out - the I/O completion handler does nothing if this pointer is cleared. Link: https://lore.kernel.org/r/1643289172-165636-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:40:07 -05:00
John Garry	0aed75fd30	scsi: pm8001: Fix warning for undescribed param in process_one_iomb() make W=1 complains of an undescribed function parameter: drivers/scsi/pm8001/pm80xx_hwi.c:3938: warning: Function parameter or member 'circularQ' not described in 'process_one_iomb' Fix it. Link: https://lore.kernel.org/r/1643289172-165636-2-git-send-email-john.garry@huawei.com Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 16:40:07 -05:00
Ming Lei	edb854a368	scsi: core: Reallocate device's budget map on queue depth change We currently use ->cmd_per_lun as initial queue depth for setting up the budget_map. Martin Wilck reported that it is common for the queue_depth to be subsequently updated in slave_configure() based on detected hardware characteristics. As a result, for some drivers, the static host template settings for cmd_per_lun and can_queue won't actually get used in practice. And if the default values are used to allocate the budget_map, memory may be consumed unnecessarily. Fix the issue by reallocating the budget_map after ->slave_configure() returns. At that time the device queue_depth should accurately reflect what the hardware needs. Link: https://lore.kernel.org/r/20220127153733.409132-1-ming.lei@redhat.com Cc: Bart Van Assche <bvanassche@acm.org> Reported-by: Martin Wilck <martin.wilck@suse.com> Suggested-by: Martin Wilck <martin.wilck@suse.com> Tested-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 12:56:01 -05:00
John Meneghini	936bd03405	scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe Running tests with a debug kernel shows that bnx2fc_recv_frame() is modifying the per_cpu lport stats counters in a non-mpsafe way. Just boot a debug kernel and run the bnx2fc driver with the hardware enabled. [ 1391.699147] BUG: using smp_processor_id() in preemptible [00000000] code: bnx2fc_ [ 1391.699160] caller is bnx2fc_recv_frame+0xbf9/0x1760 [bnx2fc] [ 1391.699174] CPU: 2 PID: 4355 Comm: bnx2fc_l2_threa Kdump: loaded Tainted: G B [ 1391.699180] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 1391.699183] Call Trace: [ 1391.699188] dump_stack_lvl+0x57/0x7d [ 1391.699198] check_preemption_disabled+0xc8/0xd0 [ 1391.699205] bnx2fc_recv_frame+0xbf9/0x1760 [bnx2fc] [ 1391.699215] ? do_raw_spin_trylock+0xb5/0x180 [ 1391.699221] ? bnx2fc_npiv_create_vports.isra.0+0x4e0/0x4e0 [bnx2fc] [ 1391.699229] ? bnx2fc_l2_rcv_thread+0xb7/0x3a0 [bnx2fc] [ 1391.699240] bnx2fc_l2_rcv_thread+0x1af/0x3a0 [bnx2fc] [ 1391.699250] ? bnx2fc_ulp_init+0xc0/0xc0 [bnx2fc] [ 1391.699258] kthread+0x364/0x420 [ 1391.699263] ? _raw_spin_unlock_irq+0x24/0x50 [ 1391.699268] ? set_kthread_struct+0x100/0x100 [ 1391.699273] ret_from_fork+0x22/0x30 Restore the old get_cpu/put_cpu code with some modifications to reduce the size of the critical section. Link: https://lore.kernel.org/r/20220124145110.442335-1-jmeneghi@redhat.com Fixes: `d576a5e80c` ("bnx2fc: Improve stats update mechanism") Tested-by: Guangwu Zhang <guazhang@redhat.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 12:17:53 -05:00
Ajish Koshy	c26b85ea16	scsi: pm80xx: Fix double completion for SATA devices Current code handles completions for SATA devices in mpi_sata_completion() and mpi_sata_event(). However, at the time when any SATA event happens, for almost all the event types, the command is still in the target. It is therefore incorrect to complete the task in sata_event(). There are some events for which we get sata_completions, some need recovery procedure and others abort. All the tasks must be completed via sata_completion() path. Removed the task done related code from sata_events(). For tasks where we don't get completions, let top layer call abort() to abort the command post timeout. Link: https://lore.kernel.org/r/20220124082255.86223-1-Ajish.Koshy@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Co-developed-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-31 12:14:49 -05:00
Douglas Gilbert	0790797aca	scsi: scsi_debug: Add environmental reporting log subpage Log subpages are starting to appear in real devices (e.g. SSDs) so add support for one. Adopt approach where all "wild" sub-pages are themselves listed as long as there is at least one non-wild page or subpage for a given page number. Link: https://lore.kernel.org/r/20220109012853.301953-10-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:27:23 -05:00
Douglas Gilbert	7109f3701a	scsi: scsi_debug: Add no_rwlock parameter By default, this driver places a read lock around all user data fetches and a write lock around all user data modifying operations (e.g. WRITE commands). These locks have "per store" granularity. Other drivers that have a similar function (e.g. null_blk) do not take this data integrity step and run significantly faster in some tests. In the common case of a (simulated) device to device copy (e.g. what dd and its variants do) there should be no need for locks around data accesses. So add the driver and sysfs parameter no_rwlock which is boolean and when set does what its name suggests. The default is false for backward comaptibility. Link: https://lore.kernel.org/r/20220109012853.301953-7-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:25:06 -05:00
Douglas Gilbert	500d0d2480	scsi: scsi_debug: Divide power on reset UNIT ATTENTION To distinguish between resets sent by the SCSI mid-level error handling and newly introduced devices (LUs), this Unit Attention: power on, reset, or bus reset occurred [0x29,0x0] has been subdivided into that UA for the reset case and this new UA: power on occurred [0x29,0x1] for the new device (LU) case. This makes debug a little easier to follow when it is turned on (e.g. 'echo 0x1 > opts'). Bump driver version number. Link: https://lore.kernel.org/r/20220109012853.301953-6-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:25:05 -05:00
Douglas Gilbert	b05d4e481e	scsi: scsi_debug: Refine sdebug_blk_mq_poll() Refine the sdebug_blk_mq_poll() function so it only takes the spinlock on the queue when it can see one or more requests with the in_use bitmap flag set. Link: https://lore.kernel.org/r/20220109012853.301953-5-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:25:05 -05:00
Douglas Gilbert	7d5a129b86	scsi: scsi_debug: Use TASK SET FULL more When the internal in_use bit array in this driver is full returning SCSI_MLQUEUE_HOST_BUSY leads to the mid-level reissuing the request which is unhelpful. Previously TASK SET FULL status was only returned if ALL_TSF [0x400] is placed in the opts variable (at load time or via sysfs). Now ignore that setting and always return TASK SET FULL when in_use array is full. Also set DID_ABORT together with TASK SET FULL so the mid-level gives up immediately. Aside: the situations addressed by this patch lead to lockups and timeouts. They have only been detected when blk_poll() is used. That mechanism is relatively new in the SCSI subsystem suggesting the mid-level may need more work in that area. Link: https://lore.kernel.org/r/20220109012853.301953-4-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:25:05 -05:00
Douglas Gilbert	d9d23a5a34	scsi: scsi_debug: Strengthen defer_t accesses Use READ_ONCE() and WRITE_ONCE() macros when accessing the sdebug_defer::defer_t value. Link: https://lore.kernel.org/r/20220109012853.301953-3-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:25:05 -05:00
Douglas Gilbert	2aad3cd853	scsi: scsi_debug: Address races following module load When scsi_debug is loaded as a module with many (simulated) hosts, targets, and devices (LUs), modprobe can take a long time to return. Only a small amount of this time is spent in the scsi_debug_init(); the rest is other parts of the kernel reacting to to the appearance of new storage devices. As soon as scsi_debug_init() has completed the user space may call 'rmmod scsi_debug' and this was found to cause race problems as outlined here: https://bugzilla.kernel.org/show_bug.cgi?id=212337 To reliably generate this race a sysfs parameter called rm_all_hosts was added and the code was strengthened in this area. The main change was to make the count of scsi_debug hosts present an atomic. Then it was found that the handling of the existing add_host parameter needed the same strengthening. Further: 'echo -9999 > /sys/bus/pseudo/drivers/scsi_debug/add_host has the same effect as rm_all_hosts so rm_all_hosts was not needed. To inhibit a race between two invocations of writes to add_host, a mutex was added. Also address a possible race when rmmod is called but LUs are still being added. The logic to remove (all) hosts is rather crude: it works backwards down a linked lists of hosts. Any pending requests are terminated with DID_NO_CONNECT as are any new requests. In the case where not all hosts are being removed, the ones that remain may have lost requests as just outlined. The lowest numbered host (id) hosts will remain. Cc: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220109012853.301953-2-dgilbert@interlog.com Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:25:05 -05:00
Tong Zhang	4db09593af	scsi: myrs: Fix crash in error case In myrs_detect(), cs->disable_intr is NULL when privdata->hw_init() fails with non-zero. In this case, myrs_cleanup(cs) will call a NULL ptr and crash the kernel. [ 1.105606] myrs 0000:00:03.0: Unknown Initialization Error 5A [ 1.105872] myrs 0000:00:03.0: Failed to initialize Controller [ 1.106082] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1.110774] Call Trace: [ 1.110950] myrs_cleanup+0xe4/0x150 [myrs] [ 1.111135] myrs_probe.cold+0x91/0x56a [myrs] [ 1.111302] ? DAC960_GEM_intr_handler+0x1f0/0x1f0 [myrs] [ 1.111500] local_pci_probe+0x48/0x90 Link: https://lore.kernel.org/r/20220123225717.1069538-1-ztong0001@gmail.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:09:41 -05:00
Colin Ian King	efd7bb1d75	scsi: 53c700: Remove redundant assignment to pointer SCp Pointer SCp is being re-assigned the same value that it was initialized to a few lines earlier, the assignment is redundant and can be removed. Link: https://lore.kernel.org/r/20220123175530.110462-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:08:18 -05:00
Kiwoong Kim	c99b9b2301	scsi: ufs: Treat link loss as fatal error This event is raised when link is lost as specified in UFSHCI spec and that means communication is not possible. Thus initializing UFS interface needs to be done. Make UFS driver considers Link Lost as fatal in the INT_FATAL_ERRORS mask. This will trigger a host reset whenever a link lost interrupt occurs. Link: https://lore.kernel.org/r/1642743475-54275-1-git-send-email-kwmad.kim@samsung.com Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:07:24 -05:00
Kiwoong Kim	ad6c8a4264	scsi: ufs: Use generic error code in ufshcd_set_dev_pwr_mode() The return value of ufshcd_set_dev_pwr_mode() is passed to device PM core. However, the function currently returns a SCSI result which the PM core doesn't understand. This might lead to unexpected behaviors in userland; a platform reset was observed in Android. Use a generic error code for SSU failures. Link: https://lore.kernel.org/r/1642743182-54098-1-git-send-email-kwmad.kim@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-25 00:03:45 -05:00
Nilesh Javali	0dd392d16d	scsi: qla2xxx: Update version to 10.02.07.300-k Link: https://lore.kernel.org/r/20220110050218.3958-18-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:32 -05:00
Joe Carnuccio	cfbafad7c6	scsi: qla2xxx: Check for firmware dump already collected While allocating firmware dump, check if dump is already collected and do not re-allocate the buffer. Link: https://lore.kernel.org/r/20220110050218.3958-17-njavali@marvell.com Cc: stable@vger.kernel.org Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:32 -05:00
Joe Carnuccio	0d6a536cb1	scsi: qla2xxx: Add devids and conditionals for 28xx This is an update to the original 28xx adapter enablement. Add a bunch of conditionals that are applicable for 28xx. Link: https://lore.kernel.org/r/20220110050218.3958-16-njavali@marvell.com Fixes: `ecc89f25e2` ("scsi: qla2xxx: Add Device ID for ISP28XX") Cc: stable@vger.kernel.org Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:32 -05:00
Saurav Kashyap	a60447e7d4	scsi: qla2xxx: Suppress a kernel complaint in qla_create_qpair() [ 12.323788] BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/1020 [ 12.332297] caller is qla2xxx_create_qpair+0x32a/0x5d0 [qla2xxx] [ 12.338417] CPU: 7 PID: 1020 Comm: systemd-udevd Tainted: G I --------- --- 5.14.0-29.el9.x86_64 #1 [ 12.348827] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.6.0 05/22/2018 [ 12.356356] Call Trace: [ 12.358821] dump_stack_lvl+0x34/0x44 [ 12.362514] check_preemption_disabled+0xd9/0xe0 [ 12.367164] qla2xxx_create_qpair+0x32a/0x5d0 [qla2xxx] [ 12.372481] qla2x00_probe_one+0xa3a/0x1b80 [qla2xxx] [ 12.377617] ? _raw_spin_lock_irqsave+0x19/0x40 [ 12.384284] local_pci_probe+0x42/0x80 [ 12.390162] ? pci_match_device+0xd7/0x110 [ 12.396366] pci_device_probe+0xfd/0x1b0 [ 12.402372] really_probe+0x1e7/0x3e0 [ 12.408114] __driver_probe_device+0xfe/0x180 [ 12.414544] driver_probe_device+0x1e/0x90 [ 12.420685] __driver_attach+0xc0/0x1c0 [ 12.426536] ? __device_attach_driver+0xe0/0xe0 [ 12.433061] ? __device_attach_driver+0xe0/0xe0 [ 12.439538] bus_for_each_dev+0x78/0xc0 [ 12.445294] bus_add_driver+0x12b/0x1e0 [ 12.451021] driver_register+0x8f/0xe0 [ 12.456631] ? 0xffffffffc07bc000 [ 12.461773] qla2x00_module_init+0x1be/0x229 [qla2xxx] [ 12.468776] do_one_initcall+0x44/0x200 [ 12.474401] ? load_module+0xad3/0xba0 [ 12.479908] ? kmem_cache_alloc_trace+0x45/0x410 [ 12.486268] do_init_module+0x5c/0x280 [ 12.491730] __do_sys_init_module+0x12e/0x1b0 [ 12.497785] do_syscall_64+0x3b/0x90 [ 12.503029] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 12.509764] RIP: 0033:0x7f554f73ab2e Link: https://lore.kernel.org/r/20220110050218.3958-15-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:32 -05:00
Joe Carnuccio	4c103a802c	scsi: qla2xxx: Fix T10 PI tag escape and IP guard options for 28XX adapters 28XX adapters are capable of detecting both T10 PI tag escape values as well as IP guard. This was missed due to the adapter type missed in the corresponding macros. Fix this by adding support for 28xx in those macros. Link: https://lore.kernel.org/r/20220110050218.3958-14-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Tested-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:31 -05:00
Quinn Tran	73825fd7a3	scsi: qla2xxx: edif: Fix clang warning Silence compile warning due to unaligned memory access. qla_edif.c:713:45: warning: taking address of packed member 'u' of class or structure 'auth_complete_cmd' may result in an unaligned pointer value [-Waddress-of-packed-member] fcport = qla2x00_find_fcport_by_pid(vha, &appplogiok.u.d_id); Link: https://lore.kernel.org/r/20220110050218.3958-13-njavali@marvell.com Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:31 -05:00
Nilesh Javali	14cb838d24	scsi: qla2xxx: Fix warning for missing error code Fix smatch-reported warning message: drivers/scsi/qla2xxx/qla_target.c:3324 qlt_xmit_response() warn: missing error code 'res' Link: https://lore.kernel.org/r/20220110050218.3958-12-njavali@marvell.com Fixes: `4a8f71014b` ("scsi: qla2xxx: Fix unmap of already freed sgl") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:31 -05:00
Arun Easi	8ad4be3d15	scsi: qla2xxx: Fix device reconnect in loop topology A device logout in loop topology initiates a device connection teardown which loses the FW device handle. In loop topo, the device handle is not regrabbed leading to device login failures and eventually to loss of the device. Fix this by taking the main login path that does it. Link: https://lore.kernel.org/r/20220110050218.3958-11-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:31 -05:00
Shreyas Deodhar	65120de26a	scsi: qla2xxx: Add ql2xnvme_queues module param to configure number of NVMe queues Add ql2xnvme_queues module parameter to configure number of NVMe queues Usage: Number of NVMe Queues that can be configured. Final value will be min(ql2xnvme_queues, num_cpus, num_chip_queues), 1 - Minimum number of queues supported 8 - Default value 128 - Maximum number of queues supported Link: https://lore.kernel.org/r/20220110050218.3958-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:31 -05:00
Bikash Hazarika	1cfbbacbee	scsi: qla2xxx: Fix wrong FDMI data for 64G adapter Corrected transmission speed mask values for FC. Supported Speed: 16 32 20 Gb/s ===> Should be 64 instead of 20 Supported Speed: 16G 32G 48G ===> Should be 64G instead of 48G Link: https://lore.kernel.org/r/20220110050218.3958-9-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bikash Hazarika <bhazarika@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:31 -05:00
Quinn Tran	355f5ffe84	scsi: qla2xxx: Add retry for exec firmware Per FW request, Exec FW can fail due to temporary error resulting in driver not attaching to the adapter. Add retry of this command up to 4 retries. Link: https://lore.kernel.org/r/20220110050218.3958-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:30 -05:00
Quinn Tran	afd438ff87	scsi: qla2xxx: Fix scheduling while atomic The driver makes a call into midlayer (fc_remote_port_delete) which can put the thread to sleep. The thread that originates the call is in interrupt context. The combination of the two trigger a crash. Schedule the call in non-interrupt context where it is more safe. kernel: BUG: scheduling while atomic: swapper/7/0/0x00010000 kernel: Call Trace: kernel: <IRQ> kernel: dump_stack+0x66/0x81 kernel: __schedule_bug.cold.90+0x5/0x1d kernel: __schedule+0x7af/0x960 kernel: schedule+0x28/0x80 kernel: schedule_timeout+0x26d/0x3b0 kernel: wait_for_completion+0xb4/0x140 kernel: ? wake_up_q+0x70/0x70 kernel: __wait_rcu_gp+0x12c/0x160 kernel: ? sdev_evt_alloc+0xc0/0x180 [scsi_mod] kernel: synchronize_sched+0x6c/0x80 kernel: ? call_rcu_bh+0x20/0x20 kernel: ? __bpf_trace_rcu_invoke_callback+0x10/0x10 kernel: sdev_evt_alloc+0xfd/0x180 [scsi_mod] kernel: starget_for_each_device+0x85/0xb0 [scsi_mod] kernel: ? scsi_init_io+0x360/0x3d0 [scsi_mod] kernel: scsi_init_io+0x388/0x3d0 [scsi_mod] kernel: device_for_each_child+0x54/0x90 kernel: fc_remote_port_delete+0x70/0xe0 [scsi_transport_fc] kernel: qla2x00_schedule_rport_del+0x62/0xf0 [qla2xxx] kernel: qla2x00_mark_device_lost+0x9c/0xd0 [qla2xxx] kernel: qla24xx_handle_plogi_done_event+0x55f/0x570 [qla2xxx] kernel: qla2x00_async_login_sp_done+0xd2/0x100 [qla2xxx] kernel: qla24xx_logio_entry+0x13a/0x3c0 [qla2xxx] kernel: qla24xx_process_response_queue+0x306/0x400 [qla2xxx] kernel: qla24xx_msix_rsp_q+0x3f/0xb0 [qla2xxx] kernel: __handle_irq_event_percpu+0x40/0x180 kernel: handle_irq_event_percpu+0x30/0x80 kernel: handle_irq_event+0x36/0x60 Link: https://lore.kernel.org/r/20220110050218.3958-7-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:30 -05:00
Quinn Tran	e35920ab78	scsi: qla2xxx: Fix premature hw access after PCI error After a recoverable PCI error has been detected and recovered, qla driver needs to check to see if the error condition still persist and/or wait for the OS to give the resume signal. Sep 8 22:26:03 localhost kernel: WARNING: CPU: 9 PID: 124606 at qla_tmpl.c:440 qla27xx_fwdt_entry_t266+0x55/0x60 [qla2xxx] Sep 8 22:26:03 localhost kernel: RIP: 0010:qla27xx_fwdt_entry_t266+0x55/0x60 [qla2xxx] Sep 8 22:26:03 localhost kernel: Call Trace: Sep 8 22:26:03 localhost kernel: ? qla27xx_walk_template+0xb1/0x1b0 [qla2xxx] Sep 8 22:26:03 localhost kernel: ? qla27xx_execute_fwdt_template+0x12a/0x160 [qla2xxx] Sep 8 22:26:03 localhost kernel: ? qla27xx_fwdump+0xa0/0x1c0 [qla2xxx] Sep 8 22:26:03 localhost kernel: ? qla2xxx_pci_mmio_enabled+0xfb/0x120 [qla2xxx] Sep 8 22:26:03 localhost kernel: ? report_mmio_enabled+0x44/0x80 Sep 8 22:26:03 localhost kernel: ? report_slot_reset+0x80/0x80 Sep 8 22:26:03 localhost kernel: ? pci_walk_bus+0x70/0x90 Sep 8 22:26:03 localhost kernel: ? aer_dev_correctable_show+0xc0/0xc0 Sep 8 22:26:03 localhost kernel: ? pcie_do_recovery+0x1bb/0x240 Sep 8 22:26:03 localhost kernel: ? aer_recover_work_func+0xaa/0xd0 Sep 8 22:26:03 localhost kernel: ? process_one_work+0x1a7/0x360 .. Sep 8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-8041:22: detected PCI disconnect. Sep 8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-107ff:22: qla27xx_fwdt_entry_t262: dump ram MB failed. Area 5h start 198013h end 198013h Sep 8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-107ff:22: Unable to capture FW dump Sep 8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-1015:22: cmd=0x0, waited 5221 msecs Sep 8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-680d:22: mmio enabled returning. Sep 8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-d04c:22: MBX Command timeout for cmd 0, iocontrol=ffffffff jiffies=10140f2e5 mb[0-3]=[0xffff 0xffff 0xffff 0xffff] Link: https://lore.kernel.org/r/20220110050218.3958-6-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:30 -05:00
Quinn Tran	64f24af75b	scsi: qla2xxx: Fix warning message due to adisc being flushed Fix warning message due to adisc being flushed. Linux kernel triggered a warning message where a different error code type is not matching up with the expected type. Add additional translation of one error code type to another. WARNING: CPU: 2 PID: `1131623` at drivers/scsi/qla2xxx/qla_init.c:498 qla2x00_async_adisc_sp_done+0x294/0x2b0 [qla2xxx] CPU: 2 PID: `1131623` Comm: drmgr Not tainted 5.13.0-rc1-autotest #1 .. GPR28: c000000aaa9c8890 c0080000079ab678 c00000140a104800 c00000002bd19000 NIP [c00800000790857c] qla2x00_async_adisc_sp_done+0x294/0x2b0 [qla2xxx] LR [c008000007908578] qla2x00_async_adisc_sp_done+0x290/0x2b0 [qla2xxx] Call Trace: [c00000001cdc3620] [c008000007908578] qla2x00_async_adisc_sp_done+0x290/0x2b0 [qla2xxx] (unreliable) [c00000001cdc3710] [c0080000078f3080] __qla2x00_abort_all_cmds+0x1b8/0x580 [qla2xxx] [c00000001cdc3840] [c0080000078f589c] qla2x00_abort_all_cmds+0x34/0xd0 [qla2xxx] [c00000001cdc3880] [c0080000079153d8] qla2x00_abort_isp_cleanup+0x3f0/0x570 [qla2xxx] [c00000001cdc3920] [c0080000078fb7e8] qla2x00_remove_one+0x3d0/0x480 [qla2xxx] [c00000001cdc39b0] [c00000000071c274] pci_device_remove+0x64/0x120 [c00000001cdc39f0] [c0000000007fb818] device_release_driver_internal+0x168/0x2a0 [c00000001cdc3a30] [c00000000070e304] pci_stop_bus_device+0xb4/0x100 [c00000001cdc3a70] [c00000000070e4f0] pci_stop_and_remove_bus_device+0x20/0x40 [c00000001cdc3aa0] [c000000000073940] pci_hp_remove_devices+0x90/0x130 [c00000001cdc3b30] [c0080000070704d0] disable_slot+0x38/0x90 [rpaphp] [ c00000001cdc3b60] [c00000000073eb4c] power_write_file+0xcc/0x180 [c00000001cdc3be0] [c0000000007354bc] pci_slot_attr_store+0x3c/0x60 [c00000001cdc3c00] [c00000000055f820] sysfs_kf_write+0x60/0x80 [c00000001cdc3c20] [c00000000055df10] kernfs_fop_write_iter+0x1a0/0x290 [c00000001cdc3c70] [c000000000447c4c] new_sync_write+0x14c/0x1d0 [c00000001cdc3d10] [c00000000044b134] vfs_write+0x224/0x330 [c00000001cdc3d60] [c00000000044b3f4] ksys_write+0x74/0x130 [c00000001cdc3db0] [c00000000002df70] system_call_exception+0x150/0x2d0 [c00000001cdc3e10] [c00000000000d45c] system_call_common+0xec/0x278 Link: https://lore.kernel.org/r/20220110050218.3958-5-njavali@marvell.com Cc: stable@vger.kernel.org Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:30 -05:00
Quinn Tran	725d3a0d31	scsi: qla2xxx: Fix stuck session in gpdb Fix stuck sessions in get port database. When a thread is in the process of re-establishing a session, a flag is set to prevent multiple threads / triggers from doing the same task. This flag was left on, where any attempt to relogin was locked out. Clear this flag, if the attempt has failed. Link: https://lore.kernel.org/r/20220110050218.3958-4-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:30 -05:00
Saurav Kashyap	31e6cdbe0e	scsi: qla2xxx: Implement ref count for SRB The timeout handler and the done function are racing. When qla2x00_async_iocb_timeout() starts to run it can be preempted by the normal response path (via the firmware?). qla24xx_async_gpsc_sp_done() releases the SRB unconditionally. When scheduling back to qla2x00_async_iocb_timeout() qla24xx_async_abort_cmd() will access an freed sp->qpair pointer: qla2xxx [0000:83:00.0]-2871:0: Async-gpsc timeout - hdl=63d portid=234500 50:06:0e:80:08:77:b6:21. qla2xxx [0000:83:00.0]-2853:0: Async done-gpsc res 0, WWPN 50:06:0e:80:08:77:b6:21 qla2xxx [0000:83:00.0]-2854:0: Async-gpsc OUT WWPN 20:45:00:27:f8:75:33:00 speeds=2c00 speed=0400. qla2xxx [0000:83:00.0]-28d8:0: qla24xx_handle_gpsc_event 50:06:0e:80:08:77:b6:21 DS 7 LS 6 rc 0 login 1\|1 rscn 1\|0 lid 5 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: qla24xx_async_abort_cmd+0x1b/0x1c0 [qla2xxx] Obvious solution to this is to introduce a reference counter. One reference is taken for the normal code path (the 'good' case) and one for the timeout path. As we always race between the normal good case and the timeout/abort handler we need to serialize it. Also we cannot assume any order between the handlers. Since this is slow path we can use proper synchronization via locks. When we are able to cancel a timer (del_timer returns 1) we know there can't be any error handling in progress because the timeout handler hasn't expired yet, thus we can safely decrement the refcounter by one. If we are not able to cancel the timer, we know an abort handler is running. We have to make sure we call sp->done() in the abort handlers before calling kref_put(). Link: https://lore.kernel.org/r/20220110050218.3958-3-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Co-developed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:30 -05:00
Daniel Wagner	d4523bd6fd	scsi: qla2xxx: Refactor asynchronous command initialization Move common open-coded asynchronous command initializing code such as setting up the timer and the done callback into one function. This is a preparation step and allows us later on to change the low level error flow handling at a central place. Link: https://lore.kernel.org/r/20220110050218.3958-2-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:57:29 -05:00
Christophe JAILLET	012d98dae4	scsi: bfa: Remove useless DMA-32 fallback configuration As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/5663cef9b54004fa56cca7ce65f51eadfc3ecddb.1642238127.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:29 -05:00
Christophe JAILLET	8001fa240f	scsi: hisi_sas: Remove useless DMA-32 fallback configuration As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/1bf2d3660178b0e6f172e5208bc0bd68d31d9268.1642237482.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:29 -05:00
Christophe JAILLET	fb8d5ea8fd	scsi: 3w-sas: Remove useless DMA-32 fallback configuration As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/dbbe8671ca760972d80f8d35f3170b4609bee368.1642236763.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:28 -05:00
John Meneghini	847f9ea4c5	scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put() The bnx2fc_destroy() functions are removing the interface before calling destroy_work. This results multiple WARNings from sysfs_remove_group() as the controller rport device attributes are removed too early. Replace the fcoe_port's destroy_work queue. It's not needed. The problem is easily reproducible with the following steps. Example: $ dmesg -w & $ systemctl enable --now fcoe $ fipvlan -s -c ens2f1 $ fcoeadm -d ens2f1.802 [ 583.464488] host2: libfc: Link down on port (7500a1) [ 583.472651] bnx2fc: 7500a1 - rport not created Yet!! [ 583.490468] ------------[ cut here ]------------ [ 583.538725] sysfs group 'power' not found for kobject 'rport-2:0-0' [ 583.568814] WARNING: CPU: 3 PID: 192 at fs/sysfs/group.c:279 sysfs_remove_group+0x6f/0x80 [ 583.607130] Modules linked in: dm_service_time 8021q garp mrp stp llc bnx2fc cnic uio rpcsec_gss_krb5 auth_rpcgss nfsv4 ... [ 583.942994] CPU: 3 PID: 192 Comm: kworker/3:2 Kdump: loaded Not tainted 5.14.0-39.el9.x86_64 #1 [ 583.984105] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 584.016535] Workqueue: fc_wq_2 fc_rport_final_delete [scsi_transport_fc] [ 584.050691] RIP: 0010:sysfs_remove_group+0x6f/0x80 [ 584.074725] Code: ff 5b 48 89 ef 5d 41 5c e9 ee c0 ff ff 48 89 ef e8 f6 b8 ff ff eb d1 49 8b 14 24 48 8b 33 48 c7 c7 ... [ 584.162586] RSP: 0018:ffffb567c15afdc0 EFLAGS: 00010282 [ 584.188225] RAX: 0000000000000000 RBX: ffffffff8eec4220 RCX: 0000000000000000 [ 584.221053] RDX: ffff8c1586ce84c0 RSI: ffff8c1586cd7cc0 RDI: ffff8c1586cd7cc0 [ 584.255089] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb567c15afc00 [ 584.287954] R10: ffffb567c15afbf8 R11: ffffffff8fbe7f28 R12: ffff8c1486326400 [ 584.322356] R13: ffff8c1486326480 R14: ffff8c1483a4a000 R15: 0000000000000004 [ 584.355379] FS: 0000000000000000(0000) GS:ffff8c1586cc0000(0000) knlGS:0000000000000000 [ 584.394419] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 584.421123] CR2: 00007fe95a6f7840 CR3: 0000000107674002 CR4: 00000000000606e0 [ 584.454888] Call Trace: [ 584.466108] device_del+0xb2/0x3e0 [ 584.481701] device_unregister+0x13/0x60 [ 584.501306] bsg_unregister_queue+0x5b/0x80 [ 584.522029] bsg_remove_queue+0x1c/0x40 [ 584.541884] fc_rport_final_delete+0xf3/0x1d0 [scsi_transport_fc] [ 584.573823] process_one_work+0x1e3/0x3b0 [ 584.592396] worker_thread+0x50/0x3b0 [ 584.609256] ? rescuer_thread+0x370/0x370 [ 584.628877] kthread+0x149/0x170 [ 584.643673] ? set_kthread_struct+0x40/0x40 [ 584.662909] ret_from_fork+0x22/0x30 [ 584.680002] ---[ end trace 53575ecefa942ece ]--- Link: https://lore.kernel.org/r/20220115040044.1013475-1-jmeneghi@redhat.com Fixes: `0cbf32e168` ("[SCSI] bnx2fc: Avoid calling bnx2fc_if_destroy with unnecessary locks") Tested-by: Guangwu Zhang <guazhang@redhat.com> Co-developed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:28 -05:00
John Garry	62afb379a0	scsi: pm8001: Fix bogus FW crash for maxcpus=1 According to the comment in check_fw_ready() we should not check the IOP1_READY field in register SCRATCH_PAD_1 for 8008 or 8009 controllers. However we check this very field in process_oq() for processing the highest index interrupt vector. The highest interrupt vector is checked as the FW is programmed to signal fatal errors through this irq. Change that function to not check IOP1_READY for those mentioned controllers, but do check ILA_READY in both cases. The reason I assume that this was not hit earlier was because we always allocated 64 MSI(X), and just did not pass the vector index check in process_oq(), i.e. the handler never ran for vector index 63. Link: https://lore.kernel.org/r/1642508105-95432-1-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:27 -05:00
Saurav Kashyap	64fd4af627	scsi: qedf: Change context reset messages to ratelimited If FCoE is not configured, libfc/libfcoe keeps on retrying FLOGI and after 3 retries driver does a context reset and tries fipvlan again. This leads to context reset message flooding the logs. Hence ratelimit the message to prevent flooding the logs. Link: https://lore.kernel.org/r/20220117135311.6256-4-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:26 -05:00
Saurav Kashyap	5239ab63f1	scsi: qedf: Fix refcount issue when LOGO is received during TMF Hung task call trace was seen during LOGO processing. [ 974.309060] [0000:00:00.0]:[qedf_eh_device_reset:868]: 1:0:2:0: LUN RESET Issued... [ 974.309065] [0000:00:00.0]:[qedf_initiate_tmf:2422]: tm_flags 0x10 sc_cmd 00000000c16b930f op = 0x2a target_id = 0x2 lun=0 [ 974.309178] [0000:00:00.0]:[qedf_initiate_tmf:2431]: portid=016900 tm_flags =LUN RESET [ 974.309222] [0000:00:00.0]:[qedf_initiate_tmf:2438]: orig io_req = 00000000ec78df8f xid = 0x180 ref_cnt = 1. [ 974.309625] host1: rport 016900: Received LOGO request while in state Ready [ 974.309627] host1: rport 016900: Delete port [ 974.309642] host1: rport 016900: work event 3 [ 974.309644] host1: rport 016900: lld callback ev 3 [ 974.313243] [0000:61:00.2]:[qedf_execute_tmf:2383]:1: fcport is uploading, not executing flush. [ 974.313295] [0000:61:00.2]:[qedf_execute_tmf:2400]:1: task mgmt command success... [ 984.031088] INFO: task jbd2/dm-15-8:7645 blocked for more than 120 seconds. [ 984.031136] Not tainted 4.18.0-305.el8.x86_64 #1 [ 984.031166] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 984.031209] jbd2/dm-15-8 D 0 7645 2 0x80004080 [ 984.031212] Call Trace: [ 984.031222] __schedule+0x2c4/0x700 [ 984.031230] ? unfreeze_partials.isra.83+0x16e/0x1a0 [ 984.031233] ? bit_wait_timeout+0x90/0x90 [ 984.031235] schedule+0x38/0xa0 [ 984.031238] io_schedule+0x12/0x40 [ 984.031240] bit_wait_io+0xd/0x50 [ 984.031243] __wait_on_bit+0x6c/0x80 [ 984.031248] ? free_buffer_head+0x21/0x50 [ 984.031251] out_of_line_wait_on_bit+0x91/0xb0 [ 984.031257] ? init_wait_var_entry+0x50/0x50 [ 984.031268] jbd2_journal_commit_transaction+0x112e/0x19f0 [jbd2] [ 984.031280] kjournald2+0xbd/0x270 [jbd2] [ 984.031284] ? finish_wait+0x80/0x80 [ 984.031291] ? commit_timeout+0x10/0x10 [jbd2] [ 984.031294] kthread+0x116/0x130 [ 984.031300] ? kthread_flush_work_fn+0x10/0x10 [ 984.031305] ret_from_fork+0x1f/0x40 There was a ref count issue when LOGO is received during TMF. This leads to one of the I/Os hanging with the driver. Fix the ref count. Link: https://lore.kernel.org/r/20220117135311.6256-3-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:26 -05:00
Saurav Kashyap	b70a99fd13	scsi: qedf: Add stag_work to all the vports Call trace seen when creating NPIV ports, only 32 out of 64 show online. stag work was not initialized for vport, hence initialize the stag work. WARNING: CPU: 8 PID: 645 at kernel/workqueue.c:1635 __queue_delayed_work+0x68/0x80 CPU: 8 PID: 645 Comm: kworker/8:1 Kdump: loaded Tainted: G IOE --------- -- 4.18.0-348.el8.x86_64 #1 Hardware name: Dell Inc. PowerEdge MX740c/0177V9, BIOS 2.12.2 07/09/2021 Workqueue: events fc_lport_timeout [libfc] RIP: 0010:__queue_delayed_work+0x68/0x80 Code: 89 b2 88 00 00 00 44 89 82 90 00 00 00 48 01 c8 48 89 42 50 41 81 f8 00 20 00 00 75 1d e9 60 24 07 00 44 89 c7 e9 98 f6 ff ff <0f> 0b eb c5 0f 0b eb a1 0f 0b eb a7 0f 0b eb ac 44 89 c6 e9 40 23 RSP: 0018:ffffae514bc3be40 EFLAGS: 00010006 RAX: ffff8d25d6143750 RBX: 0000000000000202 RCX: 0000000000000002 RDX: ffff8d2e31383748 RSI: ffff8d25c000d600 RDI: ffff8d2e31383788 RBP: ffff8d2e31380de0 R08: 0000000000002000 R09: ffff8d2e31383750 R10: ffffffffc0c957e0 R11: ffff8d2624800000 R12: ffff8d2e31380a58 R13: ffff8d2d915eb000 R14: ffff8d25c499b5c0 R15: ffff8d2e31380e18 FS: 0000000000000000(0000) GS:ffff8d2d1fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055fd0484b8b8 CR3: 00000008ffc10006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: queue_delayed_work_on+0x36/0x40 qedf_elsct_send+0x57/0x60 [qedf] fc_lport_enter_flogi+0x90/0xc0 [libfc] fc_lport_timeout+0xb7/0x140 [libfc] process_one_work+0x1a7/0x360 ? create_worker+0x1a0/0x1a0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 ---[ end trace 008f00f722f2c2ff ]-- Initialize stag work for all the vports. Link: https://lore.kernel.org/r/20220117135311.6256-2-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:25 -05:00
Xiaoke Wang	a65b32748f	scsi: ufs: ufshcd-pltfrm: Check the return value of devm_kstrdup() devm_kstrdup() returns pointer to allocated string on success, NULL on failure. So it is better to check the return value of it. Link: https://lore.kernel.org/r/tencent_4257E15D4A94FF9020DDCC4BB9B21C041408@qq.com Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:25 -05:00
Yang Yingliang	61263b3a11	scsi: elx: efct: Don't use GFP_KERNEL under spin lock GFP_KERNEL/GFP_DMA can't be used under a spin lock. According the comment, els_ios_lock is used to protect els ios list so we can move down the spin lock to avoid using this flag under the lock. Link: https://lore.kernel.org/r/20220111012441.3232527-1-yangyingliang@huawei.com Fixes: `8f406ef728` ("scsi: elx: libefc: Extended link Service I/O handling") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-24 23:30:23 -05:00

1 2 3 4 5 ...

22624 Commits