scsi: mpt3sas: Fix deadlock while cancelling the running firmware event

Do not cancel current running firmware event work if the event type is
different from MPT3SAS_REMOVE_UNRESPONDING_DEVICES.  Otherwise a deadlock
can be observed while cancelling the current firmware event work if a hard
reset operation is called as part of processing the current event.

Link: https://lore.kernel.org/r/20210518051625.1596742-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit is contained in:
Suganath Prabu S 2021-05-18 10:46:23 +05:30 committed by Martin K. Petersen
parent ea2f0f7753
commit e2fac6c44a

View File

@ -3697,6 +3697,28 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc)
ioc->fw_events_cleanup = 1;
while ((fw_event = dequeue_next_fw_event(ioc)) ||
(fw_event = ioc->current_event)) {
/*
* Don't call cancel_work_sync() for current_event
* other than MPT3SAS_REMOVE_UNRESPONDING_DEVICES;
* otherwise we may observe deadlock if current
* hard reset issued as part of processing the current_event.
*
* Orginal logic of cleaning the current_event is added
* for handling the back to back host reset issued by the user.
* i.e. during back to back host reset, driver use to process
* the two instances of MPT3SAS_REMOVE_UNRESPONDING_DEVICES
* event back to back and this made the drives to unregister
* the devices from SML.
*/
if (fw_event == ioc->current_event &&
ioc->current_event->event !=
MPT3SAS_REMOVE_UNRESPONDING_DEVICES) {
ioc->current_event = NULL;
continue;
}
/*
* Wait on the fw_event to complete. If this returns 1, then
* the event was never executed, and we need a put for the