scsi: target: core: Fix race during ACL removal

Under huge load there is a possibility of race condition in updating
se_dev_entry object in ACL removal procedure:

 NIP [c0080000154093d0] transport_lookup_cmd_lun+0x1f8/0x3d0 [target_core_mod]
 LR [c00800001542ab34] target_submit_cmd_map_sgls+0x11c/0x300 [target_core_mod]
 Call Trace:
   target_submit_cmd_map_sgls+0x11c/0x300 [target_core_mod]
   target_submit_cmd+0x44/0x60 [target_core_mod]
   tcm_qla2xxx_handle_cmd+0x88/0xe0 [tcm_qla2xxx]
   qlt_do_work+0x2e4/0x3d0 [qla2xxx]
   process_one_work+0x298/0x5c

Despite usage of RCU primitives with deve->se_lun pointer, it has not
become dereference-safe because deve->se_lun is updated and not
synchronized with a reader. That change might be in a release function
called by synchronize_rcu(). But, in fact, there is no point in setting
that pointer to NULL for deleting deve. All access to deve->se_lun is
already under rcu_read_lock. And either deve->se_lun is always valid or
deve is not valid itself and will not be found in the list_for_*.  The same
applicable for deve->se_lun_acl too.  So a better solution is to remove
that NULLing.

Link: https://lore.kernel.org/r/20220727214125.19647-2-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit is contained in:
Dmitry Bogdanov 2022-07-28 00:41:24 +03:00 committed by Martin K. Petersen
parent 00511d2abf
commit dd0a66ada0

View File

@ -434,9 +434,6 @@ void core_disable_device_list_for_node(
kref_put(&orig->pr_kref, target_pr_kref_release);
wait_for_completion(&orig->pr_comp);
rcu_assign_pointer(orig->se_lun, NULL);
rcu_assign_pointer(orig->se_lun_acl, NULL);
kfree_rcu(orig, rcu_head);
core_scsi3_free_pr_reg_from_nacl(dev, nacl);