linux/drivers/scsi
Bill Kuzeja a5dd506e15 scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
A system can get hung task timeouts if a qlogic board fails during
initialization (if the board breaks again or fails the init). The hang
involves the scsi scan.

In a nutshell, since commit beb9e315e6 ("qla2xxx: Prevent removal and
board_disable race"):

...it is possible to have freed ha (base_vha->hw) early by a call to
qla2x00_remove_one when pdev->enable_cnt equals zero:

       if (!atomic_read(&pdev->enable_cnt)) {
               scsi_host_put(base_vha->host);
               kfree(ha);
               pci_set_drvdata(pdev, NULL);
               return;

Almost always, the scsi_host_put above frees the vha structure
(attached to the end of the Scsi_Host we're putting) since it's the last
put, and life is good.  However, if we are entering this routine because
the adapter has broken sometime during initialization AND a scsi scan is
already in progress (and has done its own scsi_host_get), vha will not
be freed. What's worse, the scsi scan will access the freed ha structure
through qla2xxx_scan_finished:

        if (time > vha->hw->loop_reset_delay * HZ)
                return 1;

The scsi scan keeps checking to see if a scan is complete by calling
qla2xxx_scan_finished. There is a timeout value that limits the length
of time a scan can take (hw->loop_reset_delay, usually set to 5
seconds), but this definition is in the data structure (hw) that can get
freed early.

This can yield unpredictable results, the worst of which is that the
scsi scan can hang indefinitely. This happens when the freed structure
gets reused and loop_reset_delay gets overwritten with garbage, which
the scan obliviously uses as its timeout value.

The fix for this is simple: at the top of qla2xxx_scan_finished, check
for the UNLOADING bit in the vha structure (_vha is not freed at this
point).  If UNLOADING is set, we exit the scan for this adapter
immediately. After this last reference to the ha structure, we'll exit
the scan for this adapter, and continue on.

This problem is hard to hit, but I have run into it doing negative
testing many times now (with a test specifically designed to bring it
out), so I can verify that this fix works. My testing has been against a
RHEL7 driver variant, but the bug and patch are equally relevant to to
the upstream driver.

Fixes: beb9e315e6 ("qla2xxx: Prevent removal and board_disable race")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-01 16:39:01 -04:00
..
aacraid SCSI misc on 20161006 2016-10-07 09:28:53 -07:00
aic7xxx aic7xxx: Fix queue depth handling 2016-02-23 21:27:02 -05:00
aic94xx scsi: aic94xx: Add missing error code assignment before test 2016-08-25 23:39:25 -04:00
arcmsr scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware 2016-10-26 22:17:43 -04:00
arm scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS 2016-04-15 16:53:14 -04:00
be2iscsi scsi: be2iscsi: Replace _bh with _irqsave/irqrestore 2016-10-17 13:35:31 -04:00
bfa scsi: bfa: Do not dereference port before it is null checked 2016-09-02 06:09:16 -04:00
bnx2fc scsi: bnx2fc: Mark symbols static where possible 2016-09-09 07:11:07 -04:00
bnx2i bnx2i: fix spelling mistake "complection" -> "completion" 2016-07-12 23:16:31 -04:00
csiostor scsi: csiostor: Fix completion usage 2016-09-14 13:19:15 -04:00
cxgbi chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's 2016-09-19 01:37:32 -04:00
cxlflash scsi: cxlflash: Fix context reference tracking on detach 2016-09-14 12:47:42 -04:00
device_handler scsi: scsi_dh_alua: Fix a reference counting bug 2016-11-01 13:32:24 -04:00
dpt
esas2r scsi: esas2r: don't reinitialize adapter's req_table 2016-08-25 22:28:17 -04:00
fcoe SCSI misc on 20161006 2016-10-07 09:28:53 -07:00
fnic fnic: pci_dma_mapping_error() doesn't return an error code 2016-07-20 20:49:17 -04:00
hisi_sas scsi: hisi_sas: send three identify before phy up 2016-09-14 12:54:18 -04:00
ibmvscsi scsi: ibmvfc: Fix I/O hang when port is not mapped 2016-09-19 19:50:33 -04:00
ibmvscsi_tgt scsi: ibmvscsis: Fixed unused variable 2016-09-19 11:51:59 -04:00
isci Merge branch 'for-4.7-zac' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2016-05-23 17:53:39 -07:00
libfc scsi: libfc: do not send ABTS when resetting exchanges 2016-08-18 22:35:17 -04:00
libsas SCSI misc on 20160727 2016-07-27 14:48:37 -07:00
lpfc scsi: lpfc: Mark symbols static where possible 2016-09-26 20:35:51 -04:00
megaraid scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices 2016-10-24 21:31:43 -04:00
mpt3sas scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk 2016-11-01 13:31:22 -04:00
mvsas scsi: mvsas: Mark symbols static where possible 2016-09-26 21:15:55 -04:00
osd scsi/osd: open code blk_make_request 2016-07-20 17:38:35 -06:00
pcmcia scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
pm8001 scsi: pm8001: Mark symbols static where possible 2016-09-26 21:10:45 -04:00
qla2xxx scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init 2016-11-01 16:39:01 -04:00
qla4xxx scsi: qla4xxx: Mark symbols static where possible 2016-09-02 06:06:43 -04:00
smartpqi scsi: smartpqi: raid bypass lba calculation fix 2016-09-19 11:55:26 -04:00
snic snic: Fix use-after-free in case of a dma mapping error 2016-07-12 23:16:31 -04:00
sym53c8xx_2 scsi: sym53c8xx_2: Use complete() instead complete_all() 2016-09-14 13:19:29 -04:00
ufs scsi: ufs: Get a TM service response from the correct offset 2016-09-21 16:28:57 -04:00
.gitignore
3w-9xxx.c 3w-9xxx: don't unmap bounce buffered commands 2015-10-07 10:24:48 -07:00
3w-9xxx.h 3w-9xxx: fix command completion race 2015-04-27 10:10:19 -07:00
3w-sas.c 3w-sas: fix command completion race 2015-04-27 10:04:39 -07:00
3w-sas.h 3w-sas: fix command completion race 2015-04-27 10:04:39 -07:00
3w-xxxx.c 3w-xxxx: Pass through compat mode ioctls 2016-01-08 12:51:03 -05:00
3w-xxxx.h 3w-xxxx: fix command completion race 2015-04-27 10:05:55 -07:00
53c700_d.h_shipped
53c700.c scsi: remove current_cmnd field from struct scsi_device 2016-07-13 22:33:23 -04:00
53c700.h scsi: remove current_cmnd field from struct scsi_device 2016-07-13 22:33:23 -04:00
53c700.scr
a100u2w.c scsi: a100u2w: trivial typo in printk 2015-08-07 15:03:42 +02:00
a100u2w.h
a2091.c
a2091.h
a3000.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
a3000.h
a4000t.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
advansys.c Merge branch 'mkp-fixes' into fixes 2015-12-03 09:32:33 -08:00
aha152x.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
aha152x.h
aha1542.c scsi: aha1542: avoid uninitialized variable warnings 2016-02-23 21:27:02 -05:00
aha1542.h aha1542: fix include guard and remove useless changelog 2015-04-09 18:08:31 -07:00
aha1740.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
aha1740.h scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
am53c974.c am53c974: Fix crash during modprobe 2015-04-17 10:13:56 -07:00
atari_scsi.c atari_scsi: Allow can_queue to be increased for Falcon 2016-04-11 16:57:09 -04:00
atp870u.c atp870u: Introduce atp870_init() 2015-11-25 22:08:55 -05:00
atp870u.h atp870u: Remove scam_on from struct atp_unit 2015-11-25 22:08:52 -05:00
BusLogic.c scsi: replace seq_printf with seq_puts 2015-02-02 09:57:45 -08:00
BusLogic.h
bvme6000_scsi.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
ch.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2015-04-14 09:50:27 -07:00
constants.c scsi: fix upper bounds check of sense key in scsi_sense_key_string() 2016-08-16 00:49:32 -04:00
dc395x.c scsi: print single-character strings with seq_putc 2015-02-02 09:57:46 -08:00
dc395x.h
dmx3191d.c ncr5380: Remove DONT_USE_INTR and AUTOPROBE_IRQ macros 2016-04-11 16:57:09 -04:00
dpt_i2o.c dpt_i2o: fix build warning 2016-02-23 21:27:02 -05:00
dpti.h scsi: use 64-bit LUNs 2014-07-17 22:07:37 +02:00
eata_generic.h
eata_pio.c eata_pio: missing break statement 2016-05-10 22:01:07 -04:00
eata_pio.h
eata.c scsi: drop reason argument from ->change_queue_depth 2014-11-24 14:45:27 +01:00
esp_scsi.c scsi: use host wide tags by default 2015-11-09 17:11:57 -08:00
esp_scsi.h esp_scsi: correctly detect am53c974 2014-11-24 16:13:16 +01:00
fdomain.c scsi: fdomain: drop fdomain_pci_tbl when built-in 2016-02-23 21:27:02 -05:00
fdomain.h
FlashPoint.c FlashPoint: fix build warning 2015-11-09 16:32:14 -08:00
g_NCR5380_mmio.c
g_NCR5380.c ncr5380: Update usage documentation 2016-04-11 16:57:09 -04:00
g_NCR5380.h ncr5380: Merge DMA implementation from atari_NCR5380 core driver 2016-04-11 16:57:09 -04:00
gdth_ioctl.h
gdth_proc.c gdth: replace struct timeval with ktime_get_real_seconds() 2016-02-25 21:16:49 -05:00
gdth_proc.h
gdth.c gdth: replace struct timeval with ktime_get_real_seconds() 2016-02-25 21:16:49 -05:00
gdth.h
gvp11.c
gvp11.h
hosts.c SCSI misc on 20161006 2016-10-07 09:28:53 -07:00
hpsa_cmd.h scsi: hpsa: Check for vpd support before sending 2016-09-14 14:19:31 -04:00
hpsa.c scsi: hpsa: correct call to hpsa_do_reset 2016-09-21 16:42:05 -04:00
hpsa.h scsi: hpsa: correct call to hpsa_do_reset 2016-09-21 16:42:05 -04:00
hptiop.c hptiop: Support HighPoint RR36xx HBAs and Support SAS tape and SAS media changer 2015-08-12 13:14:57 -07:00
hptiop.h hptiop: Support HighPoint RR36xx HBAs and Support SAS tape and SAS media changer 2015-08-12 13:14:57 -07:00
imm.c imm: check parport_claim 2016-02-25 21:10:53 -05:00
imm.h
initio.c SCSI: initio: remove duplicate module device table 2015-11-20 11:39:03 -05:00
initio.h
ipr.c scsi: ipr: Fix async error WARN_ON 2016-10-14 16:26:31 -04:00
ipr.h scsi: ipr: Don't log unnecessary 9084 error details 2016-09-19 11:57:33 -04:00
ips.c ips: remove pointless #warning 2015-06-02 17:24:54 -07:00
ips.h
iscsi_boot_sysfs.c ibft: Expose iBFT acpi header via sysfs 2016-05-16 11:14:29 -04:00
iscsi_tcp.c scsi_tcp: block BH in TCP callbacks 2016-05-19 11:36:49 -07:00
iscsi_tcp.h iscsi_tcp: Use ahash 2016-01-27 20:36:10 +08:00
jazz_esp.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
Kconfig scsi: dtc: remove from tree 2016-09-26 20:49:25 -04:00
lasi700.c
libiscsi_tcp.c iscsi_tcp: Use ahash 2016-01-27 20:36:10 +08:00
libiscsi.c scsi: libiscsi: Fix locking in __iscsi_conn_send_pdu 2016-10-17 13:34:44 -04:00
mac53c94.c PCI: Remove includes of asm/pci-bridge.h 2016-02-05 16:29:28 -06:00
mac53c94.h
mac_esp.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
mac_scsi.c mac_scsi: Fix pseudo DMA implementation 2016-04-11 16:57:09 -04:00
Makefile scsi: dtc: remove from tree 2016-09-26 20:49:25 -04:00
megaraid.c megaraid : use dev_printk when possible 2015-08-26 07:23:04 -07:00
megaraid.h
mesh.c PCI: Remove includes of asm/pci-bridge.h 2016-02-05 16:29:28 -06:00
mesh.h
mvme16x_scsi.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
mvme147.c
mvme147.h
mvumi.c scsi: mvumi: use __maybe_unused to hide pm functions 2016-03-05 17:07:46 -05:00
mvumi.h
ncr53c8xx.c scsi: drop reason argument from ->change_queue_depth 2014-11-24 14:45:27 +01:00
ncr53c8xx.h scsi: Remove CONFIG_SCSI_MULTI_LUN 2014-07-17 22:07:35 +02:00
NCR53c406a.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
NCR5380.c scsi: NCR5380: no longer mark irq probing as __init 2016-10-17 14:13:03 -04:00
NCR5380.h scsi: ncr5380: Improve interrupt latency during PIO tranfers 2016-09-14 14:11:12 -04:00
NCR_D700.c
NCR_D700.h
NCR_Q720.c
NCR_Q720.h
nsp32_debug.c
nsp32_io.h
nsp32.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
nsp32.h
osst_detect.h
osst_options.h
osst.c scsi: remove scsi_driver owner field 2014-11-24 20:01:28 +01:00
osst.h
pmcraid.c scsi: pmcraid: mark symbols static where possible 2016-09-04 01:28:07 -04:00
pmcraid.h
ppa.c scsi: ppa: use new parport device model 2016-02-23 21:27:02 -05:00
ppa.h
ps3rom.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
qla1280.c qla1280: Don't allocate 512kb of host tags 2016-04-30 09:25:26 -07:00
qla1280.h
qlogicfas408.c
qlogicfas408.h
qlogicfas.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
qlogicpti.c qlogicpti: Return correct error code 2016-03-01 20:06:49 -05:00
qlogicpti.h
raid_class.c
script_asm.pl
scsi_common.c scsi: add scsi_set_sense_field_pointer() 2016-04-04 12:07:42 -04:00
scsi_debug.c scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded 2016-10-26 22:14:50 -04:00
scsi_devinfo.c scsi: blacklist all RDAC devices for BLIST_NO_ULD_ATTACH 2016-08-16 00:58:13 -04:00
scsi_dh.c scsi: Replace wrong device handler name for CLARiiON arrays 2016-10-11 17:56:51 -04:00
scsi_error.c Merge remote-tracking branch 'mkp-scsi/4.7/scsi-fixes' into fixes 2016-06-18 11:59:01 -07:00
scsi_ioctl.c scsi: return EAGAIN when resetting a device under EH 2014-11-12 11:16:12 +01:00
scsi_lib_dma.c
scsi_lib.c blk-mq: remove ->map_queue 2016-09-15 08:42:03 -06:00
scsi_logging.c scsi_logging: return void for dev_printk() functions 2015-02-04 08:00:24 -08:00
scsi_logging.h scsi: simplify scsi_log_(send|completion) 2014-11-12 11:16:05 +01:00
scsi_module.c
scsi_netlink.c net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
scsi_pm.c scsi: Set request queue runtime PM status back to active on resume 2016-02-19 10:52:45 -05:00
scsi_priv.h SCSI misc on 20161006 2016-10-07 09:28:53 -07:00
scsi_proc.c scsi: disable automatic target scan 2016-04-11 16:57:09 -04:00
scsi_sas_internal.h scsi_transport_sas: add 'scsi_target_id' sysfs attribute 2016-03-14 21:05:04 -04:00
scsi_scan.c scsi: Remove one useless stack variable 2016-10-11 18:02:09 -04:00
scsi_sysctl.c scsi: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
scsi_sysfs.c Revert "scsi: fix soft lockup in scsi_remove_target() on module removal" 2016-04-15 16:53:07 -04:00
scsi_trace.c scsi-trace: define ZBC_IN and ZBC_OUT 2016-04-11 16:57:09 -04:00
scsi_transport_api.h
scsi_transport_fc.c scsi_transport_fc: Unexport scsi_is_fc_vport() 2016-04-11 16:57:09 -04:00
scsi_transport_iscsi.c scsi_transport_iscsi: Declare local symbols static 2016-04-11 16:57:09 -04:00
scsi_transport_sas.c scsi: sas: remove is_sas_attached() 2016-08-18 22:23:20 -04:00
scsi_transport_spi.c [SCSI] Fix printk typos in drivers/scsi 2015-08-07 14:28:45 +02:00
scsi_transport_srp.c IB/srp: Avoid using uninitialized variable 2015-07-14 13:20:09 -04:00
scsi_typedefs.h
scsi.c scsi: Avoid that toggling use_blk_mq triggers a memory leak 2016-09-26 20:58:42 -04:00
scsi.h
scsicam.c scsi: PC partition tables are little endian 2014-11-12 11:15:54 +01:00
sd_dif.c scsi: sd: Move DIF protection types to t10-pi.h 2016-09-15 09:51:14 -04:00
sd.c scsi: sd: Move DIF protection types to t10-pi.h 2016-09-15 09:51:14 -04:00
sd.h scsi: sd: Move DIF protection types to t10-pi.h 2016-09-15 09:51:14 -04:00
sense_codes.h scsi: move Additional Sense Codes to separate file 2016-04-11 16:57:09 -04:00
ses.c scsi: ses: use scsi_is_sas_rphy instead of is_sas_attached 2016-08-18 22:22:19 -04:00
sg.c scsi: sg: Use mult_frac, drop MULDIV macro 2016-08-30 22:18:59 -04:00
sgiwd93.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
sim710.c scsi: sim710: fix build warning 2016-02-23 21:27:02 -05:00
sni_53c710.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
sr_ioctl.c sr: reduce debug noise in sr_do_ioctl 2015-01-20 19:43:24 +01:00
sr_vendor.c scsi: Implement sr_printk() 2014-07-17 22:07:39 +02:00
sr.c scsi: sr: constify sr_pm_ops structure 2016-09-04 01:28:08 -04:00
sr.h scsi: introduce sdev_prefix_printk() 2014-11-12 11:15:57 +01:00
st_options.h
st.c st: clear ILI if Medium Error 2016-04-25 22:08:16 -04:00
st.h st: Remove obsolete scsi_tape.max_pfn 2015-11-18 11:59:09 -05:00
stex.c stex: Add S3/S4 support 2016-02-23 21:27:02 -05:00
storvsc_drv.c scsi: storvsc: Filter out storvsc messages CD-ROM medium not present 2016-07-12 23:16:31 -04:00
sun3_scsi_vme.c scsi/NCR5380: merge sun3_scsi_vme.c into sun3_scsi.c 2014-05-28 12:16:28 +02:00
sun3_scsi.c ncr5380: Remove disused atari_NCR5380.c core driver 2016-04-11 16:57:09 -04:00
sun3_scsi.h sun3_scsi: Move macro definitions 2014-11-20 09:11:15 +01:00
sun3x_esp.c arch, drivers: don't include <asm/io.h> directly, use <linux/io.h> instead 2015-08-10 23:07:05 -04:00
sun_esp.c scsi: drop owner assignment from platform_drivers 2014-10-20 16:21:33 +02:00
sym53c416.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
sym53c416.h
virtio_scsi.c SCSI misc on 20161006 2016-10-07 09:28:53 -07:00
vmw_pvscsi.c scsi: vmw_pvscsi: return SUCCESS for successful command aborts 2016-11-01 13:31:23 -04:00
vmw_pvscsi.h scsi: vmw_pvscsi: return SUCCESS for successful command aborts 2016-11-01 13:31:23 -04:00
wd33c93.c scsi: print single-character strings with seq_putc 2015-02-02 09:57:46 -08:00
wd33c93.h
wd719x.c drivers/scsi/wd719x.c: remove last declaration using DEFINE_PCI_DEVICE_TABLE 2016-09-01 17:52:01 -07:00
wd719x.h scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
xen-scsifront.c xen: Use correctly the Xen memory terminologies 2015-09-08 18:03:49 +01:00
zalon.c
zorro7xx.c