Fix module parameter descriptions mentioning default values that no longer
match the code.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
qla2x00_mem_alloc() returns 1 on success and -ENOMEM on failure. On the
one hand the caller assumes non-zero is success but on the other hand
the caller also assumes that it returns an error code.
I've fixed it to return zero on success and a negative error code on
failure. This matches the documentation as well.
[jejb: checkpatch fix]
Fixes: e315cd28b9 ('[SCSI] qla2xxx: Code changes for qla data structure refactoring')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
isci is needlessly tying libata's hands by returning
SAM_STAT_CHECK_CONDITION to some ata errors. Instead, prefer
SAS_PROTO_RESPONSE to let libata (via sas_ata_task_done()) disposition
the device-to-host fis.
For example isci is triggering an HSM Violation where AHCI is showing a
simple media error for the same bus condition:
isci:
ata7.00: failed command: READ VERIFY SECTOR(S)
ata7.00: cmd 40/00:01:00:00:00/00:00:00:00:00/e0 tag 0
res 01/04:00:00:00:00/00:00:00:00:00/e0 Emask 0x3 (HSM violation)
ahci:
ata6.00: failed command: READ VERIFY SECTOR(S)
ata6.00: cmd 40/00:01:00:00:00/00:00:00:00:00/e0 tag 0
res 51/40:01:00:00:00/00:00:00:00:00/e0 Emask 0x9 (media error)
Note that the isci response matches this from sas_ata_task_done():
/* We saw a SAS error. Send a vague error. */
[..]
dev->sata_dev.fis[3] = 0x04; /* status err */
dev->sata_dev.fis[2] = ATA_ERR;
The end effect is that isci is needlessly triggering hard resets when
they are not necessary.
Reported-by: Xun Ni <xun.ni@intel.com>
Tested-by: Nelson Cheng <nelson.cheng@intel.com>
Acked-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
libsas sometimes short circuits timeouts to force commands into error
recovery. It is misleading to log that the command timed-out in
sas_scsi_timed_out() when in fact it was just queued for error handling.
It's also redundant in the case of a true timeout as libata eh will
detect and report timeouts via it's AC_ERR_TIMEOUT facility.
Given that some environments consider "timeout" errors to be indicative
of impending device failure demote the sas_scsi_timed_out() timeout
message to be disabled by default. This parallels ata_scsi_timed_out().
[jejb: checkpatch fix]
Reported-by: Xun Ni <xun.ni@intel.com>
Tested-by: Nelson Cheng <nelson.cheng@intel.com>
Acked-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Send aborts to the firmware via the request/response queue mechanism.
Signed-off-by: Armen Baloyan <armen.baloyan@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
- Fix interpreting the wrong IOCB type for task management
functions in the response path.
- Merge the task management function handling for various adapters.
Signed-off-by: Armen Baloyan <armen.baloyan@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Allow for the capture of a firmware dump but have a sysfs node
(allow_cna_fw_dump) to allow the feature to be enabled/disabled dynamically.
The default is off.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Tell the mid-layer that number of commands we can queue is the available
resources we have minus a small amount for internal commands.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
SCSI retry delay upon SAM_STAT_BUSY/_SET_FULL was not being handled
in bnx2fc. This patch adds such handling by returning TARGET_BUSY
to the SCSI ML for the corresponding LUN until the retry timer expires.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The problem has been identified to be a change in the scsi_remove_device
path where a call to the pm_runtime_set_memalloc_noio was added when
del_gendisk is called in this path. Note that the new pm routine
attempts to cycle through all parent devices from the FC target device
to set the memalloc_noio flag. Because of this new change, a dependency
was created between the FC target device and the parent netdev device
in the destroy path.
In order to synchronized the destroy paths, bnx2fc has been modified
to flush all destroy workqueues in the NETDEV_UNREGISTER return path.
[ 4.123584] BUG: soft lockup - CPU#8 stuck for 22s! [kworker/8:3:8082]
[ 4.123713] Call Trace:
[ 4.123719] [<ffffffff815dfbe0>] klist_next+0x20/0xf0
[ 4.123725] [<ffffffff813e9220>] ? pm_save_wakeup_count+0x70/0x70
[ 4.123731] [<ffffffff813d9e4e>] device_for_each_child+0x4e/0x70
[ 4.123735] [<ffffffff813e9554>] pm_runtime_set_memalloc_noio+0x94/0xf0
[ 4.123740] [<ffffffff812d4d74>] del_gendisk+0x264/0x2a0
[ 4.123747] [<ffffffffa00c6dc9>] sd_remove+0x69/0xb0 [sd_mod]
[ 4.123751] [<ffffffff813de24f>] __device_release_driver+0x7f/0xf0
[ 4.123754] [<ffffffff813de2e3>] device_release_driver+0x23/0x30
[ 4.123757] [<ffffffff813ddab4>] bus_remove_device+0xf4/0x170
[ 4.123760] [<ffffffff813da475>] device_del+0x135/0x1d0
[ 4.123765] [<ffffffff81411b75>] __scsi_remove_device+0xc5/0xd0
[ 4.123768] [<ffffffff81411ba6>] scsi_remove_device+0x26/0x40
[ 4.123770] [<ffffffff81411d40>] scsi_remove_target+0x160/0x210
[ 4.123775] [<ffffffffa0420e4c>] fc_rport_final_delete+0xac/0x1f0 [scsi_transport_fc]
[ 4.123780] [<ffffffff810774ab>] process_one_work+0x17b/0x460
[ 4.123783] [<ffffffff8107825b>] worker_thread+0x11b/0x400
[ 4.123786] [<ffffffff81078140>] ? rescuer_thread+0x3e0/0x3e0
[ 4.123791] [<ffffffff8107e9c0>] kthread+0xc0/0xd0
[ 4.123794] [<ffffffff8107e900>] ? kthread_create_on_node+0x110/0x110
[ 4.123798] [<ffffffff8160ceec>] ret_from_fork+0x7c/0xb0
[ 4.123801] [<ffffffff8107e900>] ? kthread_create_on_node+0x110/0x110
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
pm80xx_get_gsm_dump() was returning "1" in error case
instead of negative error value.
Signed-off-by: Viswas G <Viswas.G@pmcs.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Removed redundant code snippets in pm8001_hwi.c and
pm8001_ctl.c
Signed-off-by: Viswas G <Viswas.G@pmcs.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If the initialization of storvsc fails, the storvsc_device_destroy()
causes NULL pointer dereference.
storvsc_bus_scan()
scsi_scan_target()
__scsi_scan_target()
scsi_probe_and_add_lun(hostdata=NULL)
scsi_alloc_sdev(hostdata=NULL)
sdev->hostdata = hostdata
now the host allocation fails
__scsi_remove_device(sdev)
calls sdev->host->hostt->slave_destroy() ==
storvsc_device_destroy(sdev)
access of sdev->hostdata->request_mempool
Signed-off-by: Ales Novak <alnovak@suse.cz>
Signed-off-by: Thomas Abraham <tabraham@suse.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In the first place, the loop 'for' in the macro 'for_each_isci_host'
(drivers/scsi/isci/host.h:314) is incorrect, because it accesses
the 3rd element of 2 element array. After the 2nd iteration it executes
the instruction:
ihost = to_pci_info(pdev)->hosts[2]
(while the size of the 'hosts' array equals 2) and reads an
out of range element.
In the second place, this loop is incorrectly optimized by GCC v4.8
(see http://marc.info/?l=linux-kernel&m=138998871911336&w=2).
As a result, on platforms with two SCU controllers,
the loop is executed more times than it can be (for i=0,1 and 2).
It causes kernel panic during entering the S3 state
and the following oops after 'rmmod isci':
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8131360b>] __list_add+0x1b/0xc0
Oops: 0000 [#1] SMP
RIP: 0010:[<ffffffff8131360b>] [<ffffffff8131360b>] __list_add+0x1b/0xc0
Call Trace:
[<ffffffff81661b84>] __mutex_lock_slowpath+0x114/0x1b0
[<ffffffff81661c3f>] mutex_lock+0x1f/0x30
[<ffffffffa03e97cb>] sas_disable_events+0x1b/0x50 [libsas]
[<ffffffffa03e9818>] sas_unregister_ha+0x18/0x60 [libsas]
[<ffffffffa040316e>] isci_unregister+0x1e/0x40 [isci]
[<ffffffffa0403efd>] isci_pci_remove+0x5d/0x100 [isci]
[<ffffffff813391cb>] pci_device_remove+0x3b/0xb0
[<ffffffff813fbf7f>] __device_release_driver+0x7f/0xf0
[<ffffffff813fc8f8>] driver_detach+0xa8/0xb0
[<ffffffff813fbb8b>] bus_remove_driver+0x9b/0x120
[<ffffffff813fcf2c>] driver_unregister+0x2c/0x50
[<ffffffff813381f3>] pci_unregister_driver+0x23/0x80
[<ffffffffa04152f8>] isci_exit+0x10/0x1e [isci]
[<ffffffff810d199b>] SyS_delete_module+0x16b/0x2d0
[<ffffffff81012a21>] ? do_notify_resume+0x61/0xa0
[<ffffffff8166ce29>] system_call_fastpath+0x16/0x1b
The loop has been corrected.
This patch fixes kernel panic during entering the S3 state
and the above oops.
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Tested-by: Lukasz Dorau <lukasz.dorau@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Remove an erroneous BUG_ON() in the case of a hard reset timeout. The
reset timeout handler puts the port into the "awaiting link-up" state.
The timeout causes the device to be disconnected and we need to be in
the awaiting link-up state to re-connect the port. The BUG_ON() made
the incorrect assumption that resets never timeout and we always
complete the reset in the "resetting" state.
Testing this patch also uncovered that libata continues to attempt to
reset the port long after the driver has torn down the context. Once
the driver has committed to abandoning the link it must indicate to
libata that recovery ends by returning -ENODEV from
->lldd_I_T_nexus_reset().
Cc: <stable@vger.kernel.org>
Acked-by: Lukasz Dorau <lukasz.dorau@intel.com>
Reported-by: David Milburn <dmilburn@redhat.com>
Reported-by: Xun Ni <xun.ni@intel.com>
Tested-by: Xun Ni <xun.ni@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>