Commit Graph

250442 Commits

Author SHA1 Message Date
Dan Williams
3673f4bf6a [SCSI] libsas: check dev->gone before submitting sata i/o
Head off doomed-to-fail i/o in sas_queuecommand before sending it down
the ata path.

Before:
sd 7:0:0:0: [sdd] Synchronizing SCSI cache
ata8: no sense translation for status: 0x00
ata8: translated ATA stat/err 0x00/00 to SCSI SK/ASC/ASCQ 0xb/00/00
ata8.00: device reported invalid CHS sector 0
ata8: status=0x00 { }
ata8: no sense translation for status: 0x00
ata8: translated ATA stat/err 0x00/00 to SCSI SK/ASC/ASCQ 0xb/00/00
ata8.00: device reported invalid CHS sector 0
ata8: status=0x00 { }
ata8: no sense translation for status: 0x00
ata8: translated ATA stat/err 0x00/00 to SCSI SK/ASC/ASCQ 0xb/00/00
ata8.00: device reported invalid CHS sector 0
ata8: status=0x00 { }
sd 7:0:0:0: [sdd]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 7:0:0:0: [sdd]  Sense Key : Aborted Command [current] [descriptor]
sd 7:0:0:0: [sdd]  Add. Sense: No additional sense information
sd 7:0:0:0: [sdd] Stopping disk

After:
sd 9:0:0:0: [sdd] Synchronizing SCSI cache
sd 9:0:0:0: [sdd]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 9:0:0:0: [sdd] Stopping disk
sd 9:0:0:0: [sdd] START_STOP FAILED
sd 9:0:0:0: [sdd]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

This is a cosmetic change as sata i/o can still leak to a gone device,
but this addresses the nominal hotplug case when releasing the target.

Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-26 22:49:33 -05:00
Dan Williams
90f1e10d08 [SCSI] libsas: fix/amend device gone notification in sas_deform_port()
Commit 56dd2c06 "libsas: Don't issue commands to devices that have been
hot-removed" edited Darrick's original patch to remove setting 'gone' in
the sas_deform_port() path because that prevented scsi sync cache
commands from being issued when the driver was unloaded.  However, this
allows true device gone notifications (as signaled port phy events) to
trigger sync cache commands to devices that are known to be unreachable.

Teach libsas which sas_deform_port() invocations are likely device gone
events.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-26 22:49:32 -05:00
James Bottomley
c95286d81b [SCSI] MAINTAINERS update for SCSI (new email address)
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-26 15:16:48 -05:00
Samuel Thibault
fad4dab5e4 [SCSI] Fix Ultrastor asm snippet
Commit 1292500b replaced

"=m" (*field) : "1" (*field)

with

"=m" (*field) :

with comment "The following patch fixes it by using the '+' operator on
the (*field) operand, marking it as read-write to gcc."
'+' was actually forgotten.  This really puts it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:25:35 -04:00
Andrew Morton
6e2037b0fd [SCSI] osst: fix warning
drivers/scsi/osst.c: In function '__os_scsi_tape_open':
drivers/scsi/osst.c:4705: warning: ISO C90 forbids mixed declarations and code

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:09:41 -04:00
Roel Kluin
a061f57d66 [SCSI] osst: wrong index used in inner loop
Index i was already used in the outer loop.  Fixes a potentially-infinite
loop.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Willem Riede <osst@riede.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:08:57 -04:00
Vasiliy Kulikov
8e91096597 [SCSI] aic94xx: world-writable sysfs update_bios file
Don't allow everybody to load firmware.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:08:39 -04:00
Nicholas Bellinger
86cfa7fcb7 [SCSI] MAINTAINERS: Add drivers/target/ entry
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:05:36 -04:00
Nicholas Bellinger
e66ecd505a [SCSI] target: Convert TASK_ATTR to scsi_tcq.h definitions
This patch converts target core and follwing scsi-misc upstream fabric
modules to use include/scsi/scsi_tcq.h includes for SIMPLE, HEAD_OF_QUEUE
and ORDERED SCSI tasks instead of scsi/libsas.h with TASK_ATTR*

*) tcm_loop: Convert tcm_loop_allocate_core_cmd() + tcm_loop_device_reset() to
   scsi_tcq.h
*) tcm_fc: Convert ft_send_cmd() from FCP_PTA_* to scsi_tcq.h

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:03:56 -04:00
Nicholas Bellinger
d60b7a0fc9 [SCSI] target: Convert REPORT_LUNs to use int_to_scsilun
This patch converts transport_core_report_lun_response() to use
drivers/scsi/scsi_scan.c:int_to_scsilun instead of using the
struct target_core_fabric_ops->pack_lun() fabric provided API vector.

It also removes the tfo->pack_lun check from target_fabric_tf_ops_check()
and removes from struct target_core_fabric_ops->pack_lun() from
target_core_fabric_ops.h, and the following mainline scsi-misc fabric
modules:

*) tcm_loop: Drop tcm_loop_pack_lun() usage
*) tcm_fc: Drop ft_pack_lun() usage

Reported-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:02:42 -04:00
Nicholas Bellinger
af57c3ac99 [SCSI] target: Fix task->task_execute_queue=1 clear bug + LUN_RESET OOPs
This patch fixes a bug where task->task_execute_queue=1 was not being
cleared once se_task had been removed from se_device->execute_task_list,
resulting in an OOPs in core_tmr_lun_reset() for the task->task_active=0
case where transport_remove_task_from_execute_queue() was incorrectly
being called.

This patch fixes two cases in transport_get_task_from_execute_queue()
and transport_remove_task_from_execute_queue() to properly clear
task->task_execute_queue=0 once list_del(&task->t_execute_list) has
been called.

It also adds an explict check in transport_remove_task_from_execute_queue()
to dump_stack + return if called with task->task_execute_queue=0.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:01:05 -04:00
Nicholas Bellinger
f436677262 [SCSI] target: Fix bug with task_sg chained transport_free_dev_tasks release
This patch addresses a bug in the target core release path for HW
operation where transport_free_dev_tasks() was incorrectly being called
from transport_lun_remove_cmd() while releasing a se_cmd reference and
calling struct target_core_fabric_ops->queue_data_in().

This would result in a OOPs with HW target mode when the release of
se_task->task_sg[] would happen before pci_unmap_sg() can be called in
HW target mode fabric module code.  This patch addresses the issue by
moving transport_free_dev_tasks() from transport_lun_remove_cmd() into
transport_generic_free_cmd(), and adding TRANSPORT_FREE_CMD_INTR and
transport_generic_free_cmd_intr() to allow se_cmd descriptor release
to happen fromfrom within transport_processing_thread() process context
when release of se_cmd is not possible from HW interrupt context.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 13:00:10 -04:00
Nicholas Bellinger
53ab6709b4 [SCSI] target: Fix interrupt context bug with stats_lock and core_tmr_alloc_req
This patch fixes two bugs wrt to the interrupt context usage of target
core with HW target mode drivers.  It first converts the usage of struct
se_device->stats_lock in transport_get_lun_for_cmd() and core_tmr_lun_reset()
to properly use spin_lock_irq() to address an BUG with CONFIG_LOCKDEP_SUPPORT=y
enabled.

This patch also adds a 'in_interrupt()' check to allow GFP_ATOMIC usage from
core_tmr_alloc_req() to fix a 'sleeping in interrupt context' BUG with HW
target fabrics that require this logic to function.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:58:17 -04:00
Nicholas Bellinger
97868c8905 [SCSI] target: Fix multi task->task_sg[] chaining logic bug
This patch fixes a bug in transport_do_task_sg_chain() used by HW target
mode modules with sg_chain() to provide a single sg_next() walkable memory
layout for use with pci_map_sg() and friends.  This patch addresses an
issue with mapping multiple small block max_sector tasks across multiple
struct se_task->task_sg[] mappings for HW target mode operation.

This was causing OOPs with (cmd->t_task->t_tasks_no > 1) I/O traffic for
HW target drivers using transport_do_task_sg_chain(), and has been tested
so far with tcm_fc(openfcoe), tcm_qla2xxx, and ib_srpt fabrics with
t_tasks_no > 1 IBLOCK backends using a smaller max_sectors to trigger the
original issue.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Acked-by: Kiran Patil <kiran.patil@intel.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:56:58 -04:00
David Jeffery
3eef6257de [SCSI] Reduce error recovery time by reducing use of TURs
In error recovery, most scsi error recovery stages will send a TUR command
for every bad command when a driver's error handler reports success.  When
several bad commands to the same device, this results in a device
being probed multiple times.

This becomes very problematic if the device or connection is in a state
where the device still doesn't respond to commands even after a recovery
function returns success.  The error handler must wait for the test
commands to time out.  The time waiting for the redundant commands can
drastically lengthen error recovery.

This patch alters the scsi mid-layer's error routines to send test commands
once per device instead of once per bad command.  This can drastically
lower error recovery time.

[jejb: fixed up whitespace and formatting]
Signed-of-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:51:53 -04:00
Luben Tuikov
0bcaa11154 [SCSI] Retrieve the Caching mode page (version 2)
Some kernel transport drivers unconditionally disable
retrieval of the Caching mode page. One such for example is
the BBB/CBI transport over USB. Such a restraint is too
harsh as some devices do support the Caching mode
page. Unconditionally enabling the retrieval of this mode
page over those transports at their transport code level may
result in some devices failing and becoming unusable.

This patch implements a method of retrieving the Caching
mode page without unconditionally enabling it in the
transports which unconditionally disable it. The idea is to
ask for all supported pages, page code 0x3F, and then search
for the Caching mode page in the mode parameter data
returned. The sd driver already asks for all the mode pages
supported by the attached device by setting the page code to
0x3F in order to find out if the media is write protected by
reading the WP bit in the Device Specific Parameter
field. It then attempts to retrieve only the Caching mode
page by setting the page code to 8 and actually attempting
to retrieve it if and only if the transport allows it.

The method implemented here is that if the transport doesn't
allow retrieval of the Caching mode page and the device is
not RBC, then we ask for all pages supported by setting the
page code to 0x3F (similarly to how the WP bit is retrieved
above), and then we search for the Caching mode page in the
mode parameter data returned.

With this patch, devices over SATA, report this (no change):

Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Smart devices report their Caching mode page. This is a
change where we'd previously see the kernel making
assumption about the device's cache being write-through:

Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] 610472646 4096-byte logical blocks: (2.50 TB/2.27 TiB)
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Mode Sense: 47 00 10 08
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA

And "dumb" devices over BBB, are correctly shown not to
support reporting the Caching mode page:

Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] 15663104 512-byte logical blocks: (8.01 GB/7.46 GiB)
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Write Protect is off
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Mode Sense: 23 00 00 00
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] No Caching mode page present
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Assuming drive cache: write through

Version 2 adds this:

Some devices don't support page code 0x3F, and others require a
fixed transfer length of 192 bytes. This single commit includes a
patch by Alan Stern which fixes this.

Reported-and-tested-by: Richard Senior <richard@r-senior.demon.co.uk>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:43:52 -04:00
Eddie Wai
9ae58e144d [SCSI] bnx2i: Optimized the iSCSI offload performance
Modified the event coalescing code for iSCSI offload to combat both
corner cases and optimize performance as follows:

1. Added mechanism to loop back a second time to process any leftover
CQEs that was generated by the hardware during the time the driver is
busy processing previous CQEs in the bh.  This not only helps the
performance but also fixes the corner case when no more CQEs are being
generated in the pipeline; so those leftover CQEs will get a a chance
to be processed.

2. Added ARM_CQE_FP to distinguish between fast path arming versus
slow path arming.  This change will guarantee that the CQEs will
always get a chance to be re-armed during fast path completions.

3. Removed the inline event coalescing division for perf optimization.
Also fixed a division-by-zero error when the event_coal_div module
param was set to 0.

4. Changed the default SQ WQEs size from 256 to 128 to match chip
default.

5. Changed the cmd_per_lun from 32 to 24.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:41:10 -04:00
Eddie Wai
d5307a078b [SCSI] bnx2i: Updated the connection shutdown/cleanup timeout
Modified the 10s wait time for inflight offload connections to
advance to the next state to 2s based on test result.
Modified the 20s shutdown timeout to 30s based on test result.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:57 -04:00
Eddie Wai
7287c63e98 [SCSI] bnx2i: Fixed packet error created when the sq_size is set to 16
The number of chip's internal command cell, which is use to generate
SCSI cmd packets to the target, was not initialized correctly by
the driver when the sq_size is changed from the default 128.
This, in turn, will create a problem where the chip's transmit pipe
will erroneously reuse an old command cell that is no longer valid.
The fix is to correctly initialize the chip's command cell upon setup.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:45 -04:00
Vikas Chaudhary
6b278656f2 [SCSI] qla4xxx: Update driver version to 5.02.00-k7
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:34 -04:00
Harish Zunjarrao
7ad633c06b [SCSI] qla4xxx: Added vendor specific sysfs attributes
Added fw_version, serial_num, iscsi version and boot loader version
sysfs attributes.

Signed-off-by: Harish Zunjarrao <harish.zunjarrao@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:23 -04:00
Vikas Chaudhary
8f0722cae6 [SCSI] qla4xxx: Remove host_lock in queuecommand function
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:12 -04:00
Lalit Chandivade
1b46807e0b [SCSI] qla4xxx: Remove AF_DPC_SCHEDULED flag from ha.
Since queue_work does not requeue, there is no need to check
if a work is in progress using the AF_DPC_SCHEDULED flag.
queue_work would return if work is pending without adding the
work, do_dpc would again get invoked from qla4xxx_timer if
there is still DPC flags set.

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:40:02 -04:00
Vikas Chaudhary
977f46a4bb [SCSI] qla4xxx: Don't check FW alive if ISP82XX reset is in progress
Corrected logic to don't check for F/W is alive if reset is already
in progress for ISP82XX

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:52 -04:00
Vikas Chaudhary
0160ef1269 [SCSI] qla4xxx: Don't process mbx interrupt unconditionally
Do not process interrupt unconditionally during mailbox processing  which can
lead to spurious interrupt. Mailbox completion are now polled if interrupt are
disabled or wait for interrupt to come in if its enabled

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:42 -04:00
Prasanna Mumbai
6d78bd56be [SCSI] qla4xxx: Complete the cmd if sense_len is zero
Complete the cmd if sense length is zero. For cases where sense
data spans across multiple iocb's by FW, we need to hold on to the
I/O (ha->status_srb != NULL) till we have processed them all and
copied the sense data from internal buffer to scsi_cmd sense buffer.

Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:29 -04:00
Vikas Chaudhary
68d92ebf59 [SCSI] qla4xxx: Dump HW/FW reg to figure out what caused FW to be hung for ISP82XX
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:39:20 -04:00
Vikas Chaudhary
cb74428ee3 [SCSI] qla4xxx: Updated the reset sequence for ISP82xx
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:57 -04:00
Prasanna Mumbai
185f107ef9 [SCSI] qla4xxx: update function qla4xxx_isr_decode_mailbox()
- Added MBOX_ASTS_DUPLICATE_IP AEN handling.
- Update MBOX_AEN_REG_COUNT to 8 so that driver will save status
  of all mbox registers in aen_q

Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:46 -04:00
Martin K. Petersen
c498bf1a1b [SCSI] scsi_trace: Decode UNMAP bit in WRITE SAME(10)
As of SBC3r26 WRITE SAME(10) supports the UNMAP bit.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:36 -04:00
Martin K. Petersen
756aca7edd [SCSI] mpt2sas: Fix missing reference tag seed with Type 2 devices
Ensure that the initial reference tag is passed on to the HBA firmware
for DIF Type 2 devices.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Kashyap Desai <Kashyap.Desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:25 -04:00
Martin K. Petersen
2a8cfad06e [SCSI] sd: Unmap discard alignment needs to be converted to bytes
The block layer discard alignment is reported in bytes, not in units of
the logical block size.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:15 -04:00
Jing Huang
45d7f0cc58 [SCSI] bfa: kdump fix
Root cause: When kernel crashes, bfa IOC state machine and FW doesn't get
a notification and hence are not cleanly shutdown. So registers holding
driver/IOC state information are not reset back to valid disabled/parking
values. This causes subsequent driver initialization to hang during kdump
kernel boot.

Fix description: during the initialization of first PCI function, reset
corresponding register when unclean shutown is detect by reading chip
registers. This will make sure that ioc/fw gets clean re-initialization.

Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:38:02 -04:00
Wayne Boyer
a5442ba4a4 [SCSI] ipr: fix possible false positive detection of stuck interrupt
If the driver is getting flooded with interrupts, there's a possibility
that the interrupt service routine could falsely detect a stuck interrupt
condition and reset the adapter.

This patch changes the logic such that the routine will loop back into
the command processing code one more time after detecting the stuck
interrupt signature.  If there are no commands to process after that pass,
and the interrupt is still not cleared, then the driver will print the
"Error clearing HRRQ" message and reset the adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:50 -04:00
Robert Love
d85e607b34 [SCSI] libfcoe: Remove unnecessary module state checks
libfcoe's interface consists of create, destroy, enable,
disable and create_vn2vn. These are currently module
paramaters added durring the module initialization. A
concern arose that the module parameters were being added
with write permissions before the module had completed
initialization. The following code was added to each
sysfs store file.

* Make sure the module has been initialized, and is not about to be
* removed.  Module parameter sysfs files are writable before the
* module_init function is called and after module_exit.
*/
if (THIS_MODULE->state != MODULE_STATE_LIVE)
    goto out_nodev;

This check was called out as unhelpful as the module can
go dead at any time and therefore its state isn't a reliable
thing to look at as a sign of stability and initialization
completion. Also, that functional interfaces like these
should be added after module initialization.

This patch removes the unnecessary checks and hopes to
disprove the concern about initialization ordering.

Recent fcoe transport rework changes now require fcoe
transports to register with libfcoe before any operation
can take place. libfcoe may access some static variables
but nothing that could cause a problem. Once a fcoe transport
is registered, libfcoe is usable and any interface calls will
be functional.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:35 -04:00
Yi Zou
8467b96c03 [SCSI] libfc: do not immediately retry the cmd when seq_send fails in fc_fcp_send_data
Currently, when seq_send() fails in fc_fcp_send_data(),
fc_fcp_retry_cmd() would complete this failed I/O directly and let
scsi-ml retry. However, target side is not notified which may hang the
target. Instead, we should just bail out from from fc_fcp_send_data
and let scsi-ml times it out and aborts this I/O instead.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:25 -04:00
Vasu Dev
0a219edb26 [SCSI] libfc: fix race in SRR response
In this case fsp was freed before error handler was invoked,
this is fixed by having SRR fsp reference freed by exch
destructor so that fsp will be always held until it exch
is freed.

Also don't reset fsp->recov_seq since this is needed by
SRR error handler to do exch done.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:15 -04:00
Vasu Dev
8d23f4ba38 [SCSI] libfc: don't call resp handler after FC_EX_TIMEOUT
In cases exch is already timed out then exch layer could
end up calling resp handler again for its response frame
received after timeout, though in this case fc_exch_timeout
handler would have already called resp with FC_EX_TIMEOUT.

This would cause REC response handler to release its
fsp pkt hold twice instead once and possibly similar issues
with other ELS exchanges in this race.

To avoid this race have resp updated under exch lock
in rx path, the resp would get set to NULL in case
of FC_EX_TIMEOUT under the same lock to prevent resp
callback after FC_EX_TIMEOUT.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:03 -04:00
Yi Zou
6a716a8535 [SCSI] libfc: release DDP context if frame_send() fails
In case frame_send() fails, make sure to let the underlying HW release the DDP
context that has already been set up before calling frame_send().

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:51 -04:00
Hillf Danton
83383dd11a [SCSI] libfc: fix mm leak in handling incoming request for target discovery
When handling incoming request, if the operation code carried by the
received frame is not RSCN, the frame should be freed as in the RSCN
case, or there is memory leakage.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:41 -04:00
Neerav Parikh
bdf252183e [SCSI] fcoe: Prevent creation of an NPIV port with duplicate WWPN
This patch adds a validation step before allowing creation of a new NPIV port.
It checks whether the WWPN passed for the new NPIV port to be created is unique
for the given physical port.

Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:29 -04:00
Bhanu Prakash Gollapudi
c051ad2e57 [SCSI] libfcoe: Incorrect CVL handling for NPIV ports
Host doesnt handle CVL to NPIV instantiated ports correctly.
- As per FC-BB-5 Rev 2 CVLs with no VN_Port descriptors shall be treated as
  implicit logout of ALL vn_ports.
- CVL for NPIV ports should be handled before physical port even if descriptor
  for physical port appears before NPIV ports

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:17 -04:00
adam radford
4f788dce0b [SCSI] megaraid_sas: Version and Changelog update
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:36:06 -04:00
adam radford
3cc6851f9a [SCSI] megaraid_sas: Add 1078 OCR support
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:56 -04:00
adam radford
495c560970 [SCSI] megaraid_sas: Convert 6,10,12 byte CDB's for FastPath IO
The following patch for megaraid_sas converts 6,10,12 byte CDB's to 16
byte CDB for large LBA's for FastPath IO.

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:46 -04:00
adam radford
541f90b7c6 [SCSI] megaraid_sas: Fix bug where AENs could be lost in probe() and resume()
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:34 -04:00
adam radford
46fd256e05 [SCSI] megaraid_sas: Disable interrupts/free_irq() in megasas_shutdown()
The following patch for megaraid_sas disables interrupts and
free_irq() in megasas_shutdown().

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:35:09 -04:00
adam radford
7e70e73365 [SCSI] megaraid_sas: Check MFI_REG_STATE.fault.resetAdapter
The following patch for megaraid_sas fixes the function
megasas_reset_fusion() and makes the reset code check
MFI_REG_STATE.fault.resetAdapter.

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:59 -04:00
adam radford
70d031f36f [SCSI] megaraid_sas: Remove un-used function
The following patch for megaraid_sas removes un-used function
megasas_return_cmd_for_smid().

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:47 -04:00
adam radford
3f1abce4ab [SCSI] megaraid_sas: Remove MSI-X black list, use MFI_REG_STATE instead
This patch for megaraid_sas removes the MSI-X black list and uses
MFI_REG_STATE.ready.msiEnable instead.

Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:34:12 -04:00