Currently all port deletion is routed though the FCoE
workqueue (fcoe_wq). When fc_remove_host is called on
an N_Port (for example, from fcoe_destroy) the vports
are queued into a FC Transport workqueue. fc_remove_host
flushes that queue and each vport is passed to fcoe's
fcoe_vport_destroy, which simply queues the associated
fcoe_ports for later deletion. This queue cannot be
flushed within the N_Ports destroy path because of
circular locking issues. The result is that the NPIV
ports are destroyed after the N_Port, which is reverse
of how they are created.
This quirk causes fcoe to keep references on the
fcoe_interface shared by each of these ports (N_Port
and NPIV). Changing the ordering such that NPIV ports
are destroyed before the N_Port will allow us to remove
reference counting on the fcoe_interface instances.
This patch simply allows fcoe_vport_destory to destroy
NPIV ports without deferring them to a workqueue context.
This ensures that when fc_remove_host is called the
NPIV ports will be destroyed first before the N_Port and
allows reference counting on the fcoe's fcoe_interface
to be remove in a later patch.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The label implies that it should be called when
there is 'nomod.' I read that to mean that the
module reference 'get' failed. However, it's only
called when the module reference 'get' succeeded.
I think it makes more sense to name the label,
'out_putmod' since it should be called when we
need to 'put' the module reference taken in the
routine before returning.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Allow FDMI attributes to be exposed via the fc_host
class object for the fcoe driver.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is more of a debug statement. As a KERN_ERR we generate
log entries anytime any netdev goes up or down, so when booting
there are notification log entries for all system interfaces
including 'lo'. This is too much. Let's just log when necessary.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This allows the controller to do WRITE_INSERT and READ_STRIP for SAS
disks that support protection information. SAS disks must be formatted
with protection information to use this feature via sg_format.
sg3_utils-1.32 -- sg_format version 1.19 20110730
sg_format usage:
sg_format --format --verbose --pinfo /dev/sda
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
A device logout sent in the delete path of a fcport would clear the
port handle binding inside the firmware. This could lead to queued
work items for the fcport, if any, getting incorrect results. This
patch fixes the issue by checking for device name changes after a
call to get port database.
Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add a field to the qla_hw_data struct to allow us to set the maximum number of
fabric devices on a per adapter basis based on ISP type.
[jejb: fix up missing rval = QLA_SUCCESS to prevent uninit var warning]
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Rather than continuously allocating and freeing swl within the discovery
process, simply pre-allocate it the first time that it's needed, cache it
through the rest of the lifecycle of the driver and free it at module unload.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Don't use default 30 second mailbox-command timeout for these
serial requests, instead, limit the TMO to the standard 2*RATOV
plus some fudge-factor.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
During command failure/non-recognition, the upper-layer
FC-transport expects the drivers to set
job-reply->reply_payload_rcv_len. Do this in a consistent manner
to avoid duplication.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
During rport tear-down, make sure we do an implicit LOGO of the fcport in our
firmware to try to clear any residual commands associated with that fcport.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Make sure that all calls to ha->isp_ops->fabric_login() check the
return value for failure.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When thermal temperature initially fails, return a blank string to the
sysfs interface. This fixes the initial display of 0.00 followed by
subsequent display of blank line; the initial 0.00 should have not
displayed for cards that do not support thermal temperature.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Instead of processing each RSCN individually, use only the name server results
from the switch to tell the existance of a given fcport.
Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The driver is logging a slew of 'good' status requests for ELS/CT passthrough
commands. Change some log messages from:
* ql_log() -> ql_dbg()
* ql_log_info -> ql_dbg_user
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Rework the structures related to SRB processing to minimize the memory
allocations per I/O and manage resources associated with and completions
from common routines.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Original 'defaults' were not OUI valid.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Added Sync. mode to support Series 7/8/9 controller families: This is a
compatibility mode for all these controller families. The Async. (Performance)
mode can be changed in the future. First Async. mode version added for Series
7; Controller parameter aac_sync_mode added
Signed-off-by: Mahesh Rajashekhara <aacraid@pmc-sierra.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The error_mask module param overrides has a bug which prevented
the new module param values to take effect.
Also changed the type attribute of the error_mask1/2 module params
from int to uint to allow the MSB to be set.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
bnx2i_percpu_thread_create() create per cpu kthread, and should use
proper NUMA aware API.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If 'drv_fcxp = kzalloc(sizeof(struct bfad_fcxp), GFP_KERNEL);' fails
and returns NULL, then we'll leak the memory allocated to 'bsg_fcpt'
when we jump to 'out:' and the variable subsequently goes out of
scope.
Also remove the cast of the kzalloc() return value. kzalloc() returns
a void* which is implicitly converted, so the explicit cast is
pointless.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Enabling clock gating for power savings on entry to controller ready
state. Disable SCU clock gating for power savings on exit from the
controller ready state.
The gating is fully automated by silicon after setting the mode.
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If the driver/lib has called scsi_done and cleaned up internally but
scsi layer has not yet called blk_mark_rq_complete when the command
times out we hit a problem if the timeout code calls blk_mark_rq_complete first.
When the time out code calls into the driver we were returning
BLK_EH_RESET_TIMER and that causes the timeout code to just call
us again later.
We need to be calling BLK_EH_HANDLED so the timeout code can complete
the completion process because it had called blk_mark_rq_complete
on the command and now owns its processing.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Problem description from Xi Wang:
A large max_r2t could lead to integer overflow in subsequent call to
iscsi_tcp_r2tpool_alloc(), allocating a smaller buffer than expected
and leading to out-of-bounds write.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
NETLINK_CREDS's pid now returns 0, so I guess we are supposed to
be using NETLINK_CB. This changed while the patch to export the
pid was getting merged upstream, so it was not noticed until both
the network and iscsi changes were in the same tree.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch fixes the host byte settings DID_TARGET_FAILURE and
DID_NEXUS_FAILURE. The function __scsi_error_from_host_byte, tries to reset
the host byte to DID_OK. But that does not happen because of the OR operation.
Here is the flow.
scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte
Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition,
result will be set as DID_NEXUS_FAILURE (=0x11). Then in
__scsi_error_from_host_byte, when we do OR with DID_OK. Purpose is to reset
it back to DID_OK. But that does not happen. This patch fixes this issue.
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When there are 255 NPIV ports, and the interface is brought down & up, both
physical and NPIV ports are logged off and never logged back in. Since
discovery happens on single CPU, XID resources on that CPU will be limited,
which when exhausted the discovery fails. Increase the XID resource range to
ensure that the discovery completes successfully. Also ensure that
fc_exch_mgr_alloc() doesn't fail on the system that has lower number of CPUs.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Host drops sessions when flood of unsolicited LOGOs are received from the
target. Because of unsufficient PLOGI retries, upon exceeding the retry count
of 3, the target sessions are dropped. Increased the retry count to 255 to
allow sufficient retries in this scenario.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
System panics while accessing stale pointer - timer_work_queue - in the IO path
before bnx2fc_stop is called. Fix is to destroy the workqueue after the destroy
operation is complete.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Allow FDMI attributes to be exposed via the fc_host
class object for the fcoe driver.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This adds support for updating the FC-GS FDMI attributes
in the fcoe driver.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch adds support for Fabric Device Management
Interface as per FC-GS-4 spec. in libfc. Any driver
making use of libfc can enable fdmi state machine
for a given lport.
If lport has enabled FDMI support the lport state
machine will transition into FDMI after completing
the DNS states and before entering the SCR state.
The FDMI state transition is such that if there is an
error, it won't stop the lport state machine from
transitioning and the it will behave as if there was
no FDMI support.
The FDMI HBA attributes are registed with the Management
server via Register HBA (RHBA) command and the port
attributes are reigstered using the Register Port(RPA)
command.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Currently the libfc Common Transport(CT) calls assume that
the CT requests are Name Server specific only. This patch
makes it more flexible to allow more FC-GS services to make
use of these routines.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This adds FC-GS Fabric Device Management Interface
(FDMI) related attributes to fc_host_attr structure.
This is in preparation for allowing FDMI attributes
to be registered via libfc.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Use find_first_zero_bit to find the first cleared bit in a memory region.
This also includes the following minor changes.
- Use bitmap_zero
- Reduce unnecessary atomic bitops usage
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The SCSI GET LBA STATUS command was introduced in SBC-3 revision
20 in September 2009. At that time the Parameter Data Length
field in the response had an associated byte offset of 8.
Then in SBC-3 revision 25 (October 2010) that byte offset was
changed to 4. The sg_get_lba_status utility in sg3_utils version
1.33 (released earlier today) has been changed to calculate
the newer response length. However the implementation of
GET LBA STATUS command in the scsi_debug driver still uses the
original byte offset.
modify the Parameter Data Length field value in the GET LBA STATUS command
response to comply with the change in SBC-3 revision 25
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Certain types of changes to devices should not be interpreted as a device
change that would cause the device to be removed and re-added. These include
RAID level and Firmware revision changes. However, these attribute changes DO
need to be reflected in the controller info structure's dev structure list, so
that sysfs and /proc info files for the devices will reflect the new values.
Signed-off-by: Scott Teel <scott.stacy.teel@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Reduce confusion and inaccuracy caused by dated naming of vars and functions
referring to external target devices.
CURRENT NAMING: PROPOSED NAMING:
"MSA2xxx devices" "external target devices"
msa2xxx_model ext_target_model
is_msa2xxx is_ext_target
add_msa2xxx_enclosure add_ext_target_dev
nmsa2xxx_enclosures n_ext_target_devs
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Driver limits SAS external target IDs to range 1-8.
Need to increase limit and clean up overlapping concepts of targets and paths
in the code.
There are several defined constants that control this:
HPSA_MAX_TARGETS_PER_CTLR 16
MAX_MSA2XXX_ENCLOSURES 32
HPSA_MAX_PATHS 8
We can condense this to one constant:
MAX_EXT_TARGETS 32
SAS switches allow for 8 connections, and there is capacity for 4 switches per
enclosure in largest blade enclosure type.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It should call hpsa_set_bus_target_lun rather
than individually setting bus, target and lun.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Some distros have a "rescan-scsi-bus.sh" script which depends on
SCSI REPORT LUNs not reporting something different than what the
driver tells the kernel, even if the driver uses scan_start and
scan_finished methods of the SCSI host template to override the
usual SCSI midlayer discovery code. Previously, 1 was added to
the LUN to make room to insert the RAID controller device at
LUN 0. Now, the RAID controller is moved to bus 3, and 1 is no
longer added to the LUN. However, SCSI REPORT LUNS on Smart Array
doesn't report physical devices like tape drives or auto-loaders
as it turns out, so those particular device types still won't match.
Generally the logical drives are reported first however, so at
least those should match.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Upgraded firmware on Smart Array P7xx (and some others) made them show up as
SCSI revision 5 devices and this caused the driver to fail to map MSA2xxx
logical drives to the correct bus/target/lun. A symptom of this would be that
the target ID of the logical drives as presented by the external storage array
is ignored, and all such logical drives are assigned to target zero,
differentiated only by LUN. Some multipath software reportedly does not deal
well with this behavior, failing to recognize different paths to the same
device as such.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sometimes, for testing purposes (e.g. testing rmmod on a system
that normally boots using hpsa) it's nice to rename the driver
and split it into two drivers and restrict it to certain
controllers. This makes that easier.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
hpsa_register_scsi just calls hpsa_scsi_detect. Move
the guts of hpsa_scsi_detect into hpsa_register_scsi and
get rid of hpsa_scsi_detect.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We had both h->max_sg_entries and h->maxsgentries in the per controller
structure which is terribly confusing. max_sg_entries was really
just a constant, 32, which defines how big the "block fetch table"
is, which is as large as the max number of SG elements embedded
within a command (excluding SG elements in chain blocks).
MAXSGENTRIES was the constant used to denote the max number of SG
elements embedded within a command, also a poor name.
So renamed MAXSGENTREIS to SG_ENTRIES_IN_CMD, and removed
h->max_sg_entries and replaced it with SG_ENTRIES_IN_CMD.
h->maxsgentries is unchanged, and is the maximum number of sg
elements the controller will support in a command, including
those in chain blocks, minus 1 for the chain block pointer..
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Note: this is a replacement patch for the issue pointed out in
http://www.gossamer-threads.com/lists/linux/kernel/1477270
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
FC Discovery state machine fixes.
- Fix bug with driver returning the inactive ndlp (125743)
- Fix discovery problem when in pt2pt by copying old ndlp state before
state change (126887)
- Fix ndlp nodelist not empty wait timeout during driver unloading (127052)
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
T10 Diff fixes and enhancements:
- Add SLI4 Lancer support for T10 DIF / BlockGuard (121980)
- Fix SLI4 BlockGuard behavior when protection data is generated by HBA (121980)
- Enhance debugfs for injecting T10 DIF errors (123966, 132966)
- Fix Incorrect usage of bghm for BlockGuard errors (127022)
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
SLI related fixes:
- Fix REG_RPI fails on SLI4 HBA putting NPort into NPR state (126230)
- Fix ELS FDISC failing with local reject / invalid RPI. (126350)
- Fix reset port when reset is needed during fw_dump (125807)
- Fix unbounded firmware revision string from port cause panic (126560)
- Fix driver behavior when receiving an ADISC (126654)
- Fix driver not returning when bad ndlp found in abts error event
handling (126209)
- Add more driver logs in area of SLI4 port error attention and reset
recovery (126813, 124466)
- Fix failure in handling large CQ/EQ identifiers in an IOV
environment (126856)
- Fix for driver using duplicate RPIs after lancer port reset (126723)
- Clear vport->fc_myDID in lpfc_els_issue_fdisc to guarentee a
zero SID (126779, 126897)
- Fix for SLI4 Port delivery for BLS ABORT ACC (126289)
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
BSG and User interface fixes:
- Fix driver processing an els command using 16Gb FC Adapter (126345)
- Change SLI4 FC port internal loopback to inner internal (126409)
- Fix bug with driver dump command type 4 using 16Gb FC Adapter (126406)
- Create character device to take a reference on the driver (126082)
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
IO_XFER_ERROR_BREAK and IO_XFER_OPEN_RETRY_TIMEOUT are deficient of the
required actions as outlined in the programming manual for the pm8001. Due to
the overlapping code requirements of these recovery responses, we found it
necessary to bundle them together into one patch.
When a break is received during the command phase (ssp_completion), this is a
result of a timeout or interruption on the bus. Logic suggests that we should
retry the command.
When a break is received during the data-phase (ssp_event), the task must be
aborted on the target or it will retain a data-phase lock turning the target
reticent to all future media commands yet will successfully respond to TUR,
INQUIRY and ABORT leading eventually to target failure through several
abort-cycle loops.
The open retry interval is exceedingly short resulting in occasional target
drop-off during expander resets or when targets push-back during bad-block
remapping. Increased effective timeout from 130ms to 1.5 seconds for each try
so as to trigger after the administrative inquiry/tur timeout in the scsi
subsystem to keep error-recovery harmonics to a minimum.
When an open retry timeout event is received, the action required by the
targets is to issue an abort for the outstanding command then logic suggests
we retry the command as this state is usually an indication of a credit block
or busy condition on the target.
We hijacked the pm8001_handle_event work queue handler so that it will handle
task as an argument instead of device for the workers in support of the
deferred handling outlined above.
Moderate to Heavy bad-path testing on a 2.6.32 vintage kernel, compile-testing
on scsi-misc-2.6 kernel ...
Signed-off-by: Mark Salyzyn <mark_salyzyn@xyratex.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Jack noticed I dropped a patch fragment associated with a flags automatic
variable in mpi_set_phys_g3_with_ssc (ooops) and that the pre-emptive locking
that piggy-backed this patch was not in-fact necessary because of underlying
atomic accesses to the hardware. Here is the updated patch fixing these two
issues.
The pm8001 driver is missing the FUNC_GET_EVENTS handler in the phy control
function. Since the pm8001_bar4_shift function was not designed to be called
at runtime, added locking surrounding the adjustment for all accesses.
Signed-off-by: Mark Salyzyn <mark_salyzyn@xyratex.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
pm8001_phy_control PHY_FUNC_HARD_RESET locks up on second try via
smp_phy_control because response HW_EVENT_PHY_START_STATUS fails to complete
previous command. The PM8001F_RUN_TIME flag is not treated as a bit, but a
state in all readers, yet once we are operational or in the run time state,
the flags use a bit-set operation.
Signed-off-by: Mark Salyzyn <mark_salyzyn@xyratex.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This implements basic power management for SCSI tapes.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch (as1520) fixes a bug in the SCSI layer's power management
implementation.
LUN scanning can be carried out asynchronously in do_scan_async(), and
sd uses an asynchronous thread for the time-consuming parts of disk
probing in sd_probe_async(). Currently nothing coordinates these
async threads with system sleep transitions; they can and do attempt
to continue scanning/probing SCSI devices even after the host adapter
has been suspended. As one might expect, the outcome is not ideal.
This is what the "prepare" stage of system suspend was created for.
After the prepare callback has been called for a host, target, or
device, drivers are not allowed to register any children underneath
them. Currently the SCSI prepare callback is not implemented; this
patch rectifies that omission.
For SCSI hosts, the prepare routine calls scsi_complete_async_scans()
to wait until async scanning is finished. It might be slightly more
efficient to wait only until the host in question has been scanned,
but there's currently no way to do that. Besides, during a sleep
transition we will ultimately have to wait until all the host scanning
has finished anyway.
For SCSI devices, the prepare routine calls async_synchronize_full()
to wait until sd probing is finished. The routine does nothing for
SCSI targets, because asynchronous target scanning is done only as
part of host scanning.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In do_scan_async(), calling scsi_autopm_put_host(shost) may reference
freed shost, and cause Posison overwitten warning.
Yes, this case can happen, for example, an USB is disconnected just
when do_scan_async() thread starts to run, then scsi_host_put() called
in scsi_finish_async_scan() will lead to shost be freed(because the
refcount of shost->shost_gendev decreases to 1 after USB disconnects),
at this point, if references shost again, system will show following
warning msg.
To make scsi_autopm_put_host(shost) always reference a valid shost,
put it just before scsi_host_put() in function
scsi_finish_async_scan().
[ 299.281565] =============================================================================
[ 299.281634] BUG kmalloc-4096 (Tainted: G I ): Poison overwritten
[ 299.281682] -----------------------------------------------------------------------------
[ 299.281684]
[ 299.281752] INFO: 0xffff880056c305d0-0xffff880056c305d0. First byte
0x6a instead of 0x6b
[ 299.281816] INFO: Allocated in scsi_host_alloc+0x4a/0x490 age=1688
cpu=1 pid=2004
[ 299.281870] __slab_alloc+0x617/0x6c1
[ 299.281901] __kmalloc+0x28c/0x2e0
[ 299.281931] scsi_host_alloc+0x4a/0x490
[ 299.281966] usb_stor_probe1+0x5b/0xc40 [usb_storage]
[ 299.282010] storage_probe+0xa4/0xe0 [usb_storage]
[ 299.282062] usb_probe_interface+0x172/0x330 [usbcore]
[ 299.282105] driver_probe_device+0x257/0x3b0
[ 299.282138] __driver_attach+0x103/0x110
[ 299.282171] bus_for_each_dev+0x8e/0xe0
[ 299.282201] driver_attach+0x26/0x30
[ 299.282230] bus_add_driver+0x1c4/0x430
[ 299.282260] driver_register+0xb6/0x230
[ 299.282298] usb_register_driver+0xe5/0x270 [usbcore]
[ 299.282337] 0xffffffffa04ab03d
[ 299.282364] do_one_initcall+0x47/0x230
[ 299.282396] sys_init_module+0xa0f/0x1fe0
[ 299.282429] INFO: Freed in scsi_host_dev_release+0x18a/0x1d0 age=85
cpu=0 pid=2008
[ 299.282482] __slab_free+0x3c/0x2a1
[ 299.282510] kfree+0x296/0x310
[ 299.282536] scsi_host_dev_release+0x18a/0x1d0
[ 299.282574] device_release+0x74/0x100
[ 299.282606] kobject_release+0xc7/0x2a0
[ 299.282637] kobject_put+0x54/0xa0
[ 299.282668] put_device+0x27/0x40
[ 299.282694] scsi_host_put+0x1d/0x30
[ 299.282723] do_scan_async+0x1fc/0x2b0
[ 299.282753] kthread+0xdf/0xf0
[ 299.282782] kernel_thread_helper+0x4/0x10
[ 299.282817] INFO: Slab 0xffffea00015b0c00 objects=7 used=7 fp=0x
(null) flags=0x100000000004080
[ 299.282882] INFO: Object 0xffff880056c30000 @offset=0 fp=0x (null)
[ 299.282884]
...
Signed-off-by: Huajun Li <huajun.li.lee@gmail.com>
Cc: stable@kernel.org
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
With IOs running and PegHalt testing the system reboots when memory reset is
performed during device initialization.
Signed-off-by: Shyam Sundar <shyam.sundar@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Complete the mailbox command timed out before initiating another abort cycle
to recover so that mailbox commands issued during next reset cycle don't fail
due to pending mailbox access timeout.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Remove the check for a NULL fcport so that the host reset will run
unconditionally to unwedge any commands before the device is offlined and to
prevent a quick runthrough of the SCSI error handling.
Signed-off-by: Michael Christie <mchristi@redhat.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
ISP2200 adapters only have 24 mailbox registers so read only that many.
Reported-by: Olatunji Ruwase <oor@cs.cmu.edu>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This can cause instability in mailbox command state machine handling.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Not clearing the options flags in mbx1 could lead the firmware
into interpreting old data in mbx1 through mbx8. This could
lead to inadvertent DMA read/write operations to stale memory.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Many locations within the driver would use an inconsistent set of
checks to determine ISP-reset state. Consolidate the checks into
this inline-helper.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
NULL orom ptr passed in for verification which caused page fault.
We will set a default version when we don't have orom struct.
Reported-by: Dan Melnic <dan@seamicro.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In some scenarios, an EEH error can take a long time to be detected, since the
driver issues an MMIO read only after a device reset command times out and we
try to reset the adapter. This patch adds some code in ipr_cancel_op() to read
a hardware register so we detect the error earlier in case the op is being
aborted because of a timeout caused by a frozen adapter slot.
Another problem in such scenarios is that in __ipr_eh_host_reset() we change the
dump state flag from WAIT_FOR_DUMP to GET_DUMP, and the flag is later changed
from GET_DUMP to READ_DUMP in ipr_reset_restore_cfg_space(). However, if when
__ipr_eh_host_reset() is called by the SCSI error handling the function
ipr_reset_restore_cfg_space() has already been called by the PCI EEH code, we
end up with the flag in an inconsistent state. This patch also prevents this
problem.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If ioc->pci_error_recovery is set, goto out in mpt2sas_base_hard_reset_handler()
leads to unlock unheld ioc->reset_in_progress_mutex.
The patch fixes the issue by jumping afer mutex_unlock() call.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Fix assembler constraint to prevent overeager gcc optimisation
mac_esp: rename irq
mac_scsi: dont enable mac_scsi irq before requesting it
macfb: fix black and white modes
m68k/irq: Remove obsolete IRQ_FLG_* definitions
Fix up trivial conflict in arch/m68k/kernel/process_mm.c as per Geert.
Rename the "Mac ESP" irq as "ESP" to be consistent with all the other Mac
drivers and ESP drivers.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Don't enable the SCSI irq when initialising the chip -- the irq has no
handler yet.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPFnpKAAoJEDeqqVYsXL0MJZMH/06MxLWMkXrZJBgFMZs0UwcO
8vdTsQFR5HxGbj1WzARL9BeMbllCnTyG3xqdKp5iDANg0sUBJLqTuTLl+avvI3nc
rKrlS6zImBs3BccxH+wtEqllUgoEto7AmxHPLqY/jf6O8/hxP/AH9uuZpcJCZqpw
1bbd/5x1aa7V6Y9omgx8+HdAohhDt/3JqTn7/PMOz6tHXnyeMx9yGl7tCzzWjmtT
KeRblzn8jIApnC3yxV35MlKTFqT72csI0pHsJsKNt6OI93mLKEoJH5DxCVtQWqKo
8KzRw8mq6D0dksYfKKt78WC0ykmrft2GXMrSaUgnQgLd63qwcB7vwfynP0uqo24=
=jjU7
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
SCSI updates on 20120118
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (49 commits)
[SCSI] libfc: remove redundant timer init for fcp
[SCSI] fcoe: Move fcoe_debug_logging from fcoe.h to fcoe.c
[SCSI] libfc: Declare local functions static
[SCSI] fcoe: fix regression on offload em matching function for initiator/target
[SCSI] qla4xxx: Update driver version to 5.02.00-k12
[SCSI] qla4xxx: Cleanup modinfo display
[SCSI] qla4xxx: Update license
[SCSI] qla4xxx: Clear the RISC interrupt bit during FW init
[SCSI] qla4xxx: Added error logging for firmware abort
[SCSI] qla4xxx: Disable generating pause frames in case of FW hung
[SCSI] qla4xxx: Temperature monitoring for ISP82XX core.
[SCSI] megaraid: fix sparse warnings
[SCSI] sg: convert to kstrtoul_from_user()
[SCSI] don't change sdev starvation list order without request dispatched
[SCSI] isci: fix, prevent port from getting stuck in the 'configuring' state
[SCSI] isci: fix start OOB
[SCSI] isci: fix io failures while wide port links are coming up
[SCSI] isci: allow more time for wide port targets
[SCSI] isci: enable wide port targets
[SCSI] isci: Fix IO fails when pull cable from phy in x4 wideport in MPC mode.
...