We recently changed the locking in this function, but this return was
missed. It needs an unlock and the IRQs need to be restored.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Without this patch, scsi_show_result prints hostbyte as invalid for statuses
that are not defined in hostbyte_table (when scsi logging is enabled).
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
I think ioremap() ends up being equivalent to ioremap_nocache
by default, but we should signal our intent that these mappings
should be non-cacheable.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In the abort handler, when asked to abort a command which
is not known to the driver, SUCCESS is returned, but the
diagnostic message incorrectly indicates the abort failed.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It turns out Smart Array logical drives do not support target
reset and when the target reset fails, the logical drive will
be taken off line. Symptoms look like this:
hpsa 0000:03:00.0: Abort request on C1:B0:T0:L0
hpsa 0000:03:00.0: resetting device 1:0:0:0
hpsa 0000:03:00.0: cp ffff880037c56000 is reported invalid (probably means target device no longer present)
hpsa 0000:03:00.0: resetting device failed.
sd 1:0:0:0: Device offlined - not ready after error recovery
sd 1:0:0:0: rejecting I/O to offline device
EXT3-fs error (device sdb1): read_block_bitmap:
LUN reset is supported though, and is what we should be using.
Target reset is also disruptive in shared SAS situations,
for example, an external MSA1210m which does support target
reset attached to Smart Arrays in multiple hosts -- a target
reset from one host is disruptive to other hosts as all LUNs
on the target will be reset and will abort all outstanding i/os
back to all the attached hosts. So we should use LUN reset,
not target reset.
Tested this with Smart Array logical drives and with tape drives.
Not sure how this bug survived since 2009, except it must be very
rare for a Smart Array to require more than 30s to complete a request.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The scsi netlink code confuses the netlink port id with a process id,
going so far as to read NETLINK_CREDS(skb)->pid instead of the correct
NETLINK_CB(skb).pid. Fortunately it does not matter because nothing
registers to respond to scsi netlink requests.
The only interesting use of the scsi_netlink interface is
fc_host_post_vendor_event which sends a netlink multicast message.
Since nothing registers to handle scsi netlink messages kill all of the
registration logic, while retaining the same error handling behavior
preserving the userspace visible behavior and removing all of the
confused code that thought a netlink port id was a process id.
This was tested with a kernel allyesconfig build which had no problems.
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
isci update for 3.6
1/ Fix the workaround for drives that have a slow response to COMSAS.
Drives with this problem intermittently take a long time to be
identified, or fail to be identified altogether.
2/ A minor fix for the efi variable code failure path
3/ A handful of smatch fixups from Dan Carpenter
* pci/stephen-const:
make drivers with pci error handlers const
scsi: make pci error handlers const
netdev: make pci_error_handlers const
PCI: Make pci_error_handlers const
It is a frequent mistake to confuse the netlink port identifier with a
process identifier. Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.
I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.
I have successfully built an allyesconfig kernel with this change.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch defines netlink_kernel_create as a wrapper function of
__netlink_kernel_create to hide the struct module *me parameter
(which seems to be THIS_MODULE in all existing netlink subsystems).
Suggested by David S. Miller.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have an old FIXME in reg.h which points out that we should standardise
on PVR_foo for our PVR #defines. Currently we use PVR_ on 32-bit and PV_
on 64-bit.
So do that rename and remove the FIXME.
Seeing as we're touching all but one usage of __is_processor(), rename it
to something less ugly and more indicative of what it does, which is
simply to check the PVR version.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Remove PCI vendor and subvendor IDs, as they are already defined in
pci_ids.h.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This allows a user to adjust the wait time in seconds after I/O timeout before
resetting the adapter.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This allows a user to adjust the queue depth of the adapter when throttled due
to I/O timeout.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Reduce the amount of time the host lock is held in the interrupt handler
for improved performance.
[jejb: fix up checkpatch noise]
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Reduce the amount of time the host lock is held in queuecommand
for improved performance.
[jejb: fix up checkpatch noise]
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When device discovery is disabled during driver load time using module
parameter "disable_discovery=1" and when diag reset is issued then from logs,
it is observed that the devices get added, removed and then added with new
target ids.
So, in order to limit this turn-off the code which is deleting and devices
across host reset when the disable_discovery module parameter is turned on.
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch provides a command line option to disable "Port enable" during
the driver load.
The objective of this command line option is to load the driver and do
all the necessary initialization excluding port enable(i.e. delay
device discovery)
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Changeset in MPI 2.0 Rev V(2.0.14) specification
1) Bumped MPI2_HEADER_VERSION_UNIT.
2) Added a product specific range to event values.
3) Added clarification to Direct-Attached SAS PHY Power condition.
4) Updated timing requirements for performing Hard Reset.
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When specifying the command line option "max_sectors" less than 64, then
warning message should provide correct upper boundary value 32767 instead of
8192.
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
A new sysfs shost attribute called "BMR_status" is implemented to
report Backup Rail Monitor status.
This attribute is located in the path
/sys/class/scsi_host/host#/BMR_status
when reading this adapter attribute, then driver will output the state
of GPIO[24]. It returns "0" if BMR is healthy and it returns "1" for failure.
if it returns an empty string then it means that there was an error while
obtaining the BMR status. Then check dmesg for what error has occured.
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
libsas and ipr pass flags to ata_host_init that are meant for the port.
ata_host flags:
ATA_HOST_SIMPLEX = (1 << 0), /* Host is simplex, one DMA channel per host only */
ATA_HOST_STARTED = (1 << 1), /* Host started */
ATA_HOST_PARALLEL_SCAN = (1 << 2), /* Ports on this host can be scanned in parallel */
ATA_HOST_IGNORE_ATA = (1 << 3), /* Ignore ATA devices on this host. */
flags passed by libsas:
ATA_FLAG_SATA = (1 << 1),
ATA_FLAG_PIO_DMA = (1 << 7), /* PIO cmds via DMA */
ATA_FLAG_NCQ = (1 << 10), /* host supports NCQ */
The only one that aliases is ATA_HOST_STARTED which is a 'don't care' in
the libsas and ipr cases since ata_hosts from these sources are not
registered with libata.
Reported-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Brian King <brking@us.ibm.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Provide a "simple-dev-pm-ops" implementation that shuts down the domain
and the device on suspend, and resumes the device and the domain on
resume. All of the mechanics of restoring domain connectivity are
handled by libsas once isci has notified libsas that all links should be
back up. libsas is in charge of handling links that did not resume, or
resumed out of order.
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
libsas power management routines to suspend and recover the sas domain
based on a model where the lldd is allowed and expected to be
"forgetful".
sas_suspend_ha - disable event processing allowing the lldd to take down
links without concern for causing hotplug events.
Regardless of whether the lldd actually posts link down
messages libsas notifies the lldd that all
domain_devices are gone.
sas_prep_resume_ha - on the way back up before the lldd starts link
training clean out any spurious events that were
generated on the way down, and re-enable event
processing
sas_resume_ha - after the lldd has started and decided that all phys
have posted link-up events this routine is called to let
libsas start it's own timeout of any phys that did not
resume. After the timeout an lldd can cancel the
phy teardown by posting a link-up event.
Storage for ex_change_count (u16) and phy_change_count (u8) are changed
to int so they can be set to -1 to indicate 'invalidated'.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jacek Danecki <jacek.danecki@intel.com>
Tested-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is a particularly nasty SCSI ATA Translation Layer (SATL) problem.
SAT-2 says (section 8.12.2)
if the device is in the stopped state as the result of
processing a START STOP UNIT command (see 9.11), then the SATL
shall terminate the TEST UNIT READY command with CHECK CONDITION
status with the sense key set to NOT READY and the additional
sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
REQUIRED;
mpt2sas internal SATL seems to implement this. The result is very confusing
standby behaviour (using hdparm -y). If you suspend a drive and then send
another command, usually it wakes up. However, if the next command is a TEST
UNIT READY, the SATL sees that the drive is suspended and proceeds to follow
the SATL rules for this, returning NOT READY to all subsequent commands. This
means that the ordering of TEST UNIT READY is crucial: if you send TUR and
then a command, you get a NOT READY to both back. If you send a command and
then a TUR, you get GOOD status because the preceeding command woke the drive.
This bit us badly because
commit 85ef06d1d2
Author: Tejun Heo <tj@kernel.org>
Date: Fri Jul 1 16:17:47 2011 +0200
block: flush MEDIA_CHANGE from drivers on close(2)
Changed our ordering on TEST UNIT READY commands meaning that SATA drives
connected to an mpt2sas now suspend and refuse to wake (because the mpt2sas
SATL sees the suspend *before* the drives get awoken by the next ATA command)
resulting in lots of failed commands.
The standard is completely nuts forcing this inconsistent behaviour, but we
have to work around it.
The fix for this is twofold:
1. Set the allow_restart flag so we wake the drive when we see it has been
suspended
2. Return all TEST UNIT READY status directly to the mid layer without any
further error handling which prevents us causing error handling which
may offline the device just because of a media check TUR.
Reported-by: Matthias Prager <linux@matthiasprager.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>