If a device gets offlined as a result of the Inquiry sent
during scanning, the following oops can occur. After the
disk gets put into the SDEV_OFFLINE state, the error handler
sends back the failed inquiry, which wakes the thread doing
the scan. This starts a race between the scanning thread
freeing the scsi device and the error handler calling
scsi_run_host_queues to restart the host. Since the disk
is in the SDEV_OFFLINE state, scsi_device_get will still
work, which results in __scsi_iterate_devices getting
a reference to the scsi disk when it shouldn't.
The following execution thread causes the oops:
CPU 0 (scan) CPU 1 (eh)
---------------------------------------------------------
scsi_probe_and_add_lun
....
scsi_eh_offline_sdevs
scsi_eh_flush_done_q
scsi_destroy_sdev
scsi_device_dev_release
scsi_restart_operations
scsi_run_host_queues
__scsi_iterate_devices
get_device
scsi_device_dev_release_usercontext
scsi_run_queue
<---OOPS--->
The patch fixes this by changing the state of the sdev to SDEV_DEL
before doing the final put_device, which should prevent the race
from occurring.
Original oops follows:
Badness in kref_get at lib/kref.c:32
Call Trace:
[C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable)
[C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8
[C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100
Exception: 700 at .kref_get+0x10/0x28
LR = .kobject_get+0x20/0x3c
[C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable)
[C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c
[C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod]
[C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod]
[C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod]
[C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178
[C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68
Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143
data at address 0x000001b8
Faulting instruction address: 0xd0000000000698e4
sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216
sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi2 : sym-2.2.2
cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30]
pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod]
lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod]
sp: c00000002f447cb0
msr: 9000000000009032
dar: 1b8
dsisr: 40000000
current = 0xc0000000045fecd0
paca = 0xc00000000048ee80
pid = 1123, comm = scsi_eh_1
enter ? for help
[c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod]
[c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178
[c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is a resend of a patch I generated in response to an email sent
by Ruben Faelens <parasietje@gmail.com>. His original email to
linux-scsi requested a method in which he could spin down a scsi disk
when not in use and have the kernel automatically spin it back up when
an I/O was generated to the disk. The infrastructure to automatically
spin a disk up has been in the scsi error handler for some time now,
but it is not enabled by default. This patch adds an sd sysfs attribute
which allows userspace to enable this behavior.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original post was incorrect as it didn't realize that we already had
a self-referenc due to device_initialize(), and we were really only
missing the put on our own reference. This was hidden by the other bug
which had the midlayer reusing stargets after they were already free,
which was doing too many puts on our rport.
Updating FC transport for:
- Add put in fc_rport_final_delete(), to release the rport.
Prior, we were leaving the rport with a reference, thus the shost
with references, etc. If the driver was unloaded, shosts and rports
remained, along with work threads, etc
- Fix fc_rport_create failure path - too many put's on parent
- Add commenting to easily track ref taking.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Updated patch to address comments from Pat Mansfield and Michael Reed:
Bumped max to 600 (10mins). Set default dev_loss_tmo to a value other
than the max (30s).
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In a prior posting to linux-scsi on the fc transport and workq
deadlocks, we noted a second error that did not have a patch:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114467847711383&w=2
- There's a deadlock where scsi_remove_target() has to sit behind
scsi_scan_target() due to contention over the scan_lock().
Subsequently we posted a request for comments about the deadlock:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114469358829500&w=2
This posting resolves the second error. Here's what we now understand,
and are implementing:
If the lldd deletes the rport while a scan is active, the sdev's queue
is blocked which stops the issuing of commands associated with the scan.
At this point, the scan stalls, and does so with the shost->scan_mutex held.
If, at this point, if any scan or delete request is made on the host, it
will stall waiting for the scan_mutex.
For the FC transport, we queue all delete work to a single workq.
So, things worked fine when competing with the scan, as long as the
target blocking the scan was the same target at the top of our delete
workq, as the delete workq routine always unblocked just prior to
requesting the delete. Unfortunately, if the top of our delete workq
was for a different target, we deadlock. Additionally, if the target
blocking scan returned, we were unblocking it in the scan workq routine,
which really won't execute until the existing stalled scan workq
completes (e.g. we're re-scheduling it while it is in the midst of its
execution).
This patch moves the unblock out of the workq routines and moves it to
the context that is scheduling the work. This ensures that at some point,
we will unblock the target that is blocking scan. Please note, however,
that the deadlock condition may still occur while it waits for the
transport to timeout an unblock on a target. Worst case, this is bounded
by the transport dev_loss_tmo (default: 30 seconds).
Finally, Michael Reed deserves the credit for the bulk of this patch,
analysis, and it's testing. Thank you for your help.
Note: The request for comments statements about the gross-ness of the
scan_mutex still stand.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This removes the duplicate functionality which had been added to
the lpfc driver.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The scsi midlayer portion of the patch
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This may seem like a DILLIGAF, but after chatting with the F/W folks,
there is no harm in dropping the page calculation as denoted in the
enclosed patch for these older adapters in this new age of 4GB+ memory
sticks. Any resource optimization within the old-old-old adapters for
systems with less than 4G of memory is of little consequence. The
existing AAC_QUIRK_31BIT flag in linit.c should look after the rest of
the legacy hardware DMA limitations.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The scsi layer is already calling add_disk_randomness in scsi_end_request.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We currently stuff a truncated size into the geometry logic and return the
result which can produce bizarre reports for a 4Tb array. Since that
mapping logic isn't useful for disks that big don't try and map this way at
all.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
From: Randy Dunlap <rdunlap@xenotime.net>
Fix sparse warnings: use NULL instead of 0 for pointers:
drivers/scsi/lpfc/lpfc_els.c:827:56: warning: Using plain integer as NULL pointer
drivers/scsi/lpfc/lpfc_els.c:2781:18: warning: Using plain integer as NULL pointer
drivers/scsi/lpfc/lpfc_els.c:2782:18: warning: Using plain integer as NULL pointer
drivers/scsi/lpfc/lpfc_init.c:951:21: warning: Using plain integer as NULL pointer
drivers/scsi/lpfc/lpfc_init.c:956:20: warning: Using plain integer as NULL pointer
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original code incorrectly assigned it to the driver's
link-down-timeout value (a value in seconds).
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Also remove qla2xxx_probe_one/qla2xxx_remove_one stubs previously
used with external firmware module loaders.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
As there is no point in failing the initialization process when
firmware informs the host software that it could not transition
beyond a CONFIG_WAIT nor WAIT_FOR_LOGIN state. Previous logic
would mark such conditions as a general *failure* and subsequently
tear-down the scsi-host during initialization.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Similar in form to QLogic's standard offering -- via
the 'extended_error_logging' module parameter.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- macro usage statements should terminate with a ';'
- remove unused macros.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The host section of ISP24xx NVRAMs contain a new bit which
allows a user to selectively disable ports of an HBA. These
ports (hosts) will not be presented to the midlayer.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Defer firmware dump-data raw-to-textual conversion to
user-space.
- Add module parameter (ql2xallocfwdump) to allow for per-HBA
allocations of firmware dump memory.
- Dump request and response queue data as per firmware group
request.
- Add extended firmware trace support for ISP24XX/ISP54XX chips.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We have to be able to remove SCSI devices even when they are suspended, so
QUIESCE -> CANCEL must be a legal state transition. This patch (as727)
adds the transition to the state machine.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch simplifies "good_bytes" computation in sd_rw_intr().
sd: "good_bytes" computation is always done in terms of the resolution
of the device's medium, since after that it is the number of good bytes
we pass around and other layers/contexts (as opposed ot sd) can translate
that to their own resolution (block layer:512). It also makes
scsi_io_completion() processing more straightforward, eliminating the
3rd argument to the function.
It also fixes a couple of bugs like not checking return value,
using "break" instead of "return;", etc.
I've been running with this patch for some time now on a
test (do-it-all) system.
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Even with the latest fixes aic79xx still occasionally triggers the
BUG_ON in slave_destroy. Rather than trying to figure out the various
levels of interaction here I've decided to remove the callback altogether.
The primary reason for the slave_alloc / slave_destroy is to keep an
index of pointers to the sdevs associated with a given target.
However, by changing the arguments to the affected functions slightly
it's possible to avoid the use of that index entirely.
The only performance penalty we'll incur is in writing the
information for /proc/scsi/XXX, as we'll have to recurse over all
available sdevs to find the correct ones. But I doubt that reading
from /proc is in any way time-critical.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
According to Anthony Cheung all HP XP arrays with "OPEN-"
types support REPORT_LUN. So there is no reason why we
shouldn't use it.
Signed-off-by: Anthony Cheung <anthony.cheung@hp.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The patch adds support for a ZCR controller (Device ID : 0x413).
It also has a critical bug fix :
Disable controller interrupt before firing INIT cmd to FW. Interrupt
is enabled after required initialization is over. This is done to
ensure that driver is ready to handle interrupts when it is generated
by the controller.
Signed-off-by: Sumant Patro <Sumant.Patro@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch fixes a condition where ibmvscsi treats a transport error as a
"busy" condition, so no errors were returned to the scsi mid-layer.
In a RAID environment this means that I/O hung rather than failing
over.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- add 'virtual_gb' parameter to simulate large storage
(by wrapping in dev_size_mb megabytes of actual ram)
- add 'no_lun_0' parameter to skip lun 0 on each target
(but still respond as required to INQUIRY + REPORT LUNS)
- add well know lu support
- add MODE SELECT commands support [pages: 0xa and 0x1c]
- add LOG SENSE command support [pages: 0xd and 0x2f]
- add READ CAPACITY (16) support
- increase number of mode pages supported (to read),
mainly transport specific (SAS) mode (sub)pages
- add more VPD pages and extend others, including
ATA information VPD page
- START STOP UNIT now maintains a state machine
- READ (16) and WRITE (16) cope with lbas larger
than 32 bits (needed for the 'virtual_gb' parameter)
- allow single command transfers up to 32 MB
- more precise error (sense data) messages
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
binfmt_flat.c calls set_personality with PER_LINUX as the personality.
On the arm architecture this results in the program running in 26bit
usermode. PER_LINUX_32BIT should be used instead. This doesn't affect
other architectures that use binfmt_flat.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change enable_irq() macro to be a statement, not expression.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix PLL setting for the Coldfire 5249 CPU. This brings it into line with
the new style frequency configuration of m68knommu parts.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix flush code for the ColdFire 5206/5206e/5272 cases.
Add support for the new ColdFire 532x CPU family
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here is a patch to the system call handling for 5307/5272/etc to:
- fix the strace support (one tested the wrong bit)
- make all system calls a little bit faster by inlining set_esp0 and
supporting ENOSYS out of the critical path.
- remove extraneous spaces
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch solve a bug triggered by execvp (this function use calloc to
store the argument list and gcc 3.4.x align the stack to word, not to dword).
This situation aren't related to signal handling and all 2.6.x have the bug.
On ColdFire targets we must force the stack to be aligned.
Original patch from Andrea Tarani <andrea.tarani@gilbarco.com>,
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove list of fixed clock frequency options used for configuring master
clock, and make field an int. Much more flexible this way, no need to add
more options for every new used freqency.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove list of fixed clock frequency options used for configuring master
clock, and make field an int. Much more flexible this way, no need to add
more options for every new used freqency.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for the AVNET 5282 board.
Patch submitted by Daniel Alomar <dalomar@serrasold.com>.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add configure support for the new Freescale 532x family of CPUs.
Patch submitted by Matt Waddel <Matt.Waddel@freescale.com>.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commits
3e3318dee0 [PATCH] swsusp: x86_64 mark special saveable/unsaveable pages
b6370d96e0 [PATCH] swsusp: i386 mark special saveable/unsaveable pages
ce4ab0012b [PATCH] swsusp: add architecture special saveable pages support
because not only do they apparently cause page faults on x86, the
infrastructure doesn't compile on powerpc.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add build support for the new Freescale 532x CPU platforms.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for the UART addressing on the new Freescale M532x CPU family.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add build support for new Freescale M532x CPU family timer.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A cleanup of m68knommu/kernel/setup.c :
- No need to initialize global pointers to NULL, they will have that value
automatically, and they eat up space in my data segment image in FLASH.
- Remove get_cpuinfo. It has been replaced by show_cpuinfo.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Don't rely on DEBUG having a value, check for it being defined.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Include the ColdFire 532x support when including ColdFire peripharp
support definitions.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>