Calling a ccw driver's notify function without the ccw device lock
held opens up a race window between discovery and handling of a change
in the device operational state. As a result, the device driver may
encounter unexpected device malfunction, leading to out-of-retry
situations or similar.
Remove race by extending the ccw device lock from state change
discovery to the calling of the notify function.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Extend the scsw data structure to the format required by fcx. Also
provide helper functions for easier access to fields which are present
in both the traditional as well as the modified format.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Use a generic wait_queue to prevent the wait_queue in dasd_sleep_on_
functions from being referenced by callback_data while it does not
exist any more.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When the dasd_int_handler is called with an error code instead of
an irb, the associated request should be restarted. This handling
was missing from the -ETIMEDOUT case. In fact it should be done in
any case.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Most noteable part of this commit is the new local header file entry.h
which contains all the function declarations of functions that get only
called from asm code or are arch internal. That way we can avoid extern
declarations in C files.
This is more or less the same that was done for sparc64.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
I compiled the kernel without deadline, and the dasd code exits the old
scheduler (CFQ), fails to load the new one (deadline), and then things just
hang - with one of these (sorry about the weird chars - I copy & pasted it
from a 3270 console):
dasd(eckd): 0.0.0151: 3390/0A(CU:3990/01) Cyl:3338 Head:15 Sec:224
------------ cut here ------------
Badness at kernel/mutex.c:134
Modules linked in: dasd_eckd_mod dasd_mod
CPU: 0 Not tainted 2.6.25-rc3 #9
Process exe (pid: 538, task: 000000000d172000, ksp: 000000000d21ef88)
Krnl PSW : 0404000180000000 000000000022fb5c (mutex_lock_nested+0x2a4/0x2cc)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000024218 000000000076fc78 0000000000000000 000000000000000f
000000000022f92e 0000000000449898 000000000f921c00 000003e000162590
00000000001539c4 000000000d172000 070000007fffffff 000000000d21f400
000000000f8f2560 00000000002413f8 000000000022fb44 000000000d21f400
Krnl Code: 000000000022fb50: bf2f1000 icm %r2,15,0(%r1)
000000000022fb54: a774fef6 brc 7,22f940
000000000022fb58: a7f40001 brc 15,22fb5a
>000000000022fb5c: a7f4fef2 brc 15,22f940
000000000022fb60: c0e5fffa112a brasl %r14,171db4
000000000022fb66: 1222 ltr %r2,%r2
000000000022fb68: a784fedb brc 8,22f91e
000000000022fb6c: c010002a0086 larl %r1,76fc78
Call Trace:
(<000000000022f92e> mutex_lock_nested+0x76/0x2cc)
<00000000001539c4> elevator_exit+0x38/0x80
<0000000000156ffe> blk_cleanup_queue+0x62/0x7c
<000003e0001d5414> dasd_change_state+0xe0/0x8ec
<000003e0001d5cae> dasd_set_target_state+0x8e/0x9c
<000003e0001d5f74> dasd_generic_set_online+0x160/0x284
<000003e00011e83a> dasd_eckd_set_online+0x2e/0x40
<0000000000199bf4> ccw_device_set_online+0x170/0x2c0
<0000000000199d9e> online_store_recog_and_online+0x5a/0x14c
<000000000019a08a> online_store+0xbe/0x2ec
<000000000018456c> dev_attr_store+0x38/0x58
<000000000010efbc> sysfs_write_file+0x130/0x190
<00000000000af582> vfs_write+0xb2/0x160
<00000000000afc7c> sys_write+0x54/0x9c
<0000000000025e16> sys32_write+0x2e/0x50
<0000000000024218> sysc_noemu+0x10/0x16
<0000000077e82bd2> 0x77e82bd2
Set elevator pointer to NULL in order to avoid double elevator_exit
calls when elevator_init call for deadline iosched fails.
Also make sure the dasd device driver depends on IOSCHED_DEADLINE so
the default IO scheduler of the dasd driver is present.
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
After setting the status of the cqr and releasing the lock for the
block cqr queue, we call the cqr callback function, which will usually
just trigger the dasd_block_tasklet. But when the tasklet is already
running the cqr might be processed before we invoke the callback
function. In rare cases the callback pointer may already be invalid
by the time we want to call it, which will result in a panic.
Solution: Call the callback function first and then release the lock.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When an alias device is set offline while it is in use this may
result in a panic in the cleanup part of the dasd_block_tasklet.
The problem here is that there may exist some ccw requests that were
originally created for the alias device and transferred to the base
device when the alias was set offline. When these request are
cleaned up later, the discipline pointer in the alias device may not
be valid anymore. To fix this use the base device discipline to find
the cleanup function.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Adding interface control check (ifcc) handling in error recovery.
First retry up to 255 times and if all retries fail try an alternate
path if possible.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch converts s390 to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.
As a result, the interfaces of internal functions below are changed:
o dasd_end_request
o tapeblock_end_request
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Add time to the 'expires' value to avoid a loop caused by the cqr
termination function
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Parallel access volumes (PAV) is a storage server feature, that allows
to start multiple channel programs on the same DASD in parallel. It
defines alias devices which can be used as alternative paths to the
same disk. With the old base PAV support we only needed rudimentary
functionality in the DASD device driver. As the mapping between base
and alias devices was static, we just had to export an identifier
(uid) and could leave the combining of devices to external layers
like a device mapper multipath.
Now hyper PAV removes the requirement to dedicate alias devices to
specific base devices. Instead each alias devices can be combined with
multiple base device on a per request basis. This requires full
support by the DASD device driver as now each channel program itself
has to identify the target base device.
The changes to the dasd device driver and the ECKD discipline are:
- Separate subchannel device representation (dasd_device) from block
device representation (dasd_block). Only base devices are block
devices.
- Gather information about base and alias devices and possible
combinations.
- For each request decide which dasd_device should be used (base or
alias) and build specific channel program.
- Support summary unit checks, which allow the storage server to
upgrade / downgrade between base and hyper PAV at runtime (support
is mandatory).
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Using the return value of ccw_device_set_online as return value for
dasd_generic_probe() causes the DASD to fail setting online
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Instead of the deprecated read_dev_chars() and read_conf_data_lpm(),
implement dasd_generic_read_dev_chars() and dasd_eckd_read_conf_lpm().
These should even recover better from error than the original cio
functions.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds a sysfs-attribute 'status' to make the DASD device-status
accessible from user-space. In addition, the DASD driver generates an
uevent(CHANGE) for the ccw-device on each device-status change.
This enables user-space applications (e.g. udev) to do related processing.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds support for clock synchronization to an external time
reference (ETR). The external time reference sends an oscillator
signal and a synchronization signal every 2^20 microseconds to keep
the TOD clocks of all connected servers in sync. For availability
two ETR units can be connected to a machine. If the clock deviates
for more than the sync-check tolerance all cpus get a machine check
that indicates that the clock is out of sync. For the lovely details
how to get the clock back in sync see the code below.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The reserve/release IOCTLs sometimes do not work. If second system
does a 'steal lock' the pending unit check (Format 3 Msg F) is
delivered. Since ERP is disabled for reserve/release, the IOCTL call
fails. We have to allow basic ERP (retries) for reserve/release IOCTLs.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It is now possible to enable/disable ERP related logging without re-compile
and re-ipl. A additional sysfs-attribute 'erplog' allows to switch the
logging non-interruptive.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In case a request timed out and termination did not work, the console was
flooded with retry messages (every 1/10s). Now we use a 5s delay per retry and
generate a more precise message.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Clean up dasd timer when when a dasd device is set offline.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Enhanced default DBF level to get most important messages
in debug feature files.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The dasd_device_from_cdev function is called from interrupt context
to get the struct dasd_device associated with a ccw device. The
driver_data of the ccw device points to the dasd_devmap structure
which contains the pointer to the dasd_device structure. The lock
that protects the dasd_devmap structure is acquire with out irqsave.
To prevent the deadlock in dasd_device_from_cdev if it is called
from interrupt context the dependency to the dasd_devmap structure
needs to be removed. Let the driver_data of the ccw device point
to the dasd_device structure directly and use the ccw device lock
to protect the access.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix clear_IO handling (need to wait for interrupt) and
introduced error-handling in shutdown processing.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The request queue flush function of the dasd driver has to dequeue
the requests first and then call the end request function. Otherwise
a kernel bug in ll_rw_block.c might get triggered.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits)
[PATCH] devfs: Remove it from the feature_removal.txt file
[PATCH] devfs: Last little devfs cleanups throughout the kernel tree.
[PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV
[PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed
[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree
[PATCH] devfs: Remove devfs_remove() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree
[PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree
[PATCH] devfs: Remove devfs support from the sound subsystem
[PATCH] devfs: Remove devfs support from the ide subsystem.
[PATCH] devfs: Remove devfs support from the serial subsystem
[PATCH] devfs: Remove devfs from the init code
[PATCH] devfs: Remove devfs from the partition code
...
Add support for parallel-access-volumes to the dasd driver. This
allows concurrent access to dasd devices with multiple channel
programs.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Dasd code cleanup: 1) remove white space, 2) remove the emacs override
sections, and 3) use kzalloc instead of kmalloc.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The dasd state machine is not designed to enable an unformatted device, since
'unformatted' is a final state. The BIODASDENABLE ioctl calls
dasd_enable_device() which never returns if the device is in this special
state. Return -EPERM in dasd_increase_state for unformatted devices to make
dasd_enable_device terminate. Note: To get such an unformatted device online
it has to be re-analyzed. This means that the device needs to be disabled
prior to re-enablement.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Using the fail-fast flag in i/o requests on a dasd disk which has been
quiesced leads to kernel panics. Modify the request start function to only
work on requests in a valid state.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The dasd driver sometimes print the misleading message "Can't offline dasd
device with open count = 0". The reason why it can't offline the device in
this case is that the device is still in the startup phase. Print a more
meaningful message.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
BUG_ON() Conversion in drivers/video/
BUG_ON() Conversion in drivers/parisc/
BUG_ON() Conversion in drivers/block/
BUG_ON() Conversion in sound/sparc/cs4231.c
BUG_ON() Conversion in drivers/s390/block/dasd.c
BUG_ON() Conversion in lib/swiotlb.c
BUG_ON() Conversion in kernel/cpu.c
BUG_ON() Conversion in ipc/msg.c
BUG_ON() Conversion in block/elevator.c
BUG_ON() Conversion in fs/coda/
BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
BUG_ON() Conversion in input/serio/hil_mlc.c
BUG_ON() Conversion in md/dm-hw-handler.c
BUG_ON() Conversion in md/bitmap.c
The comment describing how MS_ASYNC works in msync.c is confusing
rcu: undeclared variable used in documentation
fix typos "wich" -> "which"
typo patch for fs/ufs/super.c
Fix simple typos
tabify drivers/char/Makefile
...
MODULE_PARM was actually breaking: recent gcc version optimize them out as
unused. It's time to replace the last users, which are generally in the
most unloved drivers anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Convert all kmalloc + memset sequences in drivers/s390 to kzalloc usage.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The DASD extended error reporting is a facility that allows to get detailed
information about certain problems in the DASD I/O. This information can be
used to implement fail-over applications that can recover these problems.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Handle ioctls implemented in dasd_ioctl through the normal switch statement
that most drivers use instead of the awkward dasd_ioctl_no_register routine.
This avoids searching a linear list on every call to dasd_ioctl(), and allows
to give the various ioctl implementation functions sane prototypes, aswell as
moving the check for bdev->bd_disk->private_data from the individual functions
to dasd_ioctl. (I think it can't actually every be NULL, but let's keep that
for later)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
DASD allows to open a device as soon as gendisk is registered, which means the
device is a fake device (capacity=0) and we do know nothing about blocksize
and partitions at that point of time. In case the device is opened by
someone, the bdev and inode creation is done with the fake device info and the
following partition detection code is just using the wrong data.
To avoid this modify the DASD state machine to make sure that the open is
rejected until the device analysis is either finished or an unformatted device
was detected.
Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Revert dasd eer module until we have a common understanding of how the
interface should be.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When using the dasd diag discipline, the base discipline module (eckd or fba)
can be unloaded, even though the dasd driver requires both discipline modules
(base and diag) to work correctly.
Implement reference counting for both base and diag discipline modules in
order to fix this.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>