The zfcpdump tool requires a method to attach exactly one LUN. The
easiest way to achieve this is to add a new zfcp module parameter.
When allow_lun_scan is set to "false", zfcp only accepts LUNs that
have been configured through the unit_add sysfs interface.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The function zfcp_cache_hw_align is only called from zfcp_module_init,
so it should be declared with __init as well.
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Initialization of the qdio waitqueue should happen when the qdio data
is initialized and the QDIOUP flag should be handled in the qdio code
as well. Adjust the code accordingly and remove the superfluos
function zfcp_erp_adapter_strategy_open_qdio.
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the final cleanup of the redesign from the zfcp tracing.
Structures and elements which were used by multiple areas of the
former debug tracing are now changed to the new scheme.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the continuation to redesign the zfcp tracing to a more
straight-forward and easy to extend scheme.
This patch deals with all trace records of the zfcp SCSI area.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the continuation to redesign the zfcp tracing to a more
straight-forward and easy to extend scheme.
This patch deals with all trace records of the zfcp HBA area.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is the continuation to redesign the zfcp tracing to a more
straight-forward and easy to extend scheme.
This patch deals with all trace records of the zfcp SAN area.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The tracing environment of the zfcp LLD has become very bulky and hard
to maintain. Small changes involve a large modification process which
is error-prone and not effective. This patch is the first of a set to
redesign the zfcp tracing to a more straight-forward and easy to
extend scheme. It removes all interpretation and visualization parts
and focuses on bare logging of the information.
This patch deals with all trace records of the zfcp error recovery.
Signed-off-by: Swen schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Interrupting the connection to the FCP channel while I/O requests are
being issued can lead to this deadlock. scsi_dispatch_cmd already
holds the host_lock while the recovery trigger tries to acquire the
host_lock again when iterating through the scsi_devices.
INFO: lockdep is turned off.
BUG: spinlock lockup on CPU#1, blast/9660, 0000000078f38878
CPU: 1 Not tainted 2.6.35.7SWEN2 #2
Process blast (pid: 9660, task: 0000000071f75940, ksp: 0000000074393ac0)
0000000074393640 00000000743935c0 0000000000000002 0000000000000000
0000000074393660 00000000743935d8 00000000743935d8 00000000005590c2
0000000000000000 0000000078f38878 0000000026ede800 0000000078f38878
000000000000000d 040000000000000c 0000000074393628 0000000000000000
0000000000000000 0000000000100b2a 00000000743935c0 0000000074393600
Call Trace:
([<0000000000100a32>] show_trace+0xee/0x144)
[<00000000003be202>] do_raw_spin_lock+0x112/0x178
[<000000000055d408>] _raw_spin_lock_irqsave+0x90/0xb0
[<00000000003f1514>] __scsi_iterate_devices+0x38/0xbc
[<00000000004849b0>] zfcp_erp_clear_adapter_status+0xd0/0x16c
[<000000000048587a>] zfcp_erp_adapter_reopen+0x3a/0xb4
[<0000000000489812>] zfcp_fsf_req_send+0x166/0x180
[<000000000048c8d6>] zfcp_fsf_fcp_cmnd+0x272/0x408
[<000000000048f864>] zfcp_scsi_queuecommand+0x11c/0x1e0
[<00000000003f1f2a>] scsi_dispatch_cmd+0x1d6/0x324
[<00000000003f9910>] scsi_request_fn+0x42c/0x56c
[<00000000003828ae>] __blk_run_queue+0x86/0x140
[<000000000037f742>] elv_insert+0x11a/0x208
[<000000000038104c>] blk_insert_cloned_request+0x84/0xe4
[<000003c0032b7c64>] dm_dispatch_request+0x6c/0x94 [dm_mod]
[<000003c0032b7d5c>] map_request+0xd0/0x100 [dm_mod]
[<000003c0032b9a78>] dm_request_fn+0xec/0x1bc [dm_mod]
[<0000000000382c0e>] generic_unplug_device+0x5a/0x6c
[<000003c0032b7f98>] dm_unplug_all+0x74/0x9c [dm_mod]
[<00000000001d1272>] sync_page+0x76/0x9c
[<00000000001d12ba>] sync_page_killable+0x22/0x60
[<000000000055a768>] __wait_on_bit_lock+0xc0/0x124
[<00000000001d1140>] __lock_page_killable+0x78/0x84
[<00000000001d351c>] generic_file_aio_read+0x5a4/0x7e8
[<0000000000228ec0>] do_sync_read+0xc8/0x12c
[<0000000000229edc>] vfs_read+0xac/0x1ac
[<000000000022a0d8>] SyS_read+0x58/0xa8
[<00000000001146de>] sysc_noemu+0x10/0x16
[<00000200000493c4>] 0x200000493c4
INFO: lockdep is turned off.
Call zfcp_fsf_fcp_cmnd without the host_lock and disable the
interrupts when acquiring the req_q_lock. According to the patch
description in "[PATCH] Eliminate error handler overload of the SCSI
serial number", the serial_number is not used, so simply drop the
queuecommand wrapper function and run zfcp_scsi_queuecommand without
holding the host_lock.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The ERP got values assigned for which no reference was taken. This
can lead to an unpredictable race condition. Fix this by only
assigning the values which are required and for which a reference was
pulled or is held implicitly.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If the evaluation of GPN_FT requests wants to remove an invalid port
from the system the zfcp_erp_port_shutdown function is triggered.
Depending on the system status a superior action (e.g. adapter reopen)
is required. This can lead to an invalid mem access of the port struct
which might be freed at the time since the superior action is not
holding a reference of the port which triggered this ERP action.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The request data assignment between the fsf abort initiator and its
corresponding handler is not consistent and leads to an unpredictable
behaviour, e.g. kernel panic. This patch fixes this issue and assigns
the correct value.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use correct bit positions of rsid field.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The qdio device indicator is freed before the device is notified that
the indicator is reset. This sequence contains a race when the freed
indicator is used by a new device while the reset of the indicator is
still pending. Do the reset operation before freeing the indicator to
avoid that potential race.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.
Remove this too as a cleanup.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.
The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.
Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)
Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.
Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] kprobes: Fix the return address of multiple kretprobes
[S390] kprobes: disable interrupts throughout
[S390] ftrace: build without frame pointers on s390
[S390] mm: add devmem_is_allowed() for STRICT_DEVMEM checking
[S390] vmlogrdr: purge after recording is switched off
[S390] cio: fix incorrect ccw_device_init_count
[S390] tape: add medium state notifications
[S390] fix get_user_pages_fast
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits)
can-bcm: fix minor heap overflow
gianfar: Do not call device_set_wakeup_enable() under a spinlock
ipv6: Warn users if maximum number of routes is reached.
docs: Add neigh/gc_thresh3 and route/max_size documentation.
axnet_cs: fix resume problem for some Ax88790 chip
ipv6: addrconf: don't remove address state on ifdown if the address is being kept
tcp: Don't change unlocked socket state in tcp_v4_err().
x25: Prevent crashing when parsing bad X.25 facilities
cxgb4vf: add call to Firmware to reset VF State.
cxgb4vf: Fail open if link_start() fails.
cxgb4vf: flesh out PCI Device ID Table ...
cxgb4vf: fix some errors in Gather List to skb conversion
cxgb4vf: fix bug in Generic Receive Offload
cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue()
ixgbe: Look inside vlan when determining offload protocol.
bnx2x: Look inside vlan when determining checksum proto.
vlan: Add function to retrieve EtherType from vlan packets.
virtio-net: init link state correctly
ucc_geth: Fix deadlock
ucc_geth: Do not bring the whole IF down when TX failure.
...
If automatic purge is enabled for a vmlogrdr device, old records are
purged before an IUCV recording service is switched on or off. If z/VM
generates a large number of records between purging and switching the
recording service off, these records remain queued, and may have a
negative performance impact on the z/VM system. To avoid this problem,
we need to purge the records after recording is switched off.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If device recognition is interrupted by a subchannel event
indicating that the device is gone, ccw_device_init_count
is not correctly decreased.
Fix this by reporting the corresponding event to the device
recognition callback via the state machine.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
QDIO is running independent from netdevice state. We are not
allowed to schedule NAPI in case the netdevice is not open.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For a certain Hipersockets specific error code in the xmit path, the
qeth driver tries to invoke dev_queue_xmit again.
Commit 79640a4ca6 introduces a busylock
causing locking problems in case of re-invoked dev_queue_xmit by qeth.
This patch removes the attempts to retry packet sending with
dev_queue_xmit from the qeth driver.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (70 commits)
[SCSI] pmcraid: add support for set timestamp command and other fixes
[SCSI] pmcraid: remove duplicate struct member
[SCSI] qla4xxx: Fix cmd check in qla4xxx_cmd_wait
[SCSI] megaraid_sas: Version and documentation update
[SCSI] megaraid_sas: Add three times Online controller reset
[SCSI] megaraid_sas: Add input parameter for max_sectors
[SCSI] megaraid_sas: support devices update flag
[SCSI] libosd: write/read_sg_kern API
[SCSI] libosd: Support for scatter gather write/read commands
[SCSI] libosd: Free resources in reverse order of allocation
[SCSI] libosd: Fix bug in attr_page handling
[SCSI] lpfc 8.3.18: Update lpfc driver version to 8.3.18
[SCSI] lpfc 8.3.18: Add new WQE support
[SCSI] lpfc 8.3.18: Fix critical errors
[SCSI] lpfc 8.3.18: Adapter Shutdown and Unregistration cleanup
[SCSI] lpfc 8.3.18: Add logic to detect last devloss timeout
[SCSI] lpfc 8.3.18: Add support of received ELS commands
[SCSI] lpfc 8.3.18: FC/FCoE Discovery fixes
[SCSI] ipr: add definitions for a new adapter
[SCSI] bfa: fix comments for c files
...
Get rid of the format string "%s" usage with volatile strings
to prevent use after free errors in the s390dbf.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The BIODASDSNID ioctl executes a 'Sense Path Group ID'
command on a DASD ECKD device. The returned path group data
allows user space programs to determine path state and
path group ID of the channel paths to the device.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the FCP_RSP_INFO length to correctly skip the FCP_RSP_INFO field.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
zfcp_unit_release calls put_device on the port. Ensure that get_device
has been called before possibly triggering the release function
through put_device or device_unregister.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If an exchange config is executed while the local link is down, the
request succeeds but the returned data is incomplete. Proceeding with
the adapter activation is leading to an unpredictable behaviour (e.g.
kernel panic) caused by invalid values. In such a scenario the
recommended ERP is to retry the action and wait for a link up event.
If the issue persists the activation has to fail.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Sigend-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Store the facility list once at system startup with stfl/stfle and
reuse the result for all facility tests.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For the DASD DIAG discipline IO is started through special diagnose
calls. Unsolicited interrupts may contain information about the device
itself. But this information is not needed because the device is not
used directly.
Fix the case that an unimplemented dicipline function may be called
by ignoring unsolicited interrupts for the DIAG disciplin.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The dasd interrupt handler needs to distinguish solicited from
unsolicited interrupts, as unsolicited interrupts may require special
handling (e.g. summary unit checks) and solicited interrupts require
proper error recovery for the failed I/O request.
The interrupt handler needs to check several bit fields in the
interrupt response block (irb) to make this distinction.
So far our check of the status control bits has not been specific
enough, which may lead to a failed request getting just retried
instead of the necessary error recovery.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Writing to /proc/dasd/statistics while the debug level of the
generic dasd debug entry is set to DBF_DEBUG will lead to an
use after free when accessing the debug entry later.
Since for the format string "%s" in the s390 dbf only a pointer
to the string is stored in the debug feature and the buffer used
here is freed afterwards.
To fix this just remove the debug message.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Function ccw_device_cancel_halt_clear may cause an unexpected kernel
panic if a clear function is currently active at the subchannel for
which it is called. In that case, the iretry counter used to determine
the number of retries is never initialized, leading to an immediate
failure of the function which results in a kernel panic.
Fix this by initializing the iretry counter when the function is
first called. Also replace the kernel panic with a return code: a
single malfunctioning I/O device should not automatically cause a
system-wide kernel panic.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Read external interrupts parameters from the lowcore in the first
level interrupt handler in entry[64].S.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds a notification mechanism to inform ccw drivers
about changes to channel paths, which occured while the device
is online.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Update the subchannel descriptor while resuming from hibernate
in order to obtain current link addresses.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Update the channel path descriptors after hibernation.
This is done unlocked, since we are the only active
task at this time.
Note: chsc_determine_base_channel_path_desc is changed
to use spin_lock_irqsave, since it's called with
interrupts disabled in this case.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Update the channel path descriptor at the beginning of to the
vary_on operation.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
chsc_determine_channel_path_desc is called by a wrapper
who allocates a response struct. The response data
is then memcpy'ed to this response struct by
chsc_determine_channel_path_desc.
Change chsc_determine_base_channel_path_desc to use the
global chsc_page and deliver it to the function doing
the actual chsc call. The channel path desriptor is
then directly read from the response data.
As a result we get rid of the additional allocation
for the response struct.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Most wrappers around the channel subsystem call have their own logic
to allocate memory (with proper alignment) or use preallocated or
static memory. This patch converts most users of the channel
subsystem call to use the same preallocated page (proteced by a
spinlock).
Note: The sei_page which is used in our crw handler to call
"store event information" has to coexist, since
a) in crw context, while accessing the sei_page, sleeping is allowed
(which will conflict with the spinlock protection of the chsc_page)
b) in crw context, while accessing the sei_page, channel subsystem
calls are allowed (which itself would require the page).
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch fixes:
* kfree vs. free_page usage
* structure definition for determine_css_characteristics
* naming convention for the chsc init function
* deregistration of crw handlers in the cleanup path
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Serialize access to members of struct channel_path with a mutex.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If a ccwdevice is lost during hibernation and a different
ccwdevice is attached to the same subchannel, we will
deregister the old ccw device and register the new one.
Since deregistration is not allowed in this context, we
handle this action later. However, some parts of the
registration process for the new device were started anyway,
so that the old device structure is no longer accessible.
Fix this by deferring both actions to the afterwards
scheduled subchannel event.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The dasd_eckd_dump_sense_dbf function uses a macro for s390 debug
feature that can handle up to 8 parameters (for the DASD device
driver).
Fix the function to use only the maximum number of parameters.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
fix this sparse warning:
drivers/s390/cio/css.c:580:6: warning: symbol 'css_schedule_eval_all_unreg'
was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The usual way to recover a failed DASD ECKD request (cqr) is to create
a new request with an appropriate recovery CCW program. Certain
features, e.g. failfast, can be enabled per request and are stored in
the requests flags. These flags have to be copied from the failed to
the recovery request, to let the recovery request use the same
features as the original one.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>