State transition to DEV_STATE_STOPPED indicates all outstanding I/O has
finished. Add wait queue to wait for this state.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a system where the ctrl-alt-del init action initiated by signal
quiesce suspends the machine the quiesce handler override for
_machine_restart, _machine_halt and _machine_power_off needs to be
undone, otherwise the override is still present in the resumed
system. The next shutdown would then load the quiesce state psw
instead of performing the correct shutdown action.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The monreader device driver doesn't set dev->driver_data to NULL after
freeing the corresponding data structure. This leads to a use after
free bug in the freeze/thaw suspend/resume functions after the device
has been opened and closed once. Fix this by clearing dev->driver_data
in the close() function.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name
and .strategy members of sysctl tables are dead code. Remove them.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Allow the architecture to request a normal jiffy tick when the system
goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
used to prevent the system going fully idle if there has been an
interrupt other than a clock comparator interrupt since the last wakeup.
On s390 the HiperSockets response time for 1 connection ping-pong goes
down from 42 to 34 microseconds. The CPU cost decreases by 27%.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
LKML-Reference: <20090929122533.402715150@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] zfcp: Flush SCSI registration work when adding unit
[SCSI] zfcp: Fix timer initialization for ct and els requests
[SCSI] zfcp: Warn about storage devices with broken PLOGI data
[SCSI] zfcp: Handle WWPN mismatch in PLOGI payload
[SCSI] zfcp: fix kfree handling in zfcp_init_device_setup
[SCSI] fix memory leak in initialization
After copying uts->nodename to the static nodename array the static
version isn't necessarily zero termininated, since the size of the
array is one byte too short.
Afterwards doing strncat(data, nodename, strlen(nodename)); may copy
an arbitrary large amount of bytes.
Fix this by getting rid of the static array and using strncat with
proper length limit.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix missing unregister_sysctl_table in case the SCLP doesn't provide
the requested feature. Also simplify the whole error handling while
at it.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If a suspended z/VM guest has been logged off before the resume the
'SET SMSG IUCV' CP command need to be repeated to reenable sending
message via SMSG. This fixes the following error:
HCPMFS057I H4214002 not receiving; SMSG off
Error: non-zero CP response for command 'SMSG H4214002 CMM SHRINK 5010': #57
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix the size of the local buffer and use snprintf to prevent
further miscalculations. Also fix the usage of bitwise vs logic
operations.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch updates misc percpu related symbols such that percpu
symbols are unique and don't clash with local symbols. This serves
two purposes of decreasing the possibility of global percpu symbol
collision and allowing dropping per_cpu__ prefix from percpu symbols.
* drivers/crypto/padlock-aes.c: s/last_cword/paes_last_cword/
* drivers/lguest/x86/core.c: s/last_cpu/lg_last_cpu/
* drivers/s390/net/netiucv.c: rename the variable used in a macro to
avoid clashing with percpu symbol
* arch/mn10300/kernel/kprobes.c: replace current_ prefix with cur_ for
static variables. Please note that percpu symbol current_kprobe
can't be changed as it's used by generic code.
Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
which cause name clashes" patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
When configuring a LUN for use in zfcp, flush the SCSI work to ensure
the SCSI device has been created before returning. This means that a
configuration procedure can run these commands in a script and the
SCSI device is available immediately after the unit_add:
echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.181d/online
echo 0x401040C300000000 > \
/sys/bus/ccw/drivers/zfcp/0.0.181d/0x500507630313c562/unit_add
lsscsi
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add HZ since the start_timer function expects jiffies, not seconds.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
After opening a remote port zfcp checks if the WWPN returned in the
PLOGI maches the WWPN of the port that should have been opened. On a
mismatch zfcp assumes that the DID just changed, queries the FC
nameserver and tries again. If the situation persists the erp will
give up.
With this strategy, if the remote port always returns the wrong PLOGI
data, the remote port will not be opened. Introduce a warning, so that
the system administrator knows why the remote port is not being opened
and to have a pointer to investigate the problem on the storage
system.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For ports, zfcp gets the DID from the FC nameserver and tries to open
the port. If the open succeeds, zfcp compares the WWPN from the
nameserver with the WWPN in the PLOGI payload. In case of a mismatch,
zfcp assumes that the DID of the port just changed and we opened the
wrong port. This means that zfcp has to forget the DID, lookup the DID
again and retry.
This error case had a problem that zfcp forgets the DID, but never
looks up a new one, stalling the ERP in this case. Fix this by
triggering the DID lookup and properly exit from the ERP. The DID
lookup will trigger a new ERP action.
Also ensure when trying to open the port again with the new DID, first
close the open port, even in the NOESC case.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The pointer that is allocated with kmalloc() is passed to strsep()
which modifies it. Later on the modified pointer value will be passed
to kfree. Save the original pointer and pass that one to kfree
instead.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Group device now cleanly reacts to failures during channel start and
implements a clean rollback.
Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The creation of a new lcs device requires a call to the function
ccw_device_set_online() for the read and the write channel. Failure
of either call should terminate the lcs device creation immediately
with return code -ENODEV.
Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timer_list structure in lcs_send_lancmd() is allocated on stack.
Initialization with init_timer() leads to above ODEBUG message.
Instead use init_timer_on_stack() which prevents above msg.
Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix this build error:
next-20091013 randconfig build on s390x build breaks with
drivers/s390/built-in.o:(.data+0x3354): undefined reference to `sclp_vt220_pm_event_fn'
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <michael.holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use cio_is_console() in io_subchannel_probe to indicate that the
special handling is console specific. As long as there is no other
subchannel for which this might be true, it is misleading to speak
of "early devices". Should more of these devices be introduced,
a cleanup of all console special handling is in order anyway.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
8d65af78 "sysctl: remove "struct file *" argument of ->proc_handler"
removed the struct file argument from all proc_handlers but didn't
change the call home proc handler (or call home was merged later).
So fix this now.
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hans-Joachim Picht <hans@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the rdc_buffer is above 2G we need indirect addresssing so we have
to use an idaw to give the rdc_buffer to the ccw.
If the rdc_buffer is under 2G nothing changes.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace spin_lock with spin_lock_irqsave in dasd_eckd_restore_device.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When setting a channel attached tape online under Linux 2.6.31, the
"vol_id" process from udev hangs in sync_page():
2 sync_page+144 [0x1dfaac]
3 __wait_on_bit_lock+194 [0x58c23e]
4 __lock_page+116 [0x1df9dc]
5 truncate_inode_pages_range+728 [0x1ed7cc]
6 __blkdev_put+244 [0x25f738]
7 __fput+300 [0x229c4c]
8 filp_close+122 [0x225a3a]
The reason for that is an error in the request queue handling. It can
happen that we fetch a request, but do not process it further because
the number of queued requests exceeds TAPEBLOCK_MIN_REQUEUE.
To fix this, we should call blk_peek_request() instead of
blk_fetch_request() in the while condition and fetch the request in
the loop body afterwards.
This bug was introduced with the patch "block: implement and enforce
request peek/start/fetch" (9934c8c045)
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (34 commits)
[SCSI] qla2xxx: Fix NULL ptr deref bug in fail path during queue create
[SCSI] st: fix possible memory use after free after MTSETBLK ioctl
[SCSI] be2iscsi: Moving to pci_pools v3
[SCSI] libiscsi: iscsi_session_setup to allow for private space
[SCSI] be2iscsi: add 10Gbps iSCSI - BladeEngine 2 driver
[SCSI] zfcp: Fix hang when offlining device with offline chpid
[SCSI] zfcp: Fix lockdep warning when offlining device with offline chpid
[SCSI] zfcp: Fix oops during shutdown of offline device
[SCSI] zfcp: Fix initial device and cfdc for delayed adapter allocation
[SCSI] zfcp: correctly initialize unchained requests
[SCSI] mpt2sas: Bump version 02.100.03.00
[SCSI] mpt2sas: Support dev remove when phy status is MPI2_EVENT_SAS_TOPO_PHYSTATUS_VACANT
[SCSI] mpt2sas: Timeout occurred within the HANDSHAKE logic while waiting on firmware to ACK.
[SCSI] mpt2sas: Call init_completion on a per request basis.
[SCSI] mpt2sas: Target Reset will be issued from Interrupt context.
[SCSI] mpt2sas: Added SCSIIO, Internal and high priority memory pools to support multiple TM
[SCSI] mpt2sas: Copyright change to 2009.
[SCSI] mpt2sas: Added mpi2_history.txt for MPI2 headers.
[SCSI] mpt2sas: Update driver to MPI2 REV K headers.
[SCSI] bfa: Brocade BFA FC SCSI driver
...
There is a race while re-reading the device characteristics. After
cleaning the memory area a cqr is build which reads the device
characteristics. This may take a rather long time and the device
characteristics structure is zero during this. Now it could be
possible that the block tasklet starts working and a new cqr will be
build. The build_cp command refers to the device characteristics
structure and this may lead into a divide by zero exception.
Fix this by re-reading the device characteristics into a temporary
structur and copy the data to the original structure. Also take the
ccwdev_lock.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Improve the comments for switch cases without a break. This fixes
some warnings of a code checker tool.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Allow users to set boxed devices offline. After setting them
offline, the device state will still be boxed.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a ccw device appears not operational, inform the associated
device driver and act according to the response: if the driver
wants to keep the device, put it into the disconnected state.
If not, or if there is no driver or if the device is not online,
unregister it. This approach is consistent with no-path event
handling.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When there is no path left to a ccw device, inform the associated
device driver and act according to the response: if the driver
wants to keep the device, put it into the disconnected state.
If not, or if there is no driver or if the device is not online,
unregister it.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There is a memory leak in /proc/cio_ignore. The iterator is allocated
in cio_ignore_proc_seq_start, but never freed in
cio_ignore_proc_seq_stop, because we cannot use the iterator
that was passed by seqfile. The seqfile interface passes the last
seen iterator to the stop function and not the first one. Since our
next function will return NULL at the end, the iter passed to
cio_ignore_proc_seq_stop is NULL. The original iter has leaked.
The solution is to use seq_open_private.
Found with kmemleak:
unreferenced object 0x1c720580 (size 32):
comm "head", pid 973, jiffies 4294958302
hex dump (first 32 bytes):
00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<0000000000203154>] kmem_cache_alloc+0x190/0x19c
[<00000000003fb462>] cio_ignore_proc_seq_start+0x5e/0x128
[<0000000000231018>] seq_read+0xc8/0x4bc
[<0000000000273954>] proc_reg_read+0xa8/0xf4
[<000000000020e3d8>] vfs_read+0xac/0x1a4
[<000000000020e5c6>] SyS_read+0x52/0xa8
[<000000000011836e>] sysc_noemu+0x10/0x16
[<0000004690b7936c>] 0x4690b7936c
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move dev_set_name to when we know that the device will actually be
registered in order to avoid a memory leak if the allocated memory
for the channel path has to be freed.
Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this build failure:
drivers/s390/built-in.o: In function `raw3270_pm_unfreeze':
(.text+0x3ac04): undefined reference to `ccw_device_force_console'
with:
CONFIG_TN3270=y
CONFIG_TN3270_CONSOLE=n
CONFIG_TN3215_CONSOLE=n
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This string query operation was supposed to be replaced by the
generic get_sset_count() starting in 2007. Convert qeth's
implementation.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Running chchp --vary 0 and chccwdev -d on a FCP device with scsi
devices attached can lead to this thread hanging:
================================================================
STACK TRACE FOR TASK: 0x2fbfcc00 (kslowcrw)
STACK:
0 schedule+1136 [0x45f99c]
1 schedule_timeout+534 [0x46054e]
2 wait_for_common+374 [0x45f442]
3 blk_execute_rq+160 [0x217a2c]
4 scsi_execute+278 [0x26daf2]
5 scsi_execute_req+150 [0x26dc86]
6 sd_sync_cache+138 [0x28460a]
7 sd_shutdown+130 [0x28486a]
8 sd_remove+104 [0x284c84]
9 __device_release_driver+152 [0x257430]
10 device_release_driver+56 [0x2575c8]
11 bus_remove_device+214 [0x25672a]
12 device_del+352 [0x25456c]
13 __scsi_remove_device+108 [0x272630]
14 scsi_remove_device+66 [0x2726ba]
15 zfcp_ccw_remove+824 [0x335558]
16 ccw_device_remove+62 [0x2b3f2a]
17 __device_release_driver+152 [0x257430]
18 device_release_driver+56 [0x2575c8]
19 bus_remove_device+214 [0x25672a]
20 device_del+352 [0x25456c]
21 ccw_device_unregister+92 [0x2b48c4]
22 io_subchannel_remove+108 [0x2b4950]
23 css_remove+62 [0x2af7ee]
24 __device_release_driver+152 [0x257430]
25 device_release_driver+56 [0x2575c8]
26 bus_remove_device+214 [0x25672a]
27 device_del+352 [0x25456c]
28 device_unregister+38 [0x25464a]
29 css_sch_device_unregister+68 [0x2af97c]
30 ccw_device_call_sch_unregister+78 [0x2b581e]
31 worker_thread+604 [0x69eb0]
32 kthread+154 [0x6ff42]
33 kernel_thread_starter+6 [0x1c952]
================================================================
The problem is that the chchp --vary 0 leads to zfcp first calling
fc_remote_port_delete which blocks all scsi devices on the remote
port. Calling scsi_remove_device later lets the sd driver issue a
SYNCHRONIZE_CACHE command. This command stays on the "stopped" request
requeue because the SCSI device is blocked. Fix this by first removing
the scsi and fc hosts which removes all scsi devices and do not use
scsi_remove_device.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-39.x.20090917-s390xdefault #1
-------------------------------------------------------
kslowcrw/83 is trying to acquire lock:
(&adapter->scan_work){+.+.+.}, at: [<0000000000169c5c>] __cancel_work_timer+0x64/0x3d4
but task is already holding lock:
(&zfcp_data.config_mutex){+.+.+.}, at: [<00000000004671ea>] zfcp_ccw_remove+0x66/0x384
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&zfcp_data.config_mutex){+.+.+.}:
[<0000000000189962>] __lock_acquire+0xe26/0x1834
[<000000000018a4b6>] lock_acquire+0x146/0x178
[<000000000058cb5a>] mutex_lock_nested+0x82/0x3ec
[<0000000000477170>] zfcp_fc_scan_ports+0x3ec/0x728
[<0000000000168e34>] worker_thread+0x278/0x3a8
[<000000000016ff08>] kthread+0x9c/0xa4
[<0000000000109ebe>] kernel_thread_starter+0x6/0xc
[<0000000000109eb8>] kernel_thread_starter+0x0/0xc
-> #0 (&adapter->scan_work){+.+.+.}:
[<0000000000189e60>] __lock_acquire+0x1324/0x1834
[<000000000018a4b6>] lock_acquire+0x146/0x178
[<0000000000169c9a>] __cancel_work_timer+0xa2/0x3d4
[<0000000000465cb2>] zfcp_adapter_dequeue+0x32/0x14c
[<00000000004673e4>] zfcp_ccw_remove+0x260/0x384
[<00000000004250f6>] ccw_device_remove+0x42/0x1ac
[<00000000003cb6be>] __device_release_driver+0x9a/0x10c
[<00000000003cb856>] device_release_driver+0x3a/0x4c
[<00000000003ca94c>] bus_remove_device+0xcc/0x114
[<00000000003c8506>] device_del+0x162/0x21c
[<0000000000425ff2>] ccw_device_unregister+0x5e/0x7c
[<000000000042607e>] io_subchannel_remove+0x6e/0x9c
[<000000000041ff9a>] css_remove+0x3e/0x7c
[<00000000003cb6be>] __device_release_driver+0x9a/0x10c
[<00000000003cb856>] device_release_driver+0x3a/0x4c
[<00000000003ca94c>] bus_remove_device+0xcc/0x114
[<00000000003c8506>] device_del+0x162/0x21c
[<00000000003c85e8>] device_unregister+0x28/0x38
[<0000000000420152>] css_sch_device_unregister+0x46/0x58
[<00000000004276a6>] io_subchannel_sch_event+0x28e/0x794
[<0000000000420442>] css_evaluate_known_subchannel+0x46/0xd0
[<0000000000420ebc>] slow_eval_known_fn+0x88/0xa0
[<00000000003caffa>] bus_for_each_dev+0x7e/0xd0
[<000000000042188c>] for_each_subchannel_staged+0x6c/0xd4
[<0000000000421a00>] css_slow_path_func+0x54/0xd8
[<0000000000168e34>] worker_thread+0x278/0x3a8
[<000000000016ff08>] kthread+0x9c/0xa4
[<0000000000109ebe>] kernel_thread_starter+0x6/0xc
[<0000000000109eb8>] kernel_thread_starter+0x0/0xc
cancel_work_sync is called while holding the config_mutex. But the
work that is being cancelled or flushed also uses the config_mutex.
Fix the resulting deadlock possibility by calling cancel_work_sync
earlier without holding the mutex. The best place to do is is after
offlining the device. No new port scan work will be scheduled for the
offline device, so this is a safe place to call cancel_work_sync.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the change that the zfcp_adapter struct is only allocated when
the device is set online, the shutdown handler has to check for a
non-existing zfcp_adapter struct. On the other hand, this check is not
necessary in the offline callback, since an online device has the
zfcp_adapter allocated and we go through the offline callback before
removing the ccw device.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the change for delaying the allocation of zfcp_adapter, the
initial device parameter function has to first call
ccw_device_set_online which allocates the zfcp_adapter structure.
Change this and adapt the cfdc part accordingly.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The common initialization of ct/gs and els requests missed the
initialization of unchained requests. Fix this by moving the common
parts to a place that is called for all ct/gs and els requests.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (22 commits)
[S390] Update default configuration.
[S390] hibernate: Do real CPU swap at resume time
[S390] dasd: tolerate devices that have no feature codes
[S390] zcrypt: Do not add/remove devices in s/r callbacks
[S390] hibernate: make sure pfn_is_nosave handles lowcore pages
[S390] smp: introduce LC_ORDER and simplify lowcore handling
[S390] ptrace: use common code for simple peek/poke operations
[S390] fix disabled_wait inline assembly clobber list
[S390] Change kernel_page_present coding style.
[S390] hibernation: reset system after resume
[S390] hibernation: fix guest page hinting related crash
[S390] Get rid of init_module/delete_module compat functions.
[S390] Convert sys_execve to function with parameters.
[S390] Convert sys_clone to function with parameters.
[S390] qdio: change state of all primed input buffers
[S390] qdio: reduce per device debug messages
[S390] cio: introduce consistent subchannel scanning
[S390] cio: idset use actual number of ssids
[S390] cio: dont kfree vmalloced memory
[S390] cio: introduce css_settle
...
The DASD device driver reads the feature codes of a device during
device initialization. These codes are later used to determine the
availability of advanced features like PAV or High Performance FICON.
Some very old devices do not support the command to read feature
codes and the initialization routine fails.
As the feature codes are not necessary for basic DASD operations, we
can support such devices by just ignoring missing feature codes.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Devices are no longer removed or added in the suspend and resume
callbacks. Instead they are marked unregistered in suspend. In the
resume callback the ap_scan_bus method is scheduled. The bus scan
function will remove the old device and add new ones. This way all
the device handling will be done in only one function. Additionaly
the case where the domain might change during suspend/resume is
caught. In that case the devices qid needs to re-calculated in
order of having it found by the scan method.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If input buffers stay in primed state qdio may not receive further interrupts
for the input queue depending on the firmware. That can cause a connection
hang on OSA cards.
Change the state of all primed input buffers that are not acknowledged to
not initialized.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Even if turned off the debug message overhead is measurable in the hot path.
Reduce the number of debug message calls in do_QDIO and qdio_kick_handler.
Also use hex numbers to save space in the debug entries.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Previously, there were multiple subchannel scanning mechanisms
which could potentially conflict with each other. Fix this problem
by moving blacklist and ccw driver triggered scanning to the
existing evaluation method.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The functions idset_sch_new and for_each_subchannel_staged
use different values for the number of subchannel sets. Make
it consistent by changing idset_sch_new to also use the actual
number of subchannel sets.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't use kfree to free memory allocated by vmalloc.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce the css_driver callback settle which can be implemented
by a subchannel driver to wait for the subchannel type specific
asynchronous work to finish.
In channel_subsystem_init_sync we call that for each subchannel
driver.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use css_eval_scheduled to determine if all scheduled subchannel
evaluation is finished. Wait for this value to be 0 in the
channel subsystem init function.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Define initialization sequence of css and ccw bus init calls by merging
them into a single init call. Also introduce channel_subsystem_init_sync
to wait for the initialization of devices to finish.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
trivial: fix typo in aic7xxx comment
trivial: fix comment typo in drivers/ata/pata_hpt37x.c
trivial: typo in kernel-parameters.txt
trivial: fix typo in tracing documentation
trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
trivial: remove unnecessary semicolons
trivial: Fix duplicated word "options" in comment
trivial: kbuild: remove extraneous blank line after declaration of usage()
trivial: improve help text for mm debug config options
trivial: doc: hpfall: accept disk device to unload as argument
trivial: doc: hpfall: reduce risk that hpfall can do harm
trivial: SubmittingPatches: Fix reference to renumbered step
trivial: fix typos "man[ae]g?ment" -> "management"
trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
trivial: fix missing printk space in amd_k7_smp_check
trivial: fix typo s/ketymap/keymap/ in comment
trivial: fix typo "to to" in multiple files
trivial: fix typos in comments s/DGBU/DBGU/
...
Let attribute group vectors be declared "const". We'd
like to let most attribute metadata live in read-only
sections... this is a start.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (209 commits)
[SCSI] fix oops during scsi scanning
[SCSI] libsrp: fix memory leak in srp_ring_free()
[SCSI] libiscsi, bnx2i: make bound ep check common
[SCSI] libiscsi: add completion function for drivers that do not need pdu processing
[SCSI] scsi_dh_rdac: changes for rdac debug logging
[SCSI] scsi_dh_rdac: changes to collect the rdac debug information during the initialization
[SCSI] scsi_dh_rdac: move the init code from rdac_activate to rdac_bus_attach
[SCSI] sg: fix oops in the error path in sg_build_indirect()
[SCSI] mptsas : Bump version to 3.04.12
[SCSI] mptsas : FW event thread and scsi mid layer deadlock in SYNCHRONIZE CACHE command
[SCSI] mptsas : Send DID_NO_CONNECT for pending IOs of removed device
[SCSI] mptsas : PAE Kernel more than 4 GB kernel panic
[SCSI] mptsas : NULL pointer on big endian systems causing Expander not to tear off
[SCSI] mptsas : Sanity check for phyinfo is added
[SCSI] scsi_dh_rdac: Add support for Sun StorageTek ST2500, ST2510 and ST2530
[SCSI] pmcraid: PMC-Sierra MaxRAID driver to support 6Gb/s SAS RAID controller
[SCSI] qla2xxx: Update version number to 8.03.01-k6.
[SCSI] qla2xxx: Properly delete rports attached to a vport.
[SCSI] qla2xxx: Correct various NPIV issues.
[SCSI] qla2xxx: Correct qla2x00_eh_wait_on_command() to wait correctly.
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
netxen: update copyright
netxen: fix tx timeout recovery
netxen: fix file firmware leak
netxen: improve pci memory access
netxen: change firmware write size
tg3: Fix return ring size breakage
netxen: build fix for INET=n
cdc-phonet: autoconfigure Phonet address
Phonet: back-end for autoconfigured addresses
Phonet: fix netlink address dump error handling
ipv6: Add IFA_F_DADFAILED flag
net: Add DEVTYPE support for Ethernet based devices
mv643xx_eth.c: remove unused txq_set_wrr()
ucc_geth: Fix hangs after switching from full to half duplex
ucc_geth: Rearrange some code to avoid forward declarations
phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
drivers/net/phy: introduce missing kfree
drivers/net/wan: introduce missing kfree
net: force bridge module(s) to be GPL
Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
...
Fixed up trivial conflicts:
- arch/x86/include/asm/socket.h
converted to <asm-generic/socket.h> in the x86 tree. The generic
header has the same new #define's, so that works out fine.
- drivers/net/tun.c
fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
switched over to using 'tun->socket.sk' instead of the redundantly
available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
to the TUN driver") which added a new 'tun->sk' use.
Noted in 'next' by Stephen Rothwell.
For messages from the tape core that is shared between the 3590 and 34xx
tape disciplines, we want to have the "tape" prefix instead of "tape_3590"
or "tape_34xx". In order to fix this, we now use the pr_xxx printk macros.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge cpuid.h header file into cpu.h.
While at it convert from typedef to struct declaration and also
convert cio code to use proper lowcore structure instead of casts.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently, when a tape device is set online and no cartridge is loaded, we
get the messages "The tape cartridge has been successfully unloaded" and
"Determining the size of the recorded area". These messages are not correct.
To fix this, we now print the "cartridge loaded/unloaded" messages only,
when the load/unload event really occurs. In addition to that, the message
"Determining the size of the recorded area" is only printed, if a cartridge
is loaded.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use a console_initcall() to initialize the s390 virtio console and
clean up s390 console initialization in setup.c.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If dev_set_name fails during scanning the AP bus, the reserved memory
has to be freed.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently in the suspend process checksums for the XPRAM partitions are
created and stored. During the resume process it is checked,
if the checksums are still the same. If this is not the case, a kernel panic
is triggered. Unfortunately this prevents XPRAM from beeing used as suspend
device, because in this case after the checksum has been created, the
memory image is written to XPRAM and therefore the contents of the suspend
partition is changed. In order to allow XPRAM to be used as suspend device,
this patch removes the checksum validation.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The vmur class is allocated after the CCW driver is registered
and it is destroyed before the CCW driver is unregistered.
This is not the correct sequence, because the vmur class can be used
via driver core callbacks that are triggered during the CCW driver
deregistration. For Example:
1. vmur device is online
2. vmur module is unloaded
This leads to the following function call stack:
<4> [<0000000000387286>] device_destroy+0x36/0x5c
<4> [<000003e000209714>] ur_set_offline_force+0x9c/0x10c [vmur]
<4> [<000003e00020a928>] ur_remove+0x64/0xbc [vmur]
<4> [<00000000003e4d2e>] ccw_device_remove+0x42/0x1ac
<4> [<000000000038a1aa>] __device_release_driver+0x9a/0xe4
<4> [<000000000038a2da>] driver_detach+0xe6/0xec
<4> [<0000000000388ee4>] bus_remove_driver+0xc0/0x108
<4> [<000003e00020ad5a>] ur_exit+0x52/0x84 [vmur]
In device_destroy() the vmur class is used. Since it is already freed,
this can lead to a kernel panic.
To fix the problem, the vmur class has to be allocated before the CCW
driver is registered and destroyed after the CCW driver has ben unregistered.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With CONFIG_DEBUG_OBJECTS_TIMERS=y "chccwdev --online" for a tape device
will fail with message "ODEBUG: object is on stack, but not annotated".
We now use init_timer_on_stack.
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't use kfree directly after device registration started.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch fixes message naming so that generic dasd messages do not
contain the device discipline. For this purpose the dev_ makros are
replaced by pr_ makros for generic dasd messages.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A DASD device that is not ready or online has no defined disk layout,
so all requests that arrive in such a state need to be returned as
failed.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We used the init_name to set the console ccw_device's name early
at the boot stage. This patch moves the name setting (for all ccw
devices) to the point where we actually register the device. At this
time we can do dynamic allocations and therefore use dev_set_name.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We use a test_and_clear_bit to prevent a device from being
unregistered twice. Unfortunately in this cases the "final"
put_device (from device_initialize) was issued more than once,
resulting in an use after free error. Fix this by moving this
put_device to ccw_device_unregister.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We used the init_name to set the console subchannels name early
at the boot stage. With the patch cio: fix memleak in subchannel validation
we moved the name setting to the point where we actually register the
console subchannel. At this time we can do dynamic allocations and therefore
use dev_set_name.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When scanning for new subchannels we have a code path where we allocate
memory for a struct subchannel, set the device name (which is dynamically
allocated now) and do a check if the underlying device is blacklisted - if
so we free the subchannel structure.
Since we have not set up refcounting at this stage, the device name's memory
is lost. Fix this by moving the dev_set_name after the blacklist test.
Note: With this patch the init_name for the console subchannel becomes
virtually obsolete.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When using s390dbf with "%s" in sprintf format strings the string itself
is not copied to the dbf buffer.
Since in this case only pointers are stored in the s390dbf, we should
not use dev_name - which is bound to the lifetime of the device.
Reading this entry from s390dbf after the device was released will cause
an use after free error.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The number of qdio debugfs entries was limited. Remove this limit
and group the queue files in a per device directory.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When unit checks trigger sensing the device state is set to W4SENSE
until sense completion; then the device state is set back to
ONLINE. If a unit check occurs while set online or set offline
requests are processed then it might happen that the device's
temporary W4SENSE state causes these functions to terminate,
leaving the device in an inconsistent state when the state is set
back to ONLINE later on so that the device cannot be set online or
offline any longer.
To solve this, set online/offline and related rollback or error
routines are processed only if the device is in a final or
DISCONNECTED state.
Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ensure to always hold an extra device reference for scheduling a
subchannel deregistration, by moving the get_device to
ccw_device_schedule_sch_unregister. This fixes an use after free
error in ccw_device_call_sch_unregister where put_device was called
on an already freed device structure.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With commit c38f960809 polling was
stopped for the queue even if new data is available.
Return immediately after scheduling the queue tasklet if the queue
is not done.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move debug traces for start I/O and interrupt events to exclusive
trace levels. Also change tracing in hot-path from sprintf (costly)
to hex.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If online/offline processing of a ccw device fails, resulting in not
operational state, notify the driver and unregister the device in case
the driver dosn't want to keep it.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ensure that the hardware interruption parameter for a subchannel is
reset when the associated subchannel data structure is freed.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
All scsw helper functions are very short and usage of them shouldn't
result in function calls. Therefore we move them to a separate header
file.
Also saves a lot of EXPORT_SYMBOLs.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Path verification events occurring for offline devices are currently
ignored. As a result, offline devices are not removed, even though
they might no longer be accessible (for example because the last path
to the device was varied offline). Fix this by scheduling a status
evaluation for the affected subchannel when a path verification event
occurs.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove expensive ktime_get()/ktime_us_delta() functions from the hot
path and use get_clock_monotonic() instead. This elimates seven
function calls and avoids a lot of unnecessary calculations.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The timestamp calculation used for s390dbf output is the same in a
private zfcp function and in debug.c. Replace both with a common
inline function.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
dev_set_name tries to allocate memory, so check the return value for
allocation failures. After dev_set_name succeeds, call device_register
as next step to be able to use put_device during error handling.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Don't use kfree directly after device registration started.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The config semaphore is only used as a mutex, so replace it with a
simple mutex.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
So far, zfcp allocated all resources required for FCP
adapters/subchannels when the device was discovered in the ccw_probe
callback. If there are lots of unused FCP subchannels attached to a
system, this is a waste of resources. To alleviate this, defer the
resource allocation to the first call to ccw_set_online. To avoid
disruptions during possible following calls to ccw_set_offline and
then ccw_set_online, keep the adapter resources until the device is
finally being removed via ccw_remove. While doing this, also manage
the zfcp erp thread together with all other adapter resources in
zfcp_adapter_enqueue/dequeue.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The recommendation for a timeout of 2 * R_A_TOV is the same for ct/gs
and els requests, so set it in the common function used for
initializing both request types. Besides, the timer inside zfcp should
only run longer than the timeout set for the channel, so 10 seconds
more should be enough (instead of 60 seconds).
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Switch the creation of the zfcp erp thread from the deprecated
kernel_thread API to the kthread API. This allows also the removal of
some flags in zfcp since the kthread API handles thread creation and
shutdown internally. To allow the usage of the kthread_stop function,
replace the erp ready semaphore with a waitqueue for waiting until erp
actions arrive on the ready queue.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The fc_rport structure reserves a reference where a LLD can put
information required in a situation where the fc transport class is
triggering LLD callbacks. The zfcp driver was using this variable
directly which is discouraged. This patch solves this issue by making
this reference unnecessary. In addition the dev_loss_tmo callback is
removed, it is not required: zfcp does not access the fc_rport after
calling fc_remote_port_delete.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Update the Fibre Channel related code to use the zfcp_fc prefix.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Extract independent data structures and introduce common _setup and
_destroy routines for QDIO and Fibre Channel related data structures
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Change the dbf data and functions to use the zfcp_dbf prefix
throughout the code. Also change the calls to dbf to use zfcp_dbf
instead of zfcp_adapter.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Don't let the erp wait for gid_pn requests to complete. Instead, queue
the gid_pn work, exit erp and let the finished gid_pn work trigger a
new port reopen.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The zfcp_adapter structure was growing over time to a size of almost
one memory page. To reduce the size of the data structure and to
seperate different layers, put all qdio related data in the new
zfcp_qdio data structure.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Split all qdio related attributes out of zfcp_fsf_req and put it in
new structure.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove the global driver work queue and replace it with a workqueue
local to the adapter. The usage of this workqueue makes this the
correct place for the structure. In addition multiple adapters won't
block each other due to the serialization of the queued work.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The flag ZFCP_REQ_AUTO_CLEANUP was useless as the
ZFCP_STATUS_FSFREQ_CLEANUP flag is there for exactly the same purpose.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove the special case for NO_QTCB requests and optimize the
mempool and cache processing for fsfreqs. Especially use seperate
mempools for the zfcp_fsf_req and zfcp_qtcb structs.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The combination wait_queue/wakeup in conjunction with the flag
ZFCP_STATUS_FSFREQ_COMPLETED to signal the completion of an fsfreq
was not race-safe and can be better solved by a completion.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There is no need for the QDIO layer to have knowledge or do things
wich are done better by the FSF layer and vice versa. Straighten a
few things to improve vividness.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
An adapter shutdown implicitly closes all open ports. Make sure to
mark all WKA ports as offline, not only the directory server. Also
make sure that no pending wka port work is running when the adapter is
being removed.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When the FCP channel returns a series of commands with the error
status "test link", zfcp will send a series of ELS ADISC commands.
This is technically no problem, but it is enough to only issue one
test command per remote port. So, track whether a ELS ADISC command is
already pending, and do not send a new one if there is already a
pending command.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Using a bitwise OR to not set anything at all is pointless so remove
the useless statement.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The default trace level is to only trace failed FSF commands. Thus it
is not necessary to collect trace data for most FSF commands, since
it will be thrown away later. Restructure the FSF/HBA trace
infrastructure to first check the trace level in a inline function and
only do the expensive data collection for matching trace levels.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The default trace level is to only trace failed SCSI commands. Thus it
is not necessary to collect trace data for most SCSI commands since it
will be thrown away later. Restructure the SCSI trace infrastructure
to first check the trace level in a inline function and only do the
expensive data collection for matching trace levels.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The struct zfcp_adapter includes everything related to the debug
traces. This introduces dependences between the definitions in
zfcp_def.h and zfcp_dbf.h. Move all debug related data structures to a
new data structure to break those dependencies and manage the debug
data in zfcp_dbf.[hc].
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
In certain error scenarios ports, rports are getting attached,
validated and removed from the systems environment. Depending on the
layer this occurs asynchronously. This patch fixes the few races
which existed and ensures all references and cross references are
cleared at the time they're invalid. In addition fc transports
actions are only scheduled when required.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Neil Horman <nhorman@txudriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to put ethtool_ops in data, they should be const.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If nothing has been written into the qeth sysfs-attribute layer2,
its value is "-1" meaning "not yet defined". But the value is
displayed as "1" meaning "layer2 selected". The patch changes the
reading of this "-1"-value to "-1" to make clear the layer2-attribute
has not yet been defined.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qeth displayed an entry in /proc/service_level even when no valid
MCL-string was available (the MCL info is blank). The change is to
create an entry in /proc/service_level only when MCL-string is
non-zero.
Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear separation of cast-type determination (send path) for layer-2
resp. layer-3. Allowing to have inline functions for qeth layer-
discipline.
Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case the IP address list contains entries (not removed when the device was set
offline) this entries should be registered next time the device is brought online.
In the past this was done implicitly with the device open call but since we wait
in the set IPv4 IPA and the device open common code holds various locks this
does not work any longer.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qeth HiperSockets support now retries sending of packets when the
IBM System z signals a temporary resource shortage (e.g. target
buffer full). The packet is enqueued into the device queue.
After 3 times of unsuccessful send the packet is dropped.
Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the NULL test on block is needed, it should be before the dereference of
the base field.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
expression E1,E2;
identifier fld;
statement S1,S2;
@@
E1 = E2->fld;
(
if (E1 == NULL) S1 else S2
|
*if (E2 == NULL) S1 else S2
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If io_subchannel_initialize_dev fails it will release the only
reference to the ccw device therefore the caller should not
kfree this device since this is done in the release function.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (23 commits)
[SCSI] sd: Avoid sending extended inquiry to legacy devices
[SCSI] libsas: fix wide port hotplug issues
[SCSI] libfc: fix a circular locking warning during sending RRQ
[SCSI] qla4xxx: Remove hiwat code so scsi eh does not get escalated when we can make progress
[SCSI] qla4xxx: Fix srb lookup in qla4xxx_eh_device_reset
[SCSI] qla4xxx: Fix Driver Fault Recovery Completion
[SCSI] qla4xxx: add timeout handler
[SCSI] qla4xxx: Correct Extended Sense Data Errors
[SCSI] libiscsi: disable bh in and abort handler.
[SCSI] zfcp: Fix tracing of request id for abort requests
[SCSI] zfcp: Fix wka port processing
[SCSI] zfcp: avoid double notify in lowmem scenario
[SCSI] zfcp: Add port only once to FC transport class
[SCSI] zfcp: Recover from stalled outbound queue
[SCSI] zfcp: Fix erp escalation procedure
[SCSI] zfcp: Fix logic for physical port close
[SCSI] zfcp: Use -EIO for SBAL allocation failures
[SCSI] zfcp: Use unchained mode for small ct and els requests
[SCSI] zfcp: Use correct flags for zfcp_erp_notify
[SCSI] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf
...
The trace record for SCSI abort requests has a field for the request
id of the request to be aborted. Put the real request id instead of
zero.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Under certain conditions it is possible that a WKA port ist not opened
within the expected timeframe of half a second. In this situation
the WKA port remains in the state OPENING preventing any succeding
request to open the port. This led to unrecoverable remote ports.
Fixing this by always setting an appropriate WKA port status before
leaving the function and removing the timeout value here since it's
not needed here because the general timeout processing would deal
with it if required.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In a LOWMEM condition an ERP notification would have been sent twice
causing an unpredictable behaviour of the ERP.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When calling fc_remote_port_add make sure to not call it again before
fc_remote_port_delete has been called. In other words, ensure to
create a new fc_rport, then delete it, then create a new one again.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Depending on interruptions on some storage systems, the complete
channel can stall which looks like an outbound queue stall to Linux.
When trying to acquire a free SBAL for a non-SCSI command, zfcp waits
for 5 seconds for a free slot to appear. This is the right place to
detect a queue stall: If the wait times out, we assume a stalled queue
and try to recover this.
The overall strategy should be to trigger the erp from specific
events, and not try an overall escalation from one failed port to a
full-blown queue recovery. If we manage to send a command, the status
codes for this command or a timeout will trigger the right follow-on
actions.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If an action fails, retry it until the erp count exceeds the
threshold. If there is something fundamentally wrong, the FSF layer
will trigger a more appropriate action depending on the FSF status
codes.
The followup for successful actions is a different followup than
retrying failed actions, so split the code two functions to make this
clear.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
After closing the port, we want it to be "not open" to consider the
action to be successful.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ELS ADISC and the GID_PN requests sent from zfcp fit into
unchained FSF requests. Change the FSF allocation logic to use
unchained requests whenever possible where everything fits in one
SBAL. This avoids acquiring more SBALs than necessary, especially
during zfcp recovery when things might be stalled.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
zfcp_erp_notify uses the ZFCP_ERP_STATUS_* flags, so it is
ZFCP_STATUS_ERP_LOWMEM instead of ZFCP_ERP_NOMEM. Signalling
ZFCP_ERP_FAILED is not necessary, the missing d_id will show that the
nameserver did not return the d_id.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When a fsf_req or a qtcb cannot be allocated return -ENOMEM instead of
-EIO.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
req_q_util is not atomic, so the qdio_stat_lock must be held when
reading this variable.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We should not modify the port status after triggering an ERP action
for the port. It is not guaranteed which status is finally active
when the ERP action is performed. This can lead to situations which
are unwanted and hard to debug in case of a failure.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Every time a request is enqueued or there is some work outstanding
from the ap_tasklet, the ap_poll_timer is scheduled again.
Unfortunately it was permanently called. It looked as if it was
started in the past and thus imediately expired.
This has been changed. First it is checked if the hrtimer is already
expired. Then the expiring time is forwarded and the timer restarted.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT
This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
remove loop, add some debug data and use get_sense function
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this:
drivers/s390/char/monreader.c: In function 'mon_open':
drivers/s390/char/monreader.c:323: warning: passing argument 1 of 'dev_set_drvdata' from incompatible pointer type
include/linux/device.h:457: note: expected 'struct device *' but argument is of type 'struct device **'
drivers/s390/char/monreader.c: In function 'monreader_freeze':
drivers/s390/char/monreader.c:466: warning: passing argument 1 of 'dev_get_drvdata' from incompatible pointer type
include/linux/device.h:452: note: expected 'const struct device *' but argument is of type 'struct device **'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Define an empty static inline version of sclp_console_pm_event()
to fix the build error below for !SCLP_CONSOLE.
drivers/s390/built-in.o: In function `sclp_rw_pm_event':
sclp_rw.c:(.text+0x12f68): undefined reference to `sclp_console_pm_event'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch converts the remaining occurences of raw return values to their
symbolic counterparts in ndo_start_xmit() functions that were missed by the
previous automatic conversion.
Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero
is changed to explicitly use NETDEV_TX_OK.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is the result of an automatic spatch transformation to convert
all ndo_start_xmit() return values of 0 to NETDEV_TX_OK.
Some occurences are missed by the automatic conversion, those will be
handled in a seperate patch.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>