The block device drivers have all gained new lock_kernel
calls from a recent pushdown, and some of the drivers
were already using the BKL before.
This turns the BKL into a set of per-driver mutexes.
Still need to check whether this is safe to do.
file=$1
name=$2
if grep -q lock_kernel ${file} ; then
if grep -q 'include.*linux.mutex.h' ${file} ; then
sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
else
sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
fi
sed -i ${file} \
-e "/^#include.*linux.mutex.h/,$ {
1,/^\(static\|int\|long\)/ {
/^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
} }" \
-e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
-e '/[ ]*cycle_kernel_lock();/d'
else
sed -i -e '/include.*\<smp_lock.h\>/d' ${file} \
-e '/cycle_kernel_lock()/d'
fi
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
st_open() suggests that llseek() doesn't work: "We really want to do
nonseekable_open(inode, filp); here, but some versions of tar incorrectly
call lseek on tapes and bail out if that fails. So we disallow pread()
and pwrite(), but permit lseeks."
Instead of using the fallback default_llseek() the driver should use
noop_llseek() which leaves the file->f_pos untouched but succeeds.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Cc: Willem Riede <osst@riede.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Except for SCSI no device drivers distinguish between physical and
hardware segment limits. Consolidate the two into a single segment
limit.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
dio transfer always resets mdata->page_order to zero. It breaks
high-order pages previously allocated for non-dio transfer.
This patches adds reserved_page_order to st_buffer structure to save
page order for non-dio transfer.
http://bugzilla.kernel.org/show_bug.cgi?id=14563
When enlarge_buffer() allocates 524288 from 0, st uses six-order page
allocation. So mdata->page_order is 6 and frp_seg is 2.
After that, if st uses dio, sgl_map_user_pages() sets
mdata->page_order to 0 for st_do_scsi(). After that, when we call
normalize_buffer(), it frees only free frp_seg * PAGE_SIZE (2 * 4096)
though we should free frp_seg * PAGE_SIZE << 6 (2 * 4096 << 6). So we
see buffer_size is set to 516096 (524288 - 8192).
Reported-by: Joachim Breuer <linux-kernel@jmbreuer.net>
Tested-by: Joachim Breuer <linux-kernel@jmbreuer.net>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
value cannot logically be less than START and greater than BUFFERSIZE.
#define EXTENDED_SENSE_START 18
// vi include/scsi/scsi_cmnd.h +105
#define SCSI_SENSE_BUFFERSIZE 96
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A memory use after free bug can manifest if the MTSETBLK or SET_DENS_AND_BLK
ioctl features are used to set the tape's blocksize from 0 to non-zero.
After the driver sets the new block size, in this one case it calls
normalize_buffer() to free the device's internal data buffers. However, the
ioctl code assumes there is always a buffer and does not check or allocate
a buffer if there isn't one. So any following ioctl calls can corrupt
a part of memory by writing data to memory that the st driver had freed.
This patch removes the normalize_buffer() call and the specialness of
changing from a 0 to non-zero blocksize to fix the possible use of
memory after it has been freed by the st driver.
signed-off-by: David Jeffery <djeffery@redhat.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Conflicts:
drivers/message/fusion/mptsas.c
fixed up conflict between req->data_len accessors and mptsas driver updates.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch fixes the GCC 4.4 warning reported by David Binderman and Sergey
Senozhatsky. The old version was working correctly but was not easy to read.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
rq->data_len served two purposes - the length of data buffer on issue
and the residual count on completion. This duality creates some
headaches.
First of all, block layer and low level drivers can't really determine
what rq->data_len contains while a request is executing. It could be
the total request length or it coulde be anything else one of the
lower layers is using to keep track of residual count. This
complicates things because blk_rq_bytes() and thus
[__]blk_end_request_all() relies on rq->data_len for PC commands.
Drivers which want to report residual count should first cache the
total request length, update rq->data_len and then complete the
request with the cached data length.
Secondly, it makes requests default to reporting full residual count,
ie. reporting that no data transfer occurred. The residual count is
an exception not the norm; however, the driver should clear
rq->data_len to zero to signify the normal cases while leaving it
alone means no data transfer occurred at all. This reverse default
behavior complicates code unnecessarily and renders block PC on some
drivers (ide-tape/floppy) unuseable.
This patch adds rq->resid_len which is used only for residual count.
While at it, remove now unnecessasry blk_rq_bytes() caching in
ide_pc_intr() as rq->data_len is not changed anymore.
Boaz : spotted missing conversion in osd
Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape
[ Impact: cleanup residual count handling, report 0 resid by default ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The SUGGEST_* flags in the SCSI command result have been out of fashion
for a while and we don't actually use them in the error handling.
Remove the remaining occurrences.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Make enlarge_buffer() retry allocation if the previously chosen page
order was too small. Really limit the page order to 6. Return error if
the maximum order is not large enough for the request.
Signed-off-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This integrates st_scsi_kern_execute and st_do_scsi. IOW, it removes
st_scsi_kern_execute. Then st has a single function, st_do_scsi, to
perform SCSI commands.
Signed-off-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
frp_sg_current in struct st_buffer is always zero. We don't need it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
orig_frp_segs in struct st_buffer is always zero. We don't need it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- remove the from_initialization argument, which is always 1. We
always need to use GFP_ATOMIC.
- 'got' valuable is initialized to zero and doesn't change. We don't
need it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This removes the usage of struct scatterlist completely.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This removes struct st_buff_fragment and use reserved_pages array to
store fragment buffer.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This removes unused buf_to_sg() that the non-dio path used.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch converts the dio path (mmap) to use st_scsi_execute. IOW,
it removes scsi_execute_async in the non dio path.
scsi_execute_async has gone! This also remove unused st_sleep_done.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch converts the non-dio path (fragment buffer path) to use
st_scsi_execute. IOW, it removes scsi_execute_async in the non-dio
path.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
st_scsi_execute is a helper function to perform SCSI commands
involving data transfer between user and kernel space (st_read and
st_write).
It's the future plan to combine this with st_scsi_kern_execute helper
function.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds struct rq_map_data and the array of pointers to store
fragment buffers to struct st_buffer.
This patch doesn't remove st_buf_fragment but the latter patch does.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch simiplifies the fragment buffer management a bit, all the
buffers in the fragment list become the same size. This is necessary
to use the block layer API (sg driver was modified in the same way)
since the block layer API takes the same size page frames instead of
scatter gatter.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in st_int_ioctl with st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in get_location (READ_POSITION) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in write_mode_page (MODE_SELECT) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in read_mode_page (MODE_SENSE) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in check_tape (READ_BLOCK_LIMITS and
MODE_SENSE) with st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in st_flush (WRITE FILEMARKS) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in cross_eof (SPACE) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in do_load_unload (START STOP) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in set_location (LOCATE 10) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This replaces st_do_scsi in test_ready (TEST_UNIT_READY) with
st_scsi_kern_execute.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
st_scsi_kern_execute is a helper function to perform SCSI commands
synchronously. It supports data transfer with a liner in-kernel buffer
(not scatter gather). st_scsi_kern_execute internally uses
scsi_execute().
The majority of st_do_scsi can be replaced with
st_scsi_kern_execute. This is a preparation for rewriting st_do_scsi
to remove obsolete scsi_execute_async().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This moves st_request initialization code to st_allocate_request()
form st_do_scsi(). This is a preparation for making
st_allocate_request() usable for everyone, not only st_do_scsi().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Since we're trying to eliminate struct scsi_device timeout, the tape
driver has to be updated to use the block queue timeout instead. The
tape use of scsi_device timeout looks to be self consistent, so I don't
think this necessarily fixes any bug, but it has to be done to allow me
to remove the timeout parameter from struct scsi_device.
Acked-by: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Now that device_create() has been audited, rename things back to the
original call to be sane.
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Christie noticed a bogus memset. It can be removed as dead code
since the number of bytes in the driver buffer in fixed block mode is
always a multiple of the tape block size.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Move buffer pointer back when data could not be written. Bug found by
Mike Christie.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is a race from when a device is created with device_create() and
then the drvdata is set with a call to dev_set_drvdata() in which a
sysfs file could be open, yet the drvdata will be NULL, causing all
sorts of bad things to happen.
This patch fixes the problem by using the new function,
device_create_drvdata(). It fixes the problem in all of the scsi
drivers that need it.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There's a change in the SCSI tree that adds another class_device, so change
it to an ordinary device
[jejb: this one got rebased until it's basically cosmetic only]
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
It's big, but there doesn't seem to be a way to split it up smaller...
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes the following namespace collision with
include/asm-avr32/cacheflush.h :
<-- snip -->
...
CC [M] drivers/scsi/st.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/st.c:629:53: error: macro "flush_write_buffer" passed 1 arguments, but takes just 0
...
make[3]: *** [drivers/scsi/st.o] Error 1
<-- snip -->
st now uses st_flush_write_buffer()
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Show the current binary tape driver and mode options is sysfs. A file
(options) is created in each directory in /sys/class/scsi_tape. The files
contain masks showing the options. The mask bit definitions are the same as
used when setting the options using the MTSETDRVBUFFER function in the
MTIOCTOP ioctl (defined in include/linux/mtio.h). For example:
> cat /sys/class/scsi_tape/nst0/options
0x00000d07
[jejb: updated doc with correction from Randy Dunlap]
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Add new option MT_ST_SILI to enable setting the SILI bit in reads in variable
block mode. If SILI is set, reading a block shorter than the byte count does
not result in CHECK CONDITION. The length of the block is determined using the
residual count from the HBA. Avoiding the REQUEST SENSE command for every
block speeds up some real applications considerably.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Remove the now useless counting of adjacent pages from the debugging code in
to make it compile when DEBUG is set non-zero.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Convert st to unlocked_ioctl. The necessary locking was already in place.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is caused by a missing scatterlist initialisation (it only shows
up when sg list handling debugging is turned on).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.
Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The SCSI Tape driver uses a semaphore as mutex. Use the mutex API
instead of the (binary) semaphore.
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
bsg uses scsi_cmd_ioctl() for some SCSI/sg ioctl
commands. scsi_cmd_ioctl() gets a request queue from a gendisk
arguement. This prevents bsg being bound to SCSI devices that don't
have a gendisk (like OSD). This adds a request_queue argument to
scsi_cmd_ioctl(). The SCSI/sg ioctl commands doesn't use a gendisk so
it's safe for any SCSI devices to use scsi_cmd_ioctl().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The following patch adds support for sysfs/uevent modalias
attribute for scsi devices (like disks, tapes, cdroms etc),
based on whatever current sd.c, sr.c, st.c and osst.c drivers
supports.
The modalias format is like this:
scsi:type-0x04
(for TYPE_WORM, handled by sr.c now).
Several comments.
o This hexadecimal type value is because all TYPE_XXX constants
in include/scsi/scsi.h are given in hex, but __stringify() will
not convert them to decimal (so it will NOT be scsi:type-4).
Since it does not really matter in which format it is, while
both modalias in module and modalias attribute match each other,
I descided to go for that 0x%02x format (and added a comment in
include/scsi/scsi.h to keep them that way), instead of changing
them all to decimal.
o There was no .uevent routine for SCSI bus. It might be a good
idea to add some more ueven environment variables in there.
o osst.c driver handles tapes too, like st.c, but only SOME tapes.
With this setup, hotplug scripts (or whatever is used by the
user) will try to load both st and osst modules for all SCSI
tapes found, because both modules have scsi:type-0x01 alias).
It is not harmful, but one extra module is no good either.
It is possible to solve this, by exporting more info in
modalias attribute, including vendor and device identification
strings, so that modalias becomes something like
scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive
and having that, match for all 3 attributes, not only device
type. But oh well, vendor and device strings may be large,
and they do contain spaces and whatnot.
So I left them for now, awaiting for comments first.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On Thu, 1 Feb 2007, Andrew Morton wrote:
> On Thu, 1 Feb 2007 15:34:29 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=7919
> >
> > Summary: Tape dies if wrong block size used
> > Kernel Version: 2.6.20-rc5
> > Status: NEW
> > Severity: normal
> > Owner: scsi_drivers-other@kernel-bugs.osdl.org
> > Submitter: dmartin@sccd.ctc.edu
> >
> >
> > Most recent kernel where this bug did *NOT* occur: 2.6.17.14
> >
> > Other Kernels Tested and Results:
> >
> > OK 2.6.15.7
> > OK 2.6.16.37
> > OK 2.6.17.14
> > BAD 2.6.18.6
> > BAD 2.6.18-1.2869.fc6
> > BAD 2.6.19.2 +
> > BAD 2.6.20-rc5
> >
> > NOTE: 2.6.18-1.2869.fc6 is a Fedora modified kernel, all others are from kernel.org
> >
...
> > Steps to reproduce:
> > Get a Adaptec AHA-2940U/UW/D / AIC-7881U card and a tape drive,
> > install a recent kernel
> > set the tape block size - mt setblk 4096
> > read from or write to tape using wrong block size - tar -b 7 -cvf /dev/tape foo
> >
Write does not trigger this bug because the driver refuses in fixed block
mode writes that are not a multiple of the block size. Read does trigger
it in my system.
The bug is not associated with any specific HBA. st tries to do direct i/o
in fixed block mode with reads that are not a multiple of tape block size.
The patch in this message fixes the st problem by switching to using the
driver buffer up to the next close of the device file in fixed block mode
if the user asks for a read like this.
I don't know why the bug has surfaced only after 2.6.17 although the st
problem is old. There may be another bug in the block subsystem and this
patch works around it. However, the patch fixes a problem in st and in
this way it is a valid fix.
This patch may also fix the bug 7900.
The patch compiles and is lightly tested.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Wed, 24 Jan 2007, Andrew Morton wrote:
> On Mon, 22 Jan 2007 13:07:20 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=7864
> >
> > Summary: A MTIOCTOP/MTWEOF within the early warning will cause
> > the file number to be incorrect
> > Kernel Version: 2.6.19.2
> > Status: NEW
> > Severity: low
> > Owner: io_scsi@kernel-bugs.osdl.org
> > Submitter: ce_reisinger@yahoo.com
> >
> >
> > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached
> > early warning.
> > Now perform a:
> > struct mtget before, after;
> > ioctl(fd, MTIOCGET, &before);
> > struct mtop mtop = { MTWEOF, 1 };
> > ioctl(fd, MTIOCTOP, &mtop);
> > ioctl(fd, MTIOCGET, &after);
> >
> > Check the value of mt_fileno in the before and after structures. Notice the
> > after is 2 greater then the before.
> >
> > The problem appears to be in the block of code starting at line 2817 in st.c.
> > This block is entered because the drive did return a CHECK CONDITION with NO
> > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But
> > it has already been increased by the number of filemarks requested by the
> > MTIOCTOP. I believe that the residue count in the sense data should be
> > subtracted from fileno, not a increment as is done.
> >
>
> Thanks. Could you please send us a tested patch to fix these things, as
> per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ?
>
The analysis is basically correct and explains the bug. According to the
SCSI standards, the sense code is NO SENSE or RECOVERED ERROR in case
writing filemark(s) succeeds. If it fails (partly or completely) the sense
code is VOLUME OVERFLOW. The patch below is tested to fix the case when
one filemark is successfully written after the EOM early warning. It
should also fix the case at real EOM but this has not been tested.
Carl, thanks for reporting the bug and providing the analysis for the fix.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Printk -> sdev_printk change originally from Luben Tuikov
<ltuikov@yahoo.com>. Loglevel changes prompted by Matthew Wilcox
<matthew@wil.cx>.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Based on the original patch from Hannes Reinecke <hare@suse.de>
Fix st_open() to return -ENOMEDIUM instead of -EIO if no medium is
found.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Notice and handle sysfs errors in module init, tape init
- Properly unwind errors in module init
- Remove bogus st_sysfs_class==NULL test, it is guaranteed !NULL at that point
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Convert this:
st0: Error with sense data: <6>st: Current: sense key: Illegal Request
Additional sense: Invalid field in cdb
To this:
st0: Current: sense key: Illegal Request
Additional sense: Invalid field in cdb
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I noticed that in_use in st_buffer is not used. The patch below
against 2.6.17-rc3 removes it, assuming there is no future use for it.
It was tested in a sparc SS20 with a DLT4000.
Signed-off-by: Martin Habets <errandir_news@mph.eclipse.co.uk>
Acked-by: Kai Mkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Pass the POSIX lock owner ID to the flush operation.
This is useful for filesystems which don't want to store any locking state
in inode->i_flock but want to handle locking/unlocking POSIX locks
internally. FUSE is one such filesystem but I think it possible that some
network filesystems would need this also.
Also add a flag to indicate that a POSIX locking request was generated by
close(), so filesystems using the above feature won't send an extra locking
request in this case.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove
duplicates of the macro.
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
As devfs has been disabled from the kernel tree for a number of months
now (5 to be exact), here's a patch against 2.6.16-rc1-git1 that removes
support for it from the SCSI subsystem.
The patch also removes the scsi_disk devfs_name field as it's no longer
needed.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Change the core SCSI code to use kzalloc rather than kmalloc+memset
where possible.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When the scsi_execute_async interface was added it ended up reducing
the flexibility of userspace to send arbitrary scsi commands through
sg using SG_IO. The SG_IO interface allows userspace to specify the
CDB length. This is now ignored in scsi_execute_async and it is
guessed using the COMMAND_SIZE macro, which is not always correct,
particularly for vendor specific commands. This patch adds a cmd_len
parameter to the scsi_execute_async interface to allow the caller
to specify the length of the CDB.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
LLDDs should never see REQ_BLOCK_PC requests, we can handle them just
fine in the core code. There is a small behaviour change in that some
check in sr's rw_intr are bypassed, but I consider the old behaviour
a bug.
Mike found this cleanup opportunity and provdided early patches, so all
the credit goes to him, even if I redid the patches from scratch beause
that was easier than forward-porting the old patches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
the scsi layer is using semaphores in a mutex way, this patch converts
these into using mutexes instead
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This merge is pretty extensive. The conflict is over the new
req->retries parameter, so I had to change the prototype to
scsi_setup_blk_pc_cmnd() and the usage in sd, sr and st.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Patch from Kai minus last sg_segs clearing which was merged already.
> > Was there a oops or lockup or any debug output you can send me? I will try
> > some more large request tests with scsi_debug. You also have to compile your
> > kernel with SCSI_MAX_PHYS_SEGMENTS == 255 to get larger requests now.
>
It was an oops in sgl_unmap_user_pages(). The reason is this:
/* XXX: just for debug. Remove when PageReserved is removed */
BUG_ON(PageReserved(page));
I was using /dev/zero as input and it triggers this. When I used a file as
input, this did not trigger. Should this BUG_ON be removed?
In the same log I noticed that there was another ->sg_segs inconsistency.
Also, the field ->last_SRpnt was not reset when scsi_execute_async()
failed. This caused the error message "Async command already active"
later and prevented proper close.
While doing the changes, I noticed that the current code (since
2.6.0-test4) does not set the pages dirty when reading with direct i/o.
All of these st problems (including the one I sent earlier) are fixed in
the patch at the end of this message. These fixes should probably be
included already in 2.6.15.
After these fixes, the tape seems to operate as expected. Without other
changes, the largest block size with sym53c896 SCSI adapter is 384 kB. The
maximum number of sg segments is set to 96 and clustering is disabled in
the driver. 96 x 4 kB = 384 kB. OK.
I enabled clustering and set max_sectors to 10000 in the SCSI HBA driver.
Now the block size limit is 5000 kB as expected.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
convert st to always send scatterlists and kill scsi_request
usage.
This is the same as last time as it was posted, but with Kai's patches
merged and we now pass the bytes value to scsi_execute_async.
TODO:
- move DIO code to common place or make block layers usable for ULDs.
- move buffer allocation code to common place for all ULDs to use. And
make buffer allocation code handle all queue limits so we can find
out about problems before calling scsi_execute_async.
- move indirect (copy_to/from_user) paths commone place or make block
layers usable for ULDs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sd does not allow scsi_io_completion to retry commands for
SG_IO requests, and it make sense that it should not happen for st
SG_IO commands too. If for st we hit the bottom of scsi_io_completion
we will probably screw things up pretty bad. This patch returns to the
block layer that the whole command completed and relies on the caller to check
the request errors field. For initialization commands like in sd, this adds
the previous behavior where scsi_io_completion did not process the error.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This follows on from Jens' patch and consolidates all of the ULD
separate handlers for REQ_BLOCK_PC into a single call which has his
fix for our direction bug.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
patch below marks a few scsi core datastructures as const, so that they end up
in the .rodata section and don't cacheline share with things that get dirtied
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2.6.15-rc1 made sg's st_unmap_user_pages and st's sgl_unmap_user_pages
BUG on a PageReserved page. But that's wrong: they could be unmapping
the ZERO_PAGE, which is marked PG_reserved; and perhaps others (while
get_user_pages is still permitted on VM_PFNMAP areas - that may change).
More change is needed here: sg claims to dirty even pages written from,
and st claims not to dirty even pages read into; and SetPageDirty is not
adequate for this nowadays. Fixes to those follow in a later patch: for
the moment just fix the 2.6.15 regression.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Nick and I had already been looking at drivers/scsi/{sg.c,st.c},
brought there by __put_page in sg.c's peculiar sg_rb_correct4mmap,
which we'd like to remove. But that's irrelevant to your pain, except...
One extract from the patches I'd like to send Doug and Kai for 2.6.15
or 2.6.16 is this below: since the incomplete get_user_pages path omits
to reset res, but has already released all the pages, it will result in
premature freeing of user pages, and behaviour just like you've seen.
Though I'd have thought incomplete get_user_pages was an exceptional
case, and a bit surprised you'd encounter it. Perhaps there's some
other premature freeing in the driver, and this instance has nothing
whatever to do with it.
If the problem were easily reproducible, it'd be great if you could
try this patch; but I think you've said it's not :-(
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is the drivers/scsi/ part of the big kfree cleanup patch.
Remove pointless checks for NULL prior to calling kfree() in drivers/scsi/.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.
PageReserved special casing is removed from get_page and put_page.
All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.
MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated. We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).
Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.
Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not. This still
needs to be addressed (a generic page_is_ram() should work).
A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss). These writes to the struct
page could cause excessive cacheline bouncing on big systems. There are a
number of ways this could be addressed if it is an issue.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Refcount bug fix for filemap_xip.c
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This should eliminate (at least in the mid layer) to make numeric
assumptions about any of the enumeration variables. As a side effect,
it will also make all the messages consistent and line us up nicely for
the error logging strategy (if it ever shows itself again).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The previous patch adding the ability to nest struct class_device
changed the paramaters to the call class_device_create(). This patch
fixes up all in-kernel users of the function.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes an issue in scsi command initialization from a request
where sd, sr, st, and scsi_lib all fail to copy the request's
cmd_len to the scsi command's cmd_len field.
Signed-off-by: Timothy Thelin <timothy.thelin@wdc.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
reported by Doug Gilbert and fixed by him in sg.c (see [PATCH] sg direct
io/mmap oops). Doug fixed the comparison in sg.c. This fix for st.c does not
touch the comparison but makes both arguments signed to remove the
problem. The new code is adapted from linux/fs/bio.c.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I have rediffed the patch against 2.6.13-rc5, done a couple of cosmetic
cleanups, and run some tests. Brian King has acknowledged that it fixes the
problems he has seen. Seems mature enough for inclusion into 2.6.14 (or
later)?
Nate's explanation of the changes:
I've attached patches against 2.6.13rc2. These are basically identical
to my earlier patches, as I found that all issues I'd seen in earlier
kernels still existed in this kernel.
To summarize, the changes are: (more details in my original email)
- add a kref to the scsi_tape structure, and associate reference
counting stuff
- set sr_request->end_io = blk_end_sync_rq so we get notified when an IO
is rejected when the device goes away
- check rq_status when IOs complete, else we don't know that IOs
rejected for a dead device in fact did not complete
- change last_SRpnt so it's set before an async IO is issued (in case
st_sleep_done is bypassed)
- fix a bogus use of last_SRpnt in st_chk_result
Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Removing the SCSI tape module results in an oops in class_device_destroy if
any devices are present. The patch at the end of this message fixes the bug
by moving class_destroy() later in exit_st() so that the class still exists
when devices are removed. (The bug is old but class_simple_device_remove() did
nothing when the class did not exist.)
The patch also fixes a "class leak" in init_st() error path.
I would like to get this into 2.6.13 but it may be too late?
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch is against 2.6.12-rc3 + linus-patch from April 30. The patch
contains the following fixes:
- CAP_SYS_RAWIO is used instead of CAP_SYS_ADMIN; fix from Alan Cox
- only direct sending of SCSI commands requires this permission
- the st status is modified is successful unload is performed using
SCSI_IOCTL_STOP_UNIT
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!