Instead of terminating after five retries, commands terminated by
ABORTED_COMMAND sense are retrying forever. The problem was
introduced by:
commit b60af5b0ad
Author: Alan Stern <stern@rowland.harvard.edu>
Date: Mon Nov 3 15:56:47 2008 -0500
[SCSI] simplify scsi_io_completion()
Which introduced an error whereby ABORTED_COMMAND now gets erroneously
retried in scsi_io_completion. Fix this by returning the behaviour
back to the default no retry.
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andrew Vaszquez said:
> There's a problem that is causing commands returned by the LLD with
> a DID_RESET status to be reissued with cleared cmd->sdb data which
> in our tests are manifesting in firmware detected overruns. Here's
> a snippet of a READ_10 scsi_cmnd upon completion by the storage
The problem is caused by:
commit b60af5b0ad
Author: Alan Stern <stern@rowland.harvard.edu>
Date: Mon Nov 3 15:56:47 2008 -0500
[SCSI] simplify scsi_io_completion()
Because scsi_release_buffers() is called before commands that go
through the ACTION_RETRY and ACTION_DELAYED_RETRY legs are requeued.
However, they're not re-prepared, so nothing ever reallocates the
buffer resources to them. Fix this by releasing the buffers only if
we're not going to go down these legs (but scsi_release_buffers() on
all legs including two in scsi_end_request(); this latter needs a
special version __scsi_release_buffers() because the final one can be
called after the request has been freed, so the bidi test in
scsi_release_buffers(), which touches the request has to be skipped).
Reported-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
patch
commit b60af5b0ad
Author: Alan Stern <stern@rowland.harvard.edu>
Date: Mon Nov 3 15:56:47 2008 -0500
[SCSI] simplify scsi_io_completion()
broke DIX error handling. Also, we are now using EILSEQ to indicate
integrity errors to the upper layers (as opposed to regular EIO
failures). This allows filesystems to inspect buffers and decide
whether to retry the I/O. Update scsi_io_completion() accordingly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
A bug was introduced by
commit b60af5b0ad
Author: Alan Stern <stern@rowland.harvard.edu>
Date: Mon Nov 3 15:56:47 2008 -0500
[SCSI] simplify scsi_io_completion()
because the simplification uses scsi_queue_insert(). The problem with
this function is that it expects to be called from the completion path
while the command is still outstanding, so it decrements the device
and host busy counts to do the requeue. The problem is that
scsi_io_completion() is a path executed well after these counts have
*already* been decremented, leading to a double decrement if the
command goes down any error path leading to ACTION_DELAYED_RETRY.
The fix is to allow a private function __scsi_queue_insert() with a
flag to say whether the busy counters should be decremented. This is
made static to scsi_lib.c to discourage other use.
Reported-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch (as1191) adds a missing "default" case in
scsi_io_completion(), thereby fixing an "uninitialized variable"
error. It also adds a missing newline to a log entry.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
scsi_execute() and scsi_execute_req() discard the residual length
information. Some callers need it. This adds residual argument
(optional) to scsi_execute and scsi_execute_req.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch (as1142b) consolidates a lot of repetitious code in
scsi_io_completion(). It also fixes a few comments. Most
importantly, however, it clearly distinguishes among the three sorts
of retries that can be done when a command fails to complete:
Unprepare the request and resubmit it, so that a new
command will be created for it.
Requeue the request directly so that it will be retried
immediately using the same command.
Requeue the request so that it will be retried following
a short delay.
Complete the remainder of the request with an I/O error.
[jejb: Updates
1. For several error conditions, we would now print the sense twice
in slightly different ways, so unify the location of sense
printing.
2. I added more descriptions to actual failure conditions for
better debugging
3. according to spec, ABORTED_COMMAND is supposed to be retried
(except on DIF failure). Our old behaviour of erroring it looks
to be a bug.
4. I'd prefer not to default initialise the action variable because
that ensures that every leg of the error handler has an
associated action and the compiler will warn if someone later
accidentally misses one or removes one.
]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
It's called under that lock everywhere else and it does alter the
request state, so it should be.
This one occurance in scsi_requeue_command() could open a window where
req->special is set to NULL while the requests is going through either
timeout or completion processing leading to NULL pointer derefs of the
sort complained of in bugzillas 12020 and 12195.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Close possible infinite loop with interrupts off when devices are
added back to the starved list.
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=11898
Reported-by: <alex.shi@intel.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch implements q->lld_busy_fn() for scsi mid layer to export
its busy state for request stacking drivers.
For efficiency, no lock is taken to check the busy state of
shost/starget/sdev, since the returned value is not guaranteed and
may be changed after request stacking drivers call the function,
regardless of taking lock or not.
When scsi can't dispatch I/Os anymore and needs to kill I/Os
(e.g. !sdev), scsi needs to return 'not busy'.
Otherwise, request stacking drivers may hold requests forever.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch refactors the busy checking codes of scsi_device,
Scsi_Host and scsi_target. There should be no functional change.
This is a preparation for another patch which exports scsi's busy
state to the block layer for request stacking drivers.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
On Tue, 12 Aug 2008 15:08:14 +0200
Giuliano Pochini <pochini@shiny.it> wrote:
> Fujitsu magneto-optical drive, Adaptec 29160 and
> Linux Jay 2.6.26 #7 SMP Sun Aug 10 18:34:22 CEST 2008 ppc 7455, altivec supported PowerMac3,6 GNU/Linux
>
> When I insert a disk and I mount it, scsi_test_unit_ready() is called and
> the do-while loop gets sshdr->sense_key == UNIT_ATTENTION in the first
> cycle and 0 in the second one. So the if below misses the UNIT_ATTENTION
> and sdev->changed = 1 is not executed. At this point bad things can
> happen... I'm not sure how to fix this. Any clue ?
The problem is essentially caused by us eating UNIT_ATTENTION
conditions in scsi_test_unit_ready(). Fix by updating the ->changed
flag when this happens if the media is removable.
[pochini@shiny.it: updates to tidy up patch]
Signed-off-by: Giuliano Pochini <pochini@shiny.it>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This checks the errors the scsi-ml determined were retryable
and returns if we should fast fail it based on the request
fail fast flags.
Without the patch, drivers like lpfc, qla2xxx and fcoe would return
DID_ERROR for what it determines is a temporary communication problem.
There is no loss of connectivity at that time and the driver thinks
that it would be fast to retry at the driver level. SCSI-ml will however
sees fast fail on the request and DID_ERROR and will fast fail the io.
This will then cause dm-multipath to fail the path and possibley switch
target controllers when we should be retrying at the scsi layer.
We also were fast failing device errors to dm multiapth when
unless the scsi_dh modules think otherwis we want to retry at
the scsi layer because multipath can only retry the IO like scsi
should have done. multipath is a little dumber though because it
does not what the error was for and assumes that it should fail
the paths.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
SCSI-ml manages the queueing limits for the device and host, but
does not do so at the target level. However something something similar
can come in userful when a driver is transitioning a transport object to
the the blocked state, becuase at that time we do not want to queue
io and we do not want the queuecommand to be called again.
The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
a transport level queueing issue like the hw cannot allocate some
resource at the iscsi session/connection level, or the target has temporarily
closed or shrunk the queueing window, or if we are transitioning
to the blocked state.
bnx2i, when they rework their firmware according to netdev
developers requests, will also need to be able to limit queueing at this
level. bnx2i will hook into libiscsi, but will allocate a scsi host per
netdevice/hba, so unlike pure software iscsi/iser which is allocating
a host per session, it cannot set the scsi_host->can_queue and return
SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.
The iscsi class/driver can also set a scsi_target->can_queue value which
reflects the max commands the driver/class can support. For iscsi this
reflects the number of commands we can support for each session due to
session/connection hw limits, driver limits, and to also reflect the
session/targets's queueing window.
Changes:
v1 - initial patch.
v2 - Fix scsi_run_queue handling of multiple blocked targets.
Previously we would break from the main loop if a device was added back on
the starved list. We now run over the list and check if any target is
blocked.
v3 - Rediff for scsi-misc.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits)
[SCSI] zfcp: fix double dbf id usage
[SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
[SCSI] zfcp: fix erp list usage without using locks
[SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
[SCSI] zfcp: fix deadlock caused by shared work queue tasks
[SCSI] zfcp: put threshold data in hba trace
[SCSI] zfcp: Simplify zfcp data structures
[SCSI] zfcp: Simplify get_adapter_by_busid
[SCSI] zfcp: remove all typedefs and replace them with standards
[SCSI] zfcp: attach and release SAN nameserver port on demand
[SCSI] zfcp: remove unused references, declarations and flags
[SCSI] zfcp: Update message with input from review
[SCSI] zfcp: add queue_full sysfs attribute
[SCSI] scsi_dh: suppress comparison warning
[SCSI] scsi_dh: add Dell product information into rdac device handler
[SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option
[SCSI] qla2xxx: fix printk format warnings
[SCSI] qla2xxx: Update version number to 8.02.01-k8.
[SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing.
[SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling.
...
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.
Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Brian King <brking@linux.vnet.ibm.com> reported that fibre channel
devices can oops during scanning if their ports block (because the
device goes from CREATED -> BLOCK -> RUNNING rather than CREATED ->
BLOCK -> CREATED).
Fix this by adding a new state: CREATED_BLOCK which can only transition
back to CREATED and disallow the CREATED -> BLOCK transition. Now both
the created and blocked states that the mid-layer recognises can include
CREATED_BLOCK.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Sometimes, particularly for USB devices with the last sector bug,
requests get completed in chunks. There's a bug in this in that if
one of the chunks gets an error, we complete that chunk with an error
but never move on to the remaining ones, leading to the request
hanging (because it's not fully completed).
Fix this by completing all remaining chunks if an error is encountered.
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[jejb: fixed up a ton of missed conversions.
All of you are on notice this has happened, driver trees will now
need to be rebased]
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: SCSI List <linux-scsi@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
I goofed and did not see the macro for checking if a request is tagged.
This patch has us use blk_rq_tagged instead of digging into the req->tag.
Patch was made over scsi-misc.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If initiator or target reject the I/O due to DIF errors there is no
point in retrying.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Implement support for DMA of protection information for devices that
are data integrity capable.
- Add support for mapping an extra scatter-gather list containing
the protection information.
- Allocate protection scsi_data_buffer if host is DIX (integrity DMA)
capable.
- Accessor function for checking whether a device has protection
enabled.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When drivers use a shared tag map we can end up with more requests
than tags, because the tag map is shost->can_queue tags and there
can be sdevs * sdev->queue_depth requests. In scsi_request_fn
if tag allocation fails we just drop down to just dequeueing the
tag without a tag. The problem is that drivers using the shared tag
map rely on a valid tag always being set, because it will use the
tag number to lookup commands later.
This patch has us check if we got a valid tag when the host lock
is held right before we check if the host queue is ready. We do the
check here because to allocate the tag we need the q lock, but
if the tag is bad we want to add the device/q onto the starved list
which requires the host lock.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits)
[SCSI] scsi_dh: fix kconfig related build errors
[SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of
[SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h
[SCSI] make struct scsi_{host,target}_type static
[SCSI] fix locking in host use of blk_plug_device()
[SCSI] zfcp: Cleanup external header file
[SCSI] zfcp: Cleanup code in zfcp_erp.c
[SCSI] zfcp: zfcp_fsf cleanup.
[SCSI] zfcp: consolidate sysfs things into one file.
[SCSI] zfcp: Cleanup of code in zfcp_aux.c
[SCSI] zfcp: Cleanup of code in zfcp_scsi.c
[SCSI] zfcp: Move status accessors from zfcp to SCSI include file.
[SCSI] zfcp: Small QDIO cleanups
[SCSI] zfcp: Adapter reopen for large number of unsolicited status
[SCSI] zfcp: Fix error checking for ELS ADISC requests
[SCSI] zfcp: wait until adapter is finished with ERP during auto-port
[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver
[SCSI] sg: Add target reset support
[SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC
[SCSI] sd: Move scsi_disk() accessor function to sd.h
...
scsi_lib.c:scsi_host_queue_ready() plugs the device with incorrect
locking. It should actually have the queue lock held, but it's
holding the host lock. Fix this by eliminating the call. The host
ready has no need to plug the queue because if it returns 0 in
scsi_request_function control transfers to not_ready which acquires
the queue lock and plugs the device if its at zero depth.
Reported-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The data integrity changes need to dynamically allocate
scsi_data_buffers too. Rename scsi_bidi_sdb_cache for clarity.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch (as1108) fixes a problem that can occur with certain USB
mass-storage devices: They return invalid data together with a residue
indicating that the data should be ignored. Rather than leave the
invalid data in a transfer buffer, where it can get misinterpreted,
the patch clears the invalid portion of the buffer.
This solves a problem (wrong write-protect setting detected) reported
by Maciej Rutecki and Peter Teoh.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Peter Teoh <htmldeveloper@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Some of the storage devices (that can be accessed through multiple paths),
do need some special handling for
1. Activating the passive path of the storage access.
2. Decode and handle the special sense codes returned by the devices.
3. Handle the I/Os being sent to the passive path, especially
during the device probe time.
when accessed through multiple paths.
As of today this special device handling is done at the dm-multipath
layer using dm-handlers. That works well for (1); for (2) to be handled
at dm layer, scsi sense information need to be exported from SCSI to dm-layer,
which is not very attractive; (3) cannot be done at all at the dm layer.
Device handler has been moved to SCSI mainly to handle (2) and (3) properly.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
[SCSI] aic94xx: fix section mismatch
[SCSI] u14-34f: Fix 32bit only problem
[SCSI] dpt_i2o: sysfs code
[SCSI] dpt_i2o: 64 bit support
[SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent
[SCSI] dpt_i2o: use standard __init / __exit code
[SCSI] megaraid_sas: fix suspend/resume sections
[SCSI] aacraid: Add Power Management support
[SCSI] aacraid: Fix jbod operations scan issues
[SCSI] aacraid: Fix warning about macro side-effects
[SCSI] add support for variable length extended commands
[SCSI] Let scsi_cmnd->cmnd use request->cmd buffer
[SCSI] bsg: add large command support
[SCSI] aacraid: Fix down_interruptible() to check the return value correctly
[SCSI] megaraid_sas; Update the Version and Changelog
[SCSI] ibmvscsi: Handle non SCSI error status
[SCSI] bug fix for free list handling
[SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions
[SCSI] megaraid_mbox: fix Dell CERC firmware problem
Add support for variable-length, extended, and vendor specific
CDBs to scsi-ml. It is now possible for initiators and ULD's
to issue these types of commands. LLDs need not change much.
All they need is to raise the .max_cmd_len to the longest command
they support (see iscsi patch).
- clean-up some code paths that did not expect commands to be
larger than 16, and change cmd_len members' type to short as
char is not enough.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- struct scsi_cmnd had a 16 bytes command buffer of its own.
This is an unnecessary duplication and copy of request's
cmd. It is probably left overs from the time that scsi_cmnd
could function without a request attached. So clean that up.
- Once above is done, few places, apart from scsi-ml, needed
adjustments due to changing the data type of scsi_cmnd->cmnd.
- Lots of drivers still use MAX_COMMAND_SIZE. So I have left
that #define but equate it to BLK_MAX_CDB. The way I see it
and is reflected in the patch below is.
MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB
as per the SCSI standard and is not related
to the implementation.
BLK_MAX_CDB. - The allocated space at the request level
- I have audit all ISA drivers and made sure none use ->cmnd in a DMA
Operation. Same audit was done by Andi Kleen.
(*)fixed-length here means commands that their size can be determined
by their opcode and the CDB does not carry a length specifier, (unlike
the VARIABLE_LENGTH_CMD(0x7f) command). This is actually not exactly
true and the SCSI standard also defines extended commands and
vendor specific commands that can be bigger than 16 bytes. The kernel
will support these using the same infrastructure used for VARLEN CDB's.
So in effect MAX_COMMAND_SIZE means the maximum size command
scsi-ml supports without specifying a cmd_len by ULD's
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Currently, if the barrier command fails, the error return isn't seen
by the block layer and it proceeds on regardless. The problem is that
SCSI always returns no error for REQ_TYPE_BLOCK_PC ... it expects the
submitter to pick the errors out of req->errors, which the block
barrier functions don't do.
Since it appears that the way SG_IO and scsi_execute_request() work
they discard the block error return and always use req->errors, the
best fix for this is to have the SCSI layer return an error to block
if one actually occurred (this also allows us to filter out spurious
errors, like deferred sense).
This patch is a bug fix that will need backporting to stable, but it's
also quite a big change and in need of testing, so we'll incubate in
the main kernel tree and backport at the -rc2 or so stage if no
problems turn up.
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch makes the needlessly global scsi_end_bidi_request() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Commit:
a341cd0f (SCSI: add asynchronous event notification API)
breaks:
285e9670 (sr,sd: send media state change modification events)
by introducing an event filter, which is removed here, to make
events, we are depending on, happen again.
Fix this by removing the event filter. It's pretty much broken at the
moment, since a user can't set it (the attribute being read only). A
proper fix will be to make the event discriminator distinguish between
AN and Polled media change events.
Cc: David Zeuthen <david@fubar.dk>
Cc: kristen accardi <kaccardi@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
With padding and draining moved into it, block layer now may extend
requests as directed by queue parameters, so now a request has two
sizes - the original request size and the extended size which matches
the size of area pointed to by bios and later by sgs. The latter size
is what lower layers are primarily interested in when allocating,
filling up DMA tables and setting up the controller.
Both padding and draining extend the data area to accomodate
controller characteristics. As any controller which speaks SCSI can
handle underflows, feeding larger data area is safe.
So, this patch makes the primary data length field, request->data_len,
indicate the size of full data area and add a separate length field,
request->raw_data_len, for the unmodified request size. The latter is
used to report to higher layer (userland) and where the original
request size should be fed to the controller or device.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).
When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512. This can result in
sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen. When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.
This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is a one-line patch to add the following to __scsi_alloc_queue():
dma_set_seg_boundary(dev, shost->dma_boundary);
This is the simplest approach but the result looks odd,
__scsi_alloc_queue() does:
blk_queue_segment_boundary(q, shost->dma_boundary);
dma_set_seg_boundary(dev, shost->dma_boundary);
blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
I think that it would be better to set up segment boundary in the same
way as we did for the maximum segment size. That is, removing
shost->dma_boundary and LLDs call pci_set_dma_seg_boundary (or its
friends).
Then __scsi_alloc_queue() can set up both limits in the same way:
blk_queue_segment_boundary(q, dma_get_seg_boundary(dev));
blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
killing dma_boundary in scsi_host_template needs a large patch for
libata (dma_boundary is used by only libata and sym53c8xx). I'll send
a patch to do that if it is acceptable. James and Jeff?
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
request_queue and device struct must have the same value of a segment
size limit. This patch adds blk_queue_segment_boundary in
__scsi_alloc_queue so LLDs don't need to call both
blk_queue_segment_boundary and set_dma_max_seg_size. A LLD can change
the default value (64KB) can call device_dma_parameters accessors like
pci_set_dma_max_seg_size when allocating scsi_host.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
scsi_init_queue is expected to clean up allocated things when it
fails.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Needs to call kmem_cache_destroy for scsi_bidi_sdb_cache in
scsi_exit_queue.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.
Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
At the block level bidi request uses req->next_rq pointer for a second
bidi_read request.
At Scsi-midlayer a second scsi_data_buffer structure is used for the
bidi_read part. This bidi scsi_data_buffer is put on
request->next_rq->special. Struct scsi_cmnd is not changed.
- Define scsi_bidi_cmnd() to return true if it is a bidi request and a
second sgtable was allocated.
- Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
from this command This API is to isolate users from the mechanics of
bidi.
- Define scsi_end_bidi_request() to do what scsi_end_request() does but
for a bidi request. This is necessary because bidi commands are a bit
tricky here. (See comments in body)
- scsi_release_buffers() will also release the bidi_read scsi_data_buffer
- scsi_io_completion() on bidi commands will now call
scsi_end_bidi_request() and return.
- The previous work done in scsi_init_io() is now done in a new
scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
The new scsi_init_io() will call the above twice if needed also for
the bidi_read command. Only at this point is a command bidi.
- In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
confused by a get-sense command that looks like bidi. This is done
by puting NULL at request->next_rq, and restoring.
[jejb: update to sg_table and resolve conflicts
also update to blk-end-request and resolve conflicts]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In preparation for bidi we abstract all IO members of scsi_cmnd,
that will need to duplicate, into a substructure.
- Group all IO members of scsi_cmnd into a scsi_data_buffer
structure.
- Adjust accessors to new members.
- scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
scsi_cmnd. And work on it.
- Adjust scsi_init_io() and scsi_release_buffers() for above
change.
- Fix other parts of scsi_lib/scsi.c to members migration. Use
accessors where appropriate.
- fix Documentation about scsi_cmnd in scsi_host.h
- scsi_error.c
* Changed needed members of struct scsi_eh_save.
* Careful considerations in scsi_eh_prep/restore_cmnd.
- sd.c and sr.c
* sd and sr would adjust IO size to align on device's block
size so code needs to change once we move to scsi_data_buff
implementation.
* Convert code to use scsi_for_each_sg
* Use data accessors where appropriate.
- tgt: convert libsrp to use scsi_data_buffer
- isd200: This driver still bangs on scsi_cmnd IO members,
so need changing
[jejb: rebased on top of sg_table patches fixed up conflicts
and used the synergy to eliminate use_sg and sg_count]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more
insulated from scsi_lib changes. As a bonus it will also gain bidi
capability when it comes.
[jejb: rebase on to sg_table and fix up rejections]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
SCSI sg table allocation has a maximum size (of SCSI_MAX_SG_SEGMENTS,
currently 128) and this will cause a BUG_ON() in SCSI if something
tries an allocation over it. This patch adds a size limit to the
chaining allocator to allow the specification of the maximum
allocation size for chaining, so we always chain in units of the
maximum SCSI allocation size.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch converts scsi mid-layer to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.
As a result, the interface of internal function, scsi_end_request(),
is changed.
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Also change scsi_alloc_sgtable() to just return 0/failure, since it
maps to the command passed in. ->request_buffer is now no longer needed,
once drivers are adapted to use scsi_sglist() it can be killed.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>