Commit Graph

305 Commits

Author SHA1 Message Date
Linus Torvalds
23d4ed53b7 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "Final small batch of fixes to be included before -rc1.  Some general
  cleanups in here as well, but some of the blk-mq fixes we need for the
  NVMe conversion and/or scsi-mq.  The pull request contains:

   - Support for not merging across a specified "chunk size", if set by
     the driver.  Some NVMe devices perform poorly for IO that crosses
     such a chunk, so we need to support it generically as part of
     request merging avoid having to do complicated split logic.  From
     me.

   - Bump max tag depth to 10Ki tags.  Some scsi devices have a huge
     shared tag space.  Before we failed with EINVAL if a too large tag
     depth was specified, now we truncate it and pass back the actual
     value.  From me.

   - Various blk-mq rq init fixes from me and others.

   - A fix for enter on a dying queue for blk-mq from Keith.  This is
     needed to prevent oopsing on hot device removal.

   - Fixup for blk-mq timer addition from Ming Lei.

   - Small round of performance fixes for mtip32xx from Sam Bradshaw.

   - Minor stack leak fix from Rickard Strandqvist.

   - Two __init annotations from Fabian Frederick"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: add __init to blkcg_policy_register
  block: add __init to elv_register
  block: ensure that bio_add_page() always accepts a page for an empty bio
  blk-mq: add timer in blk_mq_start_request
  blk-mq: always initialize request->start_time
  block: blk-exec.c: Cleaning up local variable address returnd
  mtip32xx: minor performance enhancements
  blk-mq: ->timeout should be cleared in blk_mq_rq_ctx_init()
  blk-mq: don't allow queue entering for a dying queue
  blk-mq: bump max tag depth to 10K tags
  block: add blk_rq_set_block_pc()
  block: add notion of a chunk size for request merging
2014-06-11 08:41:17 -07:00
Linus Torvalds
1c54fc1efe SCSI for-linus on 20140609
This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc,
 be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status
 of a long neglected driver for older hardware.  In addition there are a lot of
 minor fixes and cleanups and some more updates to make scsi mq ready.
 
 Signed-off-by: James Bottomley <JBottomley@Parallels.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJTlcwiAAoJEDeqqVYsXL0M5qsIALzVPLd4yxA16zCiaPQUeIV5
 mfYmwISFlN+qW3AcUeSH4D13YgegCjEBfqaDMWvIkgouxLy/7jpxtChutq3MCzUE
 cDT1B9+ZrzoqBISRNHEh/gx5F1MOF2VPuqG2pe0J90wyRCNzJscB6PbtWMAo86CA
 2eu7wq3K9FXxCC1qY0PzwBLXHqUcgk5GWiK9CM/k4W0NiTVeNmwPeh5i91IQnBHx
 E2l7NAXgNLyCf5tyeswvZ4pW0T3hlaswNmBB4qC8oJm4U6UqMN+tk4ML63Pz7uPe
 4mlHG0uI8Vbdi13iv1EDUZ9Vo8iqVrzP2UAhakgP9poKSGqE4d/MD0EKNGQB2Es=
 =cBY8
 -----END PGP SIGNATURE-----

Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This patch consists of the usual driver updates (qla2xxx, qla4xxx,
  lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to
  maintained status of a long neglected driver for older hardware.  In
  addition there are a lot of minor fixes and cleanups and some more
  updates to make scsi mq ready"

* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (130 commits)
  include/scsi/osd_protocol.h: remove unnecessary __constant
  mvsas: Recognise device/subsystem 9485/9485 as 88SE9485
  Revert "be2iscsi: Fix processing cqe for cxn whose endpoint is freed"
  mptfusion: fix msgContext in mptctl_hp_hostinfo
  acornscsi: remove linked command support
  scsi/NCR5380: dprintk macro
  fusion: Remove use of DEF_SCSI_QCMD
  fusion: Add free msg frames to the head, not tail of list
  mpt2sas: Add free smids to the head, not tail of list
  mpt2sas: Remove use of DEF_SCSI_QCMD
  mpt2sas: Remove uses of serial_number
  mpt3sas: Remove use of DEF_SCSI_QCMD
  mpt3sas: Remove uses of serial_number
  qla2xxx: Use kmemdup instead of kmalloc + memcpy
  qla4xxx: Use kmemdup instead of kmalloc + memcpy
  qla2xxx: fix incorrect debug printk
  be2iscsi: Bump the driver version
  be2iscsi: Fix processing cqe for cxn whose endpoint is freed
  be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed
  be2iscsi: Fix memory corruption in MBX path
  ...
2014-06-09 18:54:06 -07:00
Jens Axboe
f27b087b81 block: add blk_rq_set_block_pc()
With the optimizations around not clearing the full request at alloc
time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC
up to the user allocating the request.

Add a blk_rq_set_block_pc() that sets the command type to
REQ_TYPE_BLOCK_PC, and properly initializes the members associated
with this type of request. Update callers to use this function instead
of manipulating rq->cmd_type directly.

Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed
attempt.

Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-06 07:57:37 -06:00
Linus Torvalds
681a289548 Merge branch 'for-3.16/core' of git://git.kernel.dk/linux-block into next
Pull block core updates from Jens Axboe:
 "It's a big(ish) round this time, lots of development effort has gone
  into blk-mq in the last 3 months.  Generally we're heading to where
  3.16 will be a feature complete and performant blk-mq.  scsi-mq is
  progressing nicely and will hopefully be in 3.17.  A nvme port is in
  progress, and the Micron pci-e flash driver, mtip32xx, is converted
  and will be sent in with the driver pull request for 3.16.

  This pull request contains:

   - Lots of prep and support patches for scsi-mq have been integrated.
     All from Christoph.

   - API and code cleanups for blk-mq from Christoph.

   - Lots of good corner case and error handling cleanup fixes for
     blk-mq from Ming Lei.

   - A flew of blk-mq updates from me:

     * Provide strict mappings so that the driver can rely on the CPU
       to queue mapping.  This enables optimizations in the driver.

     * Provided a bitmap tagging instead of percpu_ida, which never
       really worked well for blk-mq.  percpu_ida relies on the fact
       that we have a lot more tags available than we really need, it
       fails miserably for cases where we exhaust (or are close to
       exhausting) the tag space.

     * Provide sane support for shared tag maps, as utilized by scsi-mq

     * Various fixes for IO timeouts.

     * API cleanups, and lots of perf tweaks and optimizations.

   - Remove 'buffer' from struct request.  This is ancient code, from
     when requests were always virtually mapped.  Kill it, to reclaim
     some space in struct request.  From me.

   - Remove 'magic' from blk_plug.  Since we store these on the stack
     and since we've never caught any actual bugs with this, lets just
     get rid of it.  From me.

   - Only call part_in_flight() once for IO completion, as includes two
     atomic reads.  Hopefully we'll get a better implementation soon, as
     the part IO stats are now one of the more expensive parts of doing
     IO on blk-mq.  From me.

   - File migration of block code from {mm,fs}/ to block/.  This
     includes bio.c, bio-integrity.c, bounce.c, and ioprio.c.  From me,
     from a discussion on lkml.

  That should describe the meat of the pull request.  Also has various
  little fixes and cleanups from Dave Jones, Shaohua Li, Duan Jiong,
  Fengguang Wu, Fabian Frederick, Randy Dunlap, Robert Elliott, and Sam
  Bradshaw"

* 'for-3.16/core' of git://git.kernel.dk/linux-block: (100 commits)
  blk-mq: push IPI or local end_io decision to __blk_mq_complete_request()
  blk-mq: remember to start timeout handler for direct queue
  block: ensure that the timer is always added
  blk-mq: blk_mq_unregister_hctx() can be static
  blk-mq: make the sysfs mq/ layout reflect current mappings
  blk-mq: blk_mq_tag_to_rq should handle flush request
  block: remove dead code in scsi_ioctl:blk_verify_command
  blk-mq: request initialization optimizations
  block: add queue flag for disabling SG merging
  block: remove 'magic' from struct blk_plug
  blk-mq: remove alloc_hctx and free_hctx methods
  blk-mq: add file comments and update copyright notices
  blk-mq: remove blk_mq_alloc_request_pinned
  blk-mq: do not use blk_mq_alloc_request_pinned in blk_mq_map_request
  blk-mq: remove blk_mq_wait_for_tags
  blk-mq: initialize request in __blk_mq_alloc_request
  blk-mq: merge blk_mq_alloc_reserved_request into blk_mq_alloc_request
  blk-mq: add helper to insert requests from irq context
  blk-mq: remove stale comment for blk_mq_complete_request()
  blk-mq: allow non-softirq completions
  ...
2014-06-02 09:29:34 -07:00
Christoph Hellwig
a1b73fc194 scsi: reintroduce scsi_driver.init_command
Instead of letting the ULD play games with the prep_fn move back to
the model of a central prep_fn with a callback to the ULD.  This
already cleans up and shortens the code by itself, and will be required
to properly support blk-mq in the SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-05-19 12:35:09 +02:00
Christoph Hellwig
bc85dc500f scsi: remove scsi_end_request
By folding scsi_end_request into its only caller we can significantly clean
up the completion logic.  We can use simple goto labels now to only have
a single place to finish or requeue command there instead of the previous
convoluted logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-05-19 12:33:41 +02:00
Christoph Hellwig
c682adf3e1 scsi: explicitly release bidi buffers
Instead of trying to guess when we have a BIDI buffer in scsi_release_buffers
add a function to explicitly free the BIDI ressoures in the one place that
handles them.  This avoids needing a special __scsi_release_buffers for the
case where we already have freed the request as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
2014-05-19 12:33:41 +02:00
Alan Stern
644373a421 [SCSI] Fix command result state propagation
We're seeing a case where the contents of scmd->result isn't being reset after
a SCSI command encounters an error, is resubmitted, times out and then gets
handled.  The error handler acts on the stale result of the previous error
instead of the timeout.  Fix this by properly zeroing the scmd->status before
the command is resubmitted.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 14:27:26 -07:00
Christoph Hellwig
68c03d9193 [SCSI] don't reference freed command in scsi_prep_return
Patch

commit 0479633686
Author: Christoph Hellwig <hch@infradead.org>
Date:   Thu Feb 20 14:20:55 2014 -0800

    [SCSI] do not manipulate device reference counts in scsi_get/put_command

Introduced a use after free:I in the kill case of scsi_prep_return we have to
release our device reference, but we do this trying to reference the just
freed command.  Use the local sdev pointer instead.

Fixes: 0479633686
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 07:57:22 -07:00
Christoph Hellwig
5e012aad85 [SCSI] don't reference freed command in scsi_init_sgtable
Patch

commit 0479633686
Author: Christoph Hellwig <hch@infradead.org>
Date:   Thu Feb 20 14:20:55 2014 -0800

    [SCSI] do not manipulate device reference counts in scsi_get/put_command

Introduced a use after free: when scsi_init_io fails we have to release our
device reference, but we do this trying to reference the just freed command.
Add a local scsi_device pointer to fix this.

Fixes: 0479633686
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 07:57:21 -07:00
Jens Axboe
b4f42e2831 block: remove struct request buffer member
This was used in the olden days, back when onions were proper
yellow. Basically it mapped to the current buffer to be
transferred. With highmem being added more than a decade ago,
most drivers map pages out of a bio, and rq->buffer isn't
pointing at anything valid.

Convert old style drivers to just use bio_data().

For the discard payload use case, just reference the page
in the bio.

Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-15 14:03:02 -06:00
Jens Axboe
f89e0dd9d1 Linux 3.15-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTSv89AAoJEHm+PkMAQRiGd7AIAKL45dFRhX96W53uzGlpXtv7
 Ecs9CMY5mtsB/rqtV/NSQaUELlNb4Ilb4lITh7NaLWAZxDJ12GwVsIbFoaBj7Ypx
 tfBNNxffGGnTTn2C1PpPQmLytLBvqXIVHMaPpDvnHYJl6g9BihshLzyMrsrizqnA
 DPJ0xMdgGp6BLC4qm0ZH1mS2q+TyXLN2ZCjJ1lp6NNYwBnwOe/ABjnUa0Ztu7aii
 837N8h6nEuKHTr6DgrYHEgYVc3DPHPHaly/UJ8v4U30myzRv83YkD5DsBSUjSLj8
 CzxJEnECa1XljLJK7SjRHy5pSP9tcUlmjx3Fk7IpQixmT+A15cov6fQ967jNlDw=
 =Hxnc
 -----END PGP SIGNATURE-----

Merge tag 'v3.15-rc1' into for-3.16/core

We don't like this, but things have diverged with the blk-mq fixes
in 3.15-rc1. So merge it in.
2014-04-15 14:02:24 -06:00
Martin K. Petersen
2bfad21ecc scsi: Make sure cmd_flags are 64-bit
cmd_flags in struct request is now 64 bits wide but the scsi_execute
functions truncated arguments passed to int leading to errors. Make sure
the flags parameters are u64.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Jens Axboe <axboe@fb.com>
CC: Jan Kara <jack@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-09 20:26:20 -06:00
Jens Axboe
59c3d45e48 block: remove 'q' parameter from kblockd_schedule_*_work()
The queue parameter is never used, just get rid of it.

Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-09 10:17:00 -06:00
Christoph Hellwig
134997a041 [SCSI] remove a useless get/put_device pair in scsi_requeue_command
Avoid a spurious device get/put pair by cleaning up scsi_requeue_command
and folding scsi_unprep_request into it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:25 -07:00
Bart Van Assche
27e9e0f12a [SCSI] remove a useless get/put_device pair in scsi_next_command
Eliminate a get_device() / put_device() pair from scsi_next_command().
Both are atomic operations hence removing these slightly improves
performance.

[hch: slight changes due to different context]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:25 -07:00
Bart Van Assche
613be1f626 [SCSI] remove a useless get/put_device pair in scsi_request_fn
SCSI devices may only be removed by calling scsi_remove_device().
That function must invoke blk_cleanup_queue() before the final put
of sdev->sdev_gendev. Since blk_cleanup_queue() waits for the
block queue to drain and then tears it down, scsi_request_fn cannot
be active anymore after blk_cleanup_queue() has returned and hence
the get_device()/put_device() pair in scsi_request_fn is unnecessary.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:25 -07:00
Christoph Hellwig
0479633686 [SCSI] do not manipulate device reference counts in scsi_get/put_command
Many callers won't need this and we can optimize them away.  In addition
the handling in the __-prefixed variants was inconsistant to start with.

Based on an earlier patch from Bart Van Assche.

[jejb: fix kerneldoc probelm picked up by Fengguang Wu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:24 -07:00
Christoph Hellwig
21a05df547 [SCSI] avoid taking host_lock in scsi_run_queue unless nessecary
If we don't have starved devices we don't need to take the host lock
to iterate over them.  Also split the function up to be more clear.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:24 -07:00
Eiichi Tsukata
ee60b2c52e [SCSI] Add timeout to avoid infinite command retry
Currently, scsi error handling in scsi_io_completion() tries to
unconditionally requeue scsi command when device keeps some error state.
For example, UNIT_ATTENTION causes infinite retry with
action == ACTION_RETRY.
This is because retryable errors are thought to be temporary and the scsi
device will soon recover from those errors. Normally, such retry policy is
appropriate because the device will soon recover from temporary error state.

But there is no guarantee that device is able to recover from error state
immediately. Some hardware error can prevent device from recovering.

This patch adds timeout in scsi_io_completion() to avoid infinite command
retry in scsi_io_completion(). Once scsi command retry time is longer than
this timeout, the command is treated as failure.

Signed-off-by: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:19 -07:00
Russell King
e83b366487 Fix uses of dma_max_pfn() when converting to a limiting address
We must use a 64-bit for this, otherwise overflowed bits get lost, and
that can result in a lower than intended value set.

Fixes: 8e0cb8a1f6 ("ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations")
Fixes: 7d35496dd9 ("ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations")
Tested-Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-02-17 23:08:41 +00:00
Santosh Shilimkar
7d35496dd9 ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
DMA bounce limit is the maximum direct DMA'able memory beyond which
bounce buffers has to be used to perform dma operations. SCSI driver
relies on dma_mask but its calculation is based on max_*pfn which
don't have uniform meaning across architectures. So make use of
dma_max_pfn() which is expected to return the DMAable maximum pfn
value across architectures.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-31 14:49:26 +00:00
Linus Torvalds
357397a141 Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata changes from Tejun Heo:
 "Two interesting changes.

   - libata acpi handling has been restructured so that the association
     between ata devices and ACPI handles are less convoluted.  This
     change shouldn't change visible behavior.

   - Queued TRIM support, which enables sending TRIM to the device
     without draining in-flight RW commands, is added.  Currently only
     enabled for ahci (and likely to stay that way for the foreseeable
     future).

  Other changes are driver-specific updates / fixes"

* 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  libata: bugfix: Remove __le32 in ata_tf_to_fis()
  libata: acpi: Remove ata_dev_acpi_handle stub in libata.h
  libata: Add support for queued DSM TRIM
  libata: Add support for SEND/RECEIVE FPDMA QUEUED
  libata: Add H2D FIS "auxiliary" port flag
  libata: Populate host-to-device FIS "auxiliary" field
  ata: acpi: rework the ata acpi bind support
  sata, highbank: send extra clock cycles in SGPIO patterns
  sata, highbank: set tx_atten override bits
  devicetree: create a separate binding description for sata_highbank
  drivers/ata/sata_rcar.c: simplify use of devm_ioremap_resource
  sata highbank: enable 64-bit DMA mask when using LPAE
  ata: pata_samsung_cf: add missing __iomem annotation
  ata: pata_arasan: Staticize local symbols
  sata_mv: Remove unneeded CONFIG_HAVE_CLK ifdefs
  ata: use dev_get_platdata()
  sata_mv: Remove unneeded forward declaration
  libata: acpi: remove dead code for ata_acpi_(un)bind
  libata: move 'struct ata_taskfile' and friends from ata.h to libata.h
2013-09-03 18:19:53 -07:00
Ewan D. Milne
279afdfe78 [SCSI] Generate uevents on certain unit attention codes
Generate a uevent when the following Unit Attention ASC/ASCQ
codes are received:

    2A/01  MODE PARAMETERS CHANGED
    2A/09  CAPACITY DATA HAS CHANGED
    38/07  THIN PROVISIONING SOFT THRESHOLD REACHED
    3F/03  INQUIRY DATA HAS CHANGED
    3F/0E  REPORTED LUNS DATA HAS CHANGED

Log kernel messages when the following Unit Attention ASC/ASCQ
codes are received that are not as specific as those above:

    2A/xx  PARAMETERS CHANGED
    3F/xx  TARGET OPERATING CONDITIONS HAVE CHANGED

Added logic to set expecting_lun_change for other LUNs on the target
after REPORTED LUNS DATA HAS CHANGED is received, so that duplicate
uevents are not generated, and clear expecting_lun_change when a
REPORT LUNS command completes, in accordance with the SPC-3
specification regarding reporting of the 3F 0E ASC/ASCQ UA.

[jejb: remove SPC3 test in scsi_report_lun_change and some docbook fixes and
       unused variable fix, both reported by Fengguang Wu]
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-26 18:52:27 +04:00
Hannes Reinecke
7e782af576 [SCSI] Return ENODATA on medium error
When a medium error is detected the SCSI stack should return
ENODATA to the upper layers.

[jejb: fix whitespace error]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-23 12:54:53 -04:00
Hannes Reinecke
a9d6ceb838 [SCSI] return ENOSPC on thin provisioning failure
When the thin provisioning hard threshold is reached we
should return ENOSPC to inform upper layers about this fact.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-23 12:43:54 -04:00
Hannes Reinecke
0f7f6234d3 [SCSI] Document enhanced error codes
Document the various error codes returned on I/O failure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-23 12:31:59 -04:00
Aaron Lu
f1bc1e4c44 ata: acpi: rework the ata acpi bind support
Binding ACPI handle to SCSI device has several drawbacks, namely:
1 During ATA device initialization time, ACPI handle will be needed
  while SCSI devices are not created yet. So each time ACPI handle is
  needed, instead of retrieving the handle by ACPI_HANDLE macro,
  a namespace scan is performed to find the handle for the corresponding
  ATA device. This is inefficient, and also expose a restriction on
  calling path not holding any lock.
2 The binding to SCSI device tree makes code complex, while at the same
  time doesn't bring us any benefit. All ACPI handlings are still done
  in ATA module, not in SCSI.

Rework the ATA ACPI binding code to bind ACPI handle to ATA transport
devices(ATA port and ATA device). The binding needs to be done only once,
since the ATA transport devices do not go away with hotplug. And due to
this, the flush_work call in hotplug handler for ATA bay is no longer
needed.

Tested on an Intel test platform for binding and runtime power off for
ODD(ZPODD) and hard disk; on an ASUS S400C for binding and normal boot
and S3, where its SATA port node has _SDD and _GTF control methods when
configured as an AHCI controller and its PATA device node has _GTF
control method when configured as an IDE controller. SATA PMP binding
and ATA hotplug is not tested.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Tested-by: Dirk Griesbach <spamthis@freenet.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-08-23 12:09:23 -04:00
Bart Van Assche
0516c08d10 [SCSI] enable destruction of blocked devices which fail LUN scanning
If something goes wrong during LUN scanning, e.g. a transport layer
failure occurs, then __scsi_remove_device() can get invoked by the
LUN scanning code for a SCSI device in state SDEV_CREATED_BLOCK and
before the SCSI device has been added to sysfs (is_visible == 0).
Make sure that even in this case the transition into state SDEV_DEL
occurs. This avoids that __scsi_remove_device() can get invoked a
second time by scsi_forget_host() if this last function is invoked
from another thread than the thread that performs LUN scanning.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-07-09 12:14:09 +01:00
James Bottomley
e2eb7244bc [SCSI] Fix race between starved list and device removal
scsi_run_queue() examines all SCSI devices that are present on
the starved list. Since scsi_run_queue() unlocks the SCSI host
lock a SCSI device can get removed after it has been removed
from the starved list and before its queue is run. Protect
against that race condition by holding a reference on the
queue while running it.

Reported-by: Chanho Min <chanho.min@lge.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-07-09 12:14:08 +01:00
Lin Ming
9b21493c45 [SCSI] sd: use REQ_PM in sd's runtime suspend operation
With the introduction of REQ_PM, modify sd's runtime suspend operation
functions to use that flag so that the operations to put the device into
runtime suspended state(i.e. sync cache and stop device) will not affect
its runtime PM status.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-05-06 12:48:17 -07:00
Rafael J. Wysocki
53540098b2 ACPI / glue: Add .match() callback to struct acpi_bus_type
USB uses the .find_bridge() callback from struct acpi_bus_type
incorrectly, because as a result of the way it is used by USB every
device in the system that doesn't have a bus type or parent is
passed to usb_acpi_find_device() for inspection.

What USB actually needs, though, is to call usb_acpi_find_device()
for USB ports that don't have a bus type defined, but have
usb_port_device_type as their device type, as well as for USB
devices.

To fix that replace the struct bus_type pointer in struct
acpi_bus_type used for matching devices to specific subsystems
with a .match() callback to be used for this purpose and update
the users of struct acpi_bus_type, including USB, accordingly.
Define the .match() callback routine for USB, usb_acpi_bus_match(),
in such a way that it will cover both USB devices and USB ports
and remove the now redundant .find_bridge() callback pointer from
usb_acpi_bus.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
2013-03-04 14:23:40 +01:00
Aaron Lu
6f4c827e68 [libata] scsi: no poll when ODD is powered off
When the ODD is powered off, any action the user did to the ODD that
would generate a media event will trigger an ACPI interrupt, so the
poll for media event is no longer necessary. And the poll will also
cause a runtime status change, which will stop the ODD from staying in
powered off state, so the poll should better be stopped.

But since we don't have access to the gendisk structure in LLDs, here
comes the disk_events_disable_depth for scsi device. This field is a
hint set by LLDs to convey information to upper layer drivers. A value
of 0 means media poll is necessary for the device, while values above 0
means media poll is not needed and should better be skipped. So we can
increase its value when we are to power off the ODD in ATA layer and
decrease its value when the ODD is powered on, effectively silence the
media events poll.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2013-01-25 15:36:43 -05:00
Linus Torvalds
60da5bf47d Merge branch 'for-3.8/core' of git://git.kernel.dk/linux-block
Pull block layer core updates from Jens Axboe:
 "Here are the core block IO bits for 3.8.  The branch contains:

   - The final version of the surprise device removal fixups from Bart.

   - Don't hide EFI partitions under advanced partition types.  It's
     fairly wide spread these days.  This is especially dangerous for
     systems that have both msdos and efi partition tables, where you
     want to keep them in sync.

   - Cleanup of using -1 instead of the proper NUMA_NO_NODE

   - Export control of bdi flusher thread CPU mask and default to using
     the home node (if known) from Jeff.

   - Export unplug tracepoint for MD.

   - Core improvements from Shaohua.  Reinstate the recursive merge, as
     the original bug has been fixed.  Add plugging for discard and also
     fix a problem handling non pow-of-2 discard limits.

  There's a trivial merge in block/blk-exec.c due to a fix that went
  into 3.7-rc at a later point than -rc4 where this is based."

* 'for-3.8/core' of git://git.kernel.dk/linux-block:
  block: export block_unplug tracepoint
  block: add plug for blkdev_issue_discard
  block: discard granularity might not be power of 2
  deadline: Allow 0ms deadline latency, increase the read speed
  partitions: enable EFI/GPT support by default
  bsg: Remove unused function bsg_goose_queue()
  block: Make blk_cleanup_queue() wait until request_fn finished
  block: Avoid scheduling delayed work on a dead queue
  block: Avoid that request_fn is invoked on a dead queue
  block: Let blk_drain_queue() caller obtain the queue lock
  block: Rename queue dead flag
  bdi: add a user-tunable cpu_list for the bdi flusher threads
  block: use NUMA_NO_NODE instead of -1
  block: recursive merge requests
  block CFQ: avoid moving request to different queue
2012-12-17 08:27:23 -08:00
Bart Van Assche
3f3299d5c0 block: Rename queue dead flag
QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
stop. After this flag has been set queue draining starts. However,
during the queue draining phase it is still safe to invoke the
queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
flag.

This patch has been generated by running the following command
over the kernel source tree:

git grep -lEw 'blk_queue_dead|QUEUE_FLAG_DEAD' |
    xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g'      \
        -e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g';                \
sed -i.tmp -e "s/QUEUE_FLAG_DYING$(printf \\t)*5/QUEUE_FLAG_DYING$(printf \\t)5/g" \
    include/linux/blkdev.h;                                       \
sed -i.tmp -e 's/ DEAD/ DYING/g' -e 's/dead queue/a dying queue/' \
    -e 's/Dead queue/A dying queue/' block/blk-core.c

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Chanho Min <chanho.min@lge.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-12-06 14:30:58 +01:00
Martin K. Petersen
5db44863b6 [SCSI] sd: Implement support for WRITE SAME
Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk
driver.

 - We set the default maximum to 0xFFFF because there are several
   devices out there that only support two-byte block counts even with
   WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the
   device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK
   LIMITS VPD.

 - max_write_same_blocks can be overriden per-device basis in sysfs.

 - The UNMAP discovery heuristics remain unchanged but the discard
   limits are tweaked to match the "real" WRITE SAME commands.

 - In the error handling logic we now distinguish between WRITE SAME
   with and without UNMAP set.

The discovery process heuristics are:

 - If the device reports a SCSI level of SPC-3 or greater we'll issue
   READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is
   supported. If that's the case we will use it.

 - If the device supports the block limits VPD and reports a MAXIMUM
   WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16).

 - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond
   0xFFFFFFFF or the block count exceeds 0xFFFF.

 - no_write_same is set for ATA, FireWire and USB.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-11-13 22:45:42 -08:00
James Bottomley
fe709ed827 Merge SCSI misc branch into isci-for-3.6 tag 2012-10-02 08:55:12 +01:00
Vikas Chaudhary
0e58076b37 [SCSI] scsi_lib: Set the device state from transport-offline to running
FC and iSCSI class set SCSI devices to transport-offline state after
fast_io_fail/replacement_timeout has fired, but after relogin, function
scsi_internal_device_unblock() is not setting scsi device state to running.
Due to this the devices even after being relogged in remain offline.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-14 17:59:22 +01:00
Mike Snitzer
27c419739b [SCSI] scsi_lib: fix scsi_io_completion's SG_IO error propagation
The following v3.4-rc1 commit unmasked an existing bug in scsi_io_completion's
SG_IO error handling: 47ac56d [SCSI] scsi_error: classify some ILLEGAL_REQUEST
sense as a permanent TARGET_ERROR

Given that certain ILLEGAL_REQUEST are now properly categorized as
TARGET_ERROR the host_byte is being set (before host_byte wasn't ever
set for these ILLEGAL_REQUEST).

In scsi_io_completion, initialize req->errors with cmd->result _after_
the SG_IO block that calls __scsi_error_from_host_byte (which may
modify the host_byte).

Before this fix:

    cdb to send: 12 01 01 00 00 00
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00],
    mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=20000, flags=0,
    status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b,
    00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0x10,
    driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
SCSI Status: Check Condition

Sense Information:
sense buffer empty

After:

    cdb to send: 12 01 01 00 00 00
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00],
    mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=20000, flags=0,
    status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b,
    00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0,
    driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
SCSI Status: Check Condition

Sense Information:
 Fixed format, current;  Sense key: Illegal Request
 Additional sense: Invalid field in cdb
 Raw sense data (in hex):
        70 00 05 00 00 00 00 0b  00 00 00 00 24 00 00 00
        00 00 00

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Babu Moger <babu.moger@netapp.com>
Cc: stable@vger.kernel.org # 3.4
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-08-22 09:42:31 +04:00
Jeff Garzik
8407884dd9 Merge branch 'master' [vanilla Linus master] into libata-dev.git/upstream
Two bits were appended to the end of the bitfield
list in struct scsi_device.  Resolve that conflict
by including both bits.

Conflicts:
	include/scsi/scsi_device.h
2012-07-25 15:58:48 -04:00
Bart Van Assche
b485462aca [SCSI] Stop accepting SCSI requests before removing a device
Avoid that the code for requeueing SCSI requests triggers a
crash by making sure that that code isn't scheduled anymore
after a device has been removed.

Also, source code inspection of __scsi_remove_device() revealed
a race condition in this function: no new SCSI requests must be
accepted for a SCSI device after device removal started.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:41 +01:00
Bart Van Assche
84feb1664e [SCSI] Change return type of scsi_queue_insert() into void
The return value of scsi_queue_insert() is ignored by all its
callers, hence change the return type of this function into
void.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:41 +01:00
Bart Van Assche
940f5d47e2 [SCSI] Avoid dangling pointer in scsi_requeue_command()
When we call scsi_unprep_request() the command associated with the request
gets destroyed and therefore drops its reference on the device.  If this was
the only reference, the device may get released and we end up with a NULL
pointer deref when we call blk_requeue_request.

Reported-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
[jejb: enhance commend and add commit log for stable]
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:40 +01:00
Bart Van Assche
67bd941300 [SCSI] Fix device removal NULL pointer dereference
Use blk_queue_dead() to test whether the queue is dead instead
of !sdev. Since scsi_prep_fn() may be invoked concurrently with
__scsi_remove_device(), keep the queuedata (sdev) pointer in
__scsi_remove_device(). This patch fixes a kernel oops that
can be triggered by USB device removal. See also
http://www.spinics.net/lists/linux-scsi/msg56254.html.

Other changes included in this patch:
- Swap the blk_cleanup_queue() and kfree() calls in
  scsi_host_dev_release() to make that code easier to grasp.
- Remove the queue dead check from scsi_run_queue() since the
  queue state can change anyway at any point in that function
  where the queue lock is not held.
- Remove the queue dead check from the start of scsi_request_fn()
  since it is redundant with the scsi_device_online() check.

Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:40 +01:00
Mike Christie
d075498c98 [SCSI] remove old comment from block/unblock functions
We do not hold the host lock when calling these functions,
so remove comment.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:23 +01:00
Mike Christie
5d9fb5cc1b [SCSI] core, classes, mpt2sas: have scsi_internal_device_unblock take new state
This has scsi_internal_device_unblock/scsi_target_unblock take
the new state to set the devices as an argument instead of
always setting to running. The patch also converts users of these
functions.

This allows the FC and iSCSI class to transition devices from blocked
to transport-offline, so that when fast_io_fail/replacement_timeout
has fired we do not set the devices back to running. Instead, we
set them to SDEV_TRANSPORT_OFFLINE.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:22 +01:00
Mike Christie
1b8d262061 [SCSI] add new SDEV_TRANSPORT_OFFLINE state
This patch adds a new state SDEV_TRANSPORT_OFFLINE. It will
be used by transport classes to offline devices for cases like
when the fast_io_fail/recovery_tmo fires. In those cases we
want all IO to fail, and we have not yet escalated to dev_loss_tmo
behavior where we are removing the devices.

Currently to handle this state, transport classes are setting
the scsi_device's state to running, setting their internal
session/port structs state to something that indicates failed,
and then failing IO from some transport check in the queuecommand.

The reason for the new value is so that users can distinguish
between a device failure that is a result of a transport problem
vs the wide range of errors that devices get offlined for
when a scsi command times out and we offline the devices there.
It also fixes the confusion as to why the transport class is
failing IO, but has set the device state from blocked to running.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:21 +01:00
Holger Macht
de50ada55b [SCSI] add wrapper to access and set scsi_bus_type in struct acpi_bus_type
For being able to bind ata devices against acpi devices, scsi_bus_type
needs to be set as bus in struct acpi_bus_type. So add wrapper to
scsi_lib to accomplish that.

Signed-off-by: Holger Macht <holger@homac.de>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2012-06-29 11:38:09 -04:00
Jun'ichi Nomura
b7e94a1686 [SCSI] Fix dm-multipath starvation when scsi host is busy
block congestion control doesn't have any concept of fairness across
multiple queues.  This means that if SCSI reports the host as busy in
the queue congestion control it can result in an unfair starvation
situation in dm-mp if there are multiple multipath devices on the same
host.  For example:
http://www.redhat.com/archives/dm-devel/2012-May/msg00123.html

The fix for this is to report only the sdev busy state (and ignore the
host busy state) in the block congestion control call back.
The host is still congested, but the SCSI subsystem will sort out the
congestion in a fair way because it knows the relation between the
queues and the host.

[jejb: fixed up trailing whitespace]
Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-23 09:34:17 +01:00
James Bottomley
e346933365 isci update for 3.5
1/ Rework remote-node-context (RNC) handling for proper management of
    the silicon state machine in error handling and hot-plug conditions.
    Further details below, suffice to say if the RNC is mismanaged the
    silicon state machines may lock up.
 
 2/ Refactor the initialization code to be reused for suspend/resume support
 
 3/ Miscellaneous bug fixes to address discovery issues and hardware
    compatibility.
 
 RNC rework details from Jeff Skirvin:
 
 In the controller, devices as they appear on a SAS domain (or
 direct-attached SATA devices) are represented by memory structures known
 as "Remote Node Contexts" (RNCs).  These structures are transferred from
 main memory to the controller using a set of register commands; these
 commands include setting up the context ("posting"), removing the
 context ("invalidating"), and commands to control the scheduling of
 commands and connections to that remote device ("suspensions" and
 "resumptions").  There is a similar path to control RNC scheduling from
 the protocol engine, which interprets the results of command and data
 transmission and reception.
 
 In general, the controller chooses among non-suspended RNCs to find one
 that has work requiring scheduling the transmission of command and data
 frames to a target.  Likewise, when a target tries to return data back
 to the initiator, the state of the RNC is used by the controller to
 determine how to treat the incoming request. As an example, if the RNC
 is in the state "TX/RX Suspended", incoming SSP connection requests from
 the target will be rejected by the controller hardware.  When an RNC is
 "TX Suspended", it will not be selected by the controller hardware to
 start outgoing command or data operations (with certain priority-based
 exceptions).
 
 As mentioned above, there are two sources for management of the RNC
 states: commands from driver software, and the result of transmission
 and reception conditions of commands and data signaled by the controller
 hardware.  As an example of the latter, if an outgoing SSP command ends
 with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
 transition to the "TX Suspended" state, and this is signaled by the
 controller hardware in the status to the completion of the pending
 command as well as signaled in a controller hardware event.  Examples of
 the former are included in the patch changelogs.
 
 Driver software is required to suspend the RNC in a "TX/RX Suspended"
 condition before any outstanding commands can be terminated.  Failure to
 guarantee this can lead to a complete hardware hang condition.  Earlier
 versions of the driver software did not guarantee that an RNC was
 correctly managed before I/O termination, and so operated in an unsafe
 way.
 
 Further, the driver performed unnecessary contortions to preserve the
 remote device command state and so was more complicated than it needed
 to be.  A simplifying driver assumption is that once an I/O has entered
 the error handler path without having completed in the target, the
 requirement on the driver is that all use of the sas_task must end.
 Beyond that, recovery of operation is dependent on libsas and other
 components to reset, rediscover and reconfigure the device before normal
 operation can restart.  In the driver, this simplifying assumption meant
 that the RNC management could be reduced to entry into the suspended
 state, terminating the targeted I/O request, and resuming the RNC as
 needed for device-specific management such as an SSP Abort Task or LUN
 Reset Management request.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPtYFXAAoJEB7SkWpmfYgCJkcP/1VvsuuitNy/YM9P1tb/RvQ7
 ytJzjGtWiAABHVWwjgB+Ng7hUTaP2r6l8KeNfwxpwXyNdBAUNysYEUBHAfPsKzKz
 espTmw3wVCnREajgKXZwFp9aTj8DcYFB6vKcC/ddACt3uRNjjA9En36+6797r8Vg
 YdebyjFX2FxwoUj0icTUiV/OXgb8w723imnCl8bfOhhFRi4eFZ4EJ23AdMkUya1i
 uYePAPJPSQJuU/87gNIx4JcR0qHJ1ziGPEY+XC47CzEeXbBTSPgWOwanQ6KPoXRJ
 XVxamfcKAjRdtwQ4m1vYBSE32RTdrjhujVbkGiPi6QaEbCLjhLSCIYyuS3XMckV+
 TCZ16o5kd/I6ZtZOeP4zZRGnNBkOPzY44qiJeKffjWDhTrFacx4XWJB/ftWPgEA5
 N2zFH3RM4sY0FUJ3I/Qe5CERNdCXMtcj+UAf3nHpAIVcv46Lp+qoSkdEx05uuuiN
 +D/dSlubktuvuzmB5WisL3qrjNEkkLTAGQpZs1j0ojLEBm0XAgV5EzqmHiZ0GOPD
 OQNFxeei9SlqgtKIIP0bymRispPrG2HVCOvExYMxzKR6fjxofZLAs/aWOsdhxgMq
 TlAyZJ6OmGI+KX68HHzoMpT9iquvmP64WGkfHzCx296BfSKiruLh/Jzt5gGwv+Z1
 5tlpnUr9dUTxx7qkQXvj
 =HYvO
 -----END PGP SIGNATURE-----

Merge tag 'isci-for-3.5' into misc

isci update for 3.5

1/ Rework remote-node-context (RNC) handling for proper management of
   the silicon state machine in error handling and hot-plug conditions.
   Further details below, suffice to say if the RNC is mismanaged the
   silicon state machines may lock up.

2/ Refactor the initialization code to be reused for suspend/resume support

3/ Miscellaneous bug fixes to address discovery issues and hardware
   compatibility.

RNC rework details from Jeff Skirvin:

In the controller, devices as they appear on a SAS domain (or
direct-attached SATA devices) are represented by memory structures known
as "Remote Node Contexts" (RNCs).  These structures are transferred from
main memory to the controller using a set of register commands; these
commands include setting up the context ("posting"), removing the
context ("invalidating"), and commands to control the scheduling of
commands and connections to that remote device ("suspensions" and
"resumptions").  There is a similar path to control RNC scheduling from
the protocol engine, which interprets the results of command and data
transmission and reception.

In general, the controller chooses among non-suspended RNCs to find one
that has work requiring scheduling the transmission of command and data
frames to a target.  Likewise, when a target tries to return data back
to the initiator, the state of the RNC is used by the controller to
determine how to treat the incoming request. As an example, if the RNC
is in the state "TX/RX Suspended", incoming SSP connection requests from
the target will be rejected by the controller hardware.  When an RNC is
"TX Suspended", it will not be selected by the controller hardware to
start outgoing command or data operations (with certain priority-based
exceptions).

As mentioned above, there are two sources for management of the RNC
states: commands from driver software, and the result of transmission
and reception conditions of commands and data signaled by the controller
hardware.  As an example of the latter, if an outgoing SSP command ends
with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
transition to the "TX Suspended" state, and this is signaled by the
controller hardware in the status to the completion of the pending
command as well as signaled in a controller hardware event.  Examples of
the former are included in the patch changelogs.

Driver software is required to suspend the RNC in a "TX/RX Suspended"
condition before any outstanding commands can be terminated.  Failure to
guarantee this can lead to a complete hardware hang condition.  Earlier
versions of the driver software did not guarantee that an RNC was
correctly managed before I/O termination, and so operated in an unsafe
way.

Further, the driver performed unnecessary contortions to preserve the
remote device command state and so was more complicated than it needed
to be.  A simplifying driver assumption is that once an I/O has entered
the error handler path without having completed in the target, the
requirement on the driver is that all use of the sas_task must end.
Beyond that, recovery of operation is dependent on libsas and other
components to reset, rediscover and reconfigure the device before normal
operation can restart.  In the driver, this simplifying assumption meant
that the RNC management could be reduced to entry into the suspended
state, terminating the targeted I/O request, and resuming the RNC as
needed for device-specific management such as an SSP Abort Task or LUN
Reset Management request.
2012-05-21 12:17:30 +01:00