Commit Graph

28 Commits

Author SHA1 Message Date
Anil Chintalapati (achintal)
efc7a28838 fnic: IOMMU Fault occurs when IO and abort IO is out of order
When I/O is aborted by mid-layer, fnic FW will complete the I/O before
completing the abort task. In some cases abort request is completed before
the I/O, which could lead to inconsistent driver and firmware states.
In this case firmware reset would clear the inconsistent state.

Signed-off-by: Anil Chintalapati <achintal@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-12-30 13:31:45 +01:00
Hiral Shah
41df7b02db Fnic: Fnic Driver crashed with NULL pointer reference
When issuing I/O request, if the I/O completes before returning from
fnic_queuecommand(), we may be referencing scsi_cmnd structure that may
be freed by interrupt handler. Acquring IO lock would synchronize
fnic_queuecommand and interrupt handler.

- Increment fnic version from 1.6.0.15 to 1.6.0.16

Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Anil Chintalapati <achintal@cisco.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-20 09:11:00 +01:00
Hiral Shah
35061e21a1 Fnic: Improper resue of exchange Ids
IOs belonging to an rport are aborted with Internal terminate option
when rport goes offline. Any new IO issued to the rport during this
time can reuse the terminated exchange which will cause inconsistent
state of the exchange between local port and remote port.

fc_rport_priv is set to RPORT_ST_DELETE before exchanges are aborted by
libfc. Not issuing amy more I/O requests when RPORT_ST_DELETE is set,
will avoid inconsistent state of the exchange between local port and
remote port.

- Increment fnic version from 1.6.0.13 to 1.6.0.14

Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Anil Chintalapati <achintal@cisco.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-20 09:10:39 +01:00
Christoph Hellwig
5066863337 scsi: remove abuses of scsi_populate_tag
Unless we want to build a SPI tag message we should just check SCMD_TAGGED
instead of reverse engineering a tag type through the use of
scsi_populate_tag_msg.

Also rename the function to spi_populate_tag_msg, make it behave like the
other spi message helpers, and move it to the spi transport class.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12 11:19:41 +01:00
Christoph Hellwig
4d63716898 fnic: reject device resets without assigned tags for the blk-mq case
Current the midlayer fakes up a struct request for the explicit reset
ioctls, and those don't have a tag allocated to them.  The fnic driver pokes
into midlayer structures to paper over this design issue, but that won't
work for the blk-mq case.

Either someone who can actually test the hardware will have to come up with
a similar hack for the blk-mq case, or we'll have to bite the bullet and fix
the way the EH ioctls work for real, but until that happens we fail these
explicit requests here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Webb Scales <webbnh@hp.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Robert Elliott <elliott@hp.com>
Cc: Hiral Patel <hiralpat@cisco.com>
Cc: Suma Ramars <sramars@cisco.com>
Cc: Brian Uchino <buchino@cisco.com>
2014-07-25 17:16:34 -04:00
Hannes Reinecke
9cb78c16f5 scsi: use 64-bit LUNs
The SCSI standard defines 64-bit values for LUNs, and large arrays
employing large or hierarchical LUN numbers become more and more
common.

So update the linux SCSI stack to use 64-bit LUN numbers.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17 22:07:37 +02:00
Hiral Shah
668186637e fnic: Failing to queue aborts due to Q full cause terminate driver timeout
In fnic abort handler, abort queuing can be failed when hardware queue is full.
The command state is left as abort queued. The command with abort queued state
will never be queued next time for abort or termiantion.
Fix restores the command state in above case.

Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Narsimhulu Musini <nmusini@cisco.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-05-19 13:32:55 +02:00
Hiral Patel
67125b0287 [SCSI] fnic: Fnic Statistics Collection
This feature gathers active and cumulative per fnic stats for io,
abort, terminate, reset, vlan discovery path and it also includes
various important stats for debugging issues. It also provided
debugfs and ioctl interface for user to retrieve these stats.
It also provides functionality to reset cumulative stats through
user interface.

Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25 09:57:57 +01:00
Narsimhulu Musini
441fbd2595 [SCSI] fnic: host reset returns nonzero value(errno) on success
Fixed appropriate error codes that returns negative error number on failure,
and 0 on success. fnic_reset() is used directly by the fc transport callback
issue_fc_host_lip which requires a negative error number on failure.

Signed-off-by: Narsimhulu Musini <nmusini@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25 09:57:56 +01:00
Hiral Patel
fc85799ee3 [SCSI] fnic: fnic Driver Tuneables Exposed through CLI
Introduced module params to provide dynamic way of configuring
queue depth.

Added support to get max io throttle count through UCSM to
configure maximum outstanding IOs supported by fnic and push
that value to scsi mid-layer.

  Supported IO throttle values:

  UCSM IO THROTTLE VALUE        FNIC MAX OUTSTANDING IOS
  ------------------------------------------------------
        16 (Default)                    2048
        <= 256                          256
        > 256                           <ucsm value>

Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-11 15:59:25 -07:00
Sesidhar Beddel
d0385d9265 [SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg
Kernel panics due to NULL lport while executing the log message because
of synchronization issues between libfc and scsi transport fc. Checking
for NULL pointers at the beginning of this routine would resolve the issue
from kernel panic point of view.

Signed-off-by: Sesidhar Baddel <sebaddel@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-11 15:56:44 -07:00
Sesidhar Beddel
1259c5dc75 [SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset
Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset in case of
timing issue and also to some extent locking issue where abts and terminate
is happening around same timing.

The code changes are intended to update CMD_STATE(sc) and
io_req->abts_done together.

Signed-off-by: Sesidhar Beddel <sebaddel@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-11 15:54:08 -07:00
Suma Ramars
318c7c4325 [SCSI] fnic: Remove QUEUE_FULL handling code
Remove fnic driver QUEUE_FULL handling code instead let SCSI mid layer
handle queue full and use its algorithm to ramp down/up queue

Signed-off-by: Suma Ramars <sramars@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-09-11 15:49:30 -07:00
Dan Carpenter
5d65f91896 [SCSI] fnic: potential dead lock in fnic_is_abts_pending()
There is an unlock missing if the == FNIC_IOREQ_ABTS_PENDING is
false.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-06-26 17:41:44 -07:00
Hiral Patel
4d7007b49d [SCSI] fnic: Fnic Trace Utility
Fnic Trace utility is a tracing functionality built directly into fnic driver
to trace events. The benefit that trace buffer brings to fnic driver is the
ability to see what it happening inside the fnic driver. It also provides the
capability to trace every IO event inside fnic driver to debug panics, hangs
and potentially IO corruption issues. This feature makes it easy to find
problems in fnic driver and it also helps in tracking down strange bugs in a
more manageable way. Trace buffer is shared across all fnic instances for
this implementation.

Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-02-22 17:32:07 +00:00
Hiral Patel
14eb5d905d [SCSI] fnic: New debug flags and debug log messages
Added new fnic debug flags for identifying IO state at every stage of IO while
debugging and also added more log messages for better debugging capability.

Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-02-22 17:31:09 +00:00
Hiral Patel
a0bf1ca27b [SCSI] fnic: fnic driver may hit BUG_ON on device reset
The issue was observed when LUN Reset is issued through IOCTL or sg_reset
utility.

fnic driver issues LUN RESET to firmware. On successful completion of device
reset, driver cleans up all the pending IOs that were issued prior to device
reset. These pending IOs are expected to be in ABTS_PENDING state. This works
fine, when the device reset operation resulted from midlayer, but not when
device reset was triggered from IOCTL path as the pending IOs were not in
ABTS_PENDING state. execution path hits panic if the pending IO is not in
ABTS_PENDING state.

Changes:
The fix replaces BUG_ON check in fnic_clean_pending_aborts() with marking
pending IOs as ABTS_PENDING if they were not in ABTS_PENDING state and skips
if they were already in ABTS_PENDING state. An extra check is added to validate
the abort status of the commands after a delay of 2 * E_D_TOV using a
helper function. The helper function returns 1 if it finds any pending IO in
ABTS_PENDING state, belong to the LUN on which device reset was issued else 0.
With this, device reset operation returns success only if the helper funciton
returns 0, otherwise it returns failure.

Other changes:
- Removed code in fnic_clean_pending_aborts() that returns failure if it finds
  io_req NULL, instead of returning failure added code to continue with next io
- Added device reset flags for debugging in fnic_terminate_rport_io,
  fnic_rport_exch_reset, and fnic_clean_pending_aborts

Signed-off-by: Narsimhulu Musini <nmusini@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-02-22 17:30:19 +00:00
Hiral Patel
03298552cb [SCSI] fnic: fixing issues in device and firmware reset code
1. Handling overlapped firmware resets
     This fix serialize multiple firmware resets to avoid situation where fnic
     device fails to come up for link up event, when firmware resets are issued
     back to back. If there are overlapped firmware resets are issued,
     the firmware reset operation checks whether there is any firmware reset in
     progress, if so it polls for its completion in a loop with 100ms delay.

2. Handling device reset timeout
     fnic_device_reset code has been modified to handle Device reset timeout:
     - Issue terminate on device reset timeout.
     - Introduced flags field (one of the scratch fields in scsi_cmnd).
     With this, device reset request would have DEVICE_RESET flag set for other
     routines to determine the type of the request.
     Also modified fnic_terminate_rport_io, fnic_rport_exch_rset, completion
     routines to handle SCSI commands with DEVICE_RESET flag.

3. LUN/Device Reset hangs when issued through IOCTL using utilities like
   sg_reset.
     Each SCSI command is associated with a valid tag, fnic uses this tag to
     retrieve associated scsi command on completion. the LUN/Device Reset issued
     through IOCTL resulting into a SCSI command that is not associated with a
     valid tag. So fnic fails to retrieve associated scsi command on completion,
     which causes hang. This fix allocates tag, associates it with the
     scsi command and frees the tag, when the operation completed.

4. Preventing IOs during firmware reset.
     Current fnic implementation allows IO submissions during firmware reset.
     This fix synchronizes IO submissions and firmware reset operations.
     It ensures that IOs issued to fnic prior to reset will be issued to the
     firmware before firmware reset.

Signed-off-by: Narsimhulu Musini <nmusini@cisco.com>
Signed-off-by: Hiral Patel <hiralpat@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-02-22 17:28:19 +00:00
Abhijeet Joglekar
0c79c74272 [SCSI] fnic: fix incorrect use of SLAB_CACHE_DMA flag
Driver was incorrectly using the SLAB_CACHE_DMA flag when creating a cache
for SGLs. fnic device does not have 24-bit DMA restrictions. Remove the flag
and allocations from ZONE_DMA.

Thanks to Roland Dreier and David Rientjes for pointing out the bug.

Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-06-29 16:05:41 -05:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Jeff Garzik
f281233d3e SCSI host lock push-down
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.

The patch below presents a simple SCSI host lock push-down as an
equivalent transformation.  No locking or other behavior should change
with this patch.  All existing bugs and locking orders are preserved.

Additionally, add one parameter to queuecommand,
	struct Scsi_Host *
and remove one parameter from queuecommand,
	void (*done)(struct scsi_cmnd *)

Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.

Minimal code disturbance was attempted with this change.  Most drivers
needed only two one-line modifications for their host lock push-down.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-16 13:33:23 -08:00
Roel Kluin
0db6f4353d [SCSI] fnic: fnic_scsi.c: clean up
In fnic_abort_cmd() and fnic_device_reset() assign `rport' earlier to make
FNIC_SCSI_DBG() calls cleaner.

In fnic_clean_pending_aborts() `rport' is not used.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-27 12:01:51 -05:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Joe Eykholt
78112e5558 [SCSI] fnic: Add FIP support to the fnic driver
Use libfcoe as a common FIP implementation with fcoe.
FIP or non-FIP mode is fully automatic if the firmware
supports and enables it.

Even if FIP is not supported, this uses libfcoe for the non-FIP
handling of FLOGI and its response.

Use the new lport_set_port_id() notification to capture
successful FLOGI responses and port_id resets.

While transitioning between Ethernet and FC mode, all rx and
tx FC frames are queued.  In Ethernet mode, all frames are
passed to the exchange manager to capture FLOGI responses.

Change to set data_src_addr to the ctl_src_addr whenever it
would have previously been zero because we're not logged in.
This seems safer so we'll never send a frame with a 0 source MAC.
This also eliminates a special case for sending FLOGI frames.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04 12:01:19 -06:00
Christof Schmitt
65d430fa99 [SCSI] scsi_transport_fc: Introduce helper function for blocking scsi_eh
Move the duplicated code from FC LLDs to SCSI FC transport class.

Acked-by: James Smart <james.smart@emulex.com>
Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04 12:00:52 -06:00
Abhijeet Joglekar
4b53662bd5 [SCSI] fnic: Pad the unused bytes of CDB to 0s
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04 12:00:36 -06:00
Roel Kluin
87a2d34b03 [SCSI] fnic: remove redundant BUG_ONs and fix checks on unsigned
The shost sg tablesize is set to FNIC_MAX_SG_DESC_CNT and fnic uses
scsi_dma_map, so both BUG_ONs can be removed.

scsi_dma_map may return -ENOMEM, sg_count should be int to catch that.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-25 11:08:23 -05:00
Abhijeet Joglekar
5df6d737dd [SCSI] fnic: Add new Cisco PCI-Express FCoE HBA
fnic is a driver for the Cisco PCI-Express FCoE HBA

Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-13 22:13:09 -04:00