Commit Graph

1251 Commits

Author SHA1 Message Date
Dick Kennedy
e7981a2c72 scsi: lpfc: Fix oops if nvmet_fc_register_targetport fails
if nvmet targetport registration fails, the driver encounters a NULL
pointer oops in lpfc_hb_timeout_handler.

To fix: if registration fails, ensure nvmet_support is cleared on the
port structure.

Also enhanced the log message on failure.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:38 -04:00
Dick Kennedy
cf4c8c8610 scsi: lpfc: Revise NVME module parameter descriptions for better clarity
The descriptions for lpfc_xri_split and lpfc_enable_fc4_type were
poor. Revise for better understanding:

  lpfc_xri_split - Percentage of FCP XRI resources versus NVME
  lpfc_enable_fc4_type - Enable FC4 Protocol support - FCP / NVME

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:38 -04:00
James Smart
c578f6f4b9 scsi: lpfc: Set missing abort context
Always set ctxp->state to LPFC_NVMET_STE_ABORT if ABORT op gets called

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:37 -04:00
James Smart
e3246a123d scsi: lpfc: Reduce log spew on controller reconnects
There are several log messages that report abnormal terminations that by
default are marked warn. These are typically the result of failures due
to invalid controller state or abort completions. They are all natural
when a controller resets.

Unfortunately, as they are logged by default, it makes the admin very
concerned.

Convert the messages to Info.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:37 -04:00
Dick Kennedy
8e036a9497 scsi: lpfc: Fix FCP hba_wqidx assignment
The driver is encountering  oops in lpfc_sli_calc_ring.

The driver is setting hba_wqidx for FCP based on the policy in use for
NVME. The two may not be the same.  Change to set the wqidx based on the
FCP policy.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:36 -04:00
Dick Kennedy
f485c18db2 scsi: lpfc: Move CQ processing to a soft IRQ
Under heavy target nvme load duration, the lpfc irq handler is
encountering cpu lockup warnings.

Convert the driver to a shortened ISR handler which identifies the
interrupting condition then schedules a workq thread to process the
completion queue the interrupt was for. This moves all the real work
into the workq element.

As nvmet_fc upcalls are no longer in ISR context, don't set the feature
flags

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:36 -04:00
Dick Kennedy
c8a4ce0bf3 scsi: lpfc: Make ktime sampling more accurate
Need to make ktime samples more accurate

If ktime is turned on in the middle of an IO, the max calculation could
be misleading. Base sampling on the start time of the IO as opposed to
ktime_on.

Make ISR ktime timestamps be from when CQE is read instead of EQE.
Added additional sanity checks when deciding whether to accept an IO
sample or not.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:35 -04:00
Dick Kennedy
e8bcf0ae4c scsi: lpfc: PLOGI failures during NPIV testing
Local Reject/Invalid RPI errors seen during discovery.

Temporary RPI cleanup was occurring regardless of SLI rev. It's only
necessary on SLI-4.

Adjust the test for whether cleanup is necessary.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:35 -04:00
Dick Kennedy
2299e4323d scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined
Warning messages when NVME_TARGET_FC not defined on ppc builds

The lpfc_nvmet_replenish_context() function is only meaningful when NVME
target mode enabled. Surround the function body with ifdefs for target
mode enablement.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:34 -04:00
Dick Kennedy
2b75d0f934 scsi: lpfc: Fix lpfc nvme host rejecting IO with Not Ready message
In a link bounce scenario, a condition can occur where the discovery
engine swaps an ndlp structure (address change for an nport). While the
swap was successfully executed by the discovery engine, the driver did
not properly detect a change in the ndlp bound to the nvme rport.  This
error resulted in the nvme host transport issuing an IO to the correct
nvme rport, but the lpfc driver addressed a ndlp with an NLP_UNUSED
status and failed the io. This resulting it it looking like there were
missing namespaces and applications failed due to io errors.

To fix, in lpfc_nvme_register_rport, rework the "rebind" case to break
the nvme rport<->ndlp association when the ndlp already has an
nrport. Then rebind the rport to the correct ndlp data and backpointers.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:34 -04:00
Dick Kennedy
1234a6d54f scsi: lpfc: Fix crash receiving ELS while detaching driver
The driver crashes when attempting to use a freed ndpl pointer.

The pci_remove_one handler runs on a separate kernel thread. The order
of the removal is starting by freeing all of the ndlps and then
disabling interrupts. In between these two events the driver can still
receive an ELS and process it. When it tries to use the ndlp pointer
will be NULL

Change the order of the pci_remove_one vs disable interrupts so that
interrupts are disabled before the ndlp's are freed.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:33 -04:00
Dick Kennedy
401bb4169d scsi: lpfc: fix pci hot plug crash in list_add call
During pci hot plug, the kernel crashes in a list_add_call

The lookup by tag function will return null if the IOCB is out of range
or does not have the on txcmplq flag set.

Fix: Check for null return from lookup by tag.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:32 -04:00
Dick Kennedy
1901762f2c scsi: lpfc: fix pci hot plug crash in timer management routines
During pci hot plug, the kernel crashes in timer management code.

The sli4 remove_one handler is not stoping the timers as it starts to
remove the port so that it can be swapped.

Fix: Stop the timers early in the handler routine.

Note: Fix in SLI-4 only. SLI-3 already stopped the timers properly.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:32 -04:00
Linus Torvalds
0b33ce72ea SCSI fixes on 20170930
Eight mostly minor fixes for recently discovered issues in drivers.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZz2OGAAoJEAVr7HOZEZN4NTQQAL1avebL1Abau5ODtGTciJrZ
 I4QqUuJ0PGo9AKWIA/s34cY7kdPt2qhIoUJY0bLKdv8QtR5MDqfJ6c+2ianKenWm
 EkDKYBQ2csBqzH9VN0tEIPQhvLr7oxvCgEDjfvusxWoUX4AgSv2bMaxIpgs4GUxS
 hOH4fg/6A0HuhP/HjtYfd89DBdhhKOPU6oRx6YLF4ctlfZxAcALsvVjNAGaJzZX8
 Bpv2K/xWOwb/UghjJvxv1wYZ+AL0BHAFVZilBFpyX+ClRhmU9d9SVYs43CbwSu+Y
 41qIAYLJXrls6CSWJMTEo/+mhbPMQBS6Q3wSewOaNZXkx4comjJHpx/k2HsiVk7K
 ObN9eIXNzyby132pIIHc9ZRSpJRTr0/jjb9FetneZlLHaNabtIf67JA0j4fIoCEh
 Qb73uR030mCNvQ+xvK8DxF9UE1zQXnXiJWoIH9NrGtFtknQOmFlfFfmo/gMDI8w6
 aYkxuHdqn2oFjKDuZOQ4zYa8ptcZAAml64gj2YiNiwgNosEpt9UMJ5WRknrGJ7po
 jHohzKyKkSykjOxgOL1Noh5d7AoWPtJ1Qbg+Leyg1WLnRGHTgAYYmv1LabOdMhbM
 SIMhNRBFQunCBKTyes/RB2Nykl1OXipnxIaRgSrRhsEiTOVutmyy7jr6BbZSB2vY
 XBpadk38SGSsZva+Sfk0
 =v2G5
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Eight mostly minor fixes for recently discovered issues in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ILLEGAL REQUEST + ASC==27 => target failure
  scsi: aacraid: Add a small delay after IOP reset
  scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()
  scsi: scsi_transport_fc: set scsi_target_id upon rescan
  scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
  scsi: aacraid: error: testing array offset 'bus' after use
  scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
  scsi: aacraid: Fix 2T+ drives on SmartIOC-2000
2017-09-30 12:50:56 -07:00
Thomas Meyer
df2f7729f2 scsi: lpfc: Cocci spatch "pool_zalloc-simple"
Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.
Found by coccinelle spatch "api/alloc/pool_zalloc-simple.cocci"

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25 19:05:38 -04:00
James Smart
8e009ce846 lpfc: remove use of FC-specific error codes
The lpfc driver uses the FC-specific error when it needed to return an
error to the FC-NVME transport. Convert to use a generic value instead.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-25 08:56:05 -06:00
Stefano Brivio
5c756065e4 scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
Internal error codes happen to be positive, thus the PCI driver core
won't treat them as failure, but we do. This would cause a crash later
on as lpfc_pci_remove_one() is called (e.g. as shutdown function).

Fixes: 6d368e5321 ("[SCSI] lpfc 8.3.24: Add resource extent support")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15 21:15:32 -04:00
Colin Ian King
858e51e8cb scsi: lpfc: remove redundant null check on eqe
The pointer eqe is always non-null inside the while loop, so the check
to see if eqe is NULL is redudant and hence can be removed.

Detected by CoverityScan CID#1248693 ("Logically Dead Code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15 15:39:11 -04:00
Linus Torvalds
572c01ba19 SCSI misc on 20170907
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, zfcp and a host of minor updates.
 
 The major driver change here is the elimination of the block based
 cciss driver in favour of the SCSI based hpsa driver (which now drives
 all the legacy cases cciss used to be required for).  Plus a reset
 handler clean up and the redo of the SAS SMP handler to use bsg lib.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZscDNAAoJEAVr7HOZEZN4DWIQAK/UkkrvKpV/jLATM/yi7CoL
 QidY86Hmwwl7A9HQ+2fjLfAsye0xcCzRwkucKK90IP5b4pefHhiJJfiMKAAe3TUW
 xstnY5z5jaOhDG4nyJFoSm5fH5qXkMnJ8NZRK8f6Qg5yBN5dStEKqoBboNsz4KBI
 md7idw0mbp5i2GXlJwSpc5eDS97GiPL6WkwgGaGKfXF1NDau0GbEdjijfz55haCD
 pMhY7WJh/71RfOq/1ThXT1Z3khOlVcKXrkdO+602n7zh/klRBRtBC8m2a6xCfZPj
 n7Pb/s0jhCQPd+e/Xtv7WEbY8uNOCrGoVgZ6U5EGrT5IeTfep24ackYqerjMhE63
 esi4BJY8lUP9SGleLMgjYWyCHdmxBJRa7UI614DWN/H0QoGP6j/2EzGoi5Fw04vC
 H8/+aqPPWZc9KUBioRYo8xWO8YgMqL2eyXY+Tc9cwxqAe2T6k/NC1zJVgDFKXfzb
 QoWW4v9NNmYwf5vL/7tNgkeTMFQV66yUR7dR3SGTSk8UIrJ40ok0JyUAsDg86ZAH
 BfMkWwhWQ6Byoel0Y7Ti88T49Cox/64r/I0ux06Qgg99+KpRLT7z20+GLIEHgXxg
 116C39rgvYKqzc7W8RCyj8qSROuMVzg6QFbB6n+1PEsYIX2O8A2Re3jdS34q2LbX
 aBDm/Lfdl4kkJrV9xY6P
 =nQUG
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
  megaraid_sas, zfcp and a host of minor updates.

  The major driver change here is the elimination of the block based
  cciss driver in favour of the SCSI based hpsa driver (which now drives
  all the legacy cases cciss used to be required for). Plus a reset
  handler clean up and the redo of the SAS SMP handler to use bsg lib"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
  scsi: scsi-mq: Always unprepare before requeuing a request
  scsi: Show .retries and .jiffies_at_alloc in debugfs
  scsi: Improve requeuing behavior
  scsi: Call scsi_initialize_rq() for filesystem requests
  scsi: qla2xxx: Reset the logo flag, after target re-login.
  scsi: qla2xxx: Fix slow mem alloc behind lock
  scsi: qla2xxx: Clear fc4f_nvme flag
  scsi: qla2xxx: add missing includes for qla_isr
  scsi: qla2xxx: Fix an integer overflow in sysfs code
  scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2()
  scsi: aacraid: get rid of one level of indentation
  scsi: aacraid: fix indentation errors
  scsi: storvsc: fix memory leak on ring buffer busy
  scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough
  scsi: smartpqi: remove the smp_handler stub
  scsi: hpsa: remove the smp_handler stub
  scsi: bsg-lib: pass the release callback through bsg_setup_queue
  scsi: Rework handling of scsi_device.vpd_pg8[03]
  scsi: Rework the code for caching Vital Product Data (VPD)
  scsi: rcu: Introduce rcu_swap_protected()
  ...
2017-09-07 21:11:05 -07:00
Arnd Bergmann
5fe5a6c9ac scsi: lpfc: avoid false-positive gcc-8 warning
This is an interesting regression with gcc-8, showing a harmless warning
for correct code:

In file included from include/linux/kernel.h:13:0,
                 ...
                 from drivers/scsi/lpfc/lpfc_debugfs.c:23:
include/linux/printk.h:301:2: error: 'eq' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
  ^~~~~~
In file included from drivers/scsi/lpfc/lpfc_debugfs.c:58:0:
drivers/scsi/lpfc/lpfc_debugfs.h:451:31: note: 'eq' was declared here

I managed to reduce the warning into a small test case for gcc-8 that I
reported in the gcc bugzilla[1].

As a workaround, this changes the logic to move the two assignments of
'eq' out of the conditions and instead make the index conditional.  This
works for all configurations I tried and avoids adding a bogus
initialization.

Acked-by: James Smart <james.smart@broadcom.com>
Link: [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81958
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25 18:26:52 -04:00
Arnd Bergmann
7362617319 scsi: lpfc: avoid an unused function warning
The only reference to lpfc_nvmet_replenish_context() is inside of an
disabled:

drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: error: 'lpfc_nvmet_replenish_context' defined but not used [-Werror=unused-function]

This replaces the preprocessor conditional with a C condition, so the
compiler can see that the function is intentionally unused.

Fixes: 9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25 18:26:28 -04:00
Dick Kennedy
610448367c scsi: lpfc: lpfc version bump 11.4.0.3
Update driver version to 11.4.0.3

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:44 -04:00
Maurizio Lombardi
286871a666 scsi: lpfc: fix "integer constant too large" error on 32bit archs
cc1: warnings being treated as errors
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_get_wwpn':
drivers/scsi/lpfc/lpfc_init.c:3253: error: integer constant is too large for 'long' type

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:44 -04:00
James Smart
44fd7fe3dd scsi: lpfc: Add Buffer to Buffer credit recovery support
Add Buffer to buffer credit recovery support to the driver.  This is a
negotiated feature with the peer that allows for both sides to detect
dropped RRDY's and FC Frames and recover credit.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:43 -04:00
James Smart
d58734f05f scsi: lpfc: remove console log clutter
Change hw queue binding messages to info - not error.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:43 -04:00
Dick Kennedy
6b486ce9ee scsi: lpfc: Fix bad sgl reposting after 2nd adapter reset
Port issue was fixed, the hbacmd reset would take more than 8 minutes to
complete.

There were conflicting NVME SGL posting/reposting responsibilities
between lpfc_online()/lpfc_sli4_hba_setup() and
lpfc_nvme_create_localport().  The lpfc_online() causes a REPOST on
existing NVME SGLs which is not released during the fc port reset.
However, lpfc_nvme_create_localport() wants to allocate new NVME buffers
and post them. Both cancelled out each other which had a side effect of
hosing the mailbox handling that was used to remove the sgl lists -
causing multiple 60s mbx timeouts.

Fix by preserving all SGL lists over the fc port reset.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:42 -04:00
Dick Kennedy
a145fda381 scsi: lpfc: Fix nvme target failure after 2nd adapter reset
The nonrecovery occurred because the lpfc nvme initiator function did
not reestablish its localport creation with the nvme host transport in
lpfc_oneline.  Because of that, an NVME rport binding could not take
place.

Corrected by recreating the localport in the adapter reset recovery
routine.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:42 -04:00
Dick Kennedy
c6e0c92506 scsi: lpfc: Fix relative offset error on large nvmet target ios
If the nvmet_fc transport breaks an io into multiple sequences, the
driver will improperly set the relative offset on the 2nd through N
sequences.

Correct by properly formatting the hw cmd so the relative offset is
picked up from the hw cmd.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:41 -04:00
Dick Kennedy
66d7ce93a0 scsi: lpfc: Fix MRQ > 1 context list handling
Various oops including cpu LOCKUPs were seen.

For asynchronously received ius where the driver must assign exchange
resources, the resources were on a single get (free) list and put list
(finished, waiting to be put on get list). As all cpus are sharing the
lists, an interrupt for a receive frame may have to wait for all the
other cpus to place their done work onto the put list before it can
acquire the lock to pull from the list.

Fix by breaking the resource lists into per-cpu lists or at least more
than 1 list with cpu's sharing the lists). A cpu would allocate from the
free list for its own cpu, and put its done work on the its own put list
- avoiding the contention. As cpu load may vary, when empty, a cpu may
grab from another cpu, thereby changing resource distribution.  But
searching for a resource only occurs on 1 or a few cpus until a single
resource can be allocated. if the condition reoccurs, it starts looking
at a different cpu.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:41 -04:00
Dick Kennedy
e3e2863def scsi: lpfc: Limit amount of work processed in IRQ
Various oops being seen on being in the ISR too long and cpu lockups,
when under heavy load.

The amount of work being posted off of completion queues kept the ISR
running almost all the time

Correct the issue by limiting the amount of work per iteration.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:40 -04:00
Dick Kennedy
176de5bb20 scsi: lpfc: Correct issues with FAWWN and FDISCs
When using fabric-assigned WWNs, the switch doesn't like copy of the
FLOGI payload, which includes valid VVL bits, to be used as the FDISC
payload.

Rather than wait for corrected switch firmware, ensure the VVL bits are
marked invalid on FDISCs.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:39 -04:00
Dick Kennedy
991f0c0e33 scsi: lpfc: Fix NVME PRLI handling during RSCN
A race condition was found whereby the initiator would receive the RSCN
for a new NVME device before it had a chance to register its FC4 support
with the fabric. Thus, when queried by the initiator, it would see that
the target supported FC-NVME.

Corrected by making the assumption that the target always supports
FC-NVME thus a PRLI is sent. It's ok for the target to reject it.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:39 -04:00
Dick Kennedy
4b40d02b8b scsi: lpfc: Fix crash in lpfc nvmet when fc port is reset
In adapter reset tests, an oops was seen with a NULL pointer in
lpfc_free_rq_buffer+0x20/0x60

The driver is failing to properly repost the nvmet sgl list when
recovering from the reset. Thus the driver eventually trys to walk an
errant buffer list.

Corrected the sgl buffer recovery as well as strengthening the
initialization of the bufferlist.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:38 -04:00
Dick Kennedy
4adc041b4d scsi: lpfc: Fix duplicate NVME rport entries and namespaces.
After lip, the driver sometimes would have two rports for the same
device, allowing the namespaces to be duplicated by nvme.

In lpfc_plogi_confirm_nport() the driver was not swapping the nrport
maintained by the ndlp's undergoing address swapping. This allowed the
2nd rport to sneak in as it was considered a separate device.

This patch adds the fixes to Swap the nrport in each ndlp and take care
of the reference counts on the ndlps similar to FCP rports.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:38 -04:00
Dick Kennedy
8db1c2b3e7 scsi: lpfc: Fix handling of FCP and NVME FC4 types in Pt2Pt topology
After link bounce in a NVME Pt2Pt config, the driver managed to map the
same nport twice, resulting in multiple device nodes for the same
namespace.

In Pt2Pt, the driver must send PRLI's for both (scsi) FCP and NVME
rather than using fabric aids. The driver was inconsistent on handling
various PRLI completions, especially rejects, which had reject codes
cross the different protocol PRLI completions.

Fixed to perform the following: if nvmet mode (fc port can only be a
nvme target) - rejects all unsolicitly FCP PRLI's. Never issues a FCP
PRLI.

The multiple protocol PRLI's are sent simultaneously. However, driver
will now only state transition after both PRLI's are complete. New flags
were added to aid tracking the responses from the different PRLI's.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:37 -04:00
Dick Kennedy
cd22d6057c scsi: lpfc: Correct return error codes to align with nvme_fc transport
Modify driver return error codes to align with host nvme transport.

Driver isn't returning Exxx error codes to properly reflect out of
resource or connectivity conditions (-EBUSY), yet there were hard error
conditions returning -EBUSY.

Ensure the following situations return the proper return code:

 - Temporary failures or temporary resource availability: -EBUSY

 - Connectivity issues: -ENODEV

All others are treated as hard errors and return an -Exxx value that
indicates the type of error.

Also, lpfc_sli4_issue_wqe() was modified to not translate error from
-Exxx to WQE state.  This allows lpfc_nvme_fcp_io_submit() routine to
just return whatever -E value was returned from other routines.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:37 -04:00
Dick Kennedy
ffb70cd6b6 scsi: lpfc: convert info messages to standard messages
Transitioned some informational discovery messages to now always be
displayed when log_verbose is set.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:36 -04:00
Dick Kennedy
bb6a8a2c24 scsi: lpfc: Fix oops when NVME Target is discovered in a nonNVME environment
lpfc oops when it discovers a NVME target but is configured for SCSI
only operation. Oops is in lpfc_nvme_register_port+0x33/0x300.

The localport is not valid so it should not have been referenced.

Added validity check for localport

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:36 -04:00
Dick Kennedy
d2aa48761e scsi: lpfc: Fix rediscovery on switch blade pull
When the switch blade is pulled out then plugged back in, the driver
does not issue a PLOGI to the target

When the switch blade is pulled out, it does not reset the link. The
driver ends up issuing a LOGO to the target, and finally sees devloss.
Since the driver believes that a LOGO is outstanding, it does not issue
a PLOGI to the target upon link up

Correct by placing the ndlp in UNUSED state When devloss happens in
LOGO_ISSUE state.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:35 -04:00
Dick Kennedy
2877cbffb7 scsi: lpfc: Fix loop mode target discovery
The driver does not discover targets when in loop mode.

The NLP type is correctly getting set when a fabric connection is
detected but, not for loop. The unknown NLP type means that the driver
does not issue a PRLI when in loop topology. Thus target discovery
fails.

Fix by checking the topology during discovery.  If it is loop, set the
NLP FC4 type to FCP.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:35 -04:00
Dick Kennedy
1fe68477d2 scsi: lpfc: Fix plogi collision that causes illegal state transition
Message "0271 Illegal State Transition: node" seen in logs, all luns are
unuseable for that target.

A window exists in the rcv_plogi path where if the state is plogi issue
but the driver has not issued a plogi, then two reglogins will be sent
for the same RPI. The first one to complete will advance the state to
prli issue the second one will be detected as an illegal state, and
leave the node in an unusable state.

Correct the completion routine for the PLOGI ACC that detects the state
change when the driver starts discovery on the node again and drop the
REGLOGIN mailbox command.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:34 -04:00
Gustavo A. R. Silva
44ed33e6c5 scsi: lpfc: remove useless code in lpfc_sli4_bsg_link_diag_test
Remove variable assignments. The value stored in local variable _rc_ is
overwritten at line 2448:rc = lpfc_sli4_bsg_set_link_diag_state(phba,
0); before it can be used.

Addresses-Coverity-ID: 1226935
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:09 -04:00
James Smart
5073842093 lpfc: support nvmet_fc defer_rcv callback
Currently, calls to nvmet_fc_rcv_fcp_req() always copied the
FC-NVME cmd iu to a temporary buffer before returning, allowing
the driver to immediately repost the buffer to the hardware.

To address timing conditions on queue element structures vs async
command reception, the nvmet_fc transport occasionally may need to
hold on to the command iu buffer for a short period. In these cases,
the nvmet_fc_rcv_fcp_req() will return a special return code
(-EOVERFLOW). In these cases, the LLDD must delay until the new
defer_rcv lldd callback is called before recycling the buffer back
to the hw.

This patch adds support for the new nvmet_fc transport defer_rcv
callback and recognition of the new error code when passing commands
to the transport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-10 11:19:05 +02:00
Romain Perier
771db5c0e3 scsi: lpfc: Replace PCI pool old API
The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API. It also updates
some comments, accordingly.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-07 14:04:01 -04:00
Linus Torvalds
130568d5ea Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
 "This is a followup for block changes, that didn't make the initial
  pull request. It's a bit of a mixed bag, this contains:

   - A followup pull request from Sagi for NVMe. Outside of fixups for
     NVMe, it also includes a series for ensuring that we properly
     quiesce hardware queues when browsing live tags.

   - Set of integrity fixes from Dmitry (mostly), fixing various issues
     for folks using DIF/DIX.

   - Fix for a bug introduced in cciss, with the req init changes. From
     Christoph.

   - Fix for a bug in BFQ, from Paolo.

   - Two followup fixes for lightnvm/pblk from Javier.

   - Depth fix from Ming for blk-mq-sched.

   - Also from Ming, performance fix for mtip32xx that was introduced
     with the dynamic initialization of commands"

* 'for-linus' of git://git.kernel.dk/linux-block: (44 commits)
  block: call bio_uninit in bio_endio
  nvmet: avoid unneeded assignment of submit_bio return value
  nvme-pci: add module parameter for io queue depth
  nvme-pci: compile warnings in nvme_alloc_host_mem()
  nvmet_fc: Accept variable pad lengths on Create Association LS
  nvme_fc/nvmet_fc: revise Create Association descriptor length
  lightnvm: pblk: remove unnecessary checks
  lightnvm: pblk: control I/O flow also on tear down
  cciss: initialize struct scsi_req
  null_blk: fix error flow for shared tags during module_init
  block: Fix __blkdev_issue_zeroout loop
  nvme-rdma: unconditionally recycle the request mr
  nvme: split nvme_uninit_ctrl into stop and uninit
  virtio_blk: quiesce/unquiesce live IO when entering PM states
  mtip32xx: quiesce request queues to make sure no submissions are inflight
  nbd: quiesce request queues to make sure no submissions are inflight
  nvme: kick requeue list when requeueing a request instead of when starting the queues
  nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues
  nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues
  nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues
  ...
2017-07-11 15:36:52 -07:00
Linus Torvalds
9031114841 SCSI misc on 20170704
This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc,
 qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with
 a host of minor and miscellaneous changes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZW7JMAAoJEAVr7HOZEZN4IM4P/AqBtvH+6Lo1Eb+3A/HnHskK
 hIxVxgBxaw3fhW5AegDfVCvrdVTTEkCB/g5CIKN8NCWEx6OGmCX0Lu6lnjld9BOZ
 cTlPtzNwFGlgHrz34LwCc3vlc5ovMpTQBrpGAQpGGWoAZIP+c3ilEihIYTEMNCsN
 dmjI71AigDE5g6X1OT361IJ1gydkjfG41IcRe/jlMtEgRNdy3B2JVIdATL89Pw4b
 0uZO3uUTn8EGEKUdyJZCNpie7sGZv8u2LhA+Znby2C4h3bwWNV/d0p7ped4xrQY5
 yVpZEUbYVdcOOYBgeBJlfwOhvjRQTdxeK4d7W9XTb+AQf30F3DgSepdMCdf3BjVt
 KnQvBOTxyidB8xsCL46wlxxNew3qoUtaKoY88WUOOnnJwU5U7hlRtPkf/eO2i5QF
 +k7fxUpFfkBTS4I2gXnyGWpmSoxwJerd0knojSOjrjJcAlcgM65+pocUAea/0Dpr
 SsfL2sTb12gk6bkF9UlRv8/4aSsWYb92WW1nbTt2nFRXncPNN5Qzc3lGj//36O+b
 2bka+aSKVAFoNAnQ1pUE8EJxSboy5q7y4509iZzO/Fom+pVuzBROm5fmrpcOE5dP
 IjW7gqSFB6578tnNiK049rrrPja5wkUa+Ptc8s0FjPAVyIDrp2RN+f2nljOBBhW8
 3Z1nXMG0eFqvb5taLtfZ
 =D9QX
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc,
  qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with
  a host of minor and miscellaneous changes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (276 commits)
  qla2xxx: Fix NVMe entry_type for iocb packet on BE system
  scsi: qla2xxx: avoid unused-function warning
  scsi: snic: fix a couple of spelling mistakes/typos
  scsi: qla2xxx: fix a bunch of typos and spelling mistakes
  scsi: lpfc: don't double count abort errors
  scsi: lpfc: spin_lock_irq() is not nestable
  scsi: hisi_sas: optimise DMA slot memory
  scsi: ibmvfc: constify dev_pm_ops structures.
  scsi: ibmvscsi: constify dev_pm_ops structures.
  scsi: cxlflash: Update debug prints in reset handlers
  scsi: cxlflash: Update send_tmf() parameters
  scsi: cxlflash: Avoid double free of character device
  scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state
  scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails.
  scsi: ufs: flush eh_work when eh_work scheduled.
  scsi: qla2xxx: Protect access to qpair members with qpair->qp_lock
  scsi: sun_esp: fix device reference leaks
  scsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init
  scsi: fnic: correct speed display and add support for 25,40 and 100G
  scsi: fnic: added timestamp reporting in fnic debug stats
  ...
2017-07-06 12:10:33 -07:00
Linus Torvalds
3bad2f1c67 Merge branch 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc user access cleanups from Al Viro:
 "The first pile is assorted getting rid of cargo-culted access_ok(),
  cargo-culted set_fs() and field-by-field copyouts.

  The same description applies to a lot of stuff in other branches -
  this is just the stuff that didn't fit into a more specific topical
  branch"

* 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Switch flock copyin/copyout primitives to copy_{from,to}_user()
  fs/fcntl: return -ESRCH in f_setown when pid/pgid can't be found
  fs/fcntl: f_setown, avoid undefined behaviour
  fs/fcntl: f_setown, allow returning error
  lpfc debugfs: get rid of pointless access_ok()
  adb: get rid of pointless access_ok()
  isdn: get rid of pointless access_ok()
  compat statfs: switch to copy_to_user()
  fs/locks: don't mess with the address limit in compat_fcntl64
  nfsd_readlink(): switch to vfs_get_link()
  drbd: ->sendpage() never needed set_fs()
  fs/locks: pass kernel struct flock to fcntl_getlk/setlk
  fs: locks: Fix some troubles at kernel-doc comments
2017-07-05 13:13:32 -07:00
Dmitry Monakhov
128b6f9fdd t10-pi: Move opencoded contants to common header
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:25 -06:00
Dan Carpenter
88e9617420 scsi: lpfc: don't double count abort errors
If lpfc_nvmet_unsol_fcp_issue_abort() fails then we accidentally
increment "tgtp->xmt_abort_rsp_error" and then two lines later we
increment it a second time.

Fixes: 547077a44b ("scsi: lpfc: Adding additional stats counters for nvme.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01 17:09:11 -04:00
Dan Carpenter
c4031db72b scsi: lpfc: spin_lock_irq() is not nestable
We're calling spin_lock_irq() multiple times, the problem is that on the
first spin_unlock_irq() then we will re-enable IRQs and we don't want
that.

Fixes: 966bb5b711 ("scsi: lpfc: Break up IO ctx list into a separate get and put list")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01 17:08:41 -04:00
Al Viro
ca1579f6c6 Merge remote-tracking branch 'jl/locks-4.13' into work.misc-set_fs 2017-06-26 23:52:33 -04:00
James Smart
cb45e5295a scsi: lpfc: fix refcount error on node list
A change in remote port removal introduced a spurious put which can
cause a premature structure teardown. The affects were most notable when
the driver attempted to unload as a null pointer would be hit.

Fix by removing the unnecessary put.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26 15:01:01 -04:00
James Smart
00cefeb964 scsi: lpfc: Fix nvme io stoppage after link bounce
On link down, transport is calling driver to abort outstanding ios.
Driver erroneously rejects the abort if the port indicates it isn't
logged in - which will be the case after the link down. Thus, the io
can't clean up. This prevents reconnection at the transport level.

Fix by allowing abort to proceed.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26 15:01:00 -04:00
James Smart
b609b1c753 scsi: lpfc: update to revision to 11.4.0.1
Set lpfc driver revision to 11.4.0.1

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:41:12 -04:00
James Smart
d6564e5260 scsi: lpfc: Driver responds LS_RJT to Beacon Off ELS - Linux
Beacon OFF from switch is rejected by driver.

Driver fails Beacon OFF if frequency is set to 0. As per fc-ls spec,
status, capability, frequency and duration fields are only applicable
for Beacon ON.

Remove frequency and type checks. Reject Beacon ON if duration is non
zero.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:41:03 -04:00
James Smart
4550f9c75e scsi: lpfc: Fix crash in lpfc_sli_ringtxcmpl_put when nvmet gets an abort request.
When running nvme detach-ns /dev/nvme0n1 -n 1 command, the nvmet lpfc
driver crashes with this stack dump:

kernel BUG at /root/NVME/lpfc_8.4/lpfc_sli.c:1393!
invalid opcode: 0000 [#1] SMP
Workqueue: nvmet-fc-cpu0 nvmet_fc_do_work_on_cpu [nvmet_fc]
 lpfc_sli4_issue_wqe+0x357/0x440 [lpfc]
 lpfc_nvmet_xmt_fcp_abort+0x36b/0x5c0 [lpfc]
 nvmet_fc_abort_op+0x30/0x50 [nvmet_fc]
 nvmet_fc_do_work_on_cpu+0xd9/0x130 [nvmet_fc]
 process_one_work+0x14e/0x410
 worker_thread+0x116/0x490
 kthread+0xc7/0xe0
 ret_from_fork+0x3f/0x70

Crash is due to an uninitialized iocbq->vport pointer.

Explicitly set the iocbq->vport field to phba->pport in
lpfc_nvmet_sol_fcp_issue_abort as it does all abort iocbq initialization
in the routine.  Using phba->pport is ok because target does not support
NPIV instances.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:53 -04:00
James Smart
11e644e2a2 scsi: lpfc: Fix crash doing IO with resets
During every reset, IOCBs are allocated. So, at one point, number of
allocated IOCBs reaches maximum limit and lpfc_sli_next_iotag fails.

Allocate IOCBs only during initialization. Reuse them after every reset
instead of allocating new set of IOCBs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:44 -04:00
James Smart
569dbe84a3 scsi: lpfc: Fix crash after firmware flash when IO is running.
OS crashes after the completion of firmware download.

Failure in posting SCSI SGL buffers because number of SGL buffers is
less than total count. Some of the pending IOs are not completed by
driver. SGL buffers for these IOs are not added back to the list.
Pending IOs are not completed because lpfc_wq_list list is initialized
before completion of pending IOs.

Postpone lpfc_wq_list reinitialization by moving
lpfc_sli4_queue_destroy() after lpfc_hba_down_post().

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:33 -04:00
James Smart
09559e8112 scsi: lpfc: Fix SLI3 drivers attempting NVME ELS commands.
In a server with an 8G adapter and a 32G adapter, running NVME and FCP,
the server would crash with the following stack.

RIP: 0010: ... lpfc_nvme_register_port+0x38/0x420 [lpfc]
 lpfc_nlp_state_cleanup+0x154/0x4f0 [lpfc]
 lpfc_nlp_set_state+0x9d/0x1a0 [lpfc]
 lpfc_cmpl_prli_prli_issue+0x35f/0x440 [lpfc]
 lpfc_disc_state_machine+0x78/0x1c0 [lpfc]
 lpfc_cmpl_els_prli+0x17c/0x1f0 [lpfc]
 lpfc_sli_sp_handle_rspiocb+0x39b/0x6b0 [lpfc]
 lpfc_sli_handle_slow_ring_event_s3+0x134/0x2d0 [lpfc]
 lpfc_work_done+0x8ac/0x13b0 [lpfc]
 lpfc_do_work+0xf1/0x1b0 [lpfc]

Crash, on the 8G adapter, is due to a vport which does not have a nvme
local port structure. It's not supposed to have one. NVME is not
supported on the 8G adapter, so the NVME PRLI, which started this flow
shouldn't have been sent in the first place.

Correct discovery engine to recognize when on an SLI3 rport, which
doesn't support SLI3, if the rport supports only NVME, don't send a NVME
PRLI. Instead, as no FC4 will be used, a LOGO is sent.  If rport is FCP
and NVME, only execute the SCSI PRLI.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:21 -04:00
James Smart
966bb5b711 scsi: lpfc: Break up IO ctx list into a separate get and put list
Since unsol rcv ISR and command cmpl ISR both access/lock this list,
separate get/put lists will reduce contention.

Replaced
struct list_head lpfc_nvmet_ctx_list;
with
struct list_head lpfc_nvmet_ctx_get_list;
struct list_head lpfc_nvmet_ctx_put_list;
and all correpsonding locks and counters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:10 -04:00
James Smart
810ffa4789 scsi: lpfc: Reduce time spent in IRQ for received NVME commands
Removed unnecessary bzero of context area. Due to size of sg list, added
a substantial delay and played havoc on cpu caches.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:39:57 -04:00
James Smart
56bc802842 scsi: lpfc: Vport creation is failing with "Link Down" error
Vport creation fails for SLI-3 adapters.

Mailbox submission fails because mailbox interrupt is disabled. Mailbox
interrupt is disabled during port reset.

Do reset only for physical port.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:39:37 -04:00
James Smart
b4fd681e8a scsi: lpfc: Fix nvme_info sysfs output to be consistent
First line of nvme_info output is not consistent. There is an Extra
colon in the format.

First line of output will contain one of the following strings:
NVME Initiator Enabled
NVME Target Enabled
NVME Disabled

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:39:22 -04:00
James Smart
d41b65bcdc scsi: lpfc: Fix system panic when express lane enabled.
There is a null pointer dereference that can happen in the FOF interrupt
handler.

The driver was not setting up cq->assoc_qp_for sli4_hba->oas_cq.

Initialize cq->assoc_qp before accessing it.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:39:01 -04:00
James Smart
1080f7ec61 scsi: lpfc: update to revision to 11.4.0.0
Set lpfc driver revision to 11.4.0.0

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:32 -04:00
James Smart
0cf07f84dd scsi: lpfc: Add auto EQ delay logic
Administrator intervention is currently required to get good numbers
when switching from running latency tests to IOPS tests.

The configured interrupt coalescing values will greatly effect the
results of these tests.  Currently, the driver has a single coalescing
value set by values of the module attribute.  This patch changes the
driver to support auto-configuration of the coalescing value based on
the total number of outstanding IOs and average number of CQEs processed
per interrupt for an EQ.  Values are checked every 5 seconds.

The driver defaults to the automatic selection. Automatic selection can
be disabled by the new lpfc_auto_imax module_parameter.

Older hardware can only change interrupt coalescing by mailbox
command. Newer hardware supports change via a register. The patch
support both.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
78e1d2009f scsi: lpfc: Fix defects reported by Coverity Scan
Addressed the following reported defects:

** CID 1411552:  Control flow issues  (MISSING_BREAK)
/drivers/scsi/lpfc/lpfc_sli.c: 13259 in lpfc_sli4_nvmet_handle_rcqe()

** CID 1411553:  Memory - illegal accesses  (OVERRUN)
/drivers/scsi/lpfc/lpfc_sli.c: 16218 in lpfc_fc_frame_check()

** CID 1411553:  Memory - illegal accesses  (OVERRUN)
   Overrunning array "lpfc_rctl_names" of 202 8-byte elements at element
   index 244 (byte offset 1952) using index "fc_hdr->fh_r_ctl" (which
   evaluates to 244).

** CID 1411554:  Null pointer dereferences  (REVERSE_INULL)
/drivers/scsi/lpfc/lpfc_nvmet.c: 2131 in lpfc_nvmet_unsol_fcp_abort_cmp()

** CID 1411555:  Memory - illegal accesses  (UNINIT)
/drivers/scsi/lpfc/lpfc_nvmet.c: 180 in lpfc_nvmet_ctxbuf_post()

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
b57ab7469d scsi: lpfc: Fix vports not logging into target
vports cannot login to target.

For vports, lpfc_nodelist is allocated for targets only on completion of
GFF_ID command. Driver checks if lpfc_nodelist exists for target before
sending GFF_ID. So, GFF_ID and PLOGI are not sent.

As mentioned by the comment in lpfc_prep_node_fc4type() routine, do not
send GFF_ID only if this NPortID is previously identified as FCP
target. Send GFF_ID if it is a newly identified remote port from GID_FT
response.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
dea37e82fa scsi: lpfc: Fix PRLI retry handling when target rejects it.
The nvmet driver was rejecting the initiator's PRLI because its reg_rpi
for the PLOGI was still outstanding.  The initiator would resend the
PRLI without delay and get the same answer.  The PRLI retries would
exhaust causing the nvme initiator to set the nvmet ndlp to UNMAPPED.

The driver's lpfc_els_retry handler did not have a policy for an LS_RJT
with explanation CMD_IN_PROGRESS for PRLI or NVME_PRLI.  This caused the
delay to remain at 0 but retry set 1.

Fix: When the ELS response is LS_RJT, TPC and the command was PRLI or
NVME_PRLI, just set the delay to 1000 mS to get a 1 second delay on the
PRLI retry.  This was enough to allow the REG_RPI to complete at the
target.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
e92974f6cd scsi: lpfc: Null pointer dereference when log_verbose is set to 0xffffffff
Kernel panic when log_verbose is set to 0xffffffff

phba->pport is dereferenced before it is initialized

Fix: Do not dereference phba->pport if it is NULL

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
b83d005e63 scsi: lpfc: Fix System panic after loading the driver
System panic with general protection fault during driver load

The driver uses a static array sli4_hba.handler_name to store the irq
handler names. If the io_channel_irqs exceeds the pre-allocated size
(32+1), then the driver will overwrite other fields of sli4_hba.

Fix: Dynamically allocate handler_name.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
ecbb227e63 scsi: lpfc: Fix crash on powering off BFS VM with passthrough device
Null pointer dereference when BFS VM is powered off

The driver incorrectly uses sli3_ring on SLI-4 adapters

Use the correct ring structure based on sli_rev

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
522dceeb62 scsi: lpfc: Fix return value of board_mode store routine in case of online failure
On hbacmd reset failure, observing wrong string "nline" in kernel log.

On failure, non negative value (1) is returned from sysfs store
routine. It is interpreted as count by kernel and store routine is
called again with the remaining characters as input.

Fix: Return negative error code (-EIO) in case of failure.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
2cee780800 scsi: lpfc: Fix counters so outstandng NVME IO count is accurate
NVME FC counters don't reflect actual results

Since counters are not atomic, or protected by a lock, the values often
get screwed up.

Make them atomic, like NVMET.  Fix up sysfs and debugfs display
accordingly Added Outstanding IOs to stats display

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
14041bd170 scsi: lpfc: Fix Port going offline after multiple resets.
Observing lpfc port down after issuing hbacmd reset command

Failure in posting SGL buffers. If there is only one SGL buffer and rrq
is valid for its XRI, we are rightly returning NULL but not adding the
buffer back to the SGL list. So, number of buffers become less than
total count and repost fails during reset.

Add SGL buffer back to list before returning NULL.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
6599e12428 scsi: lpfc: Fix nvmet node ref count handling
When unloading the driver, the NVMET driver would wait the full 30
seconds for its UNMAPPED initiator node to get removed before continuing
with the unload process.  NVMEI worked correctly.

For each rport put into UNMAPPED or MAPPED state by NVMET, the driver
puts a reference on the NDLP.  The difference is that NVMEI has a
unregister call for its rports and the extra reference is removed in the
unregister process.  For NVMET, the driver has to remove the reference
explicitly when dropping out of UNMAPPED or MAPPED because there is no
unregister call.

Add a call to lpfc_nlp_put on the ndlp when NVMET and the old state was
UNMAPPED or MAPPED.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:30 -04:00
James Smart
92721c3b97 scsi: lpfc: Fix Lun Priority level shown as NA
Lun Priority level shown as NA

Remote port is not getting registered for nameserver and fdmi.  Due to
which dfc SendCTPassThru cmd is failing.

Made changes to register the remote port for both.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:30 -04:00
James Smart
ce1b591c5b scsi: lpfc: Add changes to assist in NVMET debugging
Inconsistent error messages and context state checks

Context state sanity checks were not accurate or inconsistent in the
code paths.

Separated LS context states from FCP.
Added and modified context state sanity checks.
Use context state to determine if a sol or unsol ABORT is needed.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:30 -04:00
James Smart
7d790f04d7 scsi: lpfc: Fix nvme port role handling in sysfs and debugfs handlers.
While debugging Devloss and recovery, debugfs and sysfs were found to
not show the NVME port roles consistently.

The port role FC_PORT_ROLE_NVME_DISCOVERY was added with the devloss
bringup and the other issues were just oversight.

Add NVME Target and DISCSRVC to debugfs nodeinfo and sysfs nvme info
handlers. The full port role was added to the NVME data only not the
generic nodelist.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:30 -04:00
James Smart
80cc004393 scsi: lpfc: Fix transition nvme-i rport handling to nport only.
As the devloss API was implemented in the nvmei driver, an evaluation of
the nvme transport and the lpfc driver showed dual management of the
rports.  This creates a bug possibility when the thread count and SAN
size increases.

The nvmei driver code was based on a very early transport and was not
revisited until the devloss API was introduced.

Remove the listhead in the driver's rport data structure and the
listhead in the driver's lport data structure.  Remove all rport_list
traversal.  Convert the driver to use the nrport (nvme rport) pointer
that is now NULL or nonNULL depending on a devloss action.  Convert
debugfs and nvme_info in sysfs to use the fc_nodes list in the vport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:30 -04:00
James Smart
7a06dcd3f8 scsi: lpfc: Add nvme initiator devloss support
Add nvme initiator devloss support

The existing implementation was based on no devloss behavior in the
transport (e.g. immediate teardown) so code didn't properly handle
delayed nvme rport device unregister calls.  In addition, the driver was
not correctly cycling the rport port role for each
register-unregister-reregister process.

This patch does the following:

Rework the code to properly handle rport device unregister calls and
potential re-allocation of the remoteport structure if the port comes
back in under dev_loss_tmo.

Correct code that was incorrectly cycling the rport port role for each
register-unregister-reregister process.

Prep the code to enable calling the nvme_fc transport api to dynamically
update dev_loss_tmo when the scsi sysfs interface changes it.

Memset the rpinfo structure in the registration call to enforce "accept
nvme transport defaults" in the registration call.  Driver parameters do
influence the dev_loss_tmo transport setting dynamically.

Simplifies the register function: the driver was incorrectly searching
its local rport list to determine resume or new semantics, which is not
valid as the transport already handles this.  The rport was resumed if
the rport handed back matches the ndlp->nrport pointer.  Otherwise,
devloss fired and the ndlp's nrport is NULL.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:30 -04:00
Colin Ian King
39aa23f98a scsi: lpfc: make a couple of functions static
functions lpfc_nvmet_cleanup_io_context and lpfc_nvmet_setup_io_context
can be made static as they do not need to be in global scope.

Cleans up sparse warnings:
  "warning: symbol 'lpfc_nvmet_cleanup_io_context' was not declared.
   Should it be static?"
  "warning: symbol 'lpfc_nvmet_setup_io_context' was not declared.
   Should it be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:04:42 -04:00
Colin Ian King
1ab6207210 scsi: lpfc: fix spelling mistake "entrys" -> "entries"
Trivial fix to spelling mistake in debugfs message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 20:48:05 -04:00
Linus Torvalds
ef918d3c80 SCSI fixes on 20170610
This is a set of user visible fixes (excepting one format string
 change).  Four of the qla2xxx fixes only affect the firmware dump
 path, but it's still important to the enterprise.  The rest are
 various NULL pointer crash conditions or outright driver hangs.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZPGMCAAoJEAVr7HOZEZN4jekQAJN1x7WIB4HuI3EoHSSMej8h
 bdssAbqQn/H++nIJ/e6ZRHKt0P6ngSzXuIb4lwbOJUQa7sxEWaWDeXywPEqjDYqP
 BBjloYOrAff492uYXL48xjG4Xl4qOxb8GfKT7iFptIzAdk/2Rxhj56XqlhY7IMSG
 ut4binbz+3v0NEKnI6od+uxvXAc6EumyF0zW9a4rbjK/wAukciRIGWkOrsQpa8cJ
 VdgUsMdbpjTlYbMnPfHa+oUqKkWir3PI9rQ01AvMUugrqAXiAPLgoHFB6H8eVVn7
 vzVnJd31RoUrv6JNnWcRsi0VWsciPw5XBpd6VRVjZUdOlUds3vW7n1G2ut5TfAAp
 sYkFSuhxcWgp3QJpqDbS/l976dXyfdzhQpahgYLbRAuhoi8HDmcpwzTdWC9a41tw
 k2sqAbgZd60ZHu8OSrD2HqJrkMqSXzklMkZMS33nfE1Ki7c+aWHImby4P+lEKIIw
 nJCiVc3yO+TcWvdH5w+6Fu/nA0HJ9OcFEk1P+4Xz38n5o/WcduoXG6NgpVT+mKXO
 zQZDEYbWQYixDEs1m8fJpTHu5p2tXYzdMS9L/Fa0B2MQ3kY9XIT41rHqnJPBOp2R
 wKXksIyzQagW6r0bQ2lFkth0elLHGxDlwfCDgrN6zQFrdBcpRfT+GdTDpDWiWggt
 qgIbBvEO4sd12V5miQsK
 =jZXV
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of user visible fixes (excepting one format string
  change).

  Four of the qla2xxx fixes only affect the firmware dump path, but it's
  still important to the enterprise. The rest are various NULL pointer
  crash conditions or outright driver hangs"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxgb4i: libcxgbi: in error case RST tcp conn
  scsi: scsi_debug: Avoid PI being disabled when TPGS is enabled
  scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
  scsi: lpfc: prevent potential null pointer dereference
  scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
  scsi: lpfc: nvmet_fc: fix format string
  scsi: qla2xxx: Fix crash due to NULL pointer dereference of ctx
  scsi: qla2xxx: Fix mailbox pointer error in fwdump capture
  scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC
  scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues
  scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue
  scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call
  scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive
  scsi: bnx2fc: fix race condition in bnx2fc_get_host_stats()
  scsi: qla2xxx: don't disable a not previously enabled PCI device
2017-06-11 11:21:08 -07:00
Al Viro
20dcf8e244 lpfc debugfs: get rid of pointless access_ok()
copy_from_user() needs no protection - it does the checks itself

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-04 13:57:25 -04:00
Gustavo A. R. Silva
e6ef6a77f5 scsi: lpfc: prevent potential null pointer dereference
Null check at line 966: if (ndlp) {, implies that ndlp might be NULL.
Functions lpfc_nlp_set_state() and lpfc_issue_els_prli() dereference
pointer ndlp. Include these function calls inside the IF block that
tests pointer ndlp.

Addresses-Coverity-ID: 1401856
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:45:15 -04:00
Guilherme G. Piccoli
7c9fdfb700 scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
We might have a NULL pring in lpfc_els_abort(), for example on error
recovery path, since queues are destroyed during error recovery
mechanism.

In this case, we should just drop the abort since the queues will be
recreated anyway. This patch just verifies for NULL pointer and stop the
abortion of the queue in case of a NULL pring.

Also, this patch converts return type of lpfc_els_abort() from int to
void, since it's not checked anywhere.

Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart  <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:44:13 -04:00
Arnd Bergmann
9094367a91 scsi: lpfc: nvmet_fc: fix format string
The lpfc_nvmeio_data() tracing helper always takes a format string and
three additional arguments. The latest caller has a format string with
only two integer arguments, causing this harmless warning:

drivers/scsi/lpfc/lpfc_nvmet.c: In function 'lpfc_nvmet_xmt_fcp_release':
drivers/scsi/lpfc/lpfc_nvmet.c:802:25: error: too many arguments for format [-Werror=format-extra-args]
  lpfc_nvmeio_data(phba, "NVMET FCP FREE: xri x%x ste %d\n", ctxp->oxid,

We could add a dummy argument here, but it seems reasonable to print
the 'abort' flag as the third argument.

Fixes: 19b58d9473 ("nvmet_fc: add req_release to lldd api")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:44:06 -04:00
Linus Torvalds
be941bf2e6 SCSI fixes on 20170524
This is quite a big update because it includes a rework of the lpfc
 driver to separate the NVMe part from the FC part.  The reason for
 doing this is because two separate trees (the nvme and scsi trees
 respectively) want to update the individual components and this
 separation will prevent a really nasty cross tree entanglement by the
 time we reach the next merge window.  The rest of the fixes are the
 usual minor sort with no significant security implications.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZJhx1AAoJEAVr7HOZEZN4+lMQALqrWA4Kty2nHU1EfWXd8lOR
 VJt6TlthMQWn57MCuwi1Q6bQR8PXaDr9yDvSkHu1Kqu0ZnmZRRs5CsKgN5RFkO7s
 F8jZlqKtE36lfavqv+Li+ie110NfFDJVoQOACqhRybcT7En59nwu8dvPJZ1vXtCO
 qevukGFyDnHR3VJR/LJOGs7NUmVdGegUxALfOZHH22oOVU8v+iAARfgM0DI4bPS7
 BTlhJDEVL0/uiYb/D1l8xVQCCuChX7yVygPLC57Ag8eRMAiTVyTN6Y1L6AGeDye0
 hHty1Cv0yfEf51ZXNCizIvMlcEIB6lA40VUiZ62c2+Dp9TOceVgbVrVLF28c2e2o
 z73xcrnUBdPi1znGOrQuJlTBLBYUvsFrq4ZhzlS5vGsUNslYyFi5p8xtnbHxrIQq
 qRfTLeYWuOSyULvIiYkFyZkksr7up21wsaplN5OrNw0f0hTOf8ff2duM09MTARQO
 xxTTS1/TD2KCMm4qh638qNbrIdZgjvMFeNP+G/XagloZ5D8NCdn+pzm/vLm+7lAx
 D4AhwHcQ7I57YhDHLs56yhzL7cPyPsxeFPtYKFO7Vz1B0Xw+prgKRcCA+vOrs0ae
 vKMV1ctyo5E0BfUk7lYl3NP0IPqupc82GeO5IvUmh+swNYrg3TCct13Afr4sa0n+
 yNlLgoYLnJ3mVGMWvDgL
 =NtGp
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is quite a big update because it includes a rework of the lpfc
  driver to separate the NVMe part from the FC part.

  The reason for doing this is because two separate trees (the nvme and
  scsi trees respectively) want to update the individual components and
  this separation will prevent a really nasty cross tree entanglement by
  the time we reach the next merge window.

  The rest of the fixes are the usual minor sort with no significant
  security implications"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (25 commits)
  scsi: zero per-cmd private driver data for each MQ I/O
  scsi: csiostor: fix use after free in csio_hw_use_fwconfig()
  scsi: ufs: Clean up some rpm/spm level SysFS nodes upon remove
  scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
  scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
  scsi: lpfc: update version to 11.2.0.14
  scsi: lpfc: Add MDS Diagnostic support.
  scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
  scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
  scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
  scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
  scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
  scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
  scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
  scsi: lpfc: Fix NMI watchdog assertions when running nvmet IOPS tests
  scsi: lpfc: Fix NVMEI driver not decrementing counter causing bad rport state.
  scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
  scsi: lpfc: Adding additional stats counters for nvme.
  scsi: lpfc: Fix system crash when port is reset.
  scsi: lpfc: Fix used-RPI accounting problem.
  ...
2017-05-24 20:29:53 -07:00
Linus Torvalds
894e21642d Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A small collection of fixes that should go into this cycle.

   - a pull request from Christoph for NVMe, which ended up being
     manually applied to avoid pulling in newer bits in master. Mostly
     fibre channel fixes from James, but also a few fixes from Jon and
     Vijay

   - a pull request from Konrad, with just a single fix for xen-blkback
     from Gustavo.

   - a fuseblk bdi fix from Jan, fixing a regression in this series with
     the dynamic backing devices.

   - a blktrace fix from Shaohua, replacing sscanf() with kstrtoull().

   - a request leak fix for drbd from Lars, fixing a regression in the
     last series with the kref changes. This will go to stable as well"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvmet: release the sq ref on rdma read errors
  nvmet-fc: remove target cpu scheduling flag
  nvme-fc: stop queues on error detection
  nvme-fc: require target or discovery role for fc-nvme targets
  nvme-fc: correct port role bits
  nvme: unmap CMB and remove sysfs file in reset path
  blktrace: fix integer parse
  fuseblk: Fix warning in super_setup_bdi_name()
  block: xen-blkback: add null check to avoid null pointer dereference
  drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub()
2017-05-20 16:12:30 -07:00
James Smart
4b8ba5fa52 nvmet-fc: remove target cpu scheduling flag
Remove NVMET_FCTGTFEAT_NEEDS_CMD_CPUSCHED. It's unnecessary.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-20 10:11:34 -06:00
James Smart
eeeb51d834 scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
fix build issue if NVME_FC_TARGET is not defined. noop the code.  The
code will never be invoked if target mode is not enabled.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-17 20:22:47 -04:00
Guilherme G. Piccoli
53cf29d3b1 scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
Recent commit on patchset "lpfc updates for 11.2.0.14" fixed an issue
about dereferencing a NULL pointer on port reset. The specific commit,
named "lpfc: Fix system crash when port is reset.", is missing a check
against NULL pointer on lpfc_els_flush_cmd() though.

Since we destroy the queues on adapter resets, like in PCI error
recovery path, we need the validation present on this patch in order to
avoid a NULL pointer dereference when trying to flush commands of ELS
wq, after it has been destroyed (which would lead to a kernel oops).

Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-17 20:19:23 -04:00
James Smart
2848e1d503 scsi: lpfc: update version to 11.2.0.14
Change driver version to 11.2.0.14.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:25:09 -04:00
James Smart
ae9e28f36a scsi: lpfc: Add MDS Diagnostic support.
Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:24:47 -04:00
James Smart
dc53a61852 scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
Code review of NVMEI's FC_PORT_ROLE_NVME_DISCOVERY looked wrong.

Discussions with storage architecture team clarified NVMEI's audit of
the PRLI response port roles.  Following up discussion with code review
showed a few minor corrections were required - especially in
anticipation of NVME auto discovery.

During PRLI, NVMEI should sent prli_init - which it it does.  NVMET
should send prli_tgt and prli_disc - which it does.  When NVMEI receives
a PRLI Response now, it audits the incoming target bits and stores the
attributes in the corresponding NDLP.  Later, when NVMEI registers the
NVME rport, it uses the stored ndlp attributes to set the rport
port_roles correctly.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:24:17 -04:00
James Smart
64eb4dcb14 scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
Too many work items being processed in IRQ context take a lot of CPU
time and cause problems.

With a recent change, we get out of the ISR after hitting entry_repost
work items on a queue. However, the actual values for entry repost are
still high. EQ is 128 and CQ is 128, this could translate into
processing 128 * 128 (16384) work items under IRQ context.

Set entry_repost in the actual queue creation routine now.  Limit EQ
repost to 8 and CQ repost to 64 to further limit the amount of time
spent in the IRQ.

Fix fof IRQ routines as well.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:23:42 -04:00
James Smart
667a766252 scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
When unloading and reloading the driver, the driver fails to recreate
the lpfc root inode in the debugfs tree.

The driver is incorrectly removing the lpfc root inode in
lpfc_debugfs_terminate in the first driver instance that unloads and
then sets the lpfc_debugfs_root global parameter to NULL.  When the
final driver instance unloads, the debugfs calls quietly ignore the
remove on a NULL pointer.  The bug is that the debugfs_remove call
returns void so the driver doesn't know to correctly set the global
parameter to NULL.

Base the debugfs_remove of the lpfc_debugfs_root parameter on
lpfc_debugfs_hba_count because this parameter tracks the fnX instance
tracked per driver instance.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:23:13 -04:00
James Smart
82820f0cf1 scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
When the driver send the RPA command, it does not send supported FC4
Type NVME to the management server.

Encode NVME (type x28) in the AttribEntry in the RPA command.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:22:52 -04:00
James Smart
a8cf5dfeb4 scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
Previous logic would just drop the IO.

Added logic to queue the IO to wait for an IO context resource from an
IO thats already in progress.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:22:22 -04:00