Commit Graph

148 Commits

Author SHA1 Message Date
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Dick Kennedy
00081c5ca4 scsi: lpfc: Fix LUN loss after cable pull
On devices that support FCP sequence error recovery, which attempts to
preserve the devices login across link bounce, adisc is used for device
validation. Turns out the device fc4 type is cleared as part of the link
bounce, but the ADISC handling doesn't restore the FC4 support as it
normally would with a PRLI. This caused situations where the device wasn't
reregistered with the transport thus scan logic and LUN discovery never
kicked in.

In the ADISC completion handling, reset the fc4 type so that transport port
reregistration occurs with the remote port.

Link: https://lore.kernel.org/r/20200803210229.23063-8-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04 20:56:57 -04:00
Lee Jones
22f8c07741 scsi: lpfc: Add description for lpfc_release_rpi()'s 'ndlpl param
Fixes the following W=1 kernel build warning(s):

 drivers/scsi/lpfc/lpfc_nportdisc.c:1079: warning: Function parameter or member 'ndlp' not described in 'lpfc_release_rpi'

Link: https://lore.kernel.org/r/20200723122446.1329773-20-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-24 22:32:04 -04:00
Dick Kennedy
372c187b8a scsi: lpfc: Add an internal trace log buffer
The current logging methods typically end up requesting a reproduction with
a different logging level set to figure out what happened. This was mainly
by design to not clutter the kernel log messages with things that were
typically not interesting and the messages themselves could cause other
issues.

When looking to make a better system, it was seen that in many cases when
more data was wanted was when another message, usually at KERN_ERR level,
was logged.  And in most cases, what the additional logging that was then
enabled was typically. Most of these areas fell into the discovery machine.

Based on this summary, the following design has been put in place: The
driver will maintain an internal log (256 elements of 256 bytes).  The
"additional logging" messages that are usually enabled in a reproduction
will be changed to now log all the time to the internal log.  A new logging
level is defined - LOG_TRACE_EVENT.  When this level is set (it is not by
default) and a message marked as KERN_ERR is logged, all the messages in
the internal log will be dumped to the kernel log before the KERN_ERR
message is logged.

There is a timestamp on each message added to the internal log. However,
this timestamp is not converted to wall time when logged. The value of the
timestamp is solely to give a crude time reference for the messages.

Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:49 -04:00
Dick Kennedy
9806c984d4 scsi: lpfc: Fix NVMe rport deregister and registration during ADISC
During driver unload/reload testing, the NVMe initiator would not
re-establish connectivity to NVMe controllers on reload.

The failing NVMe array supports concurrent FCP and NVMe operation via
different nport_id's. The array was repeatedly sending an ADISC every 2
seconds after PLOGI completed and while NVMe subsystems were executing
discovery. The target would continue this state for roughly 45 seconds.

The driver's current behavior on ADISC receipt is to validate a the ADISC
vs the device and issue a RESUME_RPI to restore transmission. The receipt
of the ADISC effectively caused a driver to take actions similar to a
logout and login for the remote port, causing the deregistration of the
nvme rport and a subsequent re-registration.  This caused a constant reset
and re-connect of the NVMe controller while this 45s window occurred. There
was no need for the state changes as ADISC does not change login state.

This patch corrects this behavior by validating if the remoteport is
already logged in (MAPPED) and when true, avoids the call to set the ndlp
state to MAPPED, which triggers the unreg/re-reg. Thus ADISC does not
change the login state of the node.

Link: https://lore.kernel.org/r/20200630215001.70793-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:35 -04:00
James Smart
4c2805aab5 lpfc: nvmet: Add support for NVME LS request hosthandle
As the nvmet layer does not have the concept of a remoteport object, which
can be used to identify the entity on the other end of the fabric that is
to receive an LS, the hosthandle was introduced.  The driver passes the
hosthandle, a value representative of the remote port, with a ls request
receive. The LS request will create the association.  The transport will
remember the hosthandle for the association, and if there is a need to
initiate a LS request to the remote port for the association, the
hosthandle will be used. When the driver loses connectivity with the
remote port, it needs to notify the transport that the hosthandle is no
longer valid, allowing the transport to terminate associations related to
the hosthandle.

This patch adds support to the driver for the hosthandle. The driver will
use the ndlp pointer of the remote port for the hosthandle in calls to
nvmet_fc_rcv_ls_req().  The discovery engine is updated to invalidate the
hosthandle whenever connectivity with the remote port is lost.

Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:18:34 -06:00
James Smart
2a1160a03a lpfc: Refactor lpfc nvme headers
A lot of files in lpfc include nvme headers, building up relationships that
require a file to change for its headers when there is no other change
necessary. It would be better to localize the nvme headers.

There is also no need for separate nvme (initiator) and nvmet (tgt)
header files.

Refactor the inclusion of nvme headers so that all nvme items are
included by lpfc_nvme.h

Merge lpfc_nvmet.h into lpfc_nvme.h so that there is a single header used
by both the nvme and nvmet sides. This prepares for structure sharing
between the two roles. Prep to add shared function prototypes for upcoming
shared routines.

Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:18:34 -06:00
YueHaibing
fdb827e4a3 scsi: lpfc: Make lpfc_defer_acc_rsp static
Fix sparse warning:

drivers/scsi/lpfc/lpfc_nportdisc.c:344:1: warning:
 symbol 'lpfc_defer_acc_rsp' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20200107014956.41748-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-10 01:23:47 -05:00
James Smart
be0709e449 scsi: lpfc: Fix incomplete NVME discovery when target
NVMe device re-discovery does not complete. Dev_loss_tmo messages seen on
initiator after recovery from a link disturbance.

The failing case is the following:

When the driver (as a NVME target) receives a PLOGI, the driver initiates
an "unreg rpi" mailbox command. While the mailbox command is in progress,
the driver requests that an ACC be sent to the initiator. The target's ACC
is received by the initiator and the initiator then transmits a PLOGI. The
driver receives the PLOGI prior to receiving the completion for the PLOGI
response WQE that sent the ACC. (Different delivery sources from the hw so
the race is very possible). Given the PLOGI is prior to the ACC completion
(signifying PLOGI exchange complete), the driver LS_RJT's the PRLI. The
"unreg rpi" mailbox then completes. Since PRLI has been received, the
driver transmits a PLOGI to restart discovery, which the initiator then
ACC's.  If the driver processes the (re)PLOGI ACC prior to the completing
the handling for the earlier ACC it sent the intiators original PLOGI,
there is no state change for completion of the (re)PLOGI. The ndlp remains
in "PLOGI Sent" and the initiator continues sending PRLI's which are
rejected by the target until timeout or retry is reached.

Fix by: When in target mode, defer sending an ACC for the received PLOGI
until unreg RPI completes.

Link: https://lore.kernel.org/r/20191218235808.31922-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:41 -05:00
Linus Torvalds
ef2cc88e2a SCSI misc on 20191130
This is mostly update of the usual drivers: aacraid, ufs, zfcp,
 NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
 plus a whole load of minor updates and fixes.  The two major core
 changes are Al Viro's reworking of sg's handling of copy to/from user,
 Ming Lei's removal of the host busy counter to avoid contention in the
 multiqueue case and Damien Le Moal's fixing of residual tracking
 across error handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeKvHCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQJMAQDAjlAi
 SNfbyndMqyf+rZGWufDI+43Up1VvW9GeWJHeDwEAxfO5XZsCks2uT8UxXhpEp9L7
 HkiUww3zbcgl0FWFkUM=
 =cdVU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: aacraid, ufs, zfcp,
  NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
  plus a whole load of minor updates and fixes.

  The major core changes are Al Viro's reworking of sg's handling of
  copy to/from user, Ming Lei's removal of the host busy counter to
  avoid contention in the multiqueue case and Damien Le Moal's fixing of
  residual tracking across error handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits)
  scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort()
  scsi: target: core: Fix a pr_debug() argument
  scsi: iscsi: Don't send data to unbound connection
  scsi: target: iscsi: Wait for all commands to finish before freeing a session
  scsi: target: core: Release SPC-2 reservations when closing a session
  scsi: target: core: Document target_cmd_size_check()
  scsi: bnx2i: fix potential use after free
  Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
  scsi: NCR5380: Add disconnect_mask module parameter
  scsi: NCR5380: Unconditionally clear ICR after do_abort()
  scsi: NCR5380: Call scsi_set_resid() on command completion
  scsi: scsi_debug: num_tgts must be >= 0
  scsi: lpfc: use hdwq assigned cpu for allocation
  scsi: arcmsr: fix indentation issues
  scsi: qla4xxx: fix double free bug
  scsi: pm80xx: Modified the logic to collect fatal dump
  scsi: pm80xx: Tie the interrupt name to the module instance
  scsi: pm80xx: Controller fatal error through sysfs
  scsi: pm80xx: Do not request 12G sas speeds
  scsi: pm80xx: Cleanup command when a reset times out
  ...
2019-12-02 13:37:02 -08:00
James Smart
69641627c6 scsi: lpfc: Sync with FC-NVMe-2 SLER change to require Conf with SLER
Prior to the last FC-NVME-2 draft, SLER and CONF were independent.  SLER
now requires CONF to be set.

Revise the NVME PRLI checking to look for both inorder to enable SLER.

Link: https://lore.kernel.org/r/20191105005708.7399-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
feff8b3d84 scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
When operating in private loop mode, PLOGI exchanges are racing and the
driver tries to abort it's PLOGI. But the PLOGI abort ends up terminating
the login with the other end causing the other end to abort its PLOGI as
well. Discovery never fully completes.

Fix by disabling the PLOGI abort when private loop and letting the state
machine play out.

Link: https://lore.kernel.org/r/20191018211832.7917-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:04 -04:00
Daniel Wagner
0fd103ccfe scsi: lpfc: Honor module parameter lpfc_use_adisc
The initial lpfc_desc_set_adisc implementation in commit
dea3101e0a ("lpfc: add Emulex FC driver version 8.0.28") enabled ADISC if

	cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE

In commit 92d7f7b0cd ("[SCSI] lpfc: NPIV: add NPIV support on top of
SLI-3") this changed to

	(cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE

and later in commit ffc954936b ("[SCSI] lpfc 8.3.13: FC Discovery Fixes
and enhancements.") to

	(cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICE && FCP_TARGET)

A customer reports that after a devloss, an ADISC failure is logged. It
turns out the ADISC flag is set even the user explicitly set lpfc_use_adisc
= 0.

[Sat Dec 22 22:55:58 2018] lpfc 0000:82:00.0: 2:(0):0203 Devloss timeout on WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x82000000 x8 xa
[Sat Dec 22 23:08:20 2018] lpfc 0000:82:00.0: 2:(0):2755 ADISC failure DID:05DF00 Status:x9/x70000

[mkp: fixed Hannes' email]

Fixes: 92d7f7b0cd ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3")
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: James Smart <james.smart@broadcom.com>
Link: https://lore.kernel.org/r/20191022072112.132268-1-dwagner@suse.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-22 22:30:27 -04:00
zhengbin
f7cb0d0945 scsi: lpfc: Make function lpfc_defer_pt2pt_acc static
Fix sparse warnings:

drivers/scsi/lpfc/lpfc_nportdisc.c:290:1: warning: symbol 'lpfc_defer_pt2pt_acc' was not declared. Should it be static?

Link: https://lore.kernel.org/r/1570183477-137273-1-git-send-email-zhengbin13@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 22:32:28 -04:00
James Smart
359e10f087 scsi: lpfc: Fix pt2pt discovery on SLI3 HBAs
After exchanging PLOGI on an SLI-3 adapter, the PRLI exchange failed.  Link
trace showed the port was assigned a non-zero n_port_id, but didn't use the
address on the PRLI. The assigned address is set on the port by the
CONFIG_LINK mailbox command. The driver responded to the PRLI before the
mailbox command completed. Thus the PRLI response used the old n_port_id.

Defer the PRLI response until CONFIG_LINK completes.

Link: https://lore.kernel.org/r/20190922035906.10977-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:07:08 -04:00
James Smart
0d8af09643 scsi: lpfc: Add NVMe sequence level error recovery support
FC-NVMe-2 added support for sequence level error recovery in the FC-NVME
protocol. This allows for the detection of errors and lost frames and
immediate retransmission of data to avoid exchange termination, which
escalates into NVMeoFC connection and association failures. A significant
RAS improvement.

The driver is modified to indicate support for SLER in the NVMe PRLI is
issues and to check for support in the PRLI response.  When both sides
support it, the driver will set a bit in the WQE to enable the recovery
behavior on the exchange. The adapter will take care of all detection and
retransmission.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:12 -04:00
James Smart
3235066449 scsi: lpfc: Migrate to %px and %pf in kernel print calls
In order to see real addresses, convert %p with %px for kernel addresses
and replace %p with %pf for functions.

While converting, standardize on "x%px" throughout (not %px or 0x%px).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:11 -04:00
James Smart
26d824ca45 scsi: lpfc: Fix ADISC reception terminating login state if a NVME target
If a target issues an ADISC to the port and the target is a NVME target,
the driver is inadvertantly invalidating the login and marking the remote
port as logged out. Communication with the target is lost.

Revise the ADISC check so that FCP or NVME targets will be marked valid at
the end of ADISC processing.  Enhance logging to recognize condition
better.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:09 -04:00
James Smart
7f20c1cb23 scsi: lpfc: Fix discovery when target has no GID_FT information
Some remote ports may be slow in registering their GID_FT protocol
information with the fabric. If the remote port is an initiator, it may
send PLOGI to the port before the GID_FT logic is complete. Meaning, after
accepting the PLOGI, when the driver may see no response to the GID_FT that
is issued after the login to determine the protocols supported so that
proper PRLI's may be transmit. If the driver has no fc4 information, it
currently stops and the remote port is not discovered.

Fix by issuing a LOGO when there is no GID_FT information.  The LOGO
completion handling will attempt to re-login if the nport_id is still
present.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:09 -04:00
James Smart
57178b9275 scsi: lpfc: Fix port relogin failure due to GID_FT interaction
In cases of remote-port-side cable pull/replug, there happens to be a
target that upon replug will send the port a PLOGI, a PRLI, and a LOGO.
When this sequence is received by the driver, the PLOGI accepted and a
GFT_ID is issued to find the protocol support for the remote port. While
the GFT_ID is outstanding, a LOGO is received. The driver logs the remote
port out and unregisters the RPI and schedules a new PLOGI transmission.
However, the GFT_ID was not terminated. When it completed, the driver
attempted to transition the remote port to PRLI transmission, which cancels
the PLOGI scheduling. The PRLI transmit attempt is rejected by the adapter
as the remote port is not logged in. No retry is attempted as it's expected
the logout is noted and the supposedly scheduled PLOGI should address the
state. As there is no PLOGI, the remote port does not get re-discovered.

Fix by aborting the outstanding GFT_ID if the related remote port is logged
out.

Ensure a PRLI transmit attempt only occurs if the remote port is logging
in. This avoids the incorrect attempt while logged out.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:08 -04:00
Bart Van Assche
b27cbd5549 scsi: lpfc: Remove set-but-not-used variables
This patch does not change any functionality but avoids that the compiler
complains about set-but-not-used variables when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:36 -04:00
Bart Van Assche
cd05c155d7 scsi: lpfc: Annotate switch/case fall-through
This patch avoids that the compiler warns about missing fall-through
annotation when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:35 -04:00
Bart Van Assche
3999df75bc scsi: lpfc: Declare local functions static
This patch avoids that the compiler complains about missing declarations
when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:35 -04:00
James Smart
0d041215f0 scsi: lpfc: Update 12.2.0.0 file copyrights to 2019
For files modified as part of 12.2.0.0 patches, update copyright to 2019

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
f6e8479052 scsi: lpfc: Fix default driver parameter collision for allowing NPIV support
The conversion to enable SCSI and NVME fc4 support ran into an issue with
NPIV support. With NVME, NPIV is not currently supported, but with SCSI it
was. The driver reverted to its lowest setting meaning NPIV with SCSI was
not allowed.

Convert the NPIV checks and implementation so that SCSI can continue to
allow NPIV support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
dea16bdae2 scsi: lpfc: Fix discovery failures during port failovers with lots of vports
The driver is getting hit with 100s of RSCNs during remote port address
changes. Each of those RSCN's ends up generating UNREG_RPI and REG_PRI
mailbox commands.  The discovery engine within the driver doesn't wait for
the mailbox command completions. Instead it sets state flags and moves
forward. At some point, there's a massive backlog of mailbox commands which
take time for the adapter to process. Additionally, it appears there were
duplicate events from the switch so the driver generated duplicate mailbox
commands for the same remote port.  During this window, failures on PLOGI
and PRLI ELS's are see as the adapter is rejecting them as they are for
remote ports that still have pending mailbox commands.

Streamline the discovery engine so that PLOGI log checks for outstanding
UNREG_RPIs and defer the processing until the commands complete. This
better synchronizes the ELS transmission vs the RPI registrations.

Filter out multiple UNREG_RPIs being queued up for the same remote port.

Beef up log messages in this area.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart
3e1f071892 scsi: lpfc: refactor mailbox structure context fields
The driver data structure for managing a mailbox command contained two
context fields. Unfortunately, the context were considered "generic" to be
used at the whim of the command code.  Of course, one section of code used
fields this way, while another did it that way, and eventually there were
mixups.

Refactored the structure so that the generic contexts become a node context
and a buffer context and all code standardizes on their use.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart
7ea92eb458 scsi: lpfc: Implement GID_PT on Nameserver query to support faster failover
The switches seem to respond faster to GID_PT vs GID_FT NameServer
queries.  Add support for GID_PT to be used over GID_FT to enable
faster storage failover detection. Includes addition of new module
parameter to select between GID_PT and GID_FT (GID_FT is default).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart
d83ca3ea83 scsi: lpfc: Correct loss of fc4 type on remote port address change
An address change for a remote port cause PRLI for the wrong protocol
to be sent.  The node copy done in the discovery code skipped copying
the fc4 protocols supported as well.

Fix the copy logic for the address change.  Beefed up log messages in
this area as well.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart
30e196cace scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to
re-establish the login.  An nlp_type check in the LOGO completion
handler failed to restart discovery for NVME targets.  Revised the
nlp_type check for NVME as well as SCSI.

While reviewing the LOGO handling a few other issues were seen and
were addressed:

- Better lock synchronization around ndlp data types

- When the ABTS times out, unregister the RPI before sending the LOGO
  so that all local exchange contexts are cleared and nothing received
  while awaiting LOGO/PLOGI handling will be accepted.

- LOGO handling optimized to:
   Wait only R_A_TOV for a response.
   It doesn't need to be retried on timeout. If there wasn't a
     response, a PLOGI will be sent, thus an implicit logout
     applies as well when the other port sees it.
   If there is a response, any kind of response is considered "good"
     and the XRI quarantined for a exchange qualifier window.

- PLOGI is issued as soon a LOGO state is resolved.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:50 -05:00
James Smart
01a8aed6a0 scsi: lpfc: Fix GFT_ID and PRLI logic for RSCN
Driver only sends NVME PRLI to a device that also supports FCP.  This resuls
in remote ports that don't have fc_remote_ports created for them. The driver
is clearing the nlp_fc4_type for a ndlp at the wrong time.

Fix by moving the nlp_fc4_type clearing to the discovery engine in the
DEVICE_RECOVERY state. Also ensure that rport registration is done for all
nlp_fc4_types.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11 20:37:32 -04:00
James Smart
2a5b7d626e scsi: lpfc: Limit tracking of tgt queue depth in fast path
Performance is affected when target queue depth is tracked.  An atomic
counter is incremented on the submission path which competes with it being
decremented on the completion path.  In addition, multiple CPUs can
simultaniously be manipulating this counter for the same ndlp.

Reduce the overhead by only performing the target increment/decrement when
the target queue depth is less than the overall adapter depth, thus is
actually meaningful.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-08-02 15:45:19 -04:00
James Smart
faa832e97a scsi: lpfc: Fix ELS abort on SLI-3 adapters
For ABORT_XRI_CN command, firmware identifies XRI to abort by IOTAG and RPI
combination. For ELS aborts, driver specifies IOTAG correctly but RPI is
not specified.

Fix by setting RPI in WQE.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-08-02 15:45:18 -04:00
James Smart
4ae2ebde31 scsi: lpfc: Revise copyright for new company language
Change references from "Broadcom Limited" to "Broadcom Inc." in the
copyright message. Update copyright duration if not yet updated for 2018.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10 22:15:09 -04:00
James Smart
4d5e789a2e scsi: lpfc: correct oversubscription of nvme io requests for an adapter
Under large configurations, the driver would start to log message 6065 -
NVME out of buffers (exchanges).

The driver is using the ndlp cmd_qdepth value when determining the max
outstanding ios for an adapter. This value, by default, is set to 65536,
which exceeds the maximum exchange counts supported on an adapter. The ndlp
cmd_qdepth has no relevance and outstanding io count should be capped at
the max exchange count with IO requests beyond that level getting bounced
back with an EBUSY status so that they are retried by the block layer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
118c0415ee scsi: lpfc: Fix multiple PRLI completion error path
Nodelist entry for SCSI array ends up in UNMAPPED state. This is due to
illegal discovery State machine transition because of two PRLIs and the
first one failing with LS_RJT. Also, the error path was designed
assuming the PRLIs complete in the order they were sent, FCP first, then
NVME. In a failing case, the array thinks about the first PRLI (FCP),
but issues LS_RJT for the 2nd PRLI immediately.

Fix PRLI completion error path for the ordering expectation.  Ensure the
discovery state machine update is not set until all outstanding PRLIs
are complete.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:00 -04:00
James Smart
0709263abe scsi: lpfc: Fix NVME Initiator FirstBurst
First Burst support was not properly indicated in NVMe PRLI.

Correct the bit position and the logic to check and set first burst support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:23 -04:00
James Smart
128bddacc4 scsi: lpfc: Update 11.4.0.7 modified files for 2018 Copyright
Updated Copyright in files updated 11.4.0.7

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:24 -05:00
James Smart
a5ff06817e scsi: lpfc: Indicate CONF support in NVMe PRLI
Revise the NVME PRLI to indicate CONF support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:23 -05:00
James Smart
e06351a002 scsi: lpfc: Fix issues connecting with nvme initiator
In the lpfc discovery engine, when as a nvme target, where the driver
was performing mailbox io with the adapter for port login when a NVME
PRLI is received from the host. Rather than queue and eventually get
back to sending a response after the mailbox traffic, the driver
rejected the io with an error response.

Turns out this particular initiator didn't like the rejection values
(unable to process command/command in progress) so it never attempted a
retry of the PRLI. Thus the host never established nvme connectivity
with the lpfc target.

By changing the rejection values (to Logical Busy/nothing more), the
initiator accepted the response and would retry the PRLI, resulting in
nvme connectivity.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:47 -05:00
James Smart
9de416ac67 scsi: lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled
When enabled for both SCSI and NVME support, and connected pt2pt to a
SCSI only target, the driver nodelist entry for the remote port is left
in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine if only
configured for SCSI support.

Error was due to some of the prli points still reflecting the need to
send only 1 PRLI. On a lot of fabric configs, targets were NVME only,
which meant the fabric-reported protocol attributes were only telling
the driver one protocol or the other. Thus things worked fine. With
pt2pt, the driver must send a PRLI for both protocols as there are no
hints on what the target supports. Thus pt2pt targets were hitting the
multiple PRLI issues.

Complete the dual PRLI support. Track explicitly whether scsi (fcp) or
nvme prli's have been sent. Accurately track protocol support detected
on each node as reported by the fabric or probed by PRLI traffic.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:46 -05:00
James Smart
b95e29b75d scsi: lpfc: Fix receive PRLI handling
Handling a rcv'ed PRLI incorrectly can cause the ndlp to end up in the
wrong state or the driver to ACC and PRLI when it should send LS_RJT.

The cause was due to the driver not properly looking at the PRLI type
and taking the multiple protocol support into consideration.

Resolved by adding checks in the various PRLI receive points to validate
PRLI type and reject if not valid for the enabled protocols and mode
(host vs target).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:45 -05:00
Dick Kennedy
1234a6d54f scsi: lpfc: Fix crash receiving ELS while detaching driver
The driver crashes when attempting to use a freed ndpl pointer.

The pci_remove_one handler runs on a separate kernel thread. The order
of the removal is starting by freeing all of the ndlps and then
disabling interrupts. In between these two events the driver can still
receive an ELS and process it. When it tries to use the ndlp pointer
will be NULL

Change the order of the pci_remove_one vs disable interrupts so that
interrupts are disabled before the ndlp's are freed.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:33 -04:00
Dick Kennedy
8db1c2b3e7 scsi: lpfc: Fix handling of FCP and NVME FC4 types in Pt2Pt topology
After link bounce in a NVME Pt2Pt config, the driver managed to map the
same nport twice, resulting in multiple device nodes for the same
namespace.

In Pt2Pt, the driver must send PRLI's for both (scsi) FCP and NVME
rather than using fabric aids. The driver was inconsistent on handling
various PRLI completions, especially rejects, which had reject codes
cross the different protocol PRLI completions.

Fixed to perform the following: if nvmet mode (fc port can only be a
nvme target) - rejects all unsolicitly FCP PRLI's. Never issues a FCP
PRLI.

The multiple protocol PRLI's are sent simultaneously. However, driver
will now only state transition after both PRLI's are complete. New flags
were added to aid tracking the responses from the different PRLI's.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:37 -04:00
Dick Kennedy
d2aa48761e scsi: lpfc: Fix rediscovery on switch blade pull
When the switch blade is pulled out then plugged back in, the driver
does not issue a PLOGI to the target

When the switch blade is pulled out, it does not reset the link. The
driver ends up issuing a LOGO to the target, and finally sees devloss.
Since the driver believes that a LOGO is outstanding, it does not issue
a PLOGI to the target upon link up

Correct by placing the ndlp in UNUSED state When devloss happens in
LOGO_ISSUE state.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:35 -04:00
Dick Kennedy
2877cbffb7 scsi: lpfc: Fix loop mode target discovery
The driver does not discover targets when in loop mode.

The NLP type is correctly getting set when a fabric connection is
detected but, not for loop. The unknown NLP type means that the driver
does not issue a PRLI when in loop topology. Thus target discovery
fails.

Fix by checking the topology during discovery.  If it is loop, set the
NLP FC4 type to FCP.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:35 -04:00
Guilherme G. Piccoli
7c9fdfb700 scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
We might have a NULL pring in lpfc_els_abort(), for example on error
recovery path, since queues are destroyed during error recovery
mechanism.

In this case, we should just drop the abort since the queues will be
recreated anyway. This patch just verifies for NULL pointer and stop the
abortion of the queue in case of a NULL pring.

Also, this patch converts return type of lpfc_els_abort() from int to
void, since it's not checked anywhere.

Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart  <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:44:13 -04:00
James Smart
dc53a61852 scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
Code review of NVMEI's FC_PORT_ROLE_NVME_DISCOVERY looked wrong.

Discussions with storage architecture team clarified NVMEI's audit of
the PRLI response port roles.  Following up discussion with code review
showed a few minor corrections were required - especially in
anticipation of NVME auto discovery.

During PRLI, NVMEI should sent prli_init - which it it does.  NVMET
should send prli_tgt and prli_disc - which it does.  When NVMEI receives
a PRLI Response now, it audits the incoming target bits and stores the
attributes in the corresponding NDLP.  Later, when NVMEI registers the
NVME rport, it uses the stored ndlp attributes to set the rport
port_roles correctly.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:24:17 -04:00
James Smart
1c5b12f763 Fix implicit logo and RSCN handling for NVMET
NVMET didn't have any RSCN handling at all and
would not execute implicit LOGO when receiving a PLOGI
from an rport that NVMET had in state UNMAPPED.

Clean up the logic in lpfc_nlp_state_cleanup for
initiators (FCP and NVME). NVMET should not respond to
RSCN including allocating new ndlps so this code was
conditionalized when nvmet_support is true.  The check
for NLP_RCV_PLOGI in lpfc_setup_disc_node was moved
below the check for nvmet_support to allow the NVMET
to recover initiator nodes correctly.  The implicit
logo was introduced with lpfc_rcv_plogi when NVMET gets
a PLOGI on an ndlp in UNMAPPED state.  The RSCN handling
was modified to not respond to an RSCN in NVMET.  Instead
NVMET sends a GID_FT and determines if an NVMEP_INITIATOR
it has is UNMAPPED but no longer in the zone membership.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:49 +02:00
James Smart
d080abe0a8 scsi: lpfc: Update copyrights
Update copyrights to 2017 for all files touched in this patch set

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22 18:41:44 -05:00