Commit Graph

871202 Commits

Author SHA1 Message Date
James Smart
93a4d6f401 scsi: lpfc: Add registration for CPU Offline/Online events
The recent affinitization didn't address cpu offlining/onlining.  If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
b9da814cd5 scsi: lpfc: Clarify FAWNN error message
Current message on FAWWN events is rather cryptic.

Expand the message to clarify its meaning.

Link: https://lore.kernel.org/r/20191105005708.7399-8-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
69641627c6 scsi: lpfc: Sync with FC-NVMe-2 SLER change to require Conf with SLER
Prior to the last FC-NVME-2 draft, SLER and CONF were independent.  SLER
now requires CONF to be set.

Revise the NVME PRLI checking to look for both inorder to enable SLER.

Link: https://lore.kernel.org/r/20191105005708.7399-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
dda5bdf074 scsi: lpfc: Fix dynamic fw log enablement check
The recently posted patch had a typo that incorrectly tested the receiving
function.

Fix the typo (change == to !=)

Fixes: 95bfc6d8ad ("scsi: lpfc: Make FW logging dynamically configurable")
Link: https://lore.kernel.org/r/20191105005708.7399-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
2332e6e475 scsi: lpfc: Fix unexpected error messages during RSCN handling
During heavy RCN activity and log_verbose = 0 we see these messages:

  2754 PRLI failure DID:521245 Status:x9/xb2c00, data: x0
  0231 RSCN timeout Data: x0 x3
  0230 Unexpected timeout, hba link state x5

This is due to delayed RSCN activity.

Correct by avoiding the timeout thus the messages by restarting the
discovery timeout whenever an rscn is received.

Filter PRLI responses such that severity depends on whether expected for
the configuration or not. For example, PRLI errors on a fabric will be
informational (they are expected), but Point-to-Point errors are not
necessarily expected so they are raised to an error level.

Link: https://lore.kernel.org/r/20191105005708.7399-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
6c1e803eac scsi: lpfc: Fix kernel crash at lpfc_nvme_info_show during remote port bounce
When reading sysfs nvme_info file while a remote port leaves and comes
back, a NULL pointer is encountered. The issue is due to ndlp list
corruption as the the nvme_info_show does not use the same lock as the rest
of the code.

Correct by removing the rcu_xxx_lock calls and replace by the host_lock and
phba->hbaLock spinlocks that are used by the rest of the driver.  Given
we're called from sysfs, we are safe to use _irq rather than _irqsave.

Link: https://lore.kernel.org/r/20191105005708.7399-4-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
James Smart
6bfb162082 scsi: lpfc: Fix configuration of BB credit recovery in service parameters
The driver today is reading service parameters from the firmware and then
overwriting the firmware-provided values with values of its own.  There are
some switch features that require preliminary FLOGI's that are
switch-specific and done prior to the actual fabric FLOGI for traffic.  The
fw will perform those FLOGIs and will revise the service parameters for the
features configured. As the driver later overwrites those values with its
own values, it misconfigures things like BBSCN use by doing so.

Correct by eliminating the driver-overwrite of firmware values. The driver
correctly re-reads the service parameters after each link up to obtain the
latest values from firmware.

Link: https://lore.kernel.org/r/20191105005708.7399-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
James Smart
7cfd5639d9 scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
If the driver receives a login that is later then LOGO'd by the remote port
(aka ndlp), the driver, upon the completion of the LOGO ACC transmission,
will logout the node and unregister the rpi that is being used for the
node.  As part of the unreg, the node's rpi value is replaced by the
LPFC_RPI_ALLOC_ERROR value.  If the port is subsequently offlined, the
offline walks the nodes and ensures they are logged out, which possibly
entails unreg'ing their rpi values.  This path does not validate the node's
rpi value, thus doesn't detect that it has been unreg'd already.  The
replaced rpi value is then used when accessing the rpi bitmask array which
tracks active rpi values.  As the LPFC_RPI_ALLOC_ERROR value is not a valid
index for the bitmask, it may fault the system.

Revise the rpi release code to detect when the rpi value is the replaced
RPI_ALLOC_ERROR value and ignore further release steps.

Link: https://lore.kernel.org/r/20191105005708.7399-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
1feefb7ec2 scsi: sg: sg_ioctl(): get rid of access_ok()
simply not needed there - neither sg_new_read() nor sg_new_write() need
it.

Link: https://lore.kernel.org/r/20191017193925.25539-8-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
a64e5a8685 scsi: sg: sg_write(): get rid of access_ok()/__copy_from_user()/__get_user()
Just use plain copy_from_user() and get_user().  Note that while a
buf-derived pointer gets stored into ->dxferp, all places that actually use
the resulting value feed it either to import_iovec() or to
import_single_range(), and both will do validation.

Link: https://lore.kernel.org/r/20191017193925.25539-7-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
c8c12792d5 scsi: sg: sg_read(): get rid of access_ok()/__copy_..._user()
Use copy_..._user() instead, both in sg_read() and in sg_read_oxfer().  And
don't open-code memdup_user()...

Link: https://lore.kernel.org/r/20191017193925.25539-6-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
d9fc5617bc scsi: sg: sg_new_write(): don't bother with access_ok
... just use copy_from_user().  We copy only SZ_SG_IO_HDR bytes, so that
would, strictly speaking, loosen the check.  However, for call chains via
->write() the caller has actually checked the entire range and SG_IO passes
exactly SZ_SG_IO_HDR for count.  So no visible behaviour changes happen if
we check only what we really need for copyin.

Link: https://lore.kernel.org/r/20191017193925.25539-5-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
c35a5cfb41 scsi: sg: sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t
We don't need to allocate a temporary buffer and read the entire structure
in it, only to fetch a single field and free what we'd allocated.  Just use
get_user() and be done with it...

Link: https://lore.kernel.org/r/20191017193925.25539-4-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
062c9d4527 scsi: sg: sg_write(): __get_user() can fail...
Link: https://lore.kernel.org/r/20191017193925.25539-3-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
a62726cb9c scsi: sg: sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user()
Link: https://lore.kernel.org/r/20191017193925.25539-2-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Al Viro
a16a47416d scsi: sg: sg_ioctl(): fix copyout handling
First of all, __put_user() can fail with access_ok() succeeding.  And
access_ok() + __copy_to_user() is spelled copy_to_user()...

__put_user() *can* fail with access_ok() succeeding...

Link: https://lore.kernel.org/r/20191017193925.25539-1-viro@ZenIV.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
Pan Bian
ec990306f7 scsi: fnic: fix use after free
The memory chunk io_req is released by mempool_free. Accessing
io_req->start_time will result in a use after free bug. The variable
start_time is a backup of the timestamp. So, use start_time here to
avoid use after free.

Link: https://lore.kernel.org/r/1572881182-37664-1-git-send-email-bianpan2016@163.com
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:02 -05:00
Bart Van Assche
b1335f5b04 scsi: core: scsi_trace: Use get_unaligned_be*()
This patch fixes an unintended sign extension on left shifts. From Colin
King: "Shifting a u8 left will cause the value to be promoted to an
integer. If the top bit of the u8 is set then the following conversion to
an u64 will sign extend the value causing the upper 32 bits to be set in
the result."

Fix this by using get_unaligned_be*() instead.

Fixes: bf81623542 ("[SCSI] add scsi trace core functions and put trace points")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20191101211447.187151-1-bvanassche@acm.org
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-01 20:28:03 -04:00
Saurav Girepunje
64dc4f346b scsi: csiostor: Return value not required for csio_dfs_destroy
Only csio_hw_free() calling csio_dfs_destroy() and it is not checking
return value. So remove the return from csio_dfs_destroy().

Link: https://lore.kernel.org/r/20191028194234.GA27848@saurav
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-01 20:22:05 -04:00
Saurav Girepunje
75a740e6e8 scsi: csiostor: Fix NULL check before debugfs_remove_recursive
debugfs_remove_recursive() has taken the null pointer into account.  Remove
the null check before debugfs_remove_recursive().

Link: https://lore.kernel.org/r/20191026195625.GA22455@saurav
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-01 20:19:11 -04:00
Saurav Girepunje
62fb8b34be scsi: pm8001: Fix Use plain integer as NULL pointer
Replace assignment of 0 to pointer with NULL assignment.

Link: https://lore.kernel.org/r/20191025135010.GA6191@saurav
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-01 20:16:35 -04:00
Ming Lei
6eb045e092 scsi: core: avoid host-wide host_busy counter for scsi_mq
It isn't necessary to check the host depth in scsi_queue_rq() any more
since it has been respected by blk-mq before calling scsi_queue_rq() via
getting driver tag.

Lots of LUNs may attach to same host and per-host IOPS may reach millions,
so we should avoid expensive atomic operations on the host-wide counter in
the IO path.

This patch implements scsi_host_busy() via blk_mq_tagset_busy_iter() with
one scsi command state for reading the count of busy IOs for scsi_mq.

It is observed that IOPS is increased by 15% in IO test on scsi_debug (32
LUNs, 32 submit queues, 1024 can_queue, libaio/dio) in a dual-socket
system.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20191025065855.6309-1-ming.lei@redhat.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-01 20:12:50 -04:00
Bart Van Assche
7f674c38a3 scsi: ufs: Use enum dev_cmd_type where appropriate
Declare all variables that hold dev_cmd_type values as an enum instead of
as an int.

Cc: Yaniv Gardi <ygardi@codeaurora.org>
Cc: Subhash Jadavani <subhashj@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20191029230710.211926-3-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-31 22:16:05 -04:00
Bart Van Assche
d0e9760de3 scsi: ufs: Fix kernel-doc warnings
Fix the following three kernel-doc warnings:

drivers/scsi/ufs/ufs_bsg.c:165: warning: Function parameter or member 'hba' not described in 'ufs_bsg_remove'
drivers/scsi/ufs/ufshcd.c:5789: warning: Function parameter or member 'cmd_type' not described in 'ufshcd_issue_devman_upiu_cmd'
drivers/scsi/ufs/ufshcd.c:5789: warning: Excess function parameter 'msgcode' description in 'ufshcd_issue_devman_upiu_cmd'

Cc: Yaniv Gardi <ygardi@codeaurora.org>
Cc: Subhash Jadavani <subhashj@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20191029230710.211926-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-31 22:15:13 -04:00
Bean Huo
059efd847a scsi: ufs: delete redundant function ufshcd_def_desc_sizes()
There is no need to call ufshcd_def_desc_sizes() in ufshcd_init(), since
descriptor lengths will be checked and initialized later in
ufshcd_init_desc_sizes().

Fixes: a4b0e8a4e92b1b(scsi: ufs: Factor out ufshcd_read_desc_param)
Link: https://lore.kernel.org/r/BN7PR08MB5684A3ACE214C3D4792CE729DB610@BN7PR08MB5684.namprd08.prod.outlook.com
Signed-off-by: Bean Huo <beanhuo@micron.com>
Acked-by: Avri Altman <avri.altman.wdc.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-31 22:13:37 -04:00
Steffen Maier
100843f176 scsi: zfcp: trace channel log even for FCP command responses
While v2.6.26 commit b75db73159 ("[SCSI] zfcp: Add qtcb dump to hba debug
trace") is right that we don't want to flood the (payload) trace ring
buffer, we don't trace successful FCP command responses by default.  So we
can include the channel log for problem determination with failed responses
of any FSF request type.

Fixes: b75db73159 ("[SCSI] zfcp: Add qtcb dump to hba debug trace")
Fixes: a54ca0f62f ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Cc: <stable@vger.kernel.org> #2.6.38+
Link: https://lore.kernel.org/r/e37597b5c4ae123aaa85fd86c23a9f71e994e4a9.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Steffen Maier
e76acc5194 scsi: zfcp: proper indentation to reduce confusion in zfcp_erp_required_act
No functional change.

The unary not operator only applies to the sub expression before the
logical or. So we return early if (not running) or failed.

Link: https://lore.kernel.org/r/df4f897f6e83eaa528465d0858d5a22daac47a2f.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
48910f8c35 scsi: zfcp: move maximum age of diagnostic buffers into a per-adapter variable
Replace the static define (ZFCP_DIAG_MAX_AGE) with a per-adapter variable
(${adapter}->diagnostics->max_age). This new variable is exported via
sysfs, along with other, already existing adapter variables, and can both
be read and written. This way users can choose how much time should pass
between refreshes of diagnostic buffers. The default value for the age
remains to be five seconds.

By setting this new variable to 0, the caching of diagnostic buffers for
userspace accesses can also be completely removed.

All diagnostic buffers of a given adapter are subject to this setting in
the same way.

Link: https://lore.kernel.org/r/b1d0977cc884b16dd4ca6418e4320c56a4c31d63.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
8a72db70b5 scsi: zfcp: implicitly refresh config-data diagnostics when reading sysfs
Adds implicit updates of cached diagnostics via Exchange Config Data when
reading sysfs attributes interfacing them. Right now this only affects the
new B2B-Credit diagnostic attribute.

This uses the same mechanism previously also used for cached diagnostics
of Exchange Port Data.

Link: https://lore.kernel.org/r/60a94f55f2630b74b468fed5f39880208abb2679.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
5a2876f0d1 scsi: zfcp: introduce sysfs interface to read the local B2B-Credit
In addition to the diagnostic data from the local SFP transceiver this
patch adds an interface to read the advertised buffer-to-buffer credit from
the local FC_Port.

With this patch the userspace-interface will only read data stored in the
corresponding "diagnostic buffer" (that was stored during completion of a
previous Exchange Config Data command). Implicit updating will follow later
in this series.

Link: https://lore.kernel.org/r/8a53aef87b53c50cfb1a3425b799bacb6f82b832.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
8155eb0785 scsi: zfcp: implicitly refresh port-data diagnostics when reading sysfs
This patch adds implicit updates to the sysfs entries that read the
diagnostic data stored in the "caching buffer" for Exchange Port Data.

An update is triggered once the buffer is older than ZFCP_DIAG_MAX_AGE
milliseconds (5s). This entails sending an Exchange Port Data command to
the FCP-Channel, and during its ingress path updating the cached data and
the timestamp. To prevent multiple concurrent userspace-applications from
triggering this update in parallel we synchronize all of them using a
wait-queue (waiting threads are interruptible; the updating thread is not).

Link: https://lore.kernel.org/r/c145b5cfc99a63b6a018b1184fbd27bb09c955f5.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
6028f7c4cd scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver
This adds an interface to read the diagnostics of the local SFP transceiver
of an FCP-Channel from userspace. This comes in the form of new sysfs
entries that are attached to the CCW device representing the FCP
device. Each type of data gets its own sysfs entry; the whole collection of
entries is pooled into a new child-directory of the CCW device node:
"diagnostics".

Adds sysfs entries for:
 * sfp_invalid:    boolean value evaluating to whether the following 5
                   fields are invalid; {0, 1}; 1 - invalid
 * temperature:    transceiver temp.; unit 1/256°C;
                   range [-128°C, +128°C]
 * vcc:            supply voltage; unit 100μV; range [0, 6.55V]
 * tx_bias:        transmitter laser bias current; unit 2μA;
                   range [0, 131mA]
 * tx_power:       coupled TX output power; unit 0.1μW; range [0, 6.5mW]
 * rx_power:       received optical power; unit 0.1μW; range [0, 6.5mW]

 * optical_port:   boolean value evaluating to whether the FCP-Channel has
                   an optical port; {0, 1}; 1 - optical
 * fec_active:     boolean value evaluating to whether 16G FEC is active;
                   {0, 1}; 1 - active
 * port_tx_type:   nibble describing the port type; {0, 1, 2, 3};
                   0 - unknown,             1 - short wave,
                   2 - long wave LC 1310nm, 3 - long wave LL 1550nm
 * connector_type: two bits describing the connector type; {0, 1};
                   0 - unknown,             1 - SFP+

This is only supported if the FCP-Channel in turn supports reporting the
SFP Diagnostic Data, otherwise read() on these new entries will return
EOPNOTSUPP (this affects only adapters older than FICON Express8S, on
Mainframe generations older than z14). Other possible errors for read()
include ENOLINK, ENODEV and ENOMEM.

With this patch the userspace-interface will only read data stored in
the corresponding "diagnostic buffer" (that was stored during completion
of an previous Exchange Port Data command). Implicit updating will
follow later in this series.

Link: https://lore.kernel.org/r/1f9cce7c829c881e7d71a3f10c5b57f3dd84ab32.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
a10a61e807 scsi: zfcp: support retrieval of SFP Data via Exchange Port Data
A new FCP channel feature allows us to read the diagnostics from our local
SFP transceivers. To make use of that add a flag
(FSF_FEATURE_REQUEST_SFP_DATA) to the feature-set we request from the FCP
channel. Whether the channel actually implements this can be determined via
an other new flag (FSF_FEATURE_REPORT_SFP_DATA), that is set in the
adapter_features field of the adapter structure after Exchange Config Data
finished.

Also add the corresponding definitions in the QTCB Bottom for Exchange Port
Data. These new definitions are only valid, if FSF_FEATURE_REPORT_SFP_DATA
is set.

Link: https://lore.kernel.org/r/ee1eba4de71eb06b4d82207ad4f428429346156f.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:15 -04:00
Benjamin Block
088210233e scsi: zfcp: add diagnostics buffer for exchange config data
In the same vein as the previous patch, add diagnostic data capture for the
Exchange Config Data command.

Link: https://lore.kernel.org/r/7d8ac0a6cad403fa8f8b888693476a84e80a277b.1572018131.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:14 -04:00
Benjamin Block
7e418833e6 scsi: zfcp: diagnostics buffer caching and use for exchange port data
The FCP channel exposes two central interfaces to receive information about
the local FCP-Adapter/-Port: Exchange Port and Exchange Config Data. Using
these commands can negatively impact the adapter if we allow them to be
sent at a very high rate.

The later parts of this patchset will introduce new user-interfaces to
receive more diagnostics from the adapter. To prevent any negative impact
from using those, this patch adds a simple caching-mechanism that will
prevent a malicious/faulty userspace-application from generating an
abnormal high amount of Exchange Port/Config Data traffic.

Relevant diagnostic data that is received via Exchange Config/Port Data is
cached in buffers associated with the corresponding adapter-struct.  Each
buffer is associated with a timestamp that signals how old the data is,
and, added via a following patch in this series, lets userspace-interfaces
determine when the data is too old and needs to be updated.

Buffer-updates are made during the normal response path of the
corresponding command. With this patch only the output of the Exchange Port
Data command is captured.

Link: https://lore.kernel.org/r/054ca020ce0a53dc0d9176428bea373898944e6a.1572018130.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:14 -04:00
Benjamin Block
92953c6e0a scsi: zfcp: signal incomplete or error for sync exchange config/port data
Adds a new FSF-Request status flag (ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE)
that signal that the data received using Exchange Config Data or Exchange
Port Data was incomplete. This new flags is set in the respective handlers
during the response path.

With this patch, only the synchronous FSF-functions for each command got
support for the new flag, otherwise it is transparent.

Together with this new flag and already existing status flags the
synchronous FSF-functions are extended to now detect whether the received
data is complete, incomplete or completely invalid (this includes cases
where a command ran into a timeout). This is now signaled back to the
caller, where previously only failures on the request path would result in
a bad return-code.

For complete data the return-code remains 0. For incomplete data a new
return-code -EAGAIN is added to the function-interface. For completely
invalid data the already existing return-code -EIO is reused - formerly
this was used to signal failures on the request path.

Existing callers of the FSF-functions are adjusted so that they behave as
before for return-code 0 and -EAGAIN, to not change the user-interface. As
-EIO existed all along, it was already exposed to the user - and needed
handling - and will now also be exposed in this new special case.

Link: https://lore.kernel.org/r/e14f0702fa2b00a4d1f37c7981a13f2dd1ea2c83.1572018130.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:16:14 -04:00
YueHaibing
7b10db5552 scsi: lpfc: Make lpfc_debugfs_ras_log_data static
Fix sparse warning:

drivers/scsi/lpfc/lpfc_debugfs.c:2083:1: warning:
 symbol 'lpfc_debugfs_ras_log_data' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20191028132556.16272-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 22:02:28 -04:00
Saurav Girepunje
c3e5aac3e2 scsi: lpfc: Fix NULL check before mempool_destroy is not needed
mempool_destroy has taken null pointer check into account. Remove the
redundant check.

Link: https://lore.kernel.org/r/20191026194712.GA22249@saurav
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 21:52:38 -04:00
James Smart
5792a0e816 scsi: lpfc: fix spelling error in MAGIC_NUMER_xxx
convert MAGIC_NUMER_xxx to MAGIC_NUMBER_xxx

Link: https://lore.kernel.org/r/20191025184342.6623-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 21:49:22 -04:00
James Smart
9e2edb41c3 scsi: lpfc: fix build error of lpfc_debugfs.c for vfree/vmalloc
lpfc_debufs.c was missing include of vmalloc.h when compiled on PPC.

Add missing header.

Link: https://lore.kernel.org/r/20191025182530.26653-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 21:48:37 -04:00
Luo Jiaxing
f873b66119 scsi: hisi_sas: Record the phy down event in debugfs
The number of phy down reflects the quality of the link between SAS
controller and disk. In order to allow the user to confirm the link quality
of the system, we record the number of phy down for each phy.

The user can check the current phy down count by reading the debugfs file
corresponding to the specific phy, or clear the phy down count by writing 0
to the debugfs file.

Link: https://lore.kernel.org/r/1571926105-74636-19-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
cabe7c10c9 scsi: hisi_sas: Delete the debugfs folder of hisi_sas when the probe fails
Although if the debugfs initialization fails, we will delete the debugfs
folder of hisi_sas, but we did not consider the scenario where debugfs was
successfully initialized, but the probe failed for other reasons. We found
out that hisi_sas folder is still remain after the probe failed.

When probe fail, we should delete debugfs folder to avoid the above issue.

Link: https://lore.kernel.org/r/1571926105-74636-18-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
8f6432986e scsi: hisi_sas: Add ability to have multiple debugfs dumps
We use the module parameter debugfs_dump_count to manage the upper limit of
the memory block for multiple dumps.

Link: https://lore.kernel.org/r/1571926105-74636-17-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
905ab01faf scsi: hisi_sas: Add module parameter for debugfs dump count
We still only use dump index #0 however.

Link: https://lore.kernel.org/r/1571926105-74636-16-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
a70e33eae3 scsi: hisi_sas: Allocate memory for multiple dumps of debugfs
We add multiple dumps for debugfs, but only allocate memory this time and
only dump #0.

Link: https://lore.kernel.org/r/1571926105-74636-15-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
357e4fc7a9 scsi: hisi_sas: Add debugfs file structure for ITCT cache
Create a file structure which was used to save the memory address for
ITCT cache at debugfs. This structure is bound to the corresponding debugfs
file, it can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-14-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
b714dd8f36 scsi: hisi_sas: Add debugfs file structure for IOST cache
Create a file structure which was used to save the memory address for IOST
cache at debugfs. This structure is bound to the corresponding debugfs
file, it can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-13-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
0161d55f23 scsi: hisi_sas: Add debugfs file structure for ITCT
Create a file structure which was used to save the memory address for ITCT
at debugfs. This structure is bound to the corresponding debugfs file, it
can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-12-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
e15f2e2dff scsi: hisi_sas: Add debugfs file structure for IOST
Create a file structure which was used to save the memory address for IOST
at debugfs. This structure is bound to the corresponding debugfs file, it
can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-11-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
1f66e1fd26 scsi: hisi_sas: Add debugfs file structure for port
Create a file structure which was used to save the memory address and phy
pointer for port at debugfs. This structure is bound to the corresponding
debugfs file, it can help callback function of debugfs file to get what it
need.

Link: https://lore.kernel.org/r/1571926105-74636-10-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00