Commit Graph

565 Commits

Author SHA1 Message Date
James Smart
f5201f87cc scsi: lpfc: Fix duplicate wq_create_version check
During code reviews duplicate code sections were found to determine the WQ
Create version.  The duplication was potentially overriding logic that
validated page size.

Link: https://lore.kernel.org/r/20201020202719.54726-6-james.smart@broadcom.com
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 21:42:38 -04:00
James Smart
e7dab164a9 scsi: lpfc: Fix scheduling call while in softirq context in lpfc_unreg_rpi
The following call trace was seen during HBA reset testing:

BUG: scheduling while atomic: swapper/2/0/0x10000100
...
Call Trace:
dump_stack+0x19/0x1b
__schedule_bug+0x64/0x72
__schedule+0x782/0x840
__cond_resched+0x26/0x30
_cond_resched+0x3a/0x50
mempool_alloc+0xa0/0x170
lpfc_unreg_rpi+0x151/0x630 [lpfc]
lpfc_sli_abts_recover_port+0x171/0x190 [lpfc]
lpfc_sli4_abts_err_handler+0xb2/0x1f0 [lpfc]
lpfc_sli4_io_xri_aborted+0x256/0x300 [lpfc]
lpfc_sli4_sp_handle_abort_xri_wcqe.isra.51+0xa3/0x190 [lpfc]
lpfc_sli4_fp_handle_cqe+0x89/0x4d0 [lpfc]
__lpfc_sli4_process_cq+0xdb/0x2e0 [lpfc]
__lpfc_sli4_hba_process_cq+0x41/0x100 [lpfc]
lpfc_cq_poll_hdler+0x1a/0x30 [lpfc]
irq_poll_softirq+0xc7/0x100
__do_softirq+0xf5/0x280
call_softirq+0x1c/0x30
do_softirq+0x65/0xa0
irq_exit+0x105/0x110
do_IRQ+0x56/0xf0
common_interrupt+0x16a/0x16a

With the conversion to blk_io_poll for better interrupt latency in normal
cases, it introduced this code path, executed when I/O aborts or logouts
are seen, which attempts to allocate memory for a mailbox command to be
issued.  The allocation is GFP_KERNEL, thus it could attempt to sleep.

Fix by creating a work element that performs the event handling for the
remote port. This will have the mailbox commands and other items performed
in the work element, not the irq. A much better method as the "irq" routine
does not stall while performing all this deep handling code.

Ensure that allocation failures are handled and send LOGO on failure.

Additionally, enlarge the mailbox memory pool to reduce the possibility of
additional allocation in this path.

Link: https://lore.kernel.org/r/20201020202719.54726-3-james.smart@broadcom.com
Fixes: 317aeb83c9 ("scsi: lpfc: Add blk_io_poll support for latency improvment")
Cc: <stable@vger.kernel.org> # v5.9+
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 21:42:38 -04:00
James Smart
62e3a931db scsi: lpfc: Fix invalid sleeping context in lpfc_sli4_nvmet_alloc()
The following calltrace was seen:

BUG: sleeping function called from invalid context at mm/slab.h:494
...
Call Trace:
 dump_stack+0x9a/0xf0
 ___might_sleep.cold.63+0x13d/0x178
 slab_pre_alloc_hook+0x6a/0x90
 kmem_cache_alloc_trace+0x3a/0x2d0
 lpfc_sli4_nvmet_alloc+0x4c/0x280 [lpfc]
 lpfc_post_rq_buffer+0x2e7/0xa60 [lpfc]
 lpfc_sli4_hba_setup+0x6b4c/0xa4b0 [lpfc]
 lpfc_pci_probe_one_s4.isra.15+0x14f8/0x2280 [lpfc]
 lpfc_pci_probe_one+0x260/0x2880 [lpfc]
 local_pci_probe+0xd4/0x180
 work_for_cpu_fn+0x51/0xa0
 process_one_work+0x8f0/0x17b0
 worker_thread+0x536/0xb50
 kthread+0x30c/0x3d0
 ret_from_fork+0x3a/0x50

A prior patch introduced a spin_lock_irqsave(hbalock) in the
lpfc_post_rq_buffer() routine. Call trace is seen as the hbalock is held
with interrupts disabled during a GFP_KERNEL allocation in
lpfc_sli4_nvmet_alloc().

Fix by reordering locking so that hbalock not held when calling
sli4_nvmet_alloc() (aka rqb_buf_list()).

Link: https://lore.kernel.org/r/20201020202719.54726-2-james.smart@broadcom.com
Fixes: 411de511c6 ("scsi: lpfc: Fix RQ empty firmware trap")
Cc: <stable@vger.kernel.org> # v4.17+
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 21:42:37 -04:00
Tom Rix
170b7d2de2 scsi: Remove unneeded break statements
A break is not needed if it is preceded by a return or goto.

Link: https://lore.kernel.org/r/20201019142333.16584-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 18:23:24 -04:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Dick Kennedy
24411fcd6f scsi: lpfc: Fix oops when unloading driver while running mds diags
While mds diagnostic tests are running, if the driver is requested to be
unloaded, oops or hangs are observed.  The driver doesn't terminate the
processing of diag frames when the unload is started. As such: oops may be
seen for __lpfc_sli_release_iocbq_s4 because ring memory is referenced that
was already freed; or hangs see in lpfc_nvme_wait_for_io_drain as ios no
longer complete.

If unloading, don't process diag frames. Just clean them up.

Link: https://lore.kernel.org/r/20200803210229.23063-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04 20:56:57 -04:00
Ferruh Yigit
02e3e588f0 scsi: lpfc: Fix typo in comment for ULP
UPL -> ULP for "Upper Layer Protocol"

Link: https://lore.kernel.org/r/20200728145606.1601726-1-ferruh.yigit@intel.com
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-28 23:49:57 -04:00
Lee Jones
11d8e56bfd scsi: lpfc: Ensure variable has the same stipulations as code using it
'pg_addr' is only used when CONFIG_X86 is defined.  So only declare it if
CONFIG_X86 is defined.

Fixes the following W=1 kernel build warning(s):

 drivers/scsi/lpfc/lpfc_sli.c: In function ‘lpfc_wq_create’:
 drivers/scsi/lpfc/lpfc_sli.c:15813:16: warning: unused variable ‘pg_addr’ [-Wunused-variable]
 15813 | unsigned long pg_addr;
 | ^~~~~~~

Link: https://lore.kernel.org/r/20200721164148.2617584-37-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-24 22:10:02 -04:00
Lee Jones
7af29d4553 scsi: lpfc: Fix-up around 120 documentation issues
Fixes the following W=1 kernel build warning(s):

 drivers/scsi/lpfc/lpfc_sli.c:257: warning: Function parameter or member 'mqe' not described in 'lpfc_sli4_mq_put'
 drivers/scsi/lpfc/lpfc_sli.c:257: warning: Excess function parameter 'wqe' description in 'lpfc_sli4_mq_put'
 drivers/scsi/lpfc/lpfc_sli.c:675: warning: Function parameter or member 'hq' not described in 'lpfc_sli4_rq_put'
 drivers/scsi/lpfc/lpfc_sli.c:675: warning: Function parameter or member 'dq' not described in 'lpfc_sli4_rq_put'
 drivers/scsi/lpfc/lpfc_sli.c:675: warning: Function parameter or member 'hrqe' not described in 'lpfc_sli4_rq_put'
 drivers/scsi/lpfc/lpfc_sli.c:675: warning: Function parameter or member 'drqe' not described in 'lpfc_sli4_rq_put'
 drivers/scsi/lpfc/lpfc_sli.c:675: warning: Excess function parameter 'q' description in 'lpfc_sli4_rq_put'
 drivers/scsi/lpfc/lpfc_sli.c:675: warning: Excess function parameter 'wqe' description in 'lpfc_sli4_rq_put'
 drivers/scsi/lpfc/lpfc_sli.c:738: warning: Function parameter or member 'hq' not described in 'lpfc_sli4_rq_release'
 drivers/scsi/lpfc/lpfc_sli.c:738: warning: Function parameter or member 'dq' not described in 'lpfc_sli4_rq_release'
 drivers/scsi/lpfc/lpfc_sli.c:738: warning: Excess function parameter 'q' description in 'lpfc_sli4_rq_release'
 drivers/scsi/lpfc/lpfc_sli.c:1021: warning: Function parameter or member 'xritag' not described in 'lpfc_test_rrq_active'
 drivers/scsi/lpfc/lpfc_sli.c:1132: warning: Function parameter or member 'piocbq' not described in '__lpfc_sli_get_els_sglq'
 drivers/scsi/lpfc/lpfc_sli.c:1132: warning: Excess function parameter 'piocb' description in '__lpfc_sli_get_els_sglq'
 drivers/scsi/lpfc/lpfc_sli.c:1207: warning: Function parameter or member 'piocbq' not described in '__lpfc_sli_get_nvmet_sglq'
 drivers/scsi/lpfc/lpfc_sli.c:1207: warning: Excess function parameter 'piocb' description in '__lpfc_sli_get_nvmet_sglq'
 drivers/scsi/lpfc/lpfc_sli.c:2243: warning: Function parameter or member 'rb_list' not described in 'lpfc_sli_hbqbuf_get'
 drivers/scsi/lpfc/lpfc_sli.c:2243: warning: Excess function parameter 'phba' description in 'lpfc_sli_hbqbuf_get'
 drivers/scsi/lpfc/lpfc_sli.c:2243: warning: Excess function parameter 'hbqno' description in 'lpfc_sli_hbqbuf_get'
 drivers/scsi/lpfc/lpfc_sli.c:2262: warning: Function parameter or member 'hrq' not described in 'lpfc_sli_rqbuf_get'
 drivers/scsi/lpfc/lpfc_sli.c:2262: warning: Excess function parameter 'hbqno' description in 'lpfc_sli_rqbuf_get'
 drivers/scsi/lpfc/lpfc_sli.c:3429: warning: Function parameter or member 't' not described in 'lpfc_poll_eratt'
 drivers/scsi/lpfc/lpfc_sli.c:3429: warning: Excess function parameter 'ptr' description in 'lpfc_poll_eratt'
 drivers/scsi/lpfc/lpfc_sli.c:4115: warning: Excess function parameter 'pring' description in 'lpfc_sli_abort_fcp_rings'
 drivers/scsi/lpfc/lpfc_sli.c:5331: warning: Excess function parameter 'mboxq' description in 'lpfc_sli4_read_fcoe_params'
 drivers/scsi/lpfc/lpfc_sli.c:5879: warning: Function parameter or member 'extnt_cnt' not described in 'lpfc_sli4_cfg_post_extnts'
 drivers/scsi/lpfc/lpfc_sli.c:5879: warning: Function parameter or member 'type' not described in 'lpfc_sli4_cfg_post_extnts'
 drivers/scsi/lpfc/lpfc_sli.c:5879: warning: Function parameter or member 'emb' not described in 'lpfc_sli4_cfg_post_extnts'
 drivers/scsi/lpfc/lpfc_sli.c:5879: warning: Function parameter or member 'mbox' not described in 'lpfc_sli4_cfg_post_extnts'
 drivers/scsi/lpfc/lpfc_sli.c:6459: warning: Function parameter or member 'pmb' not described in 'lpfc_sli4_ras_mbox_cmpl'
 drivers/scsi/lpfc/lpfc_sli.c:6459: warning: Excess function parameter 'pmboxq' description in 'lpfc_sli4_ras_mbox_cmpl'
 drivers/scsi/lpfc/lpfc_sli.c:6912: warning: Function parameter or member 'extnt_cnt' not described in 'lpfc_sli4_get_allocated_extnts'
 drivers/scsi/lpfc/lpfc_sli.c:6912: warning: Excess function parameter 'extnt_count' description in 'lpfc_sli4_get_allocated_extnts'
 drivers/scsi/lpfc/lpfc_sli.c:7064: warning: Excess function parameter 'pring' description in 'lpfc_sli4_repost_sgl_list'
 drivers/scsi/lpfc/lpfc_sli.c:7312: warning: Function parameter or member 'phba' not described in 'lpfc_init_idle_stat_hb'
 drivers/scsi/lpfc/lpfc_sli.c:8022: warning: Function parameter or member 't' not described in 'lpfc_mbox_timeout'
 drivers/scsi/lpfc/lpfc_sli.c:8022: warning: Excess function parameter 'ptr' description in 'lpfc_mbox_timeout'
 drivers/scsi/lpfc/lpfc_sli.c:8902: warning: Function parameter or member 'mboxq' not described in 'lpfc_sli_issue_mbox_s4'
 drivers/scsi/lpfc/lpfc_sli.c:8902: warning: Excess function parameter 'pmbox' description in 'lpfc_sli_issue_mbox_s4'
 drivers/scsi/lpfc/lpfc_sli.c:9413: warning: Function parameter or member 'piocbq' not described in 'lpfc_sli4_bpl2sgl'
 drivers/scsi/lpfc/lpfc_sli.c:9413: warning: Excess function parameter 'piocb' description in 'lpfc_sli4_bpl2sgl'
 drivers/scsi/lpfc/lpfc_sli.c:9518: warning: Function parameter or member 'iocbq' not described in 'lpfc_sli4_iocb2wqe'
 drivers/scsi/lpfc/lpfc_sli.c:9518: warning: Excess function parameter 'piocb' description in 'lpfc_sli4_iocb2wqe'
 drivers/scsi/lpfc/lpfc_sli.c:10212: warning: Function parameter or member 'phba' not described in '__lpfc_sli_issue_iocb'
 drivers/scsi/lpfc/lpfc_sli.c:10212: warning: Function parameter or member 'ring_number' not described in '__lpfc_sli_issue_iocb'
 drivers/scsi/lpfc/lpfc_sli.c:10212: warning: Function parameter or member 'piocb' not described in '__lpfc_sli_issue_iocb'
 drivers/scsi/lpfc/lpfc_sli.c:10212: warning: Function parameter or member 'flag' not described in '__lpfc_sli_issue_iocb'
 drivers/scsi/lpfc/lpfc_sli.c:10300: warning: Function parameter or member 'ring_number' not described in 'lpfc_sli_issue_iocb'
 drivers/scsi/lpfc/lpfc_sli.c:10300: warning: Excess function parameter 'pring' description in 'lpfc_sli_issue_iocb'
 drivers/scsi/lpfc/lpfc_sli.c:11807: warning: Function parameter or member 'cmd' not described in 'lpfc_sli_abort_taskmgmt'
 drivers/scsi/lpfc/lpfc_sli.c:11807: warning: Excess function parameter 'taskmgmt_cmd' description in 'lpfc_sli_abort_taskmgmt'
 drivers/scsi/lpfc/lpfc_sli.c:12067: warning: Function parameter or member 'ring_number' not described in 'lpfc_sli_issue_iocb_wait'
 drivers/scsi/lpfc/lpfc_sli.c:12067: warning: Excess function parameter 'pring' description in 'lpfc_sli_issue_iocb_wait'
 drivers/scsi/lpfc/lpfc_sli.c:12262: warning: Function parameter or member 'mbx_action' not described in 'lpfc_sli_mbox_sys_shutdown'
 drivers/scsi/lpfc/lpfc_sli.c:13219: warning: Function parameter or member 'irspiocbq' not described in 'lpfc_sli4_els_wcqe_to_rspiocbq'
 drivers/scsi/lpfc/lpfc_sli.c:13219: warning: Excess function parameter 'wcqe' description in 'lpfc_sli4_els_wcqe_to_rspiocbq'
 drivers/scsi/lpfc/lpfc_sli.c:13285: warning: Function parameter or member 'mcqe' not described in 'lpfc_sli4_sp_handle_async_event'
 drivers/scsi/lpfc/lpfc_sli.c:13285: warning: Excess function parameter 'cqe' description in 'lpfc_sli4_sp_handle_async_event'
 drivers/scsi/lpfc/lpfc_sli.c:13318: warning: Function parameter or member 'mcqe' not described in 'lpfc_sli4_sp_handle_mbox_event'
 drivers/scsi/lpfc/lpfc_sli.c:13318: warning: Excess function parameter 'cqe' description in 'lpfc_sli4_sp_handle_mbox_event'
 drivers/scsi/lpfc/lpfc_sli.c:13441: warning: Function parameter or member 'cq' not described in 'lpfc_sli4_sp_handle_mcqe'
 drivers/scsi/lpfc/lpfc_sli.c:13768: warning: Function parameter or member 'speq' not described in 'lpfc_sli4_sp_handle_eqe'
 drivers/scsi/lpfc/lpfc_sli.c:14126: warning: Function parameter or member 'cq' not described in 'lpfc_sli4_nvmet_handle_rcqe'
 drivers/scsi/lpfc/lpfc_sli.c:14235: warning: Function parameter or member 'cqe' not described in 'lpfc_sli4_fp_handle_cqe'
 drivers/scsi/lpfc/lpfc_sli.c:14235: warning: Excess function parameter 'eqe' description in 'lpfc_sli4_fp_handle_cqe'
 drivers/scsi/lpfc/lpfc_sli.c:14336: warning: Function parameter or member 'eq' not described in 'lpfc_sli4_hba_handle_eqe'
 drivers/scsi/lpfc/lpfc_sli.c:14808: warning: Function parameter or member 'entry_count' not described in 'lpfc_sli4_queue_alloc'
 drivers/scsi/lpfc/lpfc_sli.c:15185: warning: Function parameter or member 'type' not described in 'lpfc_cq_create'
 drivers/scsi/lpfc/lpfc_sli.c:15185: warning: Function parameter or member 'subtype' not described in 'lpfc_cq_create'
 drivers/scsi/lpfc/lpfc_sli.c:15333: warning: Function parameter or member 'type' not described in 'lpfc_cq_create_set'
 drivers/scsi/lpfc/lpfc_sli.c:15333: warning: Function parameter or member 'subtype' not described in 'lpfc_cq_create_set'
 drivers/scsi/lpfc/lpfc_sli.c:16063: warning: Function parameter or member 'subtype' not described in 'lpfc_rq_create'
 drivers/scsi/lpfc/lpfc_sli.c:16353: warning: Function parameter or member 'subtype' not described in 'lpfc_mrq_create'
 drivers/scsi/lpfc/lpfc_sli.c:16533: warning: Function parameter or member 'phba' not described in 'lpfc_eq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16590: warning: Function parameter or member 'phba' not described in 'lpfc_cq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16644: warning: Function parameter or member 'phba' not described in 'lpfc_mq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16644: warning: Function parameter or member 'mq' not described in 'lpfc_mq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16644: warning: Excess function parameter 'qm' description in 'lpfc_mq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16698: warning: Function parameter or member 'phba' not described in 'lpfc_wq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16754: warning: Function parameter or member 'phba' not described in 'lpfc_rq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16754: warning: Function parameter or member 'hrq' not described in 'lpfc_rq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16754: warning: Function parameter or member 'drq' not described in 'lpfc_rq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16754: warning: Excess function parameter 'rq' description in 'lpfc_rq_destroy'
 drivers/scsi/lpfc/lpfc_sli.c:16940: warning: Function parameter or member 'xri' not described in '__lpfc_sli4_free_xri'
 drivers/scsi/lpfc/lpfc_sli.c:16955: warning: Function parameter or member 'xri' not described in 'lpfc_sli4_free_xri'
 drivers/scsi/lpfc/lpfc_sli.c:17002: warning: Function parameter or member 'post_cnt' not described in 'lpfc_sli4_post_sgl_list'
 drivers/scsi/lpfc/lpfc_sli.c:17002: warning: Excess function parameter 'count' description in 'lpfc_sli4_post_sgl_list'
 drivers/scsi/lpfc/lpfc_sli.c:17221: warning: Function parameter or member 'sb_count' not described in 'lpfc_sli4_post_io_sgl_list'
 drivers/scsi/lpfc/lpfc_sli.c:17451: warning: Function parameter or member 'did' not described in 'lpfc_fc_frame_to_vport'
 drivers/scsi/lpfc/lpfc_sli.c:17590: warning: Function parameter or member 'vport' not described in 'lpfc_fc_frame_add'
 drivers/scsi/lpfc/lpfc_sli.c:17817: warning: Function parameter or member 'vport' not described in 'lpfc_sli4_seq_abort_rsp'
 drivers/scsi/lpfc/lpfc_sli.c:17817: warning: Function parameter or member 'aborted' not described in 'lpfc_sli4_seq_abort_rsp'
 drivers/scsi/lpfc/lpfc_sli.c:17817: warning: Excess function parameter 'phba' description in 'lpfc_sli4_seq_abort_rsp'
 drivers/scsi/lpfc/lpfc_sli.c:18060: warning: Function parameter or member 'seq_dmabuf' not described in 'lpfc_prep_seq'
 drivers/scsi/lpfc/lpfc_sli.c:18060: warning: Excess function parameter 'dmabuf' description in 'lpfc_prep_seq'
 drivers/scsi/lpfc/lpfc_sli.c:18332: warning: Function parameter or member 'dmabuf' not described in 'lpfc_sli4_handle_received_buffer'
 drivers/scsi/lpfc/lpfc_sli.c:18655: warning: Function parameter or member 'rpi' not described in '__lpfc_sli4_free_rpi'
 drivers/scsi/lpfc/lpfc_sli.c:18683: warning: Function parameter or member 'rpi' not described in 'lpfc_sli4_free_rpi'
 drivers/scsi/lpfc/lpfc_sli.c:18714: warning: Function parameter or member 'ndlp' not described in 'lpfc_sli4_resume_rpi'
 drivers/scsi/lpfc/lpfc_sli.c:18714: warning: Function parameter or member 'cmpl' not described in 'lpfc_sli4_resume_rpi'
 drivers/scsi/lpfc/lpfc_sli.c:18714: warning: Function parameter or member 'arg' not described in 'lpfc_sli4_resume_rpi'
 drivers/scsi/lpfc/lpfc_sli.c:18714: warning: Excess function parameter 'phba' description in 'lpfc_sli4_resume_rpi'
 drivers/scsi/lpfc/lpfc_sli.c:19103: warning: Function parameter or member 'phba' not described in 'lpfc_check_next_fcf_pri_level'
 drivers/scsi/lpfc/lpfc_sli.c:19266: warning: Function parameter or member 'fcf_index' not described in 'lpfc_sli4_fcf_rr_index_set'
 drivers/scsi/lpfc/lpfc_sli.c:19295: warning: Function parameter or member 'fcf_index' not described in 'lpfc_sli4_fcf_rr_index_clear'
 drivers/scsi/lpfc/lpfc_sli.c:19331: warning: Function parameter or member 'mbox' not described in 'lpfc_mbx_cmpl_redisc_fcf_table'
 drivers/scsi/lpfc/lpfc_sli.c:20027: warning: Function parameter or member 'pwqeq' not described in 'lpfc_wqe_bpl2sgl'
 drivers/scsi/lpfc/lpfc_sli.c:20027: warning: Excess function parameter 'pwqe' description in 'lpfc_wqe_bpl2sgl'
 drivers/scsi/lpfc/lpfc_sli.c:20141: warning: Function parameter or member 'qp' not described in 'lpfc_sli4_issue_wqe'
 drivers/scsi/lpfc/lpfc_sli.c:20141: warning: Excess function parameter 'ring_number' description in 'lpfc_sli4_issue_wqe'
 drivers/scsi/lpfc/lpfc_sli.c:20434: warning: Function parameter or member 'qp' not described in '_lpfc_move_xri_pbl_to_pvt'
 drivers/scsi/lpfc/lpfc_sli.c:20552: warning: Function parameter or member 'hwqid' not described in 'lpfc_keep_pvt_pool_above_lowwm'
 drivers/scsi/lpfc/lpfc_sli.c:20552: warning: Excess function parameter 'qp' description in 'lpfc_keep_pvt_pool_above_lowwm'
 drivers/scsi/lpfc/lpfc_sli.c:20682: warning: Function parameter or member 'qp' not described in 'lpfc_get_io_buf_from_private_pool'

Link: https://lore.kernel.org/r/20200721164148.2617584-24-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-24 22:10:00 -04:00
Lee Jones
3c1311ad83 scsi: lpfc: Remove unused variable 'pg_addr'
Fixes the following W=1 kernel build warning(s):

 drivers/scsi/lpfc/lpfc_sli.c: In function ‘lpfc_wq_create’:
 drivers/scsi/lpfc/lpfc_sli.c:15810:16: warning: variable ‘pg_addr’ set but not used [-Wunused-but-set-variable]
 15810 | unsigned long pg_addr;
 | ^~~~~~~

Link: https://lore.kernel.org/r/20200721164148.2617584-21-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-24 22:10:00 -04:00
Colin Ian King
26e0b9aa35 scsi: lpfc: Fix inconsistent indenting
Fix smatch warning:

    drivers/scsi/lpfc/lpfc_sli.c:15156 lpfc_cq_poll_hdler() warn:
    inconsistent indenting

Link: https://lore.kernel.org/r/20200707150018.823350-1-colin.king@canonical.com
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-08 01:14:34 -04:00
Dick Kennedy
372c187b8a scsi: lpfc: Add an internal trace log buffer
The current logging methods typically end up requesting a reproduction with
a different logging level set to figure out what happened. This was mainly
by design to not clutter the kernel log messages with things that were
typically not interesting and the messages themselves could cause other
issues.

When looking to make a better system, it was seen that in many cases when
more data was wanted was when another message, usually at KERN_ERR level,
was logged.  And in most cases, what the additional logging that was then
enabled was typically. Most of these areas fell into the discovery machine.

Based on this summary, the following design has been put in place: The
driver will maintain an internal log (256 elements of 256 bytes).  The
"additional logging" messages that are usually enabled in a reproduction
will be changed to now log all the time to the internal log.  A new logging
level is defined - LOG_TRACE_EVENT.  When this level is set (it is not by
default) and a message marked as KERN_ERR is logged, all the messages in
the internal log will be dumped to the kernel log before the KERN_ERR
message is logged.

There is a timestamp on each message added to the internal log. However,
this timestamp is not converted to wall time when logged. The value of the
timestamp is solely to give a crude time reference for the messages.

Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:49 -04:00
Dick Kennedy
317aeb83c9 scsi: lpfc: Add blk_io_poll support for latency improvment
Although the existing implementation is very good at high I/O load, on
tests involving light load, especially on only a few hardware queues,
latency was a little higher than it can be due to using workqueue
scheduling. Other tasks in the system can delay handling.

Change the lower level to use irq_poll by default which uses a softirq for
I/O completion. This gives better latency as variance in when the cq is
processed is reduced over the workqueue interface. However, as high load is
better served by not being in softirq when the CPU is loaded, work queues
are still used under high I/O load.

Link: https://lore.kernel.org/r/20200630215001.70793-13-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:42 -04:00
Dick Kennedy
f0020e428a scsi: lpfc: Add support to display if adapter dumps are available
Currently, if there has been an issue whereby an adapter dump was taken,
there is nothing displayed to hint that it is present. Utilities must be
run and they must query for the status in order to then download the dump.

Add a message to the driver to query dump image presence when initializing
the SLI Port.

Link: https://lore.kernel.org/r/20200630215001.70793-12-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:41 -04:00
Dick Kennedy
28ed737440 scsi: lpfc: Fix language in 0373 message to reflect non-error message
Change vocabulary of 0373 log msg from "error" to "cmpl" The current
language of the 0373 message contains the word "error" which caused a
number of customers to inquire about the "error" and if it should be a
concern. It isn't an error, it's simply an io completion status.

Revise the message to replace the word "error" with "cmpl" for completion.

Link: https://lore.kernel.org/r/20200630215001.70793-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:39 -04:00
Dick Kennedy
86ee57a97a scsi: lpfc: Fix kdump hang on PPC
When the kdump kernel shuts down lpfc calls flush_work_queue on an
interrupt to schedule the cq handler. When there is only one CPU active on
the kdump kernel, it is possible for the work_on to get scheduled on a
non-active CPU causing it to never be scheduled.

When in the kdump environment, per-CPU affinity of cq's to cpus is not
necessary. In those cases, use a general queue_work rather than a
queue_work_on().

Link: https://lore.kernel.org/r/20200630215001.70793-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:38 -04:00
Dick Kennedy
9dace1fa91 scsi: lpfc: Fix stack trace seen while setting rrq active
Call traces have been observed running different tests that involve aborts
and setting the rrq active flag.  The lpfc_set_rrq_active routine is doing
a mempool_alloc under the soft_irq processing level. When the mempool needs
to get a new buffer from the free pool and has to wait for memory to become
free it will check the flags passed in on the alloc and dump the stack if
the thread is running in interrupt context.

Replace the GFP_KERNEL flag with GFP_ATOMIC so that the memory allocation
will not attempt to sleep if there is no mem available.

Link: https://lore.kernel.org/r/20200630215001.70793-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:37 -04:00
Dick Kennedy
d91e3abb68 scsi: lpfc: Fix oops due to overrun when reading SLI3 data
When using DUMP on SLI3 to read VPD and Port status data (config region
23), the adapter is overruning the kmalloc'd buffer causing havoc on other
consumers of the allocation pools.

Rework the loops processing the dump data and validate/size memory lengths
before performing bcopy.

Link: https://lore.kernel.org/r/20200630215001.70793-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:36 -04:00
Dick Kennedy
c93764a65b scsi: lpfc: Fix missing MDS functionality
Visual code inspection of the MDS implementation revealed two errors in
the driver:

 - The set features Feature Code had an incorrect value

 - The routine that classifies command type for cmd completions was missing
   the Send Frame definition. Send Frame is used for MDS driver loopback.

Link: https://lore.kernel.org/r/20200630215001.70793-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:34 -04:00
Linus Torvalds
818dbde78e SCSI misc on 20200605
This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
 target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
 of other minor updates.  There are no major core changes in this
 series apart from a refactoring in scsi_lib.c.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXtq5QyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXyGAQCipTWx
 7kHKHZBCVTU133bADt3+SstLrAm8PKZEXMnP9wEAzu4QkkW8URxEDRrpu7qk5gbA
 9M/KyqvfRtTH7+BSK7M=
 =J6aO
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 :This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
  target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
  of other minor updates.

  There are no major core changes in this series apart from a
  refactoring in scsi_lib.c"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
  scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes
  scsi: cxgb3i: Fix some leaks in init_act_open()
  scsi: ibmvscsi: Make some functions static
  scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim
  scsi: ufs: Fix WriteBooster flush during runtime suspend
  scsi: ufs: Fix index of attributes query for WriteBooster feature
  scsi: ufs: Allow WriteBooster on UFS 2.2 devices
  scsi: ufs: Remove unnecessary memset for dev_info
  scsi: ufs-qcom: Fix scheduling while atomic issue
  scsi: mpt3sas: Fix reply queue count in non RDPQ mode
  scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
  scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd()
  scsi: vhost: Notify TCM about the maximum sg entries supported per command
  scsi: qla2xxx: Remove return value from qla_nvme_ls()
  scsi: qla2xxx: Remove an unused function
  scsi: iscsi: Register sysfs for iscsi workqueue
  scsi: scsi_debug: Parser tables and code interaction
  scsi: core: Refactor scsi_mq_setup_tags function
  scsi: core: Fix incorrect usage of shost_for_each_device
  scsi: qla2xxx: Fix endianness annotations in source files
  ...
2020-06-05 15:11:50 -07:00
James Smart
4e57e0b9f3 lpfc: fix axchg pointer reference after free and double frees
The axchg structure is a structure allocated early in the
lpfc_nvme_unsol_ls_handler() to represent the newly received exchange.
Upon error, the out_fail path in the routine unconditionally frees the
pointer, yet subsequently passes the pointer to the abort routine.
Additionally, the abort routine, lpfc_nvme_unsol_ls_issue_abort(), also
has a failure path that will attempt to delete the pointer on error.

Fix these errors by:
- Removing the unconditional free so that it stays valid if passed
  to the abort routine.
- Revise the abort routine to not free the pointer. Instead, return
  a success/failure status. Note: if success, the later completion of
  the abort frees the structure.
- Back in the unsol_ls_handler() error path, if the abort routine was
  skipped (thus no possible reference) or the abort routine returned
  error, free the pointer.

Fixes: 3a8070c567 ("lpfc: Refactor NVME LS receive handling")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27 07:12:41 +02:00
James Smart
3a8070c567 lpfc: Refactor NVME LS receive handling
In preparation for supporting both intiator mode and target mode
receiving NVME LS's, commonize the existing NVME LS request receive
handling found in the base driver and in the nvmet side.

Using the original lpfc_nvmet_unsol_ls_event() and
lpfc_nvme_unsol_ls_buffer() routines as a templates, commonize the
reception of an NVME LS request. The common routine will validate the LS
request, that it was received from a logged-in node, and allocate a
lpfc_async_xchg_ctx that is used to manage the LS request. The role of
the port is then inspected to determine which handler is to receive the
LS - nvme or nvmet. As such, the nvmet handler is tied back in. A handler
is created in nvme and is stubbed out.

Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:18:34 -06:00
James Smart
7cacae2ad0 lpfc: Refactor nvmet_rcv_ctx to create lpfc_async_xchg_ctx
To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1),
both the nvme (host) and nvmet (controller/target) sides will need to be
able to receive LS requests.  Currently, this support is in the nvmet side
only. To prepare for both sides supporting LS receive, rename
lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition.

Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:18:34 -06:00
James Smart
2a1160a03a lpfc: Refactor lpfc nvme headers
A lot of files in lpfc include nvme headers, building up relationships that
require a file to change for its headers when there is no other change
necessary. It would be better to localize the nvme headers.

There is also no need for separate nvme (initiator) and nvmet (tgt)
header files.

Refactor the inclusion of nvme headers so that all nvme items are
included by lpfc_nvme.h

Merge lpfc_nvmet.h into lpfc_nvme.h so that there is a single header used
by both the nvme and nvmet sides. This prepares for structure sharing
between the two roles. Prep to add shared function prototypes for upcoming
shared routines.

Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:18:34 -06:00
Dick Kennedy
a7fc071ab5 scsi: lpfc: Fix noderef and address space warnings
Running make C=1 M=drivers/scsi/lpfc triggers sparse warnings

Correct the code generating the following errors:

 - Incompatible address space assignment without proper conversion.

 - Deference of usespace and per-cpu pointers.

Link: https://lore.kernel.org/r/20200501214310.91713-8-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-07 22:47:26 -04:00
Dick Kennedy
88acb4d9ff scsi: lpfc: Remove unnecessary lockdep_assert_held calls
In an audit of lockdep calls in the driver, there are multiple lockdep
checks in successive calling layers. E.g. a routine checks, and then calls
a lower routine that also checks, and so on. Calling sequences result in
many redundant checks.

Refine the code to remove lower-level lockdep checks.  Update comments on
the lock, correcting a few places where lock object in comment was
incorrect.

Link: https://lore.kernel.org/r/20200501214310.91713-7-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-07 22:47:24 -04:00
Dick Kennedy
164ba8d2df scsi: lpfc: Maintain atomic consistency of queue_claimed flag
A previous change introduced the atomic use of queue_claimed flag for eq's
and cq's.  The code works fine, but the clearing of the queue_claimed flag
is not atomic.

Change queue_claimed = 0 into xchg(&queue_claimed, 0) to be consistent for
change under atomicity.

Link: https://lore.kernel.org/r/20200501214310.91713-3-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-07 22:47:21 -04:00
James Smart
0e75461a68 scsi: lpfc: Remove prototype FIPS/DSS options from SLI-3
During code review, identified dss feature that was a prototype only and
was never productized in SLI3. They shouldn't be there and prevents reuse
of the command areas.

Remove any code in the driver to deal with dss, including code to deal with
fips, which is associated with the dss feature.

Link: https://lore.kernel.org/r/20200322181304.37655-12-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-29 18:10:58 -04:00
James Smart
1543af381e scsi: lpfc: Fix update of wq consumer index in lpfc_sli4_wq_release
The lpfc_sli4_wq_release() routine iterates for each interim value when
updating the wq consuemr index.  This wastes cycles and possibly confuses
things as thevalue itterates (and the modulo logic is being applied).

There's no reason for this. Just set it to the value from the hw.

Link: https://lore.kernel.org/r/20200322181304.37655-7-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-26 23:15:10 -04:00
James Smart
4cd7089130 scsi: lpfc: Fix crash after handling a pci error
Injecting EEH on a 32GB card is causing kernel oops

The pci error handler is doing an IO flush and the offline code is also
doing an IO flush. When the 1st flush is complete the hdwq is destroyed
(freed), yet the second flush accesses the hdwq and crashes.

Added a check in lpfc_sli4_fush_io_rings to check both the HBA_IOQ_FLUSH
flag and the hdwq pointer to see if it is already set and not already
freed.

Link: https://lore.kernel.org/r/20200322181304.37655-6-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-26 23:15:09 -04:00
James Smart
f861f59671 scsi: lpfc: Fix lockdep error - register non-static key
The following lockdep error was reported when unloading the lpfc driver:

  INFO: trying to register non-static key.
  the code is fine but needs lockdep annotation.
  turning off the locking correctness validator.
  ...
  Call Trace:
  dump_stack+0x96/0xe0
  register_lock_class+0x8b8/0x8c0
  ? lockdep_hardirqs_on+0x190/0x280
  ? is_dynamic_key+0x150/0x150
  ? wait_for_completion_interruptible+0x2a0/0x2a0
  ? wake_up_q+0xd0/0xd0
  __lock_acquire+0xda/0x21a0
  ? register_lock_class+0x8c0/0x8c0
  ? synchronize_rcu_expedited+0x500/0x500
  ? __call_rcu+0x850/0x850
  lock_acquire+0xf3/0x1f0
  ? del_timer_sync+0x5/0xb0
  del_timer_sync+0x3c/0xb0
  ? del_timer_sync+0x5/0xb0
  lpfc_pci_remove_one.cold.102+0x8b7/0x935 [lpfc]
  ...

Unloading the driver resulted in a call to del_timer_sync for the
cpuhp_poll_timer. However the call to setup the timer had never been made,
so the timer structures used by lockdep checking were not initialized.

Unconditionally call setup_timer for the cpuhp_poll_timer during driver
initialization. Calls to start the timer remain "as needed".

Link: https://lore.kernel.org/r/20200322181304.37655-3-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-26 23:15:06 -04:00
James Smart
38503943c8 scsi: lpfc: Fix kasan slab-out-of-bounds error in lpfc_unreg_login
The following kasan bug was called out:

 BUG: KASAN: slab-out-of-bounds in lpfc_unreg_login+0x7c/0xc0 [lpfc]
 Read of size 2 at addr ffff889fc7c50a22 by task lpfc_worker_3/6676
 ...
 Call Trace:
 dump_stack+0x96/0xe0
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 print_address_description.constprop.6+0x1b/0x220
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 __kasan_report.cold.9+0x37/0x7c
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 kasan_report+0xe/0x20
 lpfc_unreg_login+0x7c/0xc0 [lpfc]
 lpfc_sli_def_mbox_cmpl+0x334/0x430 [lpfc]
 ...

When processing the completion of a "Reg Rpi" login mailbox command in
lpfc_sli_def_mbox_cmpl, a call may be made to lpfc_unreg_login. The vpi is
extracted from the completing mailbox context and passed as an input for
the next. However, the vpi stored in the mailbox command context is an
absolute vpi, which for SLI4 represents both base + offset.  When used with
a non-zero base component, (function id > 0) this results in an
out-of-range access beyond the allocated phba->vpi_ids array.

Fix by subtracting the function's base value to get an accurate vpi number.

Link: https://lore.kernel.org/r/20200322181304.37655-2-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-26 23:15:05 -04:00
James Smart
df3fe76658 scsi: lpfc: add RDF registration and Link Integrity FPIN logging
This patch modifies lpfc to register for Link Integrity events via the use
of an RDF ELS and to perform Link Integrity FPIN logging.

Specifically, the driver was modified to:

 - Format and issue the RDF ELS immediately following SCR registration.
   This registers the ability of the driver to receive FPIN ELS.

 - Adds decoding of the FPIN els into the received descriptors, with
   logging of the Link Integrity event information. After decoding, the ELS
   is delivered to the scsi fc transport to be delivered to any user-space
   applications.

 - To aid in logging, simple helpers were added to create enum to name
   string lookup functions that utilize the initialization helpers from the
   fc_els.h header.

 - Note: base header definitions for the ELS's don't populate the
   descriptor payloads. As such, lpfc creates it's own version of the
   structures, using the base definitions (mostly headers) and additionally
   declaring the descriptors that will complete the population of the ELS.

Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-18 00:08:38 -05:00
James Smart
145e5a8a5c scsi: lpfc: Copyright updates for 12.6.0.4 patches
Update copyrights to 2020 for files modified in the 12.6.0.4 patch set.

Link: https://lore.kernel.org/r/20200128002312.16346-13-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:56 -05:00
James Smart
f6770e7d23 scsi: lpfc: Clean up hba max_lun_queue_depth checks
The current code does some odd +1 over maximum xri count checks and
requires that the lun_queue_count can't be bigger than maximum xri count
divided by 8. These items are bogus.

Clean the code up to cap lun_queue_count to maximum xri count.

Link: https://lore.kernel.org/r/20200128002312.16346-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:56 -05:00
James Smart
39c4f1a965 scsi: lpfc: Fix RQ buffer leakage when no IOCBs available
The driver is occasionally seeing the following SLI Port error, requiring
reset and reinit:

 Port Status Event: ... error 1=0x52004a01, error 2=0x218

The failure means an RQ timeout. That is, the adapter had received
asynchronous receive frames, ran out of buffer slots to place the frames,
and the driver did not replenish the buffer slots before a timeout
occurred. The driver should not be so slow in replenishing buffers that a
timeout can occur.

When the driver received all the frames of a sequence, it allocates an IOCB
to put the frames in. In a situation where there was no IOCB available for
the frame of a sequence, the RQ buffer corresponding to the first frame of
the sequence was not returned to the FW. Eventually, with enough traffic
encountering the situation, the timeout occurred.

Fix by releasing the buffer back to firmware whenever there is no IOCB for
the first frame.

[mkp: typo]

Link: https://lore.kernel.org/r/20200128002312.16346-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:55 -05:00
Martin K. Petersen
1c46a2cf2d block, scsi: final compat_ioctl cleanup
This series concludes the work I did for linux-5.5 on the compat_ioctl()
 cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
 everything into drivers.
 
 Overall this would be a reduction both in complexity and line count, but
 as I'm also adding documentation the overall number of lines increases
 in the end.
 
 My plan was originally to keep the SCSI and block parts separate.
 This did not work easily because of interdependencies: I cannot
 do the final SCSI cleanup in a good way without first addressing the
 CDROM ioctls, so this is one series that I hope could be merged through
 either the block or the scsi git trees, or possibly both if you can
 pull in the same branch.
 
 The series comes in these steps:
 
 1. clean up the sg v3 interface as suggested by Linus. I have
    talked about this with Doug Gilbert as well, and he would
    rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
    compat read/write interface"
 
 2. Actually moving handlers out of block/compat_ioctl.c and
    block/scsi_ioctl.c into drivers, mixed in with cleanup
    patches
 
 3. Document how to do this right. I keep getting asked about this,
    and it helps to point to some documentation file.
 
 The branch is based on another one that fixes a couple of bugs found
 during the creation of this series.
 
 Changes since v3:
   https://lore.kernel.org/lkml/20200102145552.1853992-1-arnd@arndb.de/
 
 - Move sr_compat_ioctl fixup to correct patch (Ben Hutchings)
 - Add Reviewed-by tags
 
 Changes since v2:
   https://lore.kernel.org/lkml/20191217221708.3730997-1-arnd@arndb.de/
 
 - Rebase to v5.5-rc4, which contains the earlier bugfixes
 - Fix sr_block_compat_ioctl() error handling bug found by
   Ben Hutchings
 - Fix idecd_locked_compat_ioctl() compat_ptr() bug
 - Don't try to handle HDIO_DRIVE_TASKFILE in drivers/ide
 - More documentation improvements
 
 Changes since v1:
   https://lore.kernel.org/lkml/20191211204306.1207817-1-arnd@arndb.de/
 
 - move out the bugfixes into a branch for itself
 - clean up scsi sg driver further as suggested by Christoph Hellwig
 - avoid some ifdefs by moving compat_ptr() out of asm/compat.h
 - split out the blkdev_compat_ptr_ioctl function; bug spotted by
   Ben Hutchings
 - Improve formatting of documentation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJeDv8JAAoJEGCrR//JCVInh/oP/2BHdQvWONxwXXg2BLH7OJHm
 4PFoblxjNH/pwHm2PKh2uj8vUSgTHqID7NJChVgKZaZEJEJR7h26Sx60p+yTAepR
 /ysQiGameacJu2ZzKPYc4/S33Yu8cogQ5+DSz7mI9T5Yw0HSAE0JZ5xd9KIZ+/u8
 6k65ujd9kCxCgmtXrpx+7JFF0xb+urXKCvjdt2EfQ1ZmuMX5rDG/bTNg5JJ50shW
 vb7Z8hCpfW61ux8M/dgIh4WvUf0SA7FOy8WF1Km9gNhKGj41Arb2lmX1Jb4jDgjl
 DGsXQupyMVwigp5N37H3o1MamX/C8S49c16/zJQcJj64xX7WdxhE5kR8JIf+36Tf
 2l4wpaqVukXPvXkdv76Y472fKoOMZATF6kCoEPG3gXW9oxXDs5d2ofALfO3uNfLB
 PC4hzorw6bBlt67qAqERft2cxMMi9xSYfYZ8jD+eSF8WLL7xIcEazZqq8dKz7O00
 Qqx6+jzejT18av7cPfLjnupZg+mEcxDbPeuCgjrbhR8lcUI4DBu379RiTaQanvyR
 W00zwqCZWYnNJoha8u3AKsRcfL8eziF+/K9k+lCuhXeQBI4ipFJ03wAD4TWCigCS
 N7AikOdLzGVxE+2IfeCXPDKpdT6hFnjulnyDEgc/7jwHzcVF3MQBHwXiKhWHEUvT
 /AzAtKAiivp+uaMgAbzd
 =cddL
 -----END PGP SIGNATURE-----

Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue

Pull compat_ioctl cleanup from Arnd. Here's his description:

This series concludes the work I did for linux-5.5 on the compat_ioctl()
cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
everything into drivers.

Overall this would be a reduction both in complexity and line count, but
as I'm also adding documentation the overall number of lines increases
in the end.

My plan was originally to keep the SCSI and block parts separate.
This did not work easily because of interdependencies: I cannot
do the final SCSI cleanup in a good way without first addressing the
CDROM ioctls, so this is one series that I hope could be merged through
either the block or the scsi git trees, or possibly both if you can
pull in the same branch.

The series comes in these steps:

1. clean up the sg v3 interface as suggested by Linus. I have
   talked about this with Doug Gilbert as well, and he would
   rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
   compat read/write interface"

2. Actually moving handlers out of block/compat_ioctl.c and
   block/scsi_ioctl.c into drivers, mixed in with cleanup
   patches

3. Document how to do this right. I keep getting asked about this,
   and it helps to point to some documentation file.

The branch is based on another one that fixes a couple of bugs found
during the creation of this series.

Changes since v3:
  https://lore.kernel.org/lkml/20200102145552.1853992-1-arnd@arndb.de/

- Move sr_compat_ioctl fixup to correct patch (Ben Hutchings)
- Add Reviewed-by tags

Changes since v2:
  https://lore.kernel.org/lkml/20191217221708.3730997-1-arnd@arndb.de/

- Rebase to v5.5-rc4, which contains the earlier bugfixes
- Fix sr_block_compat_ioctl() error handling bug found by
  Ben Hutchings
- Fix idecd_locked_compat_ioctl() compat_ptr() bug
- Don't try to handle HDIO_DRIVE_TASKFILE in drivers/ide
- More documentation improvements

Changes since v1:
  https://lore.kernel.org/lkml/20191211204306.1207817-1-arnd@arndb.de/

- move out the bugfixes into a branch for itself
- clean up scsi sg driver further as suggested by Christoph Hellwig
- avoid some ifdefs by moving compat_ptr() out of asm/compat.h
- split out the blkdev_compat_ptr_ioctl function; bug spotted by
  Ben Hutchings
- Improve formatting of documentation

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-10 00:14:46 -05:00
James Smart
999fbbceb8 scsi: lpfc: Fix MDS Latency Diagnostics Err-drop rates
When running Cisco-MDS diagnostics which perform driver-level frame loop
back, the switch is reporting errors. Diagnostic has a limit on latency
that is not being met by the driver.

The requirement of Latency frames is that they should be responded back by
the host with a maximum delay of few hundreds of microseconds. If the
switch doesn't get response frames within this time frame, it fails the
test.

Test is failing as the lpfc-wq workqueue was overwhelmed by the packet rate
and in some cases, the work element yielded to other kernel elements.

To resolve, reduce the outstanding load allowed by the adapter. This
ensures the driver spends a reasonable amount of time doing loopback and
can do so such that latency values can be met.  Load is managed by reducing
the number of receive buffers posted such that the link can be
backpressured to reduce load.

Link: https://lore.kernel.org/r/20191218235808.31922-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:42 -05:00
James Smart
f3d0a8acc5 scsi: lpfc: Fix missing check for CSF in Write Object Mbox Rsp
When the WriteObject mailbox response has change_status set to is 0x2
(Firmware Reset) or 0x04 (Port Migration Reset), the CSF field should also
be checked to see if a fw reset is sufficient to enable all new features in
the updated firmware image. If not, a fw reset would start the new
firmware, but with a feature level equal to existing firmware.  To enable
the new features, a chip reset/pci slot reset would be required.

Check the CSF bit when change_status is 0x2 or 0x4 to know whether to
perform a pci bus reset or fw reset.

Link: https://lore.kernel.org/r/20191218235808.31922-4-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:42 -05:00
Colin Ian King
291c254845 scsi: lpfc: fix spelling mistakes of asynchronous
There are spelling mistakes of asynchronous in a lpfc_printf_log message
and comments. Fix these.

Link: https://lore.kernel.org/r/20191218084301.627555-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-19 18:47:07 -05:00
Linus Torvalds
ef2cc88e2a SCSI misc on 20191130
This is mostly update of the usual drivers: aacraid, ufs, zfcp,
 NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
 plus a whole load of minor updates and fixes.  The two major core
 changes are Al Viro's reworking of sg's handling of copy to/from user,
 Ming Lei's removal of the host busy counter to avoid contention in the
 multiqueue case and Damien Le Moal's fixing of residual tracking
 across error handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeKvHCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQJMAQDAjlAi
 SNfbyndMqyf+rZGWufDI+43Up1VvW9GeWJHeDwEAxfO5XZsCks2uT8UxXhpEp9L7
 HkiUww3zbcgl0FWFkUM=
 =cdVU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: aacraid, ufs, zfcp,
  NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
  plus a whole load of minor updates and fixes.

  The major core changes are Al Viro's reworking of sg's handling of
  copy to/from user, Ming Lei's removal of the host busy counter to
  avoid contention in the multiqueue case and Damien Le Moal's fixing of
  residual tracking across error handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits)
  scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort()
  scsi: target: core: Fix a pr_debug() argument
  scsi: iscsi: Don't send data to unbound connection
  scsi: target: iscsi: Wait for all commands to finish before freeing a session
  scsi: target: core: Release SPC-2 reservations when closing a session
  scsi: target: core: Document target_cmd_size_check()
  scsi: bnx2i: fix potential use after free
  Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
  scsi: NCR5380: Add disconnect_mask module parameter
  scsi: NCR5380: Unconditionally clear ICR after do_abort()
  scsi: NCR5380: Call scsi_set_resid() on command completion
  scsi: scsi_debug: num_tgts must be >= 0
  scsi: lpfc: use hdwq assigned cpu for allocation
  scsi: arcmsr: fix indentation issues
  scsi: qla4xxx: fix double free bug
  scsi: pm80xx: Modified the logic to collect fatal dump
  scsi: pm80xx: Tie the interrupt name to the module instance
  scsi: pm80xx: Controller fatal error through sysfs
  scsi: pm80xx: Do not request 12G sas speeds
  scsi: pm80xx: Cleanup command when a reset times out
  ...
2019-12-02 13:37:02 -08:00
James Smart
4583a4f66b scsi: lpfc: use hdwq assigned cpu for allocation
Looking at the recent conversion from smp_processor_id() to
raw_smp_processor_id(), realized that the allocation should be based on the
cpu the hdwq is bound to, not the executing cpu.

Revise to pull cpu number from the hdwq

Fixes: 765ab6cdac ("scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()")
Link: https://lore.kernel.org/r/20191116003847.6141-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-19 21:37:34 -05:00
James Smart
d480e57809 scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list()
Compilation can fail due to having an inline function reference where the
function body is not present.

Fix by removing the inline tag.

Fixes: 93a4d6f401 ("scsi: lpfc: Add registration for CPU Offline/Online events")

Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
Bart Van Assche
765ab6cdac scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()
Fix the following kernel bug report:

BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/954

Fixes: d79c9e9d4b ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Link: https://lore.kernel.org/r/20191107052158.25788-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:28:23 -05:00
James Smart
171f6c4194 scsi: lpfc: Add enablement of multiple adapter dumps
Some adapters support the ability to hold multiple adapter dumps on the
adapter flash. Some adapters default to enabling this feature while others
default to single-dump.

Make support uniform by enabling dual dump by default.

Link: https://lore.kernel.org/r/20191105005708.7399-11-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
93a4d6f401 scsi: lpfc: Add registration for CPU Offline/Online events
The recent affinitization didn't address cpu offlining/onlining.  If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
7cfd5639d9 scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
If the driver receives a login that is later then LOGO'd by the remote port
(aka ndlp), the driver, upon the completion of the LOGO ACC transmission,
will logout the node and unregister the rpi that is being used for the
node.  As part of the unreg, the node's rpi value is replaced by the
LPFC_RPI_ALLOC_ERROR value.  If the port is subsequently offlined, the
offline walks the nodes and ensures they are logged out, which possibly
entails unreg'ing their rpi values.  This path does not validate the node's
rpi value, thus doesn't detect that it has been unreg'd already.  The
replaced rpi value is then used when accessing the rpi bitmask array which
tracks active rpi values.  As the LPFC_RPI_ALLOC_ERROR value is not a valid
index for the bitmask, it may fault the system.

Revise the rpi release code to detect when the rpi value is the replaced
RPI_ALLOC_ERROR value and ignore further release steps.

Link: https://lore.kernel.org/r/20191105005708.7399-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:03 -05:00
James Smart
95bfc6d8ad scsi: lpfc: Make FW logging dynamically configurable
Currently, the FW logging facility is a load/boot time parameter which
requires the driver to be unloaded/reloaded or the system rebooted in order
to change its configuration.

Convert the logging facility to allow dynamic enablement and configuration.
Specifically:

 - Convert the feature so that it can be enabled dynamically via an
   attribute.  Additionally, the size of the buffer can be configured
   dynamically.

 - Add locks around states that now may be changing.

 - Tie the feature into debugfs so that the logs can be read at any time.

Link: https://lore.kernel.org/r/20191018211832.7917-12-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:05 -04:00
James Smart
8156d378c4 scsi: lpfc: Revise interrupt coalescing for missing scenarios
The existing "auto eq delay" mechanism was sometimes skipping over an EQ,
not ramping the coalescing down under light load fast enough, and in other
cases never kicked in as cpu sharing by multiple vectors didn't quite add
up right.

Tweak the interrupt mechanism such that:

 - Add a flag to the EQ to force checking for colaescing values when being
   serviced in the interrupt handler.  The flag will be set by any CQ bound
   to the EQ whenever the number of CQ elements process in a single scan
   meets or exceeds the hardware queue notify level. E.g. there's a
   significant number of completions happening.

 - In the heartbeat work item that checks coalescing:

   - Replace the structure that was counting the number of EQs that
     interrupted on a single cpu with a new structure that looks at the EQ
     to see whether EQ currently has a coalescing value (thus it should be
     re-evaluate) or was marked by the new flag indicating heavy
     completions.

   - When a cpu, which may be servicing multiple vectors, had at least 1 EQ
     that should be checked, a new coalescing delay is calculated based on
     the number of interrupts that occurred on the cpu.

   - The new coalescing value is then applied to the EQs that had
     interrupted on the cpu.

Link: https://lore.kernel.org/r/20191018211832.7917-11-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:05 -04:00
James Smart
324e1c4020 scsi: lpfc: Fix bad ndlp ptr in xri aborted handling
In cases where I/O may be aborted, such as driver unload or link bounces,
the system will crash based on a bad ndlp pointer.

Example:
  RIP: 0010:lpfc_sli4_abts_err_handler+0x15/0x140 [lpfc]
  ...
  lpfc_sli4_io_xri_aborted+0x20d/0x270 [lpfc]
  lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x84/0x170 [lpfc]
  lpfc_sli4_fp_handle_cqe+0xc2/0x480 [lpfc]
  __lpfc_sli4_process_cq+0xc6/0x230 [lpfc]
  __lpfc_sli4_hba_process_cq+0x29/0xc0 [lpfc]
  process_one_work+0x14c/0x390

Crash was caused by a bad ndlp address passed to I/O indicated by the XRI
aborted CQE.  The address was not NULL so the routine deferenced the ndlp
ptr. The bad ndlp also caused the lpfc_sli4_io_xri_aborted to call an
erroneous io handler.  Root cause for the bad ndlp was an lpfc_ncmd that
was aborted, put on the abort_io list, completed, taken off the abort_io
list, sent to lpfc_release_nvme_buf where it was put back on the abort_io
list because the lpfc_ncmd->flags setting LPFC_SBUF_XBUSY was not cleared
on the final completion.

Rework the exchange busy handling to ensure the flags are properly set for
both scsi and nvme.

Fixes: c490850a09 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Cc: <stable@vger.kernel.org> # v5.1+
Link: https://lore.kernel.org/r/20191018211832.7917-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:04 -04:00