The current version number is 0.2. That driver version was assigned more
than nine years ago. A version number that is not updated while the driver
is updated is not useful. Hence remove the driver version number from the
UFS driver. See also commit e0eca63e34 ("[SCSI] ufs: Separate PCI code
into glue driver").
Link: https://lore.kernel.org/r/20220419225811.4127248-16-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pass the actual type to config_scaling_param callback as the third argment
instead of a void pointer. Remove a superfluous NULL pointer check from
ufs_qcom_config_scaling_param().
Link: https://lore.kernel.org/r/20220419225811.4127248-15-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it easier to verify for humans that ufshcd_init_pwr_dev_param()
initializes all structure members. This patch does not change any
functionality.
Link: https://lore.kernel.org/r/20220419225811.4127248-14-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 5b44a07b6b ("scsi: ufs: Remove pre-defined initial voltage values
of device power") removed the code that uses the UFS_VREG_VCC* constants
and also the code that sets the min_uV and max_uV member variables. Hence
also remove these constants and that member variable.
Link: https://lore.kernel.org/r/20220419225811.4127248-13-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is confusing that ufshcd_is_hba_active() returns 'true' if the HBA is
not active. Clear up this confusion by inverting the return value of
ufshcd_is_hba_active(). This patch does not change any functionality.
Link: https://lore.kernel.org/r/20220419225811.4127248-12-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Declare the quirks array and also its 'model' member const to make it
explicit that these are not modified.
Link: https://lore.kernel.org/r/20220419225811.4127248-11-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since struct ufs_dev_fix contains quirk information, rename it into struct
ufs_dev_quirk.
Link: https://lore.kernel.org/r/20220419225811.4127248-10-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since these two macros reduce code readability, remove them.
Link: https://lore.kernel.org/r/20220419225811.4127248-9-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use get_unaligned_be16(...) instead of the equivalent but harder to read
be16_to_cpup((__be16 *)...).
Link: https://lore.kernel.org/r/20220419225811.4127248-8-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ufshcd_lrb.sense_buffer is NULL if ufshcd_lrb.cmd is NULL and
ufshcd_lrb.sense_buffer points at cmd->sense_buffer if ufshcd_lrb.cmd is
set. In other words, the ufshcd_lrb.sense_buffer member is identical to
cmd->sense_buffer. Hence this patch that removes the
ufshcd_lrb.sense_buffer structure member.
Link: https://lore.kernel.org/r/20220419225811.4127248-7-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ufshcd_lrb.sense_bufflen is set but never read. Hence remove this struct
member.
Link: https://lore.kernel.org/r/20220419225811.4127248-6-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Keoseong Park <keosung.park@samsung.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Convert "if (expr) return true; else return false;" into "return expr;" if
either 'expr' is a boolean expression or the return type of the function is
'bool'.
Link: https://lore.kernel.org/r/20220419225811.4127248-5-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Keoseong Park <keosung.park@samsung.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove "? true : false" if the preceding expression yields a boolean or if
the result of the expression is assigned to a boolean since in these two
cases the "? true : false" part is superfluous.
Link: https://lore.kernel.org/r/20220419225811.4127248-4-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Declare this function static since it is only used inside the ufshcd.c
source file.
Link: https://lore.kernel.org/r/20220419225811.4127248-3-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change one occurrence of "adpater" into "adapter".
Link: https://lore.kernel.org/r/20220419225811.4127248-2-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
UFS devices are expected to clear fDeviceInit flag in single digit
milliseconds. Current values of 5 to 10 millisecond sleep add to increased
latency during the initialization and resume path. This CL lowers the sleep
range to 500 to 1000 microseconds.
Link: https://lore.kernel.org/r/20220421002429.3136933-1-bvanassche@acm.org
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Konstantin Vyshetsky <vkon@google.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The replyPostRegisterIndex array of struct MPT3SAS_ADAPTER stores iomem
resource addresses. Fix its declaration to annotate it with __iomem to
avoid sparse warnings for writel() calls using the stored addresses.
Link: https://lore.kernel.org/r/20220307234854.148145-6-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In mpt3sas_scsih_event_callback(), fix a sparse warning when testing the
event log code value by replacing the use of a pointer to the address
storing the event log code with a log code local variable. Doing so,
le32_to_cpu() is used when the log code value is assigned, avoiding a
sparse warning.
Link: https://lore.kernel.org/r/20220307234854.148145-5-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The functions _base_readl_aero() and _base_readl() used for an adapter
base_readl() method are implemented using a regular readl() call which
internally performs a conversion to CPU endianness (le32_to_cpu()) of
the values read. The users of the ioc base_readl() method should thus
not convert again the values read using le16_to_cpu().
Fixing this removes sparse warnings.
Link: https://lore.kernel.org/r/20220307234854.148145-4-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
writel() internally executes cpu_to_le32() to convert the value being
written to little endian. The caller should thus not use this conversion
function for the value passed to writel(). Remove the cpu_to_le32() calls
in _base_put_smid_scsi_io_atomic(), _base_put_smid_fast_path_atomic(),
_base_put_smid_hi_priority_atomic() _base_put_smid_default_atomic() and
_base_handshake_req_reply_wait().
Link: https://lore.kernel.org/r/20220307234854.148145-3-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The TaskMID field of struct Mpi2SCSITaskManagementRequest_t is a 16-bit
little endian value. Fix the search loop in _ctl_set_task_mid() to add a
cpu_to_le16() conversion before checking the value of TaskMID to avoid
sparse warnings. While at it, simplify the search loop code to remove an
unnecessarily complicated if condition.
Link: https://lore.kernel.org/r/20220307234854.148145-2-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The maximum SCSI device queue depth of 1024 is not sufficient for RAID
volumes configured behind Broadcom RAID controllers. For a 16-drive RAID
volume with a device queue depth limit of 1024, only 64 I/Os (1024/16) can
be issued per drive. That is not sufficient to saturate the device.
Link: https://lore.kernel.org/r/20220414103601.140687-1-sumit.saxena@broadcom.com
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace 'if (!is_zero_ether_addr(mac))' with 'else' for simplification and
add curly brackets according to the kernel coding style:
"Do not unnecessarily use braces where a single statement will do."
...
"This does not apply if only one branch of a conditional statement is a
single statement; in the latter case use braces in both branches"
Please refer to:
https://www.kernel.org/doc/html/v5.17-rc8/process/coding-style.html
Link: https://lore.kernel.org/r/20220408081237.14037-1-hanyihao@vivo.com
Signed-off-by: Yihao Han <hanyihao@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2022 for files modified in the 14.2.0.2 patch set.
Link: https://lore.kernel.org/r/20220412222008.126521-27-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.2.0.2.
Link: https://lore.kernel.org/r/20220412222008.126521-26-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ELS_ID field for ELS_REQUEST64_WQE is not filled out when FIP is not
supported by the HBA.
Move setting ELS_ID logic into __lpfc_sli_prep_els_req_rsp_s4(), and remove
ELS_ID FIP dependency logic from lpfc_sli_prep_wqe().
Introduce PLOGI ELS_ID and as a result update wqe_els_id_MASK because PLOGI
ELS_ID = 0x4 occupies up to 3 bits.
While in __lpfc_sli_prep_els_req_rsp_s4() routine, remove SLI3-isms.
Link: https://lore.kernel.org/r/20220412222008.126521-25-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
READ_STATUS tx/rx byte count fields are now expanded to 64 bit wide
counters. This patch updates logic for the READ_STATUS mbox command when
displaying tx_word and rx_word statistics in sysfs.
Link: https://lore.kernel.org/r/20220412222008.126521-24-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Do not rely on vendor version field of the CSPs to determine if we are in a
FA-PWWN environment. Instead, use the following procedure:
First, during HBA initialization, driver does a READ_CONFIG to determine if
FA-PWWN is configured on the HBA. A LPFC_FAWWPN_CONFIG hba_flag is set
accordingly.
Next, when the link comes up before the driver gets a link up event, the
firmware logs into the fabric with FA-PWWN. If the fabric port does not
support FA-PWWN, the driver will get a Misconfigured FA-WWN async event
before the link up. A LPFC_FAWWPN_FABRIC hba_flag will be set accordingly.
Finally, if the fabric supports FA-PWWN, the firmware will replace its CSPs
WWN with the Fabric Assigned ones. Then after link up, the driver will
retrieve the Fabric Assigned WWN when it does a READ_SPARAM mbox command.
Link: https://lore.kernel.org/r/20220412222008.126521-23-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The intention of this patch is to refactor mailbox memory allocation and
cleanup steps in one routine respectively to prevent memory leaks or memory
errors related to mailbox commands. There are trivial localized fixes as
well.
Provide lpfc_mbox_rsrc_prep() - this routine allocates the dmabuf and the
mbuf associated with it. It also catches allocation errors and returns
status.
Provide lpfc_mbox_rsrc_cleanup() - this routine verifies a dmabuf exists
and if so releases the associated mbuf and the dmabuf memory. It then sets
the ctx_buf to NULL and releases the mailbox memory to the mailbox pool.
Link: https://lore.kernel.org/r/20220412222008.126521-22-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The lpfc_iocbq data structure has void * pointers that are overloaded to be
as many as 8 different data types and the driver translates the void * by
casting. This patch removes the void * pointers by declaring the specific
types needed by the driver. It also expands the context_un to include more
seldom used pointer types to save structure bytes. It also groups the u8
types together to pack the 8 bytes needed. This work allows the lpfc_iocbq
data structure to be more strongly typed and keeps it from being allocated
from the 512 byte slab.
[mkp: rolled in zeroday fix]
Link: https://lore.kernel.org/r/20220412222008.126521-21-jsmart2021@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During an NVMe target reboot, the target may initialize itself as FCP only
during the first RSCN and shortly after trigger a second RSCN claiming NVMe
support. The timing of these RSCNs occur before FCP-PRLI for the first
RSCN completes leading discovery issues over NVMe.
Change RSCN and NVME-PRLI send logic based on a new FC_RSCN_MEMENTO flag
that signals when lpfc_end_rscn() is completed and serves as a memento that
discovery was started from RSCN.
Link: https://lore.kernel.org/r/20220412222008.126521-20-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add new FC-4 type 0x60 Application Services for fabric registration when
VMID is enabled.
Modified rft struture to indicate __be format. Removed redundant ipReg
variable as it was not used.
Link: https://lore.kernel.org/r/20220412222008.126521-19-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FDMI FC-4 Active Type for vports mistakenly shows NVMe support.
Add a check to only set the NVMe support bit for the physical port.
Link: https://lore.kernel.org/r/20220412222008.126521-18-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Trunk port FDMI supported port speed shows single port supported speed
rather than the trunked port speed.
Modify supported port speed logic calculation during registration.
Link: https://lore.kernel.org/r/20220412222008.126521-17-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following was seen with CMF enabled:
BUG: using smp_processor_id() in preemptible
code: systemd-udevd/31711
kernel: caller is lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: CPU: 12 PID: 31711 Comm: systemd-udevd
kernel: Call Trace:
kernel: <TASK>
kernel: dump_stack_lvl+0x44/0x57
kernel: check_preemption_disabled+0xbf/0xe0
kernel: lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: lpfc_nvme_fcp_io_submit+0x23b4/0x4df0 [lpfc]
this_cpu_ptr() calls smp_processor_id() in a preemptible context.
Fix by using per_cpu_ptr() with raw_smp_processor_id() instead.
Link: https://lore.kernel.org/r/20220412222008.126521-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_cgn_calc_crc32() is returning 32 bits, and lpfc_cgn_update_stat() was
using u16 to store the crc32 value. Correct by redeclaring the local
variable to u32.
Link: https://lore.kernel.org/r/20220412222008.126521-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_refresh_params() can be called for an async event handler. This could
potentially override the value initialized by lpfc_cmf_setup().
Move module parameter check to lpfc_refresh_params().
Link: https://lore.kernel.org/r/20220412222008.126521-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The command IOCB ndlp pointer is overwritten in lpfc_issue_els_rdf(), and
the original ndlp pointer is stored ahead of time.
This null ptr assignment can be safely removed.
Link: https://lore.kernel.org/r/20220412222008.126521-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In P2P topology, a target controller reboot sometimes results in not
reestablishing a login because the ndlp is stuck in LOGO state.
Fix by transitioning to NPR state if we get link down before LOGO
completes.
Link: https://lore.kernel.org/r/20220412222008.126521-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If lpfc_sli_issue_iocb() fails, then the fc_prli_sent is never decremented.
Move the fc_prli_sent++ to after a guaranteed IOCB submit.
Link: https://lore.kernel.org/r/20220412222008.126521-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a potential memory leak in lpfc_ignore_els_cmpl() and
lpfc_els_rsp_reject() that was allocated from NPIV PLOGI_RJT
(lpfc_rcv_plogi()'s login_mbox).
Check if cmdiocb->context_un.mbox was allocated in lpfc_ignore_els_cmpl(),
and then free it back to phba->mbox_mem_pool along with mbox->ctx_buf for
service parameters.
For lpfc_els_rsp_reject() failure, free both the ctx_buf for service
parameters and the login_mbox.
Link: https://lore.kernel.org/r/20220412222008.126521-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If lpfc_issue_els_flogi() fails and returns non-zero status, the node
reference count is decremented to trigger the release of the nodelist
structure. However, if there is a prior registration or dev-loss-evt work
pending, the node may be released prematurely. When dev-loss-evt
completes, the released node is referenced causing a use-after-free null
pointer dereference.
Similarly, when processing non-zero ELS PLOGI completion status in
lpfc_cmpl_els_plogi(), the ndlp flags are checked for a transport
registration before triggering node removal. If dev-loss-evt work is
pending, the node may be released prematurely and a subsequent call to
lpfc_dev_loss_tmo_handler() results in a use after free ndlp dereference.
Add test for pending dev-loss before decrementing the node reference count
for FLOGI, PLOGI, PRLI, and ADISC handling.
Link: https://lore.kernel.org/r/20220412222008.126521-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous topologies may no longer be in fabric mode, so clear FC_FABRIC in
fc_flag for every new FLOGI.
Link: https://lore.kernel.org/r/20220412222008.126521-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.
CPU A:
native_queued_spin_lock_slowpath+0x192
_raw_spin_lock_irqsave+0x32
lpfc_handle_fcp_err+0x4c6
lpfc_fcp_io_cmd_wqe_cmpl+0x964
lpfc_sli4_fp_handle_cqe+0x266
__lpfc_sli4_process_cq+0x105
__lpfc_sli4_hba_process_cq+0x3c
lpfc_cq_poll_hdler+0x16
irq_poll_softirq+0x76
__softirqentry_text_start+0xe4
irq_exit+0xf7
do_IRQ+0x7f
CPU B:
native_queued_spin_lock_slowpath+0x5b
_raw_spin_lock+0x1c
lpfc_abort_handler+0x13e
scmd_eh_abort_handler+0x85
process_one_work+0x1a7
worker_thread+0x30
kthread+0x112
ret_from_fork+0x1f
Diagram of lockup:
CPUA CPUB
---- ----
lpfc_cmd->buf_lock
phba->hbalock
lpfc_cmd->buf_lock
phba->hbalock
Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.
Link: https://lore.kernel.org/r/20220412222008.126521-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During heavy I/O stress tests with 100+ vports and cable pulls, it may take
a while before the vport logs back into the fabric to resume I/O.
Currently, the driver immediately fails the I/O with DID_ERROR.
Change behavior to return DID_REQUEUE, and rely on SCSI layer's max retry
of 5 before erroring out the I/O.
Link: https://lore.kernel.org/r/20220412222008.126521-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It's possible that the fcpCntl0 reserved field is allocated non-zero.
For certain target storage arrays this could cause problems expecting
reserved fields to be all zero.
SLI3 path already allocates fcp_cmnd buffer with dma_pool_zalloc() in
lpfc_new_scsi_buf_s3. The fcpCntl0 field itself is never proactively set
throughout the SCSI I/O path. Thus, we only change the SLI4 fcp_cmnd
buffer allocation to dma_pool_zalloc.
Link: https://lore.kernel.org/r/20220412222008.126521-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The lpfc_sli4_ras_setup() routine is only called from the
lpfc_pci_probe_one_s4() routine, which means diagnostic fw logging
initialization only occurs during probing.
Thus, any path involving a reset of the HBA that restarts the state of the
SLI port does not reinitialize diagnostic fw logging.
Move lpfc_sli4_ras_setup() into lpfc_sli4_hba_setup() so that the
LOWLEVEL_SET_DIAG_LOG_OPTIONS mailbox command can be sent after a function
reset.
Link: https://lore.kernel.org/r/20220412222008.126521-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.
Call Trace:
_raw_spin_lock_irqsave+0x32/0x40
lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
lpfc_do_work+0x1485/0x1d70 [lpfc]
kthread+0x112/0x130
ret_from_fork+0x1f/0x40
Kernel panic - not syncing: Hard LOCKUP
The same CPU tries to claim the phba->port_list_lock twice.
Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().
Link: https://lore.kernel.org/r/20220412222008.126521-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>