This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going
through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to
IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ,
which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the
host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end
up changing the host action to IBMVFC_HOST_ACTION_INIT. If we then change
the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON().
Make a couple of changes to avoid this. Leave the host action to be
IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop
the host lock and reset or reenable the CRQ. Also harden the host state
machine to ensure we cannot leave the reset / reenable state until we've
finished processing the reset or reenable.
Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com
Fixes: 73ee5d8672 ("[SCSI] ibmvfc: Fix soft lockup on resume")
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
[tyreld: added fixes tag]
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mkp: fix comment checkpatch warnings]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull 5.12/scsi-fixes into the 5.13 SCSI tree to provide a baseline for
some UFS changes that would otherwise cause conflicts during the
merge.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During MQ enablement of the ibmvfc driver ibmvfc_wait_for_ops() was
missed. This function is responsible for waiting on commands to complete
that match a certain criteria such as LUN or cancel key. The implementation
as is only scans the CRQ for events ignoring any sub-queues and as a result
will exit successfully without doing anything when operating in MQ
channelized mode.
Check the MQ and channel use flags to determine which queues are
applicable, and scan each queue accordingly. Note in MQ mode SCSI commands
are only issued down sub-queues and the CRQ is only used for driver
specific management commands. As such the CRQ events are ignored when
operating in MQ mode with channels.
Link: https://lore.kernel.org/r/20210319205029.312969-3-tyreld@linux.ibm.com
Fixes: 9000cb998b ("scsi: ibmvfc: Enable MQ and set reasonable defaults")
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For various EH activities the ibmvfc driver uses ibmvfc_wait_for_ops() to
wait for the completion of commands that match a given criteria be it
cancel key, or specific LUN. With recent changes commands are completed
outside the lock in bulk by removing them from the sent list and adding
them to a private completion list. This introduces a potential race in
ibmvfc_wait_for_ops() since the criteria for a command to be outstanding is
no longer simply being on the sent list, but instead not being on the free
list.
Avoid this race by scanning the entire command event pool and checking that
any matching command that ibmvfc needs to wait on is not already on the
free list.
Link: https://lore.kernel.org/r/20210319205029.312969-2-tyreld@linux.ibm.com
Fixes: 1f4a4a1950 ("scsi: ibmvfc: Complete commands outside the host/queue lock")
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/ibmvscsi/ibmvfc.c:331: warning: Function parameter or member 'vhost' not described in 'ibmvfc_get_err_result'
drivers/scsi/ibmvscsi/ibmvfc.c:653: warning: Excess function parameter 'job_step' description in 'ibmvfc_del_tgt'
drivers/scsi/ibmvscsi/ibmvfc.c:773: warning: Function parameter or member 'queue' not described in 'ibmvfc_init_event_pool'
drivers/scsi/ibmvscsi/ibmvfc.c:773: warning: Function parameter or member 'size' not described in 'ibmvfc_init_event_pool'
drivers/scsi/ibmvscsi/ibmvfc.c:823: warning: Function parameter or member 'queue' not described in 'ibmvfc_free_event_pool'
drivers/scsi/ibmvscsi/ibmvfc.c:1413: warning: Function parameter or member 'vhost' not described in 'ibmvfc_gather_partition_info'
drivers/scsi/ibmvscsi/ibmvfc.c:1483: warning: Function parameter or member 'queue' not described in 'ibmvfc_get_event'
drivers/scsi/ibmvscsi/ibmvfc.c:1483: warning: Excess function parameter 'vhost' description in 'ibmvfc_get_event'
drivers/scsi/ibmvscsi/ibmvfc.c:1630: warning: Function parameter or member 't' not described in 'ibmvfc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:1630: warning: Excess function parameter 'evt' description in 'ibmvfc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:1893: warning: Function parameter or member 'shost' not described in 'ibmvfc_queuecommand'
drivers/scsi/ibmvscsi/ibmvfc.c:1893: warning: Excess function parameter 'done' description in 'ibmvfc_queuecommand'
drivers/scsi/ibmvscsi/ibmvfc.c:2324: warning: Function parameter or member 'rport' not described in 'ibmvfc_match_rport'
drivers/scsi/ibmvscsi/ibmvfc.c:2324: warning: Excess function parameter 'device' description in 'ibmvfc_match_rport'
drivers/scsi/ibmvscsi/ibmvfc.c:3133: warning: Function parameter or member 'evt_doneq' not described in 'ibmvfc_handle_crq'
drivers/scsi/ibmvscsi/ibmvfc.c:3317: warning: Excess function parameter 'reason' description in 'ibmvfc_change_queue_depth'
drivers/scsi/ibmvscsi/ibmvfc.c:3390: warning: Function parameter or member 'attr' not described in 'ibmvfc_show_log_level'
drivers/scsi/ibmvscsi/ibmvfc.c:3413: warning: Function parameter or member 'attr' not described in 'ibmvfc_store_log_level'
drivers/scsi/ibmvscsi/ibmvfc.c:3413: warning: Function parameter or member 'count' not described in 'ibmvfc_store_log_level'
drivers/scsi/ibmvscsi/ibmvfc.c:4121: warning: Function parameter or member 'done' not described in '__ibmvfc_tgt_get_implicit_logout_evt'
drivers/scsi/ibmvscsi/ibmvfc.c:4438: warning: Function parameter or member 't' not described in 'ibmvfc_adisc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:4438: warning: Excess function parameter 'tgt' description in 'ibmvfc_adisc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:4641: warning: Function parameter or member 'target' not described in 'ibmvfc_alloc_target'
drivers/scsi/ibmvscsi/ibmvfc.c:4641: warning: Excess function parameter 'scsi_id' description in 'ibmvfc_alloc_target'
drivers/scsi/ibmvscsi/ibmvfc.c:5068: warning: Function parameter or member 'evt' not described in 'ibmvfc_npiv_logout_done'
drivers/scsi/ibmvscsi/ibmvfc.c:5068: warning: Excess function parameter 'vhost' description in 'ibmvfc_npiv_logout_done'
Link: https://lore.kernel.org/r/20210317091230.2912389-35-lee.jones@linaro.org
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/ibmvscsi/ibmvscsi.c:143: warning: Function parameter or member 'hostdata' not described in 'ibmvscsi_release_crq_queue'
drivers/scsi/ibmvscsi/ibmvscsi.c:143: warning: Function parameter or member 'max_requests' not described in 'ibmvscsi_release_crq_queue'
drivers/scsi/ibmvscsi/ibmvscsi.c:143: warning: expecting prototype for release_crq_queue(). Prototype was for ibmvscsi_release_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:286: warning: expecting prototype for reset_crq_queue(). Prototype was for ibmvscsi_reset_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:328: warning: Function parameter or member 'max_requests' not described in 'ibmvscsi_init_crq_queue'
drivers/scsi/ibmvscsi/ibmvscsi.c:328: warning: expecting prototype for initialize_crq_queue(). Prototype was for ibmvscsi_init_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:414: warning: expecting prototype for reenable_crq_queue(). Prototype was for ibmvscsi_reenable_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:536: warning: expecting prototype for ibmvscsi_free(). Prototype was for free_event_struct() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:558: warning: expecting prototype for get_evt_struct(). Prototype was for get_event_struct() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:587: warning: Function parameter or member 'evt_struct' not described in 'init_event_struct'
drivers/scsi/ibmvscsi/ibmvscsi.c:587: warning: Excess function parameter 'evt' description in 'init_event_struct'
drivers/scsi/ibmvscsi/ibmvscsi.c:608: warning: Function parameter or member 'cmd' not described in 'set_srp_direction'
drivers/scsi/ibmvscsi/ibmvscsi.c:608: warning: Function parameter or member 'srp_cmd' not described in 'set_srp_direction'
drivers/scsi/ibmvscsi/ibmvscsi.c:608: warning: Function parameter or member 'numbuf' not described in 'set_srp_direction'
drivers/scsi/ibmvscsi/ibmvscsi.c:641: warning: Function parameter or member 'evt_struct' not described in 'unmap_cmd_data'
drivers/scsi/ibmvscsi/ibmvscsi.c:683: warning: Function parameter or member 'evt_struct' not described in 'map_sg_data'
drivers/scsi/ibmvscsi/ibmvscsi.c:757: warning: Function parameter or member 'evt_struct' not described in 'map_data_for_srp_cmd'
drivers/scsi/ibmvscsi/ibmvscsi.c:783: warning: Function parameter or member 'error_code' not described in 'purge_requests'
drivers/scsi/ibmvscsi/ibmvscsi.c:846: warning: Function parameter or member 't' not described in 'ibmvscsi_timeout'
drivers/scsi/ibmvscsi/ibmvscsi.c:846: warning: Excess function parameter 'evt_struct' description in 'ibmvscsi_timeout'
drivers/scsi/ibmvscsi/ibmvscsi.c:1043: warning: Function parameter or member 'cmnd' not described in 'ibmvscsi_queuecommand_lck'
drivers/scsi/ibmvscsi/ibmvscsi.c:1043: warning: expecting prototype for ibmvscsi_queue(). Prototype was for ibmvscsi_queuecommand_lck() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:1351: warning: expecting prototype for init_host(). Prototype was for enable_fast_fail() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:1464: warning: Function parameter or member 'hostdata' not described in 'init_adapter'
drivers/scsi/ibmvscsi/ibmvscsi.c:1475: warning: Function parameter or member 'evt_struct' not described in 'sync_completion'
drivers/scsi/ibmvscsi/ibmvscsi.c:1488: warning: Function parameter or member 'cmd' not described in 'ibmvscsi_eh_abort_handler'
drivers/scsi/ibmvscsi/ibmvscsi.c:1488: warning: expecting prototype for ibmvscsi_abort(). Prototype was for ibmvscsi_eh_abort_handler() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:1627: warning: Function parameter or member 'cmd' not described in 'ibmvscsi_eh_device_reset_handler'
drivers/scsi/ibmvscsi/ibmvscsi.c:1893: warning: Excess function parameter 'reason' description in 'ibmvscsi_change_queue_depth'
drivers/scsi/ibmvscsi/ibmvscsi.c:2221: warning: Function parameter or member 'vdev' not described in 'ibmvscsi_probe'
drivers/scsi/ibmvscsi/ibmvscsi.c:2221: warning: Function parameter or member 'id' not described in 'ibmvscsi_probe'
drivers/scsi/ibmvscsi/ibmvscsi.c:2221: warning: expecting prototype for Called by bus code for each adapter(). Prototype was for ibmvscsi_probe() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:2381: warning: cannot understand function prototype: 'const struct vio_device_id ibmvscsi_device_table[] = '
[mkp: fix checkpatch whitespace warning]
Link: https://lore.kernel.org/r/20210317091230.2912389-34-lee.jones@linaro.org
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Colin DeVilbiss <devilbis@us.ibm.com>
Cc: Santiago Leon <santil@us.ibm.com>
Cc: Dave Boutcher <sleddog@us.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The buffer for negotiating channel setup is DMA allocated at device probe
time. However, the remove path fails to free this allocation which will
prevent the hypervisor from releasing the virtual device in the case of a
hotplug remove.
Fix this issue by freeing the buffer allocation in ibmvfc_free_mem().
Link: https://lore.kernel.org/r/20210311012212.428068-1-tyreld@linux.ibm.com
Fixes: e95eef3fc0 ("scsi: ibmvfc: Implement channel enquiry and setup commands")
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A live partition migration (LPM) results in a CRQ disconnect similar to a
hard reset. In this LPM case the hypervisor mostly preserves the CRQ
transport such that it simply needs to be reenabled. However, the
capabilities may have changed such as fewer channels, or no channels at
all. Further, its possible that there may be sub-CRQ support, but no
channel support. The CRQ reenable path currently doesn't take any of this
into consideration.
For simplicity release and reinitialize sub-CRQs during reenable, and set
do_enquiry and using_channels with the appropriate values to trigger
channel renegotiation.
Link: https://lore.kernel.org/r/20210302230543.9905-6-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The H_FREE_SUB_CRQ hypercall can return a retry delay return code that
indicates the call needs to be retried after a specific amount of time
delay. The error path to free a sub-CRQ in case of a failure during channel
registration fails to capture the return code of H_FREE_SUB_CRQ which will
result in the delay loop being skipped in the case of a retry delay return
code.
Store the return code result of the H_FREE_SUB_CRQ call such that the
return code check in the delay loop evaluates a meaningful value. Also, use
the rtas_busy_delay() to check the rc value and delay for the appropriate
amount of time.
Link: https://lore.kernel.org/r/20210302230543.9905-5-tyreld@linux.ibm.com
Fixes: 39e461fddf ("scsi: ibmvfc: Map/request irq and register Sub-CRQ interrupt handler")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A non-zero return code for H_REG_SUB_CRQ is currently treated as a failure
resulting in failing sub-CRQ setup. The case of H_CLOSED should not be
treated as a failure. This return code translates to a successful sub-CRQ
registration by the hypervisor, and is meant to communicate back that there
is currently no partner VIOS CRQ connection established as of yet. This is
a common occurrence during a disconnect where the client adapter can
possibly come back up prior to the partner adapter.
For non-zero return code from H_REG_SUB_CRQ treat a H_CLOSED as success so
that sub-CRQs are successfully setup.
Link: https://lore.kernel.org/r/20210302230543.9905-4-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A hard reset results in a complete transport disconnect such that the CRQ
connection with the partner VIOS is broken. This has the side effect of
also invalidating the associated sub-CRQs. The current code assumes that
the sub-CRQs are perserved resulting in a protocol violation after trying
to reconnect them with the VIOS. This introduces an infinite loop such that
the VIOS forces a disconnect after each subsequent attempt to re-register
with invalid handles.
Avoid the aforementioned issue by releasing the sub-CRQs prior to CRQ
disconnect, and driving a reinitialization of the sub-CRQs once a new CRQ
is registered with the hypervisor.
Link: https://lore.kernel.org/r/20210302230543.9905-3-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If ibmvfc_init_sub_crqs() fails ibmvfc_probe() simply parrots registration
failure reported elsewhere, and futher vhost->scsi_scrq.scrq == NULL is
indication enough to the driver that it has no sub-CRQs available. The
mq_enabled check can also be moved into ibmvfc_init_sub_crqs() such that
each caller doesn't have to gate the call with a mq_enabled check. Finally,
in the case of sub-CRQ setup failure setting do_enquiry can be turned off
to putting the driver into single queue fallback mode.
The aforementioned changes also simplify the next patch in the series that
fixes a hard reset issue, by tying a sub-CRQ setup failure and do_enquiry
logic into ibmvfc_init_sub_crqs().
Link: https://lore.kernel.org/r/20210302230543.9905-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The UFS core has received a substantial rework this cycle. This in
turn has caused a merge conflict in linux-next. Merge 5.11/scsi-fixes
into 5.12/scsi-queue and resolve the conflict.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the various module parameter toggles for adjusting the MQ
characteristics at boot/load time as well as a device attribute for
changing the client scsi channel request amount.
Link: https://lore.kernel.org/r/20210114203148.246656-22-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Turn on MQ by default and set sane values for the upper limit on hw queues
for the SCSI host, and number of hw SCSI channels to request from the
partner VIOS.
Link: https://lore.kernel.org/r/20210114203148.246656-21-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Grab the queue and list lock for each Sub-CRQ and add any uncompleted
events to the host purge list.
Link: https://lore.kernel.org/r/20210114203148.246656-20-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In general the client needs to send Cancel MADs and task management
commands down the same channel as the command(s) intended to cancel or
abort. The client assigns cancel keys per LUN and thus must send a Cancel
down each channel commands were submitted for that LUN. Further, the client
then must wait for those cancel completions prior to submitting a LUN RESET
or ABORT TASK SET.
Add a cancel rsp iu syncronization field to the ibmvfc_queue struct such
that the cancel routine can sync the cancel response to each queue that
requires a cancel command. Build a list of each cancel event sent and wait
for the completion of each submitted cancel.
Link: https://lore.kernel.org/r/20210114203148.246656-19-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a helper routine for initializing a Cancel MAD. This will be useful for
a channelized client that needs to send Cancel commands down every channel
commands were sent for a particular LUN.
Link: https://lore.kernel.org/r/20210114203148.246656-18-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the ibmvfc client adapter requests channels it must submit a number of
Sub-CRQ handles matching the number of channels being requested. The VIOS
in its response will overwrite the actual number of channel resources
allocated which may be less than what was requested. The client then must
store the VIOS Sub-CRQ handle for each queue. This VIOS handle is needed as
a parameter with h_send_sub_crq().
Link: https://lore.kernel.org/r/20210114203148.246656-17-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the client has negotiated the use of channels all vfcFrames are
required to go down a Sub-CRQ channel or it is a protocoal violation. If
the adapter state is channelized submit vfcFrames to the appropriate
Sub-CRQ via the h_send_sub_crq() helper.
Link: https://lore.kernel.org/r/20210114203148.246656-16-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Extract the hwq id from a SCSI command and store it in the ibmvfc_event
structure to identify which Sub-CRQ to send the command down when channels
are being utilized.
Link: https://lore.kernel.org/r/20210114203148.246656-15-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous patches have plumbed the necessary Sub-CRQ interface and channel
negotiation MADs to fully channelize via hardware backed queues.
Advertise client support via NPIV Login capability IBMVFC_CAN_USE_CHANNELS
when the client bits have MQ enabled via vhost->mq_enabled, or when
channels were already in use during a subsequent NPIV Login. The later is
required because channel support is only renegotiated after a CRQ pair is
broken. Simple NPIV Logout/Logins require the client to continue to
advertise the channel capability until the CRQ pair between the client is
broken.
Link: https://lore.kernel.org/r/20210114203148.246656-14-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
New NPIV_ENQUIRY_CHANNEL and NPIV_SETUP_CHANNEL management datagrams (MADs)
were defined in a previous patchset. If the client advertises a desire to
use channels and the partner VIOS is channel capable then the client must
proceed with channel enquiry to determine the maximum number of channels
the VIOS is capable of providing, and registering SubCRQs via channel setup
with the VIOS immediately following NPIV Login. This handshaking should not
be performed for subsequent NPIV Logins unless the CRQ connection has been
reset.
Implement these two new MADs and issue them following a successful NPIV
login where the VIOS has set the SUPPORT_CHANNELS capability bit in the
NPIV Login response.
Link: https://lore.kernel.org/r/20210114203148.246656-13-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Create an irq mapping for the hw_irq number provided from phyp firmware.
Request an irq assigned our Sub-CRQ interrupt handler. Unmap these irqs at
Sub-CRQ teardown.
Link: https://lore.kernel.org/r/20210114203148.246656-12-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Simple handler that calls Sub-CRQ drain routine directly.
Link: https://lore.kernel.org/r/20210114203148.246656-11-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The logic for iterating over the Sub-CRQ responses is similiar to that of
the primary CRQ. Add the necessary handlers for processing those responses.
Link: https://lore.kernel.org/r/20210114203148.246656-10-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Each Sub-CRQ has its own interrupt. A hypercall is required to toggle the
IRQ state. Provide the necessary mechanism via a helper function.
Link: https://lore.kernel.org/r/20210114203148.246656-9-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allocate a set of Sub-CRQs in advance. During channel setup the client and
VIOS negotiate the number of queues the VIOS supports and the number that
the client desires to request. Its possible that the final channel
resources allocated is less than requested, but the client is still
responsible for sending handles for every queue it is hoping for.
Also, provide deallocation cleanup routines.
Link: https://lore.kernel.org/r/20210114203148.246656-8-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Subordinate Command Response Queues (Sub CRQ) are used in conjunction with
the primary CRQ when more than one queue is needed by the virtual I/O
adapter. Recent phyp firmware versions support Sub CRQ's with ibmvfc
adapters. This feature is a prerequisite for supporting multiple hardware
backed submission queues in the vfc adapter.
The Sub CRQ command element differs from the standard CRQ in that it is
32bytes long as opposed to 16bytes for the latter. Despite this extra
16bytes the ibmvfc protocol will use the original CRQ command element
mapped to the first 16bytes of the Sub CRQ element initially.
Add definitions for the Sub CRQ command element and queue.
Link: https://lore.kernel.org/r/20210114203148.246656-7-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sub-CRQs are registred with firmware via a hypercall. Abstract that
interface into a simpler helper function.
Link: https://lore.kernel.org/r/20210114203148.246656-6-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With the upcoming addition of Sub-CRQs the event pool size may vary
per-queue.
Add a size parameter to ibmvfc_init_event_pool() such that different size
event pools can be requested by ibmvfc_alloc_queue().
Link: https://lore.kernel.org/r/20210114203148.246656-5-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The event pool and CRQ used to be separate entities of the adapter host
structure and as such were allocated and freed independently of each
other. Recent work as defined a generic queue structure with an event pool
specific to each queue. As such the event pool for each queue shouldn't be
allocated/freed independently, but instead performed as part of the queue
allocation/free routines.
Move the calls to ibmvfc_event_pool_{init|free} into
ibmvfc_{alloc|free}_queue respectively. The only functional change here is
that the CRQ cannot be released in ibmvfc_remove until after the event pool
has been successfully purged since releasing the queue will also free the
event pool.
Link: https://lore.kernel.org/r/20210114203148.246656-4-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The next patch in this series reworks the event pool allocation calls to
happen within the individual queue allocation routines instead of as
independent calls.
Move the init/free routines earlier in ibmvfc.c to prevent undefined
reference errors when calling these functions from the queue allocation
code. No functional change.
Link: https://lore.kernel.org/r/20210114203148.246656-3-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce several new vhost fields for managing MQ state of the adapter as
well as initial defaults for MQ enablement.
Link: https://lore.kernel.org/r/20210114203148.246656-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:
CPU 0 CPU 1
------------------------------- ----------------------------------
scsi_queue_rq migration_store
-> blk_mq_start_request -> rtas_ibm_suspend_me
-> blk_add_timer -> on_each_cpu(rtas_percpu_suspend_me
_______________________________________V
|
V
-> IPI from CPU 1
-> rtas_percpu_suspend_me
-> __rtas_suspend_last_cpu
-- Linux partition suspended for > 30 seconds --
-> for_each_online_cpu(cpu)
plpar_hcall_norets(H_PROD
-> scsi_dispatch_cmd
-> scsi_times_out
-> scsi_abort_command
-> queue_delayed_work
-> ibmvfc_queuecommand_lck
-> ibmvfc_send_event
-> ibmvfc_send_crq
- returns H_CLOSED
<- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request
-> scmd_eh_abort_handler
-> scsi_try_to_abort_cmd
- returns SUCCESS
-> scsi_queue_insert
Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.
In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.
The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.
Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver's queuecommand routine is still wrapped to hold the host lock
for the duration of the call. This will become problematic when moving to
multiple queues due to the lock contention preventing asynchronous
submissions to mulitple queues. There is no real legitimate reason to hold
the host lock, and previous patches have insured proper protection of
moving ibmvfc_event objects between free and sent lists.
Link: https://lore.kernel.org/r/20210106201835.1053593-6-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Drain the command queue and place all commands on a completion list.
Perform command completion on that list outside the host/queue locks.
Further, move purged command compeletions outside the host_lock as well.
Link: https://lore.kernel.org/r/20210106201835.1053593-5-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Define per-queue locks for protecting queue state and event pool sent/free
lists. The evt list lock is initially redundant but it allows the driver to
be modified in the follow-up patches to relax the queue locking around
submissions and completions.
Link: https://lore.kernel.org/r/20210106201835.1053593-4-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is currently a single command event pool per host. In anticipation of
providing multiple queues add a per-queue event pool definition and
reimplement the existing CRQ to use its queue defined event pool for
command submission and completion.
Link: https://lore.kernel.org/r/20210106201835.1053593-3-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The primary and async CRQs are nearly identical outside of the format and
length of each message entry in the dma mapped page that represents the
queue data. These queues can be represented with a generic queue structure
that uses a union to differentiate between message format of the mapped
page.
This structure will further be leveraged in a followup patcheset that
introduces Sub-CRQs.
Link: https://lore.kernel.org/r/20210106201835.1053593-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 2aa0102c66 ("scsi: ibmvfc: Use correlation token to tag commands")
sets the vfcFrame correlation token to the pointer handle of the associated
ibmvfc_event. However, that commit failed to cast the pointer to an
appropriate type which in this case is a u64. As such sparse warnings are
generated for both correlation token assignments.
ibmvfc.c:2375:36: sparse: incorrect type in argument 1 (different base types)
ibmvfc.c:2375:36: sparse: expected unsigned long long [usertype] val
ibmvfc.c:2375:36: sparse: got struct ibmvfc_event *[assigned] evt
Add the appropriate u64 casts when assigning an ibmvfc_event as a
correlation token.
Link: https://lore.kernel.org/r/20210106203721.1054693-1-tyreld@linux.ibm.com
Fixes: 2aa0102c66 ("scsi: ibmvfc: Use correlation token to tag commands")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, qla2xxx,
smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of
cleanups, a major power management rework and a load of assorted minor
updates. There are a few core updates (formatting fixes being the big
one) but nothing major this cycle.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX9o0KSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbOZAP9D5NTN
J7dJUo2MIMy84YBu+d9ag7yLlNiRWVY2yw5vHwD/Z7JjAVLwz/tzmyjU9//o2J6w
hwhOv6Uto89gLCWSEz8=
=KUPT
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, qla2xxx, smartpqi,
target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major
power management rework and a load of assorted minor updates.
There are a few core updates (formatting fixes being the big one) but
nothing major this cycle"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
scsi: mpt3sas: Update driver version to 36.100.00.00
scsi: mpt3sas: Handle trigger page after firmware update
scsi: mpt3sas: Add persistent MPI trigger page
scsi: mpt3sas: Add persistent SCSI sense trigger page
scsi: mpt3sas: Add persistent Event trigger page
scsi: mpt3sas: Add persistent Master trigger page
scsi: mpt3sas: Add persistent trigger pages support
scsi: mpt3sas: Sync time periodically between driver and firmware
scsi: qla2xxx: Update version to 10.02.00.104-k
scsi: qla2xxx: Fix device loss on 4G and older HBAs
scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry
scsi: qla2xxx: Fix the call trace for flush workqueue
scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines
scsi: qla2xxx: Handle aborts correctly for port undergoing deletion
scsi: qla2xxx: Fix N2N and NVMe connect retry failure
scsi: qla2xxx: Fix FW initialization error on big endian machines
scsi: qla2xxx: Fix crash during driver load on big endian machines
scsi: qla2xxx: Fix compilation issue in PPC systems
scsi: qla2xxx: Don't check for fw_started while posting NVMe command
scsi: qla2xxx: Tear down session if FW say it is down
...
The previous patch added support for the targetWWPN field in version 2 MADs
and vfcFrame structures.
Set the IBMVFC_CAN_SEND_VF_WWPN bit in our capabailites flag during NPIV
Login to inform the VIOS that this client supports the feature.
Link: https://lore.kernel.org/r/20201118011104.296999-7-tyreld@linux.ibm.com
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Several version 2 MADs and the version 2 vfcFrame structures introduced a
new targetWWPN field for better identification of a target over the
scsi_id.
Set this field and MAD versioning fields when the VIOS advertises the
IBMVFC_HANDLE_VF_WWPN capability.
Link: https://lore.kernel.org/r/20201118011104.296999-6-tyreld@linux.ibm.com
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The FC iu and response payloads are located at different offsets depending
on the ibmvfc_cmd version. This is a result of the version 2 vfcFrame
definition adding an extra 64bytes of reserved space to the structure prior
to the payloads.
Add helper routines to determine the current vfcFrame version and return a
pointer to the proper iu or response structure within that ibmvfc_cmd.
Link: https://lore.kernel.org/r/20201118011104.296999-5-tyreld@linux.ibm.com
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Testing the NPIV Login response capabilities is a long winded process of
dereferencing the vhost->login_buf->resp.capabilities field, then byte
swapping that value to host endian, and performing the bitwise test.
Currently we only ever check this in ibmvfc_cancel_all(), but follow-up
patches will need to regularly check for targetWWPN and channelization
support.
Add a helper to simplify checking various VIOS capabilities, namely
ibmvfc_check_caps().
Link: https://lore.kernel.org/r/20201118011104.296999-4-tyreld@linux.ibm.com
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce a target_wwpn field to several MADs. Its possible that a SCSI ID
of a target can change due to some fabric changes. The WWPN of the SCSI
target provides a better way to identify the target. Also, add flags for
receiving MAD versioning information and advertising client support for
targetWWPN with the VIOS. This latter capability flag will be required for
future clients capable of requesting multiple hardware queues from the host
adapter.
Link: https://lore.kernel.org/r/20201118011104.296999-3-tyreld@linux.ibm.com
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>