Commit Graph

20876 Commits

Author SHA1 Message Date
Ahmed S. Darwish
093289e40b scsi: aic94xx: Switch back to original libsas event notifiers
libsas event notifiers required an extension where gfp_t flags must be
explicitly passed. For bisectability, a temporary _gfp() variant of such
functions were added. All call sites then got converted use the _gfp()
variants and explicitly pass GFP context. Having no callers left, the
original libsas notifiers were then modified to accept gfp_t flags by
default.

Switch back to the original libas API, while still passing GFP context.
The libsas _gfp() variants will be removed afterwards.

Link: https://lore.kernel.org/r/20210118100955.1761652-15-a.darwish@linutronix.de
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:09 -05:00
Ahmed S. Darwish
872a90b5b4 scsi: hisi_sas: Switch back to original libsas event notifiers
libsas event notifiers required an extension where gfp_t flags must be
explicitly passed. For bisectability, a temporary _gfp() variant of such
functions were added. All call sites then got converted use the _gfp()
variants and explicitly pass GFP context. Having no callers left, the
original libsas notifiers were then modified to accept gfp_t flags by
default.

Switch back to the original libas API, while still passing GFP context.
The libsas _gfp() variants will be removed afterwards.

Link: https://lore.kernel.org/r/20210118100955.1761652-14-a.darwish@linutronix.de
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:09 -05:00
Ahmed S. Darwish
5d6a75a1ed scsi: libsas: Add gfp_t flags parameter to event notifications
All call-sites of below libsas APIs:

  - sas_alloc_event()
  - sas_notify_port_event()
  - sas_notify_phy_event()

have been converted to use the _gfp()-suffixed version.  Modify the
original APIs above to take a gfp_t flags parameter by default.

For bisectability, call-sites will be modified again to use the original
libsas APIs (while passing gfp_t). The temporary _gfp()-suffixed versions
can then be removed.

Link: https://lore.kernel.org/r/20210118100955.1761652-13-a.darwish@linutronix.de
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:09 -05:00
Ahmed S. Darwish
26c7efc3f9 scsi: hisi_sas: Pass gfp_t flags to libsas event notifiers
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Below are the context analysis for modified functions:

=> hisi_sas_bytes_dmaed():

Since it is invoked from both process and atomic contexts, let its callers
pass the gfp_t flags:

  * hisi_sas_main.c:
  ------------------

    hisi_sas_phyup_work(): workqueue context
      -> hisi_sas_bytes_dmaed(..., GFP_KERNEL)

    hisi_sas_controller_reset_done(): has an msleep()
      -> hisi_sas_rescan_topology()
        -> hisi_sas_phy_down()
          -> hisi_sas_bytes_dmaed(..., GFP_KERNEL)

    hisi_sas_debug_I_T_nexus_reset(): calls wait_for_completion_timeout()
      -> hisi_sas_phy_down()
        -> hisi_sas_bytes_dmaed(..., GFP_KERNEL)

  * hisi_sas_v1_hw.c:
  -------------------

    int_abnormal_v1_hw(): irq handler
      -> hisi_sas_phy_down()
        -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC)

  * hisi_sas_v[23]_hw.c:
  ----------------------

    int_phy_updown_v[23]_hw(): irq handler
      -> phy_down_v[23]_hw()
        -> hisi_sas_phy_down()
          -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC)

=> int_bcast_v1_hw() and phy_bcast_v3_hw():

Both are invoked exclusively from irq handlers. Pass GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210118100955.1761652-12-a.darwish@linutronix.de
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
111d06ab77 scsi: aic94xx: Pass gfp_t flags to libsas event notifiers
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Context analysis:

  aic94xx_hwi.c: asd_dl_tasklet_handler()
    -> asd_ascb::tasklet_complete()
    == escb_tasklet_complete()
      -> aic94xx_scb.c: asd_phy_event_tasklet()
      -> aic94xx_scb.c: asd_bytes_dmaed_tasklet()
      -> aic94xx_scb.c: asd_link_reset_err_tasklet()
      -> aic94xx_scb.c: asd_primitive_rcvd_tasklet()

All functions are invoked by escb_tasklet_complete(), which is invoked by
the tasklet handler. Pass GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210118100955.1761652-11-a.darwish@linutronix.de
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
cd4e817698 scsi: pm80xx: Pass gfp_t flags to libsas event notifiers
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Call chain analysis, pm8001_hwi.c:

  pm8001_interrupt_handler_msix() || pm8001_interrupt_handler_intx() || pm8001_tasklet()
    -> PM8001_CHIP_DISP->isr() = pm80xx_chip_isr()
      -> process_oq [spin_lock_irqsave(&pm8001_ha->lock, ...)]
        -> process_one_iomb()
          -> mpi_hw_event()
            -> hw_event_sas_phy_up()
              -> pm8001_bytes_dmaed()
            -> hw_event_sata_phy_up
              -> pm8001_bytes_dmaed()

All functions are invoked by process_one_iomb(), which is invoked by the
interrupt service routine and the tasklet handler. A similar call chain is
also found at pm80xx_hwi.c. Pass GFP_ATOMIC.

For pm8001_sas.c, pm8001_phy_control() runs in task context as it calls
wait_for_completion() and msleep().  Pass GFP_KERNEL.

Link: https://lore.kernel.org/r/20210118100955.1761652-10-a.darwish@linutronix.de
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
19a39831ff scsi: libsas: Pass gfp_t flags to event notifiers
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Context analysis:

  - sas_enable_revalidation(): process, acquires mutex
  - sas_resume_ha(): process, calls wait_event_timeout()

Link: https://lore.kernel.org/r/20210118100955.1761652-9-a.darwish@linutronix.de
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
71dca5539f scsi: isci: Pass gfp_t flags in isci_port_bc_change_received()
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

libsas sas_notify_port_event() is called from
isci_port_bc_change_received(). Below is the context analysis for all of
its call chains:

host.c: sci_controller_error_handler(): atomic, irq handler     (*)
OR host.c: sci_controller_completion_handler(), atomic, tasklet (*)
  -> sci_controller_process_completions()
    -> sci_controller_event_completion()
      -> phy.c: sci_phy_event_handler()
        -> port.c: sci_port_broadcast_change_received()
          -> isci_port_bc_change_received()

host.c: isci_host_init()                                        (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_initialize(), atomic                        (*)
    -> port_config.c: sci_port_configuration_agent_initialize()
      -> sci_mpc_agent_validate_phy_configuration()
        -> port.c: sci_port_add_phy()
          -> sci_port_set_phy()
            -> phy.c: sci_phy_set_port()
              -> port.c: sci_port_broadcast_change_received()
                -> isci_port_bc_change_received()

port_config.c: apc_agent_timeout(), atomic, timer callback      (*)
  -> sci_apc_agent_configure_ports()
    -> port.c: sci_port_add_phy()
      -> sci_port_set_phy()
        -> phy.c: sci_phy_set_port()
          -> port.c: sci_port_broadcast_change_received()
            -> isci_port_bc_change_received()

phy.c: enter SCI state: *SCI_PHY_STOPPED*                       # Cont. from [1]
  -> sci_phy_stopped_state_enter()
    -> host.c: sci_controller_link_down()
      -> ->link_down_handler()
      == port_config.c: sci_apc_agent_link_down()
        -> port.c: sci_port_remove_phy()
          -> sci_port_clear_phy()
            -> phy.c: sci_phy_set_port()
              -> port.c: sci_port_broadcast_change_received()
                -> isci_port_bc_change_received()

phy.c: enter SCI state: *SCI_PHY_STARTING*                      # Cont. from [2]
  -> sci_phy_starting_state_enter()
    -> host.c: sci_controller_link_down()
      -> ->link_down_handler()
      == port_config.c: sci_apc_agent_link_down()
        -> port.c: sci_port_remove_phy()
          -> sci_port_clear_phy()
            -> phy.c: sci_phy_set_port()
              -> port.c: sci_port_broadcast_change_received()
                -> isci_port_bc_change_received()

[1] Call chains for entering state: *SCI_PHY_STOPPED*
-----------------------------------------------------

host.c: isci_host_init()                                        (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_initialize(), atomic                        (*)
      -> phy.c: sci_phy_initialize()
        -> phy.c: sci_phy_link_layer_initialization()
          -> phy.c: sci_change_state(SCI_PHY_STOPPED)

init.c: PCI ->remove() || PM_OPS ->suspend,  process context    (+)
  -> host.c: isci_host_deinit()
    -> sci_controller_stop_phys()
      -> phy.c: sci_phy_stop()
	-> sci_change_state(SCI_PHY_STOPPED)

phy.c: isci_phy_control()
spin_lock_irqsave(isci_host::scic_lock, )
  -> sci_phy_stop(), atomic                                     (*)
    -> sci_change_state(SCI_PHY_STOPPED)

[2] Call chains for entering state: *SCI_PHY_STARTING*
------------------------------------------------------

phy.c: phy_sata_timeout(), atimer, timer callback               (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> sci_change_state(SCI_PHY_STARTING)

host.c: phy_startup_timeout(), atomic, timer callback           (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> sci_controller_start_next_phy()
    -> sci_phy_start()
      -> sci_change_state(SCI_PHY_STARTING)

host.c: isci_host_start()                                       (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_start(), atomic                             (*)
    -> sci_controller_start_next_phy()
      -> sci_phy_start()
        -> sci_change_state(SCI_PHY_STARTING)

phy.c: Enter SCI state *SCI_PHY_SUB_FINAL*                      # Cont. from [2A]
  -> sci_change_state(SCI_PHY_SUB_FINAL)
    -> sci_phy_starting_final_substate_enter()
      -> sci_change_state(SCI_PHY_READY)
        -> Enter SCI state: *SCI_PHY_READY*
          -> sci_phy_ready_state_enter()
            -> host.c: sci_controller_link_up()
              -> sci_controller_start_next_phy()
                -> sci_phy_start()
                  -> sci_change_state(SCI_PHY_STARTING)

phy.c: sci_phy_event_handler(), atomic, discussed earlier       (*)
  -> sci_change_state(SCI_PHY_STARTING), 11 instances

port.c: isci_port_perform_hard_reset()
spin_lock_irqsave(isci_host::scic_lock, )
  -> port.c: sci_port_hard_reset(), atomic                      (*)
    -> phy.c: sci_phy_reset()
      -> sci_change_state(SCI_PHY_RESETTING)
        -> enter SCI PHY state: *SCI_PHY_RESETTING*
          -> sci_phy_resetting_state_enter()
            -> sci_change_state(SCI_PHY_STARTING)

[2A] Call chains for entering SCI state: *SCI_PHY_SUB_FINAL*
------------------------------------------------------------

host.c: power_control_timeout(), atomic, timer callback         (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> phy.c: sci_phy_consume_power_handler()
    -> phy.c: sci_change_state(SCI_PHY_SUB_FINAL)

host.c: sci_controller_error_handler(): atomic, irq handler     (*)
OR host.c: sci_controller_completion_handler(), atomic, tasklet (*)
  -> sci_controller_process_completions()
    -> sci_controller_unsolicited_frame()
      -> phy.c: sci_phy_frame_handler()
        -> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER)
          -> sci_phy_starting_await_sas_power_substate_enter()
            -> host.c: sci_controller_power_control_queue_insert()
              -> phy.c: sci_phy_consume_power_handler()
                -> sci_change_state(SCI_PHY_SUB_FINAL)
        -> sci_change_state(SCI_PHY_SUB_FINAL)
    -> sci_controller_event_completion()
      -> phy.c: sci_phy_event_handler()
        -> sci_phy_start_sata_link_training()
          -> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER)
            -> sci_phy_starting_await_sata_power_substate_enter
              -> host.c: sci_controller_power_control_queue_insert()
                -> phy.c: sci_phy_consume_power_handler()
                  -> sci_change_state(SCI_PHY_SUB_FINAL)

As can be seen from the "(*)" markers above, almost all the call-chains are
atomic. The only exception, marked with "(+)", is a PCI ->remove() and
PM_OPS ->suspend() cold path. Thus, pass GFP_ATOMIC to the libsas port
event notifier.

Note, the now-replaced libsas APIs used in_interrupt() to implicitly decide
which memory allocation type to use.  This was only partially correct, as
it fails to choose the correct GFP flags when just preemption or interrupts
are disabled. Such buggy code paths are marked with "(@)" in the call
chains above.

Link: https://lore.kernel.org/r/20210118100955.1761652-8-a.darwish@linutronix.de
Fixes: 1c393b970e ("scsi: libsas: Use dynamic alloced work to avoid sas event lost")
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
5ce7902902 scsi: isci: Pass gfp_t flags in isci_port_link_up()
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

libsas sas_notify_port_event() is called from isci_port_link_up().  Below
is the context analysis for all of its call chains:

host.c: isci_host_init()                                        (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_initialize(), atomic                        (*)
    -> port_config.c: sci_port_configuration_agent_initialize()
      -> sci_mpc_agent_validate_phy_configuration()
        -> port.c: sci_port_add_phy()
          -> sci_port_general_link_up_handler()
            -> sci_port_activate_phy()
              -> isci_port_link_up()

port_config.c: apc_agent_timeout(), atomic, timer callback      (*)
  -> sci_apc_agent_configure_ports()
    -> port.c: sci_port_add_phy()
      -> sci_port_general_link_up_handler()
        -> sci_port_activate_phy()
          -> isci_port_link_up()

phy.c: enter SCI state: *SCI_PHY_SUB_FINAL*                     # Cont. from [1]
  -> phy.c: sci_phy_starting_final_substate_enter()
    -> phy.c: sci_change_state(SCI_PHY_READY)
      -> enter SCI state: *SCI_PHY_READY*
        -> phy.c: sci_phy_ready_state_enter()
          -> host.c: sci_controller_link_up()
            -> .link_up_handler()
            == port_config.c: sci_apc_agent_link_up()
              -> port.c: sci_port_link_up()
                -> (continue at [A])
            == port_config.c: sci_mpc_agent_link_up()
	      -> port.c: sci_port_link_up()
                -> (continue at [A])

port_config.c: mpc_agent_timeout(), atomic, timer callback      (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> ->link_up_handler()
  == port_config.c: sci_apc_agent_link_up()
    -> port.c: sci_port_link_up()
      -> (continue at [A])
  == port_config.c: sci_mpc_agent_link_up()
    -> port.c: sci_port_link_up()
      -> (continue at [A])

[A] port.c: sci_port_link_up()
  -> sci_port_activate_phy()
    -> isci_port_link_up()
  -> sci_port_general_link_up_handler()
    -> sci_port_activate_phy()
      -> isci_port_link_up()

[1] Call chains for entering SCI state: *SCI_PHY_SUB_FINAL*
-----------------------------------------------------------

host.c: power_control_timeout(), atomic, timer callback         (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> phy.c: sci_phy_consume_power_handler()
    -> phy.c: sci_change_state(SCI_PHY_SUB_FINAL)

host.c: sci_controller_error_handler(): atomic, irq handler     (*)
OR host.c: sci_controller_completion_handler(), atomic, tasklet (*)
  -> sci_controller_process_completions()
    -> sci_controller_unsolicited_frame()
      -> phy.c: sci_phy_frame_handler()
        -> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER)
          -> sci_phy_starting_await_sas_power_substate_enter()
            -> host.c: sci_controller_power_control_queue_insert()
              -> phy.c: sci_phy_consume_power_handler()
                -> sci_change_state(SCI_PHY_SUB_FINAL)
        -> sci_change_state(SCI_PHY_SUB_FINAL)
    -> sci_controller_event_completion()
      -> phy.c: sci_phy_event_handler()
        -> sci_phy_start_sata_link_training()
          -> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER)
            -> sci_phy_starting_await_sata_power_substate_enter
              -> host.c: sci_controller_power_control_queue_insert()
                -> phy.c: sci_phy_consume_power_handler()
                  -> sci_change_state(SCI_PHY_SUB_FINAL)

As can be seen from the "(*)" markers above, all the call-chains are
atomic.  Pass GFP_ATOMIC to libsas port event notifier.

Note, the now-replaced libsas APIs used in_interrupt() to implicitly decide
which memory allocation type to use.  This was only partially correct, as
it fails to choose the correct GFP flags when just preemption or interrupts
are disabled. Such buggy code paths are marked with "(@)" in the call
chains above.

Link: https://lore.kernel.org/r/20210118100955.1761652-7-a.darwish@linutronix.de
Fixes: 1c393b970e ("scsi: libsas: Use dynamic alloced work to avoid sas event lost")
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
885ab3b892 scsi: isci: Pass gfp_t flags in isci_port_link_down()
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

sas_notify_phy_event() is exclusively called by isci_port_link_down().
Below is the context analysis for all of its call chains:

port.c: port_timeout(), atomic, timer callback                  (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> port_state_machine_change(..., SCI_PORT_FAILED)
    -> enter SCI port state: *SCI_PORT_FAILED*
      -> sci_port_failed_state_enter()
        -> isci_port_hard_reset_complete()
          -> isci_port_link_down()

port.c: isci_port_perform_hard_reset()
spin_lock_irqsave(isci_host::scic_lock, )
  -> port.c: sci_port_hard_reset(), atomic                      (*)
    -> phy.c: sci_phy_reset()
      -> sci_change_state(SCI_PHY_RESETTING)
        -> enter SCI PHY state: *SCI_PHY_RESETTING*
          -> sci_phy_resetting_state_enter()
            -> port.c: sci_port_deactivate_phy()
	      -> isci_port_link_down()

port.c: enter SCI port state: *SCI_PORT_READY*                  # Cont. from [1]
  -> sci_port_ready_state_enter()
    -> isci_port_hard_reset_complete()
      -> isci_port_link_down()

phy.c: enter SCI state: *SCI_PHY_STOPPED*                       # Cont. from [2]
  -> sci_phy_stopped_state_enter()
    -> host.c: sci_controller_link_down()
      -> ->link_down_handler()
      == port_config.c: sci_apc_agent_link_down()
        -> port.c: sci_port_remove_phy()
          -> sci_port_deactivate_phy()
            -> isci_port_link_down()
      == port_config.c: sci_mpc_agent_link_down()
        -> port.c: sci_port_link_down()
          -> sci_port_deactivate_phy()
            -> isci_port_link_down()

phy.c: enter SCI state: *SCI_PHY_STARTING*                      # Cont. from [3]
  -> sci_phy_starting_state_enter()
    -> host.c: sci_controller_link_down()
      -> ->link_down_handler()
      == port_config.c: sci_apc_agent_link_down()
        -> port.c: sci_port_remove_phy()
          -> isci_port_link_down()
      == port_config.c: sci_mpc_agent_link_down()
        -> port.c: sci_port_link_down()
          -> sci_port_deactivate_phy()
            -> isci_port_link_down()

[1] Call chains for 'enter SCI port state: *SCI_PORT_READY*'
------------------------------------------------------------

host.c: isci_host_init()                                        (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_initialize(), atomic                        (*)
    -> port_config.c: sci_port_configuration_agent_initialize()
      -> sci_mpc_agent_validate_phy_configuration()
        -> port.c: sci_port_add_phy()
          -> sci_port_general_link_up_handler()
            -> port_state_machine_change(, SCI_PORT_READY)
              -> enter port state *SCI_PORT_READY*

host.c: isci_host_start()                                       (@)
spin_lock_irq(isci_host::scic_lock)
  -> host.c: sci_controller_start(), atomic                     (*)
    -> host.c: sci_port_start()
      -> port.c: port_state_machine_change(, SCI_PORT_READY)
        -> enter port state *SCI_PORT_READY*

port_config.c: apc_agent_timeout(), atomic, timer callback      (*)
  -> sci_apc_agent_configure_ports()
    -> port.c: sci_port_add_phy()
      -> sci_port_general_link_up_handler()
        -> port_state_machine_change(, SCI_PORT_READY)
          -> enter port state *SCI_PORT_READY*

port_config.c: mpc_agent_timeout(), atomic, timer callback      (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> ->link_up_handler()
  == port.c: sci_apc_agent_link_up()
    -> sci_port_general_link_up_handler()
      -> port_state_machine_change(, SCI_PORT_READY)
        -> enter port state *SCI_PORT_READY*
  == port.c: sci_mpc_agent_link_up()
    -> port.c: sci_port_link_up()
      -> sci_port_general_link_up_handler()
        -> port_state_machine_change(, SCI_PORT_READY)
          -> enter port state *SCI_PORT_READY*

phy.c: enter SCI state: SCI_PHY_SUB_FINAL                       # Cont. from [1A]
  -> sci_phy_starting_final_substate_enter()
    -> sci_change_state(SCI_PHY_READY)
      -> enter SCI state: *SCI_PHY_READY*
        -> sci_phy_ready_state_enter()
          -> host.c: sci_controller_link_up()
            -> port_agent.link_up_handler()
            == port_config.c: sci_apc_agent_link_up()
              -> port.c: sci_port_link_up()
                -> sci_port_general_link_up_handler()
                  -> port_state_machine_change(, SCI_PORT_READY)
                    -> enter port state *SCI_PORT_READY*
            == port_config.c: sci_mpc_agent_link_up()
              -> port.c: sci_port_link_up()
                -> sci_port_general_link_up_handler()
                  -> port_state_machine_change(, SCI_PORT_READY)
                    -> enter port state *SCI_PORT_READY*

[1A] Call chains for entering SCI state: *SCI_PHY_SUB_FINAL*
------------------------------------------------------------

host.c: power_control_timeout(), atomic, timer callback         (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> phy.c: sci_phy_consume_power_handler()
    -> phy.c: sci_change_state(SCI_PHY_SUB_FINAL)

host.c: sci_controller_error_handler(): atomic, irq handler     (*)
OR host.c: sci_controller_completion_handler(), atomic, tasklet (*)
  -> sci_controller_process_completions()
    -> sci_controller_unsolicited_frame()
      -> phy.c: sci_phy_frame_handler()
        -> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER)
          -> sci_phy_starting_await_sas_power_substate_enter()
            -> host.c: sci_controller_power_control_queue_insert()
              -> phy.c: sci_phy_consume_power_handler()
                -> sci_change_state(SCI_PHY_SUB_FINAL)
        -> sci_change_state(SCI_PHY_SUB_FINAL)
    -> sci_controller_event_completion()
      -> phy.c: sci_phy_event_handler()
        -> sci_phy_start_sata_link_training()
          -> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER)
            -> sci_phy_starting_await_sata_power_substate_enter
              -> host.c: sci_controller_power_control_queue_insert()
                -> phy.c: sci_phy_consume_power_handler()
                  -> sci_change_state(SCI_PHY_SUB_FINAL)

[2] Call chains for entering state: *SCI_PHY_STOPPED*
-----------------------------------------------------

host.c: isci_host_init()                                        (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_initialize(), atomic                        (*)
      -> phy.c: sci_phy_initialize()
        -> phy.c: sci_phy_link_layer_initialization()
          -> phy.c: sci_change_state(SCI_PHY_STOPPED)

init.c: PCI ->remove() || PM_OPS ->suspend,  process context    (+)
  -> host.c: isci_host_deinit()
    -> sci_controller_stop_phys()
      -> phy.c: sci_phy_stop()
	-> sci_change_state(SCI_PHY_STOPPED)

phy.c: isci_phy_control()
spin_lock_irqsave(isci_host::scic_lock, )
  -> sci_phy_stop(), atomic                                     (*)
    -> sci_change_state(SCI_PHY_STOPPED)

[3] Call chains for entering state: *SCI_PHY_STARTING*
------------------------------------------------------

phy.c: phy_sata_timeout(), atimer, timer callback               (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> sci_change_state(SCI_PHY_STARTING)

host.c: phy_startup_timeout(), atomic, timer callback           (*)
spin_lock_irqsave(isci_host::scic_lock, )
  -> sci_controller_start_next_phy()
    -> sci_phy_start()
      -> sci_change_state(SCI_PHY_STARTING)

host.c: isci_host_start()                                       (@)
spin_lock_irq(isci_host::scic_lock)
  -> sci_controller_start(), atomic                             (*)
    -> sci_controller_start_next_phy()
      -> sci_phy_start()
        -> sci_change_state(SCI_PHY_STARTING)

phy.c: Enter SCI state *SCI_PHY_SUB_FINAL*, atomic, check above (*)
  -> sci_change_state(SCI_PHY_SUB_FINAL)
    -> sci_phy_starting_final_substate_enter()
      -> sci_change_state(SCI_PHY_READY)
        -> Enter SCI state: *SCI_PHY_READY*
          -> sci_phy_ready_state_enter()
            -> host.c: sci_controller_link_up()
              -> sci_controller_start_next_phy()
                -> sci_phy_start()
                  -> sci_change_state(SCI_PHY_STARTING)

phy.c: sci_phy_event_handler(), atomic, discussed earlier       (*)
  -> sci_change_state(SCI_PHY_STARTING), 11 instances

phy.c: enter SCI state: *SCI_PHY_RESETTING*, atomic, discussed  (*)
  -> sci_phy_resetting_state_enter()
    -> sci_change_state(SCI_PHY_STARTING)

As can be seen from the "(*)" markers above, almost all the call-chains are
atomic. The only exception, marked with "(+)", is a PCI ->remove() and
PM_OPS ->suspend() cold path. Thus, pass GFP_ATOMIC to the libsas phy event
notifier.

Note, The now-replaced libsas APIs used in_interrupt() to implicitly decide
which memory allocation type to use.  This was only partially correct, as
it fails to choose the correct GFP flags when just preemption or interrupts
are disabled. Such buggy code paths are marked with "(@)" in the call
chains above.

Link: https://lore.kernel.org/r/20210118100955.1761652-6-a.darwish@linutronix.de
Fixes: 1c393b970e ("scsi: libsas: Use dynamic alloced work to avoid sas event lost")
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
Ahmed S. Darwish
feb18e900f scsi: mvsas: Pass gfp_t flags to libsas event notifiers
mvsas calls the non _gfp version of the libsas event notifiers API, leading
to the buggy call chains below:

  mvsas/mv_sas.c: mvs_work_queue() [process context]
  spin_lock_irqsave(mvs_info::lock, )
    -> libsas/sas_event.c: sas_notify_phy_event()
      -> sas_alloc_event()
        -> in_interrupt() = false
          -> invalid GFP_KERNEL allocation
    -> libsas/sas_event.c: sas_notify_port_event()
      -> sas_alloc_event()
        -> in_interrupt() = false
          -> invalid GFP_KERNEL allocation

Use the new event notifiers API instead, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Below are context analysis for the modified functions:

=> mvs_bytes_dmaed():

Since it is invoked from both process and atomic contexts, let its callers
pass the gfp_t flags. Call chains:

  scsi_scan.c: do_scsi_scan_host() [has msleep()]
    -> shost->hostt->scan_start()
    -> [mvsas/mv_init.c: Scsi_Host::scsi_host_template .scan_start = mvs_scan_start()]
    -> mvsas/mv_sas.c: mvs_scan_start()
      -> mvs_bytes_dmaed(..., GFP_KERNEL)

  mvsas/mv_sas.c: mvs_work_queue()
  spin_lock_irqsave(mvs_info::lock,)
    -> mvs_bytes_dmaed(..., GFP_ATOMIC)

  mvsas/mv_64xx.c: mvs_64xx_isr() || mvsas/mv_94xx.c: mvs_94xx_isr()
    -> mvsas/mv_chips.h: mvs_int_full()
      -> mvsas/mv_sas.c: mvs_int_port()
        -> mvs_bytes_dmaed(..., GFP_ATOMIC);

=> mvs_work_queue():

Invoked from process context, but it calls all the libsas event notifier
APIs under a spin_lock_irqsave(). Pass GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210118100955.1761652-5-a.darwish@linutronix.de
Fixes: 1c393b970e ("scsi: libsas: Use dynamic alloced work to avoid sas event lost")
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
Ahmed S. Darwish
c2d0f1a65a scsi: libsas: Introduce a _gfp() variant of event notifiers
sas_alloc_event() uses in_interrupt() to decide which allocation should be
used.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be separated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

The in_interrupt() check is also only partially correct, because it fails
to choose the correct code path when just preemption or interrupts are
disabled. For example, as in the following call chain:

  mvsas/mv_sas.c: mvs_work_queue() [process context]
  spin_lock_irqsave(mvs_info::lock, )
    -> libsas/sas_event.c: sas_notify_phy_event()
      -> sas_alloc_event()
        -> in_interrupt() = false
          -> invalid GFP_KERNEL allocation
    -> libsas/sas_event.c: sas_notify_port_event()
      -> sas_alloc_event()
        -> in_interrupt() = false
          -> invalid GFP_KERNEL allocation

Introduce sas_alloc_event_gfp(), sas_notify_port_event_gfp(), and
sas_notify_phy_event_gfp(), which all behave like the non _gfp() variants
but use a caller-passed GFP mask for allocations.

For bisectability, all callers will be modified first to pass GFP context,
then the non _gfp() libsas API variants will be modified to take a gfp_t by
default.

Link: https://lore.kernel.org/r/20210118100955.1761652-4-a.darwish@linutronix.de
Fixes: 1c393b970e ("scsi: libsas: Use dynamic alloced work to avoid sas event lost")
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
John Garry
121181f3f8 scsi: libsas: Remove notifier indirection
LLDDs report events to libsas with .notify_port_event and .notify_phy_event
callbacks.

These callbacks are fixed and so there is no reason why the functions
cannot be called directly, so do that.

This neatens the code slightly, makes it more obvious, and reduces function
pointer usage, which is generally a good thing. Downside is that there are
2x more symbol exports.

[a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables]

Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
Hannes Reinecke
491152c7c3 scsi: ncr53c8xx: Use SAM status values
Use SAM status values instead of the driver-defined ones.  This also fixes
a potential bug as the driver-defined values declare 'COMMAND TERMINATED'
with a value of 0x20, whereas SCSI-II defines it with a value of 0x22.

Link: https://lore.kernel.org/r/20210113090500.129644-36-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:12 -05:00
Hannes Reinecke
aced5500ec scsi: advansys: Kill driver-defined status byte accessors
Replace the driver-defined status byte accessors with the mid-layer defined
ones.

Link: https://lore.kernel.org/r/20210113090500.129644-35-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:12 -05:00
Hannes Reinecke
6098c3005d scsi: qla2xxx: fc_remote_port_chkready() returns a SCSI result value
fc_remote_port_chkready() returns a SCSI result value, not the port
status. Fix the value returned when the remote port isn't set.

Link: https://lore.kernel.org/r/20210113090500.129644-34-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:12 -05:00
Hannes Reinecke
ecc751b27a scsi: storvsc: Return DID_ERROR for invalid commands
ILLEGAL_COMMAND is a sense code, not a driver byte.

Link: https://lore.kernel.org/r/20210113090500.129644-33-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:12 -05:00
Hannes Reinecke
88188179f3 scsi: ips: Use correct command completion on error
A non-zero queuecommand() return code means 'busy', i.e. the command hasn't
been submitted. So any command which should be failed need to be completed
via the ->scsi_done() callback with the appropriate result code set.

Link: https://lore.kernel.org/r/20210113090500.129644-32-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:12 -05:00
Hannes Reinecke
fc8e006c38 scsi: wd33c93: Use SCSI status
Use standard SCSI status and drop usage of the linux-specific ones.

Link: https://lore.kernel.org/r/20210113090500.129644-31-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:12 -05:00
Hannes Reinecke
809dadb15a scsi: esp_scsi: Do not set SCSI message byte
The message byte setting always devolves to COMMAND_COMPLETE so we can drop
setting the message byte in the SCSI result.

Link: https://lore.kernel.org/r/20210113090500.129644-30-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:11 -05:00
Hannes Reinecke
6b50529e2f scsi: esp_scsi: Use host byte as last argument to esp_cmd_is_done()
Just pass in the host byte to esp_cmd_is_done() and set the status or
message bytes if the host byte is DID_OK.

Link: https://lore.kernel.org/r/20210113090500.129644-29-hare@suse.de
Acked-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:11 -05:00
Hannes Reinecke
78c9efdd8d scsi: dpt_i2o: Use DID_ERROR instead of INITIATOR_ERROR message
Change the error code for an invalid SCSI opcode to DID_ERROR.
INITIATOR_ERROR is a scsi parallel message which doesn't apply for RAID
HBAs.

Link: https://lore.kernel.org/r/20210113090500.129644-27-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:11 -05:00
Hannes Reinecke
ddb99b1d1d scsi: mac53c94: Do not set invalid command result
CMD_ACCEPT_MSG is an internal definition and most certainly not a SCSI
status. As the latter gets set during command completion we can drop the
assignment here.

Link: https://lore.kernel.org/r/20210113090500.129644-26-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:11 -05:00
Hannes Reinecke
f3272258d7 scsi: atp870u: Use standard definitions
Use standard definitions for SCSI commands and return status instead of the
hardcoded values.

Link: https://lore.kernel.org/r/20210113090500.129644-25-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:11 -05:00
Hannes Reinecke
db83d8a5c8 scsi: ufs: ufshcd: Do not set COMMAND_COMPLETE
COMMAND_COMPLETE is defined as '0', so setting it is quite pointless.

Link: https://lore.kernel.org/r/20210113090500.129644-24-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:11 -05:00
Hannes Reinecke
7a64c81448 scsi: scsi_debug: Do not set COMMAND_COMPLETE
COMMAND_COMPLETE is defined as '0', so setting it is quite pointless.

Link: https://lore.kernel.org/r/20210113090500.129644-23-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:10 -05:00
Hannes Reinecke
9df17f4679 scsi: initio: Drop internal SCSI message definition
Use the standard SCSI message definitions instead of the driver-internal
ones.

Link: https://lore.kernel.org/r/20210113090500.129644-22-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:10 -05:00
Hannes Reinecke
9c2d267073 scsi: dc395x: Drop internal SCSI message definitions
Drop the internal SCSI message definitions and use the functions provided
by the SPI transport class.

Link: https://lore.kernel.org/r/20210113090500.129644-21-hare@suse.de
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:10 -05:00
Hannes Reinecke
d8cd784ff7 scsi: aic7xxx: aic79xx: Drop internal SCSI message definition
Use the standard SCSI message definitions instead of the driver-internal
ones.

Link: https://lore.kernel.org/r/20210113090500.129644-20-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:10 -05:00
Hannes Reinecke
1c9eb798d5 scsi: nsp_cs: Drop internal SCSI message definition
Use the standard SCSI message definitions instead of the driver-internal
ones.

Link: https://lore.kernel.org/r/20210113090500.129644-19-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:10 -05:00
Hannes Reinecke
8959e81cf4 scsi: stex: Do not set COMMAND_COMPLETE
COMMAND_COMPLETE is defined as '0', so setting it is quite pointless.

Link: https://lore.kernel.org/r/20210113090500.129644-18-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:10 -05:00
Hannes Reinecke
0e310ac4ef scsi: hpsa: Do not set COMMAND_COMPLETE
COMMAND_COMPLETE is defined as '0', and it is a SCSI parallel message to
boot. Drop the call to set_msg_byte().

Link: https://lore.kernel.org/r/20210113090500.129644-17-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:09 -05:00
Hannes Reinecke
cdec16c117 scsi: aacraid: Avoid setting message byte on completion
The aacraid controller is a RAID controller and the driver will never see
any SCSI messages. Plus it's quite pointless to set the message byte if the
host byte is already set, as the latter takes precedence during error
recovery.  Drop the message byte values for the final result.

Link: https://lore.kernel.org/r/20210113090500.129644-16-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:09 -05:00
Hannes Reinecke
35f1cad1f9 scsi: qla4xxx: Use standard SAM status definitions
Use standard SAM status definitions and drop the driver-defined ones.

Link: https://lore.kernel.org/r/20210113090500.129644-14-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:09 -05:00
Hannes Reinecke
f55475891e scsi: dc395: Drop private SAM status code definitions
We don't need to duplicate definitions from the common include files.

Link: https://lore.kernel.org/r/20210113090500.129644-13-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:09 -05:00
Hannes Reinecke
23d339f08f scsi: nsp32: Fixup status handling
SCp.status is always the SAM-defined status value, not the Linux
ones. Fixup the one wrong definition.

Link: https://lore.kernel.org/r/20210113090500.129644-12-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:09 -05:00
Hannes Reinecke
0eb198d2c3 scsi: acornscsi: Use standard defines
Use midlayer-defined values and drop the non-existing QUEUE_FULL case; we
are checking the SCSI messages in the switch statement, and QUEUE_FULL is a
SCSI status hence it can never occur here.

Link: https://lore.kernel.org/r/20210113090500.129644-11-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:09 -05:00
Hannes Reinecke
eb74b9322b scsi: bfa: Drop driver-defined SCSI status codes
Drop the driver-defined SCSI status codes and use the generic ones instead.

Link: https://lore.kernel.org/r/20210113090500.129644-10-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:08 -05:00
Hannes Reinecke
54c9f6fdef scsi: aic7xxx: aic79xx: Remove driver-defined SAM status definitions
Replace the driver-defined SAM status definitions with the standard
mid-layer defined ones.

Link: https://lore.kernel.org/r/20210113090500.129644-9-hare@suse.de
Reviewed-by: Bart van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:08 -05:00
Hannes Reinecke
c23435dbc7 scsi: aic7xxx: aic79xx: Kill pointless forward declarations
Link: https://lore.kernel.org/r/20210113090500.129644-8-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:08 -05:00
Hannes Reinecke
7662d92374 scsi: aic7xxx: aic79xx: Whitespace cleanup
Link: https://lore.kernel.org/r/20210113090500.129644-7-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:08 -05:00
Hannes Reinecke
bcd5c59f21 scsi: atp870u: Whitespace cleanup
Link: https://lore.kernel.org/r/20210113090500.129644-6-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:08 -05:00
Hannes Reinecke
1789671ded scsi: 3w-sas: Whitespace cleanup
Link: https://lore.kernel.org/r/20210113090500.129644-5-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:08 -05:00
Hannes Reinecke
bf4eebbf53 scsi: 3w-9xxx: Whitespace cleanup
Link: https://lore.kernel.org/r/20210113090500.129644-4-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:07 -05:00
Hannes Reinecke
8148dfba29 scsi: 3w-xxxx: Whitespace cleanup
Link: https://lore.kernel.org/r/20210113090500.129644-3-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:07 -05:00
Hannes Reinecke
0653c358d2 scsi: Drop gdth driver
The gdth driver refers to a SCSI parallel, PCI-only HBA RAID adapter which
was manufactured by the now-defunct ICP Vortex company, later acquired by
Adaptec and superseded by the aacraid series of controllers.  The driver
itself would require a major overhaul before any modifications can be
attempted, but seeing that it's unlikely to have any users left it should
rather be removed completely.

Link: https://lore.kernel.org/r/20210113090500.129644-2-hare@suse.de
Cautiously-Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:14:07 -05:00
Stanley Chu
348e1bc5f4 scsi: ufs: Clean up and refactor clk-scaling feature
Manipulate clock scaling related stuff only if the host capability supports
clock scaling feature to avoid redundant code execution.

Link: https://lore.kernel.org/r/20210120150142.5049-4-stanley.chu@mediatek.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:29:38 -05:00
Stanley Chu
b058fa8682 scsi: ufs: Remove redundant null checking of devfreq instance
hba->devfreq is zero-initialized thus it is not required to check its
existence in ufshcd_add_lus() function which is invoked during
initialization only.

Link: https://lore.kernel.org/r/20210120150142.5049-3-stanley.chu@mediatek.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:29:38 -05:00
Stanley Chu
f9a7fa345a scsi: ufs: Refactor cancelling clkscaling works
Cancelling suspend_work and resume_work is only required while suspending
clk-scaling. Move these two invocations into ufshcd_suspend_clkscaling()
function.

Link: https://lore.kernel.org/r/20210120150142.5049-2-stanley.chu@mediatek.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:29:37 -05:00
Can Guo
b02d51afca Revert "Make sure clk scaling happens only when HBA is runtime ACTIVE"
Commit 73cc291c27 ("scsi: ufs: Make sure clk scaling happens only
when HBA is runtime ACTIVE") is no longer needed since commit
0e9d4ca43b ("scsi: ufs: Protect some contexts from unexpected clock
scaling") is a more mature fix to protect UFS LLD stability from clock
scaling invoked through sysfs nodes by users.

Link: https://lore.kernel.org/r/1611137065-14266-4-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:23:56 -05:00
Can Guo
4543d9d782 scsi: ufs: Refactor ufshcd_init/exit_clk_scaling/gating()
ufshcd_hba_exit() is always called after ufshcd_exit_clk_scaling() and
ufshcd_exit_clk_gating(). Move ufshcd_exit_clk_scaling/gating() to
ufshcd_hba_exit(). Meanwhile, add dedicated functions to initialize
and remove sysfs nodes of clock scaling/gating to make the code more
readable. Overall functionality remains same.

Link: https://lore.kernel.org/r/1611137065-14266-3-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:23:23 -05:00
Can Guo
0e9d4ca43b scsi: ufs: Protect some contexts from unexpected clock scaling
In contexts like suspend, shutdown, and error handling we need to
suspend devfreq to make sure these contexts won't be disturbed by
clock scaling.  However, suspending devfreq is not enough since users
can still trigger a clock scaling by manipulating the devfreq sysfs
nodes like min/max_freq and governor even after devfreq is
suspended. Moreover, mere suspending devfreq cannot synchroinze a
clock scaling which has already been invoked through these sysfs
nodes. Add one more flag in struct clk_scaling and wrap the entire
func ufshcd_devfreq_scale() with the clk_scaling_lock, so that we can
use this flag and clk_scaling_lock to control and synchronize clock
scaling invoked through devfreq sysfs nodes.

Link: https://lore.kernel.org/r/1611137065-14266-2-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:22:02 -05:00
Bean Huo
4cd4899564 scsi: ufs: Group UFS WB related flags in struct ufs_dev_info
UFS device-related flags should be grouped in ufs_dev_info. Move wb_enabled
and wb_buf_flush_enabled out from struct ufs_hba, group them in struct
ufs_dev_info, and align the names of the structure members vertically.

Link: https://lore.kernel.org/r/20210119163847.20165-6-huobean@gmail.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Acked-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:21:38 -05:00
Bean Huo
e8d0381394 scsi: ufs: Remove two WB related fields from struct ufs_dev_info
d_wb_alloc_units and d_ext_ufs_feature_sup are only used during WB probe.
They are used to confirm the condition that "if bWriteBoosterBufferType
is set to 01h but dNumSharedWriteBoosterBufferAllocUnits is set to zero,
the WriteBooster feature is disabled", and if UFS device supports WB.

No need to keep them after probing is complete.

Link: https://lore.kernel.org/r/20210119163847.20165-5-huobean@gmail.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:12:30 -05:00
Bean Huo
ae1ce1fc61 scsi: ufs: Update comment in the function ufshcd_wb_probe()
USFHCD supports both WriteBooster "LU dedicated buffer" mode and "shared
buffer" mode. Update the comment accordingly in the function
ufshcd_wb_probe().

Link: https://lore.kernel.org/r/20210119163847.20165-4-huobean@gmail.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:12:30 -05:00
Bean Huo
8e834ca551 scsi: ufs: Add "wb_on" sysfs node to control WB on/off
Currently UFS WriteBooster driver uses clock scaling up/down to set WB
on/off. For the platforms which don't support UFSHCD_CAP_CLK_SCALING, WB
will be always on. Provide a sysfs attribute to enable/disable WB during
runtime. Write 1/0 to "wb_on" sysfs node to enable/disable UFS WB.

Link: https://lore.kernel.org/r/20210119163847.20165-2-huobean@gmail.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 22:08:33 -05:00
Kiwoong Kim
f1ef9047aa scsi: ufs: ufs-exynos: Use UFSHCD_QUIRK_ALIGN_SG_WITH_PAGE_SIZE
Exynos needs scatterlist entries aligned to page size because it isn't
capable of transferring data contained in one DATA IN operation to seversal
areas in memory.

Link: https://lore.kernel.org/r/80d7e27d6ec537e650a6bd74897b6c60618efcdc.1611026909.git.kwmad.kim@samsung.com
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 21:54:58 -05:00
Kiwoong Kim
2b2bfc8aa5 scsi: ufs: Introduce a quirk to allow only page-aligned sg entries
Some SoCs require a single scatterlist entry for smaller than page size,
i.e. 4KB. When dispatching commands with more than one scatterlist entry
under 4KB in size the following behavior is observed:

A command to read a block range is dispatched with two scatterlist entries
that are named AAA and BBB. After dispatching, the host builds two PRDT
entries and during transmission, device sends just one DATA IN because
device doesn't care about host DMA. The host then transfers the combined
amount of data from start address of the area named AAA. As a consequence,
the area that follows AAA in memory would be corrupted.

    |<------------->|
    +-------+------------         +-------+
    +  AAA  + (corrupted)   ...   +  BBB  +
    +-------+------------         +-------+

To avoid this we need to enforce page size alignment for sg entries.

Link: https://lore.kernel.org/r/56dddef94f60bd9466fd77e69f64bbbd657ed2a1.1611026909.git.kwmad.kim@samsung.com
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 21:53:44 -05:00
Bean Huo
60ec37555d scsi: ufs: Delete redundant if statement in ufshcd_intr()
Once going into while-do loop, intr_status is already true, this
if-statement is redundant, remove it.

Link: https://lore.kernel.org/r/20210118201233.3043-1-huobean@gmail.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 21:45:20 -05:00
Enzo Matsumiya
aa2c24e7f4 scsi: qla2xxx: Fix description for parameter ql2xenforce_iocb_limit
Parameter ql2xenforce_iocb_limit is enabled by default.

Link: https://lore.kernel.org/r/20210118184922.23793-1-ematsumiya@suse.de
Fixes: 89c72f4245 ("scsi: qla2xxx: Add IOCB resource tracking")
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 21:43:31 -05:00
Colin Ian King
ff79acc49a scsi: ibmvfc: Fix spelling mistake "succeded" -> "succeeded"
There is a spelling mistake in a ibmvfc_dbg debug message. Fix it.

Link: https://lore.kernel.org/r/20210118111346.70798-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 21:42:20 -05:00
Christophe JAILLET
8e60a7deca scsi: pm80xx: Switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.  It has been compile
tested.

When memory is allocated in 'pm8001_init_ccb_tag()' GFP_KERNEL can be used
because this function already uses this flag a few lines above.

While at it, remove "pm80xx: " in a debug message. 'pm8001_dbg()' already
adds the driver name in the message.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Link: https://lore.kernel.org/r/20210117132445.562552-1-christophe.jaillet@wanadoo.fr
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 21:40:38 -05:00
Colin Ian King
7b382122d2 scsi: pm80xx: Clean up indentation of a code block
A block of code is indented one level too deeply, clean this up.

Link: https://lore.kernel.org/r/20210115095824.9170-1-colin.king@canonical.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Indentation does not match nesting level")
2021-01-20 21:38:56 -05:00
Martin K. Petersen
938a2fbefb Merge branch '5.11/scsi-fixes' into 5.12/scsi-queue
Pull in the 5.11 SCSI fixes branch to provide an updated baseline for
megaraid and hisi_sas. Both drivers received core changes in
v5.11-rc3.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 18:26:06 -05:00
Muneendra Kumar
7f3a79a7fd scsi: lpfc: Add support for eh_should_retry_cmd()
Add support for eh_should_retry_cmd callback in lpfc_template.

Link: https://lore.kernel.org/r/1609969748-17684-6-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:55:18 -05:00
Muneendra Kumar
afdd112694 scsi: scsi_transport_fc: Add store capability to rport port_state in sysfs
Add store capability to the rport port_state using sysfs under
fc_remote_ports/rport-*/port_state.

With this the user can move the port_state from Marginal->Online and
Online->Marginal.

 - Marginal: This interface will set SCMD_NORETRIES_ABORT bit in
   scmd->state for all the pending I/Os on the SCSI device associated with
   target port.

 - Online: This interface will clear SCMD_NORETRIES_ABORT bit in
   scmd->state for all the pending I/Os on the SCSI device associated with
   target port.

The following interface is provided to set the port state to Marginal and
Online respectively:

echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state
echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state

Link: https://lore.kernel.org/r/1609969748-17684-5-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:55:17 -05:00
Muneendra Kumar
02c66326dc scsi: scsi_transport_fc: Add a new rport state FC_PORTSTATE_MARGINAL
Add a new interface, fc_eh_should_retry_cmd(), which checks if the cmd
should be retried or not by checking the rport state. If the rport state is
marginal it returns false to make sure there won't be any retries on the
cmd.

Make the fc_remote_port_delete(), fc_user_scan_tgt(), and
fc_timeout_deleted_rport() functions handle the new rport state.

Link: https://lore.kernel.org/r/1609969748-17684-4-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:55:17 -05:00
Muneendra Kumar
60bee27ba2 scsi: core: No retries on abort success
Add a new optional routine, eh_should_retry_cmd(), in scsi_host_template
that allows the transport to decide if a cmd is retryable. Return true if
the transport is in a state the cmd should be retried on.

Update scmd_eh_abort_handler() and scsi_eh_flush_done_q() to both call
scsi_eh_should_retry_cmd() to check whether the command needs to be
retried.

The above changes were based on a patch by Mike Christie.

Link: https://lore.kernel.org/r/1609969748-17684-3-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:55:17 -05:00
Muneendra Kumar
962c8dcdd5 scsi: core: Add a new error code DID_TRANSPORT_MARGINAL in scsi.h
Add code in scsi_result_to_blk_status to translate a new error
DID_TRANSPORT_MARGINAL to the corresponding blk_status_t i.e
BLK_STS_TRANSPORT.

Add DID_TRANSPORT_MARGINAL case to scsi_decide_disposition().

Link: https://lore.kernel.org/r/1609969748-17684-2-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:55:17 -05:00
Tyrel Datwyler
032d190086 scsi: ibmvfc: Provide modules parameters for MQ settings
Add the various module parameter toggles for adjusting the MQ
characteristics at boot/load time as well as a device attribute for
changing the client scsi channel request amount.

Link: https://lore.kernel.org/r/20210114203148.246656-22-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:04 -05:00
Tyrel Datwyler
9000cb998b scsi: ibmvfc: Enable MQ and set reasonable defaults
Turn on MQ by default and set sane values for the upper limit on hw queues
for the SCSI host, and number of hw SCSI channels to request from the
partner VIOS.

Link: https://lore.kernel.org/r/20210114203148.246656-21-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:04 -05:00
Tyrel Datwyler
7eb3ccd884 scsi: ibmvfc: Purge SCSI channels after transport loss/reset
Grab the queue and list lock for each Sub-CRQ and add any uncompleted
events to the host purge list.

Link: https://lore.kernel.org/r/20210114203148.246656-20-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:04 -05:00
Tyrel Datwyler
a835f386f9 scsi: ibmvfc: Send Cancel MAD down each hw SCSI channel
In general the client needs to send Cancel MADs and task management
commands down the same channel as the command(s) intended to cancel or
abort. The client assigns cancel keys per LUN and thus must send a Cancel
down each channel commands were submitted for that LUN. Further, the client
then must wait for those cancel completions prior to submitting a LUN RESET
or ABORT TASK SET.

Add a cancel rsp iu syncronization field to the ibmvfc_queue struct such
that the cancel routine can sync the cancel response to each queue that
requires a cancel command. Build a list of each cancel event sent and wait
for the completion of each submitted cancel.

Link: https://lore.kernel.org/r/20210114203148.246656-19-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:04 -05:00
Tyrel Datwyler
a61236da7f scsi: ibmvfc: Add cancel mad initialization helper
Add a helper routine for initializing a Cancel MAD. This will be useful for
a channelized client that needs to send Cancel commands down every channel
commands were sent for a particular LUN.

Link: https://lore.kernel.org/r/20210114203148.246656-18-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:04 -05:00
Tyrel Datwyler
b88a5d9b7f scsi: ibmvfc: Register Sub-CRQ handles with VIOS during channel setup
If the ibmvfc client adapter requests channels it must submit a number of
Sub-CRQ handles matching the number of channels being requested. The VIOS
in its response will overwrite the actual number of channel resources
allocated which may be less than what was requested. The client then must
store the VIOS Sub-CRQ handle for each queue. This VIOS handle is needed as
a parameter with h_send_sub_crq().

Link: https://lore.kernel.org/r/20210114203148.246656-17-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:04 -05:00
Tyrel Datwyler
31750fbd7b scsi: ibmvfc: Send commands down HW Sub-CRQ when channelized
When the client has negotiated the use of channels all vfcFrames are
required to go down a Sub-CRQ channel or it is a protocoal violation. If
the adapter state is channelized submit vfcFrames to the appropriate
Sub-CRQ via the h_send_sub_crq() helper.

Link: https://lore.kernel.org/r/20210114203148.246656-16-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:31:03 -05:00
Tyrel Datwyler
cb72477be7 scsi: ibmvfc: Set and track hw queue in ibmvfc_event struct
Extract the hwq id from a SCSI command and store it in the ibmvfc_event
structure to identify which Sub-CRQ to send the command down when channels
are being utilized.

Link: https://lore.kernel.org/r/20210114203148.246656-15-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:29:38 -05:00
Tyrel Datwyler
c53408baa5 scsi: ibmvfc: Advertise client support for using hardware channels
Previous patches have plumbed the necessary Sub-CRQ interface and channel
negotiation MADs to fully channelize via hardware backed queues.

Advertise client support via NPIV Login capability IBMVFC_CAN_USE_CHANNELS
when the client bits have MQ enabled via vhost->mq_enabled, or when
channels were already in use during a subsequent NPIV Login. The later is
required because channel support is only renegotiated after a CRQ pair is
broken. Simple NPIV Logout/Logins require the client to continue to
advertise the channel capability until the CRQ pair between the client is
broken.

Link: https://lore.kernel.org/r/20210114203148.246656-14-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:45 -05:00
Tyrel Datwyler
e95eef3fc0 scsi: ibmvfc: Implement channel enquiry and setup commands
New NPIV_ENQUIRY_CHANNEL and NPIV_SETUP_CHANNEL management datagrams (MADs)
were defined in a previous patchset. If the client advertises a desire to
use channels and the partner VIOS is channel capable then the client must
proceed with channel enquiry to determine the maximum number of channels
the VIOS is capable of providing, and registering SubCRQs via channel setup
with the VIOS immediately following NPIV Login. This handshaking should not
be performed for subsequent NPIV Logins unless the CRQ connection has been
reset.

Implement these two new MADs and issue them following a successful NPIV
login where the VIOS has set the SUPPORT_CHANNELS capability bit in the
NPIV Login response.

Link: https://lore.kernel.org/r/20210114203148.246656-13-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:45 -05:00
Tyrel Datwyler
39e461fddf scsi: ibmvfc: Map/request irq and register Sub-CRQ interrupt handler
Create an irq mapping for the hw_irq number provided from phyp firmware.
Request an irq assigned our Sub-CRQ interrupt handler. Unmap these irqs at
Sub-CRQ teardown.

Link: https://lore.kernel.org/r/20210114203148.246656-12-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
80a9e8eaed scsi: ibmvfc: Define Sub-CRQ interrupt handler routine
Simple handler that calls Sub-CRQ drain routine directly.

Link: https://lore.kernel.org/r/20210114203148.246656-11-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
1d956ad853 scsi: ibmvfc: Add handlers to drain and complete Sub-CRQ responses
The logic for iterating over the Sub-CRQ responses is similiar to that of
the primary CRQ. Add the necessary handlers for processing those responses.

Link: https://lore.kernel.org/r/20210114203148.246656-10-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
d20046e64c scsi: ibmvfc: Add Sub-CRQ IRQ enable/disable routine
Each Sub-CRQ has its own interrupt. A hypercall is required to toggle the
IRQ state. Provide the necessary mechanism via a helper function.

Link: https://lore.kernel.org/r/20210114203148.246656-9-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
3034ebe263 scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels
Allocate a set of Sub-CRQs in advance. During channel setup the client and
VIOS negotiate the number of queues the VIOS supports and the number that
the client desires to request. Its possible that the final channel
resources allocated is less than requested, but the client is still
responsible for sending handles for every queue it is hoping for.

Also, provide deallocation cleanup routines.

Link: https://lore.kernel.org/r/20210114203148.246656-8-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
6d07f129dc scsi: ibmvfc: Add Subordinate CRQ definitions
Subordinate Command Response Queues (Sub CRQ) are used in conjunction with
the primary CRQ when more than one queue is needed by the virtual I/O
adapter. Recent phyp firmware versions support Sub CRQ's with ibmvfc
adapters. This feature is a prerequisite for supporting multiple hardware
backed submission queues in the vfc adapter.

The Sub CRQ command element differs from the standard CRQ in that it is
32bytes long as opposed to 16bytes for the latter. Despite this extra
16bytes the ibmvfc protocol will use the original CRQ command element
mapped to the first 16bytes of the Sub CRQ element initially.

Add definitions for the Sub CRQ command element and queue.

Link: https://lore.kernel.org/r/20210114203148.246656-7-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
9e6b6b81aa scsi: ibmvfc: Define hcall wrapper for registering a Sub-CRQ
Sub-CRQs are registred with firmware via a hypercall. Abstract that
interface into a simpler helper function.

Link: https://lore.kernel.org/r/20210114203148.246656-6-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:44 -05:00
Tyrel Datwyler
bb35ecb2a9 scsi: ibmvfc: Add size parameter to ibmvfc_init_event_pool()
With the upcoming addition of Sub-CRQs the event pool size may vary
per-queue.

Add a size parameter to ibmvfc_init_event_pool() such that different size
event pools can be requested by ibmvfc_alloc_queue().

Link: https://lore.kernel.org/r/20210114203148.246656-5-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:43 -05:00
Tyrel Datwyler
003d91a139 scsi: ibmvfc: Init/free event pool during queue allocation/free
The event pool and CRQ used to be separate entities of the adapter host
structure and as such were allocated and freed independently of each
other. Recent work as defined a generic queue structure with an event pool
specific to each queue. As such the event pool for each queue shouldn't be
allocated/freed independently, but instead performed as part of the queue
allocation/free routines.

Move the calls to ibmvfc_event_pool_{init|free} into
ibmvfc_{alloc|free}_queue respectively. The only functional change here is
that the CRQ cannot be released in ibmvfc_remove until after the event pool
has been successfully purged since releasing the queue will also free the
event pool.

Link: https://lore.kernel.org/r/20210114203148.246656-4-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:43 -05:00
Tyrel Datwyler
225acf5f1a scsi: ibmvfc: Move event pool init/free routines
The next patch in this series reworks the event pool allocation calls to
happen within the individual queue allocation routines instead of as
independent calls.

Move the init/free routines earlier in ibmvfc.c to prevent undefined
reference errors when calling these functions from the queue allocation
code. No functional change.

Link: https://lore.kernel.org/r/20210114203148.246656-3-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:43 -05:00
Tyrel Datwyler
6ae208e5d2 scsi: ibmvfc: Add vhost fields and defaults for MQ enablement
Introduce several new vhost fields for managing MQ state of the adapter as
well as initial defaults for MQ enablement.

Link: https://lore.kernel.org/r/20210114203148.246656-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:27:43 -05:00
Can Guo
9cd20d3f47 scsi: ufs: Protect PM ops and err_handler from user access through sysfs
User layer may access sysfs nodes when system PM ops or error handling is
running. This can cause various problems. Rename eh_sem to host_sem and use
it to protect PM ops and error handling from user layer intervention.

Link: https://lore.kernel.org/r/1610594010-7254-3-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:12:35 -05:00
Can Guo
fb7afe24ba scsi: ufs: Fix a possible NULL pointer issue
During system resume/suspend, hba could be NULL. In this case, do not touch
eh_sem.

Fixes: 88a92d6ae4 ("scsi: ufs: Serialize eh_work with system PM events and async scan")
Link: https://lore.kernel.org/r/1610594010-7254-2-git-send-email-cang@codeaurora.org
Acked-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:12:35 -05:00
Brian King
764907293e scsi: ibmvfc: Set default timeout to avoid crash during migration
While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:

CPU 0                          CPU 1
-------------------------------  ----------------------------------
scsi_queue_rq                    migration_store
 -> blk_mq_start_request          -> rtas_ibm_suspend_me
  -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
              _______________________________________V
             |
             V
    -> IPI from CPU 1
     -> rtas_percpu_suspend_me
                                     -> __rtas_suspend_last_cpu

-- Linux partition suspended for > 30 seconds --
                                      -> for_each_online_cpu(cpu)
                                           plpar_hcall_norets(H_PROD
 -> scsi_dispatch_cmd
                                      -> scsi_times_out
                                       -> scsi_abort_command
                                        -> queue_delayed_work
  -> ibmvfc_queuecommand_lck
   -> ibmvfc_send_event
    -> ibmvfc_send_crq
     - returns H_CLOSED
   <- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request

                                      -> scmd_eh_abort_handler
                                       -> scsi_try_to_abort_cmd
                                         - returns SUCCESS
                                       -> scsi_queue_insert

Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.

In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.

The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.

Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:02:59 -05:00
Bean Huo
b64750a1b6 scsi: ufs: Remove unnecessary devm_kfree()
The memory allocated with devm_kzalloc() is freed automatically no need to
explicitly call devm_kfree(). Delete it and save some instruction cycles.

Link: https://lore.kernel.org/r/20210112092128.19295-1-huobean@gmail.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:30:04 -05:00
YANG LI
af0c94afc0 scsi: lpfc: Simplify bool comparison
Fix the following coccicheck warning:

./drivers/scsi/lpfc/lpfc_bsg.c:5392:5-29: WARNING: Comparison to bool

Link: https://lore.kernel.org/r/1610439893-64872-1-git-send-email-abaci-bugfix@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:28:22 -05:00
Jaegeuk Kim
a2fca52ee6 scsi: ufs: WB is only available on LUN #0 to #7
Kernel stack violation when getting unit_descriptor/wb_buf_alloc_units from
rpmb LUN. The reason is that the unit descriptor length is different per
LU.

The length of Normal LU is 45 while the one of rpmb LU is 35.

int ufshcd_read_desc_param(struct ufs_hba *hba, ...)
{
	param_offset=41;
	param_size=4;
	buff_len=45;
	...
	buff_len=35 by rpmb LU;

	if (is_kmalloc) {
		/* Make sure we don't copy more data than available */
		if (param_offset + param_size > buff_len)
			param_size = buff_len - param_offset;
			--> param_size = 250;
		memcpy(param_read_buf, &desc_buf[param_offset], param_size);
		--> memcpy(param_read_buf, desc_buf+41, 250);

[  141.868974][ T9174] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: wb_buf_alloc_units_show+0x11c/0x11c
	}
}

Link: https://lore.kernel.org/r/20210111095927.1830311-1-jaegeuk@kernel.org
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:27:46 -05:00
Nilesh Javali
dc0d9b12b8 scsi: qla2xxx: Update version to 10.02.00.105-k
Link: https://lore.kernel.org/r/20210111093134.1206-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:20 -05:00
Saurav Kashyap
ffa018e3a5 scsi: qla2xxx: Enable NVMe CONF (BIT_7) when enabling SLER
Enable NVMe confirmation bit in PRLI.

Link: https://lore.kernel.org/r/20210111093134.1206-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:20 -05:00
Quinn Tran
044c218b04 scsi: qla2xxx: Fix mailbox Ch erroneous error
Mailbox Ch/dump ram extend expects mb register 10 to be set. If not
set/clear, firmware can pick up garbage from previous invocation of this
mailbox. Example: mctp dump can set mb10.  On subsequent flash read which
use mailbox cmd Ch, mb10 can retain previous value.

Link: https://lore.kernel.org/r/20210111093134.1206-6-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:20 -05:00
Bikash Hazarika
a046585943 scsi: qla2xxx: Wait for ABTS response on I/O timeouts for NVMe
FW needs to wait for an ABTS response before completing the I/O.

Link: https://lore.kernel.org/r/20210111093134.1206-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:20 -05:00
Saurav Kashyap
daaecb41a2 scsi: qla2xxx: Move some messages from debug to normal log level
This change will aid in debugging issues arising because of dropped frame,
DIF errors, queue full etc where debug level is not set.

Link: https://lore.kernel.org/r/20210111093134.1206-4-njavali@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:20 -05:00
Saurav Kashyap
307862e669 scsi: qla2xxx: Add error counters to debugfs node
Display error counters via debugfs node.

Link: https://lore.kernel.org/r/20210111093134.1206-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:19 -05:00
Saurav Kashyap
dbf1f53cfd scsi: qla2xxx: Implementation to get and manage host, target stats and initiator port
This statistics will help in debugging process and checking specific error
counts. It also provides a capability to isolate the port or bring it out
of isolation.

Link: https://lore.kernel.org/r/20210111093134.1206-2-njavali@marvell.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:25:19 -05:00
YANG LI
ac341c2d2f scsi: qedf: Simplify bool comparison
Fix the following coccicheck warning:

./drivers/scsi/qedf/qedf_main.c:3716:5-31: WARNING: Comparison to bool

Link: https://lore.kernel.org/r/1610357368-62866-1-git-send-email-abaci-bugfix@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:15:13 -05:00
Sergey Shtylyov
e4da5feb09 scsi: aha1542: Fix multi-line comment style
Some comments in this driver don't comply with the preferred multi-line
comment style, as reported by 'scripts/checkpatch.pl':

WARNING: Block comments use * on subsequent lines
WARNING: Block comments use a trailing */ on a separate line

Fix those comments, along with the (unreported for some reason?) starts of
the multi-line comments not being /* on their own line...

Link: https://lore.kernel.org/r/08c231e5-d86f-9d0b-19ac-ad46fa0c0b58@omprussia.ru
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:14:07 -05:00
Sergey Shtylyov
6075416cc4 scsi: aha1542: Kill trailing whitespace
Some source lines (mostly the comments) in this driver end with spaces, as
reported by 'scripts/checkpatch.pl'. Trim these lines.

Link: https://lore.kernel.org/r/59829052-4932-4ea3-b504-857bbb19e6a0@omprussia.ru
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:14:07 -05:00
Sergey Shtylyov
5637d5b769 scsi: aha1542: Clarify 'struct ccb' comments
This driver's original authors did pretty bad job of documenting the
Command Control Block (CCB) structure -- especially its 2nd byte, where the
bit numbers were completely left out. Sync up the 'struct ccb' comments to
the Adaptec AHA-154xA manual.

Link: https://lore.kernel.org/r/17a7be14-a9d2-9822-bb3e-1d7385f486b0@omprussia.ru
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:14:07 -05:00
Avri Altman
fb475b74d6 scsi: ufs: A tad optimization in query upiu trace
Remove a redundant if clause in ufshcd_add_query_upiu_trace.

Link: https://lore.kernel.org/r/20210110084618.189371-1-avri.altman@wdc.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:11:11 -05:00
Vishakha Channapattan
4f608fbce5 scsi: pm80xx: Log SATA IOMB completion status on failure
Added a log message in SATA completion path to capture the status of failed
command. If the status does not match any expected status, another message
will be logged.

On IO failure with known status, the log message will be:

  [ 1712.951735] pm80xx0:: mpi_sata_completion 2269: IO failed device_id 16385 status 0x1 tag XX

If the firmware returns unexpected status, a message of the following
format will be logged:

  [ 1712.951735] pm80xx0:: mpi_sata_completion XXXX: Unknown status device_id XXXXX status 0xX tag XX

Link: https://lore.kernel.org/r/20210109123849.17098-8-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
Bhavesh Jashnani
6b2f2d05b5 scsi: pm80xx: Simultaneous poll for all FW readiness
In check_fw_ready() we first wait for ILA to come up and then we wait for
RAAE to come up and IOPs and so on. This is a sequential check.  Because of
this, ILA image seems to be not ready in the allocated time and so the
driver marks it as "not ready" and then moves on to other FW images.

ILA does become ready eventually, but is not checked again. The driver
concludes that FW is not ready when it actually is.

Instead of sequentially polling each image, we keep polling for all images
to be ready. The timeout for the polling has been set to the sum of what
was used for each individual image.

Link: https://lore.kernel.org/r/20210109123849.17098-7-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Bhavesh Jashnani <bjashnani@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
Viswas G
ec2e7e1aff scsi: pm80xx: Fix driver fatal dump failure
The function pm80xx_get_fatal_dump() has two issues that result in the
fatal dump not being able to complete successfully.

 1. Trying to collect fatal_logs from the application fails because we are
    not shifting the MEMBASE-II register properly. Once we read 64K region
    of data we have to shift the MEMBASE-II register and read the next
    chunk. Only then would we be able to get complete data.

 2. If a timeout occurs, our application will get stuck.

Link: https://lore.kernel.org/r/20210109123849.17098-6-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
akshatzen
5d28026891 scsi: pm80xx: Fix missing tag_free in NVMD DATA req
Tag was not freed in NVMD get/set data request failure scenario. This
caused a tag leak each time a request failed.

Link: https://lore.kernel.org/r/20210109123849.17098-5-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
akshatzen
95652f98b1 scsi: pm80xx: Check main config table address
The driver initializes main configuration, general status, inbound queue
and outbound queue table addresses based on a value read from
MSGU_SCRATCH_PAD_0 register.

We should validate these addresses before dereferencing them.

Adds two validations:

 1. Check if main configuration table offset lies within the pcibar
    mapped

 2. Check if first dword of main configuration table reads "PMCS"

There are two calls to init_pci_device_addresses() done during
pm8001_pci_probe() in this sequence:

 1. First inside chip_soft_rst, where if init_pci_device_addresses fails we
    will go ahead assuming MPI state is not ready and reset the device as
    long as bootloader is okay.  This gives chance to second call of
    init_pci_device_addresses to set up the addresses after reset.

 2. The second call is via pm80xx_chip_init, after soft reset is done and
    firmware is checked to be ready. Once that is done we are safe to go
    ahead and initialize default table values and use them.

Tests:

 1. Enabled debugging logs and observed no issues during initialization,
    with a controller with no issues:

    pm80xx0:: pm8001_setup_msix 1034: pci_alloc_irq_vectors request ret:64 no of intr 64
    pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000 value 0x40002000
    pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
    pm80xx0:: init_pci_device_addresses 952: VALID main config signature 0x53434d50
    pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
    pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
    pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
    pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
    pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
    pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval) general status (ptrval)
    pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd (ptrval)
    pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt (ptrval)
    pm80xx0:: pm80xx_chip_soft_rst 1446: reset register before write : 0x0
    pm80xx0:: pm80xx_chip_soft_rst 1478: reset register after write 0x40
    pm80xx0:: pm80xx_chip_soft_rst 1544: SPCv soft reset Complete
    pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000 value 0x40002000
    pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
    pm80xx0:: init_pci_device_addresses 952: VALID main config signature 0x53434d50
    pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
    pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
    pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
    pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
    pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
    pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval) general status (ptrval)
    pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd (ptrval)
    pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt (ptrval)
    pm80xx0:: pm80xx_chip_init 1329: MPI initialize successful!

 2. Tested controller with firmware known to have initialization issue and
    observed no crashes with this fix:

    pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
    pm80xx 0000:01:00.0: Removing from 1:1 domain
    pm80xx 0000:01:00.0: Requesting non-1:1 mappings
    pm80xx0:: init_pci_device_addresses 948: BAD main config signature 0x0
    pm80xx0:: mpi_uninit_check 1365: Failed to init pci addresses
    pm80xx0:: pm80xx_chip_soft_rst 1435: MPI state is not ready scratch:0:8:62a01000:0
    pm80xx0:: pm80xx_chip_soft_rst 1518: Firmware is not ready!
    pm80xx0:: pm80xx_chip_soft_rst 1532: iButton Feature is not Available!!!
    pm80xx0:: pm80xx_chip_init 1301: Firmware is not ready!
    pm80xx0:: pm8001_pci_probe 1215: chip_init failed [ret: -16]
    pm80xx: probe of 0000:01:00.0 failed with error -16
    pm80xx 0000:07:00.0: pm80xx: driver version 0.1.38
    pm80xx 0000:07:00.0: Removing from 1:1 domain
    pm80xx 0000:07:00.0: Requesting non-1:1 mappings
    scsi host6: pm80xx
    pm80xx1:: pm8001_setup_sgpio 5568: failed sgpio_req timeout
    pm80xx1:: mpi_phy_start_resp 3447: phy start resp status:0x0, phyid:0x0
    pm80xx 0000:08:00.0: pm80xx: driver version 0.1.38
    pm80xx 0000:08:00.0: Removing from 1:1 domain
    pm80xx 0000:08:00.0: Requesting non-1:1 mappings

 3. Without this fix we observe crash on the same controller:

    pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
    pm80xx 0000:01:00.0: Removing from 1:1 domain
    pm80xx 0000:01:00.0: Requesting non-1:1 mappings
    [<ffffffffc0451b3b>] pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
    [<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
    RIP: 0010:pm80xx_chip_soft_rst+0x71/0x4c0 [pm80xx]
    [<ffffffffc0451b3b>] ? pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
    [<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
    pm80xx0:: mpi_uninit_check 1339: TIMEOUT:IBDB value/=2
    pm80xx0:: pm80xx_chip_soft_rst 1387: MPI state is not ready scratch:0:8:62a01000:0
    pm80xx0:: pm80xx_chip_soft_rst 1470: Firmware is not ready!
    pm80xx0:: pm80xx_chip_soft_rst 1484: iButton Feature is not Available!!!
    pm80xx0:: pm80xx_chip_init 1266: Firmware is not ready!
    pm80xx0:: pm8001_pci_probe 1207: chip_init failed [ret: -16]
    pm80xx: probe of 0000:01:00.0 failed with error -16

Link: https://lore.kernel.org/r/20210109123849.17098-4-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
akshatzen
a961ea0afd scsi: pm80xx: Check for fatal error
When the controller runs into a fatal error, commands get stuck due to no
response. If the controller is in fatal error state, abort requests issued
to the controller get stuck too.

Check the controller state for fatal error conditions.

Link: https://lore.kernel.org/r/20210109123849.17098-3-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
akshatzen
d71023af4b scsi: pm80xx: Do not busy wait in MPI init check
We do not need to busy wait during mpi_init_check() since it is not being
invoked in atomic context. mpi_init_check() is being called from
pm8001_pci_resume(), pm8001_pci_probe(). Hence we are replacing udelay with
msleep.

Link: https://lore.kernel.org/r/20210109123849.17098-2-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13 00:02:01 -05:00
Ziqi Chen
b61d041413 scsi: ufs-qcom: Fix ufs RST_n spec violation
According to the spec (JESD220E chapter 7.2), while powering off/on the ufs
device, RST_n signal should be between VSS(Ground) and VCCQ/VCCQ2.

Link: https://lore.kernel.org/r/1610103385-45755-3-git-send-email-ziqichen@codeaurora.org
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Ziqi Chen <ziqichen@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 23:37:34 -05:00
Ziqi Chen
528db9e563 scsi: ufs: core: Fix ufs clk specs violation
According to the spec (JESD220E chapter 7.2), while powering off/on the ufs
device, REF_CLK signal should be between VSS(Ground) and VCCQ/VCCQ2.

Link: https://lore.kernel.org/r/1610103385-45755-2-git-send-email-ziqichen@codeaurora.org
Reviewed-by: Can Guo <cang@codeaurora.org>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Ziqi Chen <ziqichen@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 23:37:34 -05:00
YANG LI
dc0bfdb563 scsi: isci: Remove the unneeded variable "status"
The variable 'status' is being initialized with SCI_SUCCESS and never
updated later with a new value. The initialization is redundant and can be
removed.

Link: https://lore.kernel.org/r/1609311860-102820-1-git-send-email-abaci-bugfix@linux.alibaba.com
Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 23:34:06 -05:00
Dinghao Liu
d6e3ae7672 scsi: fnic: Fix memleak in vnic_dev_init_devcmd2
When ioread32() returns 0xFFFFFFFF, we should execute cleanup functions
like other error handling paths before returning.

Link: https://lore.kernel.org/r/20201225083520.22015-1-dinghao.liu@zju.edu.cn
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 23:32:53 -05:00
Javed Hasan
b2b0f16fa6 scsi: libfc: Avoid invoking response handler twice if ep is already completed
A race condition exists between the response handler getting called because
of exchange_mgr_reset() (which clears out all the active XIDs) and the
response we get via an interrupt.

Sequence of events:

	 rport ba0200: Port timeout, state PLOGI
	 rport ba0200: Port entered PLOGI state from PLOGI state
	 xid 1052: Exchange timer armed : 20000 msecs      xid timer armed here
	 rport ba0200: Received LOGO request while in state PLOGI
	 rport ba0200: Delete port
	 rport ba0200: work event 3
	 rport ba0200: lld callback ev 3
	 bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200
	 bnx2fc: ba0200 - rport not created Yet!!
	 /* Here we reset any outstanding exchanges before
	 freeing rport using the exch_mgr_reset() */
	 xid 1052: Exchange timer canceled
	 /* Here we got two responses for one xid */
	 xid 1052: invoking resp(), esb 20000000 state 3
	 xid 1052: invoking resp(), esb 20000000 state 3
	 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2
	 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2

Skip the response if the exchange is already completed.

Link: https://lore.kernel.org/r/20201215194731.2326-1-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 23:07:32 -05:00
Martin Wilck
72eeb7c715 scsi: scsi_transport_srp: Don't block target in failfast state
If the port is in SRP_RPORT_FAIL_FAST state when srp_reconnect_rport() is
entered, a transition to SDEV_BLOCK would be illegal, and a kernel WARNING
would be triggered. Skip scsi_target_block() in this case.

Link: https://lore.kernel.org/r/20210111142541.21534-1-mwilck@suse.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 22:56:49 -05:00
Adrian Hunter
b6cacaf204 scsi: ufs: ufs-debugfs: Add error counters
People testing have a need to know how many errors might be occurring over
time. Add error counters and expose them via debugfs.

A module initcall is used to create a debugfs root directory for
ufshcd-related items. In the case that modules are built-in, then
initialization is done in link order, so move ufshcd-core to the top of the
Makefile.

Link: https://lore.kernel.org/r/20210107072538.21782-1-adrian.hunter@intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12 22:14:06 -05:00
Andrea Parri (Microsoft)
91b1b640b8 scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()
Check that the packet is of the expected size at least, don't copy data
past the packet.

Link: https://lore.kernel.org/r/20201217203321.4539-4-parri.andrea@gmail.com
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reported-by: Saruhan Karademir <skarade@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:15:24 -05:00
Andrea Parri (Microsoft)
244808e030 scsi: storvsc: Resolve data race in storvsc_probe()
vmscsi_size_delta can be written concurrently by multiple instances of
storvsc_probe(), corresponding to multiple synthetic IDE/SCSI devices;
cf. storvsc_drv's probe_type == PROBE_PREFER_ASYNCHRONOUS.  Change the
global variable vmscsi_size_delta to per-synthetic-IDE/SCSI-device.

Link: https://lore.kernel.org/r/20201217203321.4539-3-parri.andrea@gmail.com
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Suggested-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:15:00 -05:00
Andrea Parri (Microsoft)
ab548fd21e scsi: storvsc: Fix max_outstanding_req_per_channel for Win8 and newer
Current code overestimates the value of max_outstanding_req_per_channel for
Win8 and newer hosts, since vmscsi_size_delta is set to the initial value
of sizeof(vmscsi_win8_extension) rather than zero.  This may lead to wrong
decisions when using ring_avail_percent_lowater equals to zero.  The
estimate of max_outstanding_req_per_channel is 'exact' for Win7 and older
hosts.  A better choice, keeping the algorithm for the estimation simple,
is to err the other way around, i.e., to underestimate for Win7 and older
but to use the exact value for Win8 and newer.

Link: https://lore.kernel.org/r/20201217203321.4539-2-parri.andrea@gmail.com
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Suggested-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:15:00 -05:00
James Smart
181dd9a4c2 scsi: lpfc: Update lpfc version to 12.8.0.7
Update lpfc version to 12.8.0.7

Link: https://lore.kernel.org/r/20210104180240.46824-16-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:37 -05:00
James Smart
0b3ad32e26 scsi: lpfc: Enhancements to LOG_TRACE_EVENT for better readability
While testing recent discovery node rework, several items were seen that
could be done better with respect to the new trace event logic.

1) in the following msg:
      kernel: lpfc 0000:44:00.0: start 35 end 35 cnt 0
   If cnt is zero in the 1st message, there is no reason to display the
   1st message, which is just giving start/end positioning.

   Fix by not displaying message if cnt is 0.

2) If the driver is loaded with module log verbosity off, and later a
   single NPIV host instance verbosity is enabled via sysfs, it enables
   messages on all instances. This is due to the trace log verbosity checks
   (lpfc_dmp_dbg) looking at the phba only. It should look at the phba and
   the vport.

   Fix by enabling a check on both phba and vport.

3) in the following messages:
       2904 Firmware Dump Image Present on Adapter
       2887 Reset Needed: Attempting Port Recovery...
   These messages are not necessary for the trace event log, which is
   primarily for discovery.

   Fix by changing log level on these 2 messages to LOG_SLI.

Link: https://lore.kernel.org/r/20210104180240.46824-15-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:37 -05:00
James Smart
a22d73b655 scsi: lpfc: Implement health checking when aborting I/O
Several errors have occurred where the adapter stops or fails but does not
raise the register values for the driver to detect failure. Thus driver is
unaware of the failure. The failure typically results in I/O timeouts, the
I/O timeout handler failing (after several seconds), and the error handler
escalating recovery policy and resulting in more errors. Eventually, the
driver is in a position where things have spiraled and it can't do recovery
because other recovery ops are still outstanding and it becomes unusable.

Resolve the situation by having the I/O timeout handler (actually a els,
SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O,
perform a mailbox command and look for a response from the hardware.  If
the mailbox command fails, it will mark the adapter offline and then invoke
the adapter reset handler to clean up.

The new I/O timeout test will be limited to a test every 5s. If there are
multiple I/O timeouts concurrently, only the 1st I/O timeout will generate
the mailbox command. Further testing will only occur once a timeout occurs
after a 5s delay from the last mailbox command has expired.

Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:37 -05:00
James Smart
243156c010 scsi: lpfc: Fix crash when nvmet transport calls host_release
When lpfc is running in NVMET mode and supports the NVME-1 addendum
changes, a LIP on a bound NVME Initiator or lipping the lpfc NVMET's link
resulted in an Oops in lpfc_nvmet_host_release.

The fix requires lpfc NVMET to maintain an additional reference on any node
structure that acts as the hosthandle for the NVMET transport.  This
reference get is a one-time addition, is taken prior to the upcall of an
unsolicited LS_REQ, and is released when the NVMET transport releases the
hosthandle during the host_release downcall.

Link: https://lore.kernel.org/r/20210104180240.46824-13-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:37 -05:00
James Smart
ff8a44bff5 scsi: lpfc: Fix vport create logging
When with testing with large numbers of npiv vports and link bounces, the
driver is flooding the messages file, even with log_verbose = 0.

The new LOG_TRACE_EVENT messages are still generating events to the
messages files.

Fix by converting the vport create msg from LOG_TRACE_EVENT to LOG_VPORT.

Link: https://lore.kernel.org/r/20210104180240.46824-12-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
9ec58ec7d4 scsi: lpfc: Fix NVMe recovery after mailbox timeout
If a mailbox command times out, the SLI port is deemed in error and the
port is reset.  The HBA cleanup is not returning I/Os to the NVMe layer
before the port is unregistered. This is due to the HBA being marked
offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler
rather than an general adapter reset routine.  The mailbox timeout handler
mailbox handler only cleaned up SCSI I/Os.

Fix by reworking the mailbox handler to:

 - After handling the mailbox error, detect the board is already in
   failure (may be due to another error), and leave cleanup to the
   other handler.

 - If the mailbox command timeout is initial detector of the port error,
   continue with the board cleanup and marking the adapter offline
   (!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic
   reset adapter routine that is subsequently invoked, will clean up the
   I/Os.

 - Have the reset adapter routine flush all NVMe and SCSI I/Os if the
   adapter has been marked failed (!SLI_ACTIVE).

 - Rework the NVMe I/O terminate routine to take a status code to fail the
   I/O with and update so that cleaned up I/O calls the wqe completion
   routine. Currently it is bypassing the wqe cleanup and calling the NVMe
   I/O completion directly. The wqe completion routine will take care of
   data structure and node cleanup then call the NVMe I/O completion
   handler.

Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
31051249f1 scsi: lpfc: Fix target reset failing
Target reset is failed by the target as an invalid command.

The Target Reset TMF has been obsoleted in T10 for a while, but continues
to be used. On (newer) devices, the TMF is rejected causing the reset
handler to escalate to adapter resets.

Fix by having Target Reset TMF rejections be translated into a LOGO and
re-PLOGI with the target device. This provides the same semantic action
(although, if the device also supports nvme traffic, it will terminate nvme
traffic as well - but it's still recoverable).

Link: https://lore.kernel.org/r/20210104180240.46824-10-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
da09ae4864 scsi: lpfc: Fix error log messages being logged following SCSI task mgnt
A successful task mgmt command is logging errors, making it look like
problems were encountered.  This is due to log messages for the
device/target and bus reset handlers having the LOG_TRACE_EVENT flag set.

Fix by adjusting the event flag such that the call to the logging routine
only receives a LOG_TRACE_EVENT if a prior call actually failed.

Link: https://lore.kernel.org/r/20210104180240.46824-9-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
f0871ab68a scsi: lpfc: Prevent duplicate requests to unregister with cpuhp framework
In the lpfc offline routine, called for various reasons such as sysfs
attribute, driver unload, or port error, the driver is calling
__lpfc_cpuhp_remove() to destroy the hot plug data. If the offline routine
is called while the driver is in the process of being unloaded, a request
using lpfc_cpuhp_remove() is also made from lpfc_sli4_hba_unset(). The
cpuhp elements are no longer valid when the second removal request is made.

Fix by only calling the cpuhp removal once when the adapter is in the
process of unloading.

Link: https://lore.kernel.org/r/20210104180240.46824-8-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
3ba6216aad scsi: lpfc: Fix FW reset action if I/Os are outstanding
If the port is configured for NVME and has any outstanding IOs when a FW
reset is requesteed, outstanding I/Os are not properly cleaned up. This
causes the fw download request to fail.

Fix by clearing the LPFC_SLI_ACTIVE flag to signify the I/O must be
manually flushed by the driver on port reset.

Link: https://lore.kernel.org/r/20210104180240.46824-7-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
c33b160934 scsi: lpfc: Use the nvme-fc transport supplied timeout for LS requests
When lpfc generates a GEN_REQUEST wqe for the nvme LS (such as Create
Association), the timeout is set to R_A_TOV without regard to the timeout
value supplied by the nvme-fc transport. The driver should be setting the
timeout to the value passed into the routine. Additionally the caller
should be setting the timeout value to the value in the ls request set by
the nvme transport. Instead, it unconditionally is setting it to a driver
defined value.  So the driver actually overrode the value twice.

Fix by using the timeout provided to the routine, and for the caller, set
the timeout to the ls request timeout value.

Link: https://lore.kernel.org/r/20210104180240.46824-6-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:35 -05:00
James Smart
07aaefdf75 scsi: lpfc: Fix crash when a fabric node is released prematurely
The driver's management of the fabric controller (aka pseudo-scsi
initiator) node in SLI3 mode is causing this crash. The crash occurs
because of a node reference imbalance that frees the fabric controller node
while devloss is outstanding from the SCSI transport.  This is triggered by
an odd behavior where the switch reacts to a rejected RDP request with a
PLOGI and nothing else, not even a LOGO.  The driver ACKS the PLOGI and
after successfully registering the RPI, incorrectly registers the fabric
controller node because it has the NLP_FC4_FCP flag still set from the
fabric controller PRLI.  If a LIP is issued, the driver attempts to cleanup
on Link Up and ends up executing too many puts.

Fix by detecting the fabric node type and clearing out the nodes internal
flags that triggered a SCSI transport registration and subsequence dev_loss
event.  The driver cannot count on any persistence from fabric controller
nodes.

Link: https://lore.kernel.org/r/20210104180240.46824-5-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:35 -05:00
James Smart
ecf041fe98 scsi: lpfc: Refresh ndlp when a new PRLI is received in the PRLI issue state
Testing with target ports coming and going, the driver eventually reached a
state where it no longer discovered the target. When the driver has issued
a PRLI and receives a PRLI from the target, it is not properly updating the
node's initiator/target role flags. Thus, when a subsequent RSCN is
received for a target loss, the driver mis-identifies the target as an
initiator and does not initiate LUN scanning.

Fix by always refreshing the ndlp with the latest PRLI state information
whenever a PRLI is processed.  Also clear the ndlp flags when processing a
PLOGI so that there is no carry over through a re-login.

Link: https://lore.kernel.org/r/20210104180240.46824-4-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:35 -05:00
James Smart
d2f2547efd scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3
A very long time ago, there was a feature: auto sli mode. It gave the user
the ability to auto select the SLI mode (SLI2 or SLI3) to run the port in,
or even force SLI2 mode if configured.  Because of the convoluted logic,
the CONFIG_PORT mbox command ends up being called 2 or 3 times. It should
have been called only once.  Additionally, the driver no longer supports
SLI-2, so only SLI-3 mode should be allowed.

The following changes were made:

 - Force module parameter to SLI3 only.

 - Rip out redundant CONFIG_PORT mbox commands.

 - Force CONFIG_PORT mbox command to be in beginning of enable ISR routine.

 - Added changes for offline to online behavior

Link: https://lore.kernel.org/r/20210104180240.46824-3-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:35 -05:00
James Smart
8e062ce305 scsi: lpfc: Fix PLOGI S_ID of 0 on pt2pt config
Under some pt2pt situations, the other end of the link may issue a LOGO
after successfully completing PLOGI and assigning addresses to the port.
Thus the driver may attempt a new PLOGI to re-create the login, but the
LOGO handling cleared the address back to 0. Once this happens, the other
end, which may be address 0, gets all confused and this cannot be resolved
without an administrative action to bounce the link.

Fix by assuming that address assignment only occurs on the 1st PLOGI after
link up, and regardless of login state, the address assignment sticks.  The
FC standards aren't particularly clear in this situation (it only describes
initial PLOGI), but there is nothing that contradicts this and behaviors on
the devices tested appears to conform to the understanding.

Thus, don't reset the port address to 0 as part of LOGO handling. Port
addresses will only reset on link down.

Link: https://lore.kernel.org/r/20210104180240.46824-2-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:35 -05:00
John Garry
3997e0fdd5 scsi: hisi_sas: Remove auto_affine_msi_experimental module_param
Now that the driver always uses managed interrupts, delete
auto_affine_msi_experimental module param.

Link: https://lore.kernel.org/r/1609763622-34119-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:52:46 -05:00
Jaegeuk Kim
eeb1b55b6e scsi: ufs: Fix tm request when non-fatal error happens
When non-fatal error like line-reset happens, ufshcd_err_handler() starts
to abort tasks by ufshcd_try_to_abort_task(). When it tries to issue a task
management request, we hit two warnings:

WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70
WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c

After fixing the above warnings we hit another tm_cmd timeout which may be
caused by unstable controller state:

__ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

Then, ufshcd_err_handler() enters full reset, and kernel gets stuck. It
turned out ufshcd_print_trs() printed too many messages on console which
requires CPU locks. Likewise hba->silence_err_logs, we need to avoid too
verbose messages. This is actually not an error case.

Link: https://lore.kernel.org/r/20210107185316.788815-3-jaegeuk@kernel.org
Fixes: 69a6c269c0 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs")
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:50:48 -05:00
Jaegeuk Kim
4ee7ee530b scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns()
When gate_work/ungate_work experience an error during hibern8_enter or exit
we can livelock:

 ufshcd_err_handler()
   ufshcd_scsi_block_requests()
   ufshcd_reset_and_restore()
     ufshcd_clear_ua_wluns() -> stuck
   ufshcd_scsi_unblock_requests()

In order to avoid this, ufshcd_clear_ua_wluns() can be called per recovery
flows such as suspend/resume, link_recovery, and error_handler.

Link: https://lore.kernel.org/r/20210107185316.788815-2-jaegeuk@kernel.org
Fixes: 1918651f2d ("scsi: ufs: Clear UAC for RPMB after ufshcd resets")
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:50:48 -05:00
Bean Huo
d9edeb8b47 scsi: ufs: Replace sprintf and snprintf with sysfs_emit
sprintf and snprintf may cause output defect in sysfs content, it is better
to use new added sysfs_emit function which knows the size of the temporary
buffer.

Link: https://lore.kernel.org/r/20210106211541.23039-1-huobean@gmail.com
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:42:44 -05:00
Randy Dunlap
aaac0ea983 scsi: ufs: Fix all Kconfig help text indentation
Use consistent and expected indentation for all Kconfig text.

Link: https://lore.kernel.org/r/20210106205554.18082-1-rdunlap@infradead.org
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:39:58 -05:00
Tyrel Datwyler
654080d02e scsi: ibmvfc: Relax locking around ibmvfc_queuecommand()
The driver's queuecommand routine is still wrapped to hold the host lock
for the duration of the call. This will become problematic when moving to
multiple queues due to the lock contention preventing asynchronous
submissions to mulitple queues. There is no real legitimate reason to hold
the host lock, and previous patches have insured proper protection of
moving ibmvfc_event objects between free and sent lists.

Link: https://lore.kernel.org/r/20210106201835.1053593-6-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:37:13 -05:00
Tyrel Datwyler
1f4a4a1950 scsi: ibmvfc: Complete commands outside the host/queue lock
Drain the command queue and place all commands on a completion list.
Perform command completion on that list outside the host/queue locks.
Further, move purged command compeletions outside the host_lock as well.

Link: https://lore.kernel.org/r/20210106201835.1053593-5-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:37:13 -05:00
Tyrel Datwyler
57e80e0bc1 scsi: ibmvfc: Define per-queue state/list locks
Define per-queue locks for protecting queue state and event pool sent/free
lists. The evt list lock is initially redundant but it allows the driver to
be modified in the follow-up patches to relax the queue locking around
submissions and completions.

Link: https://lore.kernel.org/r/20210106201835.1053593-4-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:37:13 -05:00
Tyrel Datwyler
e4b26f3db8 scsi: ibmvfc: Make command event pool queue specific
There is currently a single command event pool per host. In anticipation of
providing multiple queues add a per-queue event pool definition and
reimplement the existing CRQ to use its queue defined event pool for
command submission and completion.

Link: https://lore.kernel.org/r/20210106201835.1053593-3-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:37:12 -05:00
Tyrel Datwyler
f8968665af scsi: ibmvfc: Define generic queue structure for CRQs
The primary and async CRQs are nearly identical outside of the format and
length of each message entry in the dma mapped page that represents the
queue data. These queues can be represented with a generic queue structure
that uses a union to differentiate between message format of the mapped
page.

This structure will further be leveraged in a followup patcheset that
introduces Sub-CRQs.

Link: https://lore.kernel.org/r/20210106201835.1053593-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:37:12 -05:00