Commit Graph

150 Commits

Author SHA1 Message Date
Xiang Chen
01d4e3a2fc scsi: hisi_sas: Some misc tidy-up
Do some minor tidy-up.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
John Garry
c63b88ccff scsi: hisi_sas: Fix for setting the PHY linkrate when disconnected
In commit efdcad62e7 ("scsi: hisi_sas: Set PHY linkrate when
disconnected"), we use the sas_phy_data.enable flag to track whether the
PHY was enabled or not, so that we know if we should set the PHY negotiated
linkrate at SAS_LINK_RATE_UNKNOWN or SAS_PHY_DISABLED.

However, it is not proper to use sas_phy_data.enable, since it is only set
when libsas attempts to set the PHY disabled/enabled; hence, it may not
even have an initial value.

As a solution to this problem, introduce hisi_sas_phy.enable to track
whether the PHY is enabled or not, so that we can set the negotiated
linkrate properly when the PHY comes down.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen
a97fa58680 scsi: hisi_sas: add host reset interface for test
Add host reset interface to make it easier for testing the host reset
feature.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:11 -04:00
Xiang Chen
57dbb2b218 scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
If we exchange SAS expander from one SAS controller to other SAS controller
without powering it down, the STP target port will maintain previous
affiliation and reject all subsequent connection requests from other STP
initiator ports with OPEN_REJECT (STP RESOURCES BUSY).

To solve this issue, send HARD RESET to clear the previous affiliation of
STP target port according to SPL (chapter 6.19.4).

We (re-)introduce dev status flag to know if to sleep in NEXUS reset code
or not for remote PHYs. The idea is that if the device is being
initialised, we don't require the delay, and caller would wait for link to
be established, cf. sas_ata_hard_reset().

Co-developed-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
Xiang Chen
4fefe5bbf5 scsi: hisi_sas: Use pci_irq_get_affinity() for v3 hw as experimental
For auto-control irq affinity mode, choose the dq to deliver IO according
to the current CPU.

Then it decreases the performance regression that fio and CQ interrupts are
processed on different node.

For user control irq affinity mode, keep it as before.

To realize it, also need to distinguish the usage of dq lock and sas_dev
lock.

We mark as experimental due to ongoing discussion on managed MSI IRQ
during hotplug:
https://marc.info/?l=linux-scsi&m=154876335707751&w=2

We're almost at the point where we can expose multiple queues to the upper
layer for SCSI MQ, but we need to sort out the per-HBA tags performance
issue.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
John Garry
795f25a31b scsi: hisi_sas: Issue internal abort on all relevant queues
To support queue mapped to a CPU, it needs to be ensured that issuing an
internal abort is safe, in that it is guaranteed that an internal abort is
processed for a single IO or a device after all the relevant command(s)
which it is attempting to abort have been processed by the controller.

Currently we only deliver commands for any device on a single queue to
solve this problem, as we know that commands issued on the same queue will
be processed in order, and we will not have a scenario where the internal
abort is racing against a command(s) which it is trying to abort.

To enqueue commands on queue mapped to a CPU, choosing a queue for an
command is based on the associated queue for the current CPU, so this is
not safe for internal abort since it would definitely not be guaranteed
that commands for the command devices are issued on the same queue.

To solve this issue, we take a bludgeoning approach, and issue a separate
internal abort on any queue(s) relevant to the command or device, in that
we will be guaranteed that at least one of these internal aborts will be
received last in the controller.

So, for aborting a single command, we can just force the internal abort to
be issued on the same queue as the command which we are trying to abort.

For aborting all commands associated with a device, we issue a separate
internal abort on all relevant queues. Issuing multiple internal aborts in
this fashion would have not side affect.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiaofei Tan
b6c9b15e44 scsi: hisi_sas: Fix losing directly attached disk when hot-plug
Hot-plugging SAS wire of direct hard disk backplane may cause disk lost. We
have done this test with several types of SATA disk from different venders,
and only two models from Seagate has this problem, ST4000NM0035-1V4107 and
ST3000VM002-1ET166.

The root cause is that the disk doesn't send D2H frame after OOB finished.
SAS controller will issue phyup interrupt only when D2H frame is received,
otherwise, will be waiting there all the time.

When this issue happen, we can find the disk again with link reset.  To fix
this issue, we setup an timer after OOB finished. If the PHY is not up in
20s, do link reset. Notes: the 20s is an experience value.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiang Chen
ffb1c820b8 scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()
When issing a hardreset to a SATA device when running IO, it is possible
that abnormal CQs of the device are returned. Then enter error handler, it
doesn't enter function hisi_sas_abort_task() as there is no timeout IO, and
it doesn't set device as HISI_SAS_DEV_EH. So when hardreset by libata
later, it actually doesn't issue hardreset as there is a check to judge
whether device is in error.

For this situation, actually need to hardreset the device to recover.
So remove the check of sas_dev status in hisi_sas_I_T_nexus_reset().

Before we add the check to avoid the endless loop of reset for
directly-attached SATA device at probe time, actually we flutter it for
it, so it is not necessary to add the check now.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiang Chen
569eddcf3a scsi: hisi_sas: send primitive NOTIFY to SSP situation only
Send primitive NOTIFY to SSP situation only, or it causes underflow issue
when sending IO. Also rename hisi_sas_hw.sl_notify() to hisi_sas_hw.
sl_notify_ssp().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Luo Jiaxing
49159a5e41 scsi: hisi_sas: Take debugfs snapshot for all regs
This patch takes snapshot for global regs, port regs, CQ, DQ, IOST, ITCT.

Add code for snapshot trig and generate dump directory.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-08 21:58:37 -05:00
Linus Torvalds
938edb8a31 SCSI misc on 20181224
This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
 megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.  Additionally, we have
 a pile of annotation, unused variable and minor updates.  The big API
 change is the updates for Christoph's DMA rework which include
 removing the DISABLE_CLUSTERING flag.  And finally there are a couple
 of target tree updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv
 qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj
 u1j3H7tha9j1it+4pUw=
 =GDa+
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
  megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.

  Additionally, we have a pile of annotation, unused variable and minor
  updates.

  The big API change is the updates for Christoph's DMA rework which
  include removing the DISABLE_CLUSTERING flag.

  And finally there are a couple of target tree updates"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
  scsi: isci: request: mark expected switch fall-through
  scsi: isci: remote_node_context: mark expected switch fall-throughs
  scsi: isci: remote_device: Mark expected switch fall-throughs
  scsi: isci: phy: Mark expected switch fall-through
  scsi: iscsi: Capture iscsi debug messages using tracepoints
  scsi: myrb: Mark expected switch fall-throughs
  scsi: megaraid: fix out-of-bound array accesses
  scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
  scsi: fcoe: remove set but not used variable 'port'
  scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
  scsi: smartpqi: fix build warnings
  scsi: smartpqi: update driver version
  scsi: smartpqi: add ofa support
  scsi: smartpqi: increase fw status register read timeout
  scsi: smartpqi: bump driver version
  scsi: smartpqi: add smp_utils support
  scsi: smartpqi: correct lun reset issues
  scsi: smartpqi: correct volume status
  scsi: smartpqi: do not offline disks for transient did no connect conditions
  scsi: smartpqi: allow for larger raid maps
  ...
2018-12-28 14:48:06 -08:00
Christoph Hellwig
2a3d4eb8e2 scsi: flip the default on use_clustering
Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page.  Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:13:12 -05:00
Xiang Chen
6db831f4ef scsi: hisi_sas: Make sg_tablesize consistent value
Sht->sg_tablesize is set in the driver, and it will be assigned to
shost->sg_tablesize in SCSI mid-layer. So it is not necessary to assign
shost->sg_table one more time in the driver.

In addition to the change, change each scsi_host_template.sg_tablesize
to HISI_SAS_SGE_PAGE_CNT instead of SG_ALL.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-12 21:23:17 -05:00
John Garry
735bcc77e6 scsi: hisi_sas: Fix warnings detected by sparse
This patchset fixes some warnings detected by the sparse tool, like these:
drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: warning: incorrect type in assignment (different base types)
drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52:    expected unsigned short [unsigned] [assigned] [usertype] tag_of_task_to_be_managed
drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52:    got restricted __le16 [usertype] <noident>
drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: warning: incorrect type in assignment (different base types)
drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52:    expected unsigned short [unsigned] [assigned] [usertype] tag_of_task_to_be_managed
drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52:    got restricted __le16 [usertype] <noident>

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-12 21:23:16 -05:00
Xiang Chen
c3566f9a61 scsi: hisi_sas: Create separate host attributes per HBA
Currently all the three HBA (v1/v2/v3 HW) share the same host attributes.

To support each HBA having separate attributes in future, create per-HBA
attributes.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-15 14:37:05 -05:00
YueHaibing
e34ff8edca scsi: hisi_sas: Remove set but not used variable 'dq_list'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/hisi_sas/hisi_sas_v1_hw.c: In function 'start_delivery_v1_hw':
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c:907:20: warning:
 variable 'dq_list' set but not used [-Wunused-but-set-variable]

drivers/scsi/hisi_sas/hisi_sas_v2_hw.c: In function 'start_delivery_v2_hw':
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c:1671:20: warning:
 variable 'dq_list' set but not used [-Wunused-but-set-variable]

drivers/scsi/hisi_sas/hisi_sas_v3_hw.c: In function 'start_delivery_v3_hw':
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:889:20: warning:
 variable 'dq_list' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit
fa222db0b0 ("scsi: hisi_sas: Don't lock DQ for complete task sending")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-05 22:43:13 -05:00
John Garry
fe5fb42de3 scsi: hisi_sas: Fix spin lock management in slot_index_alloc_quirk_v2_hw()
Currently a spin_unlock_irqrestore() call is missing on the error path,
so add it.

Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-16 17:50:49 -04:00
Xiang Chen
784b46b7cb scsi: hisi_sas: Use block layer tag instead for IPTT
Currently we use the IPTT defined in LLDD to identify IOs. Actually for
IOs which are from the block layer, they have tags to identify them. So
for those IOs, use tag of the block layer directly, and for IOs which is
not from the block layer (such as internal IOs from libsas/LLDD), reserve
96 IPTTs for them.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-16 00:27:04 -04:00
Xiang Chen
3e178f3ecf scsi: hisi_sas: Free slot later in slot_complete_vx_hw()
If an SSP/SMP IO times out, it may be actually in reality be
simultaneously processing completion of the slot in
slot_complete_vx_hw().

Then if the slot is freed in slot_complete_vx_hw() (this IPTT is freed
and it may be re-used by other slot), and we may abort the wrong slot in
hisi_sas_abort_task().

So to solve the issue, free the slot after the check of
SAS_TASK_STATE_ABORTED in slot_complete_vx_hw().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-16 00:27:04 -04:00
Xiang Chen
f4e34f2a5d scsi: hisi_sas: Add SATA FIS check for v3 hw
Add a check ERR bit of status to decide whether there is something wrong
with initial register-D2H FIS. If error exist, PHY link reset the channel
to restart OOB.

Directly call work HISI_PHYE_LINK_RESET replacing disable_phy_vx_hw() and
enable_phy_vx_hw().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-19 21:57:40 -04:00
Xiaofei Tan
1c09b66316 scsi: hisi_sas: add memory barrier in task delivery function
In task start delivery function, we need to add a memory barrier to prevent
re-ordering of reading memory by hardware. Because the slot data is set in
task prepare function and it could be running in another CPU.

This patch adds an memory barrier after s->ready is read in the task start
delivery function, and uses WRITE_ONCE() in the places where s->ready is
set to ensure that the compiler does not re-order.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-19 21:57:40 -04:00
Xiaofei Tan
ed99e1d949 scsi: hisi_sas: Add a flag to filter PHY events during reset
During reset, we don't want PHY events reported to libsas for PHYs which
were previously attached prior to reset.

So check hisi_hba->flags for HISI_SAS_RESET_BIT to filter PHY events during
reset.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19 22:02:25 -04:00
Xiang Chen
3e1fb1b8ab scsi: hisi_sas: Mark PHY as in reset for nexus reset
When issuing a nexus reset for directly attached device, we want to ignore
the PHY down events so libsas will not deform and reform the port.

In the case that the attached SAS changes for the reset, libsas will deform
and form a port.

For scenario that the PHY does not come up after a timeout period, then
report the PHY down to libsas.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
b09fcd09e9 scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
This patch adds a force PHY function for internal ATA command for v2 hw.

Because there is an SoC bug in v2 hw, and need send an IO through each PHY
of a port to work around a bug which occurs after a controller reset.

This force PHY function will be used in the later patch.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
78bd2b4f6e scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
In future scenarios we will want to use the TMF struct for more task types
than SSP.

As such, we can add struct hisi_sas_tmf_task directly into struct
hisi_sas_slot, and this will mean we can remove the TMF parameters from the
task prep functions.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
a865ae14ff scsi: hisi_sas: Try wait commands before before controller reset
We may reset the controller in many scenarios, such as SCSI EH and HW
errors. There should be no IO which returns from target when SCSI EH is
active. But for other scenarios, there may be.  It is not necessary to make
such IOs fail.

This patch adds an function of trying to wait for any commands, or IO, to
complete before host reset. If no more CQ returned from host controller in
100ms, we assume no more IO can return, and then stop waiting. We wait 5s
at most.

The HW has a register CQE_SEND_CNT to indicate the total number of CQs that
has been reported to driver. We can use this register and it is reliable to
resd this register in such scenarios that require host reset.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
235bfc7ff6 scsi: hisi_sas: Create a scsi_host_template per HW module
When a SCSI host is registered, the SCSI mid-layer takes a reference to a
module in Scsi_host.hostt.module. In doing this, we are prevented from
removing the driver module for the host in dangerous scenario, like when a
disk is mounted.

Currently there is only one scsi_host_template (sht) for all HW versions,
and this is the main.c module. So this means that we can possibly remove
the HW module in this dangerous scenario, as SCSI mid-layer is only
referencing the main.c module.

To fix this, create a sht per module, referencing that same module to
create the Scsi host.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
John Garry
757db2dae2 scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
There is much common code and functionality between the HW versions to set
the PHY linkrate.

As such, this patch factors out the common code into a generic function
hisi_sas_phy_set_linkrate().

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
Xiang Chen
fa222db0b0 scsi: hisi_sas: Don't lock DQ for complete task sending
Currently we lock the DQ to protect whole delivery process.  So this
stops us building slots for the same queue in parallel, and can affect
performance.

To optimise it, only lock the DQ during special periods, specifically
when allocating a slot from the DQ and when delivering a slot to the HW.

This approach is now safe, thanks to the previous patches to ensure that
we always deliver a slot to the HW once allocated.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
a2b3820bdd scsi: hisi_sas: make return type of prep functions void
Since the task prep functions now should not fail, adjust the return
types to void.

In addition, some checks in the task prep functions are relocated to the
main module; this is specifically the check for the number of elements
in an sg list exceeded the HW SGE limit.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
7eee4b9218 scsi: hisi_sas: relocate smp sg map
Currently we use DQ lock to protect delivery of DQ entry one by one.

To optimise to allow more than one slot to be built for a single DQ in
parallel, we need to remove the DQ lock when preparing slots, prior to
delivery.

To achieve this, we rearrange the slot build order to ensure that once
we allocate a slot for a task, we do cannot fail to deliver the task.

In this patch, we rearrange the slot building for SMP tasks to ensure
that sg mapping part (which can fail) happens before we allocate the
slot in the DQ.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
c2c1d9ded0 scsi: hisi_sas: update PHY linkrate after a controller reset
After the controller is reset, we currently may not honour the PHY max
linkrate set via sysfs, in that after a reset we always revert to max
linkrate of 12Gbps, ignoring the value set via sysfs.

This patch modifies to policy to set the programmed PHY linkrate,
honouring the max linkrate programmed via sysfs.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
cd938e535e scsi: hisi_sas: check host frozen before calling "done" function
When the host is frozen in SCSI EH state, at any point after the LLDD
sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free
the task; see sas_scsi_find_task().

This puts the LLDD in a difficult position, in that once it sets
SAS_TASK_STATE_DONE for the task state it should not reference the
sas_task again. But the LLDD needs will check the sas_task indirectly in
calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done()
(to check if the host is frozen state actually).

And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after
task->task_done() is called (as the sas_task is free'd at this point).

This situation would seem to be a problem made by libsas.

To work around, check in the LLDD whether the host is in frozen state to
ensure it is ok to call task->task_done() function. If in the frozen
state, we rely on SCSI EH and libsas to free the sas_task directly.

We do not do this for the following IO types:

 - SMP - they are managed in libsas directly, outside SCSI EH
 - Any internally originated IO, for similar reason

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
b81b6cce58 scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice
If the SCSI host enters EH, any pending IO will be processed by SCSI
EH. However it is possible that SCSI EH will try to abort the IO and
also at the same time the IO completes in the driver. In this situation
there is a small chance of freeing the sas_task twice.

Then if another IO re-uses freed sas_task before the second time of
free'ing sas_task, it is possible to free incorrect sas_task.

To avoid this situation, add some checks to increase reliability.  The
sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually
protect the LLDD and libsas freeing the task.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:43 -04:00
Xiang Chen
24cf43612d scsi: hisi_sas: optimise the usage of DQ locking
In the DQ tasklet processing it is not necessary to take the DQ lock, as
there is no contention between adding slots to the CQ and removing slots
from the matching DQ.

In addition, since we run each DQ in a separate tasklet context, there
would be no possible contention between DQ processing running for the
same queue in parallel.

It is still necessary to take hisi_hba lock when free'ing slots.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:43 -04:00
John Garry
381ed6c081 scsi: hisi_sas: print device id for errors
When we find an erroneous slot completion, to help aid debugging add the
device index to the current debug log.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiang Chen
5df41af4b1 scsi: hisi_sas: delete timer when removing hisi_sas driver
Delete timer for v1 and v3 hw when removing hisi_sas driver.

Signed-off-by: Xiang chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiang Chen
8b8d665315 scsi: hisi_sas: make SAS address of SATA disks unique
When directly connected with SATA disks in different SAS cores, fill SAS
address with scsi_host's id to make it's fake SAS address unique.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiang Chen
edafeef4f2 scsi: hisi_sas: Code cleanup and minor bug fixes
The patch does some code cleanup and fixes some small bugs:

- Correct return status of phy_up_v3_hw() and phy_bcast_v3_hw()
- Add static for function phy_get_max_linkrate_v3_hw()
- Change exception return status when no reset method
- Change magic value to ts->stat in slot_complete_vx_hw()
- Remove unnecessary check for dev_is_sata()
- Fix some issues of alignment and indents (Authored by Xiaofei Tan in
  another patch, but added here to be practical)

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:25 -04:00
Xiaofei Tan
0006ce29e8 scsi: hisi_sas: fix the issue of setting linkrate register
It is not right to set the register PROG_PHY_LINK_RATE while PHY is still
enabled. So if we want to change PHY linkrate, we need to disable PHY before
setting the register PROG_PHY_LINK_RATE, and then start-up PHY. This patch
is to fix this issue.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:24 -04:00
Xiaofei Tan
eba8c20c71 scsi: hisi_sas: fix the issue of link rate inconsistency
In sysfs, there are two files about minimum linkrate, and also two files for
maximum linkrate. Take maximum linkrate example, maximum_linkrate_hw is
read-only and indicated by the register HARD_PHY_LINKRATE, and
maximum_linkrate is read-write and corresponding to the register
PROG_PHY_LINK_RATE.

But in the function phy_up_v*_hw(), we get *_linkrate value from
HARD_PHY_LINKRATE. It is not right. This patch is to fix this issue.

Unreferenced PHY-interrupt enum is also removed for v3 hw.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:24 -04:00
Xiaofei Tan
67c2bf2331 scsi: hisi_sas: support the property of signal attenuation for v2 hw
The register SAS_PHY_CTRL is configured according to signal quality.  The
signal quality is calculated by signal attenuation of hardware physical
link. It may be different for different PCB layout.

So, in order to give better support to new board, this patch add support to
reading the devicetree property, "hisilicon,signal-attenuation".  Of course,
we still keep an default value in driver to adapt old board.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:24 -04:00
Xiaofei Tan
6379c56070 scsi: hisi_sas: directly attached disk LED feature for v2 hw
This patch implements LED feature of directly attached disk for v2 hw.
As libsas has provided an interface lldd_write_gpio() for this feature,
we just need realise the interface following SPGIO API.

We use an CPLD to finish the hardware part of this feature, and the base
address of CPLD should be configured through ACPI or DT tables.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:59 -05:00
chenxiang
468f4b8d07 scsi: hisi_sas: Change frame type for SET MAX commands
According to ATA protocol, SET MAX commands belong to different frame
types. So judge features field of SET MAX commands to decide which
frame type they belongs to.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10 23:25:08 -05:00
Xiaofei Tan
057c3d1f07 scsi: hisi_sas: do link reset for some CHL_INT2 ints
We should do link reset of PHY when identify timeout or STP link timeout. They
are internal events of SOC and are notified to driver through interrupts of
CHL_INT2.

Besides, we should add an delay work to do link reset as it needs sleep. So,
this patch add an new PHY event HISI_PHYE_LINK_RESET for this.

Notes: v2 HW doesn't report the event of STP link timeout.  So, we only need
to handle event of identify timeout for v2 HW.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 21:25:03 -05:00
Xiaofei Tan
e537b62b07 scsi: hisi_sas: use an general way to delay PHY work
Use an general way to do delay work for a PHY. Then it will be easier to add
new delayed work for a PHY in future.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 21:25:03 -05:00
Xiaofei Tan
72f7fc3050 scsi: hisi_sas: add v2 hw port AXI error handling support
Add port AXI errors handling for v2 hw. We do host controller reset for such
errors.

Besides, change port muli-bits ECC error handling, and we should also do host
reset for such error. So, this patch put them in the same struct with port AXI
error.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 21:25:03 -05:00
Xiaofei Tan
f64715d283 scsi: hisi_sas: improve int_chnl_int_v2_hw() consistency with v3 hw
Change code format of int_chnl_int_v2_hw() to be consistent with v3 hw to
reduce an tag indent.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 21:25:02 -05:00
Xiang Chen
f1c8821145 scsi: hisi_sas: add some print to enhance debugging
Add some print at some places such as error info and cq of exception IO,
device found etc, and also adjust some log levels.

All this to assist debugging ability.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 21:25:02 -05:00
Xiaofei Tan
0258141aaa scsi: hisi_sas: relocate clearing ITCT and freeing device
In certain scenarios we may just want to clear the ITCT for a device, and not
free other resources like the SATA bitmap using in v2 hw.

To facilitate this, this patch relocates the code of clearing ITCT from
free_device() to a new hw interface clear_itct().  Then for some hw, we should
not realise free_device() if there's nothing left to do for it.

[mkp: typo]

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 21:25:02 -05:00