Add fatal error checking for the pm8001_phy_control() and
pm8001_lu_reset() functions.
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Pranav Prasad <pranavpp@google.com>
Link: https://lore.kernel.org/r/20230526235155.433243-1-pranavpp@google.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In commit 0b639decf6 ("scsi: pm8001: Modify task abort handling for SATA
task"), code was introduced to dereference "task" pointer in
pm8001_abort_task(). However there was a pre-existing later check for
"!task", which spooked the kernel test robot.
Function pm8001_abort_task() should never be passed NULL for "task"
pointer, so remove that check. Also remove the "unlikely" hint, as this is
not fastpath code.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1666781764-123090-1-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The request associated with a SCSI command coming from the block layer has
a unique tag, so use that when possible for getting a CCB.
Unfortunately we don't support reserved commands in the SCSI midlayer yet,
so in the interim continue to manage those tags internally (along with
tags for private commands).
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1666091763-11023-6-git-send-email-john.garry@huawei.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In commit 5a141315ed ("scsi: pm80xx: Increase the number of outstanding
I/O supported to 1024") the pm8001_ha->tags allocation was moved into
pm8001_init_ccb_tag(). This changed the execution order of allocation.
pm8001_tag_init() used to be called after the pm8001_ha->tags allocation
and now it is called before the allocation.
Before:
pm8001_pci_probe()
`--> pm8001_pci_alloc()
`--> pm8001_alloc()
`--> pm8001_ha->tags = kzalloc(...)
`--> pm8001_tag_init(pm8001_ha); // OK: tags are allocated
After:
pm8001_pci_probe()
`--> pm8001_pci_alloc()
| `--> pm8001_alloc()
| `--> pm8001_tag_init(pm8001_ha); // NOK: tags are not allocated
|
`--> pm8001_init_ccb_tag()
`--> pm8001_ha->tags = kzalloc(...) // today it is bitmap_zalloc()
Since pm8001_ha->tags_num is zero when pm8001_tag_init() is called it does
nothing. Tags memory is allocated with bitmap_zalloc() so there is no need
to manually clear each bit with pm8001_tag_free().
Reviewed-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1666091763-11023-5-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The attached phy id finding is open coded. Replace it with
sas_find_attached_phy_id(). To keep things consistent, the return value of
pm8001_dev_found_notify() is also changed to -ENODEV after calling
sas_find_attathed_phy_id() failed.
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/20220928070130.3657183-4-yanaijie@huawei.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In commit c6b9ef5779 ("[SCSI] pm80xx: NCQ error handling changes") the
driver had support added to handle NCQ errors but much of what is done in
this handling is duplicated from the libata EH.
In that named commit we handle in 2x main steps:
a. Issue read log ext10 to examine and clear the errors
b. Issue SATA_ABORT all command
Indeed, in libata EH, we do similar to above:
a. ata_do_eh() -> ata_eh_autopsy() -> ata_eh_link_autopsy() ->
ata_eh_analyze_ncq_error() -> ata_eh_read_log_10h()
b. ata_do_eh() -> ata_eh_recover() which will issue a device soft reset
or hard reset
Since there is so much duplication, use sas_ata_device_link_abort() which
will abort all pending IOs and kick of ATA EH which will do the steps,
above.
However we will not follow the advisory to send the SATA_ABORT all command
after the autopsy in read log ext10. Indeed, in libsas EH, we already send
a per-task SATA_ABORT command, and this is prior to the ATA EH kicking in
and issuing the read log ext10 in the recovery process. I judge that this
is ok as the SATA_ABORT command does not actually send any protocol on the
link to abort I/O on the other side, so would not change any state on the
disk (for the read log ext10 command).
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1665998435-199946-7-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When we try to abort a SATA task, the CCB of the task which we are trying
to avoid may still complete. In this case, we should not touch the task
associated with that CCB as we can race with libsas freeing the last later
in sas_eh_handle_sas_errors() -> sas_eh_finish_cmd() for when
TASK_IS_ABORTED is returned from sas_scsi_find_task()
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1665998435-199946-6-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In pm8001_tag_alloc() we don't require atomic set_bit() as we are already
in atomic context. In pm8001_tag_free() we should use the same host
spinlock to protect clearing the tag (and then don't require the atomic
clear_bit()).
Link: https://lore.kernel.org/r/1654879602-33497-4-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
New special handling is added for SAS_PROTOCOL_INTERNAL_ABORT proto so that
we may use the common queue command API.
Link: https://lore.kernel.org/r/1647001432-239276-4-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The task argument of the pm8001_ccb_task_free() function can be inferred
from the ccb argument ccb_task field. So there is no need to have this
argument. Likewise, the ccb_index argument is always equal to the ccb tag
field and is not needed either. Remove both arguments and update all call
sites. The pm8001_ccb_task_free_done() helper is also modified to match
this change.
Link: https://lore.kernel.org/r/20220220031810.738362-30-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The main part of the pm8001_task_exec() function uses a do {} while(0) loop
that is useless and only makes the code harder to read. Remove this
loop. The unnecessary local variable t is also removed.
Additionally, avoid repeatedly declaring "struct task_status_struct *ts" to
handle error cases by declaring this variable for the entire function
scope. This allows simplifying the error cases, and together with the
addition of blank lines make the code more readable.
Finally, handling of the running_req counter is fixed to avoid decrementing
it without a corresponding incrementation in the case of an invalid task
protocol.
Link: https://lore.kernel.org/r/20220220031810.738362-29-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce the pm8001_ccb_alloc() and pm8001_ccb_free() helpers to replace
the typical code patterns:
res = pm8001_tag_alloc(pm8001_ha, &ccb_tag);
if (res)
...
ccb = &pm8001_ha->ccb_info[ccb_tag];
ccb->device = pm8001_ha_dev;
ccb->ccb_tag = ccb_tag;
ccb->task = task;
ccb->n_elem = 0;
and
ccb->task = NULL;
ccb->ccb_tag = PM8001_INVALID_TAG;
pm8001_tag_free(pm8001_ha, tag);
With the simpler function calls:
ccb = pm8001_ccb_alloc(pm8001_ha, pm8001_ha_dev, task);
if (!ccb)
...
and
pm8001_ccb_free(pm8001_ha, ccb);
The pm8001_ccb_alloc() helper ensures that all fields of the ccb info
structure for the newly allocated tag are all initialized, except the
buf_prd field. The pm8001_ccb_free() helper clears the initialized fields
and the ccb tag to ensure that iteration over the adapter ccb_info array
detects ccbs that are in use.
All call site of the pm8001_tag_alloc() function that use a ccb info
associated with an allocated tag are converted to use the new helpers.
Link: https://lore.kernel.org/r/20220220031810.738362-27-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To detect if a command is NCQ, there is no need to test all possible NCQ
command codes. Instead, use ata_is_ncq() to test the command protocol.
Link: https://lore.kernel.org/r/20220220031810.738362-26-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the goto statement in the for loop with "break" and remove the
ex_err label. Also fix long lines, identation and blank lines to make the
code more readable.
Link: https://lore.kernel.org/r/20220220031810.738362-25-damien.lemoal@opensource.wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In pm8001_chip_set_dev_state_req(), pm8001_chip_fw_flash_update_req(),
pm80xx_chip_phy_ctl_req() and pm8001_chip_reg_dev_req() add missing calls
to pm8001_tag_free() to free the allocated tag when pm8001_mpi_build_cmd()
fails.
Similarly, in pm8001_exec_internal_task_abort(), if the chip ->task_abort
method fails, the tag allocated for the abort request task must be
freed. Add the missing call to pm8001_tag_free().
Link: https://lore.kernel.org/r/20220220031810.738362-22-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The function pm8001_tag_alloc() determines free tags using the function
find_first_zero_bit() which can return 0 when the first bit of the bitmap
being inspected is 0. As such, tag 0 is a valid tag value that should not
be dismissed as invalid. Fix the functions pm8001_work_fn(),
mpi_sata_completion(), pm8001_mpi_task_abort_resp() and
pm8001_open_reject_retry() to not dismiss 0 tags as invalid.
The value 0xffffffff is used for invalid tags for unused ccb information
structures. Add the macro definition PM8001_INVALID_TAG to define this
value.
Link: https://lore.kernel.org/r/20220220031810.738362-20-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Avoid the sparse warning "warning: cast removes address space '__iomem' of
expression" by declaring the qp pointer as "u32 __iomem *". Accordingly,
change the accesses to the qp array to use readl().
Link: https://lore.kernel.org/r/20220220031810.738362-3-damien.lemoal@opensource.wdc.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a generic implementation of abort task TMF handler, and use in LLDDs.
With that, some LLDDs custom TMF functions can now be deleted.
Link: https://lore.kernel.org/r/1645112566-115804-18-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a generic implementation of query task TMF handler, and use in LLDDs.
Link: https://lore.kernel.org/r/1645112566-115804-17-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a generic implementation of LU reset TMF handler, and use in LLDDs.
Link: https://lore.kernel.org/r/1645112566-115804-16-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a generic implementation of clear task set TMF handler, and use in
LLDDs.
Link: https://lore.kernel.org/r/1645112566-115804-15-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a generic implementation of abort task set TMF handler, and use in
LLDDs.
Link: https://lore.kernel.org/r/1645112566-115804-14-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The hisi_sas and pm8001 TMF handlers have some special processing for when
the TMF is aborted, so add a callback and fill it in for those drivers.
Link: https://lore.kernel.org/r/1645112566-115804-13-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The pm8001 TMF handler has some special processing when the TMF completes,
so add a callback and fill it in for the pm8001 driver.
Link: https://lore.kernel.org/r/1645112566-115804-12-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a pointer to a sas_tmf_task to the sas_task struct, as this will be
used when the common LLDD TMF code is factored out.
Also set it for the LLDDs to store per-sas_task TMF info.
Link: https://lore.kernel.org/r/1645112566-115804-9-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some of the LLDDs which use libsas have their own definition of a struct
to hold TMF info, so add a common struct for libsas.
Also add an interim force phy id field for hisi_sas driver, which will be
removed once the STP "TMF" code is factored out.
Even though some LLDDs (pm8001) use a u32 for the tag, u16 will be adequate,
as that named driver only uses tags in range [0, 1024).
Link: https://lore.kernel.org/r/1645112566-115804-8-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This callback is never called, so remove support.
Link: https://lore.kernel.org/r/1645112566-115804-4-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently a use-after-free may occur if a TMF sas_task is aborted before we
handle the IO completion in mpi_ssp_completion(). The abort occurs due to
timeout.
When the timeout occurs, the SAS_TASK_STATE_ABORTED flag is set and the
sas_task is freed in pm8001_exec_internal_tmf_task().
However, if the I/O completion occurs later, the I/O completion still
thinks that the sas_task is available. Fix this by clearing the ccb->task
if the TMF times out - the I/O completion handler does nothing if this
pointer is cleared.
Link: https://lore.kernel.org/r/1643289172-165636-3-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Error handling steps were not in sequence as per the programmers
manual. Expected sequence:
- PHY_DOWN (PORT_IN_RESET)
- PORT_RESET_TIMER_TMO
- Host aborts pending I/Os
- Host deregister the device
- Host sends HW_EVENT_PHY_DOWN ACK
Previously we were sending HW_EVENT_PHY_DOWN ACK first and then deregister
the device. Fix this to use the expected sequence.
Link: https://lore.kernel.org/r/20211228111753.10802-1-Ajish.Koshy@microchip.com
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tracepoints for tracking controller and ATA commands issued and completed.
Link: https://lore.kernel.org/r/20211115215750.131696-2-changyuanl@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Co-developed-by: Akshat Jain <akshatzen@google.com>
Signed-off-by: Akshat Jain <akshatzen@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During phyup event, the firmware provides the phy_id and port_id and driver
is supposed to use these during device handle registration. Previously the
driver was using the port id value from libsas during device handle
registration. Since id can be different from the one assigned by firmware,
this can lead to wrong device registration and drives not showing up.
Use firmware assigned port id during device registration.
Link: https://lore.kernel.org/r/20210906170404.5682-2-Ajish.Koshy@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The TMF timeout timer may trigger at the same time when the response from a
controller is being handled. When this happens the SAS task may get freed
before the response processing is finished.
Fix this by calling complete() only when SAS_TASK_STATE_DONE is not set.
A similar race condition was fixed in commit b90cd6f2b9 ("scsi: libsas:
fix a race condition when smp task timeout")
Link: https://lore.kernel.org/r/20210707185945.35559-1-ipylypiv@google.com
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix kernel-doc warnings then test again, wash, rinse, find more, then
repeat more/again.
Also fix spellos, some grammar, and some punctuation.
../drivers/scsi/pm8001/pm8001_ctl.c:557: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
** pm8001_ctl_fatal_log_show - fatal error logging
../drivers/scsi/pm8001/pm8001_ctl.c:577: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
** non_fatal_log_show - non fatal error logging
../drivers/scsi/pm8001/pm8001_ctl.c:622: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
** pm8001_ctl_gsm_log_show - gsm dump collection
Link: https://lore.kernel.org/r/20210708165723.8594-1-rdunlap@infradead.org
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, ibmvfc,
megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
elx and mpi3mr being new drivers. The major core change is a rework
to drop the status byte handling macros and the old bit shifted
definitions and the rest of the updates are minor fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ
35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i
H0hkJN/Io7enFs5v3WA=
=zwIa
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (ufs, ibmvfc,
megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
elx and mpi3mr being new drivers.
The major core change is a rework to drop the status byte handling
macros and the old bit shifted definitions and the rest of the updates
are minor fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits)
scsi: aha1740: Avoid over-read of sense buffer
scsi: arcmsr: Avoid over-read of sense buffer
scsi: ips: Avoid over-read of sense buffer
scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe()
scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame()
scsi: elx: libefc: Fix less than zero comparison of a unsigned int
scsi: elx: efct: Fix pointer error checking in debugfs init
scsi: elx: efct: Fix is_originator return code type
scsi: elx: efct: Fix link error for _bad_cmpxchg
scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel()
scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session()
scsi: elx: efct: Fix error handling in efct_hw_init()
scsi: elx: efct: Remove redundant initialization of variable lun
scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected"
scsi: lpfc: Fix build error in lpfc_scsi.c
scsi: target: iscsi: Remove redundant continue statement
scsi: qla4xxx: Remove redundant continue statement
scsi: ppa: Switch to use module_parport_driver()
scsi: imm: Switch to use module_parport_driver()
scsi: mpt3sas: Fix error return value in _scsih_expander_add()
...
Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message
Remove it can help us save a bit of memory.
Also change the return error code from "-1" to "-ENOMEM".
Link: https://lore.kernel.org/r/20210610094605.16672-2-thunder.leizhen@huawei.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch prepares for converting SAM status codes into an enum. Without
this patch converting SAM status codes into an enumeration type would
trigger complaints about enum type mismatches for the SAS code.
Link: https://lore.kernel.org/r/20210524025457.11299-2-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When driver is loaded after rmmod some drives are not showing up during
discovery.
SATA drives are directly attached to the controller connected phys. During
device discovery, the IDENTIFY command (qc timeout (cmd 0xec)) is timing out
during revalidation. This will trigger abort from host side and controller
successfully aborts the command and returns success. Post this successful
abort response ATA library decides to mark the disk as NODEV.
To overcome this, inside pm8001_scan_start() after phy_start() call, add get
start response and wait for few milliseconds to trigger next phy start.
This millisecond delay will give sufficient time for the controller state
machine to accept next phy start.
Link: https://lore.kernel.org/r/20210505120103.24497-1-ajish.koshy@microchip.com
Signed-off-by: Ajish Koshy <ajish.koshy@microchip.com>
Signed-off-by: Viswas G <viswas.g@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When controller runs into fatal error, I/Os get stuck with no response,
handler event is defined to complete the pending I/Os (SAS task and
internal task) and also perform the cleanup for the drives.
Link: https://lore.kernel.org/r/20210415103352.3580-7-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
checkpatch reports the following:
ERROR: space prohibited before that ',' (ctx:WxW)
+int pm8001_mpi_general_event(struct pm8001_hba_info *pm8001_ha , void *piomb);
Remove unnecessary whitespace.
Link: https://lore.kernel.org/r/1617886593-36421-2-git-send-email-luojiaxing@huawei.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/pm8001/pm8001_sas.c:989: warning: expecting prototype for and hard reset for(). Prototype was for pm8001_I_T_nexus_reset() instead
Link: https://lore.kernel.org/r/20210303144631.3175331-12-lee.jones@linaro.org
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
libsas event notifiers required an extension where gfp_t flags must be
explicitly passed. For bisectability, a temporary _gfp() variant of such
functions were added. All call sites then got converted use the _gfp()
variants and explicitly pass GFP context. Having no callers left, the
original libsas notifiers were then modified to accept gfp_t flags by
default.
Switch back to the original libas API, while still passing GFP context.
The libsas _gfp() variants will be removed afterwards.
Link: https://lore.kernel.org/r/20210118100955.1761652-16-a.darwish@linutronix.de
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.
Call chain analysis, pm8001_hwi.c:
pm8001_interrupt_handler_msix() || pm8001_interrupt_handler_intx() || pm8001_tasklet()
-> PM8001_CHIP_DISP->isr() = pm80xx_chip_isr()
-> process_oq [spin_lock_irqsave(&pm8001_ha->lock, ...)]
-> process_one_iomb()
-> mpi_hw_event()
-> hw_event_sas_phy_up()
-> pm8001_bytes_dmaed()
-> hw_event_sata_phy_up
-> pm8001_bytes_dmaed()
All functions are invoked by process_one_iomb(), which is invoked by the
interrupt service routine and the tasklet handler. A similar call chain is
also found at pm80xx_hwi.c. Pass GFP_ATOMIC.
For pm8001_sas.c, pm8001_phy_control() runs in task context as it calls
wait_for_completion() and msleep(). Pass GFP_KERNEL.
Link: https://lore.kernel.org/r/20210118100955.1761652-10-a.darwish@linutronix.de
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
LLDDs report events to libsas with .notify_port_event and .notify_phy_event
callbacks.
These callbacks are fixed and so there is no reason why the functions
cannot be called directly, so do that.
This neatens the code slightly, makes it more obvious, and reduces function
pointer usage, which is generally a good thing. Downside is that there are
2x more symbol exports.
[a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables]
Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the controller runs into a fatal error, commands get stuck due to no
response. If the controller is in fatal error state, abort requests issued
to the controller get stuck too.
Check the controller state for fatal error conditions.
Link: https://lore.kernel.org/r/20210109123849.17098-3-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make the pm8001_printk() macro take an explicit HBA instead of assuming the
existence of an unspecified pm8001_ha argument.
Miscellanea:
- Add pm8001_ha to the few uses of pm8001_printk()
- Add HBA to the pm8001_dbg macro call to pm8001_printk()
Link: https://lore.kernel.org/r/0e17a4c845f15e18f98b346ffb9b039584d21cdd.1605914030.git.joe@perches.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Every PM8001_<FOO>_DBG macro uses an internal call to pm8001_printk.
Convert all uses of:
PM8001_<FOO>_DBG(hba, pm8001_printk(fmt, ...))
to
pm8001_dbg(hba, <FOO>, fmt, ...)
so the visual complexity of each macro is reduced.
The repetitive macro definitions are converted to a single pm8001_dbg and
the level is concatenated using PM8001_##level##_LOGGING for the specific
level test.
Done with coccinelle, checkpatch and a little typing of the new macro
definition.
Miscellanea:
- Coalesce formats
- Realign arguments
- Add missing terminating newlines to formats
- Remove trailing spaces from formats
- Change defective loop with printk(KERN_INFO... to emit a 16 byte hex
block to %p16h
Link: https://lore.kernel.org/r/49f36a93af7752b613d03c89a87078243567fd9a.1605914030.git.joe@perches.com
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>