linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-28 15:11:31 +00:00

Author	SHA1	Message	Date
Niklas Cassel	e5dd410acb	ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data When ata_qc_complete() schedules a command for EH using ata_qc_schedule_eh(), blk_abort_request() will be called, which leads to req->q->mq_ops->timeout() / scsi_timeout() being called. scsi_timeout(), if the LLDD has no abort handler (libata has no abort handler), will set host byte to DID_TIME_OUT, and then call scsi_eh_scmd_add() to add the command to EH. Thus, when commands first enter libata's EH strategy_handler, all the commands that have been added to EH will have DID_TIME_OUT set. libata has its own flag (AC_ERR_TIMEOUT), that it sets for commands that have not received a completion at the time of entering EH. Thus, libata doesn't really care about DID_TIME_OUT at all, and currently clears the host byte at the end of EH, in ata_scsi_qc_complete(), before scsi_eh_finish_cmd() is called. However, this clearing in ata_scsi_qc_complete() is currently only done for commands that are not ATA passthrough commands. Since the host byte is visible in the completion that we return to user space for ATA passthrough commands, for ATA passthrough commands that got completed via EH (commands with sense data), the user will incorrectly see: ATA pass-through(16): transport error: Host_status=0x03 [DID_TIME_OUT] Fix this by moving the clearing of the host byte (which is currently only done for commands that are not ATA passthrough commands) from ata_scsi_qc_complete() to the start of EH (regardless if the command is ATA passthrough or not). While at it, use the proper helper function to clear the host byte, rather than open coding the clearing. This will make sure that we: -Correctly clear DID_TIME_OUT for both ATA passthrough commands and commands that are not ATA passthrough commands. -Do not needlessly clear the host byte for commands that did not go via EH. ata_scsi_qc_complete() is called both for commands that are completed normally (without going via EH), and for commands that went via EH, however, only commands that went via EH will have DID_TIME_OUT set. Fixes: `24aeebbf8e` ("scsi: ata: libata: Change ata_eh_request_sense() to not set CHECK_CONDITION") Reported-by: Igor Pylypiv <ipylypiv@google.com> Closes: https://lore.kernel.org/linux-ide/ZttIN8He8TOZ7Lct@google.com/ Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2024-09-11 08:03:43 +09:00
Damien Le Moal	5f8319c4b3	ata: libata: Introduce ata_dev_free_resources Introduce the function ata_dev_free_resources() to free the resources allocated to support a device features. For now, this function is reduced to calling zpodd_exit() for devices that have this feature enabled. ata_dev_free_resources() is called from ata_eh_dev_disable() as this function is always called for all devices attached to a port that is being detached and for devices that are being disabled due to being removed (detached) from the system or due to errors. With this change, the call to zpodd_exit() done in ata_port_detach() and ata_scsi_handle_link_detach() are removed as these functions remove all devices attached to the link or port using libata EH, thus resulting in ata_eh_dev_disable() being called and the zpodd_exit() function being executed. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org>	2024-09-07 10:16:55 +09:00
Damien Le Moal	da65bbdd3b	ata: libata: Move sector_buf from struct ata_port to struct ata_device The 512B buffer sector_buf field of struct ata_port is used for scanning devices as well as during error recovery with ata EH. This buffer is thus useless if a port does not have a device connected to it. And also given that commands using this buffer are issued to devices, and not to ports, move this buffer definition from struct ata_port to struct ata_device. This change slightly increases system memory usage for systems using a port-multiplier as in that case we do not need a per-device buffer for scanning devices (PMP does not allow parallel scanning) nor for EH (as when entering EH we are guaranteed that all commands to all devices connected to the PMP have completed or have been aborted). However, this change reduces memory usage on systems that have many ports with only few devices rives connected, which is a much more common use case than the PMP use case. Suggested-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <cassel@kernel.org>	2024-09-07 10:16:55 +09:00
Damien Le Moal	10e807637f	ata: libata: Rename ata_eh_read_sense_success_ncq_log() The function ata_eh_read_sense_success_ncq_log() does more that just reading the sense data for successful NCQ commands log page as it also sets the sense data for all commands listed in the log page. Rename this function to ata_eh_get_ncq_success_sense() to better describe what the function does. Furthermore, since this function is only called from ata_eh_get_success_sense() in libata-eh.c, there is no need to export it and its declaration can be moved to drivers/ata/libata.h. To be consistent with this change, the function ata_eh_read_sense_success_non_ncq() is also renamed to ata_eh_get_non_ncq_success_sense(). Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de>	2024-09-07 10:16:55 +09:00
Niklas Cassel	9526dec226	ata: libata: Add helper ata_eh_decide_disposition() Every time I see libata code calling scsi_check_sense(), I get confused why the code path that is working fine for SCSI code, is not sufficient for libata code. The reason is that SCSI usually gets the sense data as part of the completion, and will thus automatically call scsi_check_sense(), which will set the SCSI ML byte (if any). However, for libata queued commands, we always need to fetch the sense data via SCSI EH, and thus do not get the luxury of having scsi_check_sense() called automatically. Add a new helper, ata_eh_decide_disposition(), that has a ata_eh_ prefix to more clearly highlight that this is only needed for code called by EH, while also having a similar name to scsi_decide_disposition(), such that it is easier to compare the libata code with the equivalent SCSI code. Also add a big kdoc comment explaining why this helper is called/needed in the first place. Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2024-08-29 10:06:11 +09:00
Damien Le Moal	0c76106cb9	scsi: sd: Fix TCG OPAL unlock on system resume Commit `3cc2ffe5c1` ("scsi: sd: Differentiate system and runtime start/stop management") introduced the manage_system_start_stop scsi_device flag to allow libata to indicate to the SCSI disk driver that nothing should be done when resuming a disk on system resume. This change turned the execution of sd_resume() into a no-op for ATA devices on system resume. While this solved deadlock issues during device resume, this change also wrongly removed the execution of opal_unlock_from_suspend(). As a result, devices with TCG OPAL locking enabled remain locked and inaccessible after a system resume from sleep. To fix this issue, introduce the SCSI driver resume method and implement it with the sd_resume() function calling opal_unlock_from_suspend(). The former sd_resume() function is renamed to sd_resume_common() and modified to call the new sd_resume() function. For non-ATA devices, this result in no functional changes. In order for libata to explicitly execute sd_resume() when a device is resumed during system restart, the function scsi_resume_device() is introduced. libata calls this function from the revalidation work executed on devie resume, a state that is indicated with the new device flag ATA_DFLAG_RESUMING. Doing so, locked TCG OPAL enabled devices are unlocked on resume, allowing normal operation. Fixes: `3cc2ffe5c1` ("scsi: sd: Differentiate system and runtime start/stop management") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218538 Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240319071209.1179257-1-dlemoal@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-03-25 15:46:12 -04:00
Damien Le Moal	54d7211da7	ata: libata-eh: Spinup disk on resume after revalidation Move the call to ata_dev_power_set_active() to transition a disk in standby power mode to the active power mode from ata_eh_revalidate_and_attach() before doing revalidation to the end of ata_eh_recover(), after the link speed for the device is reconfigured (if that was necessary). This is safer as this ensure that the VERIFY command executed to spinup the disk is executed with the drive properly reconfigured first. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>	2023-10-13 12:46:03 +09:00
Damien Le Moal	0fecb50891	ata: libata-eh: Reduce "disable device" message verbosity There is no point in warning about a device being disabled when we expect it to be, that is, on suspend, shutdown or when detaching the device. Suppress the message "disable device" for these cases by introducing the EH static function ata_eh_dev_disable() and by using it in ata_eh_unload() and ata_eh_detach_dev(). ata_dev_disable() code is modified to call this new function after printing the "disable device" message. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-03 09:39:50 +09:00
Damien Le Moal	7f95731c74	ata: libata-eh: Improve reset error messages Some drives are really slow to spinup on resume, resulting is a very slow response to COMRESET and to error messages such as: ata1: COMRESET failed (errno=-16) ata1: link is slow to respond, please be patient (ready=0) ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata1.00: configured for UDMA/133 Given that the slowness of the response is indicated with the message "link is slow to respond..." and that resets are retried until the device is detected as online after up to 1min (ata_eh_reset_timeouts), there is no point in printing the "COMRESET failed" error message. Let's not scare the user with non fatal errors and only warn about reset failures in ata_eh_reset() when all reset retries have been exhausted. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-03 09:39:50 +09:00
Damien Le Moal	49728bdc70	ata: libata-eh: Fix compilation warning in ata_eh_link_report() The 6 bytes length of the tries_buf string in ata_eh_link_report() is too short and results in a gcc compilation warning with W-!: drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’: drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=] 2371 \| snprintf(tries_buf, sizeof(tries_buf), " t%d", \| ^~ drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4] 2371 \| snprintf(tries_buf, sizeof(tries_buf), " t%d", \| ^~~~~~ drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6 2371 \| snprintf(tries_buf, sizeof(tries_buf), " t%d", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2372 \| ap->eh_tries); \| ~~~~~~~~~~~~~ Avoid this warning by increasing the string size to 16B. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-09-28 21:24:18 +09:00
Damien Le Moal	aa3998dbeb	ata: libata-scsi: Disable scsi device manage_system_start_stop The introduction of a device link to create a consumer/supplier relationship between the scsi device of an ATA device and the ATA port of that ATA device fixes the ordering of system suspend and resume operations. For suspend, the scsi device is suspended first and the ata port after it. This is fine as this allows the synchronize cache and START STOP UNIT commands issued by the scsi disk driver to be executed before the ata port is disabled. For resume operations, the ata port is resumed first, followed by the scsi device. This allows having the request queue of the scsi device to be unfrozen after the ata port resume is scheduled in EH, thus avoiding to see new requests prematurely issued to the ATA device. Since libata sets manage_system_start_stop to 1, the scsi disk resume operation also results in issuing a START STOP UNIT command to the device being resumed so that the device exits standby power mode. However, restoring the ATA device to the active power mode must be synchronized with libata EH processing of the port resume operation to avoid either 1) seeing the start stop unit command being received too early when the port is not yet resumed and ready to accept commands, or after the port resume process issues commands such as IDENTIFY to revalidate the device. In this last case, the risk is that the device revalidation fails with timeout errors as the drive is still spun down. Commit `0a85890559` ("ata,scsi: do not issue START STOP UNIT on resume") disabled issuing the START STOP UNIT command to avoid issues with it. But this is incorrect as transitioning a device to the active power mode from the standby power mode set on suspend requires a media access command. The IDENTIFY, READ LOG and SET FEATURES commands executed in libata EH context triggered by the ata port resume operation may thus fail. Fix these synchronization issues is by handling a device power mode transitions for system suspend and resume directly in libata EH context, without relying on the scsi disk driver management triggered with the manage_system_start_stop flag. To do this, the following libata helper functions are introduced: 1) ata_dev_power_set_standby(): This function issues a STANDBY IMMEDIATE command to transitiom a device to the standby power mode. For HDDs, this spins down the disks. This function applies only to ATA and ZAC devices and does nothing otherwise. This function also does nothing for devices that have the ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag set. For suspend, call ata_dev_power_set_standby() in ata_eh_handle_port_suspend() before the port is disabled and frozen. ata_eh_unload() is also modified to transition all enabled devices to the standby power mode when the system is shutdown or devices removed. 2) ata_dev_power_set_active() and This function applies to ATA or ZAC devices and issues a VERIFY command for 1 sector at LBA 0 to transition the device to the active power mode. For HDDs, since this function will complete only once the disk spin up. Its execution uses the same timeouts as for reset, to give the drive enough time to complete spinup without triggering a command timeout. For resume, call ata_dev_power_set_active() in ata_eh_revalidate_and_attach() after the port has been enabled and before any other command is issued to the device. With these changes, the manage_system_start_stop and no_start_on_resume scsi device flags do not need to be set in ata_scsi_dev_config(). The flag manage_runtime_start_stop is still set to allow the sd driver to spinup/spindown a disk through the sd runtime operations. Fixes: `0a85890559` ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-09-28 21:23:03 +09:00
Niklas Cassel	7a3bc2b398	ata: libata-eh: do not thaw the port twice in ata_eh_reset() commit `1e641060c4` ("libata: clear eh_info on reset completion") added a workaround that broke the retry mechanism in ATA EH. Tejun himself suggested to remove this workaround when it was identified to cause additional problems: https://lore.kernel.org/linux-ide/20110426135027.GI878@htj.dyndns.org/ He even said: "Hmm... it seems I wasn't thinking straight when I added that work around." https://lore.kernel.org/linux-ide/20110426155229.GM878@htj.dyndns.org/ While removing the workaround solved the issue, however, the workaround was kept to avoid "spurious hotplug events during reset", and instead another workaround was added on top of the existing workaround in commit `8c56cacc72` ("libata: fix unexpectedly frozen port after ata_eh_reset()"). Because these IRQs happened when the port was frozen, we know that they were actually a side effect of PxIS and IS.IPS(x) not being cleared before the COMRESET. This is now done in commit 94152042eaa9 ("ata: libahci: clear pending interrupt status"), so these workarounds can now be removed. Since commit `1e641060c4` ("libata: clear eh_info on reset completion") has now been reverted, the ATA EH retry mechanism is functional again, so there is once again no need to thaw the port more than once in ata_eh_reset(). This reverts "the workaround on top of the workaround" introduced in commit `8c56cacc72` ("libata: fix unexpectedly frozen port after ata_eh_reset()"). Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2023-09-16 21:11:28 +09:00
Niklas Cassel	80cc944eca	ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() ata_scsi_port_error_handler() starts off by clearing ATA_PFLAG_EH_PENDING, before calling ap->ops->error_handler() (without holding the ap->lock). If an error IRQ is received while ap->ops->error_handler() is running, the irq handler will set ATA_PFLAG_EH_PENDING. Once ap->ops->error_handler() returns, ata_scsi_port_error_handler() checks if ATA_PFLAG_EH_PENDING is set, and if it is, another iteration of ATA EH is performed. The problem is that ATA_PFLAG_EH_PENDING is not only cleared by ata_scsi_port_error_handler(), it is also cleared by ata_eh_reset(). ata_eh_reset() is called by ap->ops->error_handler(). This additional clearing done by ata_eh_reset() breaks the whole retry logic in ata_scsi_port_error_handler(). Thus, if an error IRQ is received while ap->ops->error_handler() is running, the port will currently remain frozen and will never get re-enabled. The additional clearing in ata_eh_reset() was introduced in commit `1e641060c4` ("libata: clear eh_info on reset completion"). Looking at the original error report: https://marc.info/?l=linux-ide&m=124765325828495&w=2 We can see the following happening: [ 1.074659] ata3: XXX port freeze [ 1.074700] ata3: XXX hardresetting link, stopping engine [ 1.074746] ata3: XXX flipping SControl [ 1.411471] ata3: XXX irq_stat=400040 CONN\|PHY [ 1.411475] ata3: XXX port freeze [ 1.420049] ata3: XXX starting engine [ 1.420096] ata3: XXX rc=0, class=1 [ 1.420142] ata3: XXX clearing IRQs for thawing [ 1.420188] ata3: XXX port thawed [ 1.420234] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) We are not supposed to be able to receive an error IRQ while the port is frozen (PxIE is set to 0, i.e. all IRQs for the port are disabled). AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) states: "Each bit location can be thought of as reporting a '1' if the virtual "interrupt line" for that port is indicating it wishes to generate an interrupt. That is, if a port has one or more interrupt status bit set, and the enables for those status bits are set, then this bit shall be set." Additionally, AHCI state P:ComInit clearly shows that the state machine will only jump to P:ComInitSetIS (which sets IS.IPS(x) to '1'), if PxIE.PCE is set to '1'. In our case, PxIE is set to 0, so IS.IPS(x) won't get set. So IS.IPS(x) only gets set if PxIS and PxIE is set. AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) also states: "The bits in this register are read/write clear. It is set by the level of the virtual interrupt line being a set, and cleared by a write of '1' from the software." So if IS.IPS(x) is set, you need to explicitly clear it by writing a 1 to IS.IPS(x) for that port. Since PxIE is cleared, the only way to get an interrupt while the port is frozen, is if IS.IPS(x) is set, and the only way IS.IPS(x) can be set when the port is frozen, is if it was set before the port was frozen. However, since commit `737dd811a3` ("ata: libahci: clear pending interrupt status"), we clear both PxIS and IS.IPS(x) after freezing the port, but before the COMRESET, so the problem that commit `1e641060c4` ("libata: clear eh_info on reset completion") fixed can no longer happen. Thus, revert commit `1e641060c4` ("libata: clear eh_info on reset completion"), so that the retry logic in ata_scsi_port_error_handler() works once again. (The retry logic is still needed, since we can still get an error IRQ _after_ the port has been thawed, but before ata_scsi_port_error_handler() takes the ap->lock in order to check if ATA_PFLAG_EH_PENDING is set.) Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2023-09-16 21:10:37 +09:00
Hannes Reinecke	ff8072d589	ata: libata: remove references to non-existing error_handler() With commit `65a15d6560` ("scsi: ipr: Remove SATA support") all libata drivers now have the error_handler() callback provided, so we can stop checking for non-existing error_handler callback. Signed-off-by: Hannes Reinecke <hare@suse.de> [niklas: fixed review comments, rebased, solved conflicts during rebase, fixed bug that unconditionally dumped all QCs, removed the now unused function ata_dump_status(), removed the now unreachable failure paths in atapi_qc_complete(), removed the non-EH function to request ATAPI sense] Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2023-08-02 17:45:10 +09:00
Sergey Shtylyov	ca02f22516	ata: libata-eh: fix reset timeout type ata_eh_reset_timeouts[] stores 'unsigned long' timeouts in ms, while ata_eh_reset() passes these values to ata_deadline() that takes just 'unsigned int timeout_msecs' parameter. Change the reset timeout table element's type to 'unsigned int' -- all timeouts fit into 'unsigned int' but we have to change ULONG_MAX to UINT_MAX... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2023-08-02 17:37:06 +09:00
Linus Torvalds	ca7ce08d6a	SCSI misc on 20230629 Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi, lpfc, qla2xxx). We have a couple of major core changes impacting other systems: Command Duration Limits, which spills into block and ATA and block level Persistent Reservation Operations, which touches block, nvme, target and dm (both of which are added with merge commits containing a cover letter explaining what's going on). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZJ19cSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfZpAQCQBuWR ELcOhsaG5KzO6xLWcH8mjsOoxffKvazZjTKXlAD5ATEv7++E250oKS3t+yfjae5I Lc195MlDju85ItUQgfk= =U9ik -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi, lpfc, qla2xxx). We have a couple of major core changes impacting other systems: - Command Duration Limits, which spills into block and ATA - block level Persistent Reservation Operations, which touches block, nvme, target and dm Both of these are added with merge commits containing a cover letter explaining what's going on" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits) scsi: core: Improve warning message in scsi_device_block() scsi: core: Replace scsi_target_block() with scsi_block_targets() scsi: core: Don't wait for quiesce in scsi_device_block() scsi: core: Don't wait for quiesce in scsi_stop_queue() scsi: core: Merge scsi_internal_device_block() and device_block() scsi: sg: Increase number of devices scsi: bsg: Increase number of devices scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue scsi: ufs: ufs-pci: Add support for Intel Arrow Lake scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT scsi: ufs: wb: Add explicit flush_threshold sysfs attribute scsi: ufs: ufs-qcom: Switch to the new ICE API scsi: ufs: dt-bindings: qcom: Add ICE phandle scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR scsi: ufs: core: Remove dedicated hwq for dev command scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes ...	2023-06-30 11:57:07 -07:00
Linus Torvalds	1546cd4bfd	ata changes for 6.5-rc1 - Add support for the .remove_new callback to the ata_platform code to simplify device removal interface (Uwe). - Code simplification in ata_dev_revalidate() (Yahu) - Fix code indentation and coding style in the pata_parport protocol modules to avoid warnings from static code analyzers (me) - Clarify ata_eh_qc_retry() behavior with better comments (Niklas) - Simplify and improve ata_change_queue_depth() behavior to have a consistent behavior between libsas managed devices and libata managed devices (e.g. AHCI connected devices) (me). - Cleanup libata-scsi and libata-eh code to use the ata_ncq_enabled() and ata_ncq_supported() helpers instead of open coding flags tests (me) - Cleanup ahci_reset_controller() code (me). - Change the pata_octeon_cf and sata_svw drivers to use of_property_read_reg() to simplify the code (Rob, me). - Remove unnecessary include files from ahci_octeon driver (me) - Modify the DesignWare ahci dt bindings to add support for the Rockchip RK3588 AHCI (Sebastian). -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZJourwAKCRDdoc3SxdoY dv43AQDzAFY0/0sjvqltGC31wRzzh/vEQFWsYt89Q4csMr4QgAEAkLO1gquH5/Wt sxnCLh1WdFqbyNy6xsw+CXrfeREGDgo= =IzEr -----END PGP SIGNATURE----- Merge tag 'ata-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata updates from Damien Le Moal: - Add support for the .remove_new callback to the ata_platform code to simplify device removal interface (Uwe) - Code simplification in ata_dev_revalidate() (Yahu) - Fix code indentation and coding style in the pata_parport protocol modules to avoid warnings from static code analyzers (me) - Clarify ata_eh_qc_retry() behavior with better comments (Niklas) - Simplify and improve ata_change_queue_depth() behavior to have a consistent behavior between libsas managed devices and libata managed devices (e.g. AHCI connected devices) (me) - Cleanup libata-scsi and libata-eh code to use the ata_ncq_enabled() and ata_ncq_supported() helpers instead of open coding flags tests (me) - Cleanup ahci_reset_controller() code (me) - Change the pata_octeon_cf and sata_svw drivers to use of_property_read_reg() to simplify the code (Rob, me) - Remove unnecessary include files from ahci_octeon driver (me) - Modify the DesignWare ahci dt bindings to add support for the Rockchip RK3588 AHCI (Sebastian) * tag 'ata-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (29 commits) dt-bindings: phy: rockchip: rk3588 has two reset lines dt-bindings: ata: dwc-ahci: add Rockchip RK3588 dt-bindings: ata: dwc-ahci: add PHY clocks ata: ahci_octeon: Remove unnecessary include ata: pata_octeon_cf: Add missing header include ata: ahci: Cleanup ahci_reset_controller() ata: Use of_property_read_reg() to parse "reg" ata: libata-scsi: Use ata_ncq_supported in ata_scsi_dev_config() ata: libata-eh: Use ata_ncq_enabled() in ata_eh_speed_down() ata: libata-sata: Improve ata_change_queue_depth() ata: libata-sata: Simplify ata_change_queue_depth() ata: libata-eh: Clarify ata_eh_qc_retry() behavior at call site ata: pata_parport: Fix on26 module code indentation and style ata: pata_parport: Fix on20 module code indentation and style ata: pata_parport: Fix ktti module code indentation and style ata: pata_parport: Fix kbic module code indentation and style ata: pata_parport: Fix friq module code indentation and style ata: pata_parport: Fix fit3 module code indentation and style ata: pata_parport: Fix fit2 module code indentation and style ata: pata_parport: Fix epia module code indentation and style ...	2023-06-30 11:48:16 -07:00
Damien Le Moal	6aa0365a3c	ata: libata-scsi: Avoid deadlock on rescan after device resume When an ATA port is resumed from sleep, the port is reset and a power management request issued to libata EH to reset the port and rescanning the device(s) attached to the port. Device rescanning is done by scheduling an ata_scsi_dev_rescan() work, which will execute scsi_rescan_device(). However, scsi_rescan_device() takes the generic device lock, which is also taken by dpm_resume() when the SCSI device is resumed as well. If a device rescan execution starts before the completion of the SCSI device resume, the rcu locking used to refresh the cached VPD pages of the device, combined with the generic device locking from scsi_rescan_device() and from dpm_resume() can cause a deadlock. Avoid this situation by changing struct ata_port scsi_rescan_task to be a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is modified to check if the SCSI device associated with the ATA device that must be rescanned is not suspended. If the SCSI device is still suspended, ata_scsi_dev_rescan() returns early and reschedule itself for execution after an arbitrary delay of 5ms. Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reported-by: Joe Breuer <linux-kernel@jmbreuer.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217530 Fixes: `a19a93e4c6` ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Joe Breuer <linux-kernel@jmbreuer.net>	2023-06-18 12:00:49 +09:00
Damien Le Moal	12980c1f2f	ata: libata-eh: Use ata_ncq_enabled() in ata_eh_speed_down() In ata_eh_speed_down(), instead of hard-coding the test on the device flags to detect if NCQ is supported and enabled, use ata_ncq_enabled(). Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com>	2023-06-05 21:33:47 +09:00
Niklas Cassel	e4c26a1b74	ata: libata-eh: Clarify ata_eh_qc_retry() behavior at call site While the function documentation for ata_eh_qc_retry() is clear, from simply reading the single function that calls ata_eh_qc_retry(), it is not clear that ata_eh_qc_retry() might not retry the command. Add a comment in the single function that calls ata_eh_qc_retry() to clarify the behavior. [Damien] Added curly braces to "if () else" with multi-line comment. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2023-06-01 09:13:22 +09:00
Niklas Cassel	18bd7718b5	scsi: ata: libata: Handle completion of CDL commands using policy 0xD A CDL timeout for policy 0xF is defined as a NCQ error, just with a CDL specific sk/asc/ascq in the sense data. Therefore, the existing code in libata does not need to be modified to handle a policy 0xF CDL timeout. For Command Duration Limits policy 0xD: The device shall complete the command without error with the additional sense code set to DATA CURRENTLY UNAVAILABLE. Since a CDL timeout for policy 0xD is not an error, we cannot use the NCQ Command Error log (10h). Instead, we need to read the Sense Data for Successful NCQ Commands log (0Fh). In the success case, just like in the error case, we cannot simply read a log page from the interrupt handler itself, since reading a log page involves sending a READ LOG DMA EXT or READ LOG EXT command. Therefore, we add a new EH action ATA_EH_GET_SUCCESS_SENSE. When a command completes without error, and when the ATA_SENSE bit is set, this new action is set as pending, and EH is scheduled. This way, similar to the NCQ error case, the log page will be read from EH context. An alternative would have been to add a new kthread or workqueue to handle this. However, extending EH can be done with minimal changes and avoids the need to synchronize a new kthread/workqueue with EH. Co-developed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-20-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:20 -04:00
Niklas Cassel	24aeebbf8e	scsi: ata: libata: Change ata_eh_request_sense() to not set CHECK_CONDITION Currently, ata_eh_request_sense() unconditionally sets the scsicmd->result to SAM_STAT_CHECK_CONDITION. For Command Duration Limits policy 0xD: The device shall complete the command without error (SAM_STAT_GOOD) with the additional sense code set to DATA CURRENTLY UNAVAILABLE. It is perfectly fine to have sense data for a command that returned completion without error. In order to support for CDL policy 0xD, we have to remove this assumption that having sense data means that the command failed (SAM_STAT_CHECK_CONDITION). Change ata_eh_request_sense() to not set SAM_STAT_CHECK_CONDITION, and instead move the setting of SAM_STAT_CHECK_CONDITION to the single caller that wants SAM_STAT_CHECK_CONDITION set, that way ata_eh_request_sense() can be reused in a follow-up patch that adds support for CDL policy 0xD. The only caller of ata_eh_request_sense() is protected by: if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)), so we can remove this duplicated check from ata_eh_request_sense() itself. Additionally, ata_eh_request_sense() is only called from ata_eh_analyze_tf(), which is only called when iteratating the QCs using ata_qc_for_each_raw(), which does not include the internal tag, so cmd can never be NULL (all non-internal commands have qc->scsicmd set), so remove the !cmd check as well. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-14-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Niklas Cassel	876293121f	ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH The name ATA_QCFLAG_FAILED is misleading since it does not mean that a QC completed in error, or that it didn't complete at all. It means that libata decided to schedule EH for the QC, so the QC is now owned by the libata error handler (EH). The normal execution path is responsible for not accessing a QC owned by EH. libata core enforces the rule by returning NULL from ata_qc_from_tag() for QCs owned by EH. It is quite easy to mistake that a QC marked with ATA_QCFLAG_FAILED was an error. However, a QC that was actually an error is instead indicated by having qc->err_mask set. E.g. when we have a NCQ error, we abort all QCs, which currently will mark all QCs as ATA_QCFLAG_FAILED. However, it will only be a single QC that is an error (i.e. has qc->err_mask set). Rename ATA_QCFLAG_FAILED to ATA_QCFLAG_EH to more clearly highlight that this flag simply means that a QC is now owned by EH. This new name will not mislead to think that the QC was an error (which is instead indicated by having qc->err_mask set). This also makes it more obvious that the EH code skips all QCs that do not have ATA_QCFLAG_EH set (rather than ATA_QCFLAG_FAILED), since the EH code should simply only care about QCs that are owned by EH itself. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2023-01-04 13:36:26 +09:00
Wenchao Hao	b83ad9eec3	ata: libata-eh: Cleanup ata_scsi_cmd_error_handler() If ap->ops->error_handler is NULL just return. This patch also fixes some comment style issue. Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2023-01-04 13:25:28 +09:00
Niklas Cassel	3d8a3ae3d9	ata: libata: fix commands incorrectly not getting retried during NCQ error A NCQ error means that the device has aborted processing of all active commands. To get the single NCQ command that caused the NCQ error, host software has to read the NCQ error log, which also takes the device out of error state. When the device encounters a NCQ error, we receive an error interrupt from the HBA, and call ata_do_link_abort() to mark all outstanding commands on the link as ATA_QCFLAG_FAILED (which means that these commands are owned by libata EH), and then call ata_qc_complete() on them. ata_qc_complete() will call fill_result_tf() for all commands marked as ATA_QCFLAG_FAILED. The taskfile is simply the latest status/error as seen from the device's perspective. The taskfile will have ATA_ERR set in the status field and ATA_ABORTED set in the error field. When we fill the current taskfile values for all outstanding commands, that means that qc->result_tf will have ATA_ERR set for all commands owned by libata EH. When ata_eh_link_autopsy() later analyzes all commands owned by libata EH, it will call ata_eh_analyze_tf(), which will check if qc->result_tf has ATA_ERR set, if it does, it will set qc->err_mask (which marks the command as an error). When ata_eh_finish() later calls __ata_qc_complete() on all commands owned by libata EH, it will call qc->complete_fn() (ata_scsi_qc_complete()), ata_scsi_qc_complete() will call ata_gen_ata_sense() to generate sense data if qc->err_mask is set. This means that we will generate sense data for commands that should not have any sense data set. Having sense data set for the non-failed commands will cause SCSI to finish these commands instead of retrying them. While this incorrect behavior has existed for a long time, this first became a problem once we started reading the correct taskfile register in commit `4ba09d2026` ("ata: libahci: read correct status and error field for NCQ commands"). Before this commit, NCQ commands would read the taskfile values received from the last non-NCQ command completion, which most likely did not have ATA_ERR set, since the last non-NCQ command was most likely not an error. Fix this by changing ata_eh_analyze_ncq_error() to mark all non-failed commands as ATA_QCFLAG_RETRY, and change the loop in ata_eh_link_autopsy() to skip commands marked as ATA_QCFLAG_RETRY. While at it, make sure that we clear ATA_ERR and any error bits for all commands except the actual command that caused the NCQ error, so that no other libata code will be able to misinterpret these commands as errors. Fixes: `4ba09d2026` ("ata: libahci: read correct status and error field for NCQ commands") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-11-19 09:41:52 +09:00
Niklas Cassel	4cb7c6f1ef	ata: make use of ata_port_is_frozen() helper Clean up the code by making use of the newly introduced ata_port_is_frozen() helper function. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-10-18 13:53:27 +09:00
Niklas Cassel	013115d90e	ata: libata: fetch sense data for ATA devices supporting sense reporting Currently, the sense data reporting feature set is enabled for all ATA devices which supports the feature set (ata_id_has_sense_reporting()), see ata_dev_config_sense_reporting(). However, even if sense data reporting is enabled, and the device indicates that sense data is available, the sense data is only fetched for ATA ZAC devices. For regular ATA devices, the available sense data is never fetched, it is simply ignored. Instead, libata will use the ERROR + STATUS fields and map them to a very generic and reduced set of sense data, see ata_gen_ata_sense() and ata_to_sense_error(). When sense data reporting was first implemented, regular ATA devices did fetch the sense data from the device. However, this was restricted to only ATA ZAC devices in commit `ca156e006a` ("libata: don't request sense data on !ZAC ATA devices"). With recent changes related to sense data and NCQ autosense, we want to, once again, fetch the sense data for all ATA devices supporting sense reporting. ata_gen_ata_sense() should only be used for devices that don't support the sense data reporting feature set. hopefully the features will be more robust this time around. It is not just ZAC, many new ATA features, e.g. Command Duration Limits, relies on working NCQ autosense and sense data. Therefore, it is not really an option to avoid fetching the sense data forever. If we encounter a device that is misbehaving because the sense data is actually fetched, then that device should be quirked such that it never enables the sense data reporting feature set in the first place, since such a device is obviously not compliant with the specification. The order in which we will try to add sense data to a scsi_cmnd: 1) NCQ autosense (if supported) - ata_eh_analyze_ncq_error() 2) REQUEST SENSE DATA EXT (if supported) - ata_eh_request_sense() 3) error + status field translation - ata_gen_ata_sense(), called by ata_scsi_qc_complete() if neither 1) or 2) is supported. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-10-17 11:31:52 +09:00
Niklas Cassel	4b89ad8e5e	ata: libata: only set sense valid flag if sense data is valid While this shouldn't be needed if all devices that claim that they support NCQ autosense (ata_id_has_ncq_autosense()) and/or the sense data reporting feature (ata_id_has_sense_reporting()), actually supported those features. However, there might be some old ATA devices that either have these bits set, even when they don't support those features, or they simply return malformed data when using those features. These devices should be quirked, but in order to try to minimize the impact for the users of these such devices, it was suggested by Damien Le Moal that it might be a good idea to sanity check the sense data received from the device. If the sense data looks bogus, then the sense data is never added to the scsi_cmnd command. Introduce a new function, ata_scsi_sense_is_valid(), and use it in all places where sense data is received from the device. Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-10-17 11:31:52 +09:00
Niklas Cassel	461ec04067	ata: libata: clarify when ata_eh_request_sense() will be called ata_eh_request_sense() returns early when flag ATA_QCFLAG_SENSE_VALID is set. However, since the call to ata_eh_request_sense() is guarded by a ATA_SENSE bit conditional, the logical conclusion for the reader is that all checks are performed at the call site. Highlight the fact that the sense data will not be fetched if flag ATA_QCFLAG_SENSE_VALID is already set by adding an additional check to the existing guarding conditional. No functional change. Additionally, add a comment explaining that ata_eh_analyze_tf() will only fetch the sense data if: -It was a non-NCQ command that failed, or -It was a NCQ command that failed, but the sense data was not included in the NCQ command error log (i.e. NCQ autosense is not supported). Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-10-17 11:31:52 +09:00
Linus Torvalds	4078aa6850	ata changes for 6.1-rc1 * Print the timeout value for internal command failures due to a timeout (from Tomas). * Improve parameter names in ata_dev_set_feature() to clarify this function use (from Niklas). * Improve the ahci driver low power mode setting initialization to allow more flexibility for the user (from Rafael). * Several patches to remove redundant variables in libata-core, libata-eh and the pata_macio driver and to fix typos in comments (from Jinpeng, Shaomin, Ye). * Some code simplifications and macro renaming (for clarity) in various functions of libata-core (from me). * Add a missing check for a potential failure of sata_scr_read() in sata_print_link_status() (from Li). * Cleanup of libata Kconfig PATA_PLATFORM and PATA_OF_PLATFORM options (from Lukas). * Cleanups of ata dt-bindings and improvements of libahci_platform, ahci and libahci code (from Serge) * New driver for Synopsys AHCI SATA controllers, based of the generic ahci code (from Serge). One compilation warning fix is added for this driver (from me). * Several fixes to macros used to discover a drive capabilities to be consistent with the ACS specifications (from Niklas). * A couple of simplifcations to some libata functions, removing unnecessary arguments (from Niklas). * An improvements to libata-eh code to avoid unnecessary link reset when revalidating a drive after a failed command. In practice, this extra, unneeded reset, reset does not cause any arm beyond slightly slowing down error recovery (from Niklas). -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYz0asgAKCRDdoc3SxdoY drHoAQCJhb6MuQHzbN/wR5cTGAfWXQJWBJx2mJr7oKJCrB34PwD/RzphcsuaXDta kwbTGlpitegByZTDKt9eMRLWmKgyngw= =CnJj -----END PGP SIGNATURE----- Merge tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata updates from Damien Le Moal: - Print the timeout value for internal command failures due to a timeout (from Tomas) - Improve parameter names in ata_dev_set_feature() to clarify this function use (from Niklas) - Improve the ahci driver low power mode setting initialization to allow more flexibility for the user (from Rafael) - Several patches to remove redundant variables in libata-core, libata-eh and the pata_macio driver and to fix typos in comments (from Jinpeng, Shaomin, Ye) - Some code simplifications and macro renaming (for clarity) in various functions of libata-core (from me) - Add a missing check for a potential failure of sata_scr_read() in sata_print_link_status() (from Li) - Cleanup of libata Kconfig PATA_PLATFORM and PATA_OF_PLATFORM options (from Lukas) - Cleanups of ata dt-bindings and improvements of libahci_platform, ahci and libahci code (from Serge) - New driver for Synopsys AHCI SATA controllers, based of the generic ahci code (from Serge). One compilation warning fix is added for this driver (from me) - Several fixes to macros used to discover a drive capabilities to be consistent with the ACS specifications (from Niklas) - A couple of simplifcations to some libata functions, removing unnecessary arguments (from Niklas) - An improvements to libata-eh code to avoid unnecessary link reset when revalidating a drive after a failed command. In practice, this extra, unneeded reset, reset does not cause any arm beyond slightly slowing down error recovery (from Niklas) * tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (45 commits) ata: libata-eh: avoid needless hard reset when revalidating link ata: libata: drop superfluous ata_eh_analyze_tf() parameter ata: libata: drop superfluous ata_eh_request_sense() parameter ata: fix ata_id_has_dipm() ata: fix ata_id_has_ncq_autosense() ata: fix ata_id_has_devslp() ata: fix ata_id_sense_reporting_enabled() and ata_id_has_sense_reporting() ata: libata-eh: Remove the unneeded result variable ata: ahci_st: Enable compile test ata: ahci_st: Fix compilation warning MAINTAINERS: Add maintainers for DWC AHCI SATA driver ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support ata: ahci-dwc: Add platform-specific quirks support dt-bindings: ata: ahci: Add Baikal-T1 AHCI SATA controller DT schema ata: ahci: Add DWC AHCI SATA controller support ata: libahci_platform: Add function returning a clock-handle by id dt-bindings: ata: ahci: Add DWC AHCI SATA controller DT schema ata: ahci: Introduce firmware-specific caps initialization ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments ata: libahci: Don't read AHCI version twice in the save-config method ...	2022-10-07 10:48:49 -07:00
Niklas Cassel	71d7b6e51a	ata: libata-eh: avoid needless hard reset when revalidating link Performing a revalidation on a AHCI controller supporting LPM, while using a lpm mode of e.g. med_power_with_dip (hipm + dipm) or medium_power (hipm), will currently always lead to a hard reset. The expected behavior is that a hard reset is only performed when revalidate fails, because the properties of the drive has changed. A revalidate performed after e.g. a NCQ error, or such a simple thing as disabling write-caching (hdparm -W 0 /dev/sda), should succeed on the first try (and should therefore not cause the link to be reset). This unwarranted hard reset happens because ata_phys_link_offline() returns true for a link that is in deep sleep. Thus the call to ata_phys_link_offline() in ata_eh_revalidate_and_attach() will cause the revalidation to fail, which causes ata_eh_handle_dev_fail() to be called, which will set ehc->i.action \|= ATA_EH_RESET, such that the link is reset before retrying revalidation. When the link is reset, the link is reestablished, so when ata_eh_revalidate_and_attach() is called the second time, directly after the link has been reset, ata_phys_link_offline() will return false, and the revalidation will succeed. Looking at "8.3.1.3 HBA Initiated" in the AHCI 1.3.1 specification, it is clear the when host software writes a new command to memory, by setting a bit in the PxCI/PxSACT HBA port registers, the HBA will automatically bring back the link before sending out the Command FIS. However, simply reading a SCR (like ata_phys_link_offline() does), will not cause the HBA to automatically bring back the link. As long as hipm is enabled, the HBA will put an idle link into deep sleep. Avoid this needless hard reset on revalidation by temporarily disabling hipm, by setting the LPM mode to ATA_LPM_MAX_POWER. After revalidation is complete, ata_eh_recover() will restore the link policy by setting the LPM mode to ap->target_lpm_policy. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-09-24 15:31:00 +09:00
Niklas Cassel	e3b1fff6c0	ata: libata: drop superfluous ata_eh_analyze_tf() parameter The parameter can easily be derived from struct ata_queued_cmd. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-09-21 11:18:36 +09:00
Niklas Cassel	b46c760e11	ata: libata: drop superfluous ata_eh_request_sense() parameter The parameter can easily be derived from struct ata_queued_cmd. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-09-21 11:18:33 +09:00
ye xingchen	cb6e73aaad	ata: libata-eh: Remove the unneeded result variable Return the value ata_port_abort() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-09-21 11:17:05 +09:00
Damien Le Moal	d3122bf9aa	ata: libata-eh: Add missing command name Add the missing command name for ATA_CMD_NCQ_NON_DATA to ata_get_cmd_name(). Fixes: `661ce1f0c4` ("libata/libsas: Define ATA_CMD_NCQ_NON_DATA") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2022-08-16 05:42:51 +09:00
Sergey Shtylyov	e06233f937	ata: libata-eh: fix sloppy result type of ata_internal_cmd_timeout() ata_internal_cmd_timeout() returns unsigned long timeout in ms, however ata_exec_internal_sg() passes that timeout to msecs_to_jiffies() that takes just unsigned int. Change ata_internal_cmd_timeout()'s result type to unsigned int as well, also updating the struct ata_eh_cmd_timeout_ent and the command timeout tables -- all timeouts fit into unsigned int but we have to change ULONG_MAX to UINT_MAX... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-06-20 08:21:57 +09:00
Sergey Shtylyov	afae461a3b	ata: libata-eh: fix sloppy result type of ata_eh_nr_in_flight() ata_eh_nr_in_flight() counts the # of the active tagged commands and thus cannot return a negative value but the result type is nevertheless int. Switching it to unsigned int (along with the local variables receiving the function's result) helps avoiding the sign extension instructions when comparing with or assigning to unsigned long ata_port::fastdrain_cnt and thus results in a more compact 64-bit code. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. [Damien] Fixed commit message. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-06-20 08:16:09 +09:00
Sergey Shtylyov	efcef265fd	ata: add/use ata_taskfile::{error\|status} fields Add the explicit error and status register fields to 'struct ata_taskfile' using the anonymous unions ('struct ide_taskfile' had that for ages!) and update the libata taskfile code accordingly. There should be no object code changes resulting from that... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-02-20 09:06:05 +09:00
Sergey Shtylyov	2a7b02ea7f	ata: libata-acpi: kill ata_acpi_on_suspend() Since the commit `c05e6ff035` ("libata-acpi: implement and use ata_acpi_init_gtm()") ata_acpi_on_suspend() just returns 0, so its call from ata_eh_handle_port_suspend() doesn't make sense anymore. Remove the function completely, at last... Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-02-02 16:56:49 +09:00
Hannes Reinecke	1c95a27c1e	ata: libata: drop ata_msg_drv() Callers are already protected by ata_dev_print_info(), so no need to have an additional configuration parameter here. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-01-05 19:33:02 +09:00
Hannes Reinecke	d452090301	ata: libata: revamp ata_get_cmd_descript() Rename ata_get_cmd_descrip() to ata_get_cmd_name() and simplify it to return "unknown" instead of NULL. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-01-05 19:33:01 +09:00
Hannes Reinecke	c318458c93	ata: libata: add tracepoints for ATA error handling Add tracepoints for ATA error handling. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-01-05 19:33:01 +09:00
Hannes Reinecke	f8ec26d0f5	ata: libata: add reset tracepoints To follow the flow of control we should be using tracepoints, as they will tie in with the actual I/O flow and deliver a better overview about what it happening. This patch adds tracepoints for hard reset, soft reset, and postreset and adds them in the libata-eh control flow. With that we can drop the reset DPRINTK calls in the various drivers. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-01-04 20:00:57 +09:00
Damien Le Moal	68dbbe7d5b	libata: fix read log timeout value Some ATA drives are very slow to respond to READ_LOG_EXT and READ_LOG_DMA_EXT commands issued from ata_dev_configure() when the device is revalidated right after resuming a system or inserting the ATA adapter driver (e.g. ahci). The default 5s timeout (ATA_EH_CMD_DFL_TIMEOUT) used for these commands is too short, causing errors during the device configuration. Ex: ... ata9: SATA max UDMA/133 abar m524288@0x9d200000 port 0x9d200400 irq 209 ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata9.00: ATA-9: XXX XXXXXXXXXXXXXXX, XXXXXXXX, max UDMA/133 ata9.00: qc timeout (cmd 0x2f) ata9.00: Read log page 0x00 failed, Emask 0x4 ata9.00: Read log page 0x00 failed, Emask 0x40 ata9.00: NCQ Send/Recv Log not supported ata9.00: Read log page 0x08 failed, Emask 0x40 ata9.00: 27344764928 sectors, multi 16: LBA48 NCQ (depth 32), AA ata9.00: Read log page 0x00 failed, Emask 0x40 ata9.00: ATA Identify Device Log not supported ata9.00: failed to set xfermode (err_mask=0x40) ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata9.00: configured for UDMA/133 ... The timeout error causes a soft reset of the drive link, followed in most cases by a successful revalidation as that give enough time to the drive to become fully ready to quickly process the read log commands. However, in some cases, this also fails resulting in the device being dropped. Fix this by using adding the ata_eh_revalidate_timeouts entries for the READ_LOG_EXT and READ_LOG_DMA_EXT commands. This defines a timeout increased to 15s, retriable one time. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2021-11-11 10:55:32 +09:00
Bart Van Assche	c8329cd55b	scsi: ata: Use scsi_cmd_to_rq() instead of scsi_cmnd.request Prepare for removal of the request pointer by using scsi_cmd_to_rq() instead. This patch does not change any functionality. Link: https://lore.kernel.org/r/20210809230355.8186-8-bvanassche@acm.org Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-08-11 22:25:37 -04:00
Linus Torvalds	d72cd4ad41	SCSI misc on 20210428 This series consists of the usual driver updates (ufs, target, tcmu, smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core change is using a sbitmap instead of an atomic for queue tracking. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S klkSxLzXKQLzJBgdK5w= =p5B/ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, target, tcmu, smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core change is using a sbitmap instead of an atomic for queue tracking" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits) scsi: target: tcm_fc: Fix a kernel-doc header scsi: target: Shorten ALUA error messages scsi: target: Fix two format specifiers scsi: target: Compare explicitly with SAM_STAT_GOOD scsi: sd: Introduce a new local variable in sd_check_events() scsi: dc395x: Open-code status_byte(u8) calls scsi: 53c700: Open-code status_byte(u8) calls scsi: smartpqi: Remove unused functions scsi: qla4xxx: Remove an unused function scsi: myrs: Remove unused functions scsi: myrb: Remove unused functions scsi: mpt3sas: Fix two kernel-doc headers scsi: fcoe: Suppress a compiler warning scsi: libfc: Fix a format specifier scsi: aacraid: Remove an unused function scsi: core: Introduce enum scsi_disposition scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case scsi: core: Rename scsi_softirq_done() into scsi_complete() scsi: core: Remove an incorrect comment scsi: core: Make the scsi_alloc_sgtables() documentation more accurate ...	2021-04-28 17:22:10 -07:00
Gustavo A. R. Silva	e06abcc68c	libata: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-04-20 14:23:17 -06:00
Bart Van Assche	b8e162f9e7	scsi: core: Introduce enum scsi_disposition Improve readability of the code in the SCSI core by introducing an enumeration type for the values used internally that decide how to continue processing a SCSI command. The eh_*_handler return values have not been changed because that would involve modifying all SCSI drivers. The output of the following command has been inspected to verify that no out-of-range values are assigned to a variable of type enum scsi_disposition: KCFLAGS=-Wassign-enum make CC=clang W=1 drivers/scsi/ Link: https://lore.kernel.org/r/20210415220826.29438-6-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Mauro Carvalho Chehab	94bd5719e4	ata: fix some kernel-doc markups Some functions have different names between their prototypes and the kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-10-23 12:20:32 -06:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00

1 2 3 4 5 ...

340 Commits