DMA on this hardware is limited to dealing with a single page at a
time. Previously, the driver was set up accordingly to request single-page
DMA buffers, however that had the effect of generating a large number
of small MMC requests for data I/O.
Improve the driver to accept scatter-gather DMA buffers of larger sizes.
Iterate through those buffers a page at a time.
Testing with dd, this increases write performance from 2mb/sec to
10mb/sec, and increases read performance from 4mb/sec to 14mb/sec.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
According to the AM654x Data Manual[1], the setup timing in lower speed
modes can only be met if the controller uses a falling edge data launch.
To ensure this, the HIGH_SPEED_ENA (HOST_CONTROL[2]) bit should be
cleared in default speed, SD high speed, MMC high speed, SDR12 and SDR25
speed modes.
Use the sdhci writeb callback to implement this condition.
[1] http://www.ti.com/lit/gpn/am6546 Section 5.10.5.16.1
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The amount of changes to the memorystick subsystem are limited as of today.
However, I have a couple of times been funneling changes through my MMC
tree and it have turned out fine. So, I am here by volunteering to continue
doing this, by adding myself and the link to the MMC tree to the MEMSTICK
section.
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Alex Dubov <alex.dubov@gmail.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Rui Feng <rui_feng@realsil.com.cn>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Maxim Levitsky <maximlevitsky@gmail.com>
The current SONY MEMORYSTICK CARD SUPPORT section is pointing to the TI
flash media memorystick driver, which is a bit confusing. Let's make this
more clear by moving this part into TI FLASH MEDIA INTERFACE DRIVER
section, but rename the section to TI FLASH MEDIA MEMORYSTICK/MMC DRIVERS,
as to make it more clear.
Finally, add Alex Dubov to the SONY MEMORYSTICK STANDARD SUPPORT, as I
believe that has been the intention.
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Alex Dubov <alex.dubov@gmail.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reduce size of duplicated comments by switching to use SPDX identifier.
No functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
- spaces surrounding arithmetic operators
- utilize full line limit
- drop extra spaces / TABs in variable definitions
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
For easy grepping on debug purposes join string literals back in
the messages.
No functional change.
While here, join list of module authors as well.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The mmc pointer can't be NULL at ->remove(), drop the useless check.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Driver core sets it to NULL upon probe failure or release.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch allows to get datactrl configuration specific
at variant. This introduce more flexibility on datactlr
value.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch defines get_dctrl_cfg callback for sdmmc variant.
sdmmc variant has specific stm32 transfer modes.
sdmmc data transfer mode selection could be:
-Block data transfer ending on block count.
-SDIO multibyte data transfer.
-MMC Stream data transfer (not used).
-Block data transfer ending with STOP_TRANSMISSION command.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch defines get_dctrl_cfg callback for qcom variant.
qcom variant has a specific block size definition.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch adds get_datactrl_cfg callback in mmci_host_ops
to allow to get datactrl configuration specific at variant.
Common helper function is defined and could be call by variant.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Enable the DMA codepath for writes as well as reads.
This improves write speed from 1mb/sec to 2mb/sec (tested with dd).
The original ampe_stor vendor driver also uses DMA for writes.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Direct commands (DCMDs) are an optional feature of eMMC 5.1's command
queue engine (CQE). The Arasan eMMC 5.1 controller uses the CQHCI,
which exposes a control register bit to enable the feature.
The current implementation sets this bit unconditionally.
This patch allows to suppress the feature activation,
by specifying the property disable-cqe-dcmd.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 84362d79f4 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Add disable-cqe-dcmd as optional property for MMC hosts.
This property allows to disable or not enable the direct command
features of the command queue engine.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Fixes: 84362d79f4 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Fix sparse warning:
drivers/mmc/host/sdhci-omap.c:788:6: warning:
symbol 'sdhci_omap_reset' was not declared. Should it be static?
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tegra CQHCI/SDHCI design prevents write access to SDHCI block size
register when CQE is enabled and unhalted.
CQHCI driver enables CQE prior to invoking sdhci_cqe_enable which
violates this Tegra specific host requirement.
This patch fixes this by configuring sdhci block registers prior
to CQE unhalt.
This patch also has a fix for retry of unhalt due to known Tegra
specific CQE resume bug where first unhalt might not succeed when
clear all tasks is performed prior to resume and need a second unhalt.
This patch also includes CQE enable fix for CMD CRC errors that
happen with the specific sandisk emmc device when status command
is sent during the transfer of last data block due to marginal timing.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch adds define for CBC field mask of the register
CQHCI_SSC1.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tegra186 CQHCI host has a known bug where CQHCI controller selects
DATA_PRESENT_SELECT bit to 1 for DCMDs with R1B response type and
since DCMD does not trigger any data transfer, DCMD task complete
happens leaving the DATA FSM of host controller in wait state for
the data.
This effects the data transfer tasks issued after the DCMDs with
R1b response type resulting in timeout.
SW WAR is to set CMD_TIMING to 1 in DCMD task descriptor. This bug
and SW WAR is applicable only for Tegra186 and not for Tegra194.
This patch implements this WAR thru NVQUIRK_CQHCI_DCMD_R1B_CMD_TIMING
for Tegra186 and also implements update_dcmd_desc of cqhci_host_ops
interface to set CMD_TIMING bit depending on the NVQUIRK.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch adds update_dcmd_desc interface to cqhci_host_ops to
allow hosts to update any of the DCMD task descriptor attributes
and parameters.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
SDHCI controller of Tegra194 is similar to SDHCI controller in Tegra186.
This patch documents Tegra194 sdhci compatible string.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch includes below HW tuning related fixes.
configures tuning parameters as per Tegra TRM
WAR fix for manual tap change
HW auto-tuning post process
As per Tegra TRM, SDR50 mode tuning execution takes upto maximum
of 256 tuning iterations and SDR104/HS200/HS400 modes tuning
execution takes upto maximum of 128 tuning iterations.
This patch programs tuning control register with maximum tuning
iterations needed based on the timing along with the start tap,
multiplier, and step size used by the HW tuning.
Tegra210 has a known issue of glitch on trimmer output when the
tap value is changed with the trimmer input clock running and the
WAR is to disable card clock before sending tuning command and
after sending tuning command wait for 1usec and issue SW reset
followed by enabling card clock.
This WAR is applicable when changing tap value manually as well.
Tegra SDHCI driver has this implemented correctly for manual tap
change but missing SW reset before enabling card clock during
sending tuning command.
Issuing SW reset during tuning command as a part of WAR and is
applicable in cases where tuning is performed with single step size
for more iterations. This patch includes this fix.
HW auto-tuning finds the best largest passing window and sets the
tap at the middle of the window. With some devices like sandisk
eMMC driving fast edges and due to high tap to tap delay in the
Tegra chipset, auto-tuning does not detect falling tap between the
valid windows resulting in a parital window or a merged window and
the best tap is set at the signal transition which is actually the
worst tap location.
Recommended SW solution is to detect if the best passing window
picked by the HW tuning is a partial or a merged window based on
min and max tap delays found from chip characterization across
PVT and perform tuning correction to pick the best tap.
This patch has implementation of this post HW tuning process for
the tegra hosts that support HW tuning through the callback function
tegra_sdhci_execute_hw_tuning and uses the tuned tap delay.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
As per the Host Controller Standard Specification Version 4.20,
limitation of tuning iteration count is removed as PLL locking
time can be longer than UHS-1 tuning due to larger PVT fluctuation
and it will result in increase of tuning iteration to complete the
tuning.
This patch creates sdhci_host member tuning_loop_count to allow
hosts to specify maximum tuning iterations and also updates
execute_tuning to use this specified maximum tuning iteration count.
Default tuning_loop_count is set to same as existing loop count of
MAX_TUNING_LOOP which is 40 iterations.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
ddr_signaling is set to true for DDR50 and DDR52 modes but is
not set back to false for other modes. This programs incorrect
host clock when mode change happens from DDR52/DDR50 to other
SDR or HS modes like incase of mmc_retune where it switches
from HS400 to HS DDR and then from HS DDR to HS mode and then
to HS200.
This patch fixes the ddr_signaling to set properly for non DDR
modes.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Cc: stable@vger.kernel.org # v4.20 +
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The CBSY flag should be proper before calling tmio_mmc_host_probe()
because this function will already use write16 which checks this bit.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
max_req_size is calculated by 'max_blk_size * max_blk_count' in the TMIO
core. So, specifying U32_MAX as max_blk_count will overflow this
calculation. It will cause no harm in practice because the immense high
number will overflow into another immense high number. However, it is
not good coding practice, so calculate max_blk_count so that
max_req_size will fit into unsigned int on ARM32/64.
Thanks to the Renesas BSP team for the bug report!
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We will need it later for other calculations.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Mostly year updates, but one addition as well.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
According to the i.MX23/28 reference manuals both mmc interfaces support
the MMC_ERASE command. So enable this capability in the driver to allow
erase/discard/trim requests.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case spi_sync_locked fails, the fix reports the error and
returns the error code upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
For some controllers, in Present State Register, Data Line
Active bit is not reliable for commands (such as CMD6, CMD7,
CMD12, CMD28, CMD29, or CMD38) with busy signal. DLA affects
Command with Data Inhibit bit. Therefore, software driver
may not know the busy status in DLA/CDIHB.
Futunately MMC core driver has already polled card status
with CMD13 after sending any command with busy signal. So
we can just ignore CDIHB never released issue for such
controllers. This patch is to add a quirk to handle this.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Invalid Transfer Complete (IRQSTAT[TC]) bit could be set during
multi-write operation even when the BLK_CNT in BLKATTR register
has not reached zero. Therefore, Transfer Complete might be
reported twice due to this erratum since a valid Transfer Complete
occurs when BLK_CNT reaches zero. This erratum is to fix this issue
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In the event of that any data error (like, IRQSTAT[DCE]) occurs
during an eSDHC data transaction where DMA is used for data
transfer to/from the system memory, setting the SYSCTL[RSTD]
register may cause a system hang. If software sets the register
SYSCTL[RSTD] to 1 for error recovery while DMA transferring is
not complete, eSDHC may hang the system bus. This happens because
the software register SYSCTL[RSTD] resets the DMA engine without
waiting for the completion of pending system transactions. This
erratum is to fix this issue.
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
eSDHC-A001: The data timeout counter (SYSCTL[DTOCV]) is not
reliable for DTOCV values 0x4(2^17 SD clock), 0x8(2^21 SD clock),
and 0xC(2^25 SD clock). The data timeout counter can count from
2^13–2^27, but for values 2^17, 2^21, and 2^25, the timeout
counter counts for only 2^13 SD clocks.
A-008358: The data timeout counter value loaded into the timeout
counter is less than expected and can result into early timeout
error in case of eSDHC data transactions. The table below shows
the expected vs actual timeout period for different values of
SYSCTL[DTOCV]:
these two erratum has the same quirk to control it, and set
SDHCI_QUIRK_RESET_AFTER_REQUEST to fix above issue.
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Software writing to the Transfer Type configuration register
(system clock domain) can cause a setup/hold violation in the
CRC flops (card clock domain), which can cause write accesses
to be sent with corrupt CRC values. This issue occurs only for
write preceded by read. this erratum is to fix this issue.
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch is to add erratum A011334 support in lx2160 2.0 SoC
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
As mmci_variant_init() is a local function to mmci.c, let's convert it into
static.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
It's good practice to share functions via header files, rather than from
the c-files. Therefore, let's move sdmmc_variant_init() to mmci.h.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Ludovic Barre <ludovic.barre@st.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
It seems a bit silly to have a header file to share only the
qcom_variant_init() function. So, let's just drop it and move the
declaration of the function into the common mmci.h instead.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Having mmci_dmae_start() to invoke the shared function, dml_start_xfer(),
explicitly for the qcom variant isn't very nice. Let's clean up this code
by moving the qcom specific parts into the qcom ->dma_start() callback and
then drop dml_start_xfer() altogether.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
There's no point clearing the variant flag in case the qcom variant fails
to setup DMA. This is because if mmci_dma_setup() fails, then the use_dma
flag remains set to false, which leads to mmci using PIO mode and not DMA.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Some of the DMA functions are shared via mmci.h, however they are not
implemented unless CONFIG_DMA_ENGINE is set. Therefore, add that constraint
to the header file as well.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Merge page ref overflow branch.
Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).
Admittedly it's not exactly easy. To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers. Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).
Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication. So let's just do that.
* branch page-refs:
fs: prevent page refcount overflow in pipe_buf_get
mm: prevent get_user_pages() from overflowing page refcount
mm: add 'try_get_page()' helper function
mm: make page ref count overflow check tighter and more explicit
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount. All
callers converted to handle a failure.
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the page refcount wraps around past zero, it will be freed while
there are still four billion references to it. One of the possible
avenues for an attacker to try to make this happen is by doing direct IO
on a page multiple times. This patch makes get_user_pages() refuse to
take a new page reference if there are already more than two billion
references to the page.
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>