The dma framework will calculate the dma channels chancnt, setting it
ourself is wrong.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230521100252.3197-6-jszhang@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The bam_dma driver needs to know the number of channels and execution
environments (EEs) at probe time. If we are in full control of the BAM
controller this information can be obtained from the BAM identification
registers (BAM_REVISION/BAM_NUM_PIPES).
When the BAM is "controlled remotely" it is more complicated. The BAM
might not be on at probe time, so reading the registers could fail.
This is why the information must be added to the device tree in this
case, using "num-channels" and "qcom,num-ees".
However, there are also some BAM instances that are initialized by
something else but we still have a clock that allows to turn it on when
needed. This can be set up in the DT with "qcom,controlled-remotely"
and "clocks" and is already supported by the bam_dma driver. Examples
for this are the typical BLSP BAM instances on older SoCs, QPIC BAM
(for NAND) and the crypto BAM on some SoCs.
In this case, there is no need to read "num-channels" and
"qcom,num-ees" from the DT. The BAN can be turned on using the clock
so we can just read it from the BAM registers like in the normal case.
Check for the BAM clock earlier and skip reading "num-channels" and
"qcom,num-ees" if it is present to allow simplifying the DT description
a bit.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Reviewed-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Link: https://lore.kernel.org/r/20230518-bamclk-dt-v2-1-a1a857b966ca@gerhold.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add support for HDMA NATIVE, as long the IP design has set
the compatible register map parameter-HDMA_NATIVE,
which allows compatibility for native HDMA register configuration.
The HDMA Hyper-DMA IP is an enhancement of the eDMA embedded-DMA IP.
And the native HDMA registers are different from eDMA, so this patch
add support for HDMA NATIVE mode.
HDMA write and read channels operate independently to maximize
the performance of the HDMA read and write data transfer over
the link When you configure the HDMA with multiple read channels,
then it uses a round robin (RR) arbitration scheme to select
the next read channel to be serviced.The same applies when you
have multiple write channels.
The native HDMA driver also supports a maximum of 16 independent
channels (8 write + 8 read), which can run simultaneously.
Both SAR (Source Address Register) and DAR (Destination Address Register)
are aligned to byte.
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20230520050854.73160-4-cai.huoqing@linux.dev
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The structure dw_edma_core_ops has a set of the pointers
abstracting out the DW eDMA vX and DW HDMA Native controllers.
And use dw_edma_v0_core_register to set up operation.
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20230520050854.73160-3-cai.huoqing@linux.dev
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The dw_edma_core_ops structure contains a set of the operations:
device IRQ numbers getter, CPU/PCI address translation. Based on the
functions semantics the structure name "dw_edma_plat_ops" looks more
descriptive since indeed the operations are platform-specific. The
"dw_edma_core_ops" name shall be used for a structure with the IP-core
specific set of callbacks in order to abstract out DW eDMA and DW HDMA
setups. Such structure will be added in one of the next commit in the
framework of the set of changes adding the DW HDMA device support.
Anyway the renaming was necessary to distinguish two types of
the implementation callbacks:
1. DW eDMA/hDMA IP-core specific operations: device-specific CSR
setups in one or another aspect of the DMA-engine initialization.
2. DW eDMA/hDMA platform specific operations: the DMA device
environment configs like IRQs, address translation, etc.
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20230520050854.73160-2-cai.huoqing@linux.dev
Signed-off-by: Vinod Koul <vkoul@kernel.org>
A fixup for a printk format string warning causes an out-of-bounds
variable access as the %pR string expects a struct resource instead of
a plain resource_size_t.
Change both to the special %pap and %pap helpers for these types.
Fixes: 5a1a3b9c19 ("dmaengine: ste_dma40: Get LCPA SRAM from SRAM node")
Fixes: ef1e1c41a1 ("dmaengine: ste_dma40: use correct print specfier for resource_size_t")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230519093447.4097040-1-arnd@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
On s390 systems (aka mainframes), it has classic channel devices for
networking and permanent storage that are currently even more common
than PCI devices. Hence it could have a fully functional s390 kernel
with CONFIG_PCI=n, then the relevant iomem mapping functions
[including ioremap(), devm_ioremap(), etc.] are not available.
Here let QCOM_HIDMA depend on HAS_IOMEM so that it won't be built to
cause below compiling error if PCI is unset.
--------------------------------------------------------
ld: drivers/dma/qcom/hidma.o: in function `hidma_probe':
hidma.c:(.text+0x4b46): undefined reference to `devm_ioremap_resource'
ld: hidma.c:(.text+0x4b9e): undefined reference to `devm_ioremap_resource'
make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Error 1
make: *** [Makefile:1264: vmlinux] Error 2
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: linux-arm-msm@vger.kernel.org
Cc: dmaengine@vger.kernel.org
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@gmail.com>
Link: https://lore.kernel.org/r/20230506111628.712316-3-bhe@redhat.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
We should use %pR for printing resource_size_t, so update that fixing
the warning:
drivers/dma/ste_dma40.c:3556:25: warning: format specifies type 'unsigned int'
but the argument has type 'resource_size_t' (aka 'unsigned long long') [-Wformat]
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 5a1a3b9c19 ("dmaengine: ste_dma40: Get LCPA SRAM from SRAM node")
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20230517064434.141091-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
J721S2 has dedicated BCDMA instance for Camera Serial Interface RX
and TX. The BCDMA instance supports RX and TX channels but block copy
channels are not present, add support for the same.
Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Link: https://lore.kernel.org/r/20230505143929.28131-3-vaishnav.a@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This makes the probe() and its subfunction d40_hw_detect_init()
return proper error codes.
One effect of this is that deferred probe, e.g from the clock,
will start to work, would it happen. Also it is better design.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-dma40-cleanup-v3-7-60bfa6785968@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This switches the DMA40 driver to use a bunch of managed
resources and strip down the errorpath.
The result is pretty neat and makes the driver way more
readable.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-dma40-cleanup-v3-6-60bfa6785968@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The OF platform data population function only wants to
use struct device *dev, so pass that instead.
This change makes the compiler realize that the local
platform data variable is unused, so drop that too.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-dma40-cleanup-v3-5-60bfa6785968@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The Ux500 is device tree-only since ages. Delete the
platform data header and push it into or next to the driver
instead.
Drop the non-DT probe path since this will not happen.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-dma40-cleanup-v3-4-60bfa6785968@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The &pdev->dev device pointer is used so many times in the
probe() and d40_hw_detect_init() functions that a local *dev
variable makes the code way easier to read.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-dma40-cleanup-v3-3-60bfa6785968@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Instead of passing the reserved SRAM as a "reg" field
look for a phandle to the LCPA SRAM memory so we can
use the proper SRAM device tree bindings for the SRAM.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-dma40-cleanup-v3-2-60bfa6785968@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
New support:
- Apple admac t8112 device support
- StarFive JH7110 DMA controller
Updates:
- Big pile of idxd updates to support IAA 2.0 device capabilities, DSA
2.0 Event Log and completion record faulting features and new DSA
operations
- at_xdmac supend & resume updates and driver code cleanup
- k3-udma supend & resume support
- k3-psil thread support for J784s4
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmRSOJYACgkQfBQHDyUj
g0fmUQ//ZmrcNgbubI8evje8oovapLZUqdCgxoq9RbUbJ0TA6y7QCZ4ebkQYQymA
LCJSV+NpG3++gLXLmIoYWiWXCabHZWDPh5M+uUiRm51MV2AtKeHy7UzQolmOWb6E
d4JYf4HdGvzto7nMM7zo/PDaEDKdkostJNb7ZFYZrRqqCbSSSTJYmNGg3PfYRfaw
yJpQsOjizgboM8EciKbqLh9gX3+FfQGhKsnkaiGQBwtdZD29LHeC7UoQrslkH9uh
EiGTMYPKCZIAb+75G4WSO3sCNZcShQPEQt8w3SBP9X6eyaH75AZPymSRhF6O8gte
ecGW/mRzc7t8lOrRnzD1zvZ/4NgwlnxTQAl+561XwWjmZIW/Zd9dH8dHFzwS5alS
z7uVperSYXoEo//UmmwuhuZXlOWD4muOswWB4I5lTeHW97OUyccU5S3DhxYnffI6
P8qnGMmxLj8nsmk98T8TsOmzdb5txxSbPP+PumvDVF7g9LoFKfz8wDAZG3buTsJP
L/HR9sAFaFO0GrjoC7EsHJsiLauSzzcDnYIUJ5UIv/6Ogf2mUCrBhgwSmVhIeR26
hUzYAdpwmhrrVXtbMXuec7jAz62bqkACdJIWGd21t/FTzUNUaDv5OimiNEZncpnY
9q2cT9x94iHHVCl430jlIAOa/oF2AQpeqs4any7+OI2xwiJEA7g=
=Vqrw
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New support:
- Apple admac t8112 device support
- StarFive JH7110 DMA controller
Updates:
- Big pile of idxd updates to support IAA 2.0 device capabilities,
DSA 2.0 Event Log and completion record faulting features and
new DSA operations
- at_xdmac supend & resume updates and driver code cleanup
- k3-udma supend & resume support
- k3-psil thread support for J784s4"
* tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (57 commits)
dmaengine: idxd: add per wq PRS disable
dmaengine: idxd: add pid to exported sysfs attribute for opened file
dmaengine: idxd: expose fault counters to sysfs
dmaengine: idxd: add a device to represent the file opened
dmaengine: idxd: add per file user counters for completion record faults
dmaengine: idxd: process batch descriptor completion record faults
dmaengine: idxd: add descs_completed field for completion record
dmaengine: idxd: process user page faults for completion record
dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling
dmaengine: idxd: create kmem cache for event log fault items
dmaengine: idxd: add per DSA wq workqueue for processing cr faults
dmanegine: idxd: add debugfs for event log dump
dmaengine: idxd: add interrupt handling for event log
dmaengine: idxd: setup event log configuration
dmaengine: idxd: add event log size sysfs attribute
dmaengine: idxd: make misc interrupt one shot
dt-bindings: dma: snps,dw-axi-dmac: constrain the items of resets for JH7110 dma
dt-bindings: dma: Drop unneeded quotes
dmaengine: at_xdmac: align declaration of ret with the rest of variables
dmaengine: at_xdmac: add a warning message regarding for unpaused channels
...
Including:
- Convert to platform remove callback returning void
- Extend changing default domain to normal group
- Intel VT-d updates:
- Remove VT-d virtual command interface and IOASID
- Allow the VT-d driver to support non-PRI IOPF
- Remove PASID supervisor request support
- Various small and misc cleanups
- ARM SMMU updates:
- Device-tree binding updates:
* Allow Qualcomm GPU SMMUs to accept relevant clock properties
* Document Qualcomm 8550 SoC as implementing an MMU-500
* Favour new "qcom,smmu-500" binding for Adreno SMMUs
- Fix S2CR quirk detection on non-architectural Qualcomm SMMU
implementations
- Acknowledge SMMUv3 PRI queue overflow when consuming events
- Document (in a comment) why ATS is disabled for bypass streams
- AMD IOMMU updates:
- 5-level page-table support
- NUMA awareness for memory allocations
- Unisoc driver: Support for reattaching an existing domain
- Rockchip driver: Add missing set_platform_dma_ops callback
- Mediatek driver: Adjust the dma-ranges
- Various other small fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmRONeAACgkQK/BELZcB
GuPmpw/8C9ruxQ0JU5rcDBXQGvos4gMmxlbELMrBpbbiTtdb35xchpKfdhnECGIF
k2SrrcF40R/S82SyzNU/eZtGKirtcXvGFraUFgu/QdCcnnqpRHs+IJMXX2NJP+it
+0wO1uiInt3CN1ERcR4F31cDKiWjDG8bvQVE5LIyiy4KrIU5ld2G91Fkaa0R13Au
6H+/wKkcUC6OyaGE6wPx474xBkapT20vj5AIQuAWisXJJR0wbBon1sUTo/IRKsU+
IkNxH0W+1PNImJ+crAdf/nkOlyqoChY4ww6cm07LrOsBLIsX5bCqXfL4HvKthElD
MEgk2SN5kfjfR5Vf29W4hZVM1CT8VbhO41I7OzaZ6X6RU2PXoldPKlgKtZGeSKn1
9bcMpSgB0BtbttvBevSkxTo5KHFozXS2DG3DFoMB3yFMme8Th0LrhBZ9oB7NIPNw
ntMo4K75vviC6Vvzjy4Anj/+y+Zm3W6wDDP7F12O6WZLkK5s4hrSsHUm/MQnnKQP
muJlG870RnSl73xUQZe3cuBxktXuJ3EHqqYIPE0npzvauu8hhWcis3opf2Y+U2s8
aBCCIgp5kTKqjHLh2e4lNCKZf1/b/dhxRcRBQhpAIb8YsjMlIJyM+G8Jz6K6gBga
5Ld+68UQ3oHJwoLV1HCFN8jbpQ9KZn1s9+h3yrYjRAcLNiFb3nU=
=OvTo
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Convert to platform remove callback returning void
- Extend changing default domain to normal group
- Intel VT-d updates:
- Remove VT-d virtual command interface and IOASID
- Allow the VT-d driver to support non-PRI IOPF
- Remove PASID supervisor request support
- Various small and misc cleanups
- ARM SMMU updates:
- Device-tree binding updates:
* Allow Qualcomm GPU SMMUs to accept relevant clock properties
* Document Qualcomm 8550 SoC as implementing an MMU-500
* Favour new "qcom,smmu-500" binding for Adreno SMMUs
- Fix S2CR quirk detection on non-architectural Qualcomm SMMU
implementations
- Acknowledge SMMUv3 PRI queue overflow when consuming events
- Document (in a comment) why ATS is disabled for bypass streams
- AMD IOMMU updates:
- 5-level page-table support
- NUMA awareness for memory allocations
- Unisoc driver: Support for reattaching an existing domain
- Rockchip driver: Add missing set_platform_dma_ops callback
- Mediatek driver: Adjust the dma-ranges
- Various other small fixes and cleanups
* tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (82 commits)
iommu: Remove iommu_group_get_by_id()
iommu: Make iommu_release_device() static
iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()
iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)
iommu/vt-d: Remove BUG_ON in map/unmap()
iommu/vt-d: Remove BUG_ON when domain->pgd is NULL
iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation
iommu/vt-d: Remove BUG_ON on checking valid pfn range
iommu/vt-d: Make size of operands same in bitwise operations
iommu/vt-d: Remove PASID supervisor request support
iommu/vt-d: Use non-privileged mode for all PASIDs
iommu/vt-d: Remove extern from function prototypes
iommu/vt-d: Do not use GFP_ATOMIC when not needed
iommu/vt-d: Remove unnecessary checks in iopf disabling path
iommu/vt-d: Move PRI handling to IOPF feature path
iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
iommu/vt-d: Move iopf code from SVA to IOPF enabling path
iommu/vt-d: Allow SVA with device-specific IOPF
dmaengine: idxd: Add enable/disable device IOPF feature
arm64: dts: mt8186: Add dma-ranges for the parent "soc" node
...
The summary of the changes for this pull requests is:
* Song Liu's new struct module_memory replacement
* Nick Alcock's MODULE_LICENSE() removal for non-modules
* My cleanups and enhancements to reduce the areas where we vmalloc
module memory for duplicates, and the respective debug code which
proves the remaining vmalloc pressure comes from userspace.
Most of the changes have been in linux-next for quite some time except
the minor fixes I made to check if a module was already loaded
prior to allocating the final module memory with vmalloc and the
respective debug code it introduces to help clarify the issue. Although
the functional change is small it is rather safe as it can only *help*
reduce vmalloc space for duplicates and is confirmed to fix a bootup
issue with over 400 CPUs with KASAN enabled. I don't expect stable
kernels to pick up that fix as the cleanups would have also had to have
been picked up. Folks on larger CPU systems with modules will want to
just upgrade if vmalloc space has been an issue on bootup.
Given the size of this request, here's some more elaborate details
on this pull request.
The functional change change in this pull request is the very first
patch from Song Liu which replaces the struct module_layout with a new
struct module memory. The old data structure tried to put together all
types of supported module memory types in one data structure, the new
one abstracts the differences in memory types in a module to allow each
one to provide their own set of details. This paves the way in the
future so we can deal with them in a cleaner way. If you look at changes
they also provide a nice cleanup of how we handle these different memory
areas in a module. This change has been in linux-next since before the
merge window opened for v6.3 so to provide more than a full kernel cycle
of testing. It's a good thing as quite a bit of fixes have been found
for it.
Jason Baron then made dynamic debug a first class citizen module user by
using module notifier callbacks to allocate / remove module specific
dynamic debug information.
Nick Alcock has done quite a bit of work cross-tree to remove module
license tags from things which cannot possibly be module at my request
so to:
a) help him with his longer term tooling goals which require a
deterministic evaluation if a piece a symbol code could ever be
part of a module or not. But quite recently it is has been made
clear that tooling is not the only one that would benefit.
Disambiguating symbols also helps efforts such as live patching,
kprobes and BPF, but for other reasons and R&D on this area
is active with no clear solution in sight.
b) help us inch closer to the now generally accepted long term goal
of automating all the MODULE_LICENSE() tags from SPDX license tags
In so far as a) is concerned, although module license tags are a no-op
for non-modules, tools which would want create a mapping of possible
modules can only rely on the module license tag after the commit
8b41fc4454 ("kbuild: create modules.builtin without Makefile.modbuiltin
or tristate.conf"). Nick has been working on this *for years* and
AFAICT I was the only one to suggest two alternatives to this approach
for tooling. The complexity in one of my suggested approaches lies in
that we'd need a possible-obj-m and a could-be-module which would check
if the object being built is part of any kconfig build which could ever
lead to it being part of a module, and if so define a new define
-DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've
suggested would be to have a tristate in kconfig imply the same new
-DPOSSIBLE_MODULE as well but that means getting kconfig symbol names
mapping to modules always, and I don't think that's the case today. I am
not aware of Nick or anyone exploring either of these options. Quite
recently Josh Poimboeuf has pointed out that live patching, kprobes and
BPF would benefit from resolving some part of the disambiguation as
well but for other reasons. The function granularity KASLR (fgkaslr)
patches were mentioned but Joe Lawrence has clarified this effort has
been dropped with no clear solution in sight [1].
In the meantime removing module license tags from code which could never
be modules is welcomed for both objectives mentioned above. Some
developers have also welcomed these changes as it has helped clarify
when a module was never possible and they forgot to clean this up,
and so you'll see quite a bit of Nick's patches in other pull
requests for this merge window. I just picked up the stragglers after
rc3. LWN has good coverage on the motivation behind this work [2] and
the typical cross-tree issues he ran into along the way. The only
concrete blocker issue he ran into was that we should not remove the
MODULE_LICENSE() tags from files which have no SPDX tags yet, even if
they can never be modules. Nick ended up giving up on his efforts due
to having to do this vetting and backlash he ran into from folks who
really did *not understand* the core of the issue nor were providing
any alternative / guidance. I've gone through his changes and dropped
the patches which dropped the module license tags where an SPDX
license tag was missing, it only consisted of 11 drivers. To see
if a pull request deals with a file which lacks SPDX tags you
can just use:
./scripts/spdxcheck.py -f \
$(git diff --name-only commid-id | xargs echo)
You'll see a core module file in this pull request for the above,
but that's not related to his changes. WE just need to add the SPDX
license tag for the kernel/module/kmod.c file in the future but
it demonstrates the effectiveness of the script.
Most of Nick's changes were spread out through different trees,
and I just picked up the slack after rc3 for the last kernel was out.
Those changes have been in linux-next for over two weeks.
The cleanups, debug code I added and final fix I added for modules
were motivated by David Hildenbrand's report of boot failing on
a systems with over 400 CPUs when KASAN was enabled due to running
out of virtual memory space. Although the functional change only
consists of 3 lines in the patch "module: avoid allocation if module is
already present and ready", proving that this was the best we can
do on the modules side took quite a bit of effort and new debug code.
The initial cleanups I did on the modules side of things has been
in linux-next since around rc3 of the last kernel, the actual final
fix for and debug code however have only been in linux-next for about a
week or so but I think it is worth getting that code in for this merge
window as it does help fix / prove / evaluate the issues reported
with larger number of CPUs. Userspace is not yet fixed as it is taking
a bit of time for folks to understand the crux of the issue and find a
proper resolution. Worst come to worst, I have a kludge-of-concept [3]
of how to make kernel_read*() calls for modules unique / converge them,
but I'm currently inclined to just see if userspace can fix this
instead.
[0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/
[1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com
[2] https://lwn.net/Articles/927569/
[3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr
ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF
hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr
oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC
jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU
YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/
WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea
p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa
Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt
c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy
eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI
tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6
agiwdHUMVN7X
=56WK
-----END PGP SIGNATURE-----
Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"The summary of the changes for this pull requests is:
- Song Liu's new struct module_memory replacement
- Nick Alcock's MODULE_LICENSE() removal for non-modules
- My cleanups and enhancements to reduce the areas where we vmalloc
module memory for duplicates, and the respective debug code which
proves the remaining vmalloc pressure comes from userspace.
Most of the changes have been in linux-next for quite some time except
the minor fixes I made to check if a module was already loaded prior
to allocating the final module memory with vmalloc and the respective
debug code it introduces to help clarify the issue. Although the
functional change is small it is rather safe as it can only *help*
reduce vmalloc space for duplicates and is confirmed to fix a bootup
issue with over 400 CPUs with KASAN enabled. I don't expect stable
kernels to pick up that fix as the cleanups would have also had to
have been picked up. Folks on larger CPU systems with modules will
want to just upgrade if vmalloc space has been an issue on bootup.
Given the size of this request, here's some more elaborate details:
The functional change change in this pull request is the very first
patch from Song Liu which replaces the 'struct module_layout' with a
new 'struct module_memory'. The old data structure tried to put
together all types of supported module memory types in one data
structure, the new one abstracts the differences in memory types in a
module to allow each one to provide their own set of details. This
paves the way in the future so we can deal with them in a cleaner way.
If you look at changes they also provide a nice cleanup of how we
handle these different memory areas in a module. This change has been
in linux-next since before the merge window opened for v6.3 so to
provide more than a full kernel cycle of testing. It's a good thing as
quite a bit of fixes have been found for it.
Jason Baron then made dynamic debug a first class citizen module user
by using module notifier callbacks to allocate / remove module
specific dynamic debug information.
Nick Alcock has done quite a bit of work cross-tree to remove module
license tags from things which cannot possibly be module at my request
so to:
a) help him with his longer term tooling goals which require a
deterministic evaluation if a piece a symbol code could ever be
part of a module or not. But quite recently it is has been made
clear that tooling is not the only one that would benefit.
Disambiguating symbols also helps efforts such as live patching,
kprobes and BPF, but for other reasons and R&D on this area is
active with no clear solution in sight.
b) help us inch closer to the now generally accepted long term goal
of automating all the MODULE_LICENSE() tags from SPDX license tags
In so far as a) is concerned, although module license tags are a no-op
for non-modules, tools which would want create a mapping of possible
modules can only rely on the module license tag after the commit
8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf").
Nick has been working on this *for years* and AFAICT I was the only
one to suggest two alternatives to this approach for tooling. The
complexity in one of my suggested approaches lies in that we'd need a
possible-obj-m and a could-be-module which would check if the object
being built is part of any kconfig build which could ever lead to it
being part of a module, and if so define a new define
-DPOSSIBLE_MODULE [0].
A more obvious yet theoretical approach I've suggested would be to
have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
well but that means getting kconfig symbol names mapping to modules
always, and I don't think that's the case today. I am not aware of
Nick or anyone exploring either of these options. Quite recently Josh
Poimboeuf has pointed out that live patching, kprobes and BPF would
benefit from resolving some part of the disambiguation as well but for
other reasons. The function granularity KASLR (fgkaslr) patches were
mentioned but Joe Lawrence has clarified this effort has been dropped
with no clear solution in sight [1].
In the meantime removing module license tags from code which could
never be modules is welcomed for both objectives mentioned above. Some
developers have also welcomed these changes as it has helped clarify
when a module was never possible and they forgot to clean this up, and
so you'll see quite a bit of Nick's patches in other pull requests for
this merge window. I just picked up the stragglers after rc3. LWN has
good coverage on the motivation behind this work [2] and the typical
cross-tree issues he ran into along the way. The only concrete blocker
issue he ran into was that we should not remove the MODULE_LICENSE()
tags from files which have no SPDX tags yet, even if they can never be
modules. Nick ended up giving up on his efforts due to having to do
this vetting and backlash he ran into from folks who really did *not
understand* the core of the issue nor were providing any alternative /
guidance. I've gone through his changes and dropped the patches which
dropped the module license tags where an SPDX license tag was missing,
it only consisted of 11 drivers. To see if a pull request deals with a
file which lacks SPDX tags you can just use:
./scripts/spdxcheck.py -f \
$(git diff --name-only commid-id | xargs echo)
You'll see a core module file in this pull request for the above, but
that's not related to his changes. WE just need to add the SPDX
license tag for the kernel/module/kmod.c file in the future but it
demonstrates the effectiveness of the script.
Most of Nick's changes were spread out through different trees, and I
just picked up the slack after rc3 for the last kernel was out. Those
changes have been in linux-next for over two weeks.
The cleanups, debug code I added and final fix I added for modules
were motivated by David Hildenbrand's report of boot failing on a
systems with over 400 CPUs when KASAN was enabled due to running out
of virtual memory space. Although the functional change only consists
of 3 lines in the patch "module: avoid allocation if module is already
present and ready", proving that this was the best we can do on the
modules side took quite a bit of effort and new debug code.
The initial cleanups I did on the modules side of things has been in
linux-next since around rc3 of the last kernel, the actual final fix
for and debug code however have only been in linux-next for about a
week or so but I think it is worth getting that code in for this merge
window as it does help fix / prove / evaluate the issues reported with
larger number of CPUs. Userspace is not yet fixed as it is taking a
bit of time for folks to understand the crux of the issue and find a
proper resolution. Worst come to worst, I have a kludge-of-concept [3]
of how to make kernel_read*() calls for modules unique / converge
them, but I'm currently inclined to just see if userspace can fix this
instead"
Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]
* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
module: add debugging auto-load duplicate module support
module: stats: fix invalid_mod_bytes typo
module: remove use of uninitialized variable len
module: fix building stats for 32-bit targets
module: stats: include uapi/linux/module.h
module: avoid allocation if module is already present and ready
module: add debug stats to help identify memory pressure
module: extract patient module check into helper
modules/kmod: replace implementation with a semaphore
Change DEFINE_SEMAPHORE() to take a number argument
module: fix kmemleak annotations for non init ELF sections
module: Ignore L0 and rename is_arm_mapping_symbol()
module: Move is_arm_mapping_symbol() to module_symbol.h
module: Sync code of is_arm_mapping_symbol()
scripts/gdb: use mem instead of core_layout to get the module address
interconnect: remove module-related code
interconnect: remove MODULE_LICENSE in non-modules
zswap: remove MODULE_LICENSE in non-modules
zpool: remove MODULE_LICENSE in non-modules
x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
...
Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening in
the driver core in the quest to be able to move "struct bus" and "struct
class" into read-only memory, a task now complete with these changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules for
all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most of
them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
LEGadNS38k5fs+73UaxV
=7K4B
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.
So remove it in the files in this commit, none of which can be built as
modules.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dmaengine@vger.kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.
So remove it in the files in this commit, none of which can be built as
modules.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: dmaengine@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled before
and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely
on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in this
driver.
At present, missing IOPF enabling/disabling doesn't cause any real issue,
because the IOMMU driver places the IOPF enabling/disabling in the path
of SVA feature handling. But this may change.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230324120234.313643-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add sysfs knob for per wq Page Request Service disable. This knob
disables PRS support for the specific wq. When this bit is set,
it also overrides the wq's block on fault enabling.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-17-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Provide the pid of the application for the opened file. This allows the
monitor daemon to easily correlate which app opened the file and easily
kill the app by pid if that is desired action.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-16-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Expose cr_faults and cr_fault_failures counters to the user space. This
allows a user app to keep track of how many fault the application is
causing with the completion record (CR) and also the number of failures
of the CR writeback. Having a high number of cr_fault_failures is bad as
the app is submitting descriptors with the CR addresses that are bad. User
monitoring daemon may want to consider killing the application as it may be
malicious and attempting to flood the device event log.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-15-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Embed a struct device for the user file context in order to export sysfs
attributes related with the opened file. Tie the lifetime of the file
context to the device. The sysfs entry will be added under the char device.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-14-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add counters per opened file for the char device in order to keep track how
many completion record faults occurred and how many of those faults failed
the writeback by the driver after attempt to fault in the page. The
counters are managed by xarray that associates the PASID with
struct idxd_user_context.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-13-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add event log processing for faulting of user batch descriptor completion
record.
When encountering an event log entry for a page fault on a completion
record, the driver is expected to do the following:
1. If the "first error in batch" bit in event log entry error info is
set, discard any previously recorded errors associated with the
"batch identifier".
2. Fix the page fault according to the fault address in the event log. If
successful, write the completion record to the fault address in user space.
3. If an error is encountered while writing the completion record and it is
associated to a descriptor in the batch, the driver associates the error
with the batch identifier of the event log entry and tracks it until the
event log entry for the corresponding batch desc is encountered.
While processing an event log entry for a batch descriptor with error
indicating that one or more descs in the batch had event log entries,
the driver will do the following before writing the batch completion
record:
1. If the status field of the completion record is 0x1, the driver will
change it to error code 0x5 (one or more operations in batch completed
with status not successful) and changes the result field to 1.
2. If the status is error code 0x6 (page fault on batch descriptor list
address), change the result field to 1.
3. If status is any other value, the completion record is not changed.
4. Clear the recorded error in preparation for next batch with same batch
identifier.
The result field is for user software to determine whether to set the
"Batch Error" flag bit in the descriptor for continuation of partial
batch descriptor completion. See DSA spec 2.0 for additional information.
If no error has been recorded for the batch, the batch completion record is
written to user space as is.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DSA supports page fault handling through PRS. However, the DMA engine
that's processing the descriptor is blocked until the PRS response is
received. Other workqueues sharing the engine are also blocked.
Page fault handing by the driver with PRS disabled can be used to
mitigate the stalling.
With PRS disabled while ATS remain enabled, DSA handles page faults on
a completion record by reporting an event in the event log. In this
instance, the descriptor is completed and the event log contains the
completion record address and the contents of the completion record. Add
support to the event log handling code to fault in the completion record
and copy the content of the completion record to user memory.
A bitmap is introduced to keep track of discarded event log entries. When
the user process initiates ->release() of the char device, it no longer is
interested in any remaining event log entries tied to the relevant wq and
PASID. The driver will mark the event log entry index in the bitmap. Upon
encountering the entries during processing, the event log handler will just
clear the bitmap bit and skip the entry rather than attempt to process the
event log entry.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Define idxd_copy_cr() to copy completion record to fault address in
user address that is found by work queue (wq) and PASID.
It will be used to write the user's completion record that the hardware
device is not able to write due to user completion record page fault.
An xarray is added to associate the PASID and mm with the
struct idxd_user_context so mm can be found by PASID and wq.
It is called when handling the completion record fault in a kernel thread
context. Switch to the mm using kthread_use_vm() and copy the
completion record to the mm via copy_to_user(). Once the copy is
completed, switch back to the current mm using kthread_unuse_mm().
Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add a kmem cache per device for allocating event log fault context. The
context allows an event log entry to be copied and passed to a software
workqueue to be processed. Due to each device can have different sized
event log entry depending on device type, it's not possible to have a
global kmem cache.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-8-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add a workqueue for user submitted completion record fault processing.
The workqueue creation and destruction lifetime will be tied to the user
sub-driver since it will only be used when the wq is a user type.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-7-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add debugfs entry to dump the content of the event log for debugging. The
function will dump all non-zero entries in the event log. It will note
which entries are processed and which entries are still pending processing
at the time of the dump. The entries may not always be in chronological
order due to the log is a circular buffer.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
An event log interrupt is raised in the misc interrupt INTCAUSE register
when an event is written by the hardware. Add basic event log processing
support to the interrupt handler. The event log is a ring where the
hardware owns the tail and the software owns the head. The hardware will
advance the tail index when an additional event has been pushed to memory.
The software will process the log entry and then advances the head. The
log is full when (tail + 1) % log_size = head. The hardware will stop
writing when the log is full. The user is expected to create a log size
large enough to handle all the expected events.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-5-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add setup of event log feature for supported device. Event log addresses
error reporting that was lacking in gen 1 DSA devices where a second error
event does not get reported when a first event is pending software
handling. The event log allows a circular buffer that the device can push
error events to. It is up to the user to create a large enough event log
ring in order to capture the expected events. The evl size can be set in
the device sysfs attribute. By default 64 entries are supported as minimal
when event log is enabled.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add support for changing of the event log size. Event log is a
feature added to DSA 2.0 hardware to improve error reporting.
It supersedes the SWERROR register on DSA 1.0 hardware and hope
to prevent loss of reported errors.
The error log size determines how many error entries supported for
the device. It can be configured by the user via sysfs attribute.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Current code continuously processes the interrupt as long as the hardware
is setting the status bit. There's no reason to do that since the threaded
handler will get called again if another interrupt is asserted.
Also through testing, it has shown that if a misprogrammed (or malicious)
agent can continuously submit descriptors with bad completion record and
causes errors to be reported via the misc interrupt. Continuous processing
by the thread can cause software hang watchdog to kick off since the thread
isn't giving up the CPU.
Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-2-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Align the declaration of ret in atmel_xdmac_resume() with the rest of
variables. Do this by adding ret to the line with declaration for i
variable.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-8-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Do not global enable all the cyclic channels in at_xdmac_resume(). Instead
save the global status in at_xdmac_suspend() and re-enable the cyclic
channel only if it was active before suspend.
Fixes: e1f7c9eee7 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-6-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In case the system suspends to a deep sleep state where power to DMA
controller is cut-off we need to restore the content of GRWS register.
This is a write only register and writing bit X tells the controller
to suspend read and write requests for channel X. Thus set GRWS before
restoring the content of GE (Global Enable) regiter.
Fixes: e1f7c9eee7 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-5-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In case there are DMA channels not paused by consumers in suspend
process (valid on AT91 SoCs for serial driver when no_console_suspend) the
driver pauses them (using at_xdmac_device_pause() which is also the same
function called by dmaengine_pause()) and then in the resume process the
driver resumes them calling at_xdmac_device_resume() which is the same
function called by dmaengine_resume()). This is good for DMA channels
not paused by consumers but for drivers that calls
dmaengine_pause()/dmaegine_resume() on suspend/resume path this may lead to
DMA channel being enabled before the IP is enabled. For IPs that needs
strict ordering with regards to DMA channel enablement this will lead to
wrong behavior. To fix this add a new set of functions
at_xdmac_device_pause_internal()/at_xdmac_device_resume_internal() to be
called only on suspend/resume.
Fixes: e1f7c9eee7 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230214151827.1050280-4-claudiu.beznea@microchip.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>