i.MX3x SoCs contain an Image Processing Unit, consisting of a Control
Module (CM), Display Interface (DI), Synchronous Display Controller (SDC),
Asynchronous Display Controller (ADC), Image Converter (IC), Post-Filter
(PF), Camera Sensor Interface (CSI), and an Image DMA Controller (IDMAC).
CM contains, among other blocks, an Interrupt Generator (IG) and a Clock
and Reset Control Unit (CRCU). This driver serves IDMAC and IG. They are
supported over dmaengine and irq-chip APIs respectively.
IDMAC is a specialised DMA controller, its DMA channels cannot be used for
general-purpose operations, even though it might be possible to configure
a memory-to-memory channel for memcpy operation. This driver will not work
with generic dmaengine clients, clients, wishing to use it must use
respective wrapper structures, they also must specify which channels they
require, as channels are hard-wired to specific IPU functions.
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Guennadi Liakhovetski <lg@denx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This adds a driver for the Synopsys DesignWare DMA controller (aka
DMACA on AVR32 systems.) This DMA controller can be found integrated
on the AT32AP7000 chip and is primarily meant for peripheral DMA
transfer, but can also be used for memory-to-memory transfers.
This patch is based on a driver from David Brownell which was based on
an older version of the DMA Engine framework. It also implements the
proposed extensions to the DMA Engine API for slave DMA operations.
The dmatest client shows no problems, but there may still be room for
improvement performance-wise. DMA slave transfer performance is
definitely "good enough"; reading 100 MiB from an SD card running at ~20
MHz yields ~7.2 MiB/s average transfer rate.
Full documentation for this controller can be found in the Synopsys
DW AHB DMAC Databook:
http://www.synopsys.com/designware/docs/iip/DW_ahb_dmac/latest/doc/dw_ahb_dmac_db.pdf
The controller has lots of implementation options, so it's usually a
good idea to check the data sheet of the chip it's intergrated on as
well. The AT32AP7000 data sheet can be found here:
http://www.atmel.com/dyn/products/datasheets.asp?family_id=682
Changes since v4:
* Use client_count instead of dma_chan_is_in_use()
* Add missing include
* Unmap buffers unless client told us not to
Changes since v3:
* Update to latest DMA engine and DMA slave APIs
* Embed the hw descriptor into the sw descriptor
* Clean up and update MODULE_DESCRIPTION, copyright date, etc.
Changes since v2:
* Dequeue all pending transfers in terminate_all()
* Rename dw_dmac.h -> dw_dmac_regs.h
* Define and use controller-specific dma_slave data
* Fix up a few outdated comments
* Define hardware registers as structs (doesn't generate better
code, unfortunately, but it looks nicer.)
* Get number of channels from platform_data instead of hardcoding it
based on CONFIG_WHATEVER_CPU.
* Give slave clients exclusive access to the channel
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>,
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This client tests DMA memcpy using various lengths and various offsets
into the source and destination buffers. It will initialize both
buffers with a repeatable pattern and verify that the DMA engine copies
the requested region and nothing more. It will also verify that the
bytes aren't swapped around, and that the source buffer isn't modified.
The dmatest module can be configured to test a specific device, a
specific channel. It can also test multiple channels at the same time,
and it can start multiple threads competing for the same channel.
Changes since v2:
* Support testing multiple channels at the same time
* Support testing with multiple threads competing for the same channel
* Use counting test patterns in order to catch byte ordering issues
Changes since v1:
* Remove extra dashes around "help"
* Remove "default n" from Kconfig
* Turn TEST_BUF_SIZE into a module parameter
* Return DMA_NAK instead of DMA_DUP
* Print unhandled events
* Support testing specific channels and devices
* Move to the end of the Makefile
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The XOR engine found in Marvell's SoCs and system controllers
provides XOR and DMA operation, iSCSI CRC32C calculation, memory
initialization, and memory ECC error cleanup operation support.
This driver implements the DMA engine API and supports the following
capabilities:
- memcpy
- xor
- memset
The XOR engine can be used by DMA engine clients implemented in the
kernel, one of those clients is the RAID module. In that case, I
observed 20% improvement in the raid5 write throughput, and 40%
decrease in the CPU utilization when doing array construction, those
results obtained on an 5182 running at 500Mhz.
When enabling the NET DMA client, the performance decreased, so
meanwhile it is recommended to keep this client off.
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The driver implements DMA engine API for Freescale MPC85xx DMA controller,
which could be used by devices in the silicon. The driver supports the
Basic mode of Freescale MPC85xx DMA controller. The MPC85xx processors
supported include MPC8540/60, MPC8555, MPC8548, MPC8641 and so on.
The MPC83xx(MPC8349, MPC8360) are also supported.
[kamalesh@linux.vnet.ibm.com: build fix]
[dan.j.williams@intel.com: merge mm fixes, rebase on async_tx-2.6.25]
Signed-off-by: Zhang Wei <wei.zhang@freescale.com>
Signed-off-by: Ebony Zhu <ebony.zhu@freescale.com>
Acked-by: Kumar Gala <galak@gate.crashing.org>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add code to connect to the DCA driver and provide cpu tags for use by
drivers that would like to use Direct Cache Access hints.
[Adrian Bunk] Several Kconfig cleanup items
[Andrew Morten, Chris Leech] Fix for using cpu_physical_id() even when
built for uni-processor
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split the general PCI startup from the DMA handling code in order to
prepare for adding support for DCA services and future versions of the
ioatdma device.
[Rusty Russell] Removal of __unsafe() usage.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename the ioatdma.c file in preparation for splitting into multiple files,
which will allow for easier adding new functionality.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Intel(R) IOP series of i/o processors integrate an Xscale core with
raid acceleration engines. The capabilities per platform are:
iop219:
(2) copy engines
iop321:
(2) copy engines
(1) xor and block fill engine
iop33x:
(2) copy and crc32c engines
(1) xor, xor zero sum, pq, pq zero sum, and block fill engine
iop34x (iop13xx):
(2) copy, crc32c, xor, xor zero sum, and block fill engines
(1) copy, crc32c, xor, xor zero sum, pq, pq zero sum, and block fill engine
The driver supports the features of the async_tx api:
* asynchronous notification of operation completion
* implicit (interupt triggered) handling of inter-channel transaction
dependencies
The driver adapts to the platform it is running by two methods.
1/ #include <asm/arch/adma.h> which defines the hardware specific
iop_chan_* and iop_desc_* routines as a series of static inline
functions
2/ The private platform data attached to the platform_device defines the
capabilities of the channels
20070626: Callbacks are run in a tasklet. Given the recent discussion on
LKML about killing tasklets in favor of workqueues I did a quick conversion
of the driver. Raid5 resync performance dropped from 50MB/s to 30MB/s, so
the tasklet implementation remains until a generic softirq interface is
available.
Changelog:
* fixed a slot allocation bug in do_iop13xx_adma_xor that caused too few
slots to be requested eventually leading to data corruption
* enabled the slot allocation routine to attempt to free slots before
returning -ENOMEM
* switched the cleanup routine to solely use the software chain and the
status register to determine if a descriptor is complete. This is
necessary to support other IOP engines that do not have status writeback
capability
* make the driver iop generic
* modified the allocation routines to understand allocating a group of
slots for a single operation
* added a null xor initialization operation for the xor only channel on
iop3xx
* support xor operations on buffers larger than the hardware maximum
* split the do_* routines into separate prep, src/dest set, submit stages
* added async_tx support (dependent operations initiation at cleanup time)
* simplified group handling
* added interrupt support (callbacks via tasklets)
* brought the pending depth inline with ioat (i.e. 4 descriptors)
* drop dma mapping methods, suggested by Chris Leech
* don't use inline in C files, Adrian Bunk
* remove static tasklet declarations
* make iop_adma_alloc_slots easier to read and remove chances for a
corrupted descriptor chain
* fix locking bug in iop_adma_alloc_chan_resources, Benjamin Herrenschmidt
* convert capabilities over to dma_cap_mask_t
* fixup sparse warnings
* add descriptor flush before iop_chan_enable
* checkpatch.pl fixes
* gpl v2 only correction
* move set_src, set_dest, submit to async_tx methods
* move group_list and phys to async_tx
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Provides for pinning user space pages in memory, copying to iovecs,
and copying from sk_buffs including fragmented and chained sk_buffs.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provides an API for offloading memory copies to DMA devices
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>