linux/arch/arm/lib
Russell King b4f656eea6 Pull branch 'for-rmk' of git://git.linaro.org/people/ardbiesheuvel/linux-arm into devel-stable
Comments from Ard Biesheuvel:

I have included two use cases that I have been using, XOR and RAID-6
checksumming. The former gets a 60% performance boost on the NEON, the
latter over 400%.

ARM: add support for kernel mode NEON

Adds kernel_neon_begin/end (renamed from kernel_vfp_begin/end in the
previous version to de-emphasize the VFP part as VFP code that needs
software assistance is not supported currently.)

Introduces <asm/neon.h> and the Kconfig symbol KERNEL_MODE_NEON. This
has been aligned with Catalin for arm64, so any NEON code that does
not use assembly but intrinsics or the GCC vectorizer (such as my
examples) can potentially be shared between arm and arm64 archs.

ARM: move VFP init to an earlier boot stage

This is needed so the NEON is enabled when the XOR and RAID-6 algo
boot time benchmarks are run.

ARM: be strict about FP exceptions in kernel mode

This adds a check to vfp_support_entry() to flag unsupported uses of
the NEON/VFP in kernel mode. FP exceptions (bounces) are flagged as
a bug, this is because of their potentially intermittent nature.
Exceptions caused by the fact that kernel_neon_begin has not been
called are just routed through the undef handler.

ARM: crypto: add NEON accelerated XOR implementation

This is the xor_blocks() implementation built with -ftree-vectorize,
60% faster than optimized ARM code. It calls in_interrupt() to check
whether the NEON flavor can be used: this should really not be
necessary, but due to xor_blocks'squite generic nature, there is no
telling how exactly people may be using it in the real world.

lib/raid6: add ARM-NEON accelerated syndrome calculation

This is a port of the RAID-6 checksumming code in altivec.uc ported
to use NEON intrinsics. It is about 4x faster than the sequential
code.
2013-07-22 17:46:40 +01:00
..
ashldi3.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
ashrdi3.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
backtrace.S ARM: 7068/1: process: change from __backtrace to dump_stack in show_regs 2011-10-17 09:12:41 +01:00
bitops.h ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
call_with_stack.S ARM: lib: add call_with_stack function for safely changing stack 2011-12-12 16:07:35 +00:00
changebit.S ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
clear_user.S ARM: 6110/1: Fix Thumb-2 kernel builds when UACCESS_WITH_MEMCPY is enabled 2010-05-08 10:45:26 +01:00
clearbit.S ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
copy_from_user.S ARM: fix build error in arch/arm/kernel/process.c 2010-04-21 08:45:21 +01:00
copy_page.S ARM: 5701/1: ARM: copy_page.S: take into account the size of the cache line 2009-09-15 22:07:02 +01:00
copy_template.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
copy_to_user.S ARM: 6110/1: Fix Thumb-2 kernel builds when UACCESS_WITH_MEMCPY is enabled 2010-05-08 10:45:26 +01:00
csumipv6.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
csumpartial.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
csumpartialcopy.S [ARM] 5231/1: Do not save the frame pointer in the csum_partial_copy_* functions 2008-09-01 12:06:35 +01:00
csumpartialcopygeneric.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
csumpartialcopyuser.S ARM: Fix csum_partial_copy_from_user() 2010-07-26 12:18:16 +01:00
delay-loop.S ARM: 7452/1: delay: allow timer-based delay implementation to be selected 2012-07-09 17:42:23 +01:00
delay.c arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
div64.S ARM: 7125/1: Add unwinding annotations for 64bit division functions 2011-10-17 09:13:42 +01:00
ecard.S ARM: remove unnecessary mach/hardware.h includes 2011-07-12 11:19:27 -05:00
findbit.S ARM: 6482/2: Fix find_next_zero_bit and related assembly 2010-11-24 20:17:46 +00:00
floppydma.S Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
getuser.S ARM: 7527/1: uaccess: explicitly check __user pointer when !CPU_USE_DOMAINS 2012-09-09 17:28:47 +01:00
io-acorn.S arch: remove direct definitions of KERN_<LEVEL> uses 2012-07-30 17:25:13 -07:00
io-readsb.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
io-readsl.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
io-readsw-armv3.S ARM: Bring back ARMv3 IO and user access code 2012-08-13 11:44:13 +01:00
io-readsw-armv4.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
io-shark.c [PATCH] ARM: 2815/1: Shark: new defconfig, fixes with __io and serial ports 2005-07-16 17:17:18 +01:00
io-writesb.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
io-writesl.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
io-writesw-armv3.S ARM: Bring back ARMv3 IO and user access code 2012-08-13 11:44:13 +01:00
io-writesw-armv4.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
lib1funcs.S ARM: 6945/1: Add unwinding support for division functions 2011-05-27 22:56:53 +01:00
lshrdi3.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
Makefile ARM: crypto: add NEON accelerated XOR implementation 2013-07-08 22:09:06 +01:00
memchr.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
memcpy.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
memmove.S ARM: 6006/1: ARM: Use the correct NOP size in memmove for Thumb-2 kernel builds 2010-03-29 17:33:33 +01:00
memset.S ARM: 7670/1: fix the memset fix 2013-03-12 12:18:47 +00:00
memzero.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
muldi3.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
putuser.S ARM: 7527/1: uaccess: explicitly check __user pointer when !CPU_USE_DOMAINS 2012-09-09 17:28:47 +01:00
setbit.S ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
strchr.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
strrchr.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
testchangebit.S ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
testclearbit.S ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
testsetbit.S ARM: 7171/1: unwind: add unwind directives to bitops assembly macros 2011-11-26 21:58:53 +00:00
uaccess_with_memcpy.c ARM: include linux/highmem.h in uaccess functions 2011-10-02 15:44:32 +02:00
uaccess.S ARM: Bring back ARMv3 IO and user access code 2012-08-13 11:44:13 +01:00
ucmpdi2.S [ARM] 5227/1: Add the ENDPROC declarations to the .S files 2008-09-01 12:06:34 +01:00
xor-neon.c ARM: crypto: add NEON accelerated XOR implementation 2013-07-08 22:09:06 +01:00