linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-28 15:11:31 +00:00

History

Anton Blanchard 15c2d45d17 powerpc: Add 64bit optimised memcmp I noticed ksm spending quite a lot of time in memcmp on a large KVM box. The current memcmp loop is very unoptimised - byte at a time compares with no loop unrolling. We can do much much better. Optimise the loop in a few ways: - Unroll the byte at a time loop - For large (at least 32 byte) comparisons that are also 8 byte aligned, use an unrolled modulo scheduled loop using 8 byte loads. This is similar to our glibc memcmp. A simple microbenchmark testing 10000000 iterations of an 8192 byte memcmp was used to measure the performance: baseline: 29.93 s modified: 1.70 s Just over 17x faster. v2: Incorporated some suggestions from Segher: - Use andi. instead of rdlicl. - Convert bdnzt eq, to bdnz. It's just duplicating the earlier compare and was a relic from a previous version. - Don't use cr5, we have plans to use that CR field for fast local atomics. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>		2015-01-23 14:02:55 +11:00
..
alloc.c	powerpc: Remove more traces of bootmem	2014-11-19 21:41:51 +11:00
checksum_32.S
checksum_64.S	powerpc: Restore registers on error exit from csum_partial_copy_generic()	2013-10-03 17:22:42 +10:00
checksum_wrappers_64.c	powerpc: various straight conversions from module.h --> export.h	2011-10-31 19:30:44 -04:00
code-patching.c	powerpc: Move the patch_exception to a common place	2013-12-02 14:06:54 +11:00
copy_32.S	powerpc: Fix incorrect .stabs entry for copy_32.S	2010-09-02 14:07:34 +10:00
copypage_64.S	powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC()	2014-06-05 13:20:41 +10:00
copypage_power7.S	powerpc: Fix unsafe accesses to parameter area in ELFv2	2014-04-23 10:05:24 +10:00
copyuser_64.S	powerpc: Remove power3 from comments	2014-07-28 14:10:26 +10:00
copyuser_power7.S	powerpc: Fix comment typos 'CONFiG_ALTIVEC'	2014-10-29 14:41:49 +01:00
crtsavres.S	powerpc: Add vr save/restore functions	2014-01-15 13:46:43 +11:00
div64.S
feature-fixups-test.S	powerpc: Ensure the else case of feature sections will fit	2011-01-21 14:08:33 +11:00
feature-fixups.c	powerpc: Make a bunch of things static	2014-09-25 23:14:41 +10:00
hweight_64.S	powerpc: No need to use dot symbols when branching to a function	2014-04-23 10:05:16 +10:00
ldstfp.S	powerpc: Fixes for instructions not using correct register naming	2012-07-10 19:18:16 +10:00
locks.c	powerpc: Add smp_mb()s to arch_spin_unlock_wait()	2014-08-13 15:13:27 +10:00
Makefile	powerpc: Add 64bit optimised memcmp	2015-01-23 14:02:55 +11:00
mem_64.S	powerpc: use _GLOBAL_TOC for memmove	2014-07-22 15:56:04 +10:00
memcmp_64.S	powerpc: Add 64bit optimised memcmp	2015-01-23 14:02:55 +11:00
memcpy_64.S	Merge remote-tracking branch 'anton/abiv2' into next	2014-05-05 20:57:12 +10:00
memcpy_power7.S	powerpc: Fix comment typos 'CONFiG_ALTIVEC'	2014-10-29 14:41:49 +01:00
ppc_ksyms.c	powerpc: Move lib symbol exports into arch/powerpc/lib/ppc_ksyms.c	2014-09-25 23:14:39 +10:00
rheap.c	powerpc: various straight conversions from module.h --> export.h	2011-10-31 19:30:44 -04:00
sstep.c	powerpc: Fix compilation of emulate_step()	2014-11-12 15:54:29 +11:00
string_64.S	powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC()	2014-06-05 13:20:41 +10:00
string.S	powerpc: Add 64bit optimised memcmp	2015-01-23 14:02:55 +11:00
usercopy_64.c
vmx-helper.c	powerpc: POWER7 optimised copy_page using VMX and enhanced prefetch	2012-07-03 14:14:44 +10:00
xor_vmx.c	powerpc: Add VMX optimised xor for RAID5	2013-10-30 16:02:28 +11:00