linux/arch/x86
Johannes Goetzfried 4ea1277d30 crypto: cast6 - add x86_64/avx assembler implementation
This patch adds a x86_64/avx assembler implementation of the Cast6 block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

cast6-avx-x86_64 vs. cast6-generic
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   1.00x   1.01x   1.01x   0.99x   0.97x   0.98x   1.01x   0.96x   0.98x
64B     0.98x   0.99x   1.02x   1.01x   0.99x   1.00x   1.01x   0.99x   1.00x   0.99x
256B    1.77x   1.84x   0.99x   1.85x   1.77x   1.77x   1.70x   1.74x   1.69x   1.72x
1024B   1.93x   1.95x   0.99x   1.96x   1.93x   1.93x   1.84x   1.85x   1.89x   1.87x
8192B   1.91x   1.95x   0.99x   1.97x   1.95x   1.91x   1.86x   1.87x   1.93x   1.90x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   0.99x   1.02x   1.01x   0.98x   0.99x   1.00x   1.00x   0.98x   0.98x
64B     0.98x   0.99x   1.01x   1.00x   1.00x   1.00x   1.01x   1.01x   0.97x   1.00x
256B    1.77x   1.83x   1.00x   1.86x   1.79x   1.78x   1.70x   1.76x   1.71x   1.69x
1024B   1.92x   1.95x   0.99x   1.96x   1.93x   1.93x   1.83x   1.86x   1.89x   1.87x
8192B   1.94x   1.95x   0.99x   1.97x   1.95x   1.95x   1.87x   1.87x   1.93x   1.91x

Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-08-01 17:47:30 +08:00
..
boot Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:13:25 -07:00
configs x86/kconfig: Remove CONFIG_TR=y from the defconfigs 2012-03-24 08:18:03 +01:00
crypto crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
ia32 x86, compat: Use test_thread_flag(TIF_IA32) in compat signal delivery 2012-06-14 18:16:04 -07:00
include/asm Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-31 15:34:13 -07:00
kernel Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-31 15:34:13 -07:00
kvm KVM updates for the 3.6 merge window 2012-07-24 12:01:20 -07:00
lguest lguest: Make sure interrupt is allocated ok by lguest_setup_irq 2012-01-12 15:44:47 +10:30
lib Merge branch 'x86/cpu' into perf/core 2012-07-05 21:12:11 +02:00
math-emu x86: Rename trap_no to trap_nr in thread_struct 2012-03-13 06:24:09 +01:00
mm Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
net x86 bpf_jit: support BPF_S_ANC_ALU_XOR_X instruction 2012-06-06 09:42:44 -07:00
oprofile perf/x86/amd: Unify AMD's generic and family 15h pmus 2012-07-05 21:19:41 +02:00
pci Merge branch 'pci/myron-pcibios_setup' into next 2012-07-05 15:31:05 -06:00
platform Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
power x86, kvm: Call restore_sched_clock_state() only after %gs is initialized 2012-04-02 13:53:00 +02:00
realmode x86-64, reboot: Be more paranoid in 64-bit reboot=bios 2012-06-21 10:25:03 -07:00
syscalls syscalls, x86: add __NR_kcmp syscall 2012-05-31 17:49:32 -07:00
tools x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix 2012-06-06 08:54:18 +02:00
um x86, um: Correct syscall table type attributes breaking gcc 4.8 2012-06-09 12:51:09 -07:00
vdso x86, cpu: Rename checking_wrmsrl() to wrmsrl_safe() 2012-06-07 13:32:04 -07:00
video x86: Use vga_default_device() when determining whether an fb is primary 2012-04-24 09:50:17 +01:00
xen Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
.gitignore x86/kprobes: Add arch/x86/tools/insn_sanity to .gitignore 2012-01-16 08:21:59 +01:00
Kbuild x86, realmode: realmode.bin infrastructure 2012-05-08 11:41:48 -07:00
Kconfig ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION 2012-07-30 17:25:21 -07:00
Kconfig.cpu x86: Tighten dependencies of CPU_SUP_*_32 2012-03-08 10:57:34 +01:00
Kconfig.debug x86/tlb: add tlb_flushall_shift knob into debugfs 2012-06-27 19:29:10 -07:00
Makefile x86-64, gcc: Use -mpreferred-stack-boundary=3 if supported 2012-06-23 19:25:22 -07:00
Makefile_32.cpu
Makefile.um um: fix linker script generation 2012-04-09 13:59:00 -04:00