linux/arch/arm64/crypto
Ard Biesheuvel 11031c0d7d crypto: arm64/gcm-ce - implement 4 way interleave
To improve performance on cores with deep pipelines such as ThunderX2,
reimplement gcm(aes) using a 4-way interleave rather than the 2-way
interleave we use currently.

This comes down to a complete rewrite of the GCM part of the combined
GCM/GHASH driver, and instead of interleaving two invocations of AES
with the GHASH handling at the instruction level, the new version
uses a more coarse grained approach where each chunk of 64 bytes is
encrypted first and then ghashed (or ghashed and then decrypted in
the converse case).

The core NEON routine is now able to consume inputs of any size,
and tail blocks of less than 64 bytes are handled using overlapping
loads and stores, and processed by the same 4-way encryption and
hashing routines. This gets rid of most of the branches, and avoids
having to return to the C code to handle the tail block using a
stack buffer.

The table below compares the performance of the old driver and the new
one on various micro-architectures and running in various modes.

        |     AES-128      |     AES-192      |     AES-256      |
 #bytes | 512 | 1500 |  4k | 512 | 1500 |  4k | 512 | 1500 |  4k |
 -------+-----+------+-----+-----+------+-----+-----+------+-----+
    TX2 | 35% |  23% | 11% | 34% |  20% |  9% | 38% |  25% | 16% |
   EMAG | 11% |   6% |  3% | 12% |   4% |  2% | 11% |   4% |  2% |
    A72 |  8% |   5% | -4% |  9% |   4% | -5% |  7% |   4% | -5% |
    A53 | 11% |   6% | -1% | 10% |   8% | -1% | 10% |   8% | -2% |

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-10-05 01:04:31 +10:00
..
.gitignore crypto: arm64/sha2 - add generated .S files to .gitignore 2016-11-29 16:06:56 +08:00
aes-ce-ccm-core.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
aes-ce-ccm-glue.c crypto: arm64/aes-ccm - switch to AES library 2019-07-26 14:56:05 +10:00
aes-ce-core.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
aes-ce-glue.c crypto: arm64/aes-ce-cipher - use AES library as fallback 2019-07-26 14:58:09 +10:00
aes-ce-setkey.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
aes-ce.S crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS 2019-09-09 17:35:39 +10:00
aes-cipher-core.S crypto: arm64/aes-cipher - switch to shared AES inverse Sbox 2019-07-26 14:58:37 +10:00
aes-cipher-glue.c crypto: arm64/aes-ce-cipher - use AES library as fallback 2019-07-26 14:58:09 +10:00
aes-glue.c crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS 2019-09-09 17:35:39 +10:00
aes-modes.S crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS 2019-09-09 17:35:39 +10:00
aes-neon.S crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS 2019-09-09 17:35:39 +10:00
aes-neonbs-core.S crypto: arm64/aes-neonbs - replace tweak mask literal with composition 2019-09-09 17:35:28 +10:00
aes-neonbs-glue.c crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS 2019-09-09 17:35:39 +10:00
chacha-neon-core.S crypto: arm64/chacha - fix hchacha_block_neon() for big endian 2019-02-28 14:37:48 +08:00
chacha-neon-glue.c crypto: chacha - constify ctx and iv arguments 2019-06-13 14:31:40 +08:00
crct10dif-ce-core.S crypto: arm64/crct10dif-ce - cleanup and optimizations 2019-02-08 15:29:48 +08:00
crct10dif-ce-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
ghash-ce-core.S crypto: arm64/gcm-ce - implement 4 way interleave 2019-10-05 01:04:31 +10:00
ghash-ce-glue.c crypto: arm64/gcm-ce - implement 4 way interleave 2019-10-05 01:04:31 +10:00
Kconfig crypto: arm64/aes-ce-cipher - use AES library as fallback 2019-07-26 14:58:09 +10:00
Makefile treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
nh-neon-core.S crypto: arm64/nhpoly1305 - add NEON-accelerated NHPoly1305 2018-12-13 18:24:35 +08:00
nhpoly1305-neon-glue.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-05-06 20:15:06 -07:00
sha1-ce-core.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sha1-ce-glue.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
sha2-ce-core.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sha2-ce-glue.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
sha3-ce-core.S crypto: arm64/sha3-ce - yield NEON after every block of input 2018-05-12 00:13:11 +08:00
sha3-ce-glue.c crypto: arm64 - convert to use crypto_simd_usable() 2019-03-22 20:57:27 +08:00
sha256-core.S_shipped crypto: clarify licensing of OpenSSL asm code 2018-05-31 00:13:44 +08:00
sha256-glue.c crypto: arm64 - Rename functions to avoid conflict with crypto/sha256.h 2019-09-05 14:37:30 +10:00
sha512-armv8.pl crypto: clarify licensing of OpenSSL asm code 2018-05-31 00:13:44 +08:00
sha512-ce-core.S crypto: arm64/sha512-ce - yield NEON after every block of input 2018-05-12 00:13:12 +08:00
sha512-ce-glue.c crypto: arm64 - convert to use crypto_simd_usable() 2019-03-22 20:57:27 +08:00
sha512-core.S_shipped crypto: clarify licensing of OpenSSL asm code 2018-05-31 00:13:44 +08:00
sha512-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sm3-ce-core.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sm3-ce-glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sm4-ce-core.S crypto: arm64 - add support for SM4 encryption using special instructions 2018-05-05 14:52:53 +08:00
sm4-ce-glue.c crypto: arm64 - convert to use crypto_simd_usable() 2019-03-22 20:57:27 +08:00