linux/arch/x86/crypto
Tianjia Zhang 5b2efa2bb8 crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation
Like the implementation of AESNI/AVX, this patch adds an accelerated
implementation of AESNI/AVX2. In terms of code implementation, by
reusing AESNI/AVX mode-related codes, the amount of code is greatly
reduced. From the benchmark data, it can be seen that when the block
size is 1024, compared to AVX acceleration, the performance achieved
by AVX2 has increased by about 70%, it is also 7.7 times of the pure
software implementation of sm4-generic.

The main algorithm implementation comes from SM4 AES-NI work by
libgcrypt and Markku-Juhani O. Saarinen at:
https://github.com/mjosaarinen/sm4ni

This optimization supports the four modes of SM4, ECB, CBC, CFB,
and CTR. Since CBC and CFB do not support multiple block parallel
encryption, the optimization effect is not obvious.

Benchmark on Intel i5-6200U 2.30GHz, performance data of three
implementation methods, pure software sm4-generic, aesni/avx
acceleration, and aesni/avx2 acceleration, the data comes from
the 218 mode and 518 mode of tcrypt. The abscissas are blocks of
different lengths. The data is tabulated and the unit is Mb/s:

block-size  |    16      64     128     256    1024    1420    4096
sm4-generic
    ECB enc | 60.94   70.41   72.27   73.02   73.87   73.58   73.59
    ECB dec | 61.87   70.53   72.15   73.09   73.89   73.92   73.86
    CBC enc | 56.71   66.31   68.05   69.84   70.02   70.12   70.24
    CBC dec | 54.54   65.91   68.22   69.51   70.63   70.79   70.82
    CFB enc | 57.21   67.24   69.10   70.25   70.73   70.52   71.42
    CFB dec | 57.22   64.74   66.31   67.24   67.40   67.64   67.58
    CTR enc | 59.47   68.64   69.91   71.02   71.86   71.61   71.95
    CTR dec | 59.94   68.77   69.95   71.00   71.84   71.55   71.95
sm4-aesni-avx
    ECB enc | 44.95  177.35  292.06  316.98  339.48  322.27  330.59
    ECB dec | 45.28  178.66  292.31  317.52  339.59  322.52  331.16
    CBC enc | 57.75   67.68   69.72   70.60   71.48   71.63   71.74
    CBC dec | 44.32  176.83  284.32  307.24  328.61  312.61  325.82
    CFB enc | 57.81   67.64   69.63   70.55   71.40   71.35   71.70
    CFB dec | 43.14  167.78  282.03  307.20  328.35  318.24  325.95
    CTR enc | 42.35  163.32  279.11  302.93  320.86  310.56  317.93
    CTR dec | 42.39  162.81  278.49  302.37  321.11  310.33  318.37
sm4-aesni-avx2
    ECB enc | 45.19  177.41  292.42  316.12  339.90  322.53  330.54
    ECB dec | 44.83  178.90  291.45  317.31  339.85  322.55  331.07
    CBC enc | 57.66   67.62   69.73   70.55   71.58   71.66   71.77
    CBC dec | 44.34  176.86  286.10  501.68  559.58  483.87  527.46
    CFB enc | 57.43   67.60   69.61   70.52   71.43   71.28   71.65
    CFB dec | 43.12  167.75  268.09  499.33  558.35  490.36  524.73
    CTR enc | 42.42  163.39  256.17  493.95  552.45  481.58  517.19
    CTR dec | 42.49  163.11  256.36  493.34  552.62  481.49  516.83

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-08-27 16:30:18 +08:00
..
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
aegis128-aesni-asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
aegis128-aesni-glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
aes_ctrby8_avx-x86_64.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
aesni-intel_asm.S crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper 2021-01-08 15:39:47 +11:00
aesni-intel_avx-x86_64.S x86/crypto/aesni-intel_avx: Standardize stack alignment prologue 2021-04-19 12:36:34 -05:00
aesni-intel_glue.c crypto: x86/aes-ni - add missing error checks in XTS code 2021-07-23 14:49:18 +08:00
blake2s-core.S x86: remove always-defined CONFIG_AS_SSSE3 2020-04-09 00:01:59 +09:00
blake2s-glue.c crypto: blake2s - share the "shash" API boilerplate code 2021-01-03 08:41:38 +11:00
blowfish_glue.c crypto: x86/blowfish - drop CTR mode implementation 2021-01-14 17:10:29 +11:00
blowfish-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
camellia_aesni_avx2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_aesni_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia-aesni-avx2-asm_64.S x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer 2021-04-19 12:36:34 -05:00
camellia-aesni-avx-asm_64.S crypto: x86/camellia - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
camellia-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
camellia.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
cast5_avx_glue.c crypto: x86/cast5 - drop dependency on glue helper 2021-01-14 17:10:29 +11:00
cast5-avx-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
cast6_avx_glue.c crypto: x86/cast6 - drop dependency on glue helper 2021-01-14 17:10:29 +11:00
cast6-avx-x86_64-asm_64.S crypto: x86/cast6 - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
chacha_glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
chacha-avx2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha-avx512vl-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha-ssse3-x86_64.S crypto: x86/chacha-sse3 - use unaligned loads for state array 2020-07-16 21:49:04 +10:00
crc32-pclmul_asm.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
crc32-pclmul_glue.c x86: Fix various typos in comments, take #2 2021-03-21 23:50:28 +01:00
crc32c-intel_glue.c crypto: x86/crc32c-intel - Use CRC32 mnemonic 2020-08-21 14:45:28 +10:00
crc32c-pcl-intel-asm_64.S x86/crypto/crc32c-pcl-intel: Standardize jump table 2021-04-19 12:36:34 -05:00
crct10dif-pcl-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
crct10dif-pclmul_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
curve25519-x86_64.c crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit 2021-06-11 15:03:29 +08:00
des3_ede_glue.c crypto: x86/des - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
des3_ede-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
ecb_cbc_helpers.h crypto: x86 - add some helper macros for ECB and CBC modes 2021-01-14 17:10:29 +11:00
ghash-clmulni-intel_asm.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
ghash-clmulni-intel_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
glue_helper-asm-avx2.S crypto: x86/glue-helper - drop CTR helper routines 2021-01-14 17:10:28 +11:00
glue_helper-asm-avx.S crypto: x86/glue-helper - drop CTR helper routines 2021-01-14 17:10:28 +11:00
Makefile crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation 2021-08-27 16:30:18 +08:00
nh-avx2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
nh-sse2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
nhpoly1305-avx2-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
nhpoly1305-sse2-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
poly1305_glue.c crypto: poly1305 - fix poly1305_core_setkey() declaration 2021-04-02 18:28:12 +11:00
poly1305-x86_64-cryptogams.pl crypto: x86/poly1305 - Use TEST %reg,%reg instead of CMP $0,%reg 2020-12-04 18:13:16 +11:00
serpent_avx2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent_sse2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent-avx2-asm_64.S crypto: x86/serpent - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
serpent-avx-x86_64-asm_64.S crypto: x86/serpent - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
serpent-avx.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent-sse2-i586-asm_32.S x86/asm/32: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 12:03:43 +02:00
serpent-sse2-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
serpent-sse2.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
sha1_avx2_x86_64_asm.S x86/crypto/sha1_avx2: Standardize stack alignment prologue 2021-04-19 12:36:35 -05:00
sha1_ni_asm.S x86/crypto/sha_ni: Standardize stack alignment prologue 2021-04-19 12:36:35 -05:00
sha1_ssse3_asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha1_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha256_ni_asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
sha256_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha256-avx2-asm.S x86/crypto/sha256-avx2: Standardize stack alignment prologue 2021-04-19 12:36:36 -05:00
sha256-avx-asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha256-ssse3-asm.S crypto: x86/sha - Eliminate casts on asm implementations 2020-01-22 16:21:08 +08:00
sha512_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha512-avx2-asm.S x86/crypto/sha512-avx2: Standardize stack alignment prologue 2021-04-19 12:36:36 -05:00
sha512-avx-asm.S x86/crypto/sha512-avx: Standardize stack alignment prologue 2021-04-19 12:36:36 -05:00
sha512-ssse3-asm.S x86/crypto/sha512-ssse3: Standardize stack alignment prologue 2021-04-19 12:36:37 -05:00
sm4_aesni_avx2_glue.c crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation 2021-08-27 16:30:18 +08:00
sm4_aesni_avx_glue.c crypto: x86/sm4 - export reusable AESNI/AVX functions 2021-08-27 16:30:18 +08:00
sm4-aesni-avx2-asm_64.S crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation 2021-08-27 16:30:18 +08:00
sm4-aesni-avx-asm_64.S crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation 2021-07-30 10:58:31 +08:00
sm4-avx.h crypto: x86/sm4 - export reusable AESNI/AVX functions 2021-08-27 16:30:18 +08:00
twofish_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
twofish_glue_3way.c x86: Fix various typos in comments 2021-03-18 15:31:53 +01:00
twofish_glue.c crypto: prefix module autoloading with "crypto-" 2014-11-24 22:43:57 +08:00
twofish-avx-x86_64-asm_64.S crypto: x86/twofish - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
twofish-i586-asm_32.S x86/asm/32: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 12:03:43 +02:00
twofish-x86_64-asm_64-3way.S x86: Fix various typos in comments, take #2 2021-03-21 23:50:28 +01:00
twofish-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
twofish.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00