Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu:
 "API:
   - Restrict crypto_cipher to internal API users only.

  Algorithms:
   - Add x86 aesni acceleration for cts.
   - Improve x86 aesni acceleration for xts.
   - Remove x86 acceleration of some uncommon algorithms.
   - Remove RIPE-MD, Tiger and Salsa20.
   - Remove tnepres.
   - Add ARM acceleration for BLAKE2s and BLAKE2b.

  Drivers:
   - Add Keem Bay OCS HCU driver.
   - Add Marvell OcteonTX2 CPT PF driver.
   - Remove PicoXcell driver.
   - Remove mediatek driver"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (154 commits)
  hwrng: timeriomem - Use device-managed registration API
  crypto: hisilicon/qm - fix printing format issue
  crypto: hisilicon/qm - do not reset hardware when CE happens
  crypto: hisilicon/qm - update irqflag
  crypto: hisilicon/qm - fix the value of 'QM_SQC_VFT_BASE_MASK_V2'
  crypto: hisilicon/qm - fix request missing error
  crypto: hisilicon/qm - removing driver after reset
  crypto: octeontx2 - fix -Wpointer-bool-conversion warning
  crypto: hisilicon/hpre - enable Elliptic curve cryptography
  crypto: hisilicon - PASID fixed on Kunpeng 930
  crypto: hisilicon/qm - fix use of 'dma_map_single'
  crypto: hisilicon/hpre - tiny fix
  crypto: hisilicon/hpre - adapt the number of clusters
  crypto: cpt - remove casting dma_alloc_coherent
  crypto: keembay-ocs-aes - Fix 'q' assignment during CCM B0 generation
  crypto: xor - Fix typo of optimization
  hwrng: optee - Use device-managed registration API
  crypto: arm64/crc-t10dif - move NEON yield to C code
  crypto: arm64/aes-ce-mac - simplify NEON yield
  crypto: arm64/aes-neonbs - remove NEON yield calls
  ...
This commit is contained in:
Linus Torvalds 2021-02-21 17:23:56 -08:00
commit 31caf8b2a8
207 changed files with 13973 additions and 15324 deletions

View File

@ -174,7 +174,6 @@ Juha Yrjola <at solidboot.com>
Juha Yrjola <juha.yrjola@nokia.com>
Juha Yrjola <juha.yrjola@solidboot.com>
Julien Thierry <julien.thierry.kdev@gmail.com> <julien.thierry@arm.com>
Kamil Konieczny <k.konieczny@samsung.com> <k.konieczny@partner.samsung.com>
Kay Sievers <kay.sievers@vrfy.org>
Kees Cook <keescook@chromium.org> <kees.cook@canonical.com>
Kees Cook <keescook@chromium.org> <keescook@google.com>

View File

@ -143,8 +143,8 @@ recalculate
journal_crypt:algorithm(:key) (the key is optional)
Encrypt the journal using given algorithm to make sure that the
attacker can't read the journal. You can use a block cipher here
(such as "cbc(aes)") or a stream cipher (for example "chacha20",
"salsa20" or "ctr(aes)").
(such as "cbc(aes)") or a stream cipher (for example "chacha20"
or "ctr(aes)").
The journal contains history of last writes to the block device,
an attacker reading the journal could see the last sector numbers

View File

@ -28,8 +28,8 @@ Symmetric Key Cipher Request Handle
Single Block Cipher API
-----------------------
.. kernel-doc:: include/linux/crypto.h
.. kernel-doc:: include/crypto/internal/cipher.h
:doc: Single Block Cipher API
.. kernel-doc:: include/linux/crypto.h
.. kernel-doc:: include/crypto/internal/cipher.h
:functions: crypto_alloc_cipher crypto_free_cipher crypto_has_cipher crypto_cipher_blocksize crypto_cipher_setkey crypto_cipher_encrypt_one crypto_cipher_decrypt_one

View File

@ -0,0 +1,46 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/crypto/intel,keembay-ocs-hcu.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Intel Keem Bay OCS HCU Device Tree Bindings
maintainers:
- Declan Murphy <declan.murphy@intel.com>
- Daniele Alessandrelli <daniele.alessandrelli@intel.com>
description:
The Intel Keem Bay Offload and Crypto Subsystem (OCS) Hash Control Unit (HCU)
provides hardware-accelerated hashing and HMAC.
properties:
compatible:
const: intel,keembay-ocs-hcu
reg:
maxItems: 1
interrupts:
maxItems: 1
clocks:
maxItems: 1
required:
- compatible
- reg
- interrupts
- clocks
additionalProperties: false
examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
crypto@3000b000 {
compatible = "intel,keembay-ocs-hcu";
reg = <0x3000b000 0x1000>;
interrupts = <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&scmi_clk 94>;
};

View File

@ -8,7 +8,6 @@ title: Samsung Exynos SoC SlimSSS (Slim Security SubSystem) module
maintainers:
- Krzysztof Kozlowski <krzk@kernel.org>
- Kamil Konieczny <k.konieczny@partner.samsung.com>
description: |+
The SlimSSS module in Exynos5433 SoC supports the following:

View File

@ -8,7 +8,6 @@ title: Samsung Exynos SoC SSS (Security SubSystem) module
maintainers:
- Krzysztof Kozlowski <krzk@kernel.org>
- Kamil Konieczny <k.konieczny@partner.samsung.com>
description: |+
The SSS module in S5PV210 SoC supports the following:

View File

@ -9032,6 +9032,17 @@ F: drivers/crypto/keembay/keembay-ocs-aes-core.c
F: drivers/crypto/keembay/ocs-aes.c
F: drivers/crypto/keembay/ocs-aes.h
INTEL KEEM BAY OCS HCU CRYPTO DRIVER
M: Daniele Alessandrelli <daniele.alessandrelli@intel.com>
M: Declan Murphy <declan.murphy@intel.com>
S: Maintained
F: Documentation/devicetree/bindings/crypto/intel,keembay-ocs-hcu.yaml
F: drivers/crypto/keembay/Kconfig
F: drivers/crypto/keembay/Makefile
F: drivers/crypto/keembay/keembay-ocs-hcu-core.c
F: drivers/crypto/keembay/ocs-hcu.c
F: drivers/crypto/keembay/ocs-hcu.h
INTEL MANAGEMENT ENGINE (mei)
M: Tomas Winkler <tomas.winkler@intel.com>
L: linux-kernel@vger.kernel.org
@ -15683,7 +15694,6 @@ F: drivers/media/i2c/s5k5baf.c
SAMSUNG S5P Security SubSystem (SSS) DRIVER
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Vladimir Zapolskiy <vz@mleia.com>
M: Kamil Konieczny <k.konieczny@samsung.com>
L: linux-crypto@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
S: Maintained

View File

@ -62,6 +62,25 @@ config CRYPTO_SHA512_ARM
SHA-512 secure hash standard (DFIPS 180-2) implemented
using optimized ARM assembler and NEON, when available.
config CRYPTO_BLAKE2S_ARM
tristate "BLAKE2s digest algorithm (ARM)"
select CRYPTO_ARCH_HAVE_LIB_BLAKE2S
help
BLAKE2s digest algorithm optimized with ARM scalar instructions. This
is faster than the generic implementations of BLAKE2s and BLAKE2b, but
slower than the NEON implementation of BLAKE2b. (There is no NEON
implementation of BLAKE2s, since NEON doesn't really help with it.)
config CRYPTO_BLAKE2B_NEON
tristate "BLAKE2b digest algorithm (ARM NEON)"
depends on KERNEL_MODE_NEON
select CRYPTO_BLAKE2B
help
BLAKE2b digest algorithm optimized with ARM NEON instructions.
On ARM processors that have NEON support but not the ARMv8
Crypto Extensions, typically this BLAKE2b implementation is
much faster than SHA-2 and slightly faster than SHA-1.
config CRYPTO_AES_ARM
tristate "Scalar AES cipher for ARM"
select CRYPTO_ALGAPI

View File

@ -9,6 +9,8 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
obj-$(CONFIG_CRYPTO_BLAKE2S_ARM) += blake2s-arm.o
obj-$(CONFIG_CRYPTO_BLAKE2B_NEON) += blake2b-neon.o
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
obj-$(CONFIG_CRYPTO_POLY1305_ARM) += poly1305-arm.o
obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
@ -29,6 +31,8 @@ sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o
sha256-arm-y := sha256-core.o sha256_glue.o $(sha256-arm-neon-y)
sha512-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha512-neon-glue.o
sha512-arm-y := sha512-core.o sha512-glue.o $(sha512-arm-neon-y)
blake2s-arm-y := blake2s-core.o blake2s-glue.o
blake2b-neon-y := blake2b-neon-core.o blake2b-neon-glue.o
sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o
sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o

View File

@ -9,6 +9,7 @@
#include <asm/simd.h>
#include <crypto/aes.h>
#include <crypto/ctr.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
@ -23,6 +24,8 @@ MODULE_ALIAS_CRYPTO("cbc(aes)-all");
MODULE_ALIAS_CRYPTO("ctr(aes)");
MODULE_ALIAS_CRYPTO("xts(aes)");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);
asmlinkage void aesbs_convert_key(u8 out[], u32 const rk[], int rounds);
asmlinkage void aesbs_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],

View File

@ -0,0 +1,347 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* BLAKE2b digest algorithm, NEON accelerated
*
* Copyright 2020 Google LLC
*
* Author: Eric Biggers <ebiggers@google.com>
*/
#include <linux/linkage.h>
.text
.fpu neon
// The arguments to blake2b_compress_neon()
STATE .req r0
BLOCK .req r1
NBLOCKS .req r2
INC .req r3
// Pointers to the rotation tables
ROR24_TABLE .req r4
ROR16_TABLE .req r5
// The original stack pointer
ORIG_SP .req r6
// NEON registers which contain the message words of the current block.
// M_0-M_3 are occasionally used for other purposes too.
M_0 .req d16
M_1 .req d17
M_2 .req d18
M_3 .req d19
M_4 .req d20
M_5 .req d21
M_6 .req d22
M_7 .req d23
M_8 .req d24
M_9 .req d25
M_10 .req d26
M_11 .req d27
M_12 .req d28
M_13 .req d29
M_14 .req d30
M_15 .req d31
.align 4
// Tables for computing ror64(x, 24) and ror64(x, 16) using the vtbl.8
// instruction. This is the most efficient way to implement these
// rotation amounts with NEON. (On Cortex-A53 it's the same speed as
// vshr.u64 + vsli.u64, while on Cortex-A7 it's faster.)
.Lror24_table:
.byte 3, 4, 5, 6, 7, 0, 1, 2
.Lror16_table:
.byte 2, 3, 4, 5, 6, 7, 0, 1
// The BLAKE2b initialization vector
.Lblake2b_IV:
.quad 0x6a09e667f3bcc908, 0xbb67ae8584caa73b
.quad 0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1
.quad 0x510e527fade682d1, 0x9b05688c2b3e6c1f
.quad 0x1f83d9abfb41bd6b, 0x5be0cd19137e2179
// Execute one round of BLAKE2b by updating the state matrix v[0..15] in the
// NEON registers q0-q7. The message block is in q8..q15 (M_0-M_15). The stack
// pointer points to a 32-byte aligned buffer containing a copy of q8 and q9
// (M_0-M_3), so that they can be reloaded if they are used as temporary
// registers. The macro arguments s0-s15 give the order in which the message
// words are used in this round. 'final' is 1 if this is the final round.
.macro _blake2b_round s0, s1, s2, s3, s4, s5, s6, s7, \
s8, s9, s10, s11, s12, s13, s14, s15, final=0
// Mix the columns:
// (v[0], v[4], v[8], v[12]), (v[1], v[5], v[9], v[13]),
// (v[2], v[6], v[10], v[14]), and (v[3], v[7], v[11], v[15]).
// a += b + m[blake2b_sigma[r][2*i + 0]];
vadd.u64 q0, q0, q2
vadd.u64 q1, q1, q3
vadd.u64 d0, d0, M_\s0
vadd.u64 d1, d1, M_\s2
vadd.u64 d2, d2, M_\s4
vadd.u64 d3, d3, M_\s6
// d = ror64(d ^ a, 32);
veor q6, q6, q0
veor q7, q7, q1
vrev64.32 q6, q6
vrev64.32 q7, q7
// c += d;
vadd.u64 q4, q4, q6
vadd.u64 q5, q5, q7
// b = ror64(b ^ c, 24);
vld1.8 {M_0}, [ROR24_TABLE, :64]
veor q2, q2, q4
veor q3, q3, q5
vtbl.8 d4, {d4}, M_0
vtbl.8 d5, {d5}, M_0
vtbl.8 d6, {d6}, M_0
vtbl.8 d7, {d7}, M_0
// a += b + m[blake2b_sigma[r][2*i + 1]];
//
// M_0 got clobbered above, so we have to reload it if any of the four
// message words this step needs happens to be M_0. Otherwise we don't
// need to reload it here, as it will just get clobbered again below.
.if \s1 == 0 || \s3 == 0 || \s5 == 0 || \s7 == 0
vld1.8 {M_0}, [sp, :64]
.endif
vadd.u64 q0, q0, q2
vadd.u64 q1, q1, q3
vadd.u64 d0, d0, M_\s1
vadd.u64 d1, d1, M_\s3
vadd.u64 d2, d2, M_\s5
vadd.u64 d3, d3, M_\s7
// d = ror64(d ^ a, 16);
vld1.8 {M_0}, [ROR16_TABLE, :64]
veor q6, q6, q0
veor q7, q7, q1
vtbl.8 d12, {d12}, M_0
vtbl.8 d13, {d13}, M_0
vtbl.8 d14, {d14}, M_0
vtbl.8 d15, {d15}, M_0
// c += d;
vadd.u64 q4, q4, q6
vadd.u64 q5, q5, q7
// b = ror64(b ^ c, 63);
//
// This rotation amount isn't a multiple of 8, so it has to be
// implemented using a pair of shifts, which requires temporary
// registers. Use q8-q9 (M_0-M_3) for this, and reload them afterwards.
veor q8, q2, q4
veor q9, q3, q5
vshr.u64 q2, q8, #63
vshr.u64 q3, q9, #63
vsli.u64 q2, q8, #1
vsli.u64 q3, q9, #1
vld1.8 {q8-q9}, [sp, :256]
// Mix the diagonals:
// (v[0], v[5], v[10], v[15]), (v[1], v[6], v[11], v[12]),
// (v[2], v[7], v[8], v[13]), and (v[3], v[4], v[9], v[14]).
//
// There are two possible ways to do this: use 'vext' instructions to
// shift the rows of the matrix so that the diagonals become columns,
// and undo it afterwards; or just use 64-bit operations on 'd'
// registers instead of 128-bit operations on 'q' registers. We use the
// latter approach, as it performs much better on Cortex-A7.
// a += b + m[blake2b_sigma[r][2*i + 0]];
vadd.u64 d0, d0, d5
vadd.u64 d1, d1, d6
vadd.u64 d2, d2, d7
vadd.u64 d3, d3, d4
vadd.u64 d0, d0, M_\s8
vadd.u64 d1, d1, M_\s10
vadd.u64 d2, d2, M_\s12
vadd.u64 d3, d3, M_\s14
// d = ror64(d ^ a, 32);
veor d15, d15, d0
veor d12, d12, d1
veor d13, d13, d2
veor d14, d14, d3
vrev64.32 d15, d15
vrev64.32 d12, d12
vrev64.32 d13, d13
vrev64.32 d14, d14
// c += d;
vadd.u64 d10, d10, d15
vadd.u64 d11, d11, d12
vadd.u64 d8, d8, d13
vadd.u64 d9, d9, d14
// b = ror64(b ^ c, 24);
vld1.8 {M_0}, [ROR24_TABLE, :64]
veor d5, d5, d10
veor d6, d6, d11
veor d7, d7, d8
veor d4, d4, d9
vtbl.8 d5, {d5}, M_0
vtbl.8 d6, {d6}, M_0
vtbl.8 d7, {d7}, M_0
vtbl.8 d4, {d4}, M_0
// a += b + m[blake2b_sigma[r][2*i + 1]];
.if \s9 == 0 || \s11 == 0 || \s13 == 0 || \s15 == 0
vld1.8 {M_0}, [sp, :64]
.endif
vadd.u64 d0, d0, d5
vadd.u64 d1, d1, d6
vadd.u64 d2, d2, d7
vadd.u64 d3, d3, d4
vadd.u64 d0, d0, M_\s9
vadd.u64 d1, d1, M_\s11
vadd.u64 d2, d2, M_\s13
vadd.u64 d3, d3, M_\s15
// d = ror64(d ^ a, 16);
vld1.8 {M_0}, [ROR16_TABLE, :64]
veor d15, d15, d0
veor d12, d12, d1
veor d13, d13, d2
veor d14, d14, d3
vtbl.8 d12, {d12}, M_0
vtbl.8 d13, {d13}, M_0
vtbl.8 d14, {d14}, M_0
vtbl.8 d15, {d15}, M_0
// c += d;
vadd.u64 d10, d10, d15
vadd.u64 d11, d11, d12
vadd.u64 d8, d8, d13
vadd.u64 d9, d9, d14
// b = ror64(b ^ c, 63);
veor d16, d4, d9
veor d17, d5, d10
veor d18, d6, d11
veor d19, d7, d8
vshr.u64 q2, q8, #63
vshr.u64 q3, q9, #63
vsli.u64 q2, q8, #1
vsli.u64 q3, q9, #1
// Reloading q8-q9 can be skipped on the final round.
.if ! \final
vld1.8 {q8-q9}, [sp, :256]
.endif
.endm
//
// void blake2b_compress_neon(struct blake2b_state *state,
// const u8 *block, size_t nblocks, u32 inc);
//
// Only the first three fields of struct blake2b_state are used:
// u64 h[8]; (inout)
// u64 t[2]; (inout)
// u64 f[2]; (in)
//
.align 5
ENTRY(blake2b_compress_neon)
push {r4-r10}
// Allocate a 32-byte stack buffer that is 32-byte aligned.
mov ORIG_SP, sp
sub ip, sp, #32
bic ip, ip, #31
mov sp, ip
adr ROR24_TABLE, .Lror24_table
adr ROR16_TABLE, .Lror16_table
mov ip, STATE
vld1.64 {q0-q1}, [ip]! // Load h[0..3]
vld1.64 {q2-q3}, [ip]! // Load h[4..7]
.Lnext_block:
adr r10, .Lblake2b_IV
vld1.64 {q14-q15}, [ip] // Load t[0..1] and f[0..1]
vld1.64 {q4-q5}, [r10]! // Load IV[0..3]
vmov r7, r8, d28 // Copy t[0] to (r7, r8)
vld1.64 {q6-q7}, [r10] // Load IV[4..7]
adds r7, r7, INC // Increment counter
bcs .Lslow_inc_ctr
vmov.i32 d28[0], r7
vst1.64 {d28}, [ip] // Update t[0]
.Linc_ctr_done:
// Load the next message block and finish initializing the state matrix
// 'v'. Fortunately, there are exactly enough NEON registers to fit the
// entire state matrix in q0-q7 and the entire message block in q8-15.
//
// However, _blake2b_round also needs some extra registers for rotates,
// so we have to spill some registers. It's better to spill the message
// registers than the state registers, as the message doesn't change.
// Therefore we store a copy of the first 32 bytes of the message block
// (q8-q9) in an aligned buffer on the stack so that they can be
// reloaded when needed. (We could just reload directly from the
// message buffer, but it's faster to use aligned loads.)
vld1.8 {q8-q9}, [BLOCK]!
veor q6, q6, q14 // v[12..13] = IV[4..5] ^ t[0..1]
vld1.8 {q10-q11}, [BLOCK]!
veor q7, q7, q15 // v[14..15] = IV[6..7] ^ f[0..1]
vld1.8 {q12-q13}, [BLOCK]!
vst1.8 {q8-q9}, [sp, :256]
mov ip, STATE
vld1.8 {q14-q15}, [BLOCK]!
// Execute the rounds. Each round is provided the order in which it
// needs to use the message words.
_blake2b_round 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
_blake2b_round 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3
_blake2b_round 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4
_blake2b_round 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8
_blake2b_round 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13
_blake2b_round 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9
_blake2b_round 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11
_blake2b_round 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10
_blake2b_round 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5
_blake2b_round 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0
_blake2b_round 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
_blake2b_round 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 \
final=1
// Fold the final state matrix into the hash chaining value:
//
// for (i = 0; i < 8; i++)
// h[i] ^= v[i] ^ v[i + 8];
//
vld1.64 {q8-q9}, [ip]! // Load old h[0..3]
veor q0, q0, q4 // v[0..1] ^= v[8..9]
veor q1, q1, q5 // v[2..3] ^= v[10..11]
vld1.64 {q10-q11}, [ip] // Load old h[4..7]
veor q2, q2, q6 // v[4..5] ^= v[12..13]
veor q3, q3, q7 // v[6..7] ^= v[14..15]
veor q0, q0, q8 // v[0..1] ^= h[0..1]
veor q1, q1, q9 // v[2..3] ^= h[2..3]
mov ip, STATE
subs NBLOCKS, NBLOCKS, #1 // nblocks--
vst1.64 {q0-q1}, [ip]! // Store new h[0..3]
veor q2, q2, q10 // v[4..5] ^= h[4..5]
veor q3, q3, q11 // v[6..7] ^= h[6..7]
vst1.64 {q2-q3}, [ip]! // Store new h[4..7]
// Advance to the next block, if there is one.
bne .Lnext_block // nblocks != 0?
mov sp, ORIG_SP
pop {r4-r10}
mov pc, lr
.Lslow_inc_ctr:
// Handle the case where the counter overflowed its low 32 bits, by
// carrying the overflow bit into the full 128-bit counter.
vmov r9, r10, d29
adcs r8, r8, #0
adcs r9, r9, #0
adc r10, r10, #0
vmov d28, r7, r8
vmov d29, r9, r10
vst1.64 {q14}, [ip] // Update t[0] and t[1]
b .Linc_ctr_done
ENDPROC(blake2b_compress_neon)

View File

@ -0,0 +1,105 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* BLAKE2b digest algorithm, NEON accelerated
*
* Copyright 2020 Google LLC
*/
#include <crypto/internal/blake2b.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <linux/module.h>
#include <linux/sizes.h>
#include <asm/neon.h>
#include <asm/simd.h>
asmlinkage void blake2b_compress_neon(struct blake2b_state *state,
const u8 *block, size_t nblocks, u32 inc);
static void blake2b_compress_arch(struct blake2b_state *state,
const u8 *block, size_t nblocks, u32 inc)
{
if (!crypto_simd_usable()) {
blake2b_compress_generic(state, block, nblocks, inc);
return;
}
do {
const size_t blocks = min_t(size_t, nblocks,
SZ_4K / BLAKE2B_BLOCK_SIZE);
kernel_neon_begin();
blake2b_compress_neon(state, block, blocks, inc);
kernel_neon_end();
nblocks -= blocks;
block += blocks * BLAKE2B_BLOCK_SIZE;
} while (nblocks);
}
static int crypto_blake2b_update_neon(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
return crypto_blake2b_update(desc, in, inlen, blake2b_compress_arch);
}
static int crypto_blake2b_final_neon(struct shash_desc *desc, u8 *out)
{
return crypto_blake2b_final(desc, out, blake2b_compress_arch);
}
#define BLAKE2B_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 200, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
.base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2b_setkey, \
.init = crypto_blake2b_init, \
.update = crypto_blake2b_update_neon, \
.final = crypto_blake2b_final_neon, \
.descsize = sizeof(struct blake2b_state), \
}
static struct shash_alg blake2b_neon_algs[] = {
BLAKE2B_ALG("blake2b-160", "blake2b-160-neon", BLAKE2B_160_HASH_SIZE),
BLAKE2B_ALG("blake2b-256", "blake2b-256-neon", BLAKE2B_256_HASH_SIZE),
BLAKE2B_ALG("blake2b-384", "blake2b-384-neon", BLAKE2B_384_HASH_SIZE),
BLAKE2B_ALG("blake2b-512", "blake2b-512-neon", BLAKE2B_512_HASH_SIZE),
};
static int __init blake2b_neon_mod_init(void)
{
if (!(elf_hwcap & HWCAP_NEON))
return -ENODEV;
return crypto_register_shashes(blake2b_neon_algs,
ARRAY_SIZE(blake2b_neon_algs));
}
static void __exit blake2b_neon_mod_exit(void)
{
return crypto_unregister_shashes(blake2b_neon_algs,
ARRAY_SIZE(blake2b_neon_algs));
}
module_init(blake2b_neon_mod_init);
module_exit(blake2b_neon_mod_exit);
MODULE_DESCRIPTION("BLAKE2b digest algorithm, NEON accelerated");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
MODULE_ALIAS_CRYPTO("blake2b-160");
MODULE_ALIAS_CRYPTO("blake2b-160-neon");
MODULE_ALIAS_CRYPTO("blake2b-256");
MODULE_ALIAS_CRYPTO("blake2b-256-neon");
MODULE_ALIAS_CRYPTO("blake2b-384");
MODULE_ALIAS_CRYPTO("blake2b-384-neon");
MODULE_ALIAS_CRYPTO("blake2b-512");
MODULE_ALIAS_CRYPTO("blake2b-512-neon");

View File

@ -0,0 +1,285 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* BLAKE2s digest algorithm, ARM scalar implementation
*
* Copyright 2020 Google LLC
*
* Author: Eric Biggers <ebiggers@google.com>
*/
#include <linux/linkage.h>
// Registers used to hold message words temporarily. There aren't
// enough ARM registers to hold the whole message block, so we have to
// load the words on-demand.
M_0 .req r12
M_1 .req r14
// The BLAKE2s initialization vector
.Lblake2s_IV:
.word 0x6A09E667, 0xBB67AE85, 0x3C6EF372, 0xA54FF53A
.word 0x510E527F, 0x9B05688C, 0x1F83D9AB, 0x5BE0CD19
.macro __ldrd a, b, src, offset
#if __LINUX_ARM_ARCH__ >= 6
ldrd \a, \b, [\src, #\offset]
#else
ldr \a, [\src, #\offset]
ldr \b, [\src, #\offset + 4]
#endif
.endm
.macro __strd a, b, dst, offset
#if __LINUX_ARM_ARCH__ >= 6
strd \a, \b, [\dst, #\offset]
#else
str \a, [\dst, #\offset]
str \b, [\dst, #\offset + 4]
#endif
.endm
// Execute a quarter-round of BLAKE2s by mixing two columns or two diagonals.
// (a0, b0, c0, d0) and (a1, b1, c1, d1) give the registers containing the two
// columns/diagonals. s0-s1 are the word offsets to the message words the first
// column/diagonal needs, and likewise s2-s3 for the second column/diagonal.
// M_0 and M_1 are free to use, and the message block can be found at sp + 32.
//
// Note that to save instructions, the rotations don't happen when the
// pseudocode says they should, but rather they are delayed until the values are
// used. See the comment above _blake2s_round().
.macro _blake2s_quarterround a0, b0, c0, d0, a1, b1, c1, d1, s0, s1, s2, s3
ldr M_0, [sp, #32 + 4 * \s0]
ldr M_1, [sp, #32 + 4 * \s2]
// a += b + m[blake2s_sigma[r][2*i + 0]];
add \a0, \a0, \b0, ror #brot
add \a1, \a1, \b1, ror #brot
add \a0, \a0, M_0
add \a1, \a1, M_1
// d = ror32(d ^ a, 16);
eor \d0, \a0, \d0, ror #drot
eor \d1, \a1, \d1, ror #drot
// c += d;
add \c0, \c0, \d0, ror #16
add \c1, \c1, \d1, ror #16
// b = ror32(b ^ c, 12);
eor \b0, \c0, \b0, ror #brot
eor \b1, \c1, \b1, ror #brot
ldr M_0, [sp, #32 + 4 * \s1]
ldr M_1, [sp, #32 + 4 * \s3]
// a += b + m[blake2s_sigma[r][2*i + 1]];
add \a0, \a0, \b0, ror #12
add \a1, \a1, \b1, ror #12
add \a0, \a0, M_0
add \a1, \a1, M_1
// d = ror32(d ^ a, 8);
eor \d0, \a0, \d0, ror#16
eor \d1, \a1, \d1, ror#16
// c += d;
add \c0, \c0, \d0, ror#8
add \c1, \c1, \d1, ror#8
// b = ror32(b ^ c, 7);
eor \b0, \c0, \b0, ror#12
eor \b1, \c1, \b1, ror#12
.endm
// Execute one round of BLAKE2s by updating the state matrix v[0..15]. v[0..9]
// are in r0..r9. The stack pointer points to 8 bytes of scratch space for
// spilling v[8..9], then to v[9..15], then to the message block. r10-r12 and
// r14 are free to use. The macro arguments s0-s15 give the order in which the
// message words are used in this round.
//
// All rotates are performed using the implicit rotate operand accepted by the
// 'add' and 'eor' instructions. This is faster than using explicit rotate
// instructions. To make this work, we allow the values in the second and last
// rows of the BLAKE2s state matrix (rows 'b' and 'd') to temporarily have the
// wrong rotation amount. The rotation amount is then fixed up just in time
// when the values are used. 'brot' is the number of bits the values in row 'b'
// need to be rotated right to arrive at the correct values, and 'drot'
// similarly for row 'd'. (brot, drot) start out as (0, 0) but we make it such
// that they end up as (7, 8) after every round.
.macro _blake2s_round s0, s1, s2, s3, s4, s5, s6, s7, \
s8, s9, s10, s11, s12, s13, s14, s15
// Mix first two columns:
// (v[0], v[4], v[8], v[12]) and (v[1], v[5], v[9], v[13]).
__ldrd r10, r11, sp, 16 // load v[12] and v[13]
_blake2s_quarterround r0, r4, r8, r10, r1, r5, r9, r11, \
\s0, \s1, \s2, \s3
__strd r8, r9, sp, 0
__strd r10, r11, sp, 16
// Mix second two columns:
// (v[2], v[6], v[10], v[14]) and (v[3], v[7], v[11], v[15]).
__ldrd r8, r9, sp, 8 // load v[10] and v[11]
__ldrd r10, r11, sp, 24 // load v[14] and v[15]
_blake2s_quarterround r2, r6, r8, r10, r3, r7, r9, r11, \
\s4, \s5, \s6, \s7
str r10, [sp, #24] // store v[14]
// v[10], v[11], and v[15] are used below, so no need to store them yet.
.set brot, 7
.set drot, 8
// Mix first two diagonals:
// (v[0], v[5], v[10], v[15]) and (v[1], v[6], v[11], v[12]).
ldr r10, [sp, #16] // load v[12]
_blake2s_quarterround r0, r5, r8, r11, r1, r6, r9, r10, \
\s8, \s9, \s10, \s11
__strd r8, r9, sp, 8
str r11, [sp, #28]
str r10, [sp, #16]
// Mix second two diagonals:
// (v[2], v[7], v[8], v[13]) and (v[3], v[4], v[9], v[14]).
__ldrd r8, r9, sp, 0 // load v[8] and v[9]
__ldrd r10, r11, sp, 20 // load v[13] and v[14]
_blake2s_quarterround r2, r7, r8, r10, r3, r4, r9, r11, \
\s12, \s13, \s14, \s15
__strd r10, r11, sp, 20
.endm
//
// void blake2s_compress_arch(struct blake2s_state *state,
// const u8 *block, size_t nblocks, u32 inc);
//
// Only the first three fields of struct blake2s_state are used:
// u32 h[8]; (inout)
// u32 t[2]; (inout)
// u32 f[2]; (in)
//
.align 5
ENTRY(blake2s_compress_arch)
push {r0-r2,r4-r11,lr} // keep this an even number
.Lnext_block:
// r0 is 'state'
// r1 is 'block'
// r3 is 'inc'
// Load and increment the counter t[0..1].
__ldrd r10, r11, r0, 32
adds r10, r10, r3
adc r11, r11, #0
__strd r10, r11, r0, 32
// _blake2s_round is very short on registers, so copy the message block
// to the stack to save a register during the rounds. This also has the
// advantage that misalignment only needs to be dealt with in one place.
sub sp, sp, #64
mov r12, sp
tst r1, #3
bne .Lcopy_block_misaligned
ldmia r1!, {r2-r9}
stmia r12!, {r2-r9}
ldmia r1!, {r2-r9}
stmia r12, {r2-r9}
.Lcopy_block_done:
str r1, [sp, #68] // Update message pointer
// Calculate v[8..15]. Push v[9..15] onto the stack, and leave space
// for spilling v[8..9]. Leave v[8..9] in r8-r9.
mov r14, r0 // r14 = state
adr r12, .Lblake2s_IV
ldmia r12!, {r8-r9} // load IV[0..1]
__ldrd r0, r1, r14, 40 // load f[0..1]
ldm r12, {r2-r7} // load IV[3..7]
eor r4, r4, r10 // v[12] = IV[4] ^ t[0]
eor r5, r5, r11 // v[13] = IV[5] ^ t[1]
eor r6, r6, r0 // v[14] = IV[6] ^ f[0]
eor r7, r7, r1 // v[15] = IV[7] ^ f[1]
push {r2-r7} // push v[9..15]
sub sp, sp, #8 // leave space for v[8..9]
// Load h[0..7] == v[0..7].
ldm r14, {r0-r7}
// Execute the rounds. Each round is provided the order in which it
// needs to use the message words.
.set brot, 0
.set drot, 0
_blake2s_round 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
_blake2s_round 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3
_blake2s_round 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4
_blake2s_round 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8
_blake2s_round 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13
_blake2s_round 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9
_blake2s_round 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11
_blake2s_round 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10
_blake2s_round 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5
_blake2s_round 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0
// Fold the final state matrix into the hash chaining value:
//
// for (i = 0; i < 8; i++)
// h[i] ^= v[i] ^ v[i + 8];
//
ldr r14, [sp, #96] // r14 = &h[0]
add sp, sp, #8 // v[8..9] are already loaded.
pop {r10-r11} // load v[10..11]
eor r0, r0, r8
eor r1, r1, r9
eor r2, r2, r10
eor r3, r3, r11
ldm r14, {r8-r11} // load h[0..3]
eor r0, r0, r8
eor r1, r1, r9
eor r2, r2, r10
eor r3, r3, r11
stmia r14!, {r0-r3} // store new h[0..3]
ldm r14, {r0-r3} // load old h[4..7]
pop {r8-r11} // load v[12..15]
eor r0, r0, r4, ror #brot
eor r1, r1, r5, ror #brot
eor r2, r2, r6, ror #brot
eor r3, r3, r7, ror #brot
eor r0, r0, r8, ror #drot
eor r1, r1, r9, ror #drot
eor r2, r2, r10, ror #drot
eor r3, r3, r11, ror #drot
add sp, sp, #64 // skip copy of message block
stm r14, {r0-r3} // store new h[4..7]
// Advance to the next block, if there is one. Note that if there are
// multiple blocks, then 'inc' (the counter increment amount) must be
// 64. So we can simply set it to 64 without re-loading it.
ldm sp, {r0, r1, r2} // load (state, block, nblocks)
mov r3, #64 // set 'inc'
subs r2, r2, #1 // nblocks--
str r2, [sp, #8]
bne .Lnext_block // nblocks != 0?
pop {r0-r2,r4-r11,pc}
// The next message block (pointed to by r1) isn't 4-byte aligned, so it
// can't be loaded using ldmia. Copy it to the stack buffer (pointed to
// by r12) using an alternative method. r2-r9 are free to use.
.Lcopy_block_misaligned:
mov r2, #64
1:
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
ldr r3, [r1], #4
#else
ldrb r3, [r1, #0]
ldrb r4, [r1, #1]
ldrb r5, [r1, #2]
ldrb r6, [r1, #3]
add r1, r1, #4
orr r3, r3, r4, lsl #8
orr r3, r3, r5, lsl #16
orr r3, r3, r6, lsl #24
#endif
subs r2, r2, #4
str r3, [r12], #4
bne 1b
b .Lcopy_block_done
ENDPROC(blake2s_compress_arch)

View File

@ -0,0 +1,78 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* BLAKE2s digest algorithm, ARM scalar implementation
*
* Copyright 2020 Google LLC
*/
#include <crypto/internal/blake2s.h>
#include <crypto/internal/hash.h>
#include <linux/module.h>
/* defined in blake2s-core.S */
EXPORT_SYMBOL(blake2s_compress_arch);
static int crypto_blake2s_update_arm(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
return crypto_blake2s_update(desc, in, inlen, blake2s_compress_arch);
}
static int crypto_blake2s_final_arm(struct shash_desc *desc, u8 *out)
{
return crypto_blake2s_final(desc, out, blake2s_compress_arch);
}
#define BLAKE2S_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 200, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2s_setkey, \
.init = crypto_blake2s_init, \
.update = crypto_blake2s_update_arm, \
.final = crypto_blake2s_final_arm, \
.descsize = sizeof(struct blake2s_state), \
}
static struct shash_alg blake2s_arm_algs[] = {
BLAKE2S_ALG("blake2s-128", "blake2s-128-arm", BLAKE2S_128_HASH_SIZE),
BLAKE2S_ALG("blake2s-160", "blake2s-160-arm", BLAKE2S_160_HASH_SIZE),
BLAKE2S_ALG("blake2s-224", "blake2s-224-arm", BLAKE2S_224_HASH_SIZE),
BLAKE2S_ALG("blake2s-256", "blake2s-256-arm", BLAKE2S_256_HASH_SIZE),
};
static int __init blake2s_arm_mod_init(void)
{
return IS_REACHABLE(CONFIG_CRYPTO_HASH) ?
crypto_register_shashes(blake2s_arm_algs,
ARRAY_SIZE(blake2s_arm_algs)) : 0;
}
static void __exit blake2s_arm_mod_exit(void)
{
if (IS_REACHABLE(CONFIG_CRYPTO_HASH))
crypto_unregister_shashes(blake2s_arm_algs,
ARRAY_SIZE(blake2s_arm_algs));
}
module_init(blake2s_arm_mod_init);
module_exit(blake2s_arm_mod_exit);
MODULE_DESCRIPTION("BLAKE2s digest algorithm, ARM scalar implementation");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
MODULE_ALIAS_CRYPTO("blake2s-128");
MODULE_ALIAS_CRYPTO("blake2s-128-arm");
MODULE_ALIAS_CRYPTO("blake2s-160");
MODULE_ALIAS_CRYPTO("blake2s-160-arm");
MODULE_ALIAS_CRYPTO("blake2s-224");
MODULE_ALIAS_CRYPTO("blake2s-224-arm");
MODULE_ALIAS_CRYPTO("blake2s-256");
MODULE_ALIAS_CRYPTO("blake2s-256-arm");

View File

@ -24,6 +24,7 @@
#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce"
#define PRIO 300
#define STRIDE 5
#define aes_expandkey ce_aes_expandkey
#define aes_ecb_encrypt ce_aes_ecb_encrypt
#define aes_ecb_decrypt ce_aes_ecb_decrypt
@ -41,6 +42,7 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else
#define MODE "neon"
#define PRIO 200
#define STRIDE 4
#define aes_ecb_encrypt neon_aes_ecb_encrypt
#define aes_ecb_decrypt neon_aes_ecb_decrypt
#define aes_cbc_encrypt neon_aes_cbc_encrypt
@ -55,7 +57,7 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#define aes_mac_update neon_aes_mac_update
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 NEON");
#endif
#if defined(USE_V8_CRYPTO_EXTENSIONS) || !defined(CONFIG_CRYPTO_AES_ARM64_BS)
#if defined(USE_V8_CRYPTO_EXTENSIONS) || !IS_ENABLED(CONFIG_CRYPTO_AES_ARM64_BS)
MODULE_ALIAS_CRYPTO("ecb(aes)");
MODULE_ALIAS_CRYPTO("cbc(aes)");
MODULE_ALIAS_CRYPTO("ctr(aes)");
@ -87,7 +89,7 @@ asmlinkage void aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 const iv[]);
asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 ctr[]);
int rounds, int bytes, u8 ctr[], u8 finalbuf[]);
asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int bytes, u32 const rk2[], u8 iv[],
@ -103,7 +105,7 @@ asmlinkage void aes_essiv_cbc_decrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int blocks, u8 iv[],
u32 const rk2[]);
asmlinkage void aes_mac_update(u8 const in[], u32 const rk[], int rounds,
asmlinkage int aes_mac_update(u8 const in[], u32 const rk[], int rounds,
int blocks, u8 dg[], int enc_before,
int enc_after);
@ -448,34 +450,36 @@ static int ctr_encrypt(struct skcipher_request *req)
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk;
int blocks;
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, blocks, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
if (walk.nbytes) {
u8 __aligned(8) tail[AES_BLOCK_SIZE];
while (walk.nbytes > 0) {
const u8 *src = walk.src.virt.addr;
unsigned int nbytes = walk.nbytes;
u8 *tdst = walk.dst.virt.addr;
u8 *tsrc = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
unsigned int tail;
/*
* Tell aes_ctr_encrypt() to process a tail block.
*/
blocks = -1;
if (unlikely(nbytes < AES_BLOCK_SIZE))
src = memcpy(buf, src, nbytes);
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_ctr_encrypt(tail, NULL, ctx->key_enc, rounds,
blocks, walk.iv);
aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv, buf);
kernel_neon_end();
crypto_xor_cpy(tdst, tsrc, tail, nbytes);
err = skcipher_walk_done(&walk, 0);
tail = nbytes % (STRIDE * AES_BLOCK_SIZE);
if (tail > 0 && tail < AES_BLOCK_SIZE)
/*
* The final partial block could not be returned using
* an overlapping store, so it was passed via buf[]
* instead.
*/
memcpy(dst + nbytes - tail, buf, tail);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
return err;
@ -650,7 +654,7 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req)
}
static struct skcipher_alg aes_algs[] = { {
#if defined(USE_V8_CRYPTO_EXTENSIONS) || !defined(CONFIG_CRYPTO_AES_ARM64_BS)
#if defined(USE_V8_CRYPTO_EXTENSIONS) || !IS_ENABLED(CONFIG_CRYPTO_AES_ARM64_BS)
.base = {
.cra_name = "__ecb(aes)",
.cra_driver_name = "__ecb-aes-" MODE,
@ -852,10 +856,17 @@ static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks,
int rounds = 6 + ctx->key_length / 4;
if (crypto_simd_usable()) {
int rem;
do {
kernel_neon_begin();
aes_mac_update(in, ctx->key_enc, rounds, blocks, dg, enc_before,
enc_after);
rem = aes_mac_update(in, ctx->key_enc, rounds, blocks,
dg, enc_before, enc_after);
kernel_neon_end();
in += (blocks - rem) * AES_BLOCK_SIZE;
blocks = rem;
enc_before = 0;
} while (blocks);
} else {
if (enc_before)
aes_encrypt(ctx, dg, dg);

View File

@ -321,42 +321,76 @@ AES_FUNC_END(aes_cbc_cts_decrypt)
/*
* aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 ctr[])
* int bytes, u8 ctr[], u8 finalbuf[])
*/
AES_FUNC_START(aes_ctr_encrypt)
stp x29, x30, [sp, #-16]!
mov x29, sp
enc_prepare w3, x2, x6
enc_prepare w3, x2, x12
ld1 {vctr.16b}, [x5]
umov x6, vctr.d[1] /* keep swabbed ctr in reg */
rev x6, x6
cmn w6, w4 /* 32 bit overflow? */
bcs .Lctrloop
umov x12, vctr.d[1] /* keep swabbed ctr in reg */
rev x12, x12
.LctrloopNx:
subs w4, w4, #MAX_STRIDE
bmi .Lctr1x
add w7, w6, #1
add w7, w4, #15
sub w4, w4, #MAX_STRIDE << 4
lsr w7, w7, #4
mov w8, #MAX_STRIDE
cmp w7, w8
csel w7, w7, w8, lt
adds x12, x12, x7
mov v0.16b, vctr.16b
add w8, w6, #2
mov v1.16b, vctr.16b
add w9, w6, #3
mov v2.16b, vctr.16b
add w9, w6, #3
rev w7, w7
mov v3.16b, vctr.16b
rev w8, w8
ST5( mov v4.16b, vctr.16b )
mov v1.s[3], w7
rev w9, w9
ST5( add w10, w6, #4 )
mov v2.s[3], w8
ST5( rev w10, w10 )
mov v3.s[3], w9
ST5( mov v4.s[3], w10 )
ld1 {v5.16b-v7.16b}, [x1], #48 /* get 3 input blocks */
bcs 0f
.subsection 1
/* apply carry to outgoing counter */
0: umov x8, vctr.d[0]
rev x8, x8
add x8, x8, #1
rev x8, x8
ins vctr.d[0], x8
/* apply carry to N counter blocks for N := x12 */
adr x16, 1f
sub x16, x16, x12, lsl #3
br x16
hint 34 // bti c
mov v0.d[0], vctr.d[0]
hint 34 // bti c
mov v1.d[0], vctr.d[0]
hint 34 // bti c
mov v2.d[0], vctr.d[0]
hint 34 // bti c
mov v3.d[0], vctr.d[0]
ST5( hint 34 )
ST5( mov v4.d[0], vctr.d[0] )
1: b 2f
.previous
2: rev x7, x12
ins vctr.d[1], x7
sub x7, x12, #MAX_STRIDE - 1
sub x8, x12, #MAX_STRIDE - 2
sub x9, x12, #MAX_STRIDE - 3
rev x7, x7
rev x8, x8
mov v1.d[1], x7
rev x9, x9
ST5( sub x10, x12, #MAX_STRIDE - 4 )
mov v2.d[1], x8
ST5( rev x10, x10 )
mov v3.d[1], x9
ST5( mov v4.d[1], x10 )
tbnz w4, #31, .Lctrtail
ld1 {v5.16b-v7.16b}, [x1], #48
ST4( bl aes_encrypt_block4x )
ST5( bl aes_encrypt_block5x )
eor v0.16b, v5.16b, v0.16b
@ -368,47 +402,72 @@ ST5( ld1 {v5.16b-v6.16b}, [x1], #32 )
ST5( eor v4.16b, v6.16b, v4.16b )
st1 {v0.16b-v3.16b}, [x0], #64
ST5( st1 {v4.16b}, [x0], #16 )
add x6, x6, #MAX_STRIDE
rev x7, x6
ins vctr.d[1], x7
cbz w4, .Lctrout
b .LctrloopNx
.Lctr1x:
adds w4, w4, #MAX_STRIDE
beq .Lctrout
.Lctrloop:
mov v0.16b, vctr.16b
encrypt_block v0, w3, x2, x8, w7
adds x6, x6, #1 /* increment BE ctr */
rev x7, x6
ins vctr.d[1], x7
bcs .Lctrcarry /* overflow? */
.Lctrcarrydone:
subs w4, w4, #1
bmi .Lctrtailblock /* blocks <0 means tail block */
ld1 {v3.16b}, [x1], #16
eor v3.16b, v0.16b, v3.16b
st1 {v3.16b}, [x0], #16
bne .Lctrloop
.Lctrout:
st1 {vctr.16b}, [x5] /* return next CTR value */
ldp x29, x30, [sp], #16
ret
.Lctrtailblock:
st1 {v0.16b}, [x0]
.Lctrtail:
/* XOR up to MAX_STRIDE * 16 - 1 bytes of in/output with v0 ... v3/v4 */
mov x16, #16
ands x13, x4, #0xf
csel x13, x13, x16, ne
ST5( cmp w4, #64 - (MAX_STRIDE << 4) )
ST5( csel x14, x16, xzr, gt )
cmp w4, #48 - (MAX_STRIDE << 4)
csel x15, x16, xzr, gt
cmp w4, #32 - (MAX_STRIDE << 4)
csel x16, x16, xzr, gt
cmp w4, #16 - (MAX_STRIDE << 4)
ble .Lctrtail1x
adr_l x12, .Lcts_permute_table
add x12, x12, x13
ST5( ld1 {v5.16b}, [x1], x14 )
ld1 {v6.16b}, [x1], x15
ld1 {v7.16b}, [x1], x16
ST4( bl aes_encrypt_block4x )
ST5( bl aes_encrypt_block5x )
ld1 {v8.16b}, [x1], x13
ld1 {v9.16b}, [x1]
ld1 {v10.16b}, [x12]
ST4( eor v6.16b, v6.16b, v0.16b )
ST4( eor v7.16b, v7.16b, v1.16b )
ST4( tbl v3.16b, {v3.16b}, v10.16b )
ST4( eor v8.16b, v8.16b, v2.16b )
ST4( eor v9.16b, v9.16b, v3.16b )
ST5( eor v5.16b, v5.16b, v0.16b )
ST5( eor v6.16b, v6.16b, v1.16b )
ST5( tbl v4.16b, {v4.16b}, v10.16b )
ST5( eor v7.16b, v7.16b, v2.16b )
ST5( eor v8.16b, v8.16b, v3.16b )
ST5( eor v9.16b, v9.16b, v4.16b )
ST5( st1 {v5.16b}, [x0], x14 )
st1 {v6.16b}, [x0], x15
st1 {v7.16b}, [x0], x16
add x13, x13, x0
st1 {v9.16b}, [x13] // overlapping stores
st1 {v8.16b}, [x0]
b .Lctrout
.Lctrcarry:
umov x7, vctr.d[0] /* load upper word of ctr */
rev x7, x7 /* ... to handle the carry */
add x7, x7, #1
rev x7, x7
ins vctr.d[0], x7
b .Lctrcarrydone
.Lctrtail1x:
csel x0, x0, x6, eq // use finalbuf if less than a full block
ld1 {v5.16b}, [x1]
ST5( mov v3.16b, v4.16b )
encrypt_block v3, w3, x2, x8, w7
eor v5.16b, v5.16b, v3.16b
st1 {v5.16b}, [x0]
b .Lctrout
AES_FUNC_END(aes_ctr_encrypt)
@ -619,61 +678,47 @@ AES_FUNC_END(aes_xts_decrypt)
* int blocks, u8 dg[], int enc_before, int enc_after)
*/
AES_FUNC_START(aes_mac_update)
frame_push 6
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x6
ld1 {v0.16b}, [x23] /* get dg */
ld1 {v0.16b}, [x4] /* get dg */
enc_prepare w2, x1, x7
cbz w5, .Lmacloop4x
encrypt_block v0, w2, x1, x7, w8
.Lmacloop4x:
subs w22, w22, #4
subs w3, w3, #4
bmi .Lmac1x
ld1 {v1.16b-v4.16b}, [x19], #64 /* get next pt block */
ld1 {v1.16b-v4.16b}, [x0], #64 /* get next pt block */
eor v0.16b, v0.16b, v1.16b /* ..and xor with dg */
encrypt_block v0, w21, x20, x7, w8
encrypt_block v0, w2, x1, x7, w8
eor v0.16b, v0.16b, v2.16b
encrypt_block v0, w21, x20, x7, w8
encrypt_block v0, w2, x1, x7, w8
eor v0.16b, v0.16b, v3.16b
encrypt_block v0, w21, x20, x7, w8
encrypt_block v0, w2, x1, x7, w8
eor v0.16b, v0.16b, v4.16b
cmp w22, wzr
csinv x5, x24, xzr, eq
cmp w3, wzr
csinv x5, x6, xzr, eq
cbz w5, .Lmacout
encrypt_block v0, w21, x20, x7, w8
st1 {v0.16b}, [x23] /* return dg */
cond_yield_neon .Lmacrestart
encrypt_block v0, w2, x1, x7, w8
st1 {v0.16b}, [x4] /* return dg */
cond_yield .Lmacout, x7
b .Lmacloop4x
.Lmac1x:
add w22, w22, #4
add w3, w3, #4
.Lmacloop:
cbz w22, .Lmacout
ld1 {v1.16b}, [x19], #16 /* get next pt block */
cbz w3, .Lmacout
ld1 {v1.16b}, [x0], #16 /* get next pt block */
eor v0.16b, v0.16b, v1.16b /* ..and xor with dg */
subs w22, w22, #1
csinv x5, x24, xzr, eq
subs w3, w3, #1
csinv x5, x6, xzr, eq
cbz w5, .Lmacout
.Lmacenc:
encrypt_block v0, w21, x20, x7, w8
encrypt_block v0, w2, x1, x7, w8
b .Lmacloop
.Lmacout:
st1 {v0.16b}, [x23] /* return dg */
frame_pop
st1 {v0.16b}, [x4] /* return dg */
mov w0, w3
ret
.Lmacrestart:
ld1 {v0.16b}, [x23] /* get dg */
enc_prepare w21, x20, x0
b .Lmacloop4x
AES_FUNC_END(aes_mac_update)

View File

@ -613,7 +613,6 @@ SYM_FUNC_END(aesbs_decrypt8)
st1 {\o7\().16b}, [x19], #16
cbz x23, 1f
cond_yield_neon
b 99b
1: frame_pop
@ -715,7 +714,6 @@ SYM_FUNC_START(aesbs_cbc_decrypt)
1: st1 {v24.16b}, [x24] // store IV
cbz x23, 2f
cond_yield_neon
b 99b
2: frame_pop
@ -801,7 +799,7 @@ SYM_FUNC_END(__xts_crypt8)
mov x23, x4
mov x24, x5
0: movi v30.2s, #0x1
movi v30.2s, #0x1
movi v25.2s, #0x87
uzp1 v30.4s, v30.4s, v25.4s
ld1 {v25.16b}, [x24]
@ -846,7 +844,6 @@ SYM_FUNC_END(__xts_crypt8)
cbz x23, 1f
st1 {v25.16b}, [x24]
cond_yield_neon 0b
b 99b
1: st1 {v25.16b}, [x24]
@ -889,7 +886,7 @@ SYM_FUNC_START(aesbs_ctr_encrypt)
cset x26, ne
add x23, x23, x26 // do one extra block if final
98: ldp x7, x8, [x24]
ldp x7, x8, [x24]
ld1 {v0.16b}, [x24]
CPU_LE( rev x7, x7 )
CPU_LE( rev x8, x8 )
@ -967,7 +964,6 @@ CPU_LE( rev x8, x8 )
st1 {v0.16b}, [x24]
cbz x23, .Lctr_done
cond_yield_neon 98b
b 99b
.Lctr_done:

View File

@ -68,10 +68,10 @@
.text
.arch armv8-a+crypto
init_crc .req w19
buf .req x20
len .req x21
fold_consts_ptr .req x22
init_crc .req w0
buf .req x1
len .req x2
fold_consts_ptr .req x3
fold_consts .req v10
@ -257,12 +257,6 @@ CPU_LE( ext v12.16b, v12.16b, v12.16b, #8 )
.endm
.macro crc_t10dif_pmull, p
frame_push 4, 128
mov init_crc, w0
mov buf, x1
mov len, x2
__pmull_init_\p
// For sizes less than 256 bytes, we can't fold 128 bytes at a time.
@ -317,26 +311,7 @@ CPU_LE( ext v7.16b, v7.16b, v7.16b, #8 )
fold_32_bytes \p, v6, v7
subs len, len, #128
b.lt .Lfold_128_bytes_loop_done_\@
if_will_cond_yield_neon
stp q0, q1, [sp, #.Lframe_local_offset]
stp q2, q3, [sp, #.Lframe_local_offset + 32]
stp q4, q5, [sp, #.Lframe_local_offset + 64]
stp q6, q7, [sp, #.Lframe_local_offset + 96]
do_cond_yield_neon
ldp q0, q1, [sp, #.Lframe_local_offset]
ldp q2, q3, [sp, #.Lframe_local_offset + 32]
ldp q4, q5, [sp, #.Lframe_local_offset + 64]
ldp q6, q7, [sp, #.Lframe_local_offset + 96]
ld1 {fold_consts.2d}, [fold_consts_ptr]
__pmull_init_\p
__pmull_pre_\p fold_consts
endif_yield_neon
b .Lfold_128_bytes_loop_\@
.Lfold_128_bytes_loop_done_\@:
b.ge .Lfold_128_bytes_loop_\@
// Now fold the 112 bytes in v0-v6 into the 16 bytes in v7.
@ -453,7 +428,9 @@ CPU_LE( ext v0.16b, v0.16b, v0.16b, #8 )
// Final CRC value (x^16 * M(x)) mod G(x) is in low 16 bits of v0.
umov w0, v0.h[0]
frame_pop
.ifc \p, p8
ldp x29, x30, [sp], #16
.endif
ret
.Lless_than_256_bytes_\@:
@ -489,6 +466,8 @@ CPU_LE( ext v7.16b, v7.16b, v7.16b, #8 )
// Assumes len >= 16.
//
SYM_FUNC_START(crc_t10dif_pmull_p8)
stp x29, x30, [sp, #-16]!
mov x29, sp
crc_t10dif_pmull p8
SYM_FUNC_END(crc_t10dif_pmull_p8)

View File

@ -37,9 +37,18 @@ static int crct10dif_update_pmull_p8(struct shash_desc *desc, const u8 *data,
u16 *crc = shash_desc_ctx(desc);
if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && crypto_simd_usable()) {
do {
unsigned int chunk = length;
if (chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE)
chunk = SZ_4K;
kernel_neon_begin();
*crc = crc_t10dif_pmull_p8(*crc, data, length);
*crc = crc_t10dif_pmull_p8(*crc, data, chunk);
kernel_neon_end();
data += chunk;
length -= chunk;
} while (length);
} else {
*crc = crc_t10dif_generic(*crc, data, length);
}
@ -53,9 +62,18 @@ static int crct10dif_update_pmull_p64(struct shash_desc *desc, const u8 *data,
u16 *crc = shash_desc_ctx(desc);
if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && crypto_simd_usable()) {
do {
unsigned int chunk = length;
if (chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE)
chunk = SZ_4K;
kernel_neon_begin();
*crc = crc_t10dif_pmull_p64(*crc, data, length);
*crc = crc_t10dif_pmull_p64(*crc, data, chunk);
kernel_neon_end();
data += chunk;
length -= chunk;
} while (length);
} else {
*crc = crc_t10dif_generic(*crc, data, length);
}

View File

@ -62,40 +62,34 @@
.endm
/*
* void sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
* int sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
* int blocks)
*/
SYM_FUNC_START(sha1_ce_transform)
frame_push 3
mov x19, x0
mov x20, x1
mov x21, x2
/* load round constants */
0: loadrc k0.4s, 0x5a827999, w6
loadrc k0.4s, 0x5a827999, w6
loadrc k1.4s, 0x6ed9eba1, w6
loadrc k2.4s, 0x8f1bbcdc, w6
loadrc k3.4s, 0xca62c1d6, w6
/* load state */
ld1 {dgav.4s}, [x19]
ldr dgb, [x19, #16]
ld1 {dgav.4s}, [x0]
ldr dgb, [x0, #16]
/* load sha1_ce_state::finalize */
ldr_l w4, sha1_ce_offsetof_finalize, x4
ldr w4, [x19, x4]
ldr w4, [x0, x4]
/* load input */
1: ld1 {v8.4s-v11.4s}, [x20], #64
sub w21, w21, #1
0: ld1 {v8.4s-v11.4s}, [x1], #64
sub w2, w2, #1
CPU_LE( rev32 v8.16b, v8.16b )
CPU_LE( rev32 v9.16b, v9.16b )
CPU_LE( rev32 v10.16b, v10.16b )
CPU_LE( rev32 v11.16b, v11.16b )
2: add t0.4s, v8.4s, k0.4s
1: add t0.4s, v8.4s, k0.4s
mov dg0v.16b, dgav.16b
add_update c, ev, k0, 8, 9, 10, 11, dgb
@ -126,25 +120,18 @@ CPU_LE( rev32 v11.16b, v11.16b )
add dgbv.2s, dgbv.2s, dg1v.2s
add dgav.4s, dgav.4s, dg0v.4s
cbz w21, 3f
if_will_cond_yield_neon
st1 {dgav.4s}, [x19]
str dgb, [x19, #16]
do_cond_yield_neon
cbz w2, 2f
cond_yield 3f, x5
b 0b
endif_yield_neon
b 1b
/*
* Final block: add padding and total bit count.
* Skip if the input size was not a round multiple of the block size,
* the padding is handled by the C code in that case.
*/
3: cbz x4, 4f
2: cbz x4, 3f
ldr_l w4, sha1_ce_offsetof_count, x4
ldr x4, [x19, x4]
ldr x4, [x0, x4]
movi v9.2d, #0
mov x8, #0x80000000
movi v10.2d, #0
@ -153,11 +140,11 @@ CPU_LE( rev32 v11.16b, v11.16b )
mov x4, #0
mov v11.d[0], xzr
mov v11.d[1], x7
b 2b
b 1b
/* store new state */
4: st1 {dgav.4s}, [x19]
str dgb, [x19, #16]
frame_pop
3: st1 {dgav.4s}, [x0]
str dgb, [x0, #16]
mov w0, w2
ret
SYM_FUNC_END(sha1_ce_transform)

View File

@ -19,6 +19,7 @@
MODULE_DESCRIPTION("SHA1 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("sha1");
struct sha1_ce_state {
struct sha1_state sst;
@ -28,14 +29,22 @@ struct sha1_ce_state {
extern const u32 sha1_ce_offsetof_count;
extern const u32 sha1_ce_offsetof_finalize;
asmlinkage void sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
asmlinkage int sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
int blocks);
static void __sha1_ce_transform(struct sha1_state *sst, u8 const *src,
int blocks)
{
sha1_ce_transform(container_of(sst, struct sha1_ce_state, sst), src,
blocks);
while (blocks) {
int rem;
kernel_neon_begin();
rem = sha1_ce_transform(container_of(sst, struct sha1_ce_state,
sst), src, blocks);
kernel_neon_end();
src += (blocks - rem) * SHA1_BLOCK_SIZE;
blocks = rem;
}
}
const u32 sha1_ce_offsetof_count = offsetof(struct sha1_ce_state, sst.count);
@ -50,9 +59,7 @@ static int sha1_ce_update(struct shash_desc *desc, const u8 *data,
return crypto_sha1_update(desc, data, len);
sctx->finalize = 0;
kernel_neon_begin();
sha1_base_do_update(desc, data, len, __sha1_ce_transform);
kernel_neon_end();
return 0;
}
@ -72,11 +79,9 @@ static int sha1_ce_finup(struct shash_desc *desc, const u8 *data,
*/
sctx->finalize = finalize;
kernel_neon_begin();
sha1_base_do_update(desc, data, len, __sha1_ce_transform);
if (!finalize)
sha1_base_do_finalize(desc, __sha1_ce_transform);
kernel_neon_end();
return sha1_base_finish(desc, out);
}
@ -88,9 +93,7 @@ static int sha1_ce_final(struct shash_desc *desc, u8 *out)
return crypto_sha1_finup(desc, NULL, 0, out);
sctx->finalize = 0;
kernel_neon_begin();
sha1_base_do_finalize(desc, __sha1_ce_transform);
kernel_neon_end();
return sha1_base_finish(desc, out);
}

View File

@ -76,36 +76,30 @@
*/
.text
SYM_FUNC_START(sha2_ce_transform)
frame_push 3
mov x19, x0
mov x20, x1
mov x21, x2
/* load round constants */
0: adr_l x8, .Lsha2_rcon
adr_l x8, .Lsha2_rcon
ld1 { v0.4s- v3.4s}, [x8], #64
ld1 { v4.4s- v7.4s}, [x8], #64
ld1 { v8.4s-v11.4s}, [x8], #64
ld1 {v12.4s-v15.4s}, [x8]
/* load state */
ld1 {dgav.4s, dgbv.4s}, [x19]
ld1 {dgav.4s, dgbv.4s}, [x0]
/* load sha256_ce_state::finalize */
ldr_l w4, sha256_ce_offsetof_finalize, x4
ldr w4, [x19, x4]
ldr w4, [x0, x4]
/* load input */
1: ld1 {v16.4s-v19.4s}, [x20], #64
sub w21, w21, #1
0: ld1 {v16.4s-v19.4s}, [x1], #64
sub w2, w2, #1
CPU_LE( rev32 v16.16b, v16.16b )
CPU_LE( rev32 v17.16b, v17.16b )
CPU_LE( rev32 v18.16b, v18.16b )
CPU_LE( rev32 v19.16b, v19.16b )
2: add t0.4s, v16.4s, v0.4s
1: add t0.4s, v16.4s, v0.4s
mov dg0v.16b, dgav.16b
mov dg1v.16b, dgbv.16b
@ -134,24 +128,18 @@ CPU_LE( rev32 v19.16b, v19.16b )
add dgbv.4s, dgbv.4s, dg1v.4s
/* handled all input blocks? */
cbz w21, 3f
if_will_cond_yield_neon
st1 {dgav.4s, dgbv.4s}, [x19]
do_cond_yield_neon
cbz w2, 2f
cond_yield 3f, x5
b 0b
endif_yield_neon
b 1b
/*
* Final block: add padding and total bit count.
* Skip if the input size was not a round multiple of the block size,
* the padding is handled by the C code in that case.
*/
3: cbz x4, 4f
2: cbz x4, 3f
ldr_l w4, sha256_ce_offsetof_count, x4
ldr x4, [x19, x4]
ldr x4, [x0, x4]
movi v17.2d, #0
mov x8, #0x80000000
movi v18.2d, #0
@ -160,10 +148,10 @@ CPU_LE( rev32 v19.16b, v19.16b )
mov x4, #0
mov v19.d[0], xzr
mov v19.d[1], x7
b 2b
b 1b
/* store new state */
4: st1 {dgav.4s, dgbv.4s}, [x19]
frame_pop
3: st1 {dgav.4s, dgbv.4s}, [x0]
mov w0, w2
ret
SYM_FUNC_END(sha2_ce_transform)

View File

@ -19,6 +19,8 @@
MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("sha224");
MODULE_ALIAS_CRYPTO("sha256");
struct sha256_ce_state {
struct sha256_state sst;
@ -28,14 +30,22 @@ struct sha256_ce_state {
extern const u32 sha256_ce_offsetof_count;
extern const u32 sha256_ce_offsetof_finalize;
asmlinkage void sha2_ce_transform(struct sha256_ce_state *sst, u8 const *src,
asmlinkage int sha2_ce_transform(struct sha256_ce_state *sst, u8 const *src,
int blocks);
static void __sha2_ce_transform(struct sha256_state *sst, u8 const *src,
int blocks)
{
sha2_ce_transform(container_of(sst, struct sha256_ce_state, sst), src,
blocks);
while (blocks) {
int rem;
kernel_neon_begin();
rem = sha2_ce_transform(container_of(sst, struct sha256_ce_state,
sst), src, blocks);
kernel_neon_end();
src += (blocks - rem) * SHA256_BLOCK_SIZE;
blocks = rem;
}
}
const u32 sha256_ce_offsetof_count = offsetof(struct sha256_ce_state,
@ -61,9 +71,7 @@ static int sha256_ce_update(struct shash_desc *desc, const u8 *data,
__sha256_block_data_order);
sctx->finalize = 0;
kernel_neon_begin();
sha256_base_do_update(desc, data, len, __sha2_ce_transform);
kernel_neon_end();
return 0;
}
@ -88,11 +96,9 @@ static int sha256_ce_finup(struct shash_desc *desc, const u8 *data,
*/
sctx->finalize = finalize;
kernel_neon_begin();
sha256_base_do_update(desc, data, len, __sha2_ce_transform);
if (!finalize)
sha256_base_do_finalize(desc, __sha2_ce_transform);
kernel_neon_end();
return sha256_base_finish(desc, out);
}
@ -106,9 +112,7 @@ static int sha256_ce_final(struct shash_desc *desc, u8 *out)
}
sctx->finalize = 0;
kernel_neon_begin();
sha256_base_do_finalize(desc, __sha2_ce_transform);
kernel_neon_end();
return sha256_base_finish(desc, out);
}

View File

@ -37,20 +37,13 @@
.endm
/*
* sha3_ce_transform(u64 *st, const u8 *data, int blocks, int dg_size)
* int sha3_ce_transform(u64 *st, const u8 *data, int blocks, int dg_size)
*/
.text
SYM_FUNC_START(sha3_ce_transform)
frame_push 4
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
0: /* load state */
add x8, x19, #32
ld1 { v0.1d- v3.1d}, [x19]
/* load state */
add x8, x0, #32
ld1 { v0.1d- v3.1d}, [x0]
ld1 { v4.1d- v7.1d}, [x8], #32
ld1 { v8.1d-v11.1d}, [x8], #32
ld1 {v12.1d-v15.1d}, [x8], #32
@ -58,13 +51,13 @@ SYM_FUNC_START(sha3_ce_transform)
ld1 {v20.1d-v23.1d}, [x8], #32
ld1 {v24.1d}, [x8]
1: sub w21, w21, #1
0: sub w2, w2, #1
mov w8, #24
adr_l x9, .Lsha3_rcon
/* load input */
ld1 {v25.8b-v28.8b}, [x20], #32
ld1 {v29.8b-v31.8b}, [x20], #24
ld1 {v25.8b-v28.8b}, [x1], #32
ld1 {v29.8b-v31.8b}, [x1], #24
eor v0.8b, v0.8b, v25.8b
eor v1.8b, v1.8b, v26.8b
eor v2.8b, v2.8b, v27.8b
@ -73,10 +66,10 @@ SYM_FUNC_START(sha3_ce_transform)
eor v5.8b, v5.8b, v30.8b
eor v6.8b, v6.8b, v31.8b
tbnz x22, #6, 3f // SHA3-512
tbnz x3, #6, 2f // SHA3-512
ld1 {v25.8b-v28.8b}, [x20], #32
ld1 {v29.8b-v30.8b}, [x20], #16
ld1 {v25.8b-v28.8b}, [x1], #32
ld1 {v29.8b-v30.8b}, [x1], #16
eor v7.8b, v7.8b, v25.8b
eor v8.8b, v8.8b, v26.8b
eor v9.8b, v9.8b, v27.8b
@ -84,34 +77,34 @@ SYM_FUNC_START(sha3_ce_transform)
eor v11.8b, v11.8b, v29.8b
eor v12.8b, v12.8b, v30.8b
tbnz x22, #4, 2f // SHA3-384 or SHA3-224
tbnz x3, #4, 1f // SHA3-384 or SHA3-224
// SHA3-256
ld1 {v25.8b-v28.8b}, [x20], #32
ld1 {v25.8b-v28.8b}, [x1], #32
eor v13.8b, v13.8b, v25.8b
eor v14.8b, v14.8b, v26.8b
eor v15.8b, v15.8b, v27.8b
eor v16.8b, v16.8b, v28.8b
b 4f
b 3f
2: tbz x22, #2, 4f // bit 2 cleared? SHA-384
1: tbz x3, #2, 3f // bit 2 cleared? SHA-384
// SHA3-224
ld1 {v25.8b-v28.8b}, [x20], #32
ld1 {v29.8b}, [x20], #8
ld1 {v25.8b-v28.8b}, [x1], #32
ld1 {v29.8b}, [x1], #8
eor v13.8b, v13.8b, v25.8b
eor v14.8b, v14.8b, v26.8b
eor v15.8b, v15.8b, v27.8b
eor v16.8b, v16.8b, v28.8b
eor v17.8b, v17.8b, v29.8b
b 4f
b 3f
// SHA3-512
3: ld1 {v25.8b-v26.8b}, [x20], #16
2: ld1 {v25.8b-v26.8b}, [x1], #16
eor v7.8b, v7.8b, v25.8b
eor v8.8b, v8.8b, v26.8b
4: sub w8, w8, #1
3: sub w8, w8, #1
eor3 v29.16b, v4.16b, v9.16b, v14.16b
eor3 v26.16b, v1.16b, v6.16b, v11.16b
@ -190,33 +183,19 @@ SYM_FUNC_START(sha3_ce_transform)
eor v0.16b, v0.16b, v31.16b
cbnz w8, 4b
cbz w21, 5f
if_will_cond_yield_neon
add x8, x19, #32
st1 { v0.1d- v3.1d}, [x19]
st1 { v4.1d- v7.1d}, [x8], #32
st1 { v8.1d-v11.1d}, [x8], #32
st1 {v12.1d-v15.1d}, [x8], #32
st1 {v16.1d-v19.1d}, [x8], #32
st1 {v20.1d-v23.1d}, [x8], #32
st1 {v24.1d}, [x8]
do_cond_yield_neon
b 0b
endif_yield_neon
b 1b
cbnz w8, 3b
cond_yield 3f, x8
cbnz w2, 0b
/* save state */
5: st1 { v0.1d- v3.1d}, [x19], #32
st1 { v4.1d- v7.1d}, [x19], #32
st1 { v8.1d-v11.1d}, [x19], #32
st1 {v12.1d-v15.1d}, [x19], #32
st1 {v16.1d-v19.1d}, [x19], #32
st1 {v20.1d-v23.1d}, [x19], #32
st1 {v24.1d}, [x19]
frame_pop
3: st1 { v0.1d- v3.1d}, [x0], #32
st1 { v4.1d- v7.1d}, [x0], #32
st1 { v8.1d-v11.1d}, [x0], #32
st1 {v12.1d-v15.1d}, [x0], #32
st1 {v16.1d-v19.1d}, [x0], #32
st1 {v20.1d-v23.1d}, [x0], #32
st1 {v24.1d}, [x0]
mov w0, w2
ret
SYM_FUNC_END(sha3_ce_transform)

View File

@ -23,8 +23,12 @@
MODULE_DESCRIPTION("SHA3 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("sha3-224");
MODULE_ALIAS_CRYPTO("sha3-256");
MODULE_ALIAS_CRYPTO("sha3-384");
MODULE_ALIAS_CRYPTO("sha3-512");
asmlinkage void sha3_ce_transform(u64 *st, const u8 *data, int blocks,
asmlinkage int sha3_ce_transform(u64 *st, const u8 *data, int blocks,
int md_len);
static int sha3_update(struct shash_desc *desc, const u8 *data,
@ -55,11 +59,15 @@ static int sha3_update(struct shash_desc *desc, const u8 *data,
blocks = len / sctx->rsiz;
len %= sctx->rsiz;
if (blocks) {
while (blocks) {
int rem;
kernel_neon_begin();
sha3_ce_transform(sctx->st, data, blocks, digest_size);
rem = sha3_ce_transform(sctx->st, data, blocks,
digest_size);
kernel_neon_end();
data += blocks * sctx->rsiz;
data += (blocks - rem) * sctx->rsiz;
blocks = rem;
}
}

View File

@ -107,23 +107,17 @@
*/
.text
SYM_FUNC_START(sha512_ce_transform)
frame_push 3
mov x19, x0
mov x20, x1
mov x21, x2
/* load state */
0: ld1 {v8.2d-v11.2d}, [x19]
ld1 {v8.2d-v11.2d}, [x0]
/* load first 4 round constants */
adr_l x3, .Lsha512_rcon
ld1 {v20.2d-v23.2d}, [x3], #64
/* load input */
1: ld1 {v12.2d-v15.2d}, [x20], #64
ld1 {v16.2d-v19.2d}, [x20], #64
sub w21, w21, #1
0: ld1 {v12.2d-v15.2d}, [x1], #64
ld1 {v16.2d-v19.2d}, [x1], #64
sub w2, w2, #1
CPU_LE( rev64 v12.16b, v12.16b )
CPU_LE( rev64 v13.16b, v13.16b )
@ -201,19 +195,12 @@ CPU_LE( rev64 v19.16b, v19.16b )
add v10.2d, v10.2d, v2.2d
add v11.2d, v11.2d, v3.2d
cond_yield 3f, x4
/* handled all input blocks? */
cbz w21, 3f
if_will_cond_yield_neon
st1 {v8.2d-v11.2d}, [x19]
do_cond_yield_neon
b 0b
endif_yield_neon
b 1b
cbnz w2, 0b
/* store new state */
3: st1 {v8.2d-v11.2d}, [x19]
frame_pop
3: st1 {v8.2d-v11.2d}, [x0]
mov w0, w2
ret
SYM_FUNC_END(sha512_ce_transform)

View File

@ -23,12 +23,28 @@
MODULE_DESCRIPTION("SHA-384/SHA-512 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("sha384");
MODULE_ALIAS_CRYPTO("sha512");
asmlinkage void sha512_ce_transform(struct sha512_state *sst, u8 const *src,
asmlinkage int sha512_ce_transform(struct sha512_state *sst, u8 const *src,
int blocks);
asmlinkage void sha512_block_data_order(u64 *digest, u8 const *src, int blocks);
static void __sha512_ce_transform(struct sha512_state *sst, u8 const *src,
int blocks)
{
while (blocks) {
int rem;
kernel_neon_begin();
rem = sha512_ce_transform(sst, src, blocks);
kernel_neon_end();
src += (blocks - rem) * SHA512_BLOCK_SIZE;
blocks = rem;
}
}
static void __sha512_block_data_order(struct sha512_state *sst, u8 const *src,
int blocks)
{
@ -38,45 +54,30 @@ static void __sha512_block_data_order(struct sha512_state *sst, u8 const *src,
static int sha512_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
if (!crypto_simd_usable())
return sha512_base_do_update(desc, data, len,
__sha512_block_data_order);
kernel_neon_begin();
sha512_base_do_update(desc, data, len, sha512_ce_transform);
kernel_neon_end();
sha512_block_fn *fn = crypto_simd_usable() ? __sha512_ce_transform
: __sha512_block_data_order;
sha512_base_do_update(desc, data, len, fn);
return 0;
}
static int sha512_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable()) {
if (len)
sha512_base_do_update(desc, data, len,
__sha512_block_data_order);
sha512_base_do_finalize(desc, __sha512_block_data_order);
return sha512_base_finish(desc, out);
}
sha512_block_fn *fn = crypto_simd_usable() ? __sha512_ce_transform
: __sha512_block_data_order;
kernel_neon_begin();
sha512_base_do_update(desc, data, len, sha512_ce_transform);
sha512_base_do_finalize(desc, sha512_ce_transform);
kernel_neon_end();
sha512_base_do_update(desc, data, len, fn);
sha512_base_do_finalize(desc, fn);
return sha512_base_finish(desc, out);
}
static int sha512_ce_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable()) {
sha512_base_do_finalize(desc, __sha512_block_data_order);
return sha512_base_finish(desc, out);
}
sha512_block_fn *fn = crypto_simd_usable() ? __sha512_ce_transform
: __sha512_block_data_order;
kernel_neon_begin();
sha512_base_do_finalize(desc, sha512_ce_transform);
kernel_neon_end();
sha512_base_do_finalize(desc, fn);
return sha512_base_finish(desc, out);
}

View File

@ -129,7 +129,7 @@ static int ppc_spe_sha256_update(struct shash_desc *desc, const u8 *data,
src += bytes;
len -= bytes;
};
}
memcpy((char *)sctx->buf, src, len);
return 0;

View File

@ -21,6 +21,7 @@
#include <crypto/algapi.h>
#include <crypto/ghash.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
@ -1055,3 +1056,4 @@ MODULE_ALIAS_CRYPTO("aes-all");
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm");
MODULE_LICENSE("GPL");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -4,8 +4,6 @@
OBJECT_FILES_NON_STANDARD := y
obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o
obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o

View File

@ -43,10 +43,6 @@
#ifdef __x86_64__
# constants in mergeable sections, linker can reorder and merge
.section .rodata.cst16.gf128mul_x_ble_mask, "aM", @progbits, 16
.align 16
.Lgf128mul_x_ble_mask:
.octa 0x00000000000000010000000000000087
.section .rodata.cst16.POLY, "aM", @progbits, 16
.align 16
POLY: .octa 0xC2000000000000000000000000000001
@ -146,7 +142,7 @@ ALL_F: .octa 0xffffffffffffffffffffffffffffffff
#define CTR %xmm11
#define INC %xmm12
#define GF128MUL_MASK %xmm10
#define GF128MUL_MASK %xmm7
#ifdef __x86_64__
#define AREG %rax
@ -2577,13 +2573,140 @@ SYM_FUNC_START(aesni_cbc_dec)
ret
SYM_FUNC_END(aesni_cbc_dec)
#ifdef __x86_64__
/*
* void aesni_cts_cbc_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
* size_t len, u8 *iv)
*/
SYM_FUNC_START(aesni_cts_cbc_enc)
FRAME_BEGIN
#ifndef __x86_64__
pushl IVP
pushl LEN
pushl KEYP
pushl KLEN
movl (FRAME_OFFSET+20)(%esp), KEYP # ctx
movl (FRAME_OFFSET+24)(%esp), OUTP # dst
movl (FRAME_OFFSET+28)(%esp), INP # src
movl (FRAME_OFFSET+32)(%esp), LEN # len
movl (FRAME_OFFSET+36)(%esp), IVP # iv
lea .Lcts_permute_table, T1
#else
lea .Lcts_permute_table(%rip), T1
#endif
mov 480(KEYP), KLEN
movups (IVP), STATE
sub $16, LEN
mov T1, IVP
add $32, IVP
add LEN, T1
sub LEN, IVP
movups (T1), %xmm4
movups (IVP), %xmm5
movups (INP), IN1
add LEN, INP
movups (INP), IN2
pxor IN1, STATE
call _aesni_enc1
pshufb %xmm5, IN2
pxor STATE, IN2
pshufb %xmm4, STATE
add OUTP, LEN
movups STATE, (LEN)
movaps IN2, STATE
call _aesni_enc1
movups STATE, (OUTP)
#ifndef __x86_64__
popl KLEN
popl KEYP
popl LEN
popl IVP
#endif
FRAME_END
ret
SYM_FUNC_END(aesni_cts_cbc_enc)
/*
* void aesni_cts_cbc_dec(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
* size_t len, u8 *iv)
*/
SYM_FUNC_START(aesni_cts_cbc_dec)
FRAME_BEGIN
#ifndef __x86_64__
pushl IVP
pushl LEN
pushl KEYP
pushl KLEN
movl (FRAME_OFFSET+20)(%esp), KEYP # ctx
movl (FRAME_OFFSET+24)(%esp), OUTP # dst
movl (FRAME_OFFSET+28)(%esp), INP # src
movl (FRAME_OFFSET+32)(%esp), LEN # len
movl (FRAME_OFFSET+36)(%esp), IVP # iv
lea .Lcts_permute_table, T1
#else
lea .Lcts_permute_table(%rip), T1
#endif
mov 480(KEYP), KLEN
add $240, KEYP
movups (IVP), IV
sub $16, LEN
mov T1, IVP
add $32, IVP
add LEN, T1
sub LEN, IVP
movups (T1), %xmm4
movups (INP), STATE
add LEN, INP
movups (INP), IN1
call _aesni_dec1
movaps STATE, IN2
pshufb %xmm4, STATE
pxor IN1, STATE
add OUTP, LEN
movups STATE, (LEN)
movups (IVP), %xmm0
pshufb %xmm0, IN1
pblendvb IN2, IN1
movaps IN1, STATE
call _aesni_dec1
pxor IV, STATE
movups STATE, (OUTP)
#ifndef __x86_64__
popl KLEN
popl KEYP
popl LEN
popl IVP
#endif
FRAME_END
ret
SYM_FUNC_END(aesni_cts_cbc_dec)
.pushsection .rodata
.align 16
.Lcts_permute_table:
.byte 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80
.byte 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.byte 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80
.byte 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80
#ifdef __x86_64__
.Lbswap_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
#endif
.popsection
#ifdef __x86_64__
/*
* _aesni_inc_init: internal ABI
* setup registers used by _aesni_inc
@ -2696,6 +2819,14 @@ SYM_FUNC_START(aesni_ctr_enc)
ret
SYM_FUNC_END(aesni_ctr_enc)
#endif
.section .rodata.cst16.gf128mul_x_ble_mask, "aM", @progbits, 16
.align 16
.Lgf128mul_x_ble_mask:
.octa 0x00000000000000010000000000000087
.previous
/*
* _aesni_gf128mul_x_ble: internal ABI
* Multiply in GF(2^128) for XTS IVs
@ -2708,120 +2839,325 @@ SYM_FUNC_END(aesni_ctr_enc)
* CTR: == temporary value
*/
#define _aesni_gf128mul_x_ble() \
pshufd $0x13, IV, CTR; \
pshufd $0x13, IV, KEY; \
paddq IV, IV; \
psrad $31, CTR; \
pand GF128MUL_MASK, CTR; \
pxor CTR, IV;
psrad $31, KEY; \
pand GF128MUL_MASK, KEY; \
pxor KEY, IV;
/*
* void aesni_xts_crypt8(const struct crypto_aes_ctx *ctx, u8 *dst,
* const u8 *src, bool enc, le128 *iv)
* void aesni_xts_encrypt(const struct crypto_aes_ctx *ctx, u8 *dst,
* const u8 *src, unsigned int len, le128 *iv)
*/
SYM_FUNC_START(aesni_xts_crypt8)
SYM_FUNC_START(aesni_xts_encrypt)
FRAME_BEGIN
testb %cl, %cl
movl $0, %ecx
movl $240, %r10d
leaq _aesni_enc4, %r11
leaq _aesni_dec4, %rax
cmovel %r10d, %ecx
cmoveq %rax, %r11
#ifndef __x86_64__
pushl IVP
pushl LEN
pushl KEYP
pushl KLEN
movl (FRAME_OFFSET+20)(%esp), KEYP # ctx
movl (FRAME_OFFSET+24)(%esp), OUTP # dst
movl (FRAME_OFFSET+28)(%esp), INP # src
movl (FRAME_OFFSET+32)(%esp), LEN # len
movl (FRAME_OFFSET+36)(%esp), IVP # iv
movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
#else
movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK
#endif
movups (IVP), IV
mov 480(KEYP), KLEN
addq %rcx, KEYP
.Lxts_enc_loop4:
sub $64, LEN
jl .Lxts_enc_1x
movdqa IV, STATE1
movdqu 0x00(INP), INC
pxor INC, STATE1
movdqu 0x00(INP), IN
pxor IN, STATE1
movdqu IV, 0x00(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE2
movdqu 0x10(INP), INC
pxor INC, STATE2
movdqu 0x10(INP), IN
pxor IN, STATE2
movdqu IV, 0x10(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE3
movdqu 0x20(INP), INC
pxor INC, STATE3
movdqu 0x20(INP), IN
pxor IN, STATE3
movdqu IV, 0x20(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE4
movdqu 0x30(INP), INC
pxor INC, STATE4
movdqu 0x30(INP), IN
pxor IN, STATE4
movdqu IV, 0x30(OUTP)
CALL_NOSPEC r11
call _aesni_enc4
movdqu 0x00(OUTP), INC
pxor INC, STATE1
movdqu 0x00(OUTP), IN
pxor IN, STATE1
movdqu STATE1, 0x00(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE1
movdqu 0x40(INP), INC
pxor INC, STATE1
movdqu IV, 0x40(OUTP)
movdqu 0x10(OUTP), INC
pxor INC, STATE2
movdqu 0x10(OUTP), IN
pxor IN, STATE2
movdqu STATE2, 0x10(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE2
movdqu 0x50(INP), INC
pxor INC, STATE2
movdqu IV, 0x50(OUTP)
movdqu 0x20(OUTP), INC
pxor INC, STATE3
movdqu 0x20(OUTP), IN
pxor IN, STATE3
movdqu STATE3, 0x20(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE3
movdqu 0x60(INP), INC
pxor INC, STATE3
movdqu IV, 0x60(OUTP)
movdqu 0x30(OUTP), INC
pxor INC, STATE4
movdqu 0x30(OUTP), IN
pxor IN, STATE4
movdqu STATE4, 0x30(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE4
movdqu 0x70(INP), INC
pxor INC, STATE4
movdqu IV, 0x70(OUTP)
_aesni_gf128mul_x_ble()
add $64, INP
add $64, OUTP
test LEN, LEN
jnz .Lxts_enc_loop4
.Lxts_enc_ret_iv:
movups IV, (IVP)
CALL_NOSPEC r11
movdqu 0x40(OUTP), INC
pxor INC, STATE1
movdqu STATE1, 0x40(OUTP)
movdqu 0x50(OUTP), INC
pxor INC, STATE2
movdqu STATE2, 0x50(OUTP)
movdqu 0x60(OUTP), INC
pxor INC, STATE3
movdqu STATE3, 0x60(OUTP)
movdqu 0x70(OUTP), INC
pxor INC, STATE4
movdqu STATE4, 0x70(OUTP)
.Lxts_enc_ret:
#ifndef __x86_64__
popl KLEN
popl KEYP
popl LEN
popl IVP
#endif
FRAME_END
ret
SYM_FUNC_END(aesni_xts_crypt8)
.Lxts_enc_1x:
add $64, LEN
jz .Lxts_enc_ret_iv
sub $16, LEN
jl .Lxts_enc_cts4
.Lxts_enc_loop1:
movdqu (INP), STATE
pxor IV, STATE
call _aesni_enc1
pxor IV, STATE
_aesni_gf128mul_x_ble()
test LEN, LEN
jz .Lxts_enc_out
add $16, INP
sub $16, LEN
jl .Lxts_enc_cts1
movdqu STATE, (OUTP)
add $16, OUTP
jmp .Lxts_enc_loop1
.Lxts_enc_out:
movdqu STATE, (OUTP)
jmp .Lxts_enc_ret_iv
.Lxts_enc_cts4:
movdqa STATE4, STATE
sub $16, OUTP
.Lxts_enc_cts1:
#ifndef __x86_64__
lea .Lcts_permute_table, T1
#else
lea .Lcts_permute_table(%rip), T1
#endif
add LEN, INP /* rewind input pointer */
add $16, LEN /* # bytes in final block */
movups (INP), IN1
mov T1, IVP
add $32, IVP
add LEN, T1
sub LEN, IVP
add OUTP, LEN
movups (T1), %xmm4
movaps STATE, IN2
pshufb %xmm4, STATE
movups STATE, (LEN)
movups (IVP), %xmm0
pshufb %xmm0, IN1
pblendvb IN2, IN1
movaps IN1, STATE
pxor IV, STATE
call _aesni_enc1
pxor IV, STATE
movups STATE, (OUTP)
jmp .Lxts_enc_ret
SYM_FUNC_END(aesni_xts_encrypt)
/*
* void aesni_xts_decrypt(const struct crypto_aes_ctx *ctx, u8 *dst,
* const u8 *src, unsigned int len, le128 *iv)
*/
SYM_FUNC_START(aesni_xts_decrypt)
FRAME_BEGIN
#ifndef __x86_64__
pushl IVP
pushl LEN
pushl KEYP
pushl KLEN
movl (FRAME_OFFSET+20)(%esp), KEYP # ctx
movl (FRAME_OFFSET+24)(%esp), OUTP # dst
movl (FRAME_OFFSET+28)(%esp), INP # src
movl (FRAME_OFFSET+32)(%esp), LEN # len
movl (FRAME_OFFSET+36)(%esp), IVP # iv
movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
#else
movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK
#endif
movups (IVP), IV
mov 480(KEYP), KLEN
add $240, KEYP
test $15, LEN
jz .Lxts_dec_loop4
sub $16, LEN
.Lxts_dec_loop4:
sub $64, LEN
jl .Lxts_dec_1x
movdqa IV, STATE1
movdqu 0x00(INP), IN
pxor IN, STATE1
movdqu IV, 0x00(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE2
movdqu 0x10(INP), IN
pxor IN, STATE2
movdqu IV, 0x10(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE3
movdqu 0x20(INP), IN
pxor IN, STATE3
movdqu IV, 0x20(OUTP)
_aesni_gf128mul_x_ble()
movdqa IV, STATE4
movdqu 0x30(INP), IN
pxor IN, STATE4
movdqu IV, 0x30(OUTP)
call _aesni_dec4
movdqu 0x00(OUTP), IN
pxor IN, STATE1
movdqu STATE1, 0x00(OUTP)
movdqu 0x10(OUTP), IN
pxor IN, STATE2
movdqu STATE2, 0x10(OUTP)
movdqu 0x20(OUTP), IN
pxor IN, STATE3
movdqu STATE3, 0x20(OUTP)
movdqu 0x30(OUTP), IN
pxor IN, STATE4
movdqu STATE4, 0x30(OUTP)
_aesni_gf128mul_x_ble()
add $64, INP
add $64, OUTP
test LEN, LEN
jnz .Lxts_dec_loop4
.Lxts_dec_ret_iv:
movups IV, (IVP)
.Lxts_dec_ret:
#ifndef __x86_64__
popl KLEN
popl KEYP
popl LEN
popl IVP
#endif
FRAME_END
ret
.Lxts_dec_1x:
add $64, LEN
jz .Lxts_dec_ret_iv
.Lxts_dec_loop1:
movdqu (INP), STATE
add $16, INP
sub $16, LEN
jl .Lxts_dec_cts1
pxor IV, STATE
call _aesni_dec1
pxor IV, STATE
_aesni_gf128mul_x_ble()
test LEN, LEN
jz .Lxts_dec_out
movdqu STATE, (OUTP)
add $16, OUTP
jmp .Lxts_dec_loop1
.Lxts_dec_out:
movdqu STATE, (OUTP)
jmp .Lxts_dec_ret_iv
.Lxts_dec_cts1:
movdqa IV, STATE4
_aesni_gf128mul_x_ble()
pxor IV, STATE
call _aesni_dec1
pxor IV, STATE
#ifndef __x86_64__
lea .Lcts_permute_table, T1
#else
lea .Lcts_permute_table(%rip), T1
#endif
add LEN, INP /* rewind input pointer */
add $16, LEN /* # bytes in final block */
movups (INP), IN1
mov T1, IVP
add $32, IVP
add LEN, T1
sub LEN, IVP
add OUTP, LEN
movups (T1), %xmm4
movaps STATE, IN2
pshufb %xmm4, STATE
movups STATE, (LEN)
movups (IVP), %xmm0
pshufb %xmm0, IN1
pblendvb IN2, IN1
movaps IN1, STATE
pxor STATE4, STATE
call _aesni_dec1
pxor STATE4, STATE
movups STATE, (OUTP)
jmp .Lxts_dec_ret
SYM_FUNC_END(aesni_xts_decrypt)

View File

@ -31,11 +31,10 @@
#include <crypto/internal/aead.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <linux/jump_label.h>
#include <linux/workqueue.h>
#include <linux/spinlock.h>
#ifdef CONFIG_X86_64
#include <asm/crypto/glue_helper.h>
#endif
#include <linux/static_call.h>
#define AESNI_ALIGN 16
@ -93,62 +92,25 @@ asmlinkage void aesni_cbc_enc(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
asmlinkage void aesni_cbc_dec(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
asmlinkage void aesni_cts_cbc_enc(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
asmlinkage void aesni_cts_cbc_dec(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
#define AVX_GEN2_OPTSIZE 640
#define AVX_GEN4_OPTSIZE 4096
asmlinkage void aesni_xts_encrypt(const struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
asmlinkage void aesni_xts_decrypt(const struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
#ifdef CONFIG_X86_64
static void (*aesni_ctr_enc_tfm)(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
asmlinkage void aesni_ctr_enc(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);
asmlinkage void aesni_xts_crypt8(const struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, bool enc, le128 *iv);
/* asmlinkage void aesni_gcm_enc()
* void *ctx, AES Key schedule. Starts on a 16 byte boundary.
* struct gcm_context_data. May be uninitialized.
* u8 *out, Ciphertext output. Encrypt in-place is allowed.
* const u8 *in, Plaintext input
* unsigned long plaintext_len, Length of data in bytes for encryption.
* u8 *iv, Pre-counter block j0: 12 byte IV concatenated with 0x00000001.
* 16-byte aligned pointer.
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
* const u8 *aad, Additional Authentication Data (AAD)
* unsigned long aad_len, Length of AAD in bytes.
* u8 *auth_tag, Authenticated Tag output.
* unsigned long auth_tag_len), Authenticated Tag Length in bytes.
* Valid values are 16 (most likely), 12 or 8.
*/
asmlinkage void aesni_gcm_enc(void *ctx,
struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long plaintext_len, u8 *iv,
u8 *hash_subkey, const u8 *aad, unsigned long aad_len,
u8 *auth_tag, unsigned long auth_tag_len);
/* asmlinkage void aesni_gcm_dec()
* void *ctx, AES Key schedule. Starts on a 16 byte boundary.
* struct gcm_context_data. May be uninitialized.
* u8 *out, Plaintext output. Decrypt in-place is allowed.
* const u8 *in, Ciphertext input
* unsigned long ciphertext_len, Length of data in bytes for decryption.
* u8 *iv, Pre-counter block j0: 12 byte IV concatenated with 0x00000001.
* 16-byte aligned pointer.
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
* const u8 *aad, Additional Authentication Data (AAD)
* unsigned long aad_len, Length of AAD in bytes. With RFC4106 this is going
* to be 8 or 12 bytes
* u8 *auth_tag, Authenticated Tag output.
* unsigned long auth_tag_len) Authenticated Tag Length in bytes.
* Valid values are 16 (most likely), 12 or 8.
*/
asmlinkage void aesni_gcm_dec(void *ctx,
struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long ciphertext_len, u8 *iv,
u8 *hash_subkey, const u8 *aad, unsigned long aad_len,
u8 *auth_tag, unsigned long auth_tag_len);
DEFINE_STATIC_CALL(aesni_ctr_enc_tfm, aesni_ctr_enc);
/* Scatter / Gather routines, with args similar to above */
asmlinkage void aesni_gcm_init(void *ctx,
@ -167,24 +129,6 @@ asmlinkage void aesni_gcm_finalize(void *ctx,
struct gcm_context_data *gdata,
u8 *auth_tag, unsigned long auth_tag_len);
static const struct aesni_gcm_tfm_s {
void (*init)(void *ctx, struct gcm_context_data *gdata, u8 *iv,
u8 *hash_subkey, const u8 *aad, unsigned long aad_len);
void (*enc_update)(void *ctx, struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long plaintext_len);
void (*dec_update)(void *ctx, struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long ciphertext_len);
void (*finalize)(void *ctx, struct gcm_context_data *gdata,
u8 *auth_tag, unsigned long auth_tag_len);
} *aesni_gcm_tfm;
static const struct aesni_gcm_tfm_s aesni_gcm_tfm_sse = {
.init = &aesni_gcm_init,
.enc_update = &aesni_gcm_enc_update,
.dec_update = &aesni_gcm_dec_update,
.finalize = &aesni_gcm_finalize,
};
asmlinkage void aes_ctr_enc_128_avx_by8(const u8 *in, u8 *iv,
void *keys, u8 *out, unsigned int num_bytes);
asmlinkage void aes_ctr_enc_192_avx_by8(const u8 *in, u8 *iv,
@ -214,25 +158,6 @@ asmlinkage void aesni_gcm_finalize_avx_gen2(void *ctx,
struct gcm_context_data *gdata,
u8 *auth_tag, unsigned long auth_tag_len);
asmlinkage void aesni_gcm_enc_avx_gen2(void *ctx,
struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long plaintext_len, u8 *iv,
const u8 *aad, unsigned long aad_len,
u8 *auth_tag, unsigned long auth_tag_len);
asmlinkage void aesni_gcm_dec_avx_gen2(void *ctx,
struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long ciphertext_len, u8 *iv,
const u8 *aad, unsigned long aad_len,
u8 *auth_tag, unsigned long auth_tag_len);
static const struct aesni_gcm_tfm_s aesni_gcm_tfm_avx_gen2 = {
.init = &aesni_gcm_init_avx_gen2,
.enc_update = &aesni_gcm_enc_update_avx_gen2,
.dec_update = &aesni_gcm_dec_update_avx_gen2,
.finalize = &aesni_gcm_finalize_avx_gen2,
};
/*
* asmlinkage void aesni_gcm_init_avx_gen4()
* gcm_data *my_ctx_data, context data
@ -256,24 +181,8 @@ asmlinkage void aesni_gcm_finalize_avx_gen4(void *ctx,
struct gcm_context_data *gdata,
u8 *auth_tag, unsigned long auth_tag_len);
asmlinkage void aesni_gcm_enc_avx_gen4(void *ctx,
struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long plaintext_len, u8 *iv,
const u8 *aad, unsigned long aad_len,
u8 *auth_tag, unsigned long auth_tag_len);
asmlinkage void aesni_gcm_dec_avx_gen4(void *ctx,
struct gcm_context_data *gdata, u8 *out,
const u8 *in, unsigned long ciphertext_len, u8 *iv,
const u8 *aad, unsigned long aad_len,
u8 *auth_tag, unsigned long auth_tag_len);
static const struct aesni_gcm_tfm_s aesni_gcm_tfm_avx_gen4 = {
.init = &aesni_gcm_init_avx_gen4,
.enc_update = &aesni_gcm_enc_update_avx_gen4,
.dec_update = &aesni_gcm_dec_update_avx_gen4,
.finalize = &aesni_gcm_finalize_avx_gen4,
};
static __ro_after_init DEFINE_STATIC_KEY_FALSE(gcm_use_avx);
static __ro_after_init DEFINE_STATIC_KEY_FALSE(gcm_use_avx2);
static inline struct
aesni_rfc4106_gcm_ctx *aesni_rfc4106_gcm_ctx_get(struct crypto_aead *tfm)
@ -374,16 +283,16 @@ static int ecb_encrypt(struct skcipher_request *req)
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_ecb_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = skcipher_walk_done(&walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -396,16 +305,16 @@ static int ecb_decrypt(struct skcipher_request *req)
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_ecb_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = skcipher_walk_done(&walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -418,16 +327,16 @@ static int cbc_encrypt(struct skcipher_request *req)
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK, walk.iv);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = skcipher_walk_done(&walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -440,36 +349,133 @@ static int cbc_decrypt(struct skcipher_request *req)
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_cbc_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK, walk.iv);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = skcipher_walk_done(&walk, nbytes);
}
kernel_fpu_end();
return err;
}
#ifdef CONFIG_X86_64
static void ctr_crypt_final(struct crypto_aes_ctx *ctx,
struct skcipher_walk *walk)
static int cts_cbc_encrypt(struct skcipher_request *req)
{
u8 *ctrblk = walk->iv;
u8 keystream[AES_BLOCK_SIZE];
u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
unsigned int nbytes = walk->nbytes;
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm));
int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
struct scatterlist *src = req->src, *dst = req->dst;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct skcipher_walk walk;
int err;
aesni_enc(ctx, keystream, ctrblk);
crypto_xor_cpy(dst, keystream, src, nbytes);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq, skcipher_request_flags(req),
NULL, NULL);
crypto_inc(ctrblk, AES_BLOCK_SIZE);
if (req->cryptlen <= AES_BLOCK_SIZE) {
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
cbc_blocks = 1;
}
if (cbc_blocks > 0) {
skcipher_request_set_crypt(&subreq, req->src, req->dst,
cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = cbc_encrypt(&subreq);
if (err)
return err;
if (req->cryptlen == AES_BLOCK_SIZE)
return 0;
dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst,
subreq.cryptlen);
}
/* handle ciphertext stealing */
skcipher_request_set_crypt(&subreq, src, dst,
req->cryptlen - cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_fpu_begin();
aesni_cts_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr,
walk.nbytes, walk.iv);
kernel_fpu_end();
return skcipher_walk_done(&walk, 0);
}
static int cts_cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm));
int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
struct scatterlist *src = req->src, *dst = req->dst;
struct scatterlist sg_src[2], sg_dst[2];
struct skcipher_request subreq;
struct skcipher_walk walk;
int err;
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq, skcipher_request_flags(req),
NULL, NULL);
if (req->cryptlen <= AES_BLOCK_SIZE) {
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
cbc_blocks = 1;
}
if (cbc_blocks > 0) {
skcipher_request_set_crypt(&subreq, req->src, req->dst,
cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = cbc_decrypt(&subreq);
if (err)
return err;
if (req->cryptlen == AES_BLOCK_SIZE)
return 0;
dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst,
subreq.cryptlen);
}
/* handle ciphertext stealing */
skcipher_request_set_crypt(&subreq, src, dst,
req->cryptlen - cbc_blocks * AES_BLOCK_SIZE,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_fpu_begin();
aesni_cts_cbc_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr,
walk.nbytes, walk.iv);
kernel_fpu_end();
return skcipher_walk_done(&walk, 0);
}
#ifdef CONFIG_X86_64
static void aesni_ctr_enc_avx_tfm(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv)
{
@ -491,120 +497,36 @@ static int ctr_crypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm));
u8 keystream[AES_BLOCK_SIZE];
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, true);
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes) > 0) {
kernel_fpu_begin();
while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) {
aesni_ctr_enc_tfm(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK, walk.iv);
nbytes &= AES_BLOCK_SIZE - 1;
err = skcipher_walk_done(&walk, nbytes);
}
if (walk.nbytes) {
ctr_crypt_final(ctx, &walk);
err = skcipher_walk_done(&walk, 0);
if (nbytes & AES_BLOCK_MASK)
static_call(aesni_ctr_enc_tfm)(ctx, walk.dst.virt.addr,
walk.src.virt.addr,
nbytes & AES_BLOCK_MASK,
walk.iv);
nbytes &= ~AES_BLOCK_MASK;
if (walk.nbytes == walk.total && nbytes > 0) {
aesni_enc(ctx, keystream, walk.iv);
crypto_xor_cpy(walk.dst.virt.addr + walk.nbytes - nbytes,
walk.src.virt.addr + walk.nbytes - nbytes,
keystream, nbytes);
crypto_inc(walk.iv, AES_BLOCK_SIZE);
nbytes = 0;
}
kernel_fpu_end();
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
static int xts_aesni_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
keylen /= 2;
/* first half of xts-key is for crypt */
err = aes_set_key_common(crypto_skcipher_tfm(tfm), ctx->raw_crypt_ctx,
key, keylen);
if (err)
return err;
/* second half of xts-key is for tweak */
return aes_set_key_common(crypto_skcipher_tfm(tfm), ctx->raw_tweak_ctx,
key + keylen, keylen);
}
static void aesni_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, aesni_enc);
}
static void aesni_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, aesni_dec);
}
static void aesni_xts_enc8(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
aesni_xts_crypt8(ctx, dst, src, true, iv);
}
static void aesni_xts_dec8(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
aesni_xts_crypt8(ctx, dst, src, false, iv);
}
static const struct common_glue_ctx aesni_enc_xts = {
.num_funcs = 2,
.fpu_blocks_limit = 1,
.funcs = { {
.num_blocks = 8,
.fn_u = { .xts = aesni_xts_enc8 }
}, {
.num_blocks = 1,
.fn_u = { .xts = aesni_xts_enc }
} }
};
static const struct common_glue_ctx aesni_dec_xts = {
.num_funcs = 2,
.fpu_blocks_limit = 1,
.funcs = { {
.num_blocks = 8,
.fn_u = { .xts = aesni_xts_dec8 }
}, {
.num_blocks = 1,
.fn_u = { .xts = aesni_xts_dec }
} }
};
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&aesni_enc_xts, req, aesni_enc,
aes_ctx(ctx->raw_tweak_ctx),
aes_ctx(ctx->raw_crypt_ctx),
false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&aesni_dec_xts, req, aesni_enc,
aes_ctx(ctx->raw_tweak_ctx),
aes_ctx(ctx->raw_crypt_ctx),
true);
}
static int
rfc4106_set_hash_subkey(u8 *hash_subkey, const u8 *key, unsigned int key_len)
{
@ -681,42 +603,35 @@ static int generic_gcmaes_set_authsize(struct crypto_aead *tfm,
static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
unsigned int assoclen, u8 *hash_subkey,
u8 *iv, void *aes_ctx)
u8 *iv, void *aes_ctx, u8 *auth_tag,
unsigned long auth_tag_len)
{
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
const struct aesni_gcm_tfm_s *gcm_tfm = aesni_gcm_tfm;
struct gcm_context_data data AESNI_ALIGN_ATTR;
struct scatter_walk dst_sg_walk = {};
u8 databuf[sizeof(struct gcm_context_data) + (AESNI_ALIGN - 8)] __aligned(8);
struct gcm_context_data *data = PTR_ALIGN((void *)databuf, AESNI_ALIGN);
unsigned long left = req->cryptlen;
unsigned long len, srclen, dstlen;
struct scatter_walk assoc_sg_walk;
struct scatter_walk src_sg_walk;
struct scatterlist src_start[2];
struct scatterlist dst_start[2];
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
u8 *src, *dst, *assoc;
struct skcipher_walk walk;
bool do_avx, do_avx2;
u8 *assocmem = NULL;
u8 authTag[16];
u8 *assoc;
int err;
if (!enc)
left -= auth_tag_len;
if (left < AVX_GEN4_OPTSIZE && gcm_tfm == &aesni_gcm_tfm_avx_gen4)
gcm_tfm = &aesni_gcm_tfm_avx_gen2;
if (left < AVX_GEN2_OPTSIZE && gcm_tfm == &aesni_gcm_tfm_avx_gen2)
gcm_tfm = &aesni_gcm_tfm_sse;
do_avx = (left >= AVX_GEN2_OPTSIZE);
do_avx2 = (left >= AVX_GEN4_OPTSIZE);
/* Linearize assoc, if not already linear */
if (req->src->length >= assoclen && req->src->length &&
(!PageHighMem(sg_page(req->src)) ||
req->src->offset + req->src->length <= PAGE_SIZE)) {
if (req->src->length >= assoclen && req->src->length) {
scatterwalk_start(&assoc_sg_walk, req->src);
assoc = scatterwalk_map(&assoc_sg_walk);
} else {
gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
GFP_KERNEL : GFP_ATOMIC;
/* assoc can be any length, so must be on heap */
assocmem = kmalloc(assoclen, GFP_ATOMIC);
assocmem = kmalloc(assoclen, flags);
if (unlikely(!assocmem))
return -ENOMEM;
assoc = assocmem;
@ -724,62 +639,15 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
scatterwalk_map_and_copy(assoc, req->src, 0, assoclen, 0);
}
if (left) {
src_sg = scatterwalk_ffwd(src_start, req->src, req->assoclen);
scatterwalk_start(&src_sg_walk, src_sg);
if (req->src != req->dst) {
dst_sg = scatterwalk_ffwd(dst_start, req->dst,
req->assoclen);
scatterwalk_start(&dst_sg_walk, dst_sg);
}
}
kernel_fpu_begin();
gcm_tfm->init(aes_ctx, &data, iv,
hash_subkey, assoc, assoclen);
if (req->src != req->dst) {
while (left) {
src = scatterwalk_map(&src_sg_walk);
dst = scatterwalk_map(&dst_sg_walk);
srclen = scatterwalk_clamp(&src_sg_walk, left);
dstlen = scatterwalk_clamp(&dst_sg_walk, left);
len = min(srclen, dstlen);
if (len) {
if (enc)
gcm_tfm->enc_update(aes_ctx, &data,
dst, src, len);
if (static_branch_likely(&gcm_use_avx2) && do_avx2)
aesni_gcm_init_avx_gen4(aes_ctx, data, iv, hash_subkey, assoc,
assoclen);
else if (static_branch_likely(&gcm_use_avx) && do_avx)
aesni_gcm_init_avx_gen2(aes_ctx, data, iv, hash_subkey, assoc,
assoclen);
else
gcm_tfm->dec_update(aes_ctx, &data,
dst, src, len);
}
left -= len;
scatterwalk_unmap(src);
scatterwalk_unmap(dst);
scatterwalk_advance(&src_sg_walk, len);
scatterwalk_advance(&dst_sg_walk, len);
scatterwalk_done(&src_sg_walk, 0, left);
scatterwalk_done(&dst_sg_walk, 1, left);
}
} else {
while (left) {
dst = src = scatterwalk_map(&src_sg_walk);
len = scatterwalk_clamp(&src_sg_walk, left);
if (len) {
if (enc)
gcm_tfm->enc_update(aes_ctx, &data,
src, src, len);
else
gcm_tfm->dec_update(aes_ctx, &data,
src, src, len);
}
left -= len;
scatterwalk_unmap(src);
scatterwalk_advance(&src_sg_walk, len);
scatterwalk_done(&src_sg_walk, 1, left);
}
}
gcm_tfm->finalize(aes_ctx, &data, authTag, auth_tag_len);
aesni_gcm_init(aes_ctx, data, iv, hash_subkey, assoc, assoclen);
kernel_fpu_end();
if (!assocmem)
@ -787,24 +655,58 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
else
kfree(assocmem);
if (!enc) {
u8 authTagMsg[16];
err = enc ? skcipher_walk_aead_encrypt(&walk, req, false)
: skcipher_walk_aead_decrypt(&walk, req, false);
/* Copy out original authTag */
scatterwalk_map_and_copy(authTagMsg, req->src,
req->assoclen + req->cryptlen -
auth_tag_len,
auth_tag_len, 0);
while (walk.nbytes > 0) {
kernel_fpu_begin();
if (static_branch_likely(&gcm_use_avx2) && do_avx2) {
if (enc)
aesni_gcm_enc_update_avx_gen4(aes_ctx, data,
walk.dst.virt.addr,
walk.src.virt.addr,
walk.nbytes);
else
aesni_gcm_dec_update_avx_gen4(aes_ctx, data,
walk.dst.virt.addr,
walk.src.virt.addr,
walk.nbytes);
} else if (static_branch_likely(&gcm_use_avx) && do_avx) {
if (enc)
aesni_gcm_enc_update_avx_gen2(aes_ctx, data,
walk.dst.virt.addr,
walk.src.virt.addr,
walk.nbytes);
else
aesni_gcm_dec_update_avx_gen2(aes_ctx, data,
walk.dst.virt.addr,
walk.src.virt.addr,
walk.nbytes);
} else if (enc) {
aesni_gcm_enc_update(aes_ctx, data, walk.dst.virt.addr,
walk.src.virt.addr, walk.nbytes);
} else {
aesni_gcm_dec_update(aes_ctx, data, walk.dst.virt.addr,
walk.src.virt.addr, walk.nbytes);
}
kernel_fpu_end();
/* Compare generated tag with passed in tag. */
return crypto_memneq(authTagMsg, authTag, auth_tag_len) ?
-EBADMSG : 0;
err = skcipher_walk_done(&walk, 0);
}
/* Copy in the authTag */
scatterwalk_map_and_copy(authTag, req->dst,
req->assoclen + req->cryptlen,
auth_tag_len, 1);
if (err)
return err;
kernel_fpu_begin();
if (static_branch_likely(&gcm_use_avx2) && do_avx2)
aesni_gcm_finalize_avx_gen4(aes_ctx, data, auth_tag,
auth_tag_len);
else if (static_branch_likely(&gcm_use_avx) && do_avx)
aesni_gcm_finalize_avx_gen2(aes_ctx, data, auth_tag,
auth_tag_len);
else
aesni_gcm_finalize(aes_ctx, data, auth_tag, auth_tag_len);
kernel_fpu_end();
return 0;
}
@ -812,15 +714,47 @@ static int gcmaes_crypt_by_sg(bool enc, struct aead_request *req,
static int gcmaes_encrypt(struct aead_request *req, unsigned int assoclen,
u8 *hash_subkey, u8 *iv, void *aes_ctx)
{
return gcmaes_crypt_by_sg(true, req, assoclen, hash_subkey, iv,
aes_ctx);
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
u8 auth_tag[16];
int err;
err = gcmaes_crypt_by_sg(true, req, assoclen, hash_subkey, iv, aes_ctx,
auth_tag, auth_tag_len);
if (err)
return err;
scatterwalk_map_and_copy(auth_tag, req->dst,
req->assoclen + req->cryptlen,
auth_tag_len, 1);
return 0;
}
static int gcmaes_decrypt(struct aead_request *req, unsigned int assoclen,
u8 *hash_subkey, u8 *iv, void *aes_ctx)
{
return gcmaes_crypt_by_sg(false, req, assoclen, hash_subkey, iv,
aes_ctx);
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
u8 auth_tag_msg[16];
u8 auth_tag[16];
int err;
err = gcmaes_crypt_by_sg(false, req, assoclen, hash_subkey, iv, aes_ctx,
auth_tag, auth_tag_len);
if (err)
return err;
/* Copy out original auth_tag */
scatterwalk_map_and_copy(auth_tag_msg, req->src,
req->assoclen + req->cryptlen - auth_tag_len,
auth_tag_len, 0);
/* Compare generated tag with passed in tag. */
if (crypto_memneq(auth_tag_msg, auth_tag, auth_tag_len)) {
memzero_explicit(auth_tag, sizeof(auth_tag));
return -EBADMSG;
}
return 0;
}
static int helper_rfc4106_encrypt(struct aead_request *req)
@ -828,7 +762,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
void *aes_ctx = &(ctx->aes_key_expanded);
u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
u8 ivbuf[16 + (AESNI_ALIGN - 8)] __aligned(8);
u8 *iv = PTR_ALIGN(&ivbuf[0], AESNI_ALIGN);
unsigned int i;
__be32 counter = cpu_to_be32(1);
@ -855,7 +790,8 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
void *aes_ctx = &(ctx->aes_key_expanded);
u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
u8 ivbuf[16 + (AESNI_ALIGN - 8)] __aligned(8);
u8 *iv = PTR_ALIGN(&ivbuf[0], AESNI_ALIGN);
unsigned int i;
if (unlikely(req->assoclen != 16 && req->assoclen != 20))
@ -877,6 +813,128 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
}
#endif
static int xts_aesni_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
keylen /= 2;
/* first half of xts-key is for crypt */
err = aes_set_key_common(crypto_skcipher_tfm(tfm), ctx->raw_crypt_ctx,
key, keylen);
if (err)
return err;
/* second half of xts-key is for tweak */
return aes_set_key_common(crypto_skcipher_tfm(tfm), ctx->raw_tweak_ctx,
key + keylen, keylen);
}
static int xts_crypt(struct skcipher_request *req, bool encrypt)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int tail = req->cryptlen % AES_BLOCK_SIZE;
struct skcipher_request subreq;
struct skcipher_walk walk;
int err;
if (req->cryptlen < AES_BLOCK_SIZE)
return -EINVAL;
err = skcipher_walk_virt(&walk, req, false);
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
int blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
skcipher_walk_abort(&walk);
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
skcipher_request_flags(req),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
blocks * AES_BLOCK_SIZE, req->iv);
req = &subreq;
err = skcipher_walk_virt(&walk, req, false);
} else {
tail = 0;
}
kernel_fpu_begin();
/* calculate first value of T */
aesni_enc(aes_ctx(ctx->raw_tweak_ctx), walk.iv, walk.iv);
while (walk.nbytes > 0) {
int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
if (encrypt)
aesni_xts_encrypt(aes_ctx(ctx->raw_crypt_ctx),
walk.dst.virt.addr, walk.src.virt.addr,
nbytes, walk.iv);
else
aesni_xts_decrypt(aes_ctx(ctx->raw_crypt_ctx),
walk.dst.virt.addr, walk.src.virt.addr,
nbytes, walk.iv);
kernel_fpu_end();
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
if (walk.nbytes > 0)
kernel_fpu_begin();
}
if (unlikely(tail > 0 && !err)) {
struct scatterlist sg_src[2], sg_dst[2];
struct scatterlist *src, *dst;
dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen);
skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail,
req->iv);
err = skcipher_walk_virt(&walk, &subreq, false);
if (err)
return err;
kernel_fpu_begin();
if (encrypt)
aesni_xts_encrypt(aes_ctx(ctx->raw_crypt_ctx),
walk.dst.virt.addr, walk.src.virt.addr,
walk.nbytes, walk.iv);
else
aesni_xts_decrypt(aes_ctx(ctx->raw_crypt_ctx),
walk.dst.virt.addr, walk.src.virt.addr,
walk.nbytes, walk.iv);
kernel_fpu_end();
err = skcipher_walk_done(&walk, 0);
}
return err;
}
static int xts_encrypt(struct skcipher_request *req)
{
return xts_crypt(req, true);
}
static int xts_decrypt(struct skcipher_request *req)
{
return xts_crypt(req, false);
}
static struct crypto_alg aesni_cipher_alg = {
.cra_name = "aes",
.cra_driver_name = "aes-aesni",
@ -928,6 +986,23 @@ static struct skcipher_alg aesni_skciphers[] = {
.setkey = aesni_skcipher_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base = {
.cra_name = "__cts(cbc(aes))",
.cra_driver_name = "__cts-cbc-aes-aesni",
.cra_priority = 400,
.cra_flags = CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = CRYPTO_AES_CTX_SIZE,
.cra_module = THIS_MODULE,
},
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = aesni_skcipher_setkey,
.encrypt = cts_cbc_encrypt,
.decrypt = cts_cbc_decrypt,
#ifdef CONFIG_X86_64
}, {
.base = {
@ -946,6 +1021,7 @@ static struct skcipher_alg aesni_skciphers[] = {
.setkey = aesni_skcipher_setkey,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
#endif
}, {
.base = {
.cra_name = "__xts(aes)",
@ -959,10 +1035,10 @@ static struct skcipher_alg aesni_skciphers[] = {
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.walksize = 2 * AES_BLOCK_SIZE,
.setkey = xts_aesni_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
#endif
}
};
@ -985,7 +1061,8 @@ static int generic_gcmaes_encrypt(struct aead_request *req)
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct generic_gcmaes_ctx *ctx = generic_gcmaes_ctx_get(tfm);
void *aes_ctx = &(ctx->aes_key_expanded);
u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
u8 ivbuf[16 + (AESNI_ALIGN - 8)] __aligned(8);
u8 *iv = PTR_ALIGN(&ivbuf[0], AESNI_ALIGN);
__be32 counter = cpu_to_be32(1);
memcpy(iv, req->iv, 12);
@ -1001,7 +1078,8 @@ static int generic_gcmaes_decrypt(struct aead_request *req)
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct generic_gcmaes_ctx *ctx = generic_gcmaes_ctx_get(tfm);
void *aes_ctx = &(ctx->aes_key_expanded);
u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
u8 ivbuf[16 + (AESNI_ALIGN - 8)] __aligned(8);
u8 *iv = PTR_ALIGN(&ivbuf[0], AESNI_ALIGN);
memcpy(iv, req->iv, 12);
*((__be32 *)(iv+12)) = counter;
@ -1066,19 +1144,18 @@ static int __init aesni_init(void)
#ifdef CONFIG_X86_64
if (boot_cpu_has(X86_FEATURE_AVX2)) {
pr_info("AVX2 version of gcm_enc/dec engaged.\n");
aesni_gcm_tfm = &aesni_gcm_tfm_avx_gen4;
static_branch_enable(&gcm_use_avx);
static_branch_enable(&gcm_use_avx2);
} else
if (boot_cpu_has(X86_FEATURE_AVX)) {
pr_info("AVX version of gcm_enc/dec engaged.\n");
aesni_gcm_tfm = &aesni_gcm_tfm_avx_gen2;
static_branch_enable(&gcm_use_avx);
} else {
pr_info("SSE version of gcm_enc/dec engaged.\n");
aesni_gcm_tfm = &aesni_gcm_tfm_sse;
}
aesni_ctr_enc_tfm = aesni_ctr_enc;
if (boot_cpu_has(X86_FEATURE_AVX)) {
/* optimize performance of ctr mode encryption transform */
aesni_ctr_enc_tfm = aesni_ctr_enc_avx_tfm;
static_call_update(aesni_ctr_enc_tfm, aesni_ctr_enc_avx_tfm);
pr_info("AES CTR mode by8 optimization enabled\n");
}
#endif

View File

@ -58,138 +58,40 @@ void blake2s_compress_arch(struct blake2s_state *state,
}
EXPORT_SYMBOL(blake2s_compress_arch);
static int crypto_blake2s_setkey(struct crypto_shash *tfm, const u8 *key,
unsigned int keylen)
static int crypto_blake2s_update_x86(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm);
if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE)
return -EINVAL;
memcpy(tctx->key, key, keylen);
tctx->keylen = keylen;
return 0;
return crypto_blake2s_update(desc, in, inlen, blake2s_compress_arch);
}
static int crypto_blake2s_init(struct shash_desc *desc)
static int crypto_blake2s_final_x86(struct shash_desc *desc, u8 *out)
{
struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
struct blake2s_state *state = shash_desc_ctx(desc);
const int outlen = crypto_shash_digestsize(desc->tfm);
if (tctx->keylen)
blake2s_init_key(state, outlen, tctx->key, tctx->keylen);
else
blake2s_init(state, outlen);
return 0;
return crypto_blake2s_final(desc, out, blake2s_compress_arch);
}
static int crypto_blake2s_update(struct shash_desc *desc, const u8 *in,
unsigned int inlen)
{
struct blake2s_state *state = shash_desc_ctx(desc);
const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
if (unlikely(!inlen))
return 0;
if (inlen > fill) {
memcpy(state->buf + state->buflen, in, fill);
blake2s_compress_arch(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
state->buflen = 0;
in += fill;
inlen -= fill;
#define BLAKE2S_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 200, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2s_setkey, \
.init = crypto_blake2s_init, \
.update = crypto_blake2s_update_x86, \
.final = crypto_blake2s_final_x86, \
.descsize = sizeof(struct blake2s_state), \
}
if (inlen > BLAKE2S_BLOCK_SIZE) {
const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
/* Hash one less (full) block than strictly possible */
blake2s_compress_arch(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
}
memcpy(state->buf + state->buflen, in, inlen);
state->buflen += inlen;
return 0;
}
static int crypto_blake2s_final(struct shash_desc *desc, u8 *out)
{
struct blake2s_state *state = shash_desc_ctx(desc);
blake2s_set_lastblock(state);
memset(state->buf + state->buflen, 0,
BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
blake2s_compress_arch(state, state->buf, 1, state->buflen);
cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
memcpy(out, state->h, state->outlen);
memzero_explicit(state, sizeof(*state));
return 0;
}
static struct shash_alg blake2s_algs[] = {{
.base.cra_name = "blake2s-128",
.base.cra_driver_name = "blake2s-128-x86",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_128_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}, {
.base.cra_name = "blake2s-160",
.base.cra_driver_name = "blake2s-160-x86",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_160_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}, {
.base.cra_name = "blake2s-224",
.base.cra_driver_name = "blake2s-224-x86",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_224_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}, {
.base.cra_name = "blake2s-256",
.base.cra_driver_name = "blake2s-256-x86",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_256_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}};
static struct shash_alg blake2s_algs[] = {
BLAKE2S_ALG("blake2s-128", "blake2s-128-x86", BLAKE2S_128_HASH_SIZE),
BLAKE2S_ALG("blake2s-160", "blake2s-160-x86", BLAKE2S_160_HASH_SIZE),
BLAKE2S_ALG("blake2s-224", "blake2s-224-x86", BLAKE2S_224_HASH_SIZE),
BLAKE2S_ALG("blake2s-256", "blake2s-256-x86", BLAKE2S_256_HASH_SIZE),
};
static int __init blake2s_mod_init(void)
{

View File

@ -6,8 +6,6 @@
*
* CBC & ECB parts based on code (crypto/cbc.c,ecb.c) by:
* Copyright (c) 2006 Herbert Xu <herbert@gondor.apana.org.au>
* CTR part based on code (crypto/ctr.c) by:
* (C) Copyright IBM Corp. 2007 - Joy Latten <latten@us.ibm.com>
*/
#include <crypto/algapi.h>
@ -247,97 +245,6 @@ static int cbc_decrypt(struct skcipher_request *req)
return err;
}
static void ctr_crypt_final(struct bf_ctx *ctx, struct skcipher_walk *walk)
{
u8 *ctrblk = walk->iv;
u8 keystream[BF_BLOCK_SIZE];
u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
unsigned int nbytes = walk->nbytes;
blowfish_enc_blk(ctx, keystream, ctrblk);
crypto_xor_cpy(dst, keystream, src, nbytes);
crypto_inc(ctrblk, BF_BLOCK_SIZE);
}
static unsigned int __ctr_crypt(struct bf_ctx *ctx, struct skcipher_walk *walk)
{
unsigned int bsize = BF_BLOCK_SIZE;
unsigned int nbytes = walk->nbytes;
u64 *src = (u64 *)walk->src.virt.addr;
u64 *dst = (u64 *)walk->dst.virt.addr;
u64 ctrblk = be64_to_cpu(*(__be64 *)walk->iv);
__be64 ctrblocks[4];
/* Process four block batch */
if (nbytes >= bsize * 4) {
do {
if (dst != src) {
dst[0] = src[0];
dst[1] = src[1];
dst[2] = src[2];
dst[3] = src[3];
}
/* create ctrblks for parallel encrypt */
ctrblocks[0] = cpu_to_be64(ctrblk++);
ctrblocks[1] = cpu_to_be64(ctrblk++);
ctrblocks[2] = cpu_to_be64(ctrblk++);
ctrblocks[3] = cpu_to_be64(ctrblk++);
blowfish_enc_blk_xor_4way(ctx, (u8 *)dst,
(u8 *)ctrblocks);
src += 4;
dst += 4;
} while ((nbytes -= bsize * 4) >= bsize * 4);
if (nbytes < bsize)
goto done;
}
/* Handle leftovers */
do {
if (dst != src)
*dst = *src;
ctrblocks[0] = cpu_to_be64(ctrblk++);
blowfish_enc_blk_xor(ctx, (u8 *)dst, (u8 *)ctrblocks);
src += 1;
dst += 1;
} while ((nbytes -= bsize) >= bsize);
done:
*(__be64 *)walk->iv = cpu_to_be64(ctrblk);
return nbytes;
}
static int ctr_crypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct bf_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes) >= BF_BLOCK_SIZE) {
nbytes = __ctr_crypt(ctx, &walk);
err = skcipher_walk_done(&walk, nbytes);
}
if (nbytes) {
ctr_crypt_final(ctx, &walk);
err = skcipher_walk_done(&walk, 0);
}
return err;
}
static struct crypto_alg bf_cipher_alg = {
.cra_name = "blowfish",
.cra_driver_name = "blowfish-asm",
@ -384,20 +291,6 @@ static struct skcipher_alg bf_skcipher_algs[] = {
.setkey = blowfish_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "ctr(blowfish)",
.base.cra_driver_name = "ctr-blowfish-asm",
.base.cra_priority = 300,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct bf_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = BF_MIN_KEY_SIZE,
.max_keysize = BF_MAX_KEY_SIZE,
.ivsize = BF_BLOCK_SIZE,
.chunksize = BF_BLOCK_SIZE,
.setkey = blowfish_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
},
};

View File

@ -17,7 +17,6 @@
#include <linux/linkage.h>
#include <asm/frame.h>
#include <asm/nospec-branch.h>
#define CAMELLIA_TABLE_BYTE_LEN 272
@ -589,14 +588,6 @@ SYM_FUNC_END(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
.long 0x80808080
.long 0x80808080
/* For CTR-mode IV byteswap */
.Lbswap128_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
/* For XTS mode IV generation */
.Lxts_gf128mul_and_shl1_mask:
.byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
/*
* pre-SubByte transform
*
@ -998,292 +989,3 @@ SYM_FUNC_START(camellia_cbc_dec_16way)
FRAME_END
ret;
SYM_FUNC_END(camellia_cbc_dec_16way)
#define inc_le128(x, minus_one, tmp) \
vpcmpeqq minus_one, x, tmp; \
vpsubq minus_one, x, x; \
vpslldq $8, tmp, tmp; \
vpsubq tmp, x, x;
SYM_FUNC_START(camellia_ctr_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (little endian, 128bit)
*/
FRAME_BEGIN
subq $(16 * 16), %rsp;
movq %rsp, %rax;
vmovdqa .Lbswap128_mask, %xmm14;
/* load IV and byteswap */
vmovdqu (%rcx), %xmm0;
vpshufb %xmm14, %xmm0, %xmm15;
vmovdqu %xmm15, 15 * 16(%rax);
vpcmpeqd %xmm15, %xmm15, %xmm15;
vpsrldq $8, %xmm15, %xmm15; /* low: -1, high: 0 */
/* construct IVs */
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm13;
vmovdqu %xmm13, 14 * 16(%rax);
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm13;
vmovdqu %xmm13, 13 * 16(%rax);
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm12;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm11;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm10;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm9;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm8;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm7;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm6;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm5;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm4;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm3;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm2;
inc_le128(%xmm0, %xmm15, %xmm13);
vpshufb %xmm14, %xmm0, %xmm1;
inc_le128(%xmm0, %xmm15, %xmm13);
vmovdqa %xmm0, %xmm13;
vpshufb %xmm14, %xmm0, %xmm0;
inc_le128(%xmm13, %xmm15, %xmm14);
vmovdqu %xmm13, (%rcx);
/* inpack16_pre: */
vmovq (key_table)(CTX), %xmm15;
vpshufb .Lpack_bswap, %xmm15, %xmm15;
vpxor %xmm0, %xmm15, %xmm0;
vpxor %xmm1, %xmm15, %xmm1;
vpxor %xmm2, %xmm15, %xmm2;
vpxor %xmm3, %xmm15, %xmm3;
vpxor %xmm4, %xmm15, %xmm4;
vpxor %xmm5, %xmm15, %xmm5;
vpxor %xmm6, %xmm15, %xmm6;
vpxor %xmm7, %xmm15, %xmm7;
vpxor %xmm8, %xmm15, %xmm8;
vpxor %xmm9, %xmm15, %xmm9;
vpxor %xmm10, %xmm15, %xmm10;
vpxor %xmm11, %xmm15, %xmm11;
vpxor %xmm12, %xmm15, %xmm12;
vpxor 13 * 16(%rax), %xmm15, %xmm13;
vpxor 14 * 16(%rax), %xmm15, %xmm14;
vpxor 15 * 16(%rax), %xmm15, %xmm15;
call __camellia_enc_blk16;
addq $(16 * 16), %rsp;
vpxor 0 * 16(%rdx), %xmm7, %xmm7;
vpxor 1 * 16(%rdx), %xmm6, %xmm6;
vpxor 2 * 16(%rdx), %xmm5, %xmm5;
vpxor 3 * 16(%rdx), %xmm4, %xmm4;
vpxor 4 * 16(%rdx), %xmm3, %xmm3;
vpxor 5 * 16(%rdx), %xmm2, %xmm2;
vpxor 6 * 16(%rdx), %xmm1, %xmm1;
vpxor 7 * 16(%rdx), %xmm0, %xmm0;
vpxor 8 * 16(%rdx), %xmm15, %xmm15;
vpxor 9 * 16(%rdx), %xmm14, %xmm14;
vpxor 10 * 16(%rdx), %xmm13, %xmm13;
vpxor 11 * 16(%rdx), %xmm12, %xmm12;
vpxor 12 * 16(%rdx), %xmm11, %xmm11;
vpxor 13 * 16(%rdx), %xmm10, %xmm10;
vpxor 14 * 16(%rdx), %xmm9, %xmm9;
vpxor 15 * 16(%rdx), %xmm8, %xmm8;
write_output(%xmm7, %xmm6, %xmm5, %xmm4, %xmm3, %xmm2, %xmm1, %xmm0,
%xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
%xmm8, %rsi);
FRAME_END
ret;
SYM_FUNC_END(camellia_ctr_16way)
#define gf128mul_x_ble(iv, mask, tmp) \
vpsrad $31, iv, tmp; \
vpaddq iv, iv, iv; \
vpshufd $0x13, tmp, tmp; \
vpand mask, tmp, tmp; \
vpxor tmp, iv, iv;
.align 8
SYM_FUNC_START_LOCAL(camellia_xts_crypt_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (t α GF(2¹²))
* %r8: index for input whitening key
* %r9: pointer to __camellia_enc_blk16 or __camellia_dec_blk16
*/
FRAME_BEGIN
subq $(16 * 16), %rsp;
movq %rsp, %rax;
vmovdqa .Lxts_gf128mul_and_shl1_mask, %xmm14;
/* load IV */
vmovdqu (%rcx), %xmm0;
vpxor 0 * 16(%rdx), %xmm0, %xmm15;
vmovdqu %xmm15, 15 * 16(%rax);
vmovdqu %xmm0, 0 * 16(%rsi);
/* construct IVs */
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 1 * 16(%rdx), %xmm0, %xmm15;
vmovdqu %xmm15, 14 * 16(%rax);
vmovdqu %xmm0, 1 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 2 * 16(%rdx), %xmm0, %xmm13;
vmovdqu %xmm0, 2 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 3 * 16(%rdx), %xmm0, %xmm12;
vmovdqu %xmm0, 3 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 4 * 16(%rdx), %xmm0, %xmm11;
vmovdqu %xmm0, 4 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 5 * 16(%rdx), %xmm0, %xmm10;
vmovdqu %xmm0, 5 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 6 * 16(%rdx), %xmm0, %xmm9;
vmovdqu %xmm0, 6 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 7 * 16(%rdx), %xmm0, %xmm8;
vmovdqu %xmm0, 7 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 8 * 16(%rdx), %xmm0, %xmm7;
vmovdqu %xmm0, 8 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 9 * 16(%rdx), %xmm0, %xmm6;
vmovdqu %xmm0, 9 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 10 * 16(%rdx), %xmm0, %xmm5;
vmovdqu %xmm0, 10 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 11 * 16(%rdx), %xmm0, %xmm4;
vmovdqu %xmm0, 11 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 12 * 16(%rdx), %xmm0, %xmm3;
vmovdqu %xmm0, 12 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 13 * 16(%rdx), %xmm0, %xmm2;
vmovdqu %xmm0, 13 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 14 * 16(%rdx), %xmm0, %xmm1;
vmovdqu %xmm0, 14 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vpxor 15 * 16(%rdx), %xmm0, %xmm15;
vmovdqu %xmm15, 0 * 16(%rax);
vmovdqu %xmm0, 15 * 16(%rsi);
gf128mul_x_ble(%xmm0, %xmm14, %xmm15);
vmovdqu %xmm0, (%rcx);
/* inpack16_pre: */
vmovq (key_table)(CTX, %r8, 8), %xmm15;
vpshufb .Lpack_bswap, %xmm15, %xmm15;
vpxor 0 * 16(%rax), %xmm15, %xmm0;
vpxor %xmm1, %xmm15, %xmm1;
vpxor %xmm2, %xmm15, %xmm2;
vpxor %xmm3, %xmm15, %xmm3;
vpxor %xmm4, %xmm15, %xmm4;
vpxor %xmm5, %xmm15, %xmm5;
vpxor %xmm6, %xmm15, %xmm6;
vpxor %xmm7, %xmm15, %xmm7;
vpxor %xmm8, %xmm15, %xmm8;
vpxor %xmm9, %xmm15, %xmm9;
vpxor %xmm10, %xmm15, %xmm10;
vpxor %xmm11, %xmm15, %xmm11;
vpxor %xmm12, %xmm15, %xmm12;
vpxor %xmm13, %xmm15, %xmm13;
vpxor 14 * 16(%rax), %xmm15, %xmm14;
vpxor 15 * 16(%rax), %xmm15, %xmm15;
CALL_NOSPEC r9;
addq $(16 * 16), %rsp;
vpxor 0 * 16(%rsi), %xmm7, %xmm7;
vpxor 1 * 16(%rsi), %xmm6, %xmm6;
vpxor 2 * 16(%rsi), %xmm5, %xmm5;
vpxor 3 * 16(%rsi), %xmm4, %xmm4;
vpxor 4 * 16(%rsi), %xmm3, %xmm3;
vpxor 5 * 16(%rsi), %xmm2, %xmm2;
vpxor 6 * 16(%rsi), %xmm1, %xmm1;
vpxor 7 * 16(%rsi), %xmm0, %xmm0;
vpxor 8 * 16(%rsi), %xmm15, %xmm15;
vpxor 9 * 16(%rsi), %xmm14, %xmm14;
vpxor 10 * 16(%rsi), %xmm13, %xmm13;
vpxor 11 * 16(%rsi), %xmm12, %xmm12;
vpxor 12 * 16(%rsi), %xmm11, %xmm11;
vpxor 13 * 16(%rsi), %xmm10, %xmm10;
vpxor 14 * 16(%rsi), %xmm9, %xmm9;
vpxor 15 * 16(%rsi), %xmm8, %xmm8;
write_output(%xmm7, %xmm6, %xmm5, %xmm4, %xmm3, %xmm2, %xmm1, %xmm0,
%xmm15, %xmm14, %xmm13, %xmm12, %xmm11, %xmm10, %xmm9,
%xmm8, %rsi);
FRAME_END
ret;
SYM_FUNC_END(camellia_xts_crypt_16way)
SYM_FUNC_START(camellia_xts_enc_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (t α GF(2¹²))
*/
xorl %r8d, %r8d; /* input whitening key, 0 for enc */
leaq __camellia_enc_blk16, %r9;
jmp camellia_xts_crypt_16way;
SYM_FUNC_END(camellia_xts_enc_16way)
SYM_FUNC_START(camellia_xts_dec_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (t α GF(2¹²))
*/
cmpl $16, key_length(CTX);
movl $32, %r8d;
movl $24, %eax;
cmovel %eax, %r8d; /* input whitening key, last for dec */
leaq __camellia_dec_blk16, %r9;
jmp camellia_xts_crypt_16way;
SYM_FUNC_END(camellia_xts_dec_16way)

View File

@ -7,7 +7,6 @@
#include <linux/linkage.h>
#include <asm/frame.h>
#include <asm/nospec-branch.h>
#define CAMELLIA_TABLE_BYTE_LEN 272
@ -625,16 +624,6 @@ SYM_FUNC_END(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
.section .rodata.cst16, "aM", @progbits, 16
.align 16
/* For CTR-mode IV byteswap */
.Lbswap128_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
/* For XTS mode */
.Lxts_gf128mul_and_shl1_mask_0:
.byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
.Lxts_gf128mul_and_shl1_mask_1:
.byte 0x0e, 1, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0
/*
* pre-SubByte transform
*
@ -1061,343 +1050,3 @@ SYM_FUNC_START(camellia_cbc_dec_32way)
FRAME_END
ret;
SYM_FUNC_END(camellia_cbc_dec_32way)
#define inc_le128(x, minus_one, tmp) \
vpcmpeqq minus_one, x, tmp; \
vpsubq minus_one, x, x; \
vpslldq $8, tmp, tmp; \
vpsubq tmp, x, x;
#define add2_le128(x, minus_one, minus_two, tmp1, tmp2) \
vpcmpeqq minus_one, x, tmp1; \
vpcmpeqq minus_two, x, tmp2; \
vpsubq minus_two, x, x; \
vpor tmp2, tmp1, tmp1; \
vpslldq $8, tmp1, tmp1; \
vpsubq tmp1, x, x;
SYM_FUNC_START(camellia_ctr_32way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (32 blocks)
* %rdx: src (32 blocks)
* %rcx: iv (little endian, 128bit)
*/
FRAME_BEGIN
vzeroupper;
movq %rsp, %r10;
cmpq %rsi, %rdx;
je .Lctr_use_stack;
/* dst can be used as temporary storage, src is not overwritten. */
movq %rsi, %rax;
jmp .Lctr_continue;
.Lctr_use_stack:
subq $(16 * 32), %rsp;
movq %rsp, %rax;
.Lctr_continue:
vpcmpeqd %ymm15, %ymm15, %ymm15;
vpsrldq $8, %ymm15, %ymm15; /* ab: -1:0 ; cd: -1:0 */
vpaddq %ymm15, %ymm15, %ymm12; /* ab: -2:0 ; cd: -2:0 */
/* load IV and byteswap */
vmovdqu (%rcx), %xmm0;
vmovdqa %xmm0, %xmm1;
inc_le128(%xmm0, %xmm15, %xmm14);
vbroadcasti128 .Lbswap128_mask, %ymm14;
vinserti128 $1, %xmm0, %ymm1, %ymm0;
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 15 * 32(%rax);
/* construct IVs */
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13); /* ab:le2 ; cd:le3 */
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 14 * 32(%rax);
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 13 * 32(%rax);
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 12 * 32(%rax);
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 11 * 32(%rax);
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm10;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm9;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm8;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm7;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm6;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm5;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm4;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm3;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm2;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vpshufb %ymm14, %ymm0, %ymm1;
add2_le128(%ymm0, %ymm15, %ymm12, %ymm11, %ymm13);
vextracti128 $1, %ymm0, %xmm13;
vpshufb %ymm14, %ymm0, %ymm0;
inc_le128(%xmm13, %xmm15, %xmm14);
vmovdqu %xmm13, (%rcx);
/* inpack32_pre: */
vpbroadcastq (key_table)(CTX), %ymm15;
vpshufb .Lpack_bswap, %ymm15, %ymm15;
vpxor %ymm0, %ymm15, %ymm0;
vpxor %ymm1, %ymm15, %ymm1;
vpxor %ymm2, %ymm15, %ymm2;
vpxor %ymm3, %ymm15, %ymm3;
vpxor %ymm4, %ymm15, %ymm4;
vpxor %ymm5, %ymm15, %ymm5;
vpxor %ymm6, %ymm15, %ymm6;
vpxor %ymm7, %ymm15, %ymm7;
vpxor %ymm8, %ymm15, %ymm8;
vpxor %ymm9, %ymm15, %ymm9;
vpxor %ymm10, %ymm15, %ymm10;
vpxor 11 * 32(%rax), %ymm15, %ymm11;
vpxor 12 * 32(%rax), %ymm15, %ymm12;
vpxor 13 * 32(%rax), %ymm15, %ymm13;
vpxor 14 * 32(%rax), %ymm15, %ymm14;
vpxor 15 * 32(%rax), %ymm15, %ymm15;
call __camellia_enc_blk32;
movq %r10, %rsp;
vpxor 0 * 32(%rdx), %ymm7, %ymm7;
vpxor 1 * 32(%rdx), %ymm6, %ymm6;
vpxor 2 * 32(%rdx), %ymm5, %ymm5;
vpxor 3 * 32(%rdx), %ymm4, %ymm4;
vpxor 4 * 32(%rdx), %ymm3, %ymm3;
vpxor 5 * 32(%rdx), %ymm2, %ymm2;
vpxor 6 * 32(%rdx), %ymm1, %ymm1;
vpxor 7 * 32(%rdx), %ymm0, %ymm0;
vpxor 8 * 32(%rdx), %ymm15, %ymm15;
vpxor 9 * 32(%rdx), %ymm14, %ymm14;
vpxor 10 * 32(%rdx), %ymm13, %ymm13;
vpxor 11 * 32(%rdx), %ymm12, %ymm12;
vpxor 12 * 32(%rdx), %ymm11, %ymm11;
vpxor 13 * 32(%rdx), %ymm10, %ymm10;
vpxor 14 * 32(%rdx), %ymm9, %ymm9;
vpxor 15 * 32(%rdx), %ymm8, %ymm8;
write_output(%ymm7, %ymm6, %ymm5, %ymm4, %ymm3, %ymm2, %ymm1, %ymm0,
%ymm15, %ymm14, %ymm13, %ymm12, %ymm11, %ymm10, %ymm9,
%ymm8, %rsi);
vzeroupper;
FRAME_END
ret;
SYM_FUNC_END(camellia_ctr_32way)
#define gf128mul_x_ble(iv, mask, tmp) \
vpsrad $31, iv, tmp; \
vpaddq iv, iv, iv; \
vpshufd $0x13, tmp, tmp; \
vpand mask, tmp, tmp; \
vpxor tmp, iv, iv;
#define gf128mul_x2_ble(iv, mask1, mask2, tmp0, tmp1) \
vpsrad $31, iv, tmp0; \
vpaddq iv, iv, tmp1; \
vpsllq $2, iv, iv; \
vpshufd $0x13, tmp0, tmp0; \
vpsrad $31, tmp1, tmp1; \
vpand mask2, tmp0, tmp0; \
vpshufd $0x13, tmp1, tmp1; \
vpxor tmp0, iv, iv; \
vpand mask1, tmp1, tmp1; \
vpxor tmp1, iv, iv;
.align 8
SYM_FUNC_START_LOCAL(camellia_xts_crypt_32way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (32 blocks)
* %rdx: src (32 blocks)
* %rcx: iv (t α GF(2¹²))
* %r8: index for input whitening key
* %r9: pointer to __camellia_enc_blk32 or __camellia_dec_blk32
*/
FRAME_BEGIN
vzeroupper;
subq $(16 * 32), %rsp;
movq %rsp, %rax;
vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0, %ymm12;
/* load IV and construct second IV */
vmovdqu (%rcx), %xmm0;
vmovdqa %xmm0, %xmm15;
gf128mul_x_ble(%xmm0, %xmm12, %xmm13);
vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1, %ymm13;
vinserti128 $1, %xmm0, %ymm15, %ymm0;
vpxor 0 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 15 * 32(%rax);
vmovdqu %ymm0, 0 * 32(%rsi);
/* construct IVs */
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 1 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 14 * 32(%rax);
vmovdqu %ymm0, 1 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 2 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 13 * 32(%rax);
vmovdqu %ymm0, 2 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 3 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 12 * 32(%rax);
vmovdqu %ymm0, 3 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 4 * 32(%rdx), %ymm0, %ymm11;
vmovdqu %ymm0, 4 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 5 * 32(%rdx), %ymm0, %ymm10;
vmovdqu %ymm0, 5 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 6 * 32(%rdx), %ymm0, %ymm9;
vmovdqu %ymm0, 6 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 7 * 32(%rdx), %ymm0, %ymm8;
vmovdqu %ymm0, 7 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 8 * 32(%rdx), %ymm0, %ymm7;
vmovdqu %ymm0, 8 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 9 * 32(%rdx), %ymm0, %ymm6;
vmovdqu %ymm0, 9 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 10 * 32(%rdx), %ymm0, %ymm5;
vmovdqu %ymm0, 10 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 11 * 32(%rdx), %ymm0, %ymm4;
vmovdqu %ymm0, 11 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 12 * 32(%rdx), %ymm0, %ymm3;
vmovdqu %ymm0, 12 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 13 * 32(%rdx), %ymm0, %ymm2;
vmovdqu %ymm0, 13 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 14 * 32(%rdx), %ymm0, %ymm1;
vmovdqu %ymm0, 14 * 32(%rsi);
gf128mul_x2_ble(%ymm0, %ymm12, %ymm13, %ymm14, %ymm15);
vpxor 15 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 0 * 32(%rax);
vmovdqu %ymm0, 15 * 32(%rsi);
vextracti128 $1, %ymm0, %xmm0;
gf128mul_x_ble(%xmm0, %xmm12, %xmm15);
vmovdqu %xmm0, (%rcx);
/* inpack32_pre: */
vpbroadcastq (key_table)(CTX, %r8, 8), %ymm15;
vpshufb .Lpack_bswap, %ymm15, %ymm15;
vpxor 0 * 32(%rax), %ymm15, %ymm0;
vpxor %ymm1, %ymm15, %ymm1;
vpxor %ymm2, %ymm15, %ymm2;
vpxor %ymm3, %ymm15, %ymm3;
vpxor %ymm4, %ymm15, %ymm4;
vpxor %ymm5, %ymm15, %ymm5;
vpxor %ymm6, %ymm15, %ymm6;
vpxor %ymm7, %ymm15, %ymm7;
vpxor %ymm8, %ymm15, %ymm8;
vpxor %ymm9, %ymm15, %ymm9;
vpxor %ymm10, %ymm15, %ymm10;
vpxor %ymm11, %ymm15, %ymm11;
vpxor 12 * 32(%rax), %ymm15, %ymm12;
vpxor 13 * 32(%rax), %ymm15, %ymm13;
vpxor 14 * 32(%rax), %ymm15, %ymm14;
vpxor 15 * 32(%rax), %ymm15, %ymm15;
CALL_NOSPEC r9;
addq $(16 * 32), %rsp;
vpxor 0 * 32(%rsi), %ymm7, %ymm7;
vpxor 1 * 32(%rsi), %ymm6, %ymm6;
vpxor 2 * 32(%rsi), %ymm5, %ymm5;
vpxor 3 * 32(%rsi), %ymm4, %ymm4;
vpxor 4 * 32(%rsi), %ymm3, %ymm3;
vpxor 5 * 32(%rsi), %ymm2, %ymm2;
vpxor 6 * 32(%rsi), %ymm1, %ymm1;
vpxor 7 * 32(%rsi), %ymm0, %ymm0;
vpxor 8 * 32(%rsi), %ymm15, %ymm15;
vpxor 9 * 32(%rsi), %ymm14, %ymm14;
vpxor 10 * 32(%rsi), %ymm13, %ymm13;
vpxor 11 * 32(%rsi), %ymm12, %ymm12;
vpxor 12 * 32(%rsi), %ymm11, %ymm11;
vpxor 13 * 32(%rsi), %ymm10, %ymm10;
vpxor 14 * 32(%rsi), %ymm9, %ymm9;
vpxor 15 * 32(%rsi), %ymm8, %ymm8;
write_output(%ymm7, %ymm6, %ymm5, %ymm4, %ymm3, %ymm2, %ymm1, %ymm0,
%ymm15, %ymm14, %ymm13, %ymm12, %ymm11, %ymm10, %ymm9,
%ymm8, %rsi);
vzeroupper;
FRAME_END
ret;
SYM_FUNC_END(camellia_xts_crypt_32way)
SYM_FUNC_START(camellia_xts_enc_32way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (32 blocks)
* %rdx: src (32 blocks)
* %rcx: iv (t α GF(2¹²))
*/
xorl %r8d, %r8d; /* input whitening key, 0 for enc */
leaq __camellia_enc_blk32, %r9;
jmp camellia_xts_crypt_32way;
SYM_FUNC_END(camellia_xts_enc_32way)
SYM_FUNC_START(camellia_xts_dec_32way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (32 blocks)
* %rdx: src (32 blocks)
* %rcx: iv (t α GF(2¹²))
*/
cmpl $16, key_length(CTX);
movl $32, %r8d;
movl $24, %eax;
cmovel %eax, %r8d; /* input whitening key, last for dec */
leaq __camellia_dec_blk32, %r9;
jmp camellia_xts_crypt_32way;
SYM_FUNC_END(camellia_xts_dec_32way)

View File

@ -19,18 +19,10 @@ struct camellia_ctx {
u32 key_length;
};
struct camellia_xts_ctx {
struct camellia_ctx tweak_ctx;
struct camellia_ctx crypt_ctx;
};
extern int __camellia_setkey(struct camellia_ctx *cctx,
const unsigned char *key,
unsigned int key_len);
extern int xts_camellia_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen);
/* regular block cipher functions */
asmlinkage void __camellia_enc_blk(const void *ctx, u8 *dst, const u8 *src,
bool xor);
@ -46,13 +38,6 @@ asmlinkage void camellia_ecb_enc_16way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void camellia_ecb_dec_16way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void camellia_cbc_dec_16way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void camellia_ctr_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void camellia_xts_enc_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void camellia_xts_dec_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
static inline void camellia_enc_blk(const void *ctx, u8 *dst, const u8 *src)
{
@ -78,14 +63,5 @@ static inline void camellia_enc_blk_xor_2way(const void *ctx, u8 *dst,
/* glue helpers */
extern void camellia_decrypt_cbc_2way(const void *ctx, u8 *dst, const u8 *src);
extern void camellia_crypt_ctr(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
extern void camellia_crypt_ctr_2way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
extern void camellia_xts_enc(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
extern void camellia_xts_dec(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
#endif /* ASM_X86_CAMELLIA_H */

View File

@ -5,16 +5,16 @@
* Copyright © 2013 Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
*/
#include <asm/crypto/camellia.h>
#include <asm/crypto/glue_helper.h>
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <crypto/xts.h>
#include <linux/crypto.h>
#include <linux/err.h>
#include <linux/module.h>
#include <linux/types.h>
#include "camellia.h"
#include "ecb_cbc_helpers.h"
#define CAMELLIA_AESNI_PARALLEL_BLOCKS 16
#define CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS 32
@ -23,121 +23,6 @@ asmlinkage void camellia_ecb_enc_32way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void camellia_ecb_dec_32way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void camellia_cbc_dec_32way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void camellia_ctr_32way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void camellia_xts_enc_32way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void camellia_xts_dec_32way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
static const struct common_glue_ctx camellia_enc = {
.num_funcs = 4,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
.fn_u = { .ecb = camellia_ecb_enc_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .ecb = camellia_ecb_enc_16way }
}, {
.num_blocks = 2,
.fn_u = { .ecb = camellia_enc_blk_2way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = camellia_enc_blk }
} }
};
static const struct common_glue_ctx camellia_ctr = {
.num_funcs = 4,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
.fn_u = { .ctr = camellia_ctr_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .ctr = camellia_ctr_16way }
}, {
.num_blocks = 2,
.fn_u = { .ctr = camellia_crypt_ctr_2way }
}, {
.num_blocks = 1,
.fn_u = { .ctr = camellia_crypt_ctr }
} }
};
static const struct common_glue_ctx camellia_enc_xts = {
.num_funcs = 3,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
.fn_u = { .xts = camellia_xts_enc_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .xts = camellia_xts_enc_16way }
}, {
.num_blocks = 1,
.fn_u = { .xts = camellia_xts_enc }
} }
};
static const struct common_glue_ctx camellia_dec = {
.num_funcs = 4,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
.fn_u = { .ecb = camellia_ecb_dec_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .ecb = camellia_ecb_dec_16way }
}, {
.num_blocks = 2,
.fn_u = { .ecb = camellia_dec_blk_2way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = camellia_dec_blk }
} }
};
static const struct common_glue_ctx camellia_dec_cbc = {
.num_funcs = 4,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
.fn_u = { .cbc = camellia_cbc_dec_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .cbc = camellia_cbc_dec_16way }
}, {
.num_blocks = 2,
.fn_u = { .cbc = camellia_decrypt_cbc_2way }
}, {
.num_blocks = 1,
.fn_u = { .cbc = camellia_dec_blk }
} }
};
static const struct common_glue_ctx camellia_dec_xts = {
.num_funcs = 3,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
.fn_u = { .xts = camellia_xts_dec_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .xts = camellia_xts_dec_16way }
}, {
.num_blocks = 1,
.fn_u = { .xts = camellia_xts_dec }
} }
};
static int camellia_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
@ -147,45 +32,39 @@ static int camellia_setkey(struct crypto_skcipher *tfm, const u8 *key,
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&camellia_enc, req);
ECB_WALK_START(req, CAMELLIA_BLOCK_SIZE, CAMELLIA_AESNI_PARALLEL_BLOCKS);
ECB_BLOCK(CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS, camellia_ecb_enc_32way);
ECB_BLOCK(CAMELLIA_AESNI_PARALLEL_BLOCKS, camellia_ecb_enc_16way);
ECB_BLOCK(2, camellia_enc_blk_2way);
ECB_BLOCK(1, camellia_enc_blk);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&camellia_dec, req);
ECB_WALK_START(req, CAMELLIA_BLOCK_SIZE, CAMELLIA_AESNI_PARALLEL_BLOCKS);
ECB_BLOCK(CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS, camellia_ecb_dec_32way);
ECB_BLOCK(CAMELLIA_AESNI_PARALLEL_BLOCKS, camellia_ecb_dec_16way);
ECB_BLOCK(2, camellia_dec_blk_2way);
ECB_BLOCK(1, camellia_dec_blk);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(camellia_enc_blk, req);
CBC_WALK_START(req, CAMELLIA_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(camellia_enc_blk);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&camellia_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&camellia_ctr, req);
}
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&camellia_enc_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&camellia_dec_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
CBC_WALK_START(req, CAMELLIA_BLOCK_SIZE, CAMELLIA_AESNI_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS, camellia_cbc_dec_32way);
CBC_DEC_BLOCK(CAMELLIA_AESNI_PARALLEL_BLOCKS, camellia_cbc_dec_16way);
CBC_DEC_BLOCK(2, camellia_decrypt_cbc_2way);
CBC_DEC_BLOCK(1, camellia_dec_blk);
CBC_WALK_END();
}
static struct skcipher_alg camellia_algs[] = {
@ -216,35 +95,6 @@ static struct skcipher_alg camellia_algs[] = {
.setkey = camellia_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(camellia)",
.base.cra_driver_name = "__ctr-camellia-aesni-avx2",
.base.cra_priority = 500,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct camellia_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = CAMELLIA_MIN_KEY_SIZE,
.max_keysize = CAMELLIA_MAX_KEY_SIZE,
.ivsize = CAMELLIA_BLOCK_SIZE,
.chunksize = CAMELLIA_BLOCK_SIZE,
.setkey = camellia_setkey,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}, {
.base.cra_name = "__xts(camellia)",
.base.cra_driver_name = "__xts-camellia-aesni-avx2",
.base.cra_priority = 500,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = CAMELLIA_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct camellia_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * CAMELLIA_MIN_KEY_SIZE,
.max_keysize = 2 * CAMELLIA_MAX_KEY_SIZE,
.ivsize = CAMELLIA_BLOCK_SIZE,
.setkey = xts_camellia_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
},
};

View File

@ -5,16 +5,16 @@
* Copyright © 2012-2013 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*/
#include <asm/crypto/camellia.h>
#include <asm/crypto/glue_helper.h>
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <crypto/xts.h>
#include <linux/crypto.h>
#include <linux/err.h>
#include <linux/module.h>
#include <linux/types.h>
#include "camellia.h"
#include "ecb_cbc_helpers.h"
#define CAMELLIA_AESNI_PARALLEL_BLOCKS 16
/* 16-way parallel cipher functions (avx/aes-ni) */
@ -27,120 +27,6 @@ EXPORT_SYMBOL_GPL(camellia_ecb_dec_16way);
asmlinkage void camellia_cbc_dec_16way(const void *ctx, u8 *dst, const u8 *src);
EXPORT_SYMBOL_GPL(camellia_cbc_dec_16way);
asmlinkage void camellia_ctr_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
EXPORT_SYMBOL_GPL(camellia_ctr_16way);
asmlinkage void camellia_xts_enc_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
EXPORT_SYMBOL_GPL(camellia_xts_enc_16way);
asmlinkage void camellia_xts_dec_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
EXPORT_SYMBOL_GPL(camellia_xts_dec_16way);
void camellia_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, camellia_enc_blk);
}
EXPORT_SYMBOL_GPL(camellia_xts_enc);
void camellia_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, camellia_dec_blk);
}
EXPORT_SYMBOL_GPL(camellia_xts_dec);
static const struct common_glue_ctx camellia_enc = {
.num_funcs = 3,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .ecb = camellia_ecb_enc_16way }
}, {
.num_blocks = 2,
.fn_u = { .ecb = camellia_enc_blk_2way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = camellia_enc_blk }
} }
};
static const struct common_glue_ctx camellia_ctr = {
.num_funcs = 3,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .ctr = camellia_ctr_16way }
}, {
.num_blocks = 2,
.fn_u = { .ctr = camellia_crypt_ctr_2way }
}, {
.num_blocks = 1,
.fn_u = { .ctr = camellia_crypt_ctr }
} }
};
static const struct common_glue_ctx camellia_enc_xts = {
.num_funcs = 2,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .xts = camellia_xts_enc_16way }
}, {
.num_blocks = 1,
.fn_u = { .xts = camellia_xts_enc }
} }
};
static const struct common_glue_ctx camellia_dec = {
.num_funcs = 3,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .ecb = camellia_ecb_dec_16way }
}, {
.num_blocks = 2,
.fn_u = { .ecb = camellia_dec_blk_2way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = camellia_dec_blk }
} }
};
static const struct common_glue_ctx camellia_dec_cbc = {
.num_funcs = 3,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .cbc = camellia_cbc_dec_16way }
}, {
.num_blocks = 2,
.fn_u = { .cbc = camellia_decrypt_cbc_2way }
}, {
.num_blocks = 1,
.fn_u = { .cbc = camellia_dec_blk }
} }
};
static const struct common_glue_ctx camellia_dec_xts = {
.num_funcs = 2,
.fpu_blocks_limit = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
.fn_u = { .xts = camellia_xts_dec_16way }
}, {
.num_blocks = 1,
.fn_u = { .xts = camellia_xts_dec }
} }
};
static int camellia_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
@ -149,65 +35,36 @@ static int camellia_setkey(struct crypto_skcipher *tfm, const u8 *key,
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&camellia_enc, req);
ECB_WALK_START(req, CAMELLIA_BLOCK_SIZE, CAMELLIA_AESNI_PARALLEL_BLOCKS);
ECB_BLOCK(CAMELLIA_AESNI_PARALLEL_BLOCKS, camellia_ecb_enc_16way);
ECB_BLOCK(2, camellia_enc_blk_2way);
ECB_BLOCK(1, camellia_enc_blk);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&camellia_dec, req);
ECB_WALK_START(req, CAMELLIA_BLOCK_SIZE, CAMELLIA_AESNI_PARALLEL_BLOCKS);
ECB_BLOCK(CAMELLIA_AESNI_PARALLEL_BLOCKS, camellia_ecb_dec_16way);
ECB_BLOCK(2, camellia_dec_blk_2way);
ECB_BLOCK(1, camellia_dec_blk);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(camellia_enc_blk, req);
CBC_WALK_START(req, CAMELLIA_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(camellia_enc_blk);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&camellia_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&camellia_ctr, req);
}
int xts_camellia_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
/* first half of xts-key is for crypt */
err = __camellia_setkey(&ctx->crypt_ctx, key, keylen / 2);
if (err)
return err;
/* second half of xts-key is for tweak */
return __camellia_setkey(&ctx->tweak_ctx, key + keylen / 2, keylen / 2);
}
EXPORT_SYMBOL_GPL(xts_camellia_setkey);
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&camellia_enc_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&camellia_dec_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
CBC_WALK_START(req, CAMELLIA_BLOCK_SIZE, CAMELLIA_AESNI_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(CAMELLIA_AESNI_PARALLEL_BLOCKS, camellia_cbc_dec_16way);
CBC_DEC_BLOCK(2, camellia_decrypt_cbc_2way);
CBC_DEC_BLOCK(1, camellia_dec_blk);
CBC_WALK_END();
}
static struct skcipher_alg camellia_algs[] = {
@ -238,36 +95,7 @@ static struct skcipher_alg camellia_algs[] = {
.setkey = camellia_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(camellia)",
.base.cra_driver_name = "__ctr-camellia-aesni",
.base.cra_priority = 400,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct camellia_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = CAMELLIA_MIN_KEY_SIZE,
.max_keysize = CAMELLIA_MAX_KEY_SIZE,
.ivsize = CAMELLIA_BLOCK_SIZE,
.chunksize = CAMELLIA_BLOCK_SIZE,
.setkey = camellia_setkey,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}, {
.base.cra_name = "__xts(camellia)",
.base.cra_driver_name = "__xts-camellia-aesni",
.base.cra_priority = 400,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = CAMELLIA_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct camellia_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * CAMELLIA_MIN_KEY_SIZE,
.max_keysize = 2 * CAMELLIA_MAX_KEY_SIZE,
.ivsize = CAMELLIA_BLOCK_SIZE,
.setkey = xts_camellia_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
},
}
};
static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)];

View File

@ -14,8 +14,9 @@
#include <linux/module.h>
#include <linux/types.h>
#include <crypto/algapi.h>
#include <asm/crypto/camellia.h>
#include <asm/crypto/glue_helper.h>
#include "camellia.h"
#include "ecb_cbc_helpers.h"
/* regular block cipher functions */
asmlinkage void __camellia_enc_blk(const void *ctx, u8 *dst, const u8 *src,
@ -1262,129 +1263,47 @@ static int camellia_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key,
return camellia_setkey(&tfm->base, key, key_len);
}
void camellia_decrypt_cbc_2way(const void *ctx, u8 *d, const u8 *s)
void camellia_decrypt_cbc_2way(const void *ctx, u8 *dst, const u8 *src)
{
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
u128 iv = *src;
u8 buf[CAMELLIA_BLOCK_SIZE];
const u8 *iv = src;
camellia_dec_blk_2way(ctx, (u8 *)dst, (u8 *)src);
u128_xor(&dst[1], &dst[1], &iv);
if (dst == src)
iv = memcpy(buf, iv, sizeof(buf));
camellia_dec_blk_2way(ctx, dst, src);
crypto_xor(dst + CAMELLIA_BLOCK_SIZE, iv, CAMELLIA_BLOCK_SIZE);
}
EXPORT_SYMBOL_GPL(camellia_decrypt_cbc_2way);
void camellia_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
if (dst != src)
*dst = *src;
le128_to_be128(&ctrblk, iv);
le128_inc(iv);
camellia_enc_blk_xor(ctx, (u8 *)dst, (u8 *)&ctrblk);
}
EXPORT_SYMBOL_GPL(camellia_crypt_ctr);
void camellia_crypt_ctr_2way(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblks[2];
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
if (dst != src) {
dst[0] = src[0];
dst[1] = src[1];
}
le128_to_be128(&ctrblks[0], iv);
le128_inc(iv);
le128_to_be128(&ctrblks[1], iv);
le128_inc(iv);
camellia_enc_blk_xor_2way(ctx, (u8 *)dst, (u8 *)ctrblks);
}
EXPORT_SYMBOL_GPL(camellia_crypt_ctr_2way);
static const struct common_glue_ctx camellia_enc = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 2,
.fn_u = { .ecb = camellia_enc_blk_2way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = camellia_enc_blk }
} }
};
static const struct common_glue_ctx camellia_ctr = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 2,
.fn_u = { .ctr = camellia_crypt_ctr_2way }
}, {
.num_blocks = 1,
.fn_u = { .ctr = camellia_crypt_ctr }
} }
};
static const struct common_glue_ctx camellia_dec = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 2,
.fn_u = { .ecb = camellia_dec_blk_2way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = camellia_dec_blk }
} }
};
static const struct common_glue_ctx camellia_dec_cbc = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 2,
.fn_u = { .cbc = camellia_decrypt_cbc_2way }
}, {
.num_blocks = 1,
.fn_u = { .cbc = camellia_dec_blk }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&camellia_enc, req);
ECB_WALK_START(req, CAMELLIA_BLOCK_SIZE, -1);
ECB_BLOCK(2, camellia_enc_blk_2way);
ECB_BLOCK(1, camellia_enc_blk);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&camellia_dec, req);
ECB_WALK_START(req, CAMELLIA_BLOCK_SIZE, -1);
ECB_BLOCK(2, camellia_dec_blk_2way);
ECB_BLOCK(1, camellia_dec_blk);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(camellia_enc_blk, req);
CBC_WALK_START(req, CAMELLIA_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(camellia_enc_blk);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&camellia_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&camellia_ctr, req);
CBC_WALK_START(req, CAMELLIA_BLOCK_SIZE, -1);
CBC_DEC_BLOCK(2, camellia_decrypt_cbc_2way);
CBC_DEC_BLOCK(1, camellia_dec_blk);
CBC_WALK_END();
}
static struct crypto_alg camellia_cipher_alg = {
@ -1433,20 +1352,6 @@ static struct skcipher_alg camellia_skcipher_algs[] = {
.setkey = camellia_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "ctr(camellia)",
.base.cra_driver_name = "ctr-camellia-asm",
.base.cra_priority = 300,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct camellia_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = CAMELLIA_MIN_KEY_SIZE,
.max_keysize = CAMELLIA_MAX_KEY_SIZE,
.ivsize = CAMELLIA_BLOCK_SIZE,
.chunksize = CAMELLIA_BLOCK_SIZE,
.setkey = camellia_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}
};

View File

@ -6,7 +6,6 @@
* <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
*/
#include <asm/crypto/glue_helper.h>
#include <crypto/algapi.h>
#include <crypto/cast5.h>
#include <crypto/internal/simd.h>
@ -15,6 +14,8 @@
#include <linux/module.h>
#include <linux/types.h>
#include "ecb_cbc_helpers.h"
#define CAST5_PARALLEL_BLOCKS 16
asmlinkage void cast5_ecb_enc_16way(struct cast5_ctx *ctx, u8 *dst,
@ -23,8 +24,6 @@ asmlinkage void cast5_ecb_dec_16way(struct cast5_ctx *ctx, u8 *dst,
const u8 *src);
asmlinkage void cast5_cbc_dec_16way(struct cast5_ctx *ctx, u8 *dst,
const u8 *src);
asmlinkage void cast5_ctr_16way(struct cast5_ctx *ctx, u8 *dst, const u8 *src,
__be64 *iv);
static int cast5_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
@ -32,272 +31,35 @@ static int cast5_setkey_skcipher(struct crypto_skcipher *tfm, const u8 *key,
return cast5_setkey(&tfm->base, key, keylen);
}
static inline bool cast5_fpu_begin(bool fpu_enabled, struct skcipher_walk *walk,
unsigned int nbytes)
{
return glue_fpu_begin(CAST5_BLOCK_SIZE, CAST5_PARALLEL_BLOCKS,
walk, fpu_enabled, nbytes);
}
static inline void cast5_fpu_end(bool fpu_enabled)
{
return glue_fpu_end(fpu_enabled);
}
static int ecb_crypt(struct skcipher_request *req, bool enc)
{
bool fpu_enabled = false;
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
const unsigned int bsize = CAST5_BLOCK_SIZE;
unsigned int nbytes;
void (*fn)(struct cast5_ctx *ctx, u8 *dst, const u8 *src);
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes)) {
u8 *wsrc = walk.src.virt.addr;
u8 *wdst = walk.dst.virt.addr;
fpu_enabled = cast5_fpu_begin(fpu_enabled, &walk, nbytes);
/* Process multi-block batch */
if (nbytes >= bsize * CAST5_PARALLEL_BLOCKS) {
fn = (enc) ? cast5_ecb_enc_16way : cast5_ecb_dec_16way;
do {
fn(ctx, wdst, wsrc);
wsrc += bsize * CAST5_PARALLEL_BLOCKS;
wdst += bsize * CAST5_PARALLEL_BLOCKS;
nbytes -= bsize * CAST5_PARALLEL_BLOCKS;
} while (nbytes >= bsize * CAST5_PARALLEL_BLOCKS);
if (nbytes < bsize)
goto done;
}
fn = (enc) ? __cast5_encrypt : __cast5_decrypt;
/* Handle leftovers */
do {
fn(ctx, wdst, wsrc);
wsrc += bsize;
wdst += bsize;
nbytes -= bsize;
} while (nbytes >= bsize);
done:
err = skcipher_walk_done(&walk, nbytes);
}
cast5_fpu_end(fpu_enabled);
return err;
}
static int ecb_encrypt(struct skcipher_request *req)
{
return ecb_crypt(req, true);
ECB_WALK_START(req, CAST5_BLOCK_SIZE, CAST5_PARALLEL_BLOCKS);
ECB_BLOCK(CAST5_PARALLEL_BLOCKS, cast5_ecb_enc_16way);
ECB_BLOCK(1, __cast5_encrypt);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return ecb_crypt(req, false);
ECB_WALK_START(req, CAST5_BLOCK_SIZE, CAST5_PARALLEL_BLOCKS);
ECB_BLOCK(CAST5_PARALLEL_BLOCKS, cast5_ecb_dec_16way);
ECB_BLOCK(1, __cast5_decrypt);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
const unsigned int bsize = CAST5_BLOCK_SIZE;
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes)) {
u64 *src = (u64 *)walk.src.virt.addr;
u64 *dst = (u64 *)walk.dst.virt.addr;
u64 *iv = (u64 *)walk.iv;
do {
*dst = *src ^ *iv;
__cast5_encrypt(ctx, (u8 *)dst, (u8 *)dst);
iv = dst;
src++;
dst++;
nbytes -= bsize;
} while (nbytes >= bsize);
*(u64 *)walk.iv = *iv;
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
static unsigned int __cbc_decrypt(struct cast5_ctx *ctx,
struct skcipher_walk *walk)
{
const unsigned int bsize = CAST5_BLOCK_SIZE;
unsigned int nbytes = walk->nbytes;
u64 *src = (u64 *)walk->src.virt.addr;
u64 *dst = (u64 *)walk->dst.virt.addr;
u64 last_iv;
/* Start of the last block. */
src += nbytes / bsize - 1;
dst += nbytes / bsize - 1;
last_iv = *src;
/* Process multi-block batch */
if (nbytes >= bsize * CAST5_PARALLEL_BLOCKS) {
do {
nbytes -= bsize * (CAST5_PARALLEL_BLOCKS - 1);
src -= CAST5_PARALLEL_BLOCKS - 1;
dst -= CAST5_PARALLEL_BLOCKS - 1;
cast5_cbc_dec_16way(ctx, (u8 *)dst, (u8 *)src);
nbytes -= bsize;
if (nbytes < bsize)
goto done;
*dst ^= *(src - 1);
src -= 1;
dst -= 1;
} while (nbytes >= bsize * CAST5_PARALLEL_BLOCKS);
}
/* Handle leftovers */
for (;;) {
__cast5_decrypt(ctx, (u8 *)dst, (u8 *)src);
nbytes -= bsize;
if (nbytes < bsize)
break;
*dst ^= *(src - 1);
src -= 1;
dst -= 1;
}
done:
*dst ^= *(u64 *)walk->iv;
*(u64 *)walk->iv = last_iv;
return nbytes;
CBC_WALK_START(req, CAST5_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(__cast5_encrypt);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm);
bool fpu_enabled = false;
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes)) {
fpu_enabled = cast5_fpu_begin(fpu_enabled, &walk, nbytes);
nbytes = __cbc_decrypt(ctx, &walk);
err = skcipher_walk_done(&walk, nbytes);
}
cast5_fpu_end(fpu_enabled);
return err;
}
static void ctr_crypt_final(struct skcipher_walk *walk, struct cast5_ctx *ctx)
{
u8 *ctrblk = walk->iv;
u8 keystream[CAST5_BLOCK_SIZE];
u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
unsigned int nbytes = walk->nbytes;
__cast5_encrypt(ctx, keystream, ctrblk);
crypto_xor_cpy(dst, keystream, src, nbytes);
crypto_inc(ctrblk, CAST5_BLOCK_SIZE);
}
static unsigned int __ctr_crypt(struct skcipher_walk *walk,
struct cast5_ctx *ctx)
{
const unsigned int bsize = CAST5_BLOCK_SIZE;
unsigned int nbytes = walk->nbytes;
u64 *src = (u64 *)walk->src.virt.addr;
u64 *dst = (u64 *)walk->dst.virt.addr;
/* Process multi-block batch */
if (nbytes >= bsize * CAST5_PARALLEL_BLOCKS) {
do {
cast5_ctr_16way(ctx, (u8 *)dst, (u8 *)src,
(__be64 *)walk->iv);
src += CAST5_PARALLEL_BLOCKS;
dst += CAST5_PARALLEL_BLOCKS;
nbytes -= bsize * CAST5_PARALLEL_BLOCKS;
} while (nbytes >= bsize * CAST5_PARALLEL_BLOCKS);
if (nbytes < bsize)
goto done;
}
/* Handle leftovers */
do {
u64 ctrblk;
if (dst != src)
*dst = *src;
ctrblk = *(u64 *)walk->iv;
be64_add_cpu((__be64 *)walk->iv, 1);
__cast5_encrypt(ctx, (u8 *)&ctrblk, (u8 *)&ctrblk);
*dst ^= ctrblk;
src += 1;
dst += 1;
nbytes -= bsize;
} while (nbytes >= bsize);
done:
return nbytes;
}
static int ctr_crypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm);
bool fpu_enabled = false;
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes) >= CAST5_BLOCK_SIZE) {
fpu_enabled = cast5_fpu_begin(fpu_enabled, &walk, nbytes);
nbytes = __ctr_crypt(&walk, ctx);
err = skcipher_walk_done(&walk, nbytes);
}
cast5_fpu_end(fpu_enabled);
if (walk.nbytes) {
ctr_crypt_final(&walk, ctx);
err = skcipher_walk_done(&walk, 0);
}
return err;
CBC_WALK_START(req, CAST5_BLOCK_SIZE, CAST5_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(CAST5_PARALLEL_BLOCKS, cast5_cbc_dec_16way);
CBC_DEC_BLOCK(1, __cast5_decrypt);
CBC_WALK_END();
}
static struct skcipher_alg cast5_algs[] = {
@ -328,21 +90,6 @@ static struct skcipher_alg cast5_algs[] = {
.setkey = cast5_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(cast5)",
.base.cra_driver_name = "__ctr-cast5-avx",
.base.cra_priority = 200,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct cast5_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = CAST5_MIN_KEY_SIZE,
.max_keysize = CAST5_MAX_KEY_SIZE,
.ivsize = CAST5_BLOCK_SIZE,
.chunksize = CAST5_BLOCK_SIZE,
.setkey = cast5_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}
};

View File

@ -212,8 +212,6 @@
.section .rodata.cst16, "aM", @progbits, 16
.align 16
.Lxts_gf128mul_and_shl1_mask:
.byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
.Lbswap_mask:
.byte 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12
.Lbswap128_mask:
@ -412,85 +410,3 @@ SYM_FUNC_START(cast6_cbc_dec_8way)
FRAME_END
ret;
SYM_FUNC_END(cast6_cbc_dec_8way)
SYM_FUNC_START(cast6_ctr_8way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (little endian, 128bit)
*/
FRAME_BEGIN
pushq %r12;
pushq %r15
movq %rdi, CTX;
movq %rsi, %r11;
movq %rdx, %r12;
load_ctr_8way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
RD2, RX, RKR, RKM);
call __cast6_enc_blk8;
store_ctr_8way(%r12, %r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
popq %r15;
popq %r12;
FRAME_END
ret;
SYM_FUNC_END(cast6_ctr_8way)
SYM_FUNC_START(cast6_xts_enc_8way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
pushq %r15;
movq %rdi, CTX
movq %rsi, %r11;
/* regs <= src, dst <= IVs, regs <= regs xor IVs */
load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
RX, RKR, RKM, .Lxts_gf128mul_and_shl1_mask);
call __cast6_enc_blk8;
/* dst <= regs xor IVs(in dst) */
store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
popq %r15;
FRAME_END
ret;
SYM_FUNC_END(cast6_xts_enc_8way)
SYM_FUNC_START(cast6_xts_dec_8way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
pushq %r15;
movq %rdi, CTX
movq %rsi, %r11;
/* regs <= src, dst <= IVs, regs <= regs xor IVs */
load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
RX, RKR, RKM, .Lxts_gf128mul_and_shl1_mask);
call __cast6_dec_blk8;
/* dst <= regs xor IVs(in dst) */
store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
popq %r15;
FRAME_END
ret;
SYM_FUNC_END(cast6_xts_dec_8way)

View File

@ -15,8 +15,8 @@
#include <crypto/algapi.h>
#include <crypto/cast6.h>
#include <crypto/internal/simd.h>
#include <crypto/xts.h>
#include <asm/crypto/glue_helper.h>
#include "ecb_cbc_helpers.h"
#define CAST6_PARALLEL_BLOCKS 8
@ -24,13 +24,6 @@ asmlinkage void cast6_ecb_enc_8way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void cast6_ecb_dec_8way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void cast6_cbc_dec_8way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void cast6_ctr_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void cast6_xts_enc_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void cast6_xts_dec_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
static int cast6_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
@ -38,172 +31,35 @@ static int cast6_setkey_skcipher(struct crypto_skcipher *tfm,
return cast6_setkey(&tfm->base, key, keylen);
}
static void cast6_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, __cast6_encrypt);
}
static void cast6_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, __cast6_decrypt);
}
static void cast6_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
le128_to_be128(&ctrblk, iv);
le128_inc(iv);
__cast6_encrypt(ctx, (u8 *)&ctrblk, (u8 *)&ctrblk);
u128_xor(dst, src, (u128 *)&ctrblk);
}
static const struct common_glue_ctx cast6_enc = {
.num_funcs = 2,
.fpu_blocks_limit = CAST6_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
.fn_u = { .ecb = cast6_ecb_enc_8way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __cast6_encrypt }
} }
};
static const struct common_glue_ctx cast6_ctr = {
.num_funcs = 2,
.fpu_blocks_limit = CAST6_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
.fn_u = { .ctr = cast6_ctr_8way }
}, {
.num_blocks = 1,
.fn_u = { .ctr = cast6_crypt_ctr }
} }
};
static const struct common_glue_ctx cast6_enc_xts = {
.num_funcs = 2,
.fpu_blocks_limit = CAST6_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
.fn_u = { .xts = cast6_xts_enc_8way }
}, {
.num_blocks = 1,
.fn_u = { .xts = cast6_xts_enc }
} }
};
static const struct common_glue_ctx cast6_dec = {
.num_funcs = 2,
.fpu_blocks_limit = CAST6_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
.fn_u = { .ecb = cast6_ecb_dec_8way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __cast6_decrypt }
} }
};
static const struct common_glue_ctx cast6_dec_cbc = {
.num_funcs = 2,
.fpu_blocks_limit = CAST6_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
.fn_u = { .cbc = cast6_cbc_dec_8way }
}, {
.num_blocks = 1,
.fn_u = { .cbc = __cast6_decrypt }
} }
};
static const struct common_glue_ctx cast6_dec_xts = {
.num_funcs = 2,
.fpu_blocks_limit = CAST6_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
.fn_u = { .xts = cast6_xts_dec_8way }
}, {
.num_blocks = 1,
.fn_u = { .xts = cast6_xts_dec }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&cast6_enc, req);
ECB_WALK_START(req, CAST6_BLOCK_SIZE, CAST6_PARALLEL_BLOCKS);
ECB_BLOCK(CAST6_PARALLEL_BLOCKS, cast6_ecb_enc_8way);
ECB_BLOCK(1, __cast6_encrypt);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&cast6_dec, req);
ECB_WALK_START(req, CAST6_BLOCK_SIZE, CAST6_PARALLEL_BLOCKS);
ECB_BLOCK(CAST6_PARALLEL_BLOCKS, cast6_ecb_dec_8way);
ECB_BLOCK(1, __cast6_decrypt);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(__cast6_encrypt, req);
CBC_WALK_START(req, CAST6_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(__cast6_encrypt);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&cast6_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&cast6_ctr, req);
}
struct cast6_xts_ctx {
struct cast6_ctx tweak_ctx;
struct cast6_ctx crypt_ctx;
};
static int xts_cast6_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
/* first half of xts-key is for crypt */
err = __cast6_setkey(&ctx->crypt_ctx, key, keylen / 2);
if (err)
return err;
/* second half of xts-key is for tweak */
return __cast6_setkey(&ctx->tweak_ctx, key + keylen / 2, keylen / 2);
}
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&cast6_enc_xts, req, __cast6_encrypt,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&cast6_dec_xts, req, __cast6_encrypt,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
CBC_WALK_START(req, CAST6_BLOCK_SIZE, CAST6_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(CAST6_PARALLEL_BLOCKS, cast6_cbc_dec_8way);
CBC_DEC_BLOCK(1, __cast6_decrypt);
CBC_WALK_END();
}
static struct skcipher_alg cast6_algs[] = {
@ -234,35 +90,6 @@ static struct skcipher_alg cast6_algs[] = {
.setkey = cast6_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(cast6)",
.base.cra_driver_name = "__ctr-cast6-avx",
.base.cra_priority = 200,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct cast6_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = CAST6_MIN_KEY_SIZE,
.max_keysize = CAST6_MAX_KEY_SIZE,
.ivsize = CAST6_BLOCK_SIZE,
.chunksize = CAST6_BLOCK_SIZE,
.setkey = cast6_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}, {
.base.cra_name = "__xts(cast6)",
.base.cra_driver_name = "__xts-cast6-avx",
.base.cra_priority = 200,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = CAST6_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct cast6_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * CAST6_MIN_KEY_SIZE,
.max_keysize = 2 * CAST6_MAX_KEY_SIZE,
.ivsize = CAST6_BLOCK_SIZE,
.setkey = xts_cast6_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
},
};

View File

@ -6,8 +6,6 @@
*
* CBC & ECB parts based on code (crypto/cbc.c,ecb.c) by:
* Copyright (c) 2006 Herbert Xu <herbert@gondor.apana.org.au>
* CTR part based on code (crypto/ctr.c) by:
* (C) Copyright IBM Corp. 2007 - Joy Latten <latten@us.ibm.com>
*/
#include <crypto/algapi.h>
@ -253,94 +251,6 @@ static int cbc_decrypt(struct skcipher_request *req)
return err;
}
static void ctr_crypt_final(struct des3_ede_x86_ctx *ctx,
struct skcipher_walk *walk)
{
u8 *ctrblk = walk->iv;
u8 keystream[DES3_EDE_BLOCK_SIZE];
u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
unsigned int nbytes = walk->nbytes;
des3_ede_enc_blk(ctx, keystream, ctrblk);
crypto_xor_cpy(dst, keystream, src, nbytes);
crypto_inc(ctrblk, DES3_EDE_BLOCK_SIZE);
}
static unsigned int __ctr_crypt(struct des3_ede_x86_ctx *ctx,
struct skcipher_walk *walk)
{
unsigned int bsize = DES3_EDE_BLOCK_SIZE;
unsigned int nbytes = walk->nbytes;
__be64 *src = (__be64 *)walk->src.virt.addr;
__be64 *dst = (__be64 *)walk->dst.virt.addr;
u64 ctrblk = be64_to_cpu(*(__be64 *)walk->iv);
__be64 ctrblocks[3];
/* Process four block batch */
if (nbytes >= bsize * 3) {
do {
/* create ctrblks for parallel encrypt */
ctrblocks[0] = cpu_to_be64(ctrblk++);
ctrblocks[1] = cpu_to_be64(ctrblk++);
ctrblocks[2] = cpu_to_be64(ctrblk++);
des3_ede_enc_blk_3way(ctx, (u8 *)ctrblocks,
(u8 *)ctrblocks);
dst[0] = src[0] ^ ctrblocks[0];
dst[1] = src[1] ^ ctrblocks[1];
dst[2] = src[2] ^ ctrblocks[2];
src += 3;
dst += 3;
} while ((nbytes -= bsize * 3) >= bsize * 3);
if (nbytes < bsize)
goto done;
}
/* Handle leftovers */
do {
ctrblocks[0] = cpu_to_be64(ctrblk++);
des3_ede_enc_blk(ctx, (u8 *)ctrblocks, (u8 *)ctrblocks);
dst[0] = src[0] ^ ctrblocks[0];
src += 1;
dst += 1;
} while ((nbytes -= bsize) >= bsize);
done:
*(__be64 *)walk->iv = cpu_to_be64(ctrblk);
return nbytes;
}
static int ctr_crypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct des3_ede_x86_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes) >= DES3_EDE_BLOCK_SIZE) {
nbytes = __ctr_crypt(ctx, &walk);
err = skcipher_walk_done(&walk, nbytes);
}
if (nbytes) {
ctr_crypt_final(ctx, &walk);
err = skcipher_walk_done(&walk, 0);
}
return err;
}
static int des3_ede_x86_setkey(struct crypto_tfm *tfm, const u8 *key,
unsigned int keylen)
{
@ -428,20 +338,6 @@ static struct skcipher_alg des3_ede_skciphers[] = {
.setkey = des3_ede_x86_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "ctr(des3_ede)",
.base.cra_driver_name = "ctr-des3_ede-asm",
.base.cra_priority = 300,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct des3_ede_x86_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = DES3_EDE_KEY_SIZE,
.max_keysize = DES3_EDE_KEY_SIZE,
.ivsize = DES3_EDE_BLOCK_SIZE,
.chunksize = DES3_EDE_BLOCK_SIZE,
.setkey = des3_ede_x86_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}
};

View File

@ -0,0 +1,76 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _CRYPTO_ECB_CBC_HELPER_H
#define _CRYPTO_ECB_CBC_HELPER_H
#include <crypto/internal/skcipher.h>
#include <asm/fpu/api.h>
/*
* Mode helpers to instantiate parameterized skcipher ECB/CBC modes without
* having to rely on indirect calls and retpolines.
*/
#define ECB_WALK_START(req, bsize, fpu_blocks) do { \
void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req)); \
const int __bsize = (bsize); \
struct skcipher_walk walk; \
int err = skcipher_walk_virt(&walk, (req), false); \
while (walk.nbytes > 0) { \
unsigned int nbytes = walk.nbytes; \
bool do_fpu = (fpu_blocks) != -1 && \
nbytes >= (fpu_blocks) * __bsize; \
const u8 *src = walk.src.virt.addr; \
u8 *dst = walk.dst.virt.addr; \
u8 __maybe_unused buf[(bsize)]; \
if (do_fpu) kernel_fpu_begin()
#define CBC_WALK_START(req, bsize, fpu_blocks) \
ECB_WALK_START(req, bsize, fpu_blocks)
#define ECB_WALK_ADVANCE(blocks) do { \
dst += (blocks) * __bsize; \
src += (blocks) * __bsize; \
nbytes -= (blocks) * __bsize; \
} while (0)
#define ECB_BLOCK(blocks, func) do { \
while (nbytes >= (blocks) * __bsize) { \
(func)(ctx, dst, src); \
ECB_WALK_ADVANCE(blocks); \
} \
} while (0)
#define CBC_ENC_BLOCK(func) do { \
const u8 *__iv = walk.iv; \
while (nbytes >= __bsize) { \
crypto_xor_cpy(dst, src, __iv, __bsize); \
(func)(ctx, dst, dst); \
__iv = dst; \
ECB_WALK_ADVANCE(1); \
} \
memcpy(walk.iv, __iv, __bsize); \
} while (0)
#define CBC_DEC_BLOCK(blocks, func) do { \
while (nbytes >= (blocks) * __bsize) { \
const u8 *__iv = src + ((blocks) - 1) * __bsize; \
if (dst == src) \
__iv = memcpy(buf, __iv, __bsize); \
(func)(ctx, dst, src); \
crypto_xor(dst, walk.iv, __bsize); \
memcpy(walk.iv, __iv, __bsize); \
ECB_WALK_ADVANCE(blocks); \
} \
} while (0)
#define ECB_WALK_END() \
if (do_fpu) kernel_fpu_end(); \
err = skcipher_walk_done(&walk, nbytes); \
} \
return err; \
} while (0)
#define CBC_WALK_END() ECB_WALK_END()
#endif

View File

@ -34,107 +34,3 @@
vpxor (5*16)(src), x6, x6; \
vpxor (6*16)(src), x7, x7; \
store_8way(dst, x0, x1, x2, x3, x4, x5, x6, x7);
#define inc_le128(x, minus_one, tmp) \
vpcmpeqq minus_one, x, tmp; \
vpsubq minus_one, x, x; \
vpslldq $8, tmp, tmp; \
vpsubq tmp, x, x;
#define load_ctr_8way(iv, bswap, x0, x1, x2, x3, x4, x5, x6, x7, t0, t1, t2) \
vpcmpeqd t0, t0, t0; \
vpsrldq $8, t0, t0; /* low: -1, high: 0 */ \
vmovdqa bswap, t1; \
\
/* load IV and byteswap */ \
vmovdqu (iv), x7; \
vpshufb t1, x7, x0; \
\
/* construct IVs */ \
inc_le128(x7, t0, t2); \
vpshufb t1, x7, x1; \
inc_le128(x7, t0, t2); \
vpshufb t1, x7, x2; \
inc_le128(x7, t0, t2); \
vpshufb t1, x7, x3; \
inc_le128(x7, t0, t2); \
vpshufb t1, x7, x4; \
inc_le128(x7, t0, t2); \
vpshufb t1, x7, x5; \
inc_le128(x7, t0, t2); \
vpshufb t1, x7, x6; \
inc_le128(x7, t0, t2); \
vmovdqa x7, t2; \
vpshufb t1, x7, x7; \
inc_le128(t2, t0, t1); \
vmovdqu t2, (iv);
#define store_ctr_8way(src, dst, x0, x1, x2, x3, x4, x5, x6, x7) \
vpxor (0*16)(src), x0, x0; \
vpxor (1*16)(src), x1, x1; \
vpxor (2*16)(src), x2, x2; \
vpxor (3*16)(src), x3, x3; \
vpxor (4*16)(src), x4, x4; \
vpxor (5*16)(src), x5, x5; \
vpxor (6*16)(src), x6, x6; \
vpxor (7*16)(src), x7, x7; \
store_8way(dst, x0, x1, x2, x3, x4, x5, x6, x7);
#define gf128mul_x_ble(iv, mask, tmp) \
vpsrad $31, iv, tmp; \
vpaddq iv, iv, iv; \
vpshufd $0x13, tmp, tmp; \
vpand mask, tmp, tmp; \
vpxor tmp, iv, iv;
#define load_xts_8way(iv, src, dst, x0, x1, x2, x3, x4, x5, x6, x7, tiv, t0, \
t1, xts_gf128mul_and_shl1_mask) \
vmovdqa xts_gf128mul_and_shl1_mask, t0; \
\
/* load IV */ \
vmovdqu (iv), tiv; \
vpxor (0*16)(src), tiv, x0; \
vmovdqu tiv, (0*16)(dst); \
\
/* construct and store IVs, also xor with source */ \
gf128mul_x_ble(tiv, t0, t1); \
vpxor (1*16)(src), tiv, x1; \
vmovdqu tiv, (1*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vpxor (2*16)(src), tiv, x2; \
vmovdqu tiv, (2*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vpxor (3*16)(src), tiv, x3; \
vmovdqu tiv, (3*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vpxor (4*16)(src), tiv, x4; \
vmovdqu tiv, (4*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vpxor (5*16)(src), tiv, x5; \
vmovdqu tiv, (5*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vpxor (6*16)(src), tiv, x6; \
vmovdqu tiv, (6*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vpxor (7*16)(src), tiv, x7; \
vmovdqu tiv, (7*16)(dst); \
\
gf128mul_x_ble(tiv, t0, t1); \
vmovdqu tiv, (iv);
#define store_xts_8way(dst, x0, x1, x2, x3, x4, x5, x6, x7) \
vpxor (0*16)(dst), x0, x0; \
vpxor (1*16)(dst), x1, x1; \
vpxor (2*16)(dst), x2, x2; \
vpxor (3*16)(dst), x3, x3; \
vpxor (4*16)(dst), x4, x4; \
vpxor (5*16)(dst), x5, x5; \
vpxor (6*16)(dst), x6, x6; \
vpxor (7*16)(dst), x7, x7; \
store_8way(dst, x0, x1, x2, x3, x4, x5, x6, x7);

View File

@ -37,139 +37,3 @@
vpxor (5*32+16)(src), x6, x6; \
vpxor (6*32+16)(src), x7, x7; \
store_16way(dst, x0, x1, x2, x3, x4, x5, x6, x7);
#define inc_le128(x, minus_one, tmp) \
vpcmpeqq minus_one, x, tmp; \
vpsubq minus_one, x, x; \
vpslldq $8, tmp, tmp; \
vpsubq tmp, x, x;
#define add2_le128(x, minus_one, minus_two, tmp1, tmp2) \
vpcmpeqq minus_one, x, tmp1; \
vpcmpeqq minus_two, x, tmp2; \
vpsubq minus_two, x, x; \
vpor tmp2, tmp1, tmp1; \
vpslldq $8, tmp1, tmp1; \
vpsubq tmp1, x, x;
#define load_ctr_16way(iv, bswap, x0, x1, x2, x3, x4, x5, x6, x7, t0, t0x, t1, \
t1x, t2, t2x, t3, t3x, t4, t5) \
vpcmpeqd t0, t0, t0; \
vpsrldq $8, t0, t0; /* ab: -1:0 ; cd: -1:0 */ \
vpaddq t0, t0, t4; /* ab: -2:0 ; cd: -2:0 */\
\
/* load IV and byteswap */ \
vmovdqu (iv), t2x; \
vmovdqa t2x, t3x; \
inc_le128(t2x, t0x, t1x); \
vbroadcasti128 bswap, t1; \
vinserti128 $1, t2x, t3, t2; /* ab: le0 ; cd: le1 */ \
vpshufb t1, t2, x0; \
\
/* construct IVs */ \
add2_le128(t2, t0, t4, t3, t5); /* ab: le2 ; cd: le3 */ \
vpshufb t1, t2, x1; \
add2_le128(t2, t0, t4, t3, t5); \
vpshufb t1, t2, x2; \
add2_le128(t2, t0, t4, t3, t5); \
vpshufb t1, t2, x3; \
add2_le128(t2, t0, t4, t3, t5); \
vpshufb t1, t2, x4; \
add2_le128(t2, t0, t4, t3, t5); \
vpshufb t1, t2, x5; \
add2_le128(t2, t0, t4, t3, t5); \
vpshufb t1, t2, x6; \
add2_le128(t2, t0, t4, t3, t5); \
vpshufb t1, t2, x7; \
vextracti128 $1, t2, t2x; \
inc_le128(t2x, t0x, t3x); \
vmovdqu t2x, (iv);
#define store_ctr_16way(src, dst, x0, x1, x2, x3, x4, x5, x6, x7) \
vpxor (0*32)(src), x0, x0; \
vpxor (1*32)(src), x1, x1; \
vpxor (2*32)(src), x2, x2; \
vpxor (3*32)(src), x3, x3; \
vpxor (4*32)(src), x4, x4; \
vpxor (5*32)(src), x5, x5; \
vpxor (6*32)(src), x6, x6; \
vpxor (7*32)(src), x7, x7; \
store_16way(dst, x0, x1, x2, x3, x4, x5, x6, x7);
#define gf128mul_x_ble(iv, mask, tmp) \
vpsrad $31, iv, tmp; \
vpaddq iv, iv, iv; \
vpshufd $0x13, tmp, tmp; \
vpand mask, tmp, tmp; \
vpxor tmp, iv, iv;
#define gf128mul_x2_ble(iv, mask1, mask2, tmp0, tmp1) \
vpsrad $31, iv, tmp0; \
vpaddq iv, iv, tmp1; \
vpsllq $2, iv, iv; \
vpshufd $0x13, tmp0, tmp0; \
vpsrad $31, tmp1, tmp1; \
vpand mask2, tmp0, tmp0; \
vpshufd $0x13, tmp1, tmp1; \
vpxor tmp0, iv, iv; \
vpand mask1, tmp1, tmp1; \
vpxor tmp1, iv, iv;
#define load_xts_16way(iv, src, dst, x0, x1, x2, x3, x4, x5, x6, x7, tiv, \
tivx, t0, t0x, t1, t1x, t2, t2x, t3, \
xts_gf128mul_and_shl1_mask_0, \
xts_gf128mul_and_shl1_mask_1) \
vbroadcasti128 xts_gf128mul_and_shl1_mask_0, t1; \
\
/* load IV and construct second IV */ \
vmovdqu (iv), tivx; \
vmovdqa tivx, t0x; \
gf128mul_x_ble(tivx, t1x, t2x); \
vbroadcasti128 xts_gf128mul_and_shl1_mask_1, t2; \
vinserti128 $1, tivx, t0, tiv; \
vpxor (0*32)(src), tiv, x0; \
vmovdqu tiv, (0*32)(dst); \
\
/* construct and store IVs, also xor with source */ \
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (1*32)(src), tiv, x1; \
vmovdqu tiv, (1*32)(dst); \
\
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (2*32)(src), tiv, x2; \
vmovdqu tiv, (2*32)(dst); \
\
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (3*32)(src), tiv, x3; \
vmovdqu tiv, (3*32)(dst); \
\
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (4*32)(src), tiv, x4; \
vmovdqu tiv, (4*32)(dst); \
\
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (5*32)(src), tiv, x5; \
vmovdqu tiv, (5*32)(dst); \
\
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (6*32)(src), tiv, x6; \
vmovdqu tiv, (6*32)(dst); \
\
gf128mul_x2_ble(tiv, t1, t2, t0, t3); \
vpxor (7*32)(src), tiv, x7; \
vmovdqu tiv, (7*32)(dst); \
\
vextracti128 $1, tiv, tivx; \
gf128mul_x_ble(tivx, t1x, t2x); \
vmovdqu tivx, (iv);
#define store_xts_16way(dst, x0, x1, x2, x3, x4, x5, x6, x7) \
vpxor (0*32)(dst), x0, x0; \
vpxor (1*32)(dst), x1, x1; \
vpxor (2*32)(dst), x2, x2; \
vpxor (3*32)(dst), x3, x3; \
vpxor (4*32)(dst), x4, x4; \
vpxor (5*32)(dst), x5, x5; \
vpxor (6*32)(dst), x6, x6; \
vpxor (7*32)(dst), x7, x7; \
store_16way(dst, x0, x1, x2, x3, x4, x5, x6, x7);

View File

@ -1,381 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Shared glue code for 128bit block ciphers
*
* Copyright © 2012-2013 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*
* CBC & ECB parts based on code (crypto/cbc.c,ecb.c) by:
* Copyright (c) 2006 Herbert Xu <herbert@gondor.apana.org.au>
* CTR part based on code (crypto/ctr.c) by:
* (C) Copyright IBM Corp. 2007 - Joy Latten <latten@us.ibm.com>
*/
#include <linux/module.h>
#include <crypto/b128ops.h>
#include <crypto/gf128mul.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <crypto/xts.h>
#include <asm/crypto/glue_helper.h>
int glue_ecb_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req)
{
void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
const unsigned int bsize = 128 / 8;
struct skcipher_walk walk;
bool fpu_enabled = false;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes)) {
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
unsigned int func_bytes;
unsigned int i;
fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit,
&walk, fpu_enabled, nbytes);
for (i = 0; i < gctx->num_funcs; i++) {
func_bytes = bsize * gctx->funcs[i].num_blocks;
if (nbytes < func_bytes)
continue;
/* Process multi-block batch */
do {
gctx->funcs[i].fn_u.ecb(ctx, dst, src);
src += func_bytes;
dst += func_bytes;
nbytes -= func_bytes;
} while (nbytes >= func_bytes);
if (nbytes < bsize)
break;
}
err = skcipher_walk_done(&walk, nbytes);
}
glue_fpu_end(fpu_enabled);
return err;
}
EXPORT_SYMBOL_GPL(glue_ecb_req_128bit);
int glue_cbc_encrypt_req_128bit(const common_glue_func_t fn,
struct skcipher_request *req)
{
void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
const unsigned int bsize = 128 / 8;
struct skcipher_walk walk;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes)) {
const u128 *src = (u128 *)walk.src.virt.addr;
u128 *dst = (u128 *)walk.dst.virt.addr;
u128 *iv = (u128 *)walk.iv;
do {
u128_xor(dst, src, iv);
fn(ctx, (u8 *)dst, (u8 *)dst);
iv = dst;
src++;
dst++;
nbytes -= bsize;
} while (nbytes >= bsize);
*(u128 *)walk.iv = *iv;
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
EXPORT_SYMBOL_GPL(glue_cbc_encrypt_req_128bit);
int glue_cbc_decrypt_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req)
{
void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
const unsigned int bsize = 128 / 8;
struct skcipher_walk walk;
bool fpu_enabled = false;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes)) {
const u128 *src = walk.src.virt.addr;
u128 *dst = walk.dst.virt.addr;
unsigned int func_bytes, num_blocks;
unsigned int i;
u128 last_iv;
fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit,
&walk, fpu_enabled, nbytes);
/* Start of the last block. */
src += nbytes / bsize - 1;
dst += nbytes / bsize - 1;
last_iv = *src;
for (i = 0; i < gctx->num_funcs; i++) {
num_blocks = gctx->funcs[i].num_blocks;
func_bytes = bsize * num_blocks;
if (nbytes < func_bytes)
continue;
/* Process multi-block batch */
do {
src -= num_blocks - 1;
dst -= num_blocks - 1;
gctx->funcs[i].fn_u.cbc(ctx, (u8 *)dst,
(const u8 *)src);
nbytes -= func_bytes;
if (nbytes < bsize)
goto done;
u128_xor(dst, dst, --src);
dst--;
} while (nbytes >= func_bytes);
}
done:
u128_xor(dst, dst, (u128 *)walk.iv);
*(u128 *)walk.iv = last_iv;
err = skcipher_walk_done(&walk, nbytes);
}
glue_fpu_end(fpu_enabled);
return err;
}
EXPORT_SYMBOL_GPL(glue_cbc_decrypt_req_128bit);
int glue_ctr_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req)
{
void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
const unsigned int bsize = 128 / 8;
struct skcipher_walk walk;
bool fpu_enabled = false;
unsigned int nbytes;
int err;
err = skcipher_walk_virt(&walk, req, false);
while ((nbytes = walk.nbytes) >= bsize) {
const u128 *src = walk.src.virt.addr;
u128 *dst = walk.dst.virt.addr;
unsigned int func_bytes, num_blocks;
unsigned int i;
le128 ctrblk;
fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit,
&walk, fpu_enabled, nbytes);
be128_to_le128(&ctrblk, (be128 *)walk.iv);
for (i = 0; i < gctx->num_funcs; i++) {
num_blocks = gctx->funcs[i].num_blocks;
func_bytes = bsize * num_blocks;
if (nbytes < func_bytes)
continue;
/* Process multi-block batch */
do {
gctx->funcs[i].fn_u.ctr(ctx, (u8 *)dst,
(const u8 *)src,
&ctrblk);
src += num_blocks;
dst += num_blocks;
nbytes -= func_bytes;
} while (nbytes >= func_bytes);
if (nbytes < bsize)
break;
}
le128_to_be128((be128 *)walk.iv, &ctrblk);
err = skcipher_walk_done(&walk, nbytes);
}
glue_fpu_end(fpu_enabled);
if (nbytes) {
le128 ctrblk;
u128 tmp;
be128_to_le128(&ctrblk, (be128 *)walk.iv);
memcpy(&tmp, walk.src.virt.addr, nbytes);
gctx->funcs[gctx->num_funcs - 1].fn_u.ctr(ctx, (u8 *)&tmp,
(const u8 *)&tmp,
&ctrblk);
memcpy(walk.dst.virt.addr, &tmp, nbytes);
le128_to_be128((be128 *)walk.iv, &ctrblk);
err = skcipher_walk_done(&walk, 0);
}
return err;
}
EXPORT_SYMBOL_GPL(glue_ctr_req_128bit);
static unsigned int __glue_xts_req_128bit(const struct common_glue_ctx *gctx,
void *ctx,
struct skcipher_walk *walk)
{
const unsigned int bsize = 128 / 8;
unsigned int nbytes = walk->nbytes;
u128 *src = walk->src.virt.addr;
u128 *dst = walk->dst.virt.addr;
unsigned int num_blocks, func_bytes;
unsigned int i;
/* Process multi-block batch */
for (i = 0; i < gctx->num_funcs; i++) {
num_blocks = gctx->funcs[i].num_blocks;
func_bytes = bsize * num_blocks;
if (nbytes >= func_bytes) {
do {
gctx->funcs[i].fn_u.xts(ctx, (u8 *)dst,
(const u8 *)src,
walk->iv);
src += num_blocks;
dst += num_blocks;
nbytes -= func_bytes;
} while (nbytes >= func_bytes);
if (nbytes < bsize)
goto done;
}
}
done:
return nbytes;
}
int glue_xts_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req,
common_glue_func_t tweak_fn, void *tweak_ctx,
void *crypt_ctx, bool decrypt)
{
const bool cts = (req->cryptlen % XTS_BLOCK_SIZE);
const unsigned int bsize = 128 / 8;
struct skcipher_request subreq;
struct skcipher_walk walk;
bool fpu_enabled = false;
unsigned int nbytes, tail;
int err;
if (req->cryptlen < XTS_BLOCK_SIZE)
return -EINVAL;
if (unlikely(cts)) {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
tail = req->cryptlen % XTS_BLOCK_SIZE + XTS_BLOCK_SIZE;
skcipher_request_set_tfm(&subreq, tfm);
skcipher_request_set_callback(&subreq,
crypto_skcipher_get_flags(tfm),
NULL, NULL);
skcipher_request_set_crypt(&subreq, req->src, req->dst,
req->cryptlen - tail, req->iv);
req = &subreq;
}
err = skcipher_walk_virt(&walk, req, false);
nbytes = walk.nbytes;
if (err)
return err;
/* set minimum length to bsize, for tweak_fn */
fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit,
&walk, fpu_enabled,
nbytes < bsize ? bsize : nbytes);
/* calculate first value of T */
tweak_fn(tweak_ctx, walk.iv, walk.iv);
while (nbytes) {
nbytes = __glue_xts_req_128bit(gctx, crypt_ctx, &walk);
err = skcipher_walk_done(&walk, nbytes);
nbytes = walk.nbytes;
}
if (unlikely(cts)) {
u8 *next_tweak, *final_tweak = req->iv;
struct scatterlist *src, *dst;
struct scatterlist s[2], d[2];
le128 b[2];
dst = src = scatterwalk_ffwd(s, req->src, req->cryptlen);
if (req->dst != req->src)
dst = scatterwalk_ffwd(d, req->dst, req->cryptlen);
if (decrypt) {
next_tweak = memcpy(b, req->iv, XTS_BLOCK_SIZE);
gf128mul_x_ble(b, b);
} else {
next_tweak = req->iv;
}
skcipher_request_set_crypt(&subreq, src, dst, XTS_BLOCK_SIZE,
next_tweak);
err = skcipher_walk_virt(&walk, req, false) ?:
skcipher_walk_done(&walk,
__glue_xts_req_128bit(gctx, crypt_ctx, &walk));
if (err)
goto out;
scatterwalk_map_and_copy(b, dst, 0, XTS_BLOCK_SIZE, 0);
memcpy(b + 1, b, tail - XTS_BLOCK_SIZE);
scatterwalk_map_and_copy(b, src, XTS_BLOCK_SIZE,
tail - XTS_BLOCK_SIZE, 0);
scatterwalk_map_and_copy(b, dst, 0, tail, 1);
skcipher_request_set_crypt(&subreq, dst, dst, XTS_BLOCK_SIZE,
final_tweak);
err = skcipher_walk_virt(&walk, req, false) ?:
skcipher_walk_done(&walk,
__glue_xts_req_128bit(gctx, crypt_ctx, &walk));
}
out:
glue_fpu_end(fpu_enabled);
return err;
}
EXPORT_SYMBOL_GPL(glue_xts_req_128bit);
void glue_xts_crypt_128bit_one(const void *ctx, u8 *dst, const u8 *src,
le128 *iv, common_glue_func_t fn)
{
le128 ivblk = *iv;
/* generate next IV */
gf128mul_x_ble(iv, &ivblk);
/* CC <- T xor C */
u128_xor((u128 *)dst, (const u128 *)src, (u128 *)&ivblk);
/* PP <- D(Key2,CC) */
fn(ctx, dst, dst);
/* P <- T xor PP */
u128_xor((u128 *)dst, (u128 *)dst, (u128 *)&ivblk);
}
EXPORT_SYMBOL_GPL(glue_xts_crypt_128bit_one);
MODULE_LICENSE("GPL");

View File

@ -18,10 +18,6 @@
.align 16
.Lbswap128_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
.section .rodata.cst16.xts_gf128mul_and_shl1_mask, "aM", @progbits, 16
.align 16
.Lxts_gf128mul_and_shl1_mask:
.byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
.text
@ -715,67 +711,3 @@ SYM_FUNC_START(serpent_cbc_dec_8way_avx)
FRAME_END
ret;
SYM_FUNC_END(serpent_cbc_dec_8way_avx)
SYM_FUNC_START(serpent_ctr_8way_avx)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (little endian, 128bit)
*/
FRAME_BEGIN
load_ctr_8way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
RD2, RK0, RK1, RK2);
call __serpent_enc_blk8_avx;
store_ctr_8way(%rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
FRAME_END
ret;
SYM_FUNC_END(serpent_ctr_8way_avx)
SYM_FUNC_START(serpent_xts_enc_8way_avx)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
/* regs <= src, dst <= IVs, regs <= regs xor IVs */
load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
RK0, RK1, RK2, .Lxts_gf128mul_and_shl1_mask);
call __serpent_enc_blk8_avx;
/* dst <= regs xor IVs(in dst) */
store_xts_8way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
FRAME_END
ret;
SYM_FUNC_END(serpent_xts_enc_8way_avx)
SYM_FUNC_START(serpent_xts_dec_8way_avx)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
/* regs <= src, dst <= IVs, regs <= regs xor IVs */
load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
RK0, RK1, RK2, .Lxts_gf128mul_and_shl1_mask);
call __serpent_dec_blk8_avx;
/* dst <= regs xor IVs(in dst) */
store_xts_8way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
FRAME_END
ret;
SYM_FUNC_END(serpent_xts_dec_8way_avx)

View File

@ -0,0 +1,21 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef ASM_X86_SERPENT_AVX_H
#define ASM_X86_SERPENT_AVX_H
#include <crypto/b128ops.h>
#include <crypto/serpent.h>
#include <linux/types.h>
struct crypto_skcipher;
#define SERPENT_PARALLEL_BLOCKS 8
asmlinkage void serpent_ecb_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
asmlinkage void serpent_ecb_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
asmlinkage void serpent_cbc_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
#endif

View File

@ -20,16 +20,6 @@
.Lbswap128_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
.section .rodata.cst16.xts_gf128mul_and_shl1_mask_0, "aM", @progbits, 16
.align 16
.Lxts_gf128mul_and_shl1_mask_0:
.byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
.section .rodata.cst16.xts_gf128mul_and_shl1_mask_1, "aM", @progbits, 16
.align 16
.Lxts_gf128mul_and_shl1_mask_1:
.byte 0x0e, 1, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0
.text
#define CTX %rdi
@ -734,80 +724,3 @@ SYM_FUNC_START(serpent_cbc_dec_16way)
FRAME_END
ret;
SYM_FUNC_END(serpent_cbc_dec_16way)
SYM_FUNC_START(serpent_ctr_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (little endian, 128bit)
*/
FRAME_BEGIN
vzeroupper;
load_ctr_16way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
RD2, RK0, RK0x, RK1, RK1x, RK2, RK2x, RK3, RK3x, RNOT,
tp);
call __serpent_enc_blk16;
store_ctr_16way(%rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
vzeroupper;
FRAME_END
ret;
SYM_FUNC_END(serpent_ctr_16way)
SYM_FUNC_START(serpent_xts_enc_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
vzeroupper;
load_xts_16way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
RD2, RK0, RK0x, RK1, RK1x, RK2, RK2x, RK3, RK3x, RNOT,
.Lxts_gf128mul_and_shl1_mask_0,
.Lxts_gf128mul_and_shl1_mask_1);
call __serpent_enc_blk16;
store_xts_16way(%rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
vzeroupper;
FRAME_END
ret;
SYM_FUNC_END(serpent_xts_enc_16way)
SYM_FUNC_START(serpent_xts_dec_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst (16 blocks)
* %rdx: src (16 blocks)
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
vzeroupper;
load_xts_16way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
RD2, RK0, RK0x, RK1, RK1x, RK2, RK2x, RK3, RK3x, RNOT,
.Lxts_gf128mul_and_shl1_mask_0,
.Lxts_gf128mul_and_shl1_mask_1);
call __serpent_dec_blk16;
store_xts_16way(%rsi, RC1, RD1, RB1, RE1, RC2, RD2, RB2, RE2);
vzeroupper;
FRAME_END
ret;
SYM_FUNC_END(serpent_xts_dec_16way)

View File

@ -12,9 +12,9 @@
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <crypto/serpent.h>
#include <crypto/xts.h>
#include <asm/crypto/glue_helper.h>
#include <asm/crypto/serpent-avx.h>
#include "serpent-avx.h"
#include "ecb_cbc_helpers.h"
#define SERPENT_AVX2_PARALLEL_BLOCKS 16
@ -23,158 +23,44 @@ asmlinkage void serpent_ecb_enc_16way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void serpent_ecb_dec_16way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void serpent_cbc_dec_16way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void serpent_ctr_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void serpent_xts_enc_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void serpent_xts_dec_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
static int serpent_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
{
return __serpent_setkey(crypto_skcipher_ctx(tfm), key, keylen);
}
static const struct common_glue_ctx serpent_enc = {
.num_funcs = 3,
.fpu_blocks_limit = 8,
.funcs = { {
.num_blocks = 16,
.fn_u = { .ecb = serpent_ecb_enc_16way }
}, {
.num_blocks = 8,
.fn_u = { .ecb = serpent_ecb_enc_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __serpent_encrypt }
} }
};
static const struct common_glue_ctx serpent_ctr = {
.num_funcs = 3,
.fpu_blocks_limit = 8,
.funcs = { {
.num_blocks = 16,
.fn_u = { .ctr = serpent_ctr_16way }
}, {
.num_blocks = 8,
.fn_u = { .ctr = serpent_ctr_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .ctr = __serpent_crypt_ctr }
} }
};
static const struct common_glue_ctx serpent_enc_xts = {
.num_funcs = 3,
.fpu_blocks_limit = 8,
.funcs = { {
.num_blocks = 16,
.fn_u = { .xts = serpent_xts_enc_16way }
}, {
.num_blocks = 8,
.fn_u = { .xts = serpent_xts_enc_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .xts = serpent_xts_enc }
} }
};
static const struct common_glue_ctx serpent_dec = {
.num_funcs = 3,
.fpu_blocks_limit = 8,
.funcs = { {
.num_blocks = 16,
.fn_u = { .ecb = serpent_ecb_dec_16way }
}, {
.num_blocks = 8,
.fn_u = { .ecb = serpent_ecb_dec_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __serpent_decrypt }
} }
};
static const struct common_glue_ctx serpent_dec_cbc = {
.num_funcs = 3,
.fpu_blocks_limit = 8,
.funcs = { {
.num_blocks = 16,
.fn_u = { .cbc = serpent_cbc_dec_16way }
}, {
.num_blocks = 8,
.fn_u = { .cbc = serpent_cbc_dec_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .cbc = __serpent_decrypt }
} }
};
static const struct common_glue_ctx serpent_dec_xts = {
.num_funcs = 3,
.fpu_blocks_limit = 8,
.funcs = { {
.num_blocks = 16,
.fn_u = { .xts = serpent_xts_dec_16way }
}, {
.num_blocks = 8,
.fn_u = { .xts = serpent_xts_dec_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .xts = serpent_xts_dec }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&serpent_enc, req);
ECB_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
ECB_BLOCK(SERPENT_AVX2_PARALLEL_BLOCKS, serpent_ecb_enc_16way);
ECB_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_ecb_enc_8way_avx);
ECB_BLOCK(1, __serpent_encrypt);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&serpent_dec, req);
ECB_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
ECB_BLOCK(SERPENT_AVX2_PARALLEL_BLOCKS, serpent_ecb_dec_16way);
ECB_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_ecb_dec_8way_avx);
ECB_BLOCK(1, __serpent_decrypt);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(__serpent_encrypt, req);
CBC_WALK_START(req, SERPENT_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(__serpent_encrypt);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&serpent_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&serpent_ctr, req);
}
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&serpent_enc_xts, req,
__serpent_encrypt, &ctx->tweak_ctx,
&ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&serpent_dec_xts, req,
__serpent_encrypt, &ctx->tweak_ctx,
&ctx->crypt_ctx, true);
CBC_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(SERPENT_AVX2_PARALLEL_BLOCKS, serpent_cbc_dec_16way);
CBC_DEC_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_cbc_dec_8way_avx);
CBC_DEC_BLOCK(1, __serpent_decrypt);
CBC_WALK_END();
}
static struct skcipher_alg serpent_algs[] = {
@ -205,35 +91,6 @@ static struct skcipher_alg serpent_algs[] = {
.setkey = serpent_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(serpent)",
.base.cra_driver_name = "__ctr-serpent-avx2",
.base.cra_priority = 600,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct serpent_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = SERPENT_MIN_KEY_SIZE,
.max_keysize = SERPENT_MAX_KEY_SIZE,
.ivsize = SERPENT_BLOCK_SIZE,
.chunksize = SERPENT_BLOCK_SIZE,
.setkey = serpent_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}, {
.base.cra_name = "__xts(serpent)",
.base.cra_driver_name = "__xts-serpent-avx2",
.base.cra_priority = 600,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = SERPENT_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct serpent_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * SERPENT_MIN_KEY_SIZE,
.max_keysize = 2 * SERPENT_MAX_KEY_SIZE,
.ivsize = SERPENT_BLOCK_SIZE,
.setkey = xts_serpent_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
},
};

View File

@ -15,9 +15,9 @@
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <crypto/serpent.h>
#include <crypto/xts.h>
#include <asm/crypto/glue_helper.h>
#include <asm/crypto/serpent-avx.h>
#include "serpent-avx.h"
#include "ecb_cbc_helpers.h"
/* 8-way parallel cipher functions */
asmlinkage void serpent_ecb_enc_8way_avx(const void *ctx, u8 *dst,
@ -32,191 +32,41 @@ asmlinkage void serpent_cbc_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
EXPORT_SYMBOL_GPL(serpent_cbc_dec_8way_avx);
asmlinkage void serpent_ctr_8way_avx(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
EXPORT_SYMBOL_GPL(serpent_ctr_8way_avx);
asmlinkage void serpent_xts_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
EXPORT_SYMBOL_GPL(serpent_xts_enc_8way_avx);
asmlinkage void serpent_xts_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
EXPORT_SYMBOL_GPL(serpent_xts_dec_8way_avx);
void __serpent_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
le128_to_be128(&ctrblk, iv);
le128_inc(iv);
__serpent_encrypt(ctx, (u8 *)&ctrblk, (u8 *)&ctrblk);
u128_xor(dst, src, (u128 *)&ctrblk);
}
EXPORT_SYMBOL_GPL(__serpent_crypt_ctr);
void serpent_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, __serpent_encrypt);
}
EXPORT_SYMBOL_GPL(serpent_xts_enc);
void serpent_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, __serpent_decrypt);
}
EXPORT_SYMBOL_GPL(serpent_xts_dec);
static int serpent_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
{
return __serpent_setkey(crypto_skcipher_ctx(tfm), key, keylen);
}
int xts_serpent_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
/* first half of xts-key is for crypt */
err = __serpent_setkey(&ctx->crypt_ctx, key, keylen / 2);
if (err)
return err;
/* second half of xts-key is for tweak */
return __serpent_setkey(&ctx->tweak_ctx, key + keylen / 2, keylen / 2);
}
EXPORT_SYMBOL_GPL(xts_serpent_setkey);
static const struct common_glue_ctx serpent_enc = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .ecb = serpent_ecb_enc_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __serpent_encrypt }
} }
};
static const struct common_glue_ctx serpent_ctr = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .ctr = serpent_ctr_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .ctr = __serpent_crypt_ctr }
} }
};
static const struct common_glue_ctx serpent_enc_xts = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .xts = serpent_xts_enc_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .xts = serpent_xts_enc }
} }
};
static const struct common_glue_ctx serpent_dec = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .ecb = serpent_ecb_dec_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __serpent_decrypt }
} }
};
static const struct common_glue_ctx serpent_dec_cbc = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .cbc = serpent_cbc_dec_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .cbc = __serpent_decrypt }
} }
};
static const struct common_glue_ctx serpent_dec_xts = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .xts = serpent_xts_dec_8way_avx }
}, {
.num_blocks = 1,
.fn_u = { .xts = serpent_xts_dec }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&serpent_enc, req);
ECB_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
ECB_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_ecb_enc_8way_avx);
ECB_BLOCK(1, __serpent_encrypt);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&serpent_dec, req);
ECB_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
ECB_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_ecb_dec_8way_avx);
ECB_BLOCK(1, __serpent_decrypt);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(__serpent_encrypt, req);
CBC_WALK_START(req, SERPENT_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(__serpent_encrypt);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&serpent_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&serpent_ctr, req);
}
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&serpent_enc_xts, req,
__serpent_encrypt, &ctx->tweak_ctx,
&ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&serpent_dec_xts, req,
__serpent_encrypt, &ctx->tweak_ctx,
&ctx->crypt_ctx, true);
CBC_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_cbc_dec_8way_avx);
CBC_DEC_BLOCK(1, __serpent_decrypt);
CBC_WALK_END();
}
static struct skcipher_alg serpent_algs[] = {
@ -247,35 +97,6 @@ static struct skcipher_alg serpent_algs[] = {
.setkey = serpent_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(serpent)",
.base.cra_driver_name = "__ctr-serpent-avx",
.base.cra_priority = 500,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct serpent_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = SERPENT_MIN_KEY_SIZE,
.max_keysize = SERPENT_MAX_KEY_SIZE,
.ivsize = SERPENT_BLOCK_SIZE,
.chunksize = SERPENT_BLOCK_SIZE,
.setkey = serpent_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}, {
.base.cra_name = "__xts(serpent)",
.base.cra_driver_name = "__xts-serpent-avx",
.base.cra_priority = 500,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = SERPENT_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct serpent_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * SERPENT_MIN_KEY_SIZE,
.max_keysize = 2 * SERPENT_MAX_KEY_SIZE,
.ivsize = SERPENT_BLOCK_SIZE,
.setkey = xts_serpent_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
},
};

View File

@ -10,8 +10,6 @@
*
* CBC & ECB parts based on code (crypto/cbc.c,ecb.c) by:
* Copyright (c) 2006 Herbert Xu <herbert@gondor.apana.org.au>
* CTR part based on code (crypto/ctr.c) by:
* (C) Copyright IBM Corp. 2007 - Joy Latten <latten@us.ibm.com>
*/
#include <linux/module.h>
@ -22,8 +20,9 @@
#include <crypto/b128ops.h>
#include <crypto/internal/simd.h>
#include <crypto/serpent.h>
#include <asm/crypto/serpent-sse2.h>
#include <asm/crypto/glue_helper.h>
#include "serpent-sse2.h"
#include "ecb_cbc_helpers.h"
static int serpent_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
@ -31,130 +30,46 @@ static int serpent_setkey_skcipher(struct crypto_skcipher *tfm,
return __serpent_setkey(crypto_skcipher_ctx(tfm), key, keylen);
}
static void serpent_decrypt_cbc_xway(const void *ctx, u8 *d, const u8 *s)
static void serpent_decrypt_cbc_xway(const void *ctx, u8 *dst, const u8 *src)
{
u128 ivs[SERPENT_PARALLEL_BLOCKS - 1];
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
unsigned int j;
u8 buf[SERPENT_PARALLEL_BLOCKS - 1][SERPENT_BLOCK_SIZE];
const u8 *s = src;
for (j = 0; j < SERPENT_PARALLEL_BLOCKS - 1; j++)
ivs[j] = src[j];
serpent_dec_blk_xway(ctx, (u8 *)dst, (u8 *)src);
for (j = 0; j < SERPENT_PARALLEL_BLOCKS - 1; j++)
u128_xor(dst + (j + 1), dst + (j + 1), ivs + j);
if (dst == src)
s = memcpy(buf, src, sizeof(buf));
serpent_dec_blk_xway(ctx, dst, src);
crypto_xor(dst + SERPENT_BLOCK_SIZE, s, sizeof(buf));
}
static void serpent_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
le128_to_be128(&ctrblk, iv);
le128_inc(iv);
__serpent_encrypt(ctx, (u8 *)&ctrblk, (u8 *)&ctrblk);
u128_xor(dst, src, (u128 *)&ctrblk);
}
static void serpent_crypt_ctr_xway(const void *ctx, u8 *d, const u8 *s,
le128 *iv)
{
be128 ctrblks[SERPENT_PARALLEL_BLOCKS];
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
unsigned int i;
for (i = 0; i < SERPENT_PARALLEL_BLOCKS; i++) {
if (dst != src)
dst[i] = src[i];
le128_to_be128(&ctrblks[i], iv);
le128_inc(iv);
}
serpent_enc_blk_xway_xor(ctx, (u8 *)dst, (u8 *)ctrblks);
}
static const struct common_glue_ctx serpent_enc = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .ecb = serpent_enc_blk_xway }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __serpent_encrypt }
} }
};
static const struct common_glue_ctx serpent_ctr = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .ctr = serpent_crypt_ctr_xway }
}, {
.num_blocks = 1,
.fn_u = { .ctr = serpent_crypt_ctr }
} }
};
static const struct common_glue_ctx serpent_dec = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .ecb = serpent_dec_blk_xway }
}, {
.num_blocks = 1,
.fn_u = { .ecb = __serpent_decrypt }
} }
};
static const struct common_glue_ctx serpent_dec_cbc = {
.num_funcs = 2,
.fpu_blocks_limit = SERPENT_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
.fn_u = { .cbc = serpent_decrypt_cbc_xway }
}, {
.num_blocks = 1,
.fn_u = { .cbc = __serpent_decrypt }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&serpent_enc, req);
ECB_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
ECB_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_enc_blk_xway);
ECB_BLOCK(1, __serpent_encrypt);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&serpent_dec, req);
ECB_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
ECB_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_dec_blk_xway);
ECB_BLOCK(1, __serpent_decrypt);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(__serpent_encrypt,
req);
CBC_WALK_START(req, SERPENT_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(__serpent_encrypt);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&serpent_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&serpent_ctr, req);
CBC_WALK_START(req, SERPENT_BLOCK_SIZE, SERPENT_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(SERPENT_PARALLEL_BLOCKS, serpent_decrypt_cbc_xway);
CBC_DEC_BLOCK(1, __serpent_decrypt);
CBC_WALK_END();
}
static struct skcipher_alg serpent_algs[] = {
@ -185,21 +100,6 @@ static struct skcipher_alg serpent_algs[] = {
.setkey = serpent_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(serpent)",
.base.cra_driver_name = "__ctr-serpent-sse2",
.base.cra_priority = 400,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct serpent_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = SERPENT_MIN_KEY_SIZE,
.max_keysize = SERPENT_MAX_KEY_SIZE,
.ivsize = SERPENT_BLOCK_SIZE,
.chunksize = SERPENT_BLOCK_SIZE,
.setkey = serpent_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
},
};

View File

@ -19,11 +19,6 @@
.Lbswap128_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
.section .rodata.cst16.xts_gf128mul_and_shl1_mask, "aM", @progbits, 16
.align 16
.Lxts_gf128mul_and_shl1_mask:
.byte 0x87, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
.text
/* structure of crypto context */
@ -379,78 +374,3 @@ SYM_FUNC_START(twofish_cbc_dec_8way)
FRAME_END
ret;
SYM_FUNC_END(twofish_cbc_dec_8way)
SYM_FUNC_START(twofish_ctr_8way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (little endian, 128bit)
*/
FRAME_BEGIN
pushq %r12;
movq %rsi, %r11;
movq %rdx, %r12;
load_ctr_8way(%rcx, .Lbswap128_mask, RA1, RB1, RC1, RD1, RA2, RB2, RC2,
RD2, RX0, RX1, RY0);
call __twofish_enc_blk8;
store_ctr_8way(%r12, %r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
popq %r12;
FRAME_END
ret;
SYM_FUNC_END(twofish_ctr_8way)
SYM_FUNC_START(twofish_xts_enc_8way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
movq %rsi, %r11;
/* regs <= src, dst <= IVs, regs <= regs xor IVs */
load_xts_8way(%rcx, %rdx, %rsi, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2,
RX0, RX1, RY0, .Lxts_gf128mul_and_shl1_mask);
call __twofish_enc_blk8;
/* dst <= regs xor IVs(in dst) */
store_xts_8way(%r11, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2);
FRAME_END
ret;
SYM_FUNC_END(twofish_xts_enc_8way)
SYM_FUNC_START(twofish_xts_dec_8way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
* %rdx: src
* %rcx: iv (t α GF(2¹²))
*/
FRAME_BEGIN
movq %rsi, %r11;
/* regs <= src, dst <= IVs, regs <= regs xor IVs */
load_xts_8way(%rcx, %rdx, %rsi, RC1, RD1, RA1, RB1, RC2, RD2, RA2, RB2,
RX0, RX1, RY0, .Lxts_gf128mul_and_shl1_mask);
call __twofish_dec_blk8;
/* dst <= regs xor IVs(in dst) */
store_xts_8way(%r11, RA1, RB1, RC1, RD1, RA2, RB2, RC2, RD2);
FRAME_END
ret;
SYM_FUNC_END(twofish_xts_dec_8way)

View File

@ -17,9 +17,5 @@ asmlinkage void twofish_dec_blk_3way(const void *ctx, u8 *dst, const u8 *src);
/* helpers from twofish_x86_64-3way module */
extern void twofish_dec_blk_cbc_3way(const void *ctx, u8 *dst, const u8 *src);
extern void twofish_enc_blk_ctr(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
extern void twofish_enc_blk_ctr_3way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
#endif /* ASM_X86_TWOFISH_H */

View File

@ -15,9 +15,9 @@
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <crypto/twofish.h>
#include <crypto/xts.h>
#include <asm/crypto/glue_helper.h>
#include <asm/crypto/twofish.h>
#include "twofish.h"
#include "ecb_cbc_helpers.h"
#define TWOFISH_PARALLEL_BLOCKS 8
@ -26,13 +26,6 @@ asmlinkage void twofish_ecb_enc_8way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void twofish_ecb_dec_8way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void twofish_cbc_dec_8way(const void *ctx, u8 *dst, const u8 *src);
asmlinkage void twofish_ctr_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void twofish_xts_enc_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void twofish_xts_dec_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
static int twofish_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
@ -45,171 +38,38 @@ static inline void twofish_enc_blk_3way(const void *ctx, u8 *dst, const u8 *src)
__twofish_enc_blk_3way(ctx, dst, src, false);
}
static void twofish_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, twofish_enc_blk);
}
static void twofish_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
glue_xts_crypt_128bit_one(ctx, dst, src, iv, twofish_dec_blk);
}
struct twofish_xts_ctx {
struct twofish_ctx tweak_ctx;
struct twofish_ctx crypt_ctx;
};
static int xts_twofish_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct twofish_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
/* first half of xts-key is for crypt */
err = __twofish_setkey(&ctx->crypt_ctx, key, keylen / 2);
if (err)
return err;
/* second half of xts-key is for tweak */
return __twofish_setkey(&ctx->tweak_ctx, key + keylen / 2, keylen / 2);
}
static const struct common_glue_ctx twofish_enc = {
.num_funcs = 3,
.fpu_blocks_limit = TWOFISH_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
.fn_u = { .ecb = twofish_ecb_enc_8way }
}, {
.num_blocks = 3,
.fn_u = { .ecb = twofish_enc_blk_3way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = twofish_enc_blk }
} }
};
static const struct common_glue_ctx twofish_ctr = {
.num_funcs = 3,
.fpu_blocks_limit = TWOFISH_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
.fn_u = { .ctr = twofish_ctr_8way }
}, {
.num_blocks = 3,
.fn_u = { .ctr = twofish_enc_blk_ctr_3way }
}, {
.num_blocks = 1,
.fn_u = { .ctr = twofish_enc_blk_ctr }
} }
};
static const struct common_glue_ctx twofish_enc_xts = {
.num_funcs = 2,
.fpu_blocks_limit = TWOFISH_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
.fn_u = { .xts = twofish_xts_enc_8way }
}, {
.num_blocks = 1,
.fn_u = { .xts = twofish_xts_enc }
} }
};
static const struct common_glue_ctx twofish_dec = {
.num_funcs = 3,
.fpu_blocks_limit = TWOFISH_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
.fn_u = { .ecb = twofish_ecb_dec_8way }
}, {
.num_blocks = 3,
.fn_u = { .ecb = twofish_dec_blk_3way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = twofish_dec_blk }
} }
};
static const struct common_glue_ctx twofish_dec_cbc = {
.num_funcs = 3,
.fpu_blocks_limit = TWOFISH_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
.fn_u = { .cbc = twofish_cbc_dec_8way }
}, {
.num_blocks = 3,
.fn_u = { .cbc = twofish_dec_blk_cbc_3way }
}, {
.num_blocks = 1,
.fn_u = { .cbc = twofish_dec_blk }
} }
};
static const struct common_glue_ctx twofish_dec_xts = {
.num_funcs = 2,
.fpu_blocks_limit = TWOFISH_PARALLEL_BLOCKS,
.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
.fn_u = { .xts = twofish_xts_dec_8way }
}, {
.num_blocks = 1,
.fn_u = { .xts = twofish_xts_dec }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&twofish_enc, req);
ECB_WALK_START(req, TF_BLOCK_SIZE, TWOFISH_PARALLEL_BLOCKS);
ECB_BLOCK(TWOFISH_PARALLEL_BLOCKS, twofish_ecb_enc_8way);
ECB_BLOCK(3, twofish_enc_blk_3way);
ECB_BLOCK(1, twofish_enc_blk);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&twofish_dec, req);
ECB_WALK_START(req, TF_BLOCK_SIZE, TWOFISH_PARALLEL_BLOCKS);
ECB_BLOCK(TWOFISH_PARALLEL_BLOCKS, twofish_ecb_dec_8way);
ECB_BLOCK(3, twofish_dec_blk_3way);
ECB_BLOCK(1, twofish_dec_blk);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(twofish_enc_blk, req);
CBC_WALK_START(req, TF_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(twofish_enc_blk);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&twofish_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&twofish_ctr, req);
}
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct twofish_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&twofish_enc_xts, req, twofish_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}
static int xts_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct twofish_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
return glue_xts_req_128bit(&twofish_dec_xts, req, twofish_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
CBC_WALK_START(req, TF_BLOCK_SIZE, TWOFISH_PARALLEL_BLOCKS);
CBC_DEC_BLOCK(TWOFISH_PARALLEL_BLOCKS, twofish_cbc_dec_8way);
CBC_DEC_BLOCK(3, twofish_dec_blk_cbc_3way);
CBC_DEC_BLOCK(1, twofish_dec_blk);
CBC_WALK_END();
}
static struct skcipher_alg twofish_algs[] = {
@ -240,35 +100,6 @@ static struct skcipher_alg twofish_algs[] = {
.setkey = twofish_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "__ctr(twofish)",
.base.cra_driver_name = "__ctr-twofish-avx",
.base.cra_priority = 400,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct twofish_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = TF_MIN_KEY_SIZE,
.max_keysize = TF_MAX_KEY_SIZE,
.ivsize = TF_BLOCK_SIZE,
.chunksize = TF_BLOCK_SIZE,
.setkey = twofish_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
}, {
.base.cra_name = "__xts(twofish)",
.base.cra_driver_name = "__xts-twofish-avx",
.base.cra_priority = 400,
.base.cra_flags = CRYPTO_ALG_INTERNAL,
.base.cra_blocksize = TF_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct twofish_xts_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * TF_MIN_KEY_SIZE,
.max_keysize = 2 * TF_MAX_KEY_SIZE,
.ivsize = TF_BLOCK_SIZE,
.setkey = xts_twofish_setkey,
.encrypt = xts_encrypt,
.decrypt = xts_decrypt,
},
};

View File

@ -5,17 +5,16 @@
* Copyright (c) 2011 Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
*/
#include <asm/crypto/glue_helper.h>
#include <asm/crypto/twofish.h>
#include <crypto/algapi.h>
#include <crypto/b128ops.h>
#include <crypto/internal/skcipher.h>
#include <crypto/twofish.h>
#include <linux/crypto.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/types.h>
#include "twofish.h"
#include "ecb_cbc_helpers.h"
EXPORT_SYMBOL_GPL(__twofish_enc_blk_3way);
EXPORT_SYMBOL_GPL(twofish_dec_blk_3way);
@ -30,143 +29,48 @@ static inline void twofish_enc_blk_3way(const void *ctx, u8 *dst, const u8 *src)
__twofish_enc_blk_3way(ctx, dst, src, false);
}
static inline void twofish_enc_blk_xor_3way(const void *ctx, u8 *dst,
const u8 *src)
void twofish_dec_blk_cbc_3way(const void *ctx, u8 *dst, const u8 *src)
{
__twofish_enc_blk_3way(ctx, dst, src, true);
}
u8 buf[2][TF_BLOCK_SIZE];
const u8 *s = src;
void twofish_dec_blk_cbc_3way(const void *ctx, u8 *d, const u8 *s)
{
u128 ivs[2];
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
if (dst == src)
s = memcpy(buf, src, sizeof(buf));
twofish_dec_blk_3way(ctx, dst, src);
crypto_xor(dst + TF_BLOCK_SIZE, s, sizeof(buf));
ivs[0] = src[0];
ivs[1] = src[1];
twofish_dec_blk_3way(ctx, (u8 *)dst, (u8 *)src);
u128_xor(&dst[1], &dst[1], &ivs[0]);
u128_xor(&dst[2], &dst[2], &ivs[1]);
}
EXPORT_SYMBOL_GPL(twofish_dec_blk_cbc_3way);
void twofish_enc_blk_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
if (dst != src)
*dst = *src;
le128_to_be128(&ctrblk, iv);
le128_inc(iv);
twofish_enc_blk(ctx, (u8 *)&ctrblk, (u8 *)&ctrblk);
u128_xor(dst, dst, (u128 *)&ctrblk);
}
EXPORT_SYMBOL_GPL(twofish_enc_blk_ctr);
void twofish_enc_blk_ctr_3way(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblks[3];
u128 *dst = (u128 *)d;
const u128 *src = (const u128 *)s;
if (dst != src) {
dst[0] = src[0];
dst[1] = src[1];
dst[2] = src[2];
}
le128_to_be128(&ctrblks[0], iv);
le128_inc(iv);
le128_to_be128(&ctrblks[1], iv);
le128_inc(iv);
le128_to_be128(&ctrblks[2], iv);
le128_inc(iv);
twofish_enc_blk_xor_3way(ctx, (u8 *)dst, (u8 *)ctrblks);
}
EXPORT_SYMBOL_GPL(twofish_enc_blk_ctr_3way);
static const struct common_glue_ctx twofish_enc = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 3,
.fn_u = { .ecb = twofish_enc_blk_3way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = twofish_enc_blk }
} }
};
static const struct common_glue_ctx twofish_ctr = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 3,
.fn_u = { .ctr = twofish_enc_blk_ctr_3way }
}, {
.num_blocks = 1,
.fn_u = { .ctr = twofish_enc_blk_ctr }
} }
};
static const struct common_glue_ctx twofish_dec = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 3,
.fn_u = { .ecb = twofish_dec_blk_3way }
}, {
.num_blocks = 1,
.fn_u = { .ecb = twofish_dec_blk }
} }
};
static const struct common_glue_ctx twofish_dec_cbc = {
.num_funcs = 2,
.fpu_blocks_limit = -1,
.funcs = { {
.num_blocks = 3,
.fn_u = { .cbc = twofish_dec_blk_cbc_3way }
}, {
.num_blocks = 1,
.fn_u = { .cbc = twofish_dec_blk }
} }
};
static int ecb_encrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&twofish_enc, req);
ECB_WALK_START(req, TF_BLOCK_SIZE, -1);
ECB_BLOCK(3, twofish_enc_blk_3way);
ECB_BLOCK(1, twofish_enc_blk);
ECB_WALK_END();
}
static int ecb_decrypt(struct skcipher_request *req)
{
return glue_ecb_req_128bit(&twofish_dec, req);
ECB_WALK_START(req, TF_BLOCK_SIZE, -1);
ECB_BLOCK(3, twofish_dec_blk_3way);
ECB_BLOCK(1, twofish_dec_blk);
ECB_WALK_END();
}
static int cbc_encrypt(struct skcipher_request *req)
{
return glue_cbc_encrypt_req_128bit(twofish_enc_blk, req);
CBC_WALK_START(req, TF_BLOCK_SIZE, -1);
CBC_ENC_BLOCK(twofish_enc_blk);
CBC_WALK_END();
}
static int cbc_decrypt(struct skcipher_request *req)
{
return glue_cbc_decrypt_req_128bit(&twofish_dec_cbc, req);
}
static int ctr_crypt(struct skcipher_request *req)
{
return glue_ctr_req_128bit(&twofish_ctr, req);
CBC_WALK_START(req, TF_BLOCK_SIZE, -1);
CBC_DEC_BLOCK(3, twofish_dec_blk_cbc_3way);
CBC_DEC_BLOCK(1, twofish_dec_blk);
CBC_WALK_END();
}
static struct skcipher_alg tf_skciphers[] = {
@ -195,20 +99,6 @@ static struct skcipher_alg tf_skciphers[] = {
.setkey = twofish_setkey_skcipher,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
.base.cra_name = "ctr(twofish)",
.base.cra_driver_name = "ctr-twofish-3way",
.base.cra_priority = 300,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct twofish_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = TF_MIN_KEY_SIZE,
.max_keysize = TF_MAX_KEY_SIZE,
.ivsize = TF_BLOCK_SIZE,
.chunksize = TF_BLOCK_SIZE,
.setkey = twofish_setkey_skcipher,
.encrypt = ctr_crypt,
.decrypt = ctr_crypt,
},
};

View File

@ -1,118 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Shared glue code for 128bit block ciphers
*/
#ifndef _CRYPTO_GLUE_HELPER_H
#define _CRYPTO_GLUE_HELPER_H
#include <crypto/internal/skcipher.h>
#include <linux/kernel.h>
#include <asm/fpu/api.h>
#include <crypto/b128ops.h>
typedef void (*common_glue_func_t)(const void *ctx, u8 *dst, const u8 *src);
typedef void (*common_glue_cbc_func_t)(const void *ctx, u8 *dst, const u8 *src);
typedef void (*common_glue_ctr_func_t)(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
typedef void (*common_glue_xts_func_t)(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
struct common_glue_func_entry {
unsigned int num_blocks; /* number of blocks that @fn will process */
union {
common_glue_func_t ecb;
common_glue_cbc_func_t cbc;
common_glue_ctr_func_t ctr;
common_glue_xts_func_t xts;
} fn_u;
};
struct common_glue_ctx {
unsigned int num_funcs;
int fpu_blocks_limit; /* -1 means fpu not needed at all */
/*
* First funcs entry must have largest num_blocks and last funcs entry
* must have num_blocks == 1!
*/
struct common_glue_func_entry funcs[];
};
static inline bool glue_fpu_begin(unsigned int bsize, int fpu_blocks_limit,
struct skcipher_walk *walk,
bool fpu_enabled, unsigned int nbytes)
{
if (likely(fpu_blocks_limit < 0))
return false;
if (fpu_enabled)
return true;
/*
* Vector-registers are only used when chunk to be processed is large
* enough, so do not enable FPU until it is necessary.
*/
if (nbytes < bsize * (unsigned int)fpu_blocks_limit)
return false;
/* prevent sleeping if FPU is in use */
skcipher_walk_atomise(walk);
kernel_fpu_begin();
return true;
}
static inline void glue_fpu_end(bool fpu_enabled)
{
if (fpu_enabled)
kernel_fpu_end();
}
static inline void le128_to_be128(be128 *dst, const le128 *src)
{
dst->a = cpu_to_be64(le64_to_cpu(src->a));
dst->b = cpu_to_be64(le64_to_cpu(src->b));
}
static inline void be128_to_le128(le128 *dst, const be128 *src)
{
dst->a = cpu_to_le64(be64_to_cpu(src->a));
dst->b = cpu_to_le64(be64_to_cpu(src->b));
}
static inline void le128_inc(le128 *i)
{
u64 a = le64_to_cpu(i->a);
u64 b = le64_to_cpu(i->b);
b++;
if (!b)
a++;
i->a = cpu_to_le64(a);
i->b = cpu_to_le64(b);
}
extern int glue_ecb_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req);
extern int glue_cbc_encrypt_req_128bit(const common_glue_func_t fn,
struct skcipher_request *req);
extern int glue_cbc_decrypt_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req);
extern int glue_ctr_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req);
extern int glue_xts_req_128bit(const struct common_glue_ctx *gctx,
struct skcipher_request *req,
common_glue_func_t tweak_fn, void *tweak_ctx,
void *crypt_ctx, bool decrypt);
extern void glue_xts_crypt_128bit_one(const void *ctx, u8 *dst,
const u8 *src, le128 *iv,
common_glue_func_t fn);
#endif /* _CRYPTO_GLUE_HELPER_H */

View File

@ -1,42 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef ASM_X86_SERPENT_AVX_H
#define ASM_X86_SERPENT_AVX_H
#include <crypto/b128ops.h>
#include <crypto/serpent.h>
#include <linux/types.h>
struct crypto_skcipher;
#define SERPENT_PARALLEL_BLOCKS 8
struct serpent_xts_ctx {
struct serpent_ctx tweak_ctx;
struct serpent_ctx crypt_ctx;
};
asmlinkage void serpent_ecb_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
asmlinkage void serpent_ecb_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
asmlinkage void serpent_cbc_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
asmlinkage void serpent_ctr_8way_avx(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
asmlinkage void serpent_xts_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
asmlinkage void serpent_xts_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
extern void __serpent_crypt_ctr(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
extern void serpent_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv);
extern void serpent_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv);
extern int xts_serpent_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen);
#endif

View File

@ -210,11 +210,6 @@ config CRYPTO_SIMD
tristate
select CRYPTO_CRYPTD
config CRYPTO_GLUE_HELPER_X86
tristate
depends on X86
select CRYPTO_SKCIPHER
config CRYPTO_ENGINE
tristate
@ -822,19 +817,6 @@ config CRYPTO_MICHAEL_MIC
should not be used for other purposes because of the weakness
of the algorithm.
config CRYPTO_RMD128
tristate "RIPEMD-128 digest algorithm"
select CRYPTO_HASH
help
RIPEMD-128 (ISO/IEC 10118-3:2004).
RIPEMD-128 is a 128-bit cryptographic hash function. It should only
be used as a secure replacement for RIPEMD. For other use cases,
RIPEMD-160 should be used.
Developed by Hans Dobbertin, Antoon Bosselaers and Bart Preneel.
See <https://homes.esat.kuleuven.be/~bosselae/ripemd160.html>
config CRYPTO_RMD160
tristate "RIPEMD-160 digest algorithm"
select CRYPTO_HASH
@ -852,30 +834,6 @@ config CRYPTO_RMD160
Developed by Hans Dobbertin, Antoon Bosselaers and Bart Preneel.
See <https://homes.esat.kuleuven.be/~bosselae/ripemd160.html>
config CRYPTO_RMD256
tristate "RIPEMD-256 digest algorithm"
select CRYPTO_HASH
help
RIPEMD-256 is an optional extension of RIPEMD-128 with a
256 bit hash. It is intended for applications that require
longer hash-results, without needing a larger security level
(than RIPEMD-128).
Developed by Hans Dobbertin, Antoon Bosselaers and Bart Preneel.
See <https://homes.esat.kuleuven.be/~bosselae/ripemd160.html>
config CRYPTO_RMD320
tristate "RIPEMD-320 digest algorithm"
select CRYPTO_HASH
help
RIPEMD-320 is an optional extension of RIPEMD-160 with a
320 bit hash. It is intended for applications that require
longer hash-results, without needing a larger security level
(than RIPEMD-160).
Developed by Hans Dobbertin, Antoon Bosselaers and Bart Preneel.
See <https://homes.esat.kuleuven.be/~bosselae/ripemd160.html>
config CRYPTO_SHA1
tristate "SHA1 digest algorithm"
select CRYPTO_HASH
@ -1051,19 +1009,6 @@ config CRYPTO_STREEBOG
https://tc26.ru/upload/iblock/fed/feddbb4d26b685903faa2ba11aea43f6.pdf
https://tools.ietf.org/html/rfc6986
config CRYPTO_TGR192
tristate "Tiger digest algorithms"
select CRYPTO_HASH
help
Tiger hash algorithm 192, 160 and 128-bit hashes
Tiger is a hash function optimized for 64-bit processors while
still having decent performance on 32-bit processors.
Tiger was developed by Ross Anderson and Eli Biham.
See also:
<https://www.cs.technion.ac.il/~biham/Reports/Tiger/>.
config CRYPTO_WP512
tristate "Whirlpool digest algorithms"
select CRYPTO_HASH
@ -1133,7 +1078,6 @@ config CRYPTO_AES_NI_INTEL
select CRYPTO_LIB_AES
select CRYPTO_ALGAPI
select CRYPTO_SKCIPHER
select CRYPTO_GLUE_HELPER_X86 if 64BIT
select CRYPTO_SIMD
help
Use Intel AES-NI instructions for AES algorithm.
@ -1256,6 +1200,7 @@ config CRYPTO_BLOWFISH_X86_64
depends on X86 && 64BIT
select CRYPTO_SKCIPHER
select CRYPTO_BLOWFISH_COMMON
imply CRYPTO_CTR
help
Blowfish cipher algorithm (x86_64), by Bruce Schneier.
@ -1286,7 +1231,7 @@ config CRYPTO_CAMELLIA_X86_64
depends on X86 && 64BIT
depends on CRYPTO
select CRYPTO_SKCIPHER
select CRYPTO_GLUE_HELPER_X86
imply CRYPTO_CTR
help
Camellia cipher algorithm module (x86_64).
@ -1304,9 +1249,8 @@ config CRYPTO_CAMELLIA_AESNI_AVX_X86_64
depends on CRYPTO
select CRYPTO_SKCIPHER
select CRYPTO_CAMELLIA_X86_64
select CRYPTO_GLUE_HELPER_X86
select CRYPTO_SIMD
select CRYPTO_XTS
imply CRYPTO_XTS
help
Camellia cipher algorithm module (x86_64/AES-NI/AVX).
@ -1372,6 +1316,7 @@ config CRYPTO_CAST5_AVX_X86_64
select CRYPTO_CAST5
select CRYPTO_CAST_COMMON
select CRYPTO_SIMD
imply CRYPTO_CTR
help
The CAST5 encryption algorithm (synonymous with CAST-128) is
described in RFC2144.
@ -1393,9 +1338,9 @@ config CRYPTO_CAST6_AVX_X86_64
select CRYPTO_SKCIPHER
select CRYPTO_CAST6
select CRYPTO_CAST_COMMON
select CRYPTO_GLUE_HELPER_X86
select CRYPTO_SIMD
select CRYPTO_XTS
imply CRYPTO_XTS
imply CRYPTO_CTR
help
The CAST6 encryption algorithm (synonymous with CAST-256) is
described in RFC2612.
@ -1425,6 +1370,7 @@ config CRYPTO_DES3_EDE_X86_64
depends on X86 && 64BIT
select CRYPTO_SKCIPHER
select CRYPTO_LIB_DES
imply CRYPTO_CTR
help
Triple DES EDE (FIPS 46-3) algorithm.
@ -1454,18 +1400,6 @@ config CRYPTO_KHAZAD
See also:
<http://www.larc.usp.br/~pbarreto/KhazadPage.html>
config CRYPTO_SALSA20
tristate "Salsa20 stream cipher algorithm"
select CRYPTO_SKCIPHER
help
Salsa20 stream cipher algorithm.
Salsa20 is a stream cipher submitted to eSTREAM, the ECRYPT
Stream Cipher Project. See <https://www.ecrypt.eu.org/stream/>
The Salsa20 stream cipher algorithm is designed by Daniel J.
Bernstein <djb@cr.yp.to>. See <https://cr.yp.to/snuffle.html>
config CRYPTO_CHACHA20
tristate "ChaCha stream cipher algorithms"
select CRYPTO_LIB_CHACHA_GENERIC
@ -1526,8 +1460,7 @@ config CRYPTO_SERPENT
Serpent cipher algorithm, by Anderson, Biham & Knudsen.
Keys are allowed to be from 0 to 256 bits in length, in steps
of 8 bits. Also includes the 'Tnepres' algorithm, a reversed
variant of Serpent for compatibility with old kerneli.org code.
of 8 bits.
See also:
<https://www.cl.cam.ac.uk/~rja14/serpent.html>
@ -1536,9 +1469,9 @@ config CRYPTO_SERPENT_SSE2_X86_64
tristate "Serpent cipher algorithm (x86_64/SSE2)"
depends on X86 && 64BIT
select CRYPTO_SKCIPHER
select CRYPTO_GLUE_HELPER_X86
select CRYPTO_SERPENT
select CRYPTO_SIMD
imply CRYPTO_CTR
help
Serpent cipher algorithm, by Anderson, Biham & Knudsen.
@ -1555,9 +1488,9 @@ config CRYPTO_SERPENT_SSE2_586
tristate "Serpent cipher algorithm (i586/SSE2)"
depends on X86 && !64BIT
select CRYPTO_SKCIPHER
select CRYPTO_GLUE_HELPER_X86
select CRYPTO_SERPENT
select CRYPTO_SIMD
imply CRYPTO_CTR
help
Serpent cipher algorithm, by Anderson, Biham & Knudsen.
@ -1574,10 +1507,10 @@ config CRYPTO_SERPENT_AVX_X86_64
tristate "Serpent cipher algorithm (x86_64/AVX)"
depends on X86 && 64BIT
select CRYPTO_SKCIPHER
select CRYPTO_GLUE_HELPER_X86
select CRYPTO_SERPENT
select CRYPTO_SIMD
select CRYPTO_XTS
imply CRYPTO_XTS
imply CRYPTO_CTR
help
Serpent cipher algorithm, by Anderson, Biham & Knudsen.
@ -1675,6 +1608,7 @@ config CRYPTO_TWOFISH_586
depends on (X86 || UML_X86) && !64BIT
select CRYPTO_ALGAPI
select CRYPTO_TWOFISH_COMMON
imply CRYPTO_CTR
help
Twofish cipher algorithm.
@ -1691,6 +1625,7 @@ config CRYPTO_TWOFISH_X86_64
depends on (X86 || UML_X86) && 64BIT
select CRYPTO_ALGAPI
select CRYPTO_TWOFISH_COMMON
imply CRYPTO_CTR
help
Twofish cipher algorithm (x86_64).
@ -1708,7 +1643,6 @@ config CRYPTO_TWOFISH_X86_64_3WAY
select CRYPTO_SKCIPHER
select CRYPTO_TWOFISH_COMMON
select CRYPTO_TWOFISH_X86_64
select CRYPTO_GLUE_HELPER_X86
help
Twofish cipher algorithm (x86_64, 3-way parallel).
@ -1727,11 +1661,11 @@ config CRYPTO_TWOFISH_AVX_X86_64
tristate "Twofish cipher algorithm (x86_64/AVX)"
depends on X86 && 64BIT
select CRYPTO_SKCIPHER
select CRYPTO_GLUE_HELPER_X86
select CRYPTO_SIMD
select CRYPTO_TWOFISH_COMMON
select CRYPTO_TWOFISH_X86_64
select CRYPTO_TWOFISH_X86_64_3WAY
imply CRYPTO_XTS
help
Twofish cipher algorithm (x86_64/AVX).

View File

@ -67,9 +67,7 @@ obj-$(CONFIG_CRYPTO_XCBC) += xcbc.o
obj-$(CONFIG_CRYPTO_NULL2) += crypto_null.o
obj-$(CONFIG_CRYPTO_MD4) += md4.o
obj-$(CONFIG_CRYPTO_MD5) += md5.o
obj-$(CONFIG_CRYPTO_RMD128) += rmd128.o
obj-$(CONFIG_CRYPTO_RMD160) += rmd160.o
obj-$(CONFIG_CRYPTO_RMD256) += rmd256.o
obj-$(CONFIG_CRYPTO_RMD320) += rmd320.o
obj-$(CONFIG_CRYPTO_SHA1) += sha1_generic.o
obj-$(CONFIG_CRYPTO_SHA256) += sha256_generic.o
@ -79,7 +77,6 @@ obj-$(CONFIG_CRYPTO_SM3) += sm3_generic.o
obj-$(CONFIG_CRYPTO_STREEBOG) += streebog_generic.o
obj-$(CONFIG_CRYPTO_WP512) += wp512.o
CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
obj-$(CONFIG_CRYPTO_TGR192) += tgr192.o
obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b_generic.o
obj-$(CONFIG_CRYPTO_BLAKE2S) += blake2s_generic.o
obj-$(CONFIG_CRYPTO_GF128MUL) += gf128mul.o
@ -141,7 +138,6 @@ obj-$(CONFIG_CRYPTO_TEA) += tea.o
obj-$(CONFIG_CRYPTO_KHAZAD) += khazad.o
obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
obj-$(CONFIG_CRYPTO_SEED) += seed.o
obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
obj-$(CONFIG_CRYPTO_CHACHA20) += chacha_generic.o
obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_generic.o
obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o

View File

@ -32,6 +32,7 @@
#include <crypto/b128ops.h>
#include <crypto/chacha.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/poly1305.h>
#include <crypto/internal/skcipher.h>
@ -616,3 +617,4 @@ MODULE_DESCRIPTION("Adiantum length-preserving encryption mode");
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
MODULE_ALIAS_CRYPTO("adiantum");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -7,6 +7,7 @@
* (C) Neil Horman <nhorman@tuxdriver.com>
*/
#include <crypto/internal/cipher.h>
#include <crypto/internal/rng.h>
#include <linux/err.h>
#include <linux/init.h>
@ -470,3 +471,4 @@ subsys_initcall(prng_mod_init);
module_exit(prng_mod_fini);
MODULE_ALIAS_CRYPTO("stdrng");
MODULE_ALIAS_CRYPTO("ansi_cprng");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -1,55 +1,27 @@
// SPDX-License-Identifier: (GPL-2.0-only OR Apache-2.0)
/*
* BLAKE2b reference source code package - reference C implementations
* Generic implementation of the BLAKE2b digest algorithm. Based on the BLAKE2b
* reference implementation, but it has been heavily modified for use in the
* kernel. The reference implementation was:
*
* Copyright 2012, Samuel Neves <sneves@dei.uc.pt>. You may use this under the
* terms of the CC0, the OpenSSL Licence, or the Apache Public License 2.0, at
* your option. The terms of these licenses can be found at:
* Copyright 2012, Samuel Neves <sneves@dei.uc.pt>. You may use this under
* the terms of the CC0, the OpenSSL Licence, or the Apache Public License
* 2.0, at your option. The terms of these licenses can be found at:
*
* - CC0 1.0 Universal : http://creativecommons.org/publicdomain/zero/1.0
* - OpenSSL license : https://www.openssl.org/source/license.html
* - Apache 2.0 : https://www.apache.org/licenses/LICENSE-2.0
*
* More information about the BLAKE2 hash function can be found at
* https://blake2.net.
*
* Note: the original sources have been modified for inclusion in linux kernel
* in terms of coding style, using generic helpers and simplifications of error
* handling.
* More information about BLAKE2 can be found at https://blake2.net.
*/
#include <asm/unaligned.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/kernel.h>
#include <linux/bitops.h>
#include <crypto/internal/blake2b.h>
#include <crypto/internal/hash.h>
#define BLAKE2B_160_DIGEST_SIZE (160 / 8)
#define BLAKE2B_256_DIGEST_SIZE (256 / 8)
#define BLAKE2B_384_DIGEST_SIZE (384 / 8)
#define BLAKE2B_512_DIGEST_SIZE (512 / 8)
enum blake2b_constant {
BLAKE2B_BLOCKBYTES = 128,
BLAKE2B_KEYBYTES = 64,
};
struct blake2b_state {
u64 h[8];
u64 t[2];
u64 f[2];
u8 buf[BLAKE2B_BLOCKBYTES];
size_t buflen;
};
static const u64 blake2b_IV[8] = {
0x6a09e667f3bcc908ULL, 0xbb67ae8584caa73bULL,
0x3c6ef372fe94f82bULL, 0xa54ff53a5f1d36f1ULL,
0x510e527fade682d1ULL, 0x9b05688c2b3e6c1fULL,
0x1f83d9abfb41bd6bULL, 0x5be0cd19137e2179ULL
};
static const u8 blake2b_sigma[12][16] = {
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 },
{ 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 },
@ -95,8 +67,8 @@ static void blake2b_increment_counter(struct blake2b_state *S, const u64 inc)
G(r,7,v[ 3],v[ 4],v[ 9],v[14]); \
} while (0)
static void blake2b_compress(struct blake2b_state *S,
const u8 block[BLAKE2B_BLOCKBYTES])
static void blake2b_compress_one_generic(struct blake2b_state *S,
const u8 block[BLAKE2B_BLOCK_SIZE])
{
u64 m[16];
u64 v[16];
@ -108,14 +80,14 @@ static void blake2b_compress(struct blake2b_state *S,
for (i = 0; i < 8; ++i)
v[i] = S->h[i];
v[ 8] = blake2b_IV[0];
v[ 9] = blake2b_IV[1];
v[10] = blake2b_IV[2];
v[11] = blake2b_IV[3];
v[12] = blake2b_IV[4] ^ S->t[0];
v[13] = blake2b_IV[5] ^ S->t[1];
v[14] = blake2b_IV[6] ^ S->f[0];
v[15] = blake2b_IV[7] ^ S->f[1];
v[ 8] = BLAKE2B_IV0;
v[ 9] = BLAKE2B_IV1;
v[10] = BLAKE2B_IV2;
v[11] = BLAKE2B_IV3;
v[12] = BLAKE2B_IV4 ^ S->t[0];
v[13] = BLAKE2B_IV5 ^ S->t[1];
v[14] = BLAKE2B_IV6 ^ S->f[0];
v[15] = BLAKE2B_IV7 ^ S->f[1];
ROUND(0);
ROUND(1);
@ -139,159 +111,54 @@ static void blake2b_compress(struct blake2b_state *S,
#undef G
#undef ROUND
struct blake2b_tfm_ctx {
u8 key[BLAKE2B_KEYBYTES];
unsigned int keylen;
};
static int blake2b_setkey(struct crypto_shash *tfm, const u8 *key,
unsigned int keylen)
void blake2b_compress_generic(struct blake2b_state *state,
const u8 *block, size_t nblocks, u32 inc)
{
struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(tfm);
do {
blake2b_increment_counter(state, inc);
blake2b_compress_one_generic(state, block);
block += BLAKE2B_BLOCK_SIZE;
} while (--nblocks);
}
EXPORT_SYMBOL(blake2b_compress_generic);
if (keylen == 0 || keylen > BLAKE2B_KEYBYTES)
return -EINVAL;
memcpy(tctx->key, key, keylen);
tctx->keylen = keylen;
return 0;
static int crypto_blake2b_update_generic(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
return crypto_blake2b_update(desc, in, inlen, blake2b_compress_generic);
}
static int blake2b_init(struct shash_desc *desc)
static int crypto_blake2b_final_generic(struct shash_desc *desc, u8 *out)
{
struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
struct blake2b_state *state = shash_desc_ctx(desc);
const int digestsize = crypto_shash_digestsize(desc->tfm);
return crypto_blake2b_final(desc, out, blake2b_compress_generic);
}
memset(state, 0, sizeof(*state));
memcpy(state->h, blake2b_IV, sizeof(state->h));
/* Parameter block is all zeros except index 0, no xor for 1..7 */
state->h[0] ^= 0x01010000 | tctx->keylen << 8 | digestsize;
if (tctx->keylen) {
/*
* Prefill the buffer with the key, next call to _update or
* _final will process it
*/
memcpy(state->buf, tctx->key, tctx->keylen);
state->buflen = BLAKE2B_BLOCKBYTES;
#define BLAKE2B_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 100, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
.base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2b_setkey, \
.init = crypto_blake2b_init, \
.update = crypto_blake2b_update_generic, \
.final = crypto_blake2b_final_generic, \
.descsize = sizeof(struct blake2b_state), \
}
return 0;
}
static int blake2b_update(struct shash_desc *desc, const u8 *in,
unsigned int inlen)
{
struct blake2b_state *state = shash_desc_ctx(desc);
const size_t left = state->buflen;
const size_t fill = BLAKE2B_BLOCKBYTES - left;
if (!inlen)
return 0;
if (inlen > fill) {
state->buflen = 0;
/* Fill buffer */
memcpy(state->buf + left, in, fill);
blake2b_increment_counter(state, BLAKE2B_BLOCKBYTES);
/* Compress */
blake2b_compress(state, state->buf);
in += fill;
inlen -= fill;
while (inlen > BLAKE2B_BLOCKBYTES) {
blake2b_increment_counter(state, BLAKE2B_BLOCKBYTES);
blake2b_compress(state, in);
in += BLAKE2B_BLOCKBYTES;
inlen -= BLAKE2B_BLOCKBYTES;
}
}
memcpy(state->buf + state->buflen, in, inlen);
state->buflen += inlen;
return 0;
}
static int blake2b_final(struct shash_desc *desc, u8 *out)
{
struct blake2b_state *state = shash_desc_ctx(desc);
const int digestsize = crypto_shash_digestsize(desc->tfm);
size_t i;
blake2b_increment_counter(state, state->buflen);
/* Set last block */
state->f[0] = (u64)-1;
/* Padding */
memset(state->buf + state->buflen, 0, BLAKE2B_BLOCKBYTES - state->buflen);
blake2b_compress(state, state->buf);
/* Avoid temporary buffer and switch the internal output to LE order */
for (i = 0; i < ARRAY_SIZE(state->h); i++)
__cpu_to_le64s(&state->h[i]);
memcpy(out, state->h, digestsize);
return 0;
}
static struct shash_alg blake2b_algs[] = {
{
.base.cra_name = "blake2b-160",
.base.cra_driver_name = "blake2b-160-generic",
.base.cra_priority = 100,
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_blocksize = BLAKE2B_BLOCKBYTES,
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx),
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2B_160_DIGEST_SIZE,
.setkey = blake2b_setkey,
.init = blake2b_init,
.update = blake2b_update,
.final = blake2b_final,
.descsize = sizeof(struct blake2b_state),
}, {
.base.cra_name = "blake2b-256",
.base.cra_driver_name = "blake2b-256-generic",
.base.cra_priority = 100,
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_blocksize = BLAKE2B_BLOCKBYTES,
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx),
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2B_256_DIGEST_SIZE,
.setkey = blake2b_setkey,
.init = blake2b_init,
.update = blake2b_update,
.final = blake2b_final,
.descsize = sizeof(struct blake2b_state),
}, {
.base.cra_name = "blake2b-384",
.base.cra_driver_name = "blake2b-384-generic",
.base.cra_priority = 100,
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_blocksize = BLAKE2B_BLOCKBYTES,
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx),
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2B_384_DIGEST_SIZE,
.setkey = blake2b_setkey,
.init = blake2b_init,
.update = blake2b_update,
.final = blake2b_final,
.descsize = sizeof(struct blake2b_state),
}, {
.base.cra_name = "blake2b-512",
.base.cra_driver_name = "blake2b-512-generic",
.base.cra_priority = 100,
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_blocksize = BLAKE2B_BLOCKBYTES,
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx),
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2B_512_DIGEST_SIZE,
.setkey = blake2b_setkey,
.init = blake2b_init,
.update = blake2b_update,
.final = blake2b_final,
.descsize = sizeof(struct blake2b_state),
}
BLAKE2B_ALG("blake2b-160", "blake2b-160-generic",
BLAKE2B_160_HASH_SIZE),
BLAKE2B_ALG("blake2b-256", "blake2b-256-generic",
BLAKE2B_256_HASH_SIZE),
BLAKE2B_ALG("blake2b-384", "blake2b-384-generic",
BLAKE2B_384_HASH_SIZE),
BLAKE2B_ALG("blake2b-512", "blake2b-512-generic",
BLAKE2B_512_HASH_SIZE),
};
static int __init blake2b_mod_init(void)

View File

@ -1,149 +1,55 @@
// SPDX-License-Identifier: GPL-2.0 OR MIT
/*
* shash interface to the generic implementation of BLAKE2s
*
* Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
*/
#include <crypto/internal/blake2s.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/hash.h>
#include <linux/types.h>
#include <linux/jump_label.h>
#include <linux/kernel.h>
#include <linux/module.h>
static int crypto_blake2s_setkey(struct crypto_shash *tfm, const u8 *key,
unsigned int keylen)
static int crypto_blake2s_update_generic(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm);
if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE)
return -EINVAL;
memcpy(tctx->key, key, keylen);
tctx->keylen = keylen;
return 0;
return crypto_blake2s_update(desc, in, inlen, blake2s_compress_generic);
}
static int crypto_blake2s_init(struct shash_desc *desc)
static int crypto_blake2s_final_generic(struct shash_desc *desc, u8 *out)
{
struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
struct blake2s_state *state = shash_desc_ctx(desc);
const int outlen = crypto_shash_digestsize(desc->tfm);
if (tctx->keylen)
blake2s_init_key(state, outlen, tctx->key, tctx->keylen);
else
blake2s_init(state, outlen);
return 0;
return crypto_blake2s_final(desc, out, blake2s_compress_generic);
}
static int crypto_blake2s_update(struct shash_desc *desc, const u8 *in,
unsigned int inlen)
{
struct blake2s_state *state = shash_desc_ctx(desc);
const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
if (unlikely(!inlen))
return 0;
if (inlen > fill) {
memcpy(state->buf + state->buflen, in, fill);
blake2s_compress_generic(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
state->buflen = 0;
in += fill;
inlen -= fill;
#define BLAKE2S_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 100, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2s_setkey, \
.init = crypto_blake2s_init, \
.update = crypto_blake2s_update_generic, \
.final = crypto_blake2s_final_generic, \
.descsize = sizeof(struct blake2s_state), \
}
if (inlen > BLAKE2S_BLOCK_SIZE) {
const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
/* Hash one less (full) block than strictly possible */
blake2s_compress_generic(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
}
memcpy(state->buf + state->buflen, in, inlen);
state->buflen += inlen;
return 0;
}
static int crypto_blake2s_final(struct shash_desc *desc, u8 *out)
{
struct blake2s_state *state = shash_desc_ctx(desc);
blake2s_set_lastblock(state);
memset(state->buf + state->buflen, 0,
BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
blake2s_compress_generic(state, state->buf, 1, state->buflen);
cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
memcpy(out, state->h, state->outlen);
memzero_explicit(state, sizeof(*state));
return 0;
}
static struct shash_alg blake2s_algs[] = {{
.base.cra_name = "blake2s-128",
.base.cra_driver_name = "blake2s-128-generic",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_128_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}, {
.base.cra_name = "blake2s-160",
.base.cra_driver_name = "blake2s-160-generic",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_160_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}, {
.base.cra_name = "blake2s-224",
.base.cra_driver_name = "blake2s-224-generic",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_224_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}, {
.base.cra_name = "blake2s-256",
.base.cra_driver_name = "blake2s-256-generic",
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY,
.base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx),
.base.cra_priority = 200,
.base.cra_blocksize = BLAKE2S_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.digestsize = BLAKE2S_256_HASH_SIZE,
.setkey = crypto_blake2s_setkey,
.init = crypto_blake2s_init,
.update = crypto_blake2s_update,
.final = crypto_blake2s_final,
.descsize = sizeof(struct blake2s_state),
}};
static struct shash_alg blake2s_algs[] = {
BLAKE2S_ALG("blake2s-128", "blake2s-128-generic",
BLAKE2S_128_HASH_SIZE),
BLAKE2S_ALG("blake2s-160", "blake2s-160-generic",
BLAKE2S_160_HASH_SIZE),
BLAKE2S_ALG("blake2s-224", "blake2s-224-generic",
BLAKE2S_224_HASH_SIZE),
BLAKE2S_ALG("blake2s-256", "blake2s-256-generic",
BLAKE2S_256_HASH_SIZE),
};
static int __init blake2s_mod_init(void)
{

View File

@ -14,7 +14,7 @@
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/crypto.h>
#include <linux/types.h>
#include <crypto/blowfish.h>
@ -36,12 +36,10 @@
static void bf_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct bf_ctx *ctx = crypto_tfm_ctx(tfm);
const __be32 *in_blk = (const __be32 *)src;
__be32 *const out_blk = (__be32 *)dst;
const u32 *P = ctx->p;
const u32 *S = ctx->s;
u32 yl = be32_to_cpu(in_blk[0]);
u32 yr = be32_to_cpu(in_blk[1]);
u32 yl = get_unaligned_be32(src);
u32 yr = get_unaligned_be32(src + 4);
ROUND(yr, yl, 0);
ROUND(yl, yr, 1);
@ -63,19 +61,17 @@ static void bf_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
yl ^= P[16];
yr ^= P[17];
out_blk[0] = cpu_to_be32(yr);
out_blk[1] = cpu_to_be32(yl);
put_unaligned_be32(yr, dst);
put_unaligned_be32(yl, dst + 4);
}
static void bf_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
struct bf_ctx *ctx = crypto_tfm_ctx(tfm);
const __be32 *in_blk = (const __be32 *)src;
__be32 *const out_blk = (__be32 *)dst;
const u32 *P = ctx->p;
const u32 *S = ctx->s;
u32 yl = be32_to_cpu(in_blk[0]);
u32 yr = be32_to_cpu(in_blk[1]);
u32 yl = get_unaligned_be32(src);
u32 yr = get_unaligned_be32(src + 4);
ROUND(yr, yl, 17);
ROUND(yl, yr, 16);
@ -97,8 +93,8 @@ static void bf_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
yl ^= P[1];
yr ^= P[0];
out_blk[0] = cpu_to_be32(yr);
out_blk[1] = cpu_to_be32(yl);
put_unaligned_be32(yr, dst);
put_unaligned_be32(yl, dst + 4);
}
static struct crypto_alg alg = {
@ -108,7 +104,6 @@ static struct crypto_alg alg = {
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = BF_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct bf_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = { .cipher = {
.cia_min_keysize = BF_MIN_KEY_SIZE,

View File

@ -9,14 +9,6 @@
* https://info.isl.ntt.co.jp/crypt/eng/camellia/specifications.html
*/
/*
*
* NOTE --- NOTE --- NOTE --- NOTE
* This implementation assumes that all memory addresses passed
* as parameters are four-byte aligned.
*
*/
#include <linux/crypto.h>
#include <linux/errno.h>
#include <linux/init.h>
@ -994,16 +986,14 @@ camellia_set_key(struct crypto_tfm *tfm, const u8 *in_key,
static void camellia_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct camellia_ctx *cctx = crypto_tfm_ctx(tfm);
const __be32 *src = (const __be32 *)in;
__be32 *dst = (__be32 *)out;
unsigned int max;
u32 tmp[4];
tmp[0] = be32_to_cpu(src[0]);
tmp[1] = be32_to_cpu(src[1]);
tmp[2] = be32_to_cpu(src[2]);
tmp[3] = be32_to_cpu(src[3]);
tmp[0] = get_unaligned_be32(in);
tmp[1] = get_unaligned_be32(in + 4);
tmp[2] = get_unaligned_be32(in + 8);
tmp[3] = get_unaligned_be32(in + 12);
if (cctx->key_length == 16)
max = 24;
@ -1013,25 +1003,23 @@ static void camellia_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
camellia_do_encrypt(cctx->key_table, tmp, max);
/* do_encrypt returns 0,1 swapped with 2,3 */
dst[0] = cpu_to_be32(tmp[2]);
dst[1] = cpu_to_be32(tmp[3]);
dst[2] = cpu_to_be32(tmp[0]);
dst[3] = cpu_to_be32(tmp[1]);
put_unaligned_be32(tmp[2], out);
put_unaligned_be32(tmp[3], out + 4);
put_unaligned_be32(tmp[0], out + 8);
put_unaligned_be32(tmp[1], out + 12);
}
static void camellia_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct camellia_ctx *cctx = crypto_tfm_ctx(tfm);
const __be32 *src = (const __be32 *)in;
__be32 *dst = (__be32 *)out;
unsigned int max;
u32 tmp[4];
tmp[0] = be32_to_cpu(src[0]);
tmp[1] = be32_to_cpu(src[1]);
tmp[2] = be32_to_cpu(src[2]);
tmp[3] = be32_to_cpu(src[3]);
tmp[0] = get_unaligned_be32(in);
tmp[1] = get_unaligned_be32(in + 4);
tmp[2] = get_unaligned_be32(in + 8);
tmp[3] = get_unaligned_be32(in + 12);
if (cctx->key_length == 16)
max = 24;
@ -1041,10 +1029,10 @@ static void camellia_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
camellia_do_decrypt(cctx->key_table, tmp, max);
/* do_decrypt returns 0,1 swapped with 2,3 */
dst[0] = cpu_to_be32(tmp[2]);
dst[1] = cpu_to_be32(tmp[3]);
dst[2] = cpu_to_be32(tmp[0]);
dst[3] = cpu_to_be32(tmp[1]);
put_unaligned_be32(tmp[2], out);
put_unaligned_be32(tmp[3], out + 4);
put_unaligned_be32(tmp[0], out + 8);
put_unaligned_be32(tmp[1], out + 12);
}
static struct crypto_alg camellia_alg = {
@ -1054,7 +1042,6 @@ static struct crypto_alg camellia_alg = {
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = CAMELLIA_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct camellia_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = {
.cipher = {

View File

@ -13,7 +13,7 @@
*/
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/init.h>
#include <linux/crypto.h>
#include <linux/module.h>
@ -302,8 +302,6 @@ static const u32 sb8[256] = {
void __cast5_encrypt(struct cast5_ctx *c, u8 *outbuf, const u8 *inbuf)
{
const __be32 *src = (const __be32 *)inbuf;
__be32 *dst = (__be32 *)outbuf;
u32 l, r, t;
u32 I; /* used by the Fx macros */
u32 *Km;
@ -315,8 +313,8 @@ void __cast5_encrypt(struct cast5_ctx *c, u8 *outbuf, const u8 *inbuf)
/* (L0,R0) <-- (m1...m64). (Split the plaintext into left and
* right 32-bit halves L0 = m1...m32 and R0 = m33...m64.)
*/
l = be32_to_cpu(src[0]);
r = be32_to_cpu(src[1]);
l = get_unaligned_be32(inbuf);
r = get_unaligned_be32(inbuf + 4);
/* (16 rounds) for i from 1 to 16, compute Li and Ri as follows:
* Li = Ri-1;
@ -347,8 +345,8 @@ void __cast5_encrypt(struct cast5_ctx *c, u8 *outbuf, const u8 *inbuf)
/* c1...c64 <-- (R16,L16). (Exchange final blocks L16, R16 and
* concatenate to form the ciphertext.) */
dst[0] = cpu_to_be32(r);
dst[1] = cpu_to_be32(l);
put_unaligned_be32(r, outbuf);
put_unaligned_be32(l, outbuf + 4);
}
EXPORT_SYMBOL_GPL(__cast5_encrypt);
@ -359,8 +357,6 @@ static void cast5_encrypt(struct crypto_tfm *tfm, u8 *outbuf, const u8 *inbuf)
void __cast5_decrypt(struct cast5_ctx *c, u8 *outbuf, const u8 *inbuf)
{
const __be32 *src = (const __be32 *)inbuf;
__be32 *dst = (__be32 *)outbuf;
u32 l, r, t;
u32 I;
u32 *Km;
@ -369,8 +365,8 @@ void __cast5_decrypt(struct cast5_ctx *c, u8 *outbuf, const u8 *inbuf)
Km = c->Km;
Kr = c->Kr;
l = be32_to_cpu(src[0]);
r = be32_to_cpu(src[1]);
l = get_unaligned_be32(inbuf);
r = get_unaligned_be32(inbuf + 4);
if (!(c->rr)) {
t = l; l = r; r = t ^ F1(r, Km[15], Kr[15]);
@ -391,8 +387,8 @@ void __cast5_decrypt(struct cast5_ctx *c, u8 *outbuf, const u8 *inbuf)
t = l; l = r; r = t ^ F2(r, Km[1], Kr[1]);
t = l; l = r; r = t ^ F1(r, Km[0], Kr[0]);
dst[0] = cpu_to_be32(r);
dst[1] = cpu_to_be32(l);
put_unaligned_be32(r, outbuf);
put_unaligned_be32(l, outbuf + 4);
}
EXPORT_SYMBOL_GPL(__cast5_decrypt);
@ -513,7 +509,6 @@ static struct crypto_alg alg = {
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = CAST5_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct cast5_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = {
.cipher = {

View File

@ -10,7 +10,7 @@
*/
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/init.h>
#include <linux/crypto.h>
#include <linux/module.h>
@ -172,16 +172,14 @@ static inline void QBAR(u32 *block, const u8 *Kr, const u32 *Km)
void __cast6_encrypt(const void *ctx, u8 *outbuf, const u8 *inbuf)
{
const struct cast6_ctx *c = ctx;
const __be32 *src = (const __be32 *)inbuf;
__be32 *dst = (__be32 *)outbuf;
u32 block[4];
const u32 *Km;
const u8 *Kr;
block[0] = be32_to_cpu(src[0]);
block[1] = be32_to_cpu(src[1]);
block[2] = be32_to_cpu(src[2]);
block[3] = be32_to_cpu(src[3]);
block[0] = get_unaligned_be32(inbuf);
block[1] = get_unaligned_be32(inbuf + 4);
block[2] = get_unaligned_be32(inbuf + 8);
block[3] = get_unaligned_be32(inbuf + 12);
Km = c->Km[0]; Kr = c->Kr[0]; Q(block, Kr, Km);
Km = c->Km[1]; Kr = c->Kr[1]; Q(block, Kr, Km);
@ -196,10 +194,10 @@ void __cast6_encrypt(const void *ctx, u8 *outbuf, const u8 *inbuf)
Km = c->Km[10]; Kr = c->Kr[10]; QBAR(block, Kr, Km);
Km = c->Km[11]; Kr = c->Kr[11]; QBAR(block, Kr, Km);
dst[0] = cpu_to_be32(block[0]);
dst[1] = cpu_to_be32(block[1]);
dst[2] = cpu_to_be32(block[2]);
dst[3] = cpu_to_be32(block[3]);
put_unaligned_be32(block[0], outbuf);
put_unaligned_be32(block[1], outbuf + 4);
put_unaligned_be32(block[2], outbuf + 8);
put_unaligned_be32(block[3], outbuf + 12);
}
EXPORT_SYMBOL_GPL(__cast6_encrypt);
@ -211,16 +209,14 @@ static void cast6_encrypt(struct crypto_tfm *tfm, u8 *outbuf, const u8 *inbuf)
void __cast6_decrypt(const void *ctx, u8 *outbuf, const u8 *inbuf)
{
const struct cast6_ctx *c = ctx;
const __be32 *src = (const __be32 *)inbuf;
__be32 *dst = (__be32 *)outbuf;
u32 block[4];
const u32 *Km;
const u8 *Kr;
block[0] = be32_to_cpu(src[0]);
block[1] = be32_to_cpu(src[1]);
block[2] = be32_to_cpu(src[2]);
block[3] = be32_to_cpu(src[3]);
block[0] = get_unaligned_be32(inbuf);
block[1] = get_unaligned_be32(inbuf + 4);
block[2] = get_unaligned_be32(inbuf + 8);
block[3] = get_unaligned_be32(inbuf + 12);
Km = c->Km[11]; Kr = c->Kr[11]; Q(block, Kr, Km);
Km = c->Km[10]; Kr = c->Kr[10]; Q(block, Kr, Km);
@ -235,10 +231,10 @@ void __cast6_decrypt(const void *ctx, u8 *outbuf, const u8 *inbuf)
Km = c->Km[1]; Kr = c->Kr[1]; QBAR(block, Kr, Km);
Km = c->Km[0]; Kr = c->Kr[0]; QBAR(block, Kr, Km);
dst[0] = cpu_to_be32(block[0]);
dst[1] = cpu_to_be32(block[1]);
dst[2] = cpu_to_be32(block[2]);
dst[3] = cpu_to_be32(block[3]);
put_unaligned_be32(block[0], outbuf);
put_unaligned_be32(block[1], outbuf + 4);
put_unaligned_be32(block[2], outbuf + 8);
put_unaligned_be32(block[3], outbuf + 12);
}
EXPORT_SYMBOL_GPL(__cast6_decrypt);
@ -254,7 +250,6 @@ static struct crypto_alg alg = {
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = CAST6_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct cast6_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = {
.cipher = {

View File

@ -6,6 +6,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <linux/err.h>
#include <linux/init.h>

View File

@ -6,6 +6,7 @@
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
@ -954,3 +955,4 @@ MODULE_ALIAS_CRYPTO("ccm_base");
MODULE_ALIAS_CRYPTO("rfc4309");
MODULE_ALIAS_CRYPTO("ccm");
MODULE_ALIAS_CRYPTO("cbcmac");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -20,6 +20,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <linux/err.h>
#include <linux/init.h>
@ -250,3 +251,4 @@ module_exit(crypto_cfb_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("CFB block cipher mode of operation");
MODULE_ALIAS_CRYPTO("cfb");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -9,6 +9,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/cipher.h>
#include <linux/kernel.h>
#include <linux/crypto.h>
#include <linux/errno.h>
@ -53,7 +54,7 @@ int crypto_cipher_setkey(struct crypto_cipher *tfm,
return cia->cia_setkey(crypto_cipher_tfm(tfm), key, keylen);
}
EXPORT_SYMBOL_GPL(crypto_cipher_setkey);
EXPORT_SYMBOL_NS_GPL(crypto_cipher_setkey, CRYPTO_INTERNAL);
static inline void cipher_crypt_one(struct crypto_cipher *tfm,
u8 *dst, const u8 *src, bool enc)
@ -81,11 +82,11 @@ void crypto_cipher_encrypt_one(struct crypto_cipher *tfm,
{
cipher_crypt_one(tfm, dst, src, true);
}
EXPORT_SYMBOL_GPL(crypto_cipher_encrypt_one);
EXPORT_SYMBOL_NS_GPL(crypto_cipher_encrypt_one, CRYPTO_INTERNAL);
void crypto_cipher_decrypt_one(struct crypto_cipher *tfm,
u8 *dst, const u8 *src)
{
cipher_crypt_one(tfm, dst, src, false);
}
EXPORT_SYMBOL_GPL(crypto_cipher_decrypt_one);
EXPORT_SYMBOL_NS_GPL(crypto_cipher_decrypt_one, CRYPTO_INTERNAL);

View File

@ -11,6 +11,7 @@
* Author: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
*/
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
#include <linux/err.h>
#include <linux/kernel.h>
@ -313,3 +314,4 @@ module_exit(crypto_cmac_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("CMAC keyed hash algorithm");
MODULE_ALIAS_CRYPTO("cmac");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -7,6 +7,7 @@
#include <crypto/algapi.h>
#include <crypto/ctr.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <linux/err.h>
#include <linux/init.h>
@ -358,3 +359,4 @@ MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("CTR block cipher mode of operation");
MODULE_ALIAS_CRYPTO("rfc3686");
MODULE_ALIAS_CRYPTO("ctr");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -98,6 +98,7 @@
*/
#include <crypto/drbg.h>
#include <crypto/internal/cipher.h>
#include <linux/kernel.h>
/***************************************************************
@ -2161,3 +2162,4 @@ MODULE_DESCRIPTION("NIST SP800-90A Deterministic Random Bit Generator (DRBG) "
CRYPTO_DRBG_HMAC_STRING
CRYPTO_DRBG_CTR_STRING);
MODULE_ALIAS_CRYPTO("stdrng");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -6,6 +6,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <linux/err.h>
#include <linux/init.h>

View File

@ -67,6 +67,9 @@ int crypto_ecdh_decode_key(const char *buf, unsigned int len,
if (secret.type != CRYPTO_KPP_SECRET_TYPE_ECDH)
return -EINVAL;
if (unlikely(len < secret.len))
return -EINVAL;
ptr = ecdh_unpack_data(&params->curve_id, ptr, sizeof(params->curve_id));
ptr = ecdh_unpack_data(&params->key_size, ptr, sizeof(params->key_size));
if (secret.len != crypto_ecdh_key_len(params))

View File

@ -30,6 +30,7 @@
#include <crypto/authenc.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
@ -643,3 +644,4 @@ module_exit(essiv_module_exit);
MODULE_DESCRIPTION("ESSIV skcipher/aead wrapper for block encryption");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("essiv");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -396,7 +396,6 @@ static struct crypto_alg fcrypt_alg = {
.cra_blocksize = 8,
.cra_ctxsize = sizeof(struct fcrypt_ctx),
.cra_module = THIS_MODULE,
.cra_alignmask = 3,
.cra_u = { .cipher = {
.cia_min_keysize = 8,
.cia_max_keysize = 8,

View File

@ -85,6 +85,7 @@
#include <linux/crypto.h>
#include <linux/scatterlist.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
struct crypto_kw_block {
@ -316,3 +317,4 @@ MODULE_LICENSE("Dual BSD/GPL");
MODULE_AUTHOR("Stephan Mueller <smueller@chronox.de>");
MODULE_DESCRIPTION("Key Wrapping (RFC3394 / NIST SP800-38F)");
MODULE_ALIAS_CRYPTO("kw");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -7,7 +7,7 @@
* Copyright (c) 2004 Jouni Malinen <j@w1.fi>
*/
#include <crypto/internal/hash.h>
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/string.h>
@ -19,7 +19,7 @@ struct michael_mic_ctx {
};
struct michael_mic_desc_ctx {
u8 pending[4];
__le32 pending;
size_t pending_len;
u32 l, r;
@ -60,13 +60,12 @@ static int michael_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct michael_mic_desc_ctx *mctx = shash_desc_ctx(desc);
const __le32 *src;
if (mctx->pending_len) {
int flen = 4 - mctx->pending_len;
if (flen > len)
flen = len;
memcpy(&mctx->pending[mctx->pending_len], data, flen);
memcpy((u8 *)&mctx->pending + mctx->pending_len, data, flen);
mctx->pending_len += flen;
data += flen;
len -= flen;
@ -74,23 +73,21 @@ static int michael_update(struct shash_desc *desc, const u8 *data,
if (mctx->pending_len < 4)
return 0;
src = (const __le32 *)mctx->pending;
mctx->l ^= le32_to_cpup(src);
mctx->l ^= le32_to_cpu(mctx->pending);
michael_block(mctx->l, mctx->r);
mctx->pending_len = 0;
}
src = (const __le32 *)data;
while (len >= 4) {
mctx->l ^= le32_to_cpup(src++);
mctx->l ^= get_unaligned_le32(data);
michael_block(mctx->l, mctx->r);
data += 4;
len -= 4;
}
if (len > 0) {
mctx->pending_len = len;
memcpy(mctx->pending, src, len);
memcpy(&mctx->pending, data, len);
}
return 0;
@ -100,8 +97,7 @@ static int michael_update(struct shash_desc *desc, const u8 *data,
static int michael_final(struct shash_desc *desc, u8 *out)
{
struct michael_mic_desc_ctx *mctx = shash_desc_ctx(desc);
u8 *data = mctx->pending;
__le32 *dst = (__le32 *)out;
u8 *data = (u8 *)&mctx->pending;
/* Last block and padding (0x5a, 4..7 x 0) */
switch (mctx->pending_len) {
@ -123,8 +119,8 @@ static int michael_final(struct shash_desc *desc, u8 *out)
/* l ^= 0; */
michael_block(mctx->l, mctx->r);
dst[0] = cpu_to_le32(mctx->l);
dst[1] = cpu_to_le32(mctx->r);
put_unaligned_le32(mctx->l, out);
put_unaligned_le32(mctx->r, out + 4);
return 0;
}
@ -135,13 +131,11 @@ static int michael_setkey(struct crypto_shash *tfm, const u8 *key,
{
struct michael_mic_ctx *mctx = crypto_shash_ctx(tfm);
const __le32 *data = (const __le32 *)key;
if (keylen != 8)
return -EINVAL;
mctx->l = le32_to_cpu(data[0]);
mctx->r = le32_to_cpu(data[1]);
mctx->l = get_unaligned_le32(key);
mctx->r = get_unaligned_le32(key + 4);
return 0;
}
@ -156,7 +150,6 @@ static struct shash_alg alg = {
.cra_name = "michael_mic",
.cra_driver_name = "michael_mic-generic",
.cra_blocksize = 8,
.cra_alignmask = 3,
.cra_ctxsize = sizeof(struct michael_mic_ctx),
.cra_module = THIS_MODULE,
}

View File

@ -8,6 +8,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <linux/err.h>
#include <linux/init.h>
@ -102,3 +103,4 @@ module_exit(crypto_ofb_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("OFB block cipher mode of operation");
MODULE_ALIAS_CRYPTO("ofb");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -10,6 +10,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <linux/err.h>
#include <linux/init.h>
@ -191,3 +192,4 @@ module_exit(crypto_pcbc_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("PCBC block cipher mode of operation");
MODULE_ALIAS_CRYPTO("pcbc");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -6,29 +6,15 @@
#ifndef _CRYPTO_RMD_H
#define _CRYPTO_RMD_H
#define RMD128_DIGEST_SIZE 16
#define RMD128_BLOCK_SIZE 64
#define RMD160_DIGEST_SIZE 20
#define RMD160_BLOCK_SIZE 64
#define RMD256_DIGEST_SIZE 32
#define RMD256_BLOCK_SIZE 64
#define RMD320_DIGEST_SIZE 40
#define RMD320_BLOCK_SIZE 64
/* initial values */
#define RMD_H0 0x67452301UL
#define RMD_H1 0xefcdab89UL
#define RMD_H2 0x98badcfeUL
#define RMD_H3 0x10325476UL
#define RMD_H4 0xc3d2e1f0UL
#define RMD_H5 0x76543210UL
#define RMD_H6 0xfedcba98UL
#define RMD_H7 0x89abcdefUL
#define RMD_H8 0x01234567UL
#define RMD_H9 0x3c2d1e0fUL
/* constants */
#define RMD_K1 0x00000000UL

View File

@ -1,323 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Cryptographic API.
*
* RIPEMD-128 - RACE Integrity Primitives Evaluation Message Digest.
*
* Based on the reference implementation by Antoon Bosselaers, ESAT-COSIC
*
* Copyright (c) 2008 Adrian-Ken Rueegsegger <ken@codelabs.ch>
*/
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/types.h>
#include <asm/byteorder.h>
#include "ripemd.h"
struct rmd128_ctx {
u64 byte_count;
u32 state[4];
__le32 buffer[16];
};
#define K1 RMD_K1
#define K2 RMD_K2
#define K3 RMD_K3
#define K4 RMD_K4
#define KK1 RMD_K6
#define KK2 RMD_K7
#define KK3 RMD_K8
#define KK4 RMD_K1
#define F1(x, y, z) (x ^ y ^ z) /* XOR */
#define F2(x, y, z) (z ^ (x & (y ^ z))) /* x ? y : z */
#define F3(x, y, z) ((x | ~y) ^ z)
#define F4(x, y, z) (y ^ (z & (x ^ y))) /* z ? x : y */
#define ROUND(a, b, c, d, f, k, x, s) { \
(a) += f((b), (c), (d)) + le32_to_cpup(&(x)) + (k); \
(a) = rol32((a), (s)); \
}
static void rmd128_transform(u32 *state, const __le32 *in)
{
u32 aa, bb, cc, dd, aaa, bbb, ccc, ddd;
/* Initialize left lane */
aa = state[0];
bb = state[1];
cc = state[2];
dd = state[3];
/* Initialize right lane */
aaa = state[0];
bbb = state[1];
ccc = state[2];
ddd = state[3];
/* round 1: left lane */
ROUND(aa, bb, cc, dd, F1, K1, in[0], 11);
ROUND(dd, aa, bb, cc, F1, K1, in[1], 14);
ROUND(cc, dd, aa, bb, F1, K1, in[2], 15);
ROUND(bb, cc, dd, aa, F1, K1, in[3], 12);
ROUND(aa, bb, cc, dd, F1, K1, in[4], 5);
ROUND(dd, aa, bb, cc, F1, K1, in[5], 8);
ROUND(cc, dd, aa, bb, F1, K1, in[6], 7);
ROUND(bb, cc, dd, aa, F1, K1, in[7], 9);
ROUND(aa, bb, cc, dd, F1, K1, in[8], 11);
ROUND(dd, aa, bb, cc, F1, K1, in[9], 13);
ROUND(cc, dd, aa, bb, F1, K1, in[10], 14);
ROUND(bb, cc, dd, aa, F1, K1, in[11], 15);
ROUND(aa, bb, cc, dd, F1, K1, in[12], 6);
ROUND(dd, aa, bb, cc, F1, K1, in[13], 7);
ROUND(cc, dd, aa, bb, F1, K1, in[14], 9);
ROUND(bb, cc, dd, aa, F1, K1, in[15], 8);
/* round 2: left lane */
ROUND(aa, bb, cc, dd, F2, K2, in[7], 7);
ROUND(dd, aa, bb, cc, F2, K2, in[4], 6);
ROUND(cc, dd, aa, bb, F2, K2, in[13], 8);
ROUND(bb, cc, dd, aa, F2, K2, in[1], 13);
ROUND(aa, bb, cc, dd, F2, K2, in[10], 11);
ROUND(dd, aa, bb, cc, F2, K2, in[6], 9);
ROUND(cc, dd, aa, bb, F2, K2, in[15], 7);
ROUND(bb, cc, dd, aa, F2, K2, in[3], 15);
ROUND(aa, bb, cc, dd, F2, K2, in[12], 7);
ROUND(dd, aa, bb, cc, F2, K2, in[0], 12);
ROUND(cc, dd, aa, bb, F2, K2, in[9], 15);
ROUND(bb, cc, dd, aa, F2, K2, in[5], 9);
ROUND(aa, bb, cc, dd, F2, K2, in[2], 11);
ROUND(dd, aa, bb, cc, F2, K2, in[14], 7);
ROUND(cc, dd, aa, bb, F2, K2, in[11], 13);
ROUND(bb, cc, dd, aa, F2, K2, in[8], 12);
/* round 3: left lane */
ROUND(aa, bb, cc, dd, F3, K3, in[3], 11);
ROUND(dd, aa, bb, cc, F3, K3, in[10], 13);
ROUND(cc, dd, aa, bb, F3, K3, in[14], 6);
ROUND(bb, cc, dd, aa, F3, K3, in[4], 7);
ROUND(aa, bb, cc, dd, F3, K3, in[9], 14);
ROUND(dd, aa, bb, cc, F3, K3, in[15], 9);
ROUND(cc, dd, aa, bb, F3, K3, in[8], 13);
ROUND(bb, cc, dd, aa, F3, K3, in[1], 15);
ROUND(aa, bb, cc, dd, F3, K3, in[2], 14);
ROUND(dd, aa, bb, cc, F3, K3, in[7], 8);
ROUND(cc, dd, aa, bb, F3, K3, in[0], 13);
ROUND(bb, cc, dd, aa, F3, K3, in[6], 6);
ROUND(aa, bb, cc, dd, F3, K3, in[13], 5);
ROUND(dd, aa, bb, cc, F3, K3, in[11], 12);
ROUND(cc, dd, aa, bb, F3, K3, in[5], 7);
ROUND(bb, cc, dd, aa, F3, K3, in[12], 5);
/* round 4: left lane */
ROUND(aa, bb, cc, dd, F4, K4, in[1], 11);
ROUND(dd, aa, bb, cc, F4, K4, in[9], 12);
ROUND(cc, dd, aa, bb, F4, K4, in[11], 14);
ROUND(bb, cc, dd, aa, F4, K4, in[10], 15);
ROUND(aa, bb, cc, dd, F4, K4, in[0], 14);
ROUND(dd, aa, bb, cc, F4, K4, in[8], 15);
ROUND(cc, dd, aa, bb, F4, K4, in[12], 9);
ROUND(bb, cc, dd, aa, F4, K4, in[4], 8);
ROUND(aa, bb, cc, dd, F4, K4, in[13], 9);
ROUND(dd, aa, bb, cc, F4, K4, in[3], 14);
ROUND(cc, dd, aa, bb, F4, K4, in[7], 5);
ROUND(bb, cc, dd, aa, F4, K4, in[15], 6);
ROUND(aa, bb, cc, dd, F4, K4, in[14], 8);
ROUND(dd, aa, bb, cc, F4, K4, in[5], 6);
ROUND(cc, dd, aa, bb, F4, K4, in[6], 5);
ROUND(bb, cc, dd, aa, F4, K4, in[2], 12);
/* round 1: right lane */
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[5], 8);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[14], 9);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[7], 9);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[0], 11);
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[9], 13);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[2], 15);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[11], 15);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[4], 5);
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[13], 7);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[6], 7);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[15], 8);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[8], 11);
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[1], 14);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[10], 14);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[3], 12);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[12], 6);
/* round 2: right lane */
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[6], 9);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[11], 13);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[3], 15);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[7], 7);
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[0], 12);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[13], 8);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[5], 9);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[10], 11);
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[14], 7);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[15], 7);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[8], 12);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[12], 7);
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[4], 6);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[9], 15);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[1], 13);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[2], 11);
/* round 3: right lane */
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[15], 9);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[5], 7);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[1], 15);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[3], 11);
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[7], 8);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[14], 6);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[6], 6);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[9], 14);
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[11], 12);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[8], 13);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[12], 5);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[2], 14);
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[10], 13);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[0], 13);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[4], 7);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[13], 5);
/* round 4: right lane */
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[8], 15);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[6], 5);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[4], 8);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[1], 11);
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[3], 14);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[11], 14);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[15], 6);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[0], 14);
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[5], 6);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[12], 9);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[2], 12);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[13], 9);
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[9], 12);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[7], 5);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[10], 15);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[14], 8);
/* combine results */
ddd += cc + state[1]; /* final result for state[0] */
state[1] = state[2] + dd + aaa;
state[2] = state[3] + aa + bbb;
state[3] = state[0] + bb + ccc;
state[0] = ddd;
}
static int rmd128_init(struct shash_desc *desc)
{
struct rmd128_ctx *rctx = shash_desc_ctx(desc);
rctx->byte_count = 0;
rctx->state[0] = RMD_H0;
rctx->state[1] = RMD_H1;
rctx->state[2] = RMD_H2;
rctx->state[3] = RMD_H3;
memset(rctx->buffer, 0, sizeof(rctx->buffer));
return 0;
}
static int rmd128_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct rmd128_ctx *rctx = shash_desc_ctx(desc);
const u32 avail = sizeof(rctx->buffer) - (rctx->byte_count & 0x3f);
rctx->byte_count += len;
/* Enough space in buffer? If so copy and we're done */
if (avail > len) {
memcpy((char *)rctx->buffer + (sizeof(rctx->buffer) - avail),
data, len);
goto out;
}
memcpy((char *)rctx->buffer + (sizeof(rctx->buffer) - avail),
data, avail);
rmd128_transform(rctx->state, rctx->buffer);
data += avail;
len -= avail;
while (len >= sizeof(rctx->buffer)) {
memcpy(rctx->buffer, data, sizeof(rctx->buffer));
rmd128_transform(rctx->state, rctx->buffer);
data += sizeof(rctx->buffer);
len -= sizeof(rctx->buffer);
}
memcpy(rctx->buffer, data, len);
out:
return 0;
}
/* Add padding and return the message digest. */
static int rmd128_final(struct shash_desc *desc, u8 *out)
{
struct rmd128_ctx *rctx = shash_desc_ctx(desc);
u32 i, index, padlen;
__le64 bits;
__le32 *dst = (__le32 *)out;
static const u8 padding[64] = { 0x80, };
bits = cpu_to_le64(rctx->byte_count << 3);
/* Pad out to 56 mod 64 */
index = rctx->byte_count & 0x3f;
padlen = (index < 56) ? (56 - index) : ((64+56) - index);
rmd128_update(desc, padding, padlen);
/* Append length */
rmd128_update(desc, (const u8 *)&bits, sizeof(bits));
/* Store state in digest */
for (i = 0; i < 4; i++)
dst[i] = cpu_to_le32p(&rctx->state[i]);
/* Wipe context */
memset(rctx, 0, sizeof(*rctx));
return 0;
}
static struct shash_alg alg = {
.digestsize = RMD128_DIGEST_SIZE,
.init = rmd128_init,
.update = rmd128_update,
.final = rmd128_final,
.descsize = sizeof(struct rmd128_ctx),
.base = {
.cra_name = "rmd128",
.cra_driver_name = "rmd128-generic",
.cra_blocksize = RMD128_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init rmd128_mod_init(void)
{
return crypto_register_shash(&alg);
}
static void __exit rmd128_mod_fini(void)
{
crypto_unregister_shash(&alg);
}
subsys_initcall(rmd128_mod_init);
module_exit(rmd128_mod_fini);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Adrian-Ken Rueegsegger <ken@codelabs.ch>");
MODULE_DESCRIPTION("RIPEMD-128 Message Digest");
MODULE_ALIAS_CRYPTO("rmd128");

View File

@ -1,342 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Cryptographic API.
*
* RIPEMD-256 - RACE Integrity Primitives Evaluation Message Digest.
*
* Based on the reference implementation by Antoon Bosselaers, ESAT-COSIC
*
* Copyright (c) 2008 Adrian-Ken Rueegsegger <ken@codelabs.ch>
*/
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/types.h>
#include <asm/byteorder.h>
#include "ripemd.h"
struct rmd256_ctx {
u64 byte_count;
u32 state[8];
__le32 buffer[16];
};
#define K1 RMD_K1
#define K2 RMD_K2
#define K3 RMD_K3
#define K4 RMD_K4
#define KK1 RMD_K6
#define KK2 RMD_K7
#define KK3 RMD_K8
#define KK4 RMD_K1
#define F1(x, y, z) (x ^ y ^ z) /* XOR */
#define F2(x, y, z) (z ^ (x & (y ^ z))) /* x ? y : z */
#define F3(x, y, z) ((x | ~y) ^ z)
#define F4(x, y, z) (y ^ (z & (x ^ y))) /* z ? x : y */
#define ROUND(a, b, c, d, f, k, x, s) { \
(a) += f((b), (c), (d)) + le32_to_cpup(&(x)) + (k); \
(a) = rol32((a), (s)); \
}
static void rmd256_transform(u32 *state, const __le32 *in)
{
u32 aa, bb, cc, dd, aaa, bbb, ccc, ddd;
/* Initialize left lane */
aa = state[0];
bb = state[1];
cc = state[2];
dd = state[3];
/* Initialize right lane */
aaa = state[4];
bbb = state[5];
ccc = state[6];
ddd = state[7];
/* round 1: left lane */
ROUND(aa, bb, cc, dd, F1, K1, in[0], 11);
ROUND(dd, aa, bb, cc, F1, K1, in[1], 14);
ROUND(cc, dd, aa, bb, F1, K1, in[2], 15);
ROUND(bb, cc, dd, aa, F1, K1, in[3], 12);
ROUND(aa, bb, cc, dd, F1, K1, in[4], 5);
ROUND(dd, aa, bb, cc, F1, K1, in[5], 8);
ROUND(cc, dd, aa, bb, F1, K1, in[6], 7);
ROUND(bb, cc, dd, aa, F1, K1, in[7], 9);
ROUND(aa, bb, cc, dd, F1, K1, in[8], 11);
ROUND(dd, aa, bb, cc, F1, K1, in[9], 13);
ROUND(cc, dd, aa, bb, F1, K1, in[10], 14);
ROUND(bb, cc, dd, aa, F1, K1, in[11], 15);
ROUND(aa, bb, cc, dd, F1, K1, in[12], 6);
ROUND(dd, aa, bb, cc, F1, K1, in[13], 7);
ROUND(cc, dd, aa, bb, F1, K1, in[14], 9);
ROUND(bb, cc, dd, aa, F1, K1, in[15], 8);
/* round 1: right lane */
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[5], 8);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[14], 9);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[7], 9);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[0], 11);
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[9], 13);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[2], 15);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[11], 15);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[4], 5);
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[13], 7);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[6], 7);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[15], 8);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[8], 11);
ROUND(aaa, bbb, ccc, ddd, F4, KK1, in[1], 14);
ROUND(ddd, aaa, bbb, ccc, F4, KK1, in[10], 14);
ROUND(ccc, ddd, aaa, bbb, F4, KK1, in[3], 12);
ROUND(bbb, ccc, ddd, aaa, F4, KK1, in[12], 6);
/* Swap contents of "a" registers */
swap(aa, aaa);
/* round 2: left lane */
ROUND(aa, bb, cc, dd, F2, K2, in[7], 7);
ROUND(dd, aa, bb, cc, F2, K2, in[4], 6);
ROUND(cc, dd, aa, bb, F2, K2, in[13], 8);
ROUND(bb, cc, dd, aa, F2, K2, in[1], 13);
ROUND(aa, bb, cc, dd, F2, K2, in[10], 11);
ROUND(dd, aa, bb, cc, F2, K2, in[6], 9);
ROUND(cc, dd, aa, bb, F2, K2, in[15], 7);
ROUND(bb, cc, dd, aa, F2, K2, in[3], 15);
ROUND(aa, bb, cc, dd, F2, K2, in[12], 7);
ROUND(dd, aa, bb, cc, F2, K2, in[0], 12);
ROUND(cc, dd, aa, bb, F2, K2, in[9], 15);
ROUND(bb, cc, dd, aa, F2, K2, in[5], 9);
ROUND(aa, bb, cc, dd, F2, K2, in[2], 11);
ROUND(dd, aa, bb, cc, F2, K2, in[14], 7);
ROUND(cc, dd, aa, bb, F2, K2, in[11], 13);
ROUND(bb, cc, dd, aa, F2, K2, in[8], 12);
/* round 2: right lane */
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[6], 9);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[11], 13);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[3], 15);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[7], 7);
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[0], 12);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[13], 8);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[5], 9);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[10], 11);
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[14], 7);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[15], 7);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[8], 12);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[12], 7);
ROUND(aaa, bbb, ccc, ddd, F3, KK2, in[4], 6);
ROUND(ddd, aaa, bbb, ccc, F3, KK2, in[9], 15);
ROUND(ccc, ddd, aaa, bbb, F3, KK2, in[1], 13);
ROUND(bbb, ccc, ddd, aaa, F3, KK2, in[2], 11);
/* Swap contents of "b" registers */
swap(bb, bbb);
/* round 3: left lane */
ROUND(aa, bb, cc, dd, F3, K3, in[3], 11);
ROUND(dd, aa, bb, cc, F3, K3, in[10], 13);
ROUND(cc, dd, aa, bb, F3, K3, in[14], 6);
ROUND(bb, cc, dd, aa, F3, K3, in[4], 7);
ROUND(aa, bb, cc, dd, F3, K3, in[9], 14);
ROUND(dd, aa, bb, cc, F3, K3, in[15], 9);
ROUND(cc, dd, aa, bb, F3, K3, in[8], 13);
ROUND(bb, cc, dd, aa, F3, K3, in[1], 15);
ROUND(aa, bb, cc, dd, F3, K3, in[2], 14);
ROUND(dd, aa, bb, cc, F3, K3, in[7], 8);
ROUND(cc, dd, aa, bb, F3, K3, in[0], 13);
ROUND(bb, cc, dd, aa, F3, K3, in[6], 6);
ROUND(aa, bb, cc, dd, F3, K3, in[13], 5);
ROUND(dd, aa, bb, cc, F3, K3, in[11], 12);
ROUND(cc, dd, aa, bb, F3, K3, in[5], 7);
ROUND(bb, cc, dd, aa, F3, K3, in[12], 5);
/* round 3: right lane */
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[15], 9);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[5], 7);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[1], 15);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[3], 11);
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[7], 8);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[14], 6);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[6], 6);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[9], 14);
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[11], 12);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[8], 13);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[12], 5);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[2], 14);
ROUND(aaa, bbb, ccc, ddd, F2, KK3, in[10], 13);
ROUND(ddd, aaa, bbb, ccc, F2, KK3, in[0], 13);
ROUND(ccc, ddd, aaa, bbb, F2, KK3, in[4], 7);
ROUND(bbb, ccc, ddd, aaa, F2, KK3, in[13], 5);
/* Swap contents of "c" registers */
swap(cc, ccc);
/* round 4: left lane */
ROUND(aa, bb, cc, dd, F4, K4, in[1], 11);
ROUND(dd, aa, bb, cc, F4, K4, in[9], 12);
ROUND(cc, dd, aa, bb, F4, K4, in[11], 14);
ROUND(bb, cc, dd, aa, F4, K4, in[10], 15);
ROUND(aa, bb, cc, dd, F4, K4, in[0], 14);
ROUND(dd, aa, bb, cc, F4, K4, in[8], 15);
ROUND(cc, dd, aa, bb, F4, K4, in[12], 9);
ROUND(bb, cc, dd, aa, F4, K4, in[4], 8);
ROUND(aa, bb, cc, dd, F4, K4, in[13], 9);
ROUND(dd, aa, bb, cc, F4, K4, in[3], 14);
ROUND(cc, dd, aa, bb, F4, K4, in[7], 5);
ROUND(bb, cc, dd, aa, F4, K4, in[15], 6);
ROUND(aa, bb, cc, dd, F4, K4, in[14], 8);
ROUND(dd, aa, bb, cc, F4, K4, in[5], 6);
ROUND(cc, dd, aa, bb, F4, K4, in[6], 5);
ROUND(bb, cc, dd, aa, F4, K4, in[2], 12);
/* round 4: right lane */
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[8], 15);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[6], 5);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[4], 8);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[1], 11);
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[3], 14);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[11], 14);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[15], 6);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[0], 14);
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[5], 6);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[12], 9);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[2], 12);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[13], 9);
ROUND(aaa, bbb, ccc, ddd, F1, KK4, in[9], 12);
ROUND(ddd, aaa, bbb, ccc, F1, KK4, in[7], 5);
ROUND(ccc, ddd, aaa, bbb, F1, KK4, in[10], 15);
ROUND(bbb, ccc, ddd, aaa, F1, KK4, in[14], 8);
/* Swap contents of "d" registers */
swap(dd, ddd);
/* combine results */
state[0] += aa;
state[1] += bb;
state[2] += cc;
state[3] += dd;
state[4] += aaa;
state[5] += bbb;
state[6] += ccc;
state[7] += ddd;
}
static int rmd256_init(struct shash_desc *desc)
{
struct rmd256_ctx *rctx = shash_desc_ctx(desc);
rctx->byte_count = 0;
rctx->state[0] = RMD_H0;
rctx->state[1] = RMD_H1;
rctx->state[2] = RMD_H2;
rctx->state[3] = RMD_H3;
rctx->state[4] = RMD_H5;
rctx->state[5] = RMD_H6;
rctx->state[6] = RMD_H7;
rctx->state[7] = RMD_H8;
memset(rctx->buffer, 0, sizeof(rctx->buffer));
return 0;
}
static int rmd256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct rmd256_ctx *rctx = shash_desc_ctx(desc);
const u32 avail = sizeof(rctx->buffer) - (rctx->byte_count & 0x3f);
rctx->byte_count += len;
/* Enough space in buffer? If so copy and we're done */
if (avail > len) {
memcpy((char *)rctx->buffer + (sizeof(rctx->buffer) - avail),
data, len);
goto out;
}
memcpy((char *)rctx->buffer + (sizeof(rctx->buffer) - avail),
data, avail);
rmd256_transform(rctx->state, rctx->buffer);
data += avail;
len -= avail;
while (len >= sizeof(rctx->buffer)) {
memcpy(rctx->buffer, data, sizeof(rctx->buffer));
rmd256_transform(rctx->state, rctx->buffer);
data += sizeof(rctx->buffer);
len -= sizeof(rctx->buffer);
}
memcpy(rctx->buffer, data, len);
out:
return 0;
}
/* Add padding and return the message digest. */
static int rmd256_final(struct shash_desc *desc, u8 *out)
{
struct rmd256_ctx *rctx = shash_desc_ctx(desc);
u32 i, index, padlen;
__le64 bits;
__le32 *dst = (__le32 *)out;
static const u8 padding[64] = { 0x80, };
bits = cpu_to_le64(rctx->byte_count << 3);
/* Pad out to 56 mod 64 */
index = rctx->byte_count & 0x3f;
padlen = (index < 56) ? (56 - index) : ((64+56) - index);
rmd256_update(desc, padding, padlen);
/* Append length */
rmd256_update(desc, (const u8 *)&bits, sizeof(bits));
/* Store state in digest */
for (i = 0; i < 8; i++)
dst[i] = cpu_to_le32p(&rctx->state[i]);
/* Wipe context */
memset(rctx, 0, sizeof(*rctx));
return 0;
}
static struct shash_alg alg = {
.digestsize = RMD256_DIGEST_SIZE,
.init = rmd256_init,
.update = rmd256_update,
.final = rmd256_final,
.descsize = sizeof(struct rmd256_ctx),
.base = {
.cra_name = "rmd256",
.cra_driver_name = "rmd256-generic",
.cra_blocksize = RMD256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init rmd256_mod_init(void)
{
return crypto_register_shash(&alg);
}
static void __exit rmd256_mod_fini(void)
{
crypto_unregister_shash(&alg);
}
subsys_initcall(rmd256_mod_init);
module_exit(rmd256_mod_fini);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Adrian-Ken Rueegsegger <ken@codelabs.ch>");
MODULE_DESCRIPTION("RIPEMD-256 Message Digest");
MODULE_ALIAS_CRYPTO("rmd256");

View File

@ -1,391 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Cryptographic API.
*
* RIPEMD-320 - RACE Integrity Primitives Evaluation Message Digest.
*
* Based on the reference implementation by Antoon Bosselaers, ESAT-COSIC
*
* Copyright (c) 2008 Adrian-Ken Rueegsegger <ken@codelabs.ch>
*/
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/types.h>
#include <asm/byteorder.h>
#include "ripemd.h"
struct rmd320_ctx {
u64 byte_count;
u32 state[10];
__le32 buffer[16];
};
#define K1 RMD_K1
#define K2 RMD_K2
#define K3 RMD_K3
#define K4 RMD_K4
#define K5 RMD_K5
#define KK1 RMD_K6
#define KK2 RMD_K7
#define KK3 RMD_K8
#define KK4 RMD_K9
#define KK5 RMD_K1
#define F1(x, y, z) (x ^ y ^ z) /* XOR */
#define F2(x, y, z) (z ^ (x & (y ^ z))) /* x ? y : z */
#define F3(x, y, z) ((x | ~y) ^ z)
#define F4(x, y, z) (y ^ (z & (x ^ y))) /* z ? x : y */
#define F5(x, y, z) (x ^ (y | ~z))
#define ROUND(a, b, c, d, e, f, k, x, s) { \
(a) += f((b), (c), (d)) + le32_to_cpup(&(x)) + (k); \
(a) = rol32((a), (s)) + (e); \
(c) = rol32((c), 10); \
}
static void rmd320_transform(u32 *state, const __le32 *in)
{
u32 aa, bb, cc, dd, ee, aaa, bbb, ccc, ddd, eee;
/* Initialize left lane */
aa = state[0];
bb = state[1];
cc = state[2];
dd = state[3];
ee = state[4];
/* Initialize right lane */
aaa = state[5];
bbb = state[6];
ccc = state[7];
ddd = state[8];
eee = state[9];
/* round 1: left lane */
ROUND(aa, bb, cc, dd, ee, F1, K1, in[0], 11);
ROUND(ee, aa, bb, cc, dd, F1, K1, in[1], 14);
ROUND(dd, ee, aa, bb, cc, F1, K1, in[2], 15);
ROUND(cc, dd, ee, aa, bb, F1, K1, in[3], 12);
ROUND(bb, cc, dd, ee, aa, F1, K1, in[4], 5);
ROUND(aa, bb, cc, dd, ee, F1, K1, in[5], 8);
ROUND(ee, aa, bb, cc, dd, F1, K1, in[6], 7);
ROUND(dd, ee, aa, bb, cc, F1, K1, in[7], 9);
ROUND(cc, dd, ee, aa, bb, F1, K1, in[8], 11);
ROUND(bb, cc, dd, ee, aa, F1, K1, in[9], 13);
ROUND(aa, bb, cc, dd, ee, F1, K1, in[10], 14);
ROUND(ee, aa, bb, cc, dd, F1, K1, in[11], 15);
ROUND(dd, ee, aa, bb, cc, F1, K1, in[12], 6);
ROUND(cc, dd, ee, aa, bb, F1, K1, in[13], 7);
ROUND(bb, cc, dd, ee, aa, F1, K1, in[14], 9);
ROUND(aa, bb, cc, dd, ee, F1, K1, in[15], 8);
/* round 1: right lane */
ROUND(aaa, bbb, ccc, ddd, eee, F5, KK1, in[5], 8);
ROUND(eee, aaa, bbb, ccc, ddd, F5, KK1, in[14], 9);
ROUND(ddd, eee, aaa, bbb, ccc, F5, KK1, in[7], 9);
ROUND(ccc, ddd, eee, aaa, bbb, F5, KK1, in[0], 11);
ROUND(bbb, ccc, ddd, eee, aaa, F5, KK1, in[9], 13);
ROUND(aaa, bbb, ccc, ddd, eee, F5, KK1, in[2], 15);
ROUND(eee, aaa, bbb, ccc, ddd, F5, KK1, in[11], 15);
ROUND(ddd, eee, aaa, bbb, ccc, F5, KK1, in[4], 5);
ROUND(ccc, ddd, eee, aaa, bbb, F5, KK1, in[13], 7);
ROUND(bbb, ccc, ddd, eee, aaa, F5, KK1, in[6], 7);
ROUND(aaa, bbb, ccc, ddd, eee, F5, KK1, in[15], 8);
ROUND(eee, aaa, bbb, ccc, ddd, F5, KK1, in[8], 11);
ROUND(ddd, eee, aaa, bbb, ccc, F5, KK1, in[1], 14);
ROUND(ccc, ddd, eee, aaa, bbb, F5, KK1, in[10], 14);
ROUND(bbb, ccc, ddd, eee, aaa, F5, KK1, in[3], 12);
ROUND(aaa, bbb, ccc, ddd, eee, F5, KK1, in[12], 6);
/* Swap contents of "a" registers */
swap(aa, aaa);
/* round 2: left lane" */
ROUND(ee, aa, bb, cc, dd, F2, K2, in[7], 7);
ROUND(dd, ee, aa, bb, cc, F2, K2, in[4], 6);
ROUND(cc, dd, ee, aa, bb, F2, K2, in[13], 8);
ROUND(bb, cc, dd, ee, aa, F2, K2, in[1], 13);
ROUND(aa, bb, cc, dd, ee, F2, K2, in[10], 11);
ROUND(ee, aa, bb, cc, dd, F2, K2, in[6], 9);
ROUND(dd, ee, aa, bb, cc, F2, K2, in[15], 7);
ROUND(cc, dd, ee, aa, bb, F2, K2, in[3], 15);
ROUND(bb, cc, dd, ee, aa, F2, K2, in[12], 7);
ROUND(aa, bb, cc, dd, ee, F2, K2, in[0], 12);
ROUND(ee, aa, bb, cc, dd, F2, K2, in[9], 15);
ROUND(dd, ee, aa, bb, cc, F2, K2, in[5], 9);
ROUND(cc, dd, ee, aa, bb, F2, K2, in[2], 11);
ROUND(bb, cc, dd, ee, aa, F2, K2, in[14], 7);
ROUND(aa, bb, cc, dd, ee, F2, K2, in[11], 13);
ROUND(ee, aa, bb, cc, dd, F2, K2, in[8], 12);
/* round 2: right lane */
ROUND(eee, aaa, bbb, ccc, ddd, F4, KK2, in[6], 9);
ROUND(ddd, eee, aaa, bbb, ccc, F4, KK2, in[11], 13);
ROUND(ccc, ddd, eee, aaa, bbb, F4, KK2, in[3], 15);
ROUND(bbb, ccc, ddd, eee, aaa, F4, KK2, in[7], 7);
ROUND(aaa, bbb, ccc, ddd, eee, F4, KK2, in[0], 12);
ROUND(eee, aaa, bbb, ccc, ddd, F4, KK2, in[13], 8);
ROUND(ddd, eee, aaa, bbb, ccc, F4, KK2, in[5], 9);
ROUND(ccc, ddd, eee, aaa, bbb, F4, KK2, in[10], 11);
ROUND(bbb, ccc, ddd, eee, aaa, F4, KK2, in[14], 7);
ROUND(aaa, bbb, ccc, ddd, eee, F4, KK2, in[15], 7);
ROUND(eee, aaa, bbb, ccc, ddd, F4, KK2, in[8], 12);
ROUND(ddd, eee, aaa, bbb, ccc, F4, KK2, in[12], 7);
ROUND(ccc, ddd, eee, aaa, bbb, F4, KK2, in[4], 6);
ROUND(bbb, ccc, ddd, eee, aaa, F4, KK2, in[9], 15);
ROUND(aaa, bbb, ccc, ddd, eee, F4, KK2, in[1], 13);
ROUND(eee, aaa, bbb, ccc, ddd, F4, KK2, in[2], 11);
/* Swap contents of "b" registers */
swap(bb, bbb);
/* round 3: left lane" */
ROUND(dd, ee, aa, bb, cc, F3, K3, in[3], 11);
ROUND(cc, dd, ee, aa, bb, F3, K3, in[10], 13);
ROUND(bb, cc, dd, ee, aa, F3, K3, in[14], 6);
ROUND(aa, bb, cc, dd, ee, F3, K3, in[4], 7);
ROUND(ee, aa, bb, cc, dd, F3, K3, in[9], 14);
ROUND(dd, ee, aa, bb, cc, F3, K3, in[15], 9);
ROUND(cc, dd, ee, aa, bb, F3, K3, in[8], 13);
ROUND(bb, cc, dd, ee, aa, F3, K3, in[1], 15);
ROUND(aa, bb, cc, dd, ee, F3, K3, in[2], 14);
ROUND(ee, aa, bb, cc, dd, F3, K3, in[7], 8);
ROUND(dd, ee, aa, bb, cc, F3, K3, in[0], 13);
ROUND(cc, dd, ee, aa, bb, F3, K3, in[6], 6);
ROUND(bb, cc, dd, ee, aa, F3, K3, in[13], 5);
ROUND(aa, bb, cc, dd, ee, F3, K3, in[11], 12);
ROUND(ee, aa, bb, cc, dd, F3, K3, in[5], 7);
ROUND(dd, ee, aa, bb, cc, F3, K3, in[12], 5);
/* round 3: right lane */
ROUND(ddd, eee, aaa, bbb, ccc, F3, KK3, in[15], 9);
ROUND(ccc, ddd, eee, aaa, bbb, F3, KK3, in[5], 7);
ROUND(bbb, ccc, ddd, eee, aaa, F3, KK3, in[1], 15);
ROUND(aaa, bbb, ccc, ddd, eee, F3, KK3, in[3], 11);
ROUND(eee, aaa, bbb, ccc, ddd, F3, KK3, in[7], 8);
ROUND(ddd, eee, aaa, bbb, ccc, F3, KK3, in[14], 6);
ROUND(ccc, ddd, eee, aaa, bbb, F3, KK3, in[6], 6);
ROUND(bbb, ccc, ddd, eee, aaa, F3, KK3, in[9], 14);
ROUND(aaa, bbb, ccc, ddd, eee, F3, KK3, in[11], 12);
ROUND(eee, aaa, bbb, ccc, ddd, F3, KK3, in[8], 13);
ROUND(ddd, eee, aaa, bbb, ccc, F3, KK3, in[12], 5);
ROUND(ccc, ddd, eee, aaa, bbb, F3, KK3, in[2], 14);
ROUND(bbb, ccc, ddd, eee, aaa, F3, KK3, in[10], 13);
ROUND(aaa, bbb, ccc, ddd, eee, F3, KK3, in[0], 13);
ROUND(eee, aaa, bbb, ccc, ddd, F3, KK3, in[4], 7);
ROUND(ddd, eee, aaa, bbb, ccc, F3, KK3, in[13], 5);
/* Swap contents of "c" registers */
swap(cc, ccc);
/* round 4: left lane" */
ROUND(cc, dd, ee, aa, bb, F4, K4, in[1], 11);
ROUND(bb, cc, dd, ee, aa, F4, K4, in[9], 12);
ROUND(aa, bb, cc, dd, ee, F4, K4, in[11], 14);
ROUND(ee, aa, bb, cc, dd, F4, K4, in[10], 15);
ROUND(dd, ee, aa, bb, cc, F4, K4, in[0], 14);
ROUND(cc, dd, ee, aa, bb, F4, K4, in[8], 15);
ROUND(bb, cc, dd, ee, aa, F4, K4, in[12], 9);
ROUND(aa, bb, cc, dd, ee, F4, K4, in[4], 8);
ROUND(ee, aa, bb, cc, dd, F4, K4, in[13], 9);
ROUND(dd, ee, aa, bb, cc, F4, K4, in[3], 14);
ROUND(cc, dd, ee, aa, bb, F4, K4, in[7], 5);
ROUND(bb, cc, dd, ee, aa, F4, K4, in[15], 6);
ROUND(aa, bb, cc, dd, ee, F4, K4, in[14], 8);
ROUND(ee, aa, bb, cc, dd, F4, K4, in[5], 6);
ROUND(dd, ee, aa, bb, cc, F4, K4, in[6], 5);
ROUND(cc, dd, ee, aa, bb, F4, K4, in[2], 12);
/* round 4: right lane */
ROUND(ccc, ddd, eee, aaa, bbb, F2, KK4, in[8], 15);
ROUND(bbb, ccc, ddd, eee, aaa, F2, KK4, in[6], 5);
ROUND(aaa, bbb, ccc, ddd, eee, F2, KK4, in[4], 8);
ROUND(eee, aaa, bbb, ccc, ddd, F2, KK4, in[1], 11);
ROUND(ddd, eee, aaa, bbb, ccc, F2, KK4, in[3], 14);
ROUND(ccc, ddd, eee, aaa, bbb, F2, KK4, in[11], 14);
ROUND(bbb, ccc, ddd, eee, aaa, F2, KK4, in[15], 6);
ROUND(aaa, bbb, ccc, ddd, eee, F2, KK4, in[0], 14);
ROUND(eee, aaa, bbb, ccc, ddd, F2, KK4, in[5], 6);
ROUND(ddd, eee, aaa, bbb, ccc, F2, KK4, in[12], 9);
ROUND(ccc, ddd, eee, aaa, bbb, F2, KK4, in[2], 12);
ROUND(bbb, ccc, ddd, eee, aaa, F2, KK4, in[13], 9);
ROUND(aaa, bbb, ccc, ddd, eee, F2, KK4, in[9], 12);
ROUND(eee, aaa, bbb, ccc, ddd, F2, KK4, in[7], 5);
ROUND(ddd, eee, aaa, bbb, ccc, F2, KK4, in[10], 15);
ROUND(ccc, ddd, eee, aaa, bbb, F2, KK4, in[14], 8);
/* Swap contents of "d" registers */
swap(dd, ddd);
/* round 5: left lane" */
ROUND(bb, cc, dd, ee, aa, F5, K5, in[4], 9);
ROUND(aa, bb, cc, dd, ee, F5, K5, in[0], 15);
ROUND(ee, aa, bb, cc, dd, F5, K5, in[5], 5);
ROUND(dd, ee, aa, bb, cc, F5, K5, in[9], 11);
ROUND(cc, dd, ee, aa, bb, F5, K5, in[7], 6);
ROUND(bb, cc, dd, ee, aa, F5, K5, in[12], 8);
ROUND(aa, bb, cc, dd, ee, F5, K5, in[2], 13);
ROUND(ee, aa, bb, cc, dd, F5, K5, in[10], 12);
ROUND(dd, ee, aa, bb, cc, F5, K5, in[14], 5);
ROUND(cc, dd, ee, aa, bb, F5, K5, in[1], 12);
ROUND(bb, cc, dd, ee, aa, F5, K5, in[3], 13);
ROUND(aa, bb, cc, dd, ee, F5, K5, in[8], 14);
ROUND(ee, aa, bb, cc, dd, F5, K5, in[11], 11);
ROUND(dd, ee, aa, bb, cc, F5, K5, in[6], 8);
ROUND(cc, dd, ee, aa, bb, F5, K5, in[15], 5);
ROUND(bb, cc, dd, ee, aa, F5, K5, in[13], 6);
/* round 5: right lane */
ROUND(bbb, ccc, ddd, eee, aaa, F1, KK5, in[12], 8);
ROUND(aaa, bbb, ccc, ddd, eee, F1, KK5, in[15], 5);
ROUND(eee, aaa, bbb, ccc, ddd, F1, KK5, in[10], 12);
ROUND(ddd, eee, aaa, bbb, ccc, F1, KK5, in[4], 9);
ROUND(ccc, ddd, eee, aaa, bbb, F1, KK5, in[1], 12);
ROUND(bbb, ccc, ddd, eee, aaa, F1, KK5, in[5], 5);
ROUND(aaa, bbb, ccc, ddd, eee, F1, KK5, in[8], 14);
ROUND(eee, aaa, bbb, ccc, ddd, F1, KK5, in[7], 6);
ROUND(ddd, eee, aaa, bbb, ccc, F1, KK5, in[6], 8);
ROUND(ccc, ddd, eee, aaa, bbb, F1, KK5, in[2], 13);
ROUND(bbb, ccc, ddd, eee, aaa, F1, KK5, in[13], 6);
ROUND(aaa, bbb, ccc, ddd, eee, F1, KK5, in[14], 5);
ROUND(eee, aaa, bbb, ccc, ddd, F1, KK5, in[0], 15);
ROUND(ddd, eee, aaa, bbb, ccc, F1, KK5, in[3], 13);
ROUND(ccc, ddd, eee, aaa, bbb, F1, KK5, in[9], 11);
ROUND(bbb, ccc, ddd, eee, aaa, F1, KK5, in[11], 11);
/* Swap contents of "e" registers */
swap(ee, eee);
/* combine results */
state[0] += aa;
state[1] += bb;
state[2] += cc;
state[3] += dd;
state[4] += ee;
state[5] += aaa;
state[6] += bbb;
state[7] += ccc;
state[8] += ddd;
state[9] += eee;
}
static int rmd320_init(struct shash_desc *desc)
{
struct rmd320_ctx *rctx = shash_desc_ctx(desc);
rctx->byte_count = 0;
rctx->state[0] = RMD_H0;
rctx->state[1] = RMD_H1;
rctx->state[2] = RMD_H2;
rctx->state[3] = RMD_H3;
rctx->state[4] = RMD_H4;
rctx->state[5] = RMD_H5;
rctx->state[6] = RMD_H6;
rctx->state[7] = RMD_H7;
rctx->state[8] = RMD_H8;
rctx->state[9] = RMD_H9;
memset(rctx->buffer, 0, sizeof(rctx->buffer));
return 0;
}
static int rmd320_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct rmd320_ctx *rctx = shash_desc_ctx(desc);
const u32 avail = sizeof(rctx->buffer) - (rctx->byte_count & 0x3f);
rctx->byte_count += len;
/* Enough space in buffer? If so copy and we're done */
if (avail > len) {
memcpy((char *)rctx->buffer + (sizeof(rctx->buffer) - avail),
data, len);
goto out;
}
memcpy((char *)rctx->buffer + (sizeof(rctx->buffer) - avail),
data, avail);
rmd320_transform(rctx->state, rctx->buffer);
data += avail;
len -= avail;
while (len >= sizeof(rctx->buffer)) {
memcpy(rctx->buffer, data, sizeof(rctx->buffer));
rmd320_transform(rctx->state, rctx->buffer);
data += sizeof(rctx->buffer);
len -= sizeof(rctx->buffer);
}
memcpy(rctx->buffer, data, len);
out:
return 0;
}
/* Add padding and return the message digest. */
static int rmd320_final(struct shash_desc *desc, u8 *out)
{
struct rmd320_ctx *rctx = shash_desc_ctx(desc);
u32 i, index, padlen;
__le64 bits;
__le32 *dst = (__le32 *)out;
static const u8 padding[64] = { 0x80, };
bits = cpu_to_le64(rctx->byte_count << 3);
/* Pad out to 56 mod 64 */
index = rctx->byte_count & 0x3f;
padlen = (index < 56) ? (56 - index) : ((64+56) - index);
rmd320_update(desc, padding, padlen);
/* Append length */
rmd320_update(desc, (const u8 *)&bits, sizeof(bits));
/* Store state in digest */
for (i = 0; i < 10; i++)
dst[i] = cpu_to_le32p(&rctx->state[i]);
/* Wipe context */
memset(rctx, 0, sizeof(*rctx));
return 0;
}
static struct shash_alg alg = {
.digestsize = RMD320_DIGEST_SIZE,
.init = rmd320_init,
.update = rmd320_update,
.final = rmd320_final,
.descsize = sizeof(struct rmd320_ctx),
.base = {
.cra_name = "rmd320",
.cra_driver_name = "rmd320-generic",
.cra_blocksize = RMD320_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init rmd320_mod_init(void)
{
return crypto_register_shash(&alg);
}
static void __exit rmd320_mod_fini(void)
{
crypto_unregister_shash(&alg);
}
subsys_initcall(rmd320_mod_init);
module_exit(rmd320_mod_fini);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Adrian-Ken Rueegsegger <ken@codelabs.ch>");
MODULE_DESCRIPTION("RIPEMD-320 Message Digest");
MODULE_ALIAS_CRYPTO("rmd320");

View File

@ -1,212 +0,0 @@
/*
* Salsa20: Salsa20 stream cipher algorithm
*
* Copyright (c) 2007 Tan Swee Heng <thesweeheng@gmail.com>
*
* Derived from:
* - salsa20.c: Public domain C code by Daniel J. Bernstein <djb@cr.yp.to>
*
* Salsa20 is a stream cipher candidate in eSTREAM, the ECRYPT Stream
* Cipher Project. It is designed by Daniel J. Bernstein <djb@cr.yp.to>.
* More information about eSTREAM and Salsa20 can be found here:
* https://www.ecrypt.eu.org/stream/
* https://cr.yp.to/snuffle.html
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
*/
#include <asm/unaligned.h>
#include <crypto/internal/skcipher.h>
#include <linux/module.h>
#define SALSA20_IV_SIZE 8
#define SALSA20_MIN_KEY_SIZE 16
#define SALSA20_MAX_KEY_SIZE 32
#define SALSA20_BLOCK_SIZE 64
struct salsa20_ctx {
u32 initial_state[16];
};
static void salsa20_block(u32 *state, __le32 *stream)
{
u32 x[16];
int i;
memcpy(x, state, sizeof(x));
for (i = 0; i < 20; i += 2) {
x[ 4] ^= rol32((x[ 0] + x[12]), 7);
x[ 8] ^= rol32((x[ 4] + x[ 0]), 9);
x[12] ^= rol32((x[ 8] + x[ 4]), 13);
x[ 0] ^= rol32((x[12] + x[ 8]), 18);
x[ 9] ^= rol32((x[ 5] + x[ 1]), 7);
x[13] ^= rol32((x[ 9] + x[ 5]), 9);
x[ 1] ^= rol32((x[13] + x[ 9]), 13);
x[ 5] ^= rol32((x[ 1] + x[13]), 18);
x[14] ^= rol32((x[10] + x[ 6]), 7);
x[ 2] ^= rol32((x[14] + x[10]), 9);
x[ 6] ^= rol32((x[ 2] + x[14]), 13);
x[10] ^= rol32((x[ 6] + x[ 2]), 18);
x[ 3] ^= rol32((x[15] + x[11]), 7);
x[ 7] ^= rol32((x[ 3] + x[15]), 9);
x[11] ^= rol32((x[ 7] + x[ 3]), 13);
x[15] ^= rol32((x[11] + x[ 7]), 18);
x[ 1] ^= rol32((x[ 0] + x[ 3]), 7);
x[ 2] ^= rol32((x[ 1] + x[ 0]), 9);
x[ 3] ^= rol32((x[ 2] + x[ 1]), 13);
x[ 0] ^= rol32((x[ 3] + x[ 2]), 18);
x[ 6] ^= rol32((x[ 5] + x[ 4]), 7);
x[ 7] ^= rol32((x[ 6] + x[ 5]), 9);
x[ 4] ^= rol32((x[ 7] + x[ 6]), 13);
x[ 5] ^= rol32((x[ 4] + x[ 7]), 18);
x[11] ^= rol32((x[10] + x[ 9]), 7);
x[ 8] ^= rol32((x[11] + x[10]), 9);
x[ 9] ^= rol32((x[ 8] + x[11]), 13);
x[10] ^= rol32((x[ 9] + x[ 8]), 18);
x[12] ^= rol32((x[15] + x[14]), 7);
x[13] ^= rol32((x[12] + x[15]), 9);
x[14] ^= rol32((x[13] + x[12]), 13);
x[15] ^= rol32((x[14] + x[13]), 18);
}
for (i = 0; i < 16; i++)
stream[i] = cpu_to_le32(x[i] + state[i]);
if (++state[8] == 0)
state[9]++;
}
static void salsa20_docrypt(u32 *state, u8 *dst, const u8 *src,
unsigned int bytes)
{
__le32 stream[SALSA20_BLOCK_SIZE / sizeof(__le32)];
while (bytes >= SALSA20_BLOCK_SIZE) {
salsa20_block(state, stream);
crypto_xor_cpy(dst, src, (const u8 *)stream,
SALSA20_BLOCK_SIZE);
bytes -= SALSA20_BLOCK_SIZE;
dst += SALSA20_BLOCK_SIZE;
src += SALSA20_BLOCK_SIZE;
}
if (bytes) {
salsa20_block(state, stream);
crypto_xor_cpy(dst, src, (const u8 *)stream, bytes);
}
}
static void salsa20_init(u32 *state, const struct salsa20_ctx *ctx,
const u8 *iv)
{
memcpy(state, ctx->initial_state, sizeof(ctx->initial_state));
state[6] = get_unaligned_le32(iv + 0);
state[7] = get_unaligned_le32(iv + 4);
}
static int salsa20_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keysize)
{
static const char sigma[16] = "expand 32-byte k";
static const char tau[16] = "expand 16-byte k";
struct salsa20_ctx *ctx = crypto_skcipher_ctx(tfm);
const char *constants;
if (keysize != SALSA20_MIN_KEY_SIZE &&
keysize != SALSA20_MAX_KEY_SIZE)
return -EINVAL;
ctx->initial_state[1] = get_unaligned_le32(key + 0);
ctx->initial_state[2] = get_unaligned_le32(key + 4);
ctx->initial_state[3] = get_unaligned_le32(key + 8);
ctx->initial_state[4] = get_unaligned_le32(key + 12);
if (keysize == 32) { /* recommended */
key += 16;
constants = sigma;
} else { /* keysize == 16 */
constants = tau;
}
ctx->initial_state[11] = get_unaligned_le32(key + 0);
ctx->initial_state[12] = get_unaligned_le32(key + 4);
ctx->initial_state[13] = get_unaligned_le32(key + 8);
ctx->initial_state[14] = get_unaligned_le32(key + 12);
ctx->initial_state[0] = get_unaligned_le32(constants + 0);
ctx->initial_state[5] = get_unaligned_le32(constants + 4);
ctx->initial_state[10] = get_unaligned_le32(constants + 8);
ctx->initial_state[15] = get_unaligned_le32(constants + 12);
/* space for the nonce; it will be overridden for each request */
ctx->initial_state[6] = 0;
ctx->initial_state[7] = 0;
/* initial block number */
ctx->initial_state[8] = 0;
ctx->initial_state[9] = 0;
return 0;
}
static int salsa20_crypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct salsa20_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u32 state[16];
int err;
err = skcipher_walk_virt(&walk, req, false);
salsa20_init(state, ctx, req->iv);
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
if (nbytes < walk.total)
nbytes = round_down(nbytes, walk.stride);
salsa20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
nbytes);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
return err;
}
static struct skcipher_alg alg = {
.base.cra_name = "salsa20",
.base.cra_driver_name = "salsa20-generic",
.base.cra_priority = 100,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct salsa20_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = SALSA20_MIN_KEY_SIZE,
.max_keysize = SALSA20_MAX_KEY_SIZE,
.ivsize = SALSA20_IV_SIZE,
.chunksize = SALSA20_BLOCK_SIZE,
.setkey = salsa20_setkey,
.encrypt = salsa20_crypt,
.decrypt = salsa20_crypt,
};
static int __init salsa20_generic_mod_init(void)
{
return crypto_register_skcipher(&alg);
}
static void __exit salsa20_generic_mod_fini(void)
{
crypto_unregister_skcipher(&alg);
}
subsys_initcall(salsa20_generic_mod_init);
module_exit(salsa20_generic_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION ("Salsa20 stream cipher algorithm");
MODULE_ALIAS_CRYPTO("salsa20");
MODULE_ALIAS_CRYPTO("salsa20-generic");

View File

@ -5,17 +5,12 @@
* Serpent Cipher Algorithm.
*
* Copyright (C) 2002 Dag Arne Osvik <osvik@ii.uib.no>
* 2003 Herbert Valerio Riedel <hvr@gnu.org>
*
* Added tnepres support:
* Ruben Jesus Garcia Hernandez <ruben@ugr.es>, 18.10.2004
* Based on code by hvr
*/
#include <linux/init.h>
#include <linux/module.h>
#include <linux/errno.h>
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/crypto.h>
#include <linux/types.h>
#include <crypto/serpent.h>
@ -453,19 +448,12 @@ void __serpent_encrypt(const void *c, u8 *dst, const u8 *src)
{
const struct serpent_ctx *ctx = c;
const u32 *k = ctx->expkey;
const __le32 *s = (const __le32 *)src;
__le32 *d = (__le32 *)dst;
u32 r0, r1, r2, r3, r4;
/*
* Note: The conversions between u8* and u32* might cause trouble
* on architectures with stricter alignment rules than x86
*/
r0 = le32_to_cpu(s[0]);
r1 = le32_to_cpu(s[1]);
r2 = le32_to_cpu(s[2]);
r3 = le32_to_cpu(s[3]);
r0 = get_unaligned_le32(src);
r1 = get_unaligned_le32(src + 4);
r2 = get_unaligned_le32(src + 8);
r3 = get_unaligned_le32(src + 12);
K(r0, r1, r2, r3, 0);
S0(r0, r1, r2, r3, r4); LK(r2, r1, r3, r0, r4, 1);
@ -501,10 +489,10 @@ void __serpent_encrypt(const void *c, u8 *dst, const u8 *src)
S6(r0, r1, r3, r2, r4); LK(r3, r4, r1, r2, r0, 31);
S7(r3, r4, r1, r2, r0); K(r0, r1, r2, r3, 32);
d[0] = cpu_to_le32(r0);
d[1] = cpu_to_le32(r1);
d[2] = cpu_to_le32(r2);
d[3] = cpu_to_le32(r3);
put_unaligned_le32(r0, dst);
put_unaligned_le32(r1, dst + 4);
put_unaligned_le32(r2, dst + 8);
put_unaligned_le32(r3, dst + 12);
}
EXPORT_SYMBOL_GPL(__serpent_encrypt);
@ -519,14 +507,12 @@ void __serpent_decrypt(const void *c, u8 *dst, const u8 *src)
{
const struct serpent_ctx *ctx = c;
const u32 *k = ctx->expkey;
const __le32 *s = (const __le32 *)src;
__le32 *d = (__le32 *)dst;
u32 r0, r1, r2, r3, r4;
r0 = le32_to_cpu(s[0]);
r1 = le32_to_cpu(s[1]);
r2 = le32_to_cpu(s[2]);
r3 = le32_to_cpu(s[3]);
r0 = get_unaligned_le32(src);
r1 = get_unaligned_le32(src + 4);
r2 = get_unaligned_le32(src + 8);
r3 = get_unaligned_le32(src + 12);
K(r0, r1, r2, r3, 32);
SI7(r0, r1, r2, r3, r4); KL(r1, r3, r0, r4, r2, 31);
@ -562,10 +548,10 @@ void __serpent_decrypt(const void *c, u8 *dst, const u8 *src)
SI1(r3, r1, r2, r0, r4); KL(r4, r1, r2, r0, r3, 1);
SI0(r4, r1, r2, r0, r3); K(r2, r3, r1, r4, 0);
d[0] = cpu_to_le32(r2);
d[1] = cpu_to_le32(r3);
d[2] = cpu_to_le32(r1);
d[3] = cpu_to_le32(r4);
put_unaligned_le32(r2, dst);
put_unaligned_le32(r3, dst + 4);
put_unaligned_le32(r1, dst + 8);
put_unaligned_le32(r4, dst + 12);
}
EXPORT_SYMBOL_GPL(__serpent_decrypt);
@ -576,66 +562,13 @@ static void serpent_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
__serpent_decrypt(ctx, dst, src);
}
static int tnepres_setkey(struct crypto_tfm *tfm, const u8 *key,
unsigned int keylen)
{
u8 rev_key[SERPENT_MAX_KEY_SIZE];
int i;
for (i = 0; i < keylen; ++i)
rev_key[keylen - i - 1] = key[i];
return serpent_setkey(tfm, rev_key, keylen);
}
static void tnepres_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
const u32 * const s = (const u32 * const)src;
u32 * const d = (u32 * const)dst;
u32 rs[4], rd[4];
rs[0] = swab32(s[3]);
rs[1] = swab32(s[2]);
rs[2] = swab32(s[1]);
rs[3] = swab32(s[0]);
serpent_encrypt(tfm, (u8 *)rd, (u8 *)rs);
d[0] = swab32(rd[3]);
d[1] = swab32(rd[2]);
d[2] = swab32(rd[1]);
d[3] = swab32(rd[0]);
}
static void tnepres_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
const u32 * const s = (const u32 * const)src;
u32 * const d = (u32 * const)dst;
u32 rs[4], rd[4];
rs[0] = swab32(s[3]);
rs[1] = swab32(s[2]);
rs[2] = swab32(s[1]);
rs[3] = swab32(s[0]);
serpent_decrypt(tfm, (u8 *)rd, (u8 *)rs);
d[0] = swab32(rd[3]);
d[1] = swab32(rd[2]);
d[2] = swab32(rd[1]);
d[3] = swab32(rd[0]);
}
static struct crypto_alg srp_algs[2] = { {
static struct crypto_alg srp_alg = {
.cra_name = "serpent",
.cra_driver_name = "serpent-generic",
.cra_priority = 100,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = SERPENT_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct serpent_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = { .cipher = {
.cia_min_keysize = SERPENT_MIN_KEY_SIZE,
@ -643,38 +576,23 @@ static struct crypto_alg srp_algs[2] = { {
.cia_setkey = serpent_setkey,
.cia_encrypt = serpent_encrypt,
.cia_decrypt = serpent_decrypt } }
}, {
.cra_name = "tnepres",
.cra_driver_name = "tnepres-generic",
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = SERPENT_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct serpent_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = { .cipher = {
.cia_min_keysize = SERPENT_MIN_KEY_SIZE,
.cia_max_keysize = SERPENT_MAX_KEY_SIZE,
.cia_setkey = tnepres_setkey,
.cia_encrypt = tnepres_encrypt,
.cia_decrypt = tnepres_decrypt } }
} };
};
static int __init serpent_mod_init(void)
{
return crypto_register_algs(srp_algs, ARRAY_SIZE(srp_algs));
return crypto_register_alg(&srp_alg);
}
static void __exit serpent_mod_fini(void)
{
crypto_unregister_algs(srp_algs, ARRAY_SIZE(srp_algs));
crypto_unregister_alg(&srp_alg);
}
subsys_initcall(serpent_mod_init);
module_exit(serpent_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Serpent and tnepres (kerneli compatible serpent reversed) Cipher Algorithm");
MODULE_DESCRIPTION("Serpent Cipher Algorithm");
MODULE_AUTHOR("Dag Arne Osvik <osvik@ii.uib.no>");
MODULE_ALIAS_CRYPTO("tnepres");
MODULE_ALIAS_CRYPTO("serpent");
MODULE_ALIAS_CRYPTO("serpent-generic");

View File

@ -10,6 +10,7 @@
*/
#include <crypto/internal/aead.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <linux/bug.h>
@ -490,12 +491,6 @@ int skcipher_walk_virt(struct skcipher_walk *walk,
}
EXPORT_SYMBOL_GPL(skcipher_walk_virt);
void skcipher_walk_atomise(struct skcipher_walk *walk)
{
walk->flags &= ~SKCIPHER_WALK_SLEEP;
}
EXPORT_SYMBOL_GPL(skcipher_walk_atomise);
int skcipher_walk_async(struct skcipher_walk *walk,
struct skcipher_request *req)
{
@ -986,3 +981,4 @@ EXPORT_SYMBOL_GPL(skcipher_alloc_instance_simple);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Symmetric key cipher type");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -70,8 +70,8 @@ static const char *check[] = {
"des", "md5", "des3_ede", "rot13", "sha1", "sha224", "sha256", "sm3",
"blowfish", "twofish", "serpent", "sha384", "sha512", "md4", "aes",
"cast6", "arc4", "michael_mic", "deflate", "crc32c", "tea", "xtea",
"khazad", "wp512", "wp384", "wp256", "tnepres", "xeta", "fcrypt",
"camellia", "seed", "salsa20", "rmd128", "rmd160", "rmd256", "rmd320",
"khazad", "wp512", "wp384", "wp256", "xeta", "fcrypt",
"camellia", "seed", "rmd160",
"lzo", "lzo-rle", "cts", "sha3-224", "sha3-256", "sha3-384",
"sha3-512", "streebog256", "streebog512",
NULL
@ -199,8 +199,8 @@ static int test_mb_aead_jiffies(struct test_mb_aead_data *data, int enc,
goto out;
}
pr_cont("%d operations in %d seconds (%ld bytes)\n",
bcount * num_mb, secs, (long)bcount * blen * num_mb);
pr_cont("%d operations in %d seconds (%llu bytes)\n",
bcount * num_mb, secs, (u64)bcount * blen * num_mb);
out:
kfree(rc);
@ -471,8 +471,8 @@ static int test_aead_jiffies(struct aead_request *req, int enc,
return ret;
}
printk("%d operations in %d seconds (%ld bytes)\n",
bcount, secs, (long)bcount * blen);
pr_cont("%d operations in %d seconds (%llu bytes)\n",
bcount, secs, (u64)bcount * blen);
return 0;
}
@ -764,8 +764,8 @@ static int test_mb_ahash_jiffies(struct test_mb_ahash_data *data, int blen,
goto out;
}
pr_cont("%d operations in %d seconds (%ld bytes)\n",
bcount * num_mb, secs, (long)bcount * blen * num_mb);
pr_cont("%d operations in %d seconds (%llu bytes)\n",
bcount * num_mb, secs, (u64)bcount * blen * num_mb);
out:
kfree(rc);
@ -1201,8 +1201,8 @@ static int test_mb_acipher_jiffies(struct test_mb_skcipher_data *data, int enc,
goto out;
}
pr_cont("%d operations in %d seconds (%ld bytes)\n",
bcount * num_mb, secs, (long)bcount * blen * num_mb);
pr_cont("%d operations in %d seconds (%llu bytes)\n",
bcount * num_mb, secs, (u64)bcount * blen * num_mb);
out:
kfree(rc);
@ -1441,8 +1441,8 @@ static int test_acipher_jiffies(struct skcipher_request *req, int enc,
return ret;
}
pr_cont("%d operations in %d seconds (%ld bytes)\n",
bcount, secs, (long)bcount * blen);
pr_cont("%d operations in %d seconds (%llu bytes)\n",
bcount, secs, (u64)bcount * blen);
return 0;
}
@ -1806,27 +1806,11 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("wp256");
break;
case 25:
ret += tcrypt_test("ecb(tnepres)");
break;
case 26:
ret += tcrypt_test("ecb(anubis)");
ret += tcrypt_test("cbc(anubis)");
break;
case 27:
ret += tcrypt_test("tgr192");
break;
case 28:
ret += tcrypt_test("tgr160");
break;
case 29:
ret += tcrypt_test("tgr128");
break;
case 30:
ret += tcrypt_test("ecb(xeta)");
break;
@ -1847,10 +1831,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("sha224");
break;
case 34:
ret += tcrypt_test("salsa20");
break;
case 35:
ret += tcrypt_test("gcm(aes)");
break;
@ -1867,22 +1847,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("cts(cbc(aes))");
break;
case 39:
ret += tcrypt_test("rmd128");
break;
case 40:
ret += tcrypt_test("rmd160");
break;
case 41:
ret += tcrypt_test("rmd256");
break;
case 42:
ret += tcrypt_test("rmd320");
break;
case 43:
ret += tcrypt_test("ecb(seed)");
break;
@ -1955,10 +1923,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("xcbc(aes)");
break;
case 107:
ret += tcrypt_test("hmac(rmd128)");
break;
case 108:
ret += tcrypt_test("hmac(rmd160)");
break;
@ -2181,11 +2145,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
speed_template_32_48_64);
break;
case 206:
test_cipher_speed("salsa20", ENCRYPT, sec, NULL, 0,
speed_template_16_32);
break;
case 207:
test_cipher_speed("ecb(serpent)", ENCRYPT, sec, NULL, 0,
speed_template_16_32);
@ -2393,38 +2352,14 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_hash_speed("wp512", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 310:
test_hash_speed("tgr128", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 311:
test_hash_speed("tgr160", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 312:
test_hash_speed("tgr192", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 313:
test_hash_speed("sha224", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 314:
test_hash_speed("rmd128", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 315:
test_hash_speed("rmd160", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 316:
test_hash_speed("rmd256", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 317:
test_hash_speed("rmd320", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
case 318:
klen = 16;
test_hash_speed("ghash", sec, generic_hash_speed_template);
@ -2517,38 +2452,14 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_ahash_speed("wp512", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 410:
test_ahash_speed("tgr128", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 411:
test_ahash_speed("tgr160", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 412:
test_ahash_speed("tgr192", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 413:
test_ahash_speed("sha224", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 414:
test_ahash_speed("rmd128", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 415:
test_ahash_speed("rmd160", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 416:
test_ahash_speed("rmd256", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 417:
test_ahash_speed("rmd320", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
case 418:
test_ahash_speed("sha3-224", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;

View File

@ -33,10 +33,13 @@
#include <crypto/akcipher.h>
#include <crypto/kpp.h>
#include <crypto/acompress.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/simd.h>
#include "internal.h"
MODULE_IMPORT_NS(CRYPTO_INTERNAL);
static bool notests;
module_param(notests, bool, 0644);
MODULE_PARM_DESC(notests, "disable crypto self-tests");
@ -4873,12 +4876,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.cipher = __VECS(tea_tv_template)
}
}, {
.alg = "ecb(tnepres)",
.test = alg_test_skcipher,
.suite = {
.cipher = __VECS(tnepres_tv_template)
}
}, {
.alg = "ecb(twofish)",
.test = alg_test_skcipher,
@ -4954,12 +4951,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.hash = __VECS(hmac_md5_tv_template)
}
}, {
.alg = "hmac(rmd128)",
.test = alg_test_hash,
.suite = {
.hash = __VECS(hmac_rmd128_tv_template)
}
}, {
.alg = "hmac(rmd160)",
.test = alg_test_hash,
@ -5272,30 +5263,12 @@ static const struct alg_test_desc alg_test_descs[] = {
.aad_iv = 1,
}
}
}, {
.alg = "rmd128",
.test = alg_test_hash,
.suite = {
.hash = __VECS(rmd128_tv_template)
}
}, {
.alg = "rmd160",
.test = alg_test_hash,
.suite = {
.hash = __VECS(rmd160_tv_template)
}
}, {
.alg = "rmd256",
.test = alg_test_hash,
.suite = {
.hash = __VECS(rmd256_tv_template)
}
}, {
.alg = "rmd320",
.test = alg_test_hash,
.suite = {
.hash = __VECS(rmd320_tv_template)
}
}, {
.alg = "rsa",
.test = alg_test_akcipher,
@ -5303,12 +5276,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.akcipher = __VECS(rsa_tv_template)
}
}, {
.alg = "salsa20",
.test = alg_test_skcipher,
.suite = {
.cipher = __VECS(salsa20_stream_tv_template)
}
}, {
.alg = "sha1",
.test = alg_test_hash,
@ -5396,24 +5363,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.hash = __VECS(streebog512_tv_template)
}
}, {
.alg = "tgr128",
.test = alg_test_hash,
.suite = {
.hash = __VECS(tgr128_tv_template)
}
}, {
.alg = "tgr160",
.test = alg_test_hash,
.suite = {
.hash = __VECS(tgr160_tv_template)
}
}, {
.alg = "tgr192",
.test = alg_test_hash,
.suite = {
.hash = __VECS(tgr192_tv_template)
}
}, {
.alg = "vmac64(aes)",
.test = alg_test_hash,

File diff suppressed because it is too large Load Diff

View File

@ -1,682 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Cryptographic API.
*
* Tiger hashing Algorithm
*
* Copyright (C) 1998 Free Software Foundation, Inc.
*
* The Tiger algorithm was developed by Ross Anderson and Eli Biham.
* It was optimized for 64-bit processors while still delievering
* decent performance on 32 and 16-bit processors.
*
* This version is derived from the GnuPG implementation and the
* Tiger-Perl interface written by Rafael Sevilla
*
* Adapted for Linux Kernel Crypto by Aaron Grothe
* ajgrothe@yahoo.com, February 22, 2005
*/
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/types.h>
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#define TGR192_DIGEST_SIZE 24
#define TGR160_DIGEST_SIZE 20
#define TGR128_DIGEST_SIZE 16
#define TGR192_BLOCK_SIZE 64
struct tgr192_ctx {
u64 a, b, c;
u8 hash[64];
int count;
u32 nblocks;
};
static const u64 sbox1[256] = {
0x02aab17cf7e90c5eULL, 0xac424b03e243a8ecULL, 0x72cd5be30dd5fcd3ULL,
0x6d019b93f6f97f3aULL, 0xcd9978ffd21f9193ULL, 0x7573a1c9708029e2ULL,
0xb164326b922a83c3ULL, 0x46883eee04915870ULL, 0xeaace3057103ece6ULL,
0xc54169b808a3535cULL, 0x4ce754918ddec47cULL, 0x0aa2f4dfdc0df40cULL,
0x10b76f18a74dbefaULL, 0xc6ccb6235ad1ab6aULL, 0x13726121572fe2ffULL,
0x1a488c6f199d921eULL, 0x4bc9f9f4da0007caULL, 0x26f5e6f6e85241c7ULL,
0x859079dbea5947b6ULL, 0x4f1885c5c99e8c92ULL, 0xd78e761ea96f864bULL,
0x8e36428c52b5c17dULL, 0x69cf6827373063c1ULL, 0xb607c93d9bb4c56eULL,
0x7d820e760e76b5eaULL, 0x645c9cc6f07fdc42ULL, 0xbf38a078243342e0ULL,
0x5f6b343c9d2e7d04ULL, 0xf2c28aeb600b0ec6ULL, 0x6c0ed85f7254bcacULL,
0x71592281a4db4fe5ULL, 0x1967fa69ce0fed9fULL, 0xfd5293f8b96545dbULL,
0xc879e9d7f2a7600bULL, 0x860248920193194eULL, 0xa4f9533b2d9cc0b3ULL,
0x9053836c15957613ULL, 0xdb6dcf8afc357bf1ULL, 0x18beea7a7a370f57ULL,
0x037117ca50b99066ULL, 0x6ab30a9774424a35ULL, 0xf4e92f02e325249bULL,
0x7739db07061ccae1ULL, 0xd8f3b49ceca42a05ULL, 0xbd56be3f51382f73ULL,
0x45faed5843b0bb28ULL, 0x1c813d5c11bf1f83ULL, 0x8af0e4b6d75fa169ULL,
0x33ee18a487ad9999ULL, 0x3c26e8eab1c94410ULL, 0xb510102bc0a822f9ULL,
0x141eef310ce6123bULL, 0xfc65b90059ddb154ULL, 0xe0158640c5e0e607ULL,
0x884e079826c3a3cfULL, 0x930d0d9523c535fdULL, 0x35638d754e9a2b00ULL,
0x4085fccf40469dd5ULL, 0xc4b17ad28be23a4cULL, 0xcab2f0fc6a3e6a2eULL,
0x2860971a6b943fcdULL, 0x3dde6ee212e30446ULL, 0x6222f32ae01765aeULL,
0x5d550bb5478308feULL, 0xa9efa98da0eda22aULL, 0xc351a71686c40da7ULL,
0x1105586d9c867c84ULL, 0xdcffee85fda22853ULL, 0xccfbd0262c5eef76ULL,
0xbaf294cb8990d201ULL, 0xe69464f52afad975ULL, 0x94b013afdf133e14ULL,
0x06a7d1a32823c958ULL, 0x6f95fe5130f61119ULL, 0xd92ab34e462c06c0ULL,
0xed7bde33887c71d2ULL, 0x79746d6e6518393eULL, 0x5ba419385d713329ULL,
0x7c1ba6b948a97564ULL, 0x31987c197bfdac67ULL, 0xde6c23c44b053d02ULL,
0x581c49fed002d64dULL, 0xdd474d6338261571ULL, 0xaa4546c3e473d062ULL,
0x928fce349455f860ULL, 0x48161bbacaab94d9ULL, 0x63912430770e6f68ULL,
0x6ec8a5e602c6641cULL, 0x87282515337ddd2bULL, 0x2cda6b42034b701bULL,
0xb03d37c181cb096dULL, 0xe108438266c71c6fULL, 0x2b3180c7eb51b255ULL,
0xdf92b82f96c08bbcULL, 0x5c68c8c0a632f3baULL, 0x5504cc861c3d0556ULL,
0xabbfa4e55fb26b8fULL, 0x41848b0ab3baceb4ULL, 0xb334a273aa445d32ULL,
0xbca696f0a85ad881ULL, 0x24f6ec65b528d56cULL, 0x0ce1512e90f4524aULL,
0x4e9dd79d5506d35aULL, 0x258905fac6ce9779ULL, 0x2019295b3e109b33ULL,
0xf8a9478b73a054ccULL, 0x2924f2f934417eb0ULL, 0x3993357d536d1bc4ULL,
0x38a81ac21db6ff8bULL, 0x47c4fbf17d6016bfULL, 0x1e0faadd7667e3f5ULL,
0x7abcff62938beb96ULL, 0xa78dad948fc179c9ULL, 0x8f1f98b72911e50dULL,
0x61e48eae27121a91ULL, 0x4d62f7ad31859808ULL, 0xeceba345ef5ceaebULL,
0xf5ceb25ebc9684ceULL, 0xf633e20cb7f76221ULL, 0xa32cdf06ab8293e4ULL,
0x985a202ca5ee2ca4ULL, 0xcf0b8447cc8a8fb1ULL, 0x9f765244979859a3ULL,
0xa8d516b1a1240017ULL, 0x0bd7ba3ebb5dc726ULL, 0xe54bca55b86adb39ULL,
0x1d7a3afd6c478063ULL, 0x519ec608e7669eddULL, 0x0e5715a2d149aa23ULL,
0x177d4571848ff194ULL, 0xeeb55f3241014c22ULL, 0x0f5e5ca13a6e2ec2ULL,
0x8029927b75f5c361ULL, 0xad139fabc3d6e436ULL, 0x0d5df1a94ccf402fULL,
0x3e8bd948bea5dfc8ULL, 0xa5a0d357bd3ff77eULL, 0xa2d12e251f74f645ULL,
0x66fd9e525e81a082ULL, 0x2e0c90ce7f687a49ULL, 0xc2e8bcbeba973bc5ULL,
0x000001bce509745fULL, 0x423777bbe6dab3d6ULL, 0xd1661c7eaef06eb5ULL,
0xa1781f354daacfd8ULL, 0x2d11284a2b16affcULL, 0xf1fc4f67fa891d1fULL,
0x73ecc25dcb920adaULL, 0xae610c22c2a12651ULL, 0x96e0a810d356b78aULL,
0x5a9a381f2fe7870fULL, 0xd5ad62ede94e5530ULL, 0xd225e5e8368d1427ULL,
0x65977b70c7af4631ULL, 0x99f889b2de39d74fULL, 0x233f30bf54e1d143ULL,
0x9a9675d3d9a63c97ULL, 0x5470554ff334f9a8ULL, 0x166acb744a4f5688ULL,
0x70c74caab2e4aeadULL, 0xf0d091646f294d12ULL, 0x57b82a89684031d1ULL,
0xefd95a5a61be0b6bULL, 0x2fbd12e969f2f29aULL, 0x9bd37013feff9fe8ULL,
0x3f9b0404d6085a06ULL, 0x4940c1f3166cfe15ULL, 0x09542c4dcdf3defbULL,
0xb4c5218385cd5ce3ULL, 0xc935b7dc4462a641ULL, 0x3417f8a68ed3b63fULL,
0xb80959295b215b40ULL, 0xf99cdaef3b8c8572ULL, 0x018c0614f8fcb95dULL,
0x1b14accd1a3acdf3ULL, 0x84d471f200bb732dULL, 0xc1a3110e95e8da16ULL,
0x430a7220bf1a82b8ULL, 0xb77e090d39df210eULL, 0x5ef4bd9f3cd05e9dULL,
0x9d4ff6da7e57a444ULL, 0xda1d60e183d4a5f8ULL, 0xb287c38417998e47ULL,
0xfe3edc121bb31886ULL, 0xc7fe3ccc980ccbefULL, 0xe46fb590189bfd03ULL,
0x3732fd469a4c57dcULL, 0x7ef700a07cf1ad65ULL, 0x59c64468a31d8859ULL,
0x762fb0b4d45b61f6ULL, 0x155baed099047718ULL, 0x68755e4c3d50baa6ULL,
0xe9214e7f22d8b4dfULL, 0x2addbf532eac95f4ULL, 0x32ae3909b4bd0109ULL,
0x834df537b08e3450ULL, 0xfa209da84220728dULL, 0x9e691d9b9efe23f7ULL,
0x0446d288c4ae8d7fULL, 0x7b4cc524e169785bULL, 0x21d87f0135ca1385ULL,
0xcebb400f137b8aa5ULL, 0x272e2b66580796beULL, 0x3612264125c2b0deULL,
0x057702bdad1efbb2ULL, 0xd4babb8eacf84be9ULL, 0x91583139641bc67bULL,
0x8bdc2de08036e024ULL, 0x603c8156f49f68edULL, 0xf7d236f7dbef5111ULL,
0x9727c4598ad21e80ULL, 0xa08a0896670a5fd7ULL, 0xcb4a8f4309eba9cbULL,
0x81af564b0f7036a1ULL, 0xc0b99aa778199abdULL, 0x959f1ec83fc8e952ULL,
0x8c505077794a81b9ULL, 0x3acaaf8f056338f0ULL, 0x07b43f50627a6778ULL,
0x4a44ab49f5eccc77ULL, 0x3bc3d6e4b679ee98ULL, 0x9cc0d4d1cf14108cULL,
0x4406c00b206bc8a0ULL, 0x82a18854c8d72d89ULL, 0x67e366b35c3c432cULL,
0xb923dd61102b37f2ULL, 0x56ab2779d884271dULL, 0xbe83e1b0ff1525afULL,
0xfb7c65d4217e49a9ULL, 0x6bdbe0e76d48e7d4ULL, 0x08df828745d9179eULL,
0x22ea6a9add53bd34ULL, 0xe36e141c5622200aULL, 0x7f805d1b8cb750eeULL,
0xafe5c7a59f58e837ULL, 0xe27f996a4fb1c23cULL, 0xd3867dfb0775f0d0ULL,
0xd0e673de6e88891aULL, 0x123aeb9eafb86c25ULL, 0x30f1d5d5c145b895ULL,
0xbb434a2dee7269e7ULL, 0x78cb67ecf931fa38ULL, 0xf33b0372323bbf9cULL,
0x52d66336fb279c74ULL, 0x505f33ac0afb4eaaULL, 0xe8a5cd99a2cce187ULL,
0x534974801e2d30bbULL, 0x8d2d5711d5876d90ULL, 0x1f1a412891bc038eULL,
0xd6e2e71d82e56648ULL, 0x74036c3a497732b7ULL, 0x89b67ed96361f5abULL,
0xffed95d8f1ea02a2ULL, 0xe72b3bd61464d43dULL, 0xa6300f170bdc4820ULL,
0xebc18760ed78a77aULL
};
static const u64 sbox2[256] = {
0xe6a6be5a05a12138ULL, 0xb5a122a5b4f87c98ULL, 0x563c6089140b6990ULL,
0x4c46cb2e391f5dd5ULL, 0xd932addbc9b79434ULL, 0x08ea70e42015aff5ULL,
0xd765a6673e478cf1ULL, 0xc4fb757eab278d99ULL, 0xdf11c6862d6e0692ULL,
0xddeb84f10d7f3b16ULL, 0x6f2ef604a665ea04ULL, 0x4a8e0f0ff0e0dfb3ULL,
0xa5edeef83dbcba51ULL, 0xfc4f0a2a0ea4371eULL, 0xe83e1da85cb38429ULL,
0xdc8ff882ba1b1ce2ULL, 0xcd45505e8353e80dULL, 0x18d19a00d4db0717ULL,
0x34a0cfeda5f38101ULL, 0x0be77e518887caf2ULL, 0x1e341438b3c45136ULL,
0xe05797f49089ccf9ULL, 0xffd23f9df2591d14ULL, 0x543dda228595c5cdULL,
0x661f81fd99052a33ULL, 0x8736e641db0f7b76ULL, 0x15227725418e5307ULL,
0xe25f7f46162eb2faULL, 0x48a8b2126c13d9feULL, 0xafdc541792e76eeaULL,
0x03d912bfc6d1898fULL, 0x31b1aafa1b83f51bULL, 0xf1ac2796e42ab7d9ULL,
0x40a3a7d7fcd2ebacULL, 0x1056136d0afbbcc5ULL, 0x7889e1dd9a6d0c85ULL,
0xd33525782a7974aaULL, 0xa7e25d09078ac09bULL, 0xbd4138b3eac6edd0ULL,
0x920abfbe71eb9e70ULL, 0xa2a5d0f54fc2625cULL, 0xc054e36b0b1290a3ULL,
0xf6dd59ff62fe932bULL, 0x3537354511a8ac7dULL, 0xca845e9172fadcd4ULL,
0x84f82b60329d20dcULL, 0x79c62ce1cd672f18ULL, 0x8b09a2add124642cULL,
0xd0c1e96a19d9e726ULL, 0x5a786a9b4ba9500cULL, 0x0e020336634c43f3ULL,
0xc17b474aeb66d822ULL, 0x6a731ae3ec9baac2ULL, 0x8226667ae0840258ULL,
0x67d4567691caeca5ULL, 0x1d94155c4875adb5ULL, 0x6d00fd985b813fdfULL,
0x51286efcb774cd06ULL, 0x5e8834471fa744afULL, 0xf72ca0aee761ae2eULL,
0xbe40e4cdaee8e09aULL, 0xe9970bbb5118f665ULL, 0x726e4beb33df1964ULL,
0x703b000729199762ULL, 0x4631d816f5ef30a7ULL, 0xb880b5b51504a6beULL,
0x641793c37ed84b6cULL, 0x7b21ed77f6e97d96ULL, 0x776306312ef96b73ULL,
0xae528948e86ff3f4ULL, 0x53dbd7f286a3f8f8ULL, 0x16cadce74cfc1063ULL,
0x005c19bdfa52c6ddULL, 0x68868f5d64d46ad3ULL, 0x3a9d512ccf1e186aULL,
0x367e62c2385660aeULL, 0xe359e7ea77dcb1d7ULL, 0x526c0773749abe6eULL,
0x735ae5f9d09f734bULL, 0x493fc7cc8a558ba8ULL, 0xb0b9c1533041ab45ULL,
0x321958ba470a59bdULL, 0x852db00b5f46c393ULL, 0x91209b2bd336b0e5ULL,
0x6e604f7d659ef19fULL, 0xb99a8ae2782ccb24ULL, 0xccf52ab6c814c4c7ULL,
0x4727d9afbe11727bULL, 0x7e950d0c0121b34dULL, 0x756f435670ad471fULL,
0xf5add442615a6849ULL, 0x4e87e09980b9957aULL, 0x2acfa1df50aee355ULL,
0xd898263afd2fd556ULL, 0xc8f4924dd80c8fd6ULL, 0xcf99ca3d754a173aULL,
0xfe477bacaf91bf3cULL, 0xed5371f6d690c12dULL, 0x831a5c285e687094ULL,
0xc5d3c90a3708a0a4ULL, 0x0f7f903717d06580ULL, 0x19f9bb13b8fdf27fULL,
0xb1bd6f1b4d502843ULL, 0x1c761ba38fff4012ULL, 0x0d1530c4e2e21f3bULL,
0x8943ce69a7372c8aULL, 0xe5184e11feb5ce66ULL, 0x618bdb80bd736621ULL,
0x7d29bad68b574d0bULL, 0x81bb613e25e6fe5bULL, 0x071c9c10bc07913fULL,
0xc7beeb7909ac2d97ULL, 0xc3e58d353bc5d757ULL, 0xeb017892f38f61e8ULL,
0xd4effb9c9b1cc21aULL, 0x99727d26f494f7abULL, 0xa3e063a2956b3e03ULL,
0x9d4a8b9a4aa09c30ULL, 0x3f6ab7d500090fb4ULL, 0x9cc0f2a057268ac0ULL,
0x3dee9d2dedbf42d1ULL, 0x330f49c87960a972ULL, 0xc6b2720287421b41ULL,
0x0ac59ec07c00369cULL, 0xef4eac49cb353425ULL, 0xf450244eef0129d8ULL,
0x8acc46e5caf4deb6ULL, 0x2ffeab63989263f7ULL, 0x8f7cb9fe5d7a4578ULL,
0x5bd8f7644e634635ULL, 0x427a7315bf2dc900ULL, 0x17d0c4aa2125261cULL,
0x3992486c93518e50ULL, 0xb4cbfee0a2d7d4c3ULL, 0x7c75d6202c5ddd8dULL,
0xdbc295d8e35b6c61ULL, 0x60b369d302032b19ULL, 0xce42685fdce44132ULL,
0x06f3ddb9ddf65610ULL, 0x8ea4d21db5e148f0ULL, 0x20b0fce62fcd496fULL,
0x2c1b912358b0ee31ULL, 0xb28317b818f5a308ULL, 0xa89c1e189ca6d2cfULL,
0x0c6b18576aaadbc8ULL, 0xb65deaa91299fae3ULL, 0xfb2b794b7f1027e7ULL,
0x04e4317f443b5bebULL, 0x4b852d325939d0a6ULL, 0xd5ae6beefb207ffcULL,
0x309682b281c7d374ULL, 0xbae309a194c3b475ULL, 0x8cc3f97b13b49f05ULL,
0x98a9422ff8293967ULL, 0x244b16b01076ff7cULL, 0xf8bf571c663d67eeULL,
0x1f0d6758eee30da1ULL, 0xc9b611d97adeb9b7ULL, 0xb7afd5887b6c57a2ULL,
0x6290ae846b984fe1ULL, 0x94df4cdeacc1a5fdULL, 0x058a5bd1c5483affULL,
0x63166cc142ba3c37ULL, 0x8db8526eb2f76f40ULL, 0xe10880036f0d6d4eULL,
0x9e0523c9971d311dULL, 0x45ec2824cc7cd691ULL, 0x575b8359e62382c9ULL,
0xfa9e400dc4889995ULL, 0xd1823ecb45721568ULL, 0xdafd983b8206082fULL,
0xaa7d29082386a8cbULL, 0x269fcd4403b87588ULL, 0x1b91f5f728bdd1e0ULL,
0xe4669f39040201f6ULL, 0x7a1d7c218cf04adeULL, 0x65623c29d79ce5ceULL,
0x2368449096c00bb1ULL, 0xab9bf1879da503baULL, 0xbc23ecb1a458058eULL,
0x9a58df01bb401eccULL, 0xa070e868a85f143dULL, 0x4ff188307df2239eULL,
0x14d565b41a641183ULL, 0xee13337452701602ULL, 0x950e3dcf3f285e09ULL,
0x59930254b9c80953ULL, 0x3bf299408930da6dULL, 0xa955943f53691387ULL,
0xa15edecaa9cb8784ULL, 0x29142127352be9a0ULL, 0x76f0371fff4e7afbULL,
0x0239f450274f2228ULL, 0xbb073af01d5e868bULL, 0xbfc80571c10e96c1ULL,
0xd267088568222e23ULL, 0x9671a3d48e80b5b0ULL, 0x55b5d38ae193bb81ULL,
0x693ae2d0a18b04b8ULL, 0x5c48b4ecadd5335fULL, 0xfd743b194916a1caULL,
0x2577018134be98c4ULL, 0xe77987e83c54a4adULL, 0x28e11014da33e1b9ULL,
0x270cc59e226aa213ULL, 0x71495f756d1a5f60ULL, 0x9be853fb60afef77ULL,
0xadc786a7f7443dbfULL, 0x0904456173b29a82ULL, 0x58bc7a66c232bd5eULL,
0xf306558c673ac8b2ULL, 0x41f639c6b6c9772aULL, 0x216defe99fda35daULL,
0x11640cc71c7be615ULL, 0x93c43694565c5527ULL, 0xea038e6246777839ULL,
0xf9abf3ce5a3e2469ULL, 0x741e768d0fd312d2ULL, 0x0144b883ced652c6ULL,
0xc20b5a5ba33f8552ULL, 0x1ae69633c3435a9dULL, 0x97a28ca4088cfdecULL,
0x8824a43c1e96f420ULL, 0x37612fa66eeea746ULL, 0x6b4cb165f9cf0e5aULL,
0x43aa1c06a0abfb4aULL, 0x7f4dc26ff162796bULL, 0x6cbacc8e54ed9b0fULL,
0xa6b7ffefd2bb253eULL, 0x2e25bc95b0a29d4fULL, 0x86d6a58bdef1388cULL,
0xded74ac576b6f054ULL, 0x8030bdbc2b45805dULL, 0x3c81af70e94d9289ULL,
0x3eff6dda9e3100dbULL, 0xb38dc39fdfcc8847ULL, 0x123885528d17b87eULL,
0xf2da0ed240b1b642ULL, 0x44cefadcd54bf9a9ULL, 0x1312200e433c7ee6ULL,
0x9ffcc84f3a78c748ULL, 0xf0cd1f72248576bbULL, 0xec6974053638cfe4ULL,
0x2ba7b67c0cec4e4cULL, 0xac2f4df3e5ce32edULL, 0xcb33d14326ea4c11ULL,
0xa4e9044cc77e58bcULL, 0x5f513293d934fcefULL, 0x5dc9645506e55444ULL,
0x50de418f317de40aULL, 0x388cb31a69dde259ULL, 0x2db4a83455820a86ULL,
0x9010a91e84711ae9ULL, 0x4df7f0b7b1498371ULL, 0xd62a2eabc0977179ULL,
0x22fac097aa8d5c0eULL
};
static const u64 sbox3[256] = {
0xf49fcc2ff1daf39bULL, 0x487fd5c66ff29281ULL, 0xe8a30667fcdca83fULL,
0x2c9b4be3d2fcce63ULL, 0xda3ff74b93fbbbc2ULL, 0x2fa165d2fe70ba66ULL,
0xa103e279970e93d4ULL, 0xbecdec77b0e45e71ULL, 0xcfb41e723985e497ULL,
0xb70aaa025ef75017ULL, 0xd42309f03840b8e0ULL, 0x8efc1ad035898579ULL,
0x96c6920be2b2abc5ULL, 0x66af4163375a9172ULL, 0x2174abdcca7127fbULL,
0xb33ccea64a72ff41ULL, 0xf04a4933083066a5ULL, 0x8d970acdd7289af5ULL,
0x8f96e8e031c8c25eULL, 0xf3fec02276875d47ULL, 0xec7bf310056190ddULL,
0xf5adb0aebb0f1491ULL, 0x9b50f8850fd58892ULL, 0x4975488358b74de8ULL,
0xa3354ff691531c61ULL, 0x0702bbe481d2c6eeULL, 0x89fb24057deded98ULL,
0xac3075138596e902ULL, 0x1d2d3580172772edULL, 0xeb738fc28e6bc30dULL,
0x5854ef8f63044326ULL, 0x9e5c52325add3bbeULL, 0x90aa53cf325c4623ULL,
0xc1d24d51349dd067ULL, 0x2051cfeea69ea624ULL, 0x13220f0a862e7e4fULL,
0xce39399404e04864ULL, 0xd9c42ca47086fcb7ULL, 0x685ad2238a03e7ccULL,
0x066484b2ab2ff1dbULL, 0xfe9d5d70efbf79ecULL, 0x5b13b9dd9c481854ULL,
0x15f0d475ed1509adULL, 0x0bebcd060ec79851ULL, 0xd58c6791183ab7f8ULL,
0xd1187c5052f3eee4ULL, 0xc95d1192e54e82ffULL, 0x86eea14cb9ac6ca2ULL,
0x3485beb153677d5dULL, 0xdd191d781f8c492aULL, 0xf60866baa784ebf9ULL,
0x518f643ba2d08c74ULL, 0x8852e956e1087c22ULL, 0xa768cb8dc410ae8dULL,
0x38047726bfec8e1aULL, 0xa67738b4cd3b45aaULL, 0xad16691cec0dde19ULL,
0xc6d4319380462e07ULL, 0xc5a5876d0ba61938ULL, 0x16b9fa1fa58fd840ULL,
0x188ab1173ca74f18ULL, 0xabda2f98c99c021fULL, 0x3e0580ab134ae816ULL,
0x5f3b05b773645abbULL, 0x2501a2be5575f2f6ULL, 0x1b2f74004e7e8ba9ULL,
0x1cd7580371e8d953ULL, 0x7f6ed89562764e30ULL, 0xb15926ff596f003dULL,
0x9f65293da8c5d6b9ULL, 0x6ecef04dd690f84cULL, 0x4782275fff33af88ULL,
0xe41433083f820801ULL, 0xfd0dfe409a1af9b5ULL, 0x4325a3342cdb396bULL,
0x8ae77e62b301b252ULL, 0xc36f9e9f6655615aULL, 0x85455a2d92d32c09ULL,
0xf2c7dea949477485ULL, 0x63cfb4c133a39ebaULL, 0x83b040cc6ebc5462ULL,
0x3b9454c8fdb326b0ULL, 0x56f56a9e87ffd78cULL, 0x2dc2940d99f42bc6ULL,
0x98f7df096b096e2dULL, 0x19a6e01e3ad852bfULL, 0x42a99ccbdbd4b40bULL,
0xa59998af45e9c559ULL, 0x366295e807d93186ULL, 0x6b48181bfaa1f773ULL,
0x1fec57e2157a0a1dULL, 0x4667446af6201ad5ULL, 0xe615ebcacfb0f075ULL,
0xb8f31f4f68290778ULL, 0x22713ed6ce22d11eULL, 0x3057c1a72ec3c93bULL,
0xcb46acc37c3f1f2fULL, 0xdbb893fd02aaf50eULL, 0x331fd92e600b9fcfULL,
0xa498f96148ea3ad6ULL, 0xa8d8426e8b6a83eaULL, 0xa089b274b7735cdcULL,
0x87f6b3731e524a11ULL, 0x118808e5cbc96749ULL, 0x9906e4c7b19bd394ULL,
0xafed7f7e9b24a20cULL, 0x6509eadeeb3644a7ULL, 0x6c1ef1d3e8ef0edeULL,
0xb9c97d43e9798fb4ULL, 0xa2f2d784740c28a3ULL, 0x7b8496476197566fULL,
0x7a5be3e6b65f069dULL, 0xf96330ed78be6f10ULL, 0xeee60de77a076a15ULL,
0x2b4bee4aa08b9bd0ULL, 0x6a56a63ec7b8894eULL, 0x02121359ba34fef4ULL,
0x4cbf99f8283703fcULL, 0x398071350caf30c8ULL, 0xd0a77a89f017687aULL,
0xf1c1a9eb9e423569ULL, 0x8c7976282dee8199ULL, 0x5d1737a5dd1f7abdULL,
0x4f53433c09a9fa80ULL, 0xfa8b0c53df7ca1d9ULL, 0x3fd9dcbc886ccb77ULL,
0xc040917ca91b4720ULL, 0x7dd00142f9d1dcdfULL, 0x8476fc1d4f387b58ULL,
0x23f8e7c5f3316503ULL, 0x032a2244e7e37339ULL, 0x5c87a5d750f5a74bULL,
0x082b4cc43698992eULL, 0xdf917becb858f63cULL, 0x3270b8fc5bf86ddaULL,
0x10ae72bb29b5dd76ULL, 0x576ac94e7700362bULL, 0x1ad112dac61efb8fULL,
0x691bc30ec5faa427ULL, 0xff246311cc327143ULL, 0x3142368e30e53206ULL,
0x71380e31e02ca396ULL, 0x958d5c960aad76f1ULL, 0xf8d6f430c16da536ULL,
0xc8ffd13f1be7e1d2ULL, 0x7578ae66004ddbe1ULL, 0x05833f01067be646ULL,
0xbb34b5ad3bfe586dULL, 0x095f34c9a12b97f0ULL, 0x247ab64525d60ca8ULL,
0xdcdbc6f3017477d1ULL, 0x4a2e14d4decad24dULL, 0xbdb5e6d9be0a1eebULL,
0x2a7e70f7794301abULL, 0xdef42d8a270540fdULL, 0x01078ec0a34c22c1ULL,
0xe5de511af4c16387ULL, 0x7ebb3a52bd9a330aULL, 0x77697857aa7d6435ULL,
0x004e831603ae4c32ULL, 0xe7a21020ad78e312ULL, 0x9d41a70c6ab420f2ULL,
0x28e06c18ea1141e6ULL, 0xd2b28cbd984f6b28ULL, 0x26b75f6c446e9d83ULL,
0xba47568c4d418d7fULL, 0xd80badbfe6183d8eULL, 0x0e206d7f5f166044ULL,
0xe258a43911cbca3eULL, 0x723a1746b21dc0bcULL, 0xc7caa854f5d7cdd3ULL,
0x7cac32883d261d9cULL, 0x7690c26423ba942cULL, 0x17e55524478042b8ULL,
0xe0be477656a2389fULL, 0x4d289b5e67ab2da0ULL, 0x44862b9c8fbbfd31ULL,
0xb47cc8049d141365ULL, 0x822c1b362b91c793ULL, 0x4eb14655fb13dfd8ULL,
0x1ecbba0714e2a97bULL, 0x6143459d5cde5f14ULL, 0x53a8fbf1d5f0ac89ULL,
0x97ea04d81c5e5b00ULL, 0x622181a8d4fdb3f3ULL, 0xe9bcd341572a1208ULL,
0x1411258643cce58aULL, 0x9144c5fea4c6e0a4ULL, 0x0d33d06565cf620fULL,
0x54a48d489f219ca1ULL, 0xc43e5eac6d63c821ULL, 0xa9728b3a72770dafULL,
0xd7934e7b20df87efULL, 0xe35503b61a3e86e5ULL, 0xcae321fbc819d504ULL,
0x129a50b3ac60bfa6ULL, 0xcd5e68ea7e9fb6c3ULL, 0xb01c90199483b1c7ULL,
0x3de93cd5c295376cULL, 0xaed52edf2ab9ad13ULL, 0x2e60f512c0a07884ULL,
0xbc3d86a3e36210c9ULL, 0x35269d9b163951ceULL, 0x0c7d6e2ad0cdb5faULL,
0x59e86297d87f5733ULL, 0x298ef221898db0e7ULL, 0x55000029d1a5aa7eULL,
0x8bc08ae1b5061b45ULL, 0xc2c31c2b6c92703aULL, 0x94cc596baf25ef42ULL,
0x0a1d73db22540456ULL, 0x04b6a0f9d9c4179aULL, 0xeffdafa2ae3d3c60ULL,
0xf7c8075bb49496c4ULL, 0x9cc5c7141d1cd4e3ULL, 0x78bd1638218e5534ULL,
0xb2f11568f850246aULL, 0xedfabcfa9502bc29ULL, 0x796ce5f2da23051bULL,
0xaae128b0dc93537cULL, 0x3a493da0ee4b29aeULL, 0xb5df6b2c416895d7ULL,
0xfcabbd25122d7f37ULL, 0x70810b58105dc4b1ULL, 0xe10fdd37f7882a90ULL,
0x524dcab5518a3f5cULL, 0x3c9e85878451255bULL, 0x4029828119bd34e2ULL,
0x74a05b6f5d3ceccbULL, 0xb610021542e13ecaULL, 0x0ff979d12f59e2acULL,
0x6037da27e4f9cc50ULL, 0x5e92975a0df1847dULL, 0xd66de190d3e623feULL,
0x5032d6b87b568048ULL, 0x9a36b7ce8235216eULL, 0x80272a7a24f64b4aULL,
0x93efed8b8c6916f7ULL, 0x37ddbff44cce1555ULL, 0x4b95db5d4b99bd25ULL,
0x92d3fda169812fc0ULL, 0xfb1a4a9a90660bb6ULL, 0x730c196946a4b9b2ULL,
0x81e289aa7f49da68ULL, 0x64669a0f83b1a05fULL, 0x27b3ff7d9644f48bULL,
0xcc6b615c8db675b3ULL, 0x674f20b9bcebbe95ULL, 0x6f31238275655982ULL,
0x5ae488713e45cf05ULL, 0xbf619f9954c21157ULL, 0xeabac46040a8eae9ULL,
0x454c6fe9f2c0c1cdULL, 0x419cf6496412691cULL, 0xd3dc3bef265b0f70ULL,
0x6d0e60f5c3578a9eULL
};
static const u64 sbox4[256] = {
0x5b0e608526323c55ULL, 0x1a46c1a9fa1b59f5ULL, 0xa9e245a17c4c8ffaULL,
0x65ca5159db2955d7ULL, 0x05db0a76ce35afc2ULL, 0x81eac77ea9113d45ULL,
0x528ef88ab6ac0a0dULL, 0xa09ea253597be3ffULL, 0x430ddfb3ac48cd56ULL,
0xc4b3a67af45ce46fULL, 0x4ececfd8fbe2d05eULL, 0x3ef56f10b39935f0ULL,
0x0b22d6829cd619c6ULL, 0x17fd460a74df2069ULL, 0x6cf8cc8e8510ed40ULL,
0xd6c824bf3a6ecaa7ULL, 0x61243d581a817049ULL, 0x048bacb6bbc163a2ULL,
0xd9a38ac27d44cc32ULL, 0x7fddff5baaf410abULL, 0xad6d495aa804824bULL,
0xe1a6a74f2d8c9f94ULL, 0xd4f7851235dee8e3ULL, 0xfd4b7f886540d893ULL,
0x247c20042aa4bfdaULL, 0x096ea1c517d1327cULL, 0xd56966b4361a6685ULL,
0x277da5c31221057dULL, 0x94d59893a43acff7ULL, 0x64f0c51ccdc02281ULL,
0x3d33bcc4ff6189dbULL, 0xe005cb184ce66af1ULL, 0xff5ccd1d1db99beaULL,
0xb0b854a7fe42980fULL, 0x7bd46a6a718d4b9fULL, 0xd10fa8cc22a5fd8cULL,
0xd31484952be4bd31ULL, 0xc7fa975fcb243847ULL, 0x4886ed1e5846c407ULL,
0x28cddb791eb70b04ULL, 0xc2b00be2f573417fULL, 0x5c9590452180f877ULL,
0x7a6bddfff370eb00ULL, 0xce509e38d6d9d6a4ULL, 0xebeb0f00647fa702ULL,
0x1dcc06cf76606f06ULL, 0xe4d9f28ba286ff0aULL, 0xd85a305dc918c262ULL,
0x475b1d8732225f54ULL, 0x2d4fb51668ccb5feULL, 0xa679b9d9d72bba20ULL,
0x53841c0d912d43a5ULL, 0x3b7eaa48bf12a4e8ULL, 0x781e0e47f22f1ddfULL,
0xeff20ce60ab50973ULL, 0x20d261d19dffb742ULL, 0x16a12b03062a2e39ULL,
0x1960eb2239650495ULL, 0x251c16fed50eb8b8ULL, 0x9ac0c330f826016eULL,
0xed152665953e7671ULL, 0x02d63194a6369570ULL, 0x5074f08394b1c987ULL,
0x70ba598c90b25ce1ULL, 0x794a15810b9742f6ULL, 0x0d5925e9fcaf8c6cULL,
0x3067716cd868744eULL, 0x910ab077e8d7731bULL, 0x6a61bbdb5ac42f61ULL,
0x93513efbf0851567ULL, 0xf494724b9e83e9d5ULL, 0xe887e1985c09648dULL,
0x34b1d3c675370cfdULL, 0xdc35e433bc0d255dULL, 0xd0aab84234131be0ULL,
0x08042a50b48b7eafULL, 0x9997c4ee44a3ab35ULL, 0x829a7b49201799d0ULL,
0x263b8307b7c54441ULL, 0x752f95f4fd6a6ca6ULL, 0x927217402c08c6e5ULL,
0x2a8ab754a795d9eeULL, 0xa442f7552f72943dULL, 0x2c31334e19781208ULL,
0x4fa98d7ceaee6291ULL, 0x55c3862f665db309ULL, 0xbd0610175d53b1f3ULL,
0x46fe6cb840413f27ULL, 0x3fe03792df0cfa59ULL, 0xcfe700372eb85e8fULL,
0xa7be29e7adbce118ULL, 0xe544ee5cde8431ddULL, 0x8a781b1b41f1873eULL,
0xa5c94c78a0d2f0e7ULL, 0x39412e2877b60728ULL, 0xa1265ef3afc9a62cULL,
0xbcc2770c6a2506c5ULL, 0x3ab66dd5dce1ce12ULL, 0xe65499d04a675b37ULL,
0x7d8f523481bfd216ULL, 0x0f6f64fcec15f389ULL, 0x74efbe618b5b13c8ULL,
0xacdc82b714273e1dULL, 0xdd40bfe003199d17ULL, 0x37e99257e7e061f8ULL,
0xfa52626904775aaaULL, 0x8bbbf63a463d56f9ULL, 0xf0013f1543a26e64ULL,
0xa8307e9f879ec898ULL, 0xcc4c27a4150177ccULL, 0x1b432f2cca1d3348ULL,
0xde1d1f8f9f6fa013ULL, 0x606602a047a7ddd6ULL, 0xd237ab64cc1cb2c7ULL,
0x9b938e7225fcd1d3ULL, 0xec4e03708e0ff476ULL, 0xfeb2fbda3d03c12dULL,
0xae0bced2ee43889aULL, 0x22cb8923ebfb4f43ULL, 0x69360d013cf7396dULL,
0x855e3602d2d4e022ULL, 0x073805bad01f784cULL, 0x33e17a133852f546ULL,
0xdf4874058ac7b638ULL, 0xba92b29c678aa14aULL, 0x0ce89fc76cfaadcdULL,
0x5f9d4e0908339e34ULL, 0xf1afe9291f5923b9ULL, 0x6e3480f60f4a265fULL,
0xeebf3a2ab29b841cULL, 0xe21938a88f91b4adULL, 0x57dfeff845c6d3c3ULL,
0x2f006b0bf62caaf2ULL, 0x62f479ef6f75ee78ULL, 0x11a55ad41c8916a9ULL,
0xf229d29084fed453ULL, 0x42f1c27b16b000e6ULL, 0x2b1f76749823c074ULL,
0x4b76eca3c2745360ULL, 0x8c98f463b91691bdULL, 0x14bcc93cf1ade66aULL,
0x8885213e6d458397ULL, 0x8e177df0274d4711ULL, 0xb49b73b5503f2951ULL,
0x10168168c3f96b6bULL, 0x0e3d963b63cab0aeULL, 0x8dfc4b5655a1db14ULL,
0xf789f1356e14de5cULL, 0x683e68af4e51dac1ULL, 0xc9a84f9d8d4b0fd9ULL,
0x3691e03f52a0f9d1ULL, 0x5ed86e46e1878e80ULL, 0x3c711a0e99d07150ULL,
0x5a0865b20c4e9310ULL, 0x56fbfc1fe4f0682eULL, 0xea8d5de3105edf9bULL,
0x71abfdb12379187aULL, 0x2eb99de1bee77b9cULL, 0x21ecc0ea33cf4523ULL,
0x59a4d7521805c7a1ULL, 0x3896f5eb56ae7c72ULL, 0xaa638f3db18f75dcULL,
0x9f39358dabe9808eULL, 0xb7defa91c00b72acULL, 0x6b5541fd62492d92ULL,
0x6dc6dee8f92e4d5bULL, 0x353f57abc4beea7eULL, 0x735769d6da5690ceULL,
0x0a234aa642391484ULL, 0xf6f9508028f80d9dULL, 0xb8e319a27ab3f215ULL,
0x31ad9c1151341a4dULL, 0x773c22a57bef5805ULL, 0x45c7561a07968633ULL,
0xf913da9e249dbe36ULL, 0xda652d9b78a64c68ULL, 0x4c27a97f3bc334efULL,
0x76621220e66b17f4ULL, 0x967743899acd7d0bULL, 0xf3ee5bcae0ed6782ULL,
0x409f753600c879fcULL, 0x06d09a39b5926db6ULL, 0x6f83aeb0317ac588ULL,
0x01e6ca4a86381f21ULL, 0x66ff3462d19f3025ULL, 0x72207c24ddfd3bfbULL,
0x4af6b6d3e2ece2ebULL, 0x9c994dbec7ea08deULL, 0x49ace597b09a8bc4ULL,
0xb38c4766cf0797baULL, 0x131b9373c57c2a75ULL, 0xb1822cce61931e58ULL,
0x9d7555b909ba1c0cULL, 0x127fafdd937d11d2ULL, 0x29da3badc66d92e4ULL,
0xa2c1d57154c2ecbcULL, 0x58c5134d82f6fe24ULL, 0x1c3ae3515b62274fULL,
0xe907c82e01cb8126ULL, 0xf8ed091913e37fcbULL, 0x3249d8f9c80046c9ULL,
0x80cf9bede388fb63ULL, 0x1881539a116cf19eULL, 0x5103f3f76bd52457ULL,
0x15b7e6f5ae47f7a8ULL, 0xdbd7c6ded47e9ccfULL, 0x44e55c410228bb1aULL,
0xb647d4255edb4e99ULL, 0x5d11882bb8aafc30ULL, 0xf5098bbb29d3212aULL,
0x8fb5ea14e90296b3ULL, 0x677b942157dd025aULL, 0xfb58e7c0a390acb5ULL,
0x89d3674c83bd4a01ULL, 0x9e2da4df4bf3b93bULL, 0xfcc41e328cab4829ULL,
0x03f38c96ba582c52ULL, 0xcad1bdbd7fd85db2ULL, 0xbbb442c16082ae83ULL,
0xb95fe86ba5da9ab0ULL, 0xb22e04673771a93fULL, 0x845358c9493152d8ULL,
0xbe2a488697b4541eULL, 0x95a2dc2dd38e6966ULL, 0xc02c11ac923c852bULL,
0x2388b1990df2a87bULL, 0x7c8008fa1b4f37beULL, 0x1f70d0c84d54e503ULL,
0x5490adec7ece57d4ULL, 0x002b3c27d9063a3aULL, 0x7eaea3848030a2bfULL,
0xc602326ded2003c0ULL, 0x83a7287d69a94086ULL, 0xc57a5fcb30f57a8aULL,
0xb56844e479ebe779ULL, 0xa373b40f05dcbce9ULL, 0xd71a786e88570ee2ULL,
0x879cbacdbde8f6a0ULL, 0x976ad1bcc164a32fULL, 0xab21e25e9666d78bULL,
0x901063aae5e5c33cULL, 0x9818b34448698d90ULL, 0xe36487ae3e1e8abbULL,
0xafbdf931893bdcb4ULL, 0x6345a0dc5fbbd519ULL, 0x8628fe269b9465caULL,
0x1e5d01603f9c51ecULL, 0x4de44006a15049b7ULL, 0xbf6c70e5f776cbb1ULL,
0x411218f2ef552bedULL, 0xcb0c0708705a36a3ULL, 0xe74d14754f986044ULL,
0xcd56d9430ea8280eULL, 0xc12591d7535f5065ULL, 0xc83223f1720aef96ULL,
0xc3a0396f7363a51fULL
};
static void tgr192_round(u64 * ra, u64 * rb, u64 * rc, u64 x, int mul)
{
u64 a = *ra;
u64 b = *rb;
u64 c = *rc;
c ^= x;
a -= sbox1[c & 0xff] ^ sbox2[(c >> 16) & 0xff]
^ sbox3[(c >> 32) & 0xff] ^ sbox4[(c >> 48) & 0xff];
b += sbox4[(c >> 8) & 0xff] ^ sbox3[(c >> 24) & 0xff]
^ sbox2[(c >> 40) & 0xff] ^ sbox1[(c >> 56) & 0xff];
b *= mul;
*ra = a;
*rb = b;
*rc = c;
}
static void tgr192_pass(u64 * ra, u64 * rb, u64 * rc, u64 * x, int mul)
{
u64 a = *ra;
u64 b = *rb;
u64 c = *rc;
tgr192_round(&a, &b, &c, x[0], mul);
tgr192_round(&b, &c, &a, x[1], mul);
tgr192_round(&c, &a, &b, x[2], mul);
tgr192_round(&a, &b, &c, x[3], mul);
tgr192_round(&b, &c, &a, x[4], mul);
tgr192_round(&c, &a, &b, x[5], mul);
tgr192_round(&a, &b, &c, x[6], mul);
tgr192_round(&b, &c, &a, x[7], mul);
*ra = a;
*rb = b;
*rc = c;
}
static void tgr192_key_schedule(u64 * x)
{
x[0] -= x[7] ^ 0xa5a5a5a5a5a5a5a5ULL;
x[1] ^= x[0];
x[2] += x[1];
x[3] -= x[2] ^ ((~x[1]) << 19);
x[4] ^= x[3];
x[5] += x[4];
x[6] -= x[5] ^ ((~x[4]) >> 23);
x[7] ^= x[6];
x[0] += x[7];
x[1] -= x[0] ^ ((~x[7]) << 19);
x[2] ^= x[1];
x[3] += x[2];
x[4] -= x[3] ^ ((~x[2]) >> 23);
x[5] ^= x[4];
x[6] += x[5];
x[7] -= x[6] ^ 0x0123456789abcdefULL;
}
/****************
* Transform the message DATA which consists of 512 bytes (8 words)
*/
static void tgr192_transform(struct tgr192_ctx *tctx, const u8 * data)
{
u64 a, b, c, aa, bb, cc;
u64 x[8];
int i;
for (i = 0; i < 8; i++)
x[i] = get_unaligned_le64(data + i * sizeof(__le64));
/* save */
a = aa = tctx->a;
b = bb = tctx->b;
c = cc = tctx->c;
tgr192_pass(&a, &b, &c, x, 5);
tgr192_key_schedule(x);
tgr192_pass(&c, &a, &b, x, 7);
tgr192_key_schedule(x);
tgr192_pass(&b, &c, &a, x, 9);
/* feedforward */
a ^= aa;
b -= bb;
c += cc;
/* store */
tctx->a = a;
tctx->b = b;
tctx->c = c;
}
static int tgr192_init(struct shash_desc *desc)
{
struct tgr192_ctx *tctx = shash_desc_ctx(desc);
tctx->a = 0x0123456789abcdefULL;
tctx->b = 0xfedcba9876543210ULL;
tctx->c = 0xf096a5b4c3b2e187ULL;
tctx->nblocks = 0;
tctx->count = 0;
return 0;
}
/* Update the message digest with the contents
* of INBUF with length INLEN. */
static int tgr192_update(struct shash_desc *desc, const u8 *inbuf,
unsigned int len)
{
struct tgr192_ctx *tctx = shash_desc_ctx(desc);
if (tctx->count == 64) { /* flush the buffer */
tgr192_transform(tctx, tctx->hash);
tctx->count = 0;
tctx->nblocks++;
}
if (!inbuf) {
return 0;
}
if (tctx->count) {
for (; len && tctx->count < 64; len--) {
tctx->hash[tctx->count++] = *inbuf++;
}
tgr192_update(desc, NULL, 0);
if (!len) {
return 0;
}
}
while (len >= 64) {
tgr192_transform(tctx, inbuf);
tctx->count = 0;
tctx->nblocks++;
len -= 64;
inbuf += 64;
}
for (; len && tctx->count < 64; len--) {
tctx->hash[tctx->count++] = *inbuf++;
}
return 0;
}
/* The routine terminates the computation */
static int tgr192_final(struct shash_desc *desc, u8 * out)
{
struct tgr192_ctx *tctx = shash_desc_ctx(desc);
__be64 *dst = (__be64 *)out;
__be64 *be64p;
__le32 *le32p;
u32 t, msb, lsb;
tgr192_update(desc, NULL, 0); /* flush */
msb = 0;
t = tctx->nblocks;
if ((lsb = t << 6) < t) { /* multiply by 64 to make a byte count */
msb++;
}
msb += t >> 26;
t = lsb;
if ((lsb = t + tctx->count) < t) { /* add the count */
msb++;
}
t = lsb;
if ((lsb = t << 3) < t) { /* multiply by 8 to make a bit count */
msb++;
}
msb += t >> 29;
if (tctx->count < 56) { /* enough room */
tctx->hash[tctx->count++] = 0x01; /* pad */
while (tctx->count < 56) {
tctx->hash[tctx->count++] = 0; /* pad */
}
} else { /* need one extra block */
tctx->hash[tctx->count++] = 0x01; /* pad character */
while (tctx->count < 64) {
tctx->hash[tctx->count++] = 0;
}
tgr192_update(desc, NULL, 0); /* flush */
memset(tctx->hash, 0, 56); /* fill next block with zeroes */
}
/* append the 64 bit count */
le32p = (__le32 *)&tctx->hash[56];
le32p[0] = cpu_to_le32(lsb);
le32p[1] = cpu_to_le32(msb);
tgr192_transform(tctx, tctx->hash);
be64p = (__be64 *)tctx->hash;
dst[0] = be64p[0] = cpu_to_be64(tctx->a);
dst[1] = be64p[1] = cpu_to_be64(tctx->b);
dst[2] = be64p[2] = cpu_to_be64(tctx->c);
return 0;
}
static int tgr160_final(struct shash_desc *desc, u8 * out)
{
u8 D[64];
tgr192_final(desc, D);
memcpy(out, D, TGR160_DIGEST_SIZE);
memzero_explicit(D, TGR192_DIGEST_SIZE);
return 0;
}
static int tgr128_final(struct shash_desc *desc, u8 * out)
{
u8 D[64];
tgr192_final(desc, D);
memcpy(out, D, TGR128_DIGEST_SIZE);
memzero_explicit(D, TGR192_DIGEST_SIZE);
return 0;
}
static struct shash_alg tgr_algs[3] = { {
.digestsize = TGR192_DIGEST_SIZE,
.init = tgr192_init,
.update = tgr192_update,
.final = tgr192_final,
.descsize = sizeof(struct tgr192_ctx),
.base = {
.cra_name = "tgr192",
.cra_driver_name = "tgr192-generic",
.cra_blocksize = TGR192_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
}, {
.digestsize = TGR160_DIGEST_SIZE,
.init = tgr192_init,
.update = tgr192_update,
.final = tgr160_final,
.descsize = sizeof(struct tgr192_ctx),
.base = {
.cra_name = "tgr160",
.cra_driver_name = "tgr160-generic",
.cra_blocksize = TGR192_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
}, {
.digestsize = TGR128_DIGEST_SIZE,
.init = tgr192_init,
.update = tgr192_update,
.final = tgr128_final,
.descsize = sizeof(struct tgr192_ctx),
.base = {
.cra_name = "tgr128",
.cra_driver_name = "tgr128-generic",
.cra_blocksize = TGR192_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
} };
static int __init tgr192_mod_init(void)
{
return crypto_register_shashes(tgr_algs, ARRAY_SIZE(tgr_algs));
}
static void __exit tgr192_mod_fini(void)
{
crypto_unregister_shashes(tgr_algs, ARRAY_SIZE(tgr_algs));
}
MODULE_ALIAS_CRYPTO("tgr192");
MODULE_ALIAS_CRYPTO("tgr160");
MODULE_ALIAS_CRYPTO("tgr128");
subsys_initcall(tgr192_mod_init);
module_exit(tgr192_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Tiger Message Digest Algorithm");

View File

@ -24,7 +24,7 @@
* Third Edition.
*/
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <crypto/twofish.h>
#include <linux/module.h>
#include <linux/init.h>
@ -83,11 +83,11 @@
* whitening subkey number m. */
#define INPACK(n, x, m) \
x = le32_to_cpu(src[n]) ^ ctx->w[m]
x = get_unaligned_le32(in + (n) * 4) ^ ctx->w[m]
#define OUTUNPACK(n, x, m) \
x ^= ctx->w[m]; \
dst[n] = cpu_to_le32(x)
put_unaligned_le32(x, out + (n) * 4)
@ -95,8 +95,6 @@
static void twofish_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct twofish_ctx *ctx = crypto_tfm_ctx(tfm);
const __le32 *src = (const __le32 *)in;
__le32 *dst = (__le32 *)out;
/* The four 32-bit chunks of the text. */
u32 a, b, c, d;
@ -132,8 +130,6 @@ static void twofish_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void twofish_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
struct twofish_ctx *ctx = crypto_tfm_ctx(tfm);
const __le32 *src = (const __le32 *)in;
__le32 *dst = (__le32 *)out;
/* The four 32-bit chunks of the text. */
u32 a, b, c, d;
@ -172,7 +168,6 @@ static struct crypto_alg alg = {
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = TF_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct twofish_ctx),
.cra_alignmask = 3,
.cra_module = THIS_MODULE,
.cra_u = { .cipher = {
.cia_min_keysize = TF_MIN_KEY_SIZE,

View File

@ -36,6 +36,7 @@
#include <linux/scatterlist.h>
#include <asm/byteorder.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
/*
@ -693,3 +694,4 @@ module_exit(vmac_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("VMAC hash algorithm");
MODULE_ALIAS_CRYPTO("vmac64");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

View File

@ -6,6 +6,7 @@
* Kazunori Miyazawa <miyazawa@linux-ipv6.org>
*/
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
#include <linux/err.h>
#include <linux/kernel.h>
@ -272,3 +273,4 @@ module_exit(crypto_xcbc_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("XCBC keyed hash algorithm");
MODULE_ALIAS_CRYPTO("xcbc");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);

Some files were not shown because too many files have changed in this diff Show More