Commit Graph

1308868 Commits

Author SHA1 Message Date
Ard Biesheuvel
fcf27785ae crypto: arm/crct10dif - Use existing mov_l macro instead of __adrl
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Ard Biesheuvel
779cee8209 crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
The only remaining user of the fallback implementation of 64x64
polynomial multiplication using 8x8 PMULL instructions is the final
reduction from a 16 byte vector to a 16-bit CRC.

The fallback code is complicated and messy, and this reduction has
little impact on the overall performance, so instead, let's calculate
the final CRC by passing the 16 byte vector to the generic CRC-T10DIF
implementation when running the fallback version.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Ard Biesheuvel
67dfb1b73f crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
The CRC-T10DIF implementation for arm64 has a version that uses 8x8
polynomial multiplication, for cores that lack the crypto extensions,
which cover the 64x64 polynomial multiplication instruction that the
algorithm was built around.

This fallback version rather naively adopted the 64x64 polynomial
multiplication algorithm that I ported from ARM for the GHASH driver,
which needs 8 PMULL8 instructions to implement one PMULL64. This is
reasonable, given that each 8-bit vector element needs to be multiplied
with each element in the other vector, producing 8 vectors with partial
results that need to be combined to yield the correct result.

However, most PMULL64 invocations in the CRC-T10DIF code involve
multiplication by a pair of 16-bit folding coefficients, and so all the
partial results from higher order bytes will be zero, and there is no
need to calculate them to begin with.

Then, the CRC-T10DIF algorithm always XORs the output values of the
PMULL64 instructions being issued in pairs, and so there is no need to
faithfully implement each individual PMULL64 instruction, as long as
XORing the results pairwise produces the expected result.

Implementing these improvements results in a speedup of 3.3x on low-end
platforms such as Raspberry Pi 4 (Cortex-A72)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Ard Biesheuvel
7048c21e6b crypto: arm64/crct10dif - Remove obsolete chunking logic
This is a partial revert of commit fc754c024a, which moved the logic
into C code which ensures that kernel mode NEON code does not hog the
CPU for too long.

This is no longer needed now that kernel mode NEON no longer disables
preemption, so we can drop this.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Chen Ridong
19630cf572 crypto: bcm - add error check in the ahash_hmac_init function
The ahash_init functions may return fails. The ahash_hmac_init should
not return ok when ahash_init returns error. For an example, ahash_init
will return -ENOMEM when allocation memory is error.

Fixes: 9d12ba86f8 ("crypto: brcm - Add Broadcom SPU driver")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Chen Ridong
b64140c74e crypto: caam - add error check to caam_rsa_set_priv_key_form
The caam_rsa_set_priv_key_form did not check for memory allocation errors.
Add the checks to the caam_rsa_set_priv_key_form functions.

Fixes: 52e26d77b8 ("crypto: caam - add support for RSA key form 2")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Markus Mayer
35b2237f27 hwrng: bcm74110 - Add Broadcom BCM74110 RNG driver
Add a driver for the random number generator present on the Broadcom
BCM74110 SoC.

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-10 11:51:26 +08:00
Markus Mayer
c0559d2456 dt-bindings: rng: add binding for BCM74110 RNG
Add a binding for the random number generator used on the BCM74110.

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-10 11:50:54 +08:00
Zicheng Qu
e45f0ab6ee padata: Clean up in padata_do_multithreaded()
In commit 24cc57d8fa ("padata: Honor the caller's alignment in case of
chunk_size 0"), the line 'ps.chunk_size = max(ps.chunk_size, 1ul)' was
added, making 'ps.chunk_size = 1U' redundant and never executed.

Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-10 11:50:54 +08:00
Li Huafei
a10549fcce crypto: inside-secure - Fix the return value of safexcel_xcbcmac_cra_init()
The commit 320406cb60 ("crypto: inside-secure - Replace generic aes
with libaes") replaced crypto_alloc_cipher() with kmalloc(), but did not
modify the handling of the return value. When kmalloc() returns NULL,
PTR_ERR_OR_ZERO(NULL) returns 0, but in fact, the memory allocation has
failed, and -ENOMEM should be returned.

Fixes: 320406cb60 ("crypto: inside-secure - Replace generic aes with libaes")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Acked-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-10 11:50:54 +08:00
Wang Hai
d8920a722a crypto: qat - Fix missing destroy_workqueue in adf_init_aer()
The adf_init_aer() won't destroy device_reset_wq when alloc_workqueue()
for device_sriov_wq failed. Add destroy_workqueue for device_reset_wq to
fix this issue.

Fixes: 4469f9b234 ("crypto: qat - re-enable sriov after pf reset")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-10 11:50:54 +08:00
Lukas Wunner
a03a728e37 crypto: rsassa-pkcs1 - Reinstate support for legacy protocols
Commit 1e562deace ("crypto: rsassa-pkcs1 - Migrate to sig_alg backend")
enforced that rsassa-pkcs1 sign/verify operations specify a hash
algorithm.  That is necessary because per RFC 8017 sec 8.2, a hash
algorithm identifier must be prepended to the hash before generating or
verifying the signature ("Full Hash Prefix").

However the commit went too far in that it changed user space behavior:
KEYCTL_PKEY_QUERY system calls now return -EINVAL unless they specify a
hash algorithm.  Intel Wireless Daemon (iwd) is one application issuing
such system calls (for EAP-TLS).

Closer analysis of the Embedded Linux Library (ell) used by iwd reveals
that the problem runs even deeper:  When iwd uses TLS 1.1 or earlier, it
not only queries for keys, but performs sign/verify operations without
specifying a hash algorithm.  These legacy TLS versions concatenate an
MD5 to a SHA-1 hash and omit the Full Hash Prefix:

https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/tls-suites.c#n97

TLS 1.1 was deprecated in 2021 by RFC 8996, but removal of support was
inadvertent in this case.  It probably should be coordinated with iwd
maintainers first.

So reinstate support for such legacy protocols by defaulting to hash
algorithm "none" which uses an empty Full Hash Prefix.

If it is later on decided to remove TLS 1.1 support but still allow
KEYCTL_PKEY_QUERY without a hash algorithm, that can be achieved by
reverting the present commit and replacing it with the following patch:

https://lore.kernel.org/r/ZxalYZwH5UiGX5uj@wunner.de/

It's worth noting that Python's cryptography library gained support for
such legacy use cases very recently, so they do seem to still be a thing.
The Python developers identified IKE version 1 as another protocol
omitting the Full Hash Prefix:

https://github.com/pyca/cryptography/issues/10226
https://github.com/pyca/cryptography/issues/5495

The author of those issues, Zoltan Kelemen, spent considerable effort
searching for test vectors but only found one in a 2019 blog post by
Kevin Jones.  Add it to testmgr.h to verify correctness of this feature.

Examination of wpa_supplicant as well as various IKE daemons (libreswan,
strongswan, isakmpd, raccoon) has determined that none of them seems to
use the kernel's Key Retention Service, so iwd is the only affected user
space application known so far.

Fixes: 1e562deace ("crypto: rsassa-pkcs1 - Migrate to sig_alg backend")
Reported-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Klara Modin <klarasmodin@gmail.com>
Closes: https://lore.kernel.org/r/2ed09a22-86c0-4cf0-8bda-ef804ccb3413@gmail.com/
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-10 11:50:54 +08:00
Weili Qian
c418ba6bac crypto: hisilicon/qm - disable same error report before resetting
If an error indicating that the device needs to be reset is reported,
disable the error reporting before device reset is complete,
enable the error reporting after the reset is complete to prevent
the same error from being reported repeatedly.

Fixes: eaebf4c3b1 ("crypto: hisilicon - Unify hardware error init/uninit into QM")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-02 18:23:25 +08:00
Qi Tao
2a69297eed crypto: hisilicon - support querying the capability register
Query the capability register status of accelerator devices
(SEC, HPRE and ZIP) through the debugfs interface, for example:
cat cap_regs. The purpose is to improve the robustness and
locability of hardware devices and drivers.

Signed-off-by: Qi Tao <taoqi10@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-02 18:23:24 +08:00
Dr. David Alan Gilbert
acb0ed8432 crypto: asymmetric_keys - Remove unused functions
encrypt_blob(), decrypt_blob() and create_signature() were some of the
functions added in 2018 by
commit 5a30771832 ("KEYS: Provide missing asymmetric key subops for new
key type ops [ver #2]")
however, they've not been used.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-02 18:23:24 +08:00
Uwe Kleine-König
d11c8b87a3 hwrng: drivers - Switch back to struct platform_driver::remove()
After commit 0edb555a65 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/char/hw_random to use
.remove(), with the eventual goal to drop struct
platform_driver::remove_new(). As .remove() and .remove_new() have the
same prototypes, conversion is done by just changing the structure
member name in the driver initializer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-02 18:23:24 +08:00
Ovidiu Panait
d186faa307 crypto: starfive - remove unneeded crypto_engine_stop() call
The explicit crypto_engine_stop() call is not needed, as it is already
called internally by crypto_engine_exit().

Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:11 +08:00
Ovidiu Panait
6ef46fec41 crypto: tegra - remove unneeded crypto_engine_stop() call
The explicit crypto_engine_stop() call is not needed, as it is already
called internally by crypto_engine_exit().

Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:11 +08:00
Eric Biggers
4964a1d91c crypto: api - move crypto_simd_disabled_for_test to lib
Move crypto_simd_disabled_for_test to lib/ so that crypto_simd_usable()
can be used by library code.

This was discussed previously
(https://lore.kernel.org/linux-crypto/20220716062920.210381-4-ebiggers@kernel.org/)
but was not done because there was no use case yet.  However, this is
now needed for the arm64 CRC32 library code.

Tested with:
    export ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
    echo CONFIG_CRC32=y > .config
    echo CONFIG_MODULES=y >> .config
    echo CONFIG_CRYPTO=m >> .config
    echo CONFIG_DEBUG_KERNEL=y >> .config
    echo CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n >> .config
    echo CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y >> .config
    make olddefconfig
    make -j$(nproc)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:11 +08:00
Everest K.C
53d91ca76b crypto: cavium - Fix the if condition to exit loop after timeout
The while loop breaks in the first run because of incorrect
if condition. It also causes the statements after the if to
appear dead.
Fix this by changing the condition from if(timeout--) to
if(!timeout--).

This bug was reported by Coverity Scan.
Report:
CID 1600859: (#1 of 1): Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach this statement: udelay(30UL);

Fixes: 9e2c7d9994 ("crypto: cavium - Add Support for Octeon-tx CPT Engine")
Signed-off-by: Everest K.C. <everestkc@everestkc.com.np>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:11 +08:00
Yuvaraj Ranganathan
7a42b7b930 dt-bindings: crypto: qcom-qce: document the SA8775P crypto engine
Document the crypto engine on the SA8775P Platform.

Signed-off-by: Yuvaraj Ranganathan <quic_yrangana@quicinc.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Christian Marangi
e53ca8efcc hwrng: airoha - add support for Airoha EN7581 TRNG
Add support for Airoha TRNG. The Airoha SoC provide a True RNG module
that can output 4 bytes of raw data at times.

The module makes use of various noise source to provide True Random
Number Generation.

On probe the module is reset to operate Health Test and verify correct
execution of it.

The module can also provide DRBG function but the execution mode is
mutually exclusive, running as TRNG doesn't permit to also run it as
DRBG.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Christian Marangi
7705fe6eb5 dt-bindings: rng: add support for Airoha EN7581 TRNG
Add support for Airoha EN7581 True Random Number generator.

This module can generate up to 4bytes of raw data at times and support
self health test at startup. The module gets noise for randomness from
various source from ADC, AP, dedicated clocks and other devices attached
to the SoC producing true random numbers.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
WangYuli
2ab74b57ba crypto: qat - Fix typo "accelaration"
There is a spelling mistake of 'accelaration' in comments which
should be 'acceleration'.

Signed-off-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
7cc26d4a5f crypto: x86/aegis128 - remove unneeded RETs
Remove returns that are immediately followed by another return.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
a09be0354b crypto: x86/aegis128 - remove unneeded FRAME_BEGIN and FRAME_END
Stop using FRAME_BEGIN and FRAME_END in the AEGIS assembly functions,
since all these functions are now leaf functions.  This eliminates some
unnecessary instructions.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
a0927a03e7 crypto: x86/aegis128 - take advantage of block-aligned len
Update a caller of aegis128_aesni_ad() to round down the length to a
block boundary.  After that, aegis128_aesni_ad(), aegis128_aesni_enc(),
and aegis128_aesni_dec() are only passed whole blocks.  Update the
assembly code to take advantage of that, which eliminates some unneeded
instructions.  For aegis128_aesni_enc() and aegis128_aesni_dec(), the
length is also always nonzero, so stop checking for zero length.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
933e897431 crypto: x86/aegis128 - optimize partial block handling using SSE4.1
Optimize the code that loads and stores partial blocks, taking advantage
of SSE4.1.  The code is adapted from that in aes-gcm-aesni-x86_64.S.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
8da94b300f crypto: x86/aegis128 - improve assembly function prototypes
Adjust the prototypes of the AEGIS assembly functions:

- Use proper types instead of 'void *', when applicable.

- Move the length parameter to after the buffers it describes rather
  than before, to match the usual convention.  Also shorten its name to
  just len (which is the name used in the assembly code).

- Declare register aliases at the beginning of each function rather than
  once per file.  This was necessary because len was moved, but also it
  allows adding some aliases where raw registers were used before.

- Put assoclen and cryptlen in the correct order when declaring the
  finalization function in the .c file.

- Remove the unnecessary "crypto_" prefix.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
af2aff7caf crypto: x86/aegis128 - optimize length block preparation using SSE4.1
Start using SSE4.1 instructions in the AES-NI AEGIS code, with the first
use case being preparing the length block in fewer instructions.

In practice this does not reduce the set of CPUs on which the code can
run, because all Intel and AMD CPUs with AES-NI also have SSE4.1.

Upgrade the existing SSE2 feature check to SSE4.1, though it seems this
check is not strictly necessary; the aesni-intel module has been getting
away with using SSE4.1 despite checking for AES-NI only.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
595bca25a6 crypto: x86/aegis128 - don't bother with special code for aligned data
Remove the AEGIS assembly code paths that were "optimized" to operate on
16-byte aligned data using movdqa, and instead just use the code paths
that use movdqu and can handle data with any alignment.

This does not reduce performance.  movdqa is basically a historical
artifact; on aligned data, movdqu and movdqa have had the same
performance since Intel Nehalem (2008) and AMD Bulldozer (2011).  And
code that requires AES-NI cannot run on CPUs older than those anyway.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
b8d2e7bac3 crypto: x86/aegis128 - eliminate some indirect calls
Instead of using a struct of function pointers to decide whether to call
the encryption or decryption assembly functions, use a conditional
branch on a bool.  Force-inline the functions to avoid actually
generating the branch.  This improves performance slightly since
indirect calls are slow.  Remove the now-unnecessary CFI stubs.

Note that just force-inlining the existing functions might cause the
compiler to optimize out the indirect branches, but that would not be a
reliable way to do it and the CFI stubs would still be required.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
ebb445f5e7 crypto: x86/aegis128 - remove no-op init and exit functions
Don't bother providing empty stubs for the init and exit methods in
struct aead_alg, since they are optional anyway.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Eric Biggers
3b2f2d22fb crypto: x86/aegis128 - access 32-bit arguments as 32-bit
Fix the AEGIS assembly code to access 'unsigned int' arguments as 32-bit
values instead of 64-bit, since the upper bits of the corresponding
64-bit registers are not guaranteed to be zero.

Note: there haven't been any reports of this bug actually causing
incorrect behavior.  Neither gcc nor clang guarantee zero-extension to
64 bits, but zero-extension is likely to happen in practice because most
instructions that operate on 32-bit registers zero-extend to 64 bits.

Fixes: 1d373d4e8e ("crypto: x86 - Add optimized AEGIS implementations")
Cc: stable@vger.kernel.org
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Ard Biesheuvel
16739efac6 crypto: crc32c - Provide crc32c-arch driver for accelerated library code
crc32c-generic is currently backed by the architecture's CRC-32c library
code, which may offer a variety of implementations depending on the
capabilities of the platform. These are not covered by the crypto
subsystem's fuzz testing capabilities because crc32c-generic is the
reference driver that the fuzzing logic uses as a source of truth.

Fix this by providing a crc32c-arch implementation which is based on the
arch library code if available, and modify crc32c-generic so it is
always based on the generic C implementation. If the arch has no CRC-32c
library code, this change does nothing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Ard Biesheuvel
a37e55791f crypto: crc32 - Provide crc32-arch driver for accelerated library code
crc32-generic is currently backed by the architecture's CRC-32 library
code, which may offer a variety of implementations depending on the
capabilities of the platform. These are not covered by the crypto
subsystem's fuzz testing capabilities because crc32-generic is the
reference driver that the fuzzing logic uses as a source of truth.

Fix this by providing a crc32-arch implementation which is based on the
arch library code if available, and modify crc32-generic so it is
always based on the generic C implementation. If the arch has no CRC-32
library code, this change does nothing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Thorsten Blum
a1ba22921e crypto: drbg - Use str_true_false() and str_enabled_disabled() helpers
Remove hard-coded strings by using the helper functions str_true_false()
and str_enabled_disabled().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Gatien Chevallier
5a61fd622b hwrng: stm32 - update STM32MP15 RNG max clock frequency
RNG max clock frequency can be updated to 48MHz for stm32mp1x
platforms according to the latest specifications.

Signed-off-by: Gatien Chevallier <gatien.chevallier@foss.st.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Gatien Chevallier
842285d4ce hwrng: stm32 - implement support for STM32MP25x platforms
Implement the support for STM32MP25x platforms. On this platform, a
security clock is shared between some hardware blocks. For the RNG,
it is the RNG kernel clock. Therefore, the gate is no more shared
between the RNG bus and kernel clocks as on STM32MP1x platforms and
the bus clock has to be managed on its own.

Signed-off-by: Gatien Chevallier <gatien.chevallier@foss.st.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:09 +08:00
Gatien Chevallier
4eb10daba8 dt-bindings: rng: add st,stm32mp25-rng support
Add RNG STM32MP25x platforms compatible. Update the clock
properties management to support all versions.

Signed-off-by: Gatien Chevallier <gatien.chevallier@foss.st.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:09 +08:00
Colin Ian King
7b90df7818 crypto: tegra - remove redundant error check on ret
Currently there is an unnecessary error check on ret without a proceeding
assignment to ret that needs checking. The check is redundant and can be
removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:09 +08:00
Vishal Chourasia
69b0620727 crypto: nx - Fix invalid wait context during kexec reboot
nx842_remove() call of_reconfig_notifier_unregister while holding the
devdata_spinlock. This could lead to an invalid wait context error during
kexec reboot, as of_reconfig_notifier_unregister tries to acquire a read-write
semaphore (check logs) while holding a spinlock.

Move the of_reconfig_notifier_unregister() call before acquiring the
spinlock to prevent this race condition invalid wait contexts during system
shutdown or kexec operations.

Log:

[ BUG: Invalid wait context ]
6.11.0-test2-10547-g684a64bf32b6-dirty #79 Not tainted
-----------------------------
kexec/61926 is trying to lock:
c000000002d8b590 ((of_reconfig_chain).rwsem){++++}-{4:4}, at: blocking_notifier_chain_unregister+0x44/0xa0
other info that might help us debug this:
context-{5:5}
4 locks held by kexec/61926:
 #0: c000000002926c70 (system_transition_mutex){+.+.}-{4:4}, at: __do_sys_reboot+0xf8/0x2e0
 #1: c00000000291af30 (&dev->mutex){....}-{4:4}, at: device_shutdown+0x160/0x310
 #2: c000000051011938 (&dev->mutex){....}-{4:4}, at: device_shutdown+0x174/0x310
 #3: c000000002d88070 (devdata_mutex){....}-{3:3}, at: nx842_remove+0xac/0x1bc
stack backtrace:
CPU: 2 UID: 0 PID: 61926 Comm: kexec Not tainted 6.11.0-test2-10547-g684a64bf32b6-dirty #79
Hardware name: IBM,9080-HEX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_012) hv:phyp pSeries
Call Trace:
[c0000000bb577400] [c000000001239704] dump_stack_lvl+0xc8/0x130 (unreliable)
[c0000000bb577440] [c000000000248398] __lock_acquire+0xb68/0xf00
[c0000000bb577550] [c000000000248820] lock_acquire.part.0+0xf0/0x2a0
[c0000000bb577670] [c00000000127faa0] down_write+0x70/0x1e0
[c0000000bb5776b0] [c0000000001acea4] blocking_notifier_chain_unregister+0x44/0xa0
[c0000000bb5776e0] [c000000000e2312c] of_reconfig_notifier_unregister+0x2c/0x40
[c0000000bb577700] [c000000000ded24c] nx842_remove+0x148/0x1bc
[c0000000bb577790] [c00000000011a114] vio_bus_remove+0x54/0xc0
[c0000000bb5777c0] [c000000000c1a44c] device_shutdown+0x20c/0x310
[c0000000bb577850] [c0000000001b0ab4] kernel_restart_prepare+0x54/0x70
[c0000000bb577870] [c000000000308718] kernel_kexec+0xa8/0x110
[c0000000bb5778e0] [c0000000001b1144] __do_sys_reboot+0x214/0x2e0
[c0000000bb577a40] [c000000000032f98] system_call_exception+0x148/0x310
[c0000000bb577e50] [c00000000000cedc] system_call_vectored_common+0x15c/0x2ec
--- interrupt: 3000 at 0x7fffa07e7df8
NIP:  00007fffa07e7df8 LR: 00007fffa07e7df8 CTR: 0000000000000000
REGS: c0000000bb577e80 TRAP: 3000   Not tainted  (6.11.0-test2-10547-g684a64bf32b6-dirty)
MSR:  800000000280f033   CR: 48022484  XER: 00000000
IRQMASK: 0
GPR00: 0000000000000058 00007ffff961f1e0 00007fffa08f7100 fffffffffee1dead
GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003
GPR08: 0000000000000003 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fffa0a9b360 ffffffffffffffff 0000000000000000
GPR16: 0000000000000001 0000000000000002 0000000000000001 0000000000000001
GPR20: 000000011710f520 0000000000000000 0000000000000000 0000000000000001
GPR24: 0000000129be0480 0000000000000003 0000000000000003 00007ffff961f2b0
GPR28: 00000001170f2d30 00000001170f2d28 00007fffa08f18d0 0000000129be04a0
NIP [00007fffa07e7df8] 0x7fffa07e7df8
LR [00007fffa07e7df8] 0x7fffa07e7df8
--- interrupt: 3000

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:32:40 +08:00
Vishal Chourasia
bdd9155560 crypto: nx - Rename devdata_mutex to devdata_spinlock
Rename devdata_mutex to devdata_spinlock to accurately reflect its
implementation as a spinlock.

[1] v1 https://lore.kernel.org/all/ZwyqD-w5hEhrnqTB@linux.ibm.com

Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:32:36 +08:00
Yi Yang
662f2f13e6 crypto: pcrypt - Call crypto layer directly when padata_do_parallel() return -EBUSY
Since commit 8f4f68e788 ("crypto: pcrypt - Fix hungtask for
PADATA_RESET"), the pcrypt encryption and decryption operations return
-EAGAIN when the CPU goes online or offline. In alg_test(), a WARN is
generated when pcrypt_aead_decrypt() or pcrypt_aead_encrypt() returns
-EAGAIN, the unnecessary panic will occur when panic_on_warn set 1.
Fix this issue by calling crypto layer directly without parallelization
in that case.

Fixes: 8f4f68e788 ("crypto: pcrypt - Fix hungtask for PADATA_RESET")
Signed-off-by: Yi Yang <yiyang13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:32:36 +08:00
Christophe JAILLET
288e37216f crypto: qat - Constify struct pm_status_row
'struct pm_status_row' are not modified in this driver.

Constifying this structure moves some data to a read-only section, so
increases overall security.

Update the prototype of some functions accordingly.

On a x86_64, with allmodconfig, as an example:
Before:
======
   text	   data	    bss	    dec	    hex	filename
   4400	   1059	      0	   5459	   1553	drivers/crypto/intel/qat/qat_common/adf_gen4_pm_debugfs.o

After:
=====
   text	   data	    bss	    dec	    hex	filename
   5216	    243	      0	   5459	   1553	drivers/crypto/intel/qat/qat_common/adf_gen4_pm_debugfs.o

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:32:36 +08:00
Rob Herring (Arm)
c4fdae903b dt-bindings: rng: Add Marvell Armada RNG support
The Marvell Armada RNG uses the same IP as TI from Inside Secure and is
already using the binding. The only missing part is the
"marvell,armada-8k-rng" compatible string.

Rename the binding to inside-secure,safexcel-eip76.yaml to better
reflect it is multi-vendor, licensed IP and to follow the naming
convention using compatible string.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:32:36 +08:00
Lukas Wunner
91790c7a35 crypto: ecdsa - Update Kconfig help text for NIST P521
Commit a7d45ba77d ("crypto: ecdsa - Register NIST P521 and extend test
suite") added support for ECDSA signature verification using NIST P521,
but forgot to amend the Kconfig help text.  Fix it.

Fixes: a7d45ba77d ("crypto: ecdsa - Register NIST P521 and extend test suite")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:32:28 +08:00
Lukas Wunner
b358f23ab1 crypto: sig - Fix oops on KEYCTL_PKEY_QUERY for RSA keys
Commit a2471684da ("crypto: ecdsa - Move X9.62 signature size
calculation into template") introduced ->max_size() and ->digest_size()
callbacks to struct sig_alg.  They return an algorithm's maximum
signature size and digest size, respectively.

For algorithms which lack these callbacks, crypto_register_sig() was
amended to use the ->key_size() callback instead.

However the commit neglected to also amend sig_register_instance().
As a result, the ->max_size() and ->digest_size() callbacks remain NULL
pointers if instances do not define them.  A KEYCTL_PKEY_QUERY system
call results in an oops for such instances:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  Call Trace:
  software_key_query+0x169/0x370
  query_asymmetric_key+0x67/0x90
  keyctl_pkey_query+0x86/0x120
  __do_sys_keyctl+0x428/0x480
  do_syscall_64+0x4b/0x110

The only instances affected by this are "pkcs1(rsa, ...)".

Fix by moving the callback checks from crypto_register_sig() to
sig_prepare_alg(), which is also invoked by sig_register_instance().
Change the return type of sig_prepare_alg() from void to int to be able
to return errors.  This matches other algorithm types, see e.g.
aead_prepare_alg() or ahash_prepare_alg().

Fixes: a2471684da ("crypto: ecdsa - Move X9.62 signature size calculation into template")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-26 14:41:59 +08:00
Eric Biggers
84dd048cf8 crypto: x86/crc32c - eliminate jump table and excessive unrolling
crc32c-pcl-intel-asm_64.S has a loop with 1 to 127 iterations fully
unrolled and uses a jump table to jump into the correct location.  This
optimization is misguided, as it bloats the binary code size and
introduces an indirect call.  x86_64 CPUs can predict loops well, so it
is fine to just use a loop instead.  Loop bookkeeping instructions can
compete with the crc instructions for the ALUs, but this is easily
mitigated by unrolling the loop by a smaller amount, such as 4 times.

Therefore, re-roll the loop and make related tweaks to the code.

This reduces the binary code size of crc_pclmul() from 4546 bytes to 418
bytes, a 91% reduction.  In general it also makes the code faster, with
some large improvements seen when retpoline is enabled.

More detailed performance results are shown below.  They are given as
percent improvement in throughput (negative means regressed) for CPU
microarchitecture vs. input length in bytes.  E.g. an improvement from
40 GB/s to 50 GB/s would be listed as 25%.

Table 1: Results with retpoline enabled (the default):

                       |   512 |   833 |  1024 |  2000 |  3173 |  4096 |
  ---------------------+-------+-------+-------+------ +-------+-------+
  Intel Haswell        | 35.0% | 20.7% | 17.8% |  9.7% | -0.2% |  4.4% |
  Intel Emerald Rapids | 66.8% | 45.2% | 36.3% | 19.3% |  0.0% |  5.4% |
  AMD Zen 2            | 29.5% | 17.2% | 13.5% |  8.6% | -0.5% |  2.8% |

Table 2: Results with retpoline disabled:

                       |   512 |   833 |  1024 |  2000 |  3173 |  4096 |
  ---------------------+-------+-------+-------+------ +-------+-------+
  Intel Haswell        |  3.3% |  4.8% |  4.5% |  0.9% | -2.9% |  0.3% |
  Intel Emerald Rapids |  7.5% |  6.4% |  5.2% |  2.3% | -0.0% |  0.6% |
  AMD Zen 2            | 11.8% |  1.4% |  0.2% |  1.3% | -0.9% | -0.2% |

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-26 14:41:59 +08:00
Eric Biggers
eebcadfa21 crypto: x86/crc32c - access 32-bit arguments as 32-bit
Fix crc32c-pcl-intel-asm_64.S to access 32-bit arguments as 32-bit
values instead of 64-bit, since the upper bits of the corresponding
64-bit registers are not guaranteed to be zero.  Also update the type of
the length argument to be unsigned int rather than int, as the assembly
code treats it as unsigned.

Note: there haven't been any reports of this bug actually causing
incorrect behavior.  Neither gcc nor clang guarantee zero-extension to
64 bits, but zero-extension is likely to happen in practice because most
instructions that operate on 32-bit registers zero-extend to 64 bits.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-26 14:41:59 +08:00