Commit Graph

1153421 Commits

Author SHA1 Message Date
Danny Tsen
41a6437ab4 crypto: p10-aes-gcm - Supporting functions for ghash
This perl code is taken from the OpenSSL project and added gcm_init_htable function
used in the p10-aes-gcm-glue.c code to initialize hash table.  gcm_hash_p8 is used
to hash encrypted data blocks.

Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-13 12:10:18 +08:00
Danny Tsen
3b47eccaaf crypto: p10-aes-gcm - Supporting functions for AES
This code is taken from CRYPTOGAMs[1].  The following functions are used,
aes_p8_set_encrypt_key is used to generate AES round keys and aes_p8_encrypt is used
to encrypt single block.

Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-13 12:10:18 +08:00
Danny Tsen
ca68a96c37 crypto: p10-aes-gcm - An accelerated AES/GCM stitched implementation
Improve overall performance of AES/GCM encrypt and decrypt operations
for Power10+ CPU.

Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-13 12:10:18 +08:00
Danny Tsen
cc40379b6e crypto: p10-aes-gcm - Glue code for AES/GCM stitched implementation
Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-13 12:10:17 +08:00
Danny Tsen
3c657e8689 crypto: p10-aes-gcm - Update Kconfig and Makefile
Defined CRYPTO_P10_AES_GCM in Kconfig to support AES/GCM
stitched implementation for Power10+ CPU.

Added a new module driver p10-aes-gcm-crypto.

Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-13 12:09:27 +08:00
Taehee Yoo
c970d42001 crypto: x86/aria - implement aria-avx512
aria-avx512 implementation uses AVX512 and GFNI.
It supports 64way parallel processing.
So, byteslicing code is changed to support 64way parallel.
And it exports some aria-avx2 functions such as encrypt() and decrypt().

AVX and AVX2 have 16 registers.
They should use memory to store/load state because of lack of registers.
But AVX512 supports 32 registers.
So, it doesn't require store/load in the s-box layer.
It means that it can reduce overhead of store/load in the s-box layer.
Also code become much simpler.

Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:

ARIA-AVX512(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx512) encryption
tcrypt: 1 operation in 1504 cycles (1024 bytes)
tcrypt: 1 operation in 4595 cycles (4096 bytes)
tcrypt: 1 operation in 1763 cycles (1024 bytes)
tcrypt: 1 operation in 5540 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx512) decryption
tcrypt: 1 operation in 1502 cycles (1024 bytes)
tcrypt: 1 operation in 4615 cycles (4096 bytes)
tcrypt: 1 operation in 1759 cycles (1024 bytes)
tcrypt: 1 operation in 5554 cycles (4096 bytes)

ARIA-AVX2 with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
tcrypt: 1 operation in 2003 cycles (1024 bytes)
tcrypt: 1 operation in 5867 cycles (4096 bytes)
tcrypt: 1 operation in 2358 cycles (1024 bytes)
tcrypt: 1 operation in 7295 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
tcrypt: 1 operation in 2004 cycles (1024 bytes)
tcrypt: 1 operation in 5956 cycles (4096 bytes)
tcrypt: 1 operation in 2409 cycles (1024 bytes)
tcrypt: 1 operation in 7564 cycles (4096 bytes)

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Taehee Yoo
37d8d3ae7a crypto: x86/aria - implement aria-avx2
aria-avx2 implementation uses AVX2, AES-NI, and GFNI.
It supports 32way parallel processing.
So, byteslicing code is changed to support 32way parallel.
And it exports some aria-avx functions such as encrypt() and decrypt().

There are two main logics, s-box layer and diffusion layer.
These codes are the same as aria-avx implementation.
But some instruction are exchanged because they don't support 256bit
registers.
Also, AES-NI doesn't support 256bit register.
So, aesenclast and aesdeclast are used twice like below:
	vextracti128 $1, ymm0, xmm6;
	vaesenclast xmm7, xmm0, xmm0;
	vaesenclast xmm7, xmm6, xmm6;
	vinserti128 $1, xmm6, ymm0, ymm0;

Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:

ARIA-AVX2 with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
tcrypt: 1 operation in 2003 cycles (1024 bytes)
tcrypt: 1 operation in 5867 cycles (4096 bytes)
tcrypt: 1 operation in 2358 cycles (1024 bytes)
tcrypt: 1 operation in 7295 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
tcrypt: 1 operation in 2004 cycles (1024 bytes)
tcrypt: 1 operation in 5956 cycles (4096 bytes)
tcrypt: 1 operation in 2409 cycles (1024 bytes)
tcrypt: 1 operation in 7564 cycles (4096 bytes)

ARIA-AVX with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx) encryption
tcrypt: 1 operation in 2761 cycles (1024 bytes)
tcrypt: 1 operation in 9390 cycles (4096 bytes)
tcrypt: 1 operation in 3401 cycles (1024 bytes)
tcrypt: 1 operation in 11876 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx) decryption
tcrypt: 1 operation in 2735 cycles (1024 bytes)
tcrypt: 1 operation in 9424 cycles (4096 bytes)
tcrypt: 1 operation in 3369 cycles (1024 bytes)
tcrypt: 1 operation in 11954 cycles (4096 bytes)

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Taehee Yoo
35344cf30f crypto: x86/aria - do not use magic number offsets of aria_ctx
aria-avx assembly code accesses members of aria_ctx with magic number
offset. If the shape of struct aria_ctx is changed carelessly,
aria-avx will not work.
So, we need to ensure accessing members of aria_ctx with correct
offset values, not with magic numbers.

It adds ARIA_CTX_enc_key, ARIA_CTX_dec_key, and ARIA_CTX_rounds in the
asm-offsets.c So, correct offset definitions will be generated.
aria-avx assembly code can access members of aria_ctx safely with
these definitions.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Taehee Yoo
8e7d7ce2e3 crypto: x86/aria - add keystream array into request ctx
avx accelerated aria module used local keystream array.
But, keystream array size is too big.
So, it puts the keystream array into request ctx.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
David Rientjes
91dfd98216 crypto: ccp - Avoid page allocation failure warning for SEV_GET_ID2
For SEV_GET_ID2, the user provided length does not have a specified
limitation because the length of the ID may change in the future.  The
kernel memory allocation, however, is implicitly limited to 4MB on x86 by
the page allocator, otherwise the kzalloc() will fail.

When this happens, it is best not to spam the kernel log with the warning.
Simply fail the allocation and return ENOMEM to the user.

Fixes: d6112ea0cb ("crypto: ccp - introduce SEV_GET_ID2 command")
Reported-by: Andy Nguyen <theflow@google.com>
Reported-by: Peter Gonda <pgonda@google.com>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Herbert Xu
8e613cec25 crypto: talitos - Remove GFP_DMA and add DMA alignment padding
GFP_DMA does not guarantee that the returned memory is aligned
for DMA.  It should be removed where it is superfluous.

However, kmalloc may start returning DMA-unaligned memory in future
so fix this by adding the alignment by hand.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Herbert Xu
199354d7fb crypto: caam - Remove GFP_DMA and add DMA alignment padding
GFP_DMA does not guarantee that the returned memory is aligned
for DMA.  It should be removed where it is superfluous.

However, kmalloc may start returning DMA-unaligned memory in future
so fix this by adding the alignment by hand.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Nicolai Stange
c27b2d2012 crypto: testmgr - allow ecdsa-nist-p256 and -p384 in FIPS mode
The kernel provides implementations of the NIST ECDSA signature
verification primitives. For key sizes of 256 and 384 bits respectively
they are approved and can be enabled in FIPS mode. Do so.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Nicolai Stange
2912eb9b17 crypto: testmgr - disallow plain ghash in FIPS mode
ghash may be used only as part of the gcm(aes) construction in FIPS
mode. Since commit d6097b8d5d ("crypto: api - allow algs only in specific
constructions in FIPS mode") there's support for using spawns which by
itself are marked as non-approved from approved template instantiations.
So simply mark plain ghash as non-approved in testmgr to block any attempts
of direct instantiations in FIPS mode.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Nicolai Stange
1ce94a8c2c crypto: testmgr - disallow plain cbcmac(aes) in FIPS mode
cbcmac(aes) may be used only as part of the ccm(aes) construction in FIPS
mode. Since commit d6097b8d5d ("crypto: api - allow algs only in specific
constructions in FIPS mode") there's support for using spawns which by
itself are marked as non-approved from approved template instantiations.
So simply mark plain cbcmac(aes) as non-approved in testmgr to block any
attempts of direct instantiations in FIPS mode.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Vladis Dronov
b6f5278003 crypto: s390/aes - drop redundant xts key check
xts_fallback_setkey() in xts_aes_set_key() will now enforce key size
rule in FIPS mode when setting up the fallback algorithm keys, which
makes the check in xts_aes_set_key() redundant or unreachable. So just
drop this check.

xts_fallback_setkey() now makes a key size check in xts_verify_key():

xts_fallback_setkey()
  crypto_skcipher_setkey() [ skcipher_setkey_unaligned() ]
    cipher->setkey() { .setkey = xts_setkey }
      xts_setkey()
        xts_verify_key()

Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Vladis Dronov
0ee433676e crypto: xts - drop xts_check_key()
xts_check_key() is obsoleted by xts_verify_key(). Over time XTS crypto
drivers adopted the newer xts_verify_key() variant, but xts_check_key()
is still used by a number of drivers. Switch drivers to use the newer
xts_verify_key() and make a couple of cleanups. This allows us to drop
xts_check_key() completely and avoid redundancy.

Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Nicolai Stange
1c4428b295 crypto: xts - restrict key lengths to approved values in FIPS mode
According to FIPS 140-3 IG C.I., only (total) key lengths of either
256 bits or 512 bits are allowed with xts(aes). Make xts_verify_key() to
reject anything else in FIPS mode.

As xts(aes) is the only approved xts() template instantiation in FIPS mode,
the new restriction implemented in xts_verify_key() effectively only
applies to this particular construction.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:46 +08:00
Herbert Xu
39a76cf1f5 crypto: sun8i-ss - Remove GFP_DMA and add DMA alignment padding
GFP_DMA does not guarantee that the returned memory is aligned
for DMA.  In fact for sun8i-ss it is superfluous and can be removed.

However, kmalloc may start returning DMA-unaligned memory in future
so fix this by adding the alignment by hand.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:46 +08:00
Herbert Xu
4f289826fe crypto: caam - Avoid GCC memset bug warning
Certain versions of gcc don't like the memcpy with a NULL dst
(which only happens with a zero length).  This only happens
when debugging is enabled so add an if clause to work around
these warnings.

A similar warning used to be generated by sparse but that was
fixed years ago.

Link: https://lore.kernel.org/lkml/202210290446.qBayTfzl-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Kees Cook <keescook@chromium.org>
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:46 +08:00
Herbert Xu
7361d1bc30 lib/mpi: Fix buffer overrun when SG is too long
The helper mpi_read_raw_from_sgl sets the number of entries in
the SG list according to nbytes.  However, if the last entry
in the SG list contains more data than nbytes, then it may overrun
the buffer because it only allocates enough memory for nbytes.

Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Reported-by: Roberto Sassu <roberto.sassu@huaweicloud.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:46 +08:00
Herbert Xu
e20d5a22bd crypto: lib/blake2s - Split up test function to halve stack usage
Reduce the stack usage further by splitting up the test function.

Also squash blocks and unaligned_blocks into one array.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-12-30 22:56:27 +08:00
Lukas Bulwahn
49bc6a7786 crypto: ux500 - update debug config after ux500 cryp driver removal
Commit 453de3eb08 ("crypto: ux500/cryp - delete driver") removes the
config CRYPTO_DEV_UX500_CRYP, but leaves an obsolete reference in the
dependencies of config CRYPTO_DEV_UX500_DEBUG.

Remove that obsolete reference, and adjust the description while at it.

Fixes: 453de3eb08 ("crypto: ux500/cryp - delete driver")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-12-30 22:56:27 +08:00
Arnd Bergmann
8031d1f678 crypto: wp512 - disable kmsan checks in wp512_process_buffer()
The memory sanitizer causes excessive register spills in this function:

crypto/wp512.c:782:13: error: stack frame size (2104) exceeds limit (2048) in 'wp512_process_buffer' [-Werror,-Wframe-larger-than]

Assume that this one is safe, and mark it as needing no checks to
get the stack usage back down to the normal level.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-12-30 22:56:27 +08:00
Ard Biesheuvel
aa9695157f crypto: scatterwalk - use kmap_local() not kmap_atomic()
kmap_atomic() is used to create short-lived mappings of pages that may
not be accessible via the kernel direct map. This is only needed on
32-bit architectures that implement CONFIG_HIGHMEM, but it can be used
on 64-bit other architectures too, where the returned mapping is simply
the kernel direct address of the page.

However, kmap_atomic() does not support migration on CONFIG_HIGHMEM
configurations, due to the use of per-CPU kmap slots, and so it disables
preemption on all architectures, not just the 32-bit ones. This implies
that all scatterwalk based crypto routines essentially execute with
preemption disabled all the time, which is less than ideal.

So let's switch scatterwalk_map/_unmap and the shash/ahash routines to
kmap_local() instead, which serves a similar purpose, but without the
resulting impact on preemption on architectures that have no need for
CONFIG_HIGHMEM.

Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Elliott, Robert (Servers)" <elliott@hpe.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-12-30 22:56:27 +08:00