Commit Graph

334371 Commits

Author SHA1 Message Date
Jussi Kivilinna 044ab52578 crypto: cast5/cast6 - move lookup tables to shared module
CAST5 and CAST6 both use same lookup tables, which can be moved shared module
'cast_common'.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-12-06 17:16:26 +08:00
Shan Wei f0fcf2002b padata: use __this_cpu_read per-cpu helper
For bottom halves off, __this_cpu_read is better.

Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-12-06 17:16:23 +08:00
Sachin Kamat a465348ff5 crypto: s5p-sss - Fix compilation error
struct s3c2410_dma_client gets defined multiple times as it is defined
in more than one header file. Changing it at the header file level causes
many more build breakages as they are interdependent in a complex way.
Hence fixing this problem by using the mach version of the header file.

Without this patch, following build error is observed:
arch/arm/plat-samsung/include/plat/dma-pl330.h:106:27: error:
redefinition of struct s3c2410_dma_client

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:33 +08:00
Axel Lin 14198dd64b crypto: picoxcell - Add terminating entry for platform_device_id table
The platform_device_id table is supposed to be zero-terminated.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:33 +08:00
Sebastian Andrzej Siewior d87d77128f crypto: omap-aes - select BLKCIPHER2
without it:
|drivers/built-in.o:(.data+0x14588): undefined reference to `crypto_ablkcipher_type'
|drivers/built-in.o:(.data+0x14668): undefined reference to `crypto_ablkcipher_type'

Not sure when this broke.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:32 +08:00
Jussi Kivilinna d9b1d2e7e1 crypto: camellia - add AES-NI/AVX/x86_64 assembler implementation of camellia cipher
This patch adds AES-NI/AVX/x86_64 assembler implementation of Camellia block
cipher. Implementation process data in sixteen block chunks, which are
byte-sliced and AES SubBytes is reused for Camellia s-box with help of pre-
and post-filtering.

Patch has been tested with tcrypt and automated filesystem tests.

tcrypt test results:

Intel Core i5-2450M:

camellia-aesni-avx vs camellia-asm-x86_64-2way:
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.98x   0.96x   0.99x   0.96x   0.96x   0.95x   0.95x   0.94x   0.97x   0.98x
64B     0.99x   0.98x   1.00x   0.98x   0.98x   0.99x   0.98x   0.93x   0.99x   0.98x
256B    2.28x   2.28x   1.01x   2.29x   2.25x   2.24x   1.96x   1.97x   1.91x   1.90x
1024B   2.57x   2.56x   1.00x   2.57x   2.51x   2.53x   2.19x   2.17x   2.19x   2.22x
8192B   2.49x   2.49x   1.00x   2.53x   2.48x   2.49x   2.17x   2.17x   2.22x   2.22x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   0.98x   0.99x   0.97x   0.97x   0.96x   0.97x   0.98x   0.98x   0.99x
64B     1.00x   1.00x   1.01x   0.99x   0.98x   0.99x   0.99x   0.99x   0.99x   0.99x
256B    2.37x   2.37x   1.01x   2.39x   2.35x   2.33x   2.10x   2.11x   1.99x   2.02x
1024B   2.58x   2.60x   1.00x   2.58x   2.56x   2.56x   2.28x   2.29x   2.28x   2.29x
8192B   2.50x   2.52x   1.00x   2.56x   2.51x   2.51x   2.24x   2.25x   2.26x   2.29x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:32 +08:00
Jussi Kivilinna cf582cceda crypto: camellia-x86_64 - share common functions and move structures and function definitions to header file
Prepare camellia-x86_64 functions to be reused from AVX/AESNI implementation
module.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:31 +08:00
Jussi Kivilinna bf9c518186 crypto: tcrypt - add async speed test for camellia cipher
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:28 +08:00
Wei Yongjun d48e366e6e crypto: tegra-aes - fix error-valued pointer dereference
clk_put(dd->aes_clk) will dereference an error-valued pointer since the
dd->aes_clk is a ERR_PTR() value. The correct check is call clk_put()
if !IS_ERR(dd->aes_clk).

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-11-09 17:32:27 +08:00
Wei Yongjun 3200da8d9a crypto: tegra - fix missing unlock on error case
Add the missing unlock on the error handling path in function
tegra_aes_get_random() and tegra_aes_rng_reset().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:56 +08:00
Jussi Kivilinna c12ab20b16 crypto: cast5/avx - avoid using temporary stack buffers
Introduce new assembler functions to avoid use temporary stack buffers in glue
code. This also allows use of vector instructions for xoring output in CTR and
CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.5% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~5%.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:55 +08:00
Jussi Kivilinna facd416fbc crypto: serpent/avx - avoid using temporary stack buffers
Introduce new assembler functions to avoid use temporary stack buffers in glue
code. This also allows use of vector instructions for xoring output in CTR and
CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.5% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~3%.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:55 +08:00
Jussi Kivilinna 8435a3c300 crypto: twofish/avx - avoid using temporary stack buffers
Introduce new assembler functions to avoid use temporary stack buffers in glue
code. This also allows use of vector instructions for xoring output in CTR and
CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.2% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~3%.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:55 +08:00
Jussi Kivilinna cba1cce054 crypto: cast6/avx - avoid using temporary stack buffers
Introduce new assembler functions to avoid use temporary stack buffers in
glue code. This also allows use of vector instructions for xoring output
in CTR and CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.5% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~2%.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:54 +08:00
Jussi Kivilinna 58990986f1 crypto: x86/glue_helper - use le128 instead of u128 for CTR mode
'u128' currently used for CTR mode is on little-endian 'long long' swapped
and would require extra swap operations by SSE/AVX code. Use of le128
instead of u128 allows IV calculations to be done with vector registers
easier.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:54 +08:00
Jussi Kivilinna e080b17a8c crypto: testmgr - add new larger DES3_EDE testvectors
Most DES3_EDE testvectors are short and do not test parallelised codepaths
well. Add larger testvectors to test large crypto operations and to test
multi-page crypto with DES3_EDE.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:52 +08:00
Jussi Kivilinna 8163fc30d1 crypto: testmgr - add new larger DES testvectors
Most DES testvectors are short and do not test parallelised codepaths
well. Add larger testvectors to test large crypto operations and to test
multi-page crypto with DES.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:52 +08:00
Jussi Kivilinna c3b9e8f6a4 crypto: testmgr - add new larger AES testvectors
Most AES testvectors are short and do not test parallelised codepaths
well. Add larger testvectors to test large crypto operations and to test
multi-page crypto with AES.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:52 +08:00
Jussi Kivilinna 9f28e97d1c crypto: testmgr - expand serpent test vectors
AVX2 implementation of serpent cipher processes 16 blocks parallel, so
we need to make test vectors larger to check parallel code paths.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:51 +08:00
Jussi Kivilinna 963ae397f3 crypto: testmgr - expand blowfish test vectors
AVX2 implementation of blowfish cipher processes 32 blocks parallel, so
we need to make test vectors larger to check parallel code paths.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:51 +08:00
Jussi Kivilinna be6314b4cc crypto: testmgr - expand camellia test vectors
AVX/AES-NI implementation of camellia cipher processes 16 blocks
parallel, so we need to make test vectors larger to check parallel
code paths.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-24 21:10:51 +08:00
Salman Qazi ba1ee07090 crypto: vmac - Make VMAC work when blocks aren't aligned
VMAC implementation, as it is, does not work with blocks that
are not multiples of 128-bytes.  Furthermore, this is a problem
when using the implementation on scatterlists, even
when the complete plain text is 128-byte multiple, as the pieces
that get passed to vmac_update can be pretty much any size.

I also added test cases for unaligned blocks.

Signed-off-by: Salman Qazi <sqazi@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-15 22:33:20 +08:00
Wei Yongjun 7291a932c6 crypto: talitos - convert to use be16_add_cpu()
Convert cpu_to_be16(be16_to_cpu(E1) + E2) to use be16_add_cpu().

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-15 22:18:26 +08:00
Tim Chen e3899e4df0 crypto: tcrypt - Added speed test in tcrypt for crc32c
This patch adds a test case in tcrypt to perform speed test for
crc32c checksum calculation.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-15 22:18:25 +08:00
Tim Chen 6a8ce1ef39 crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction
This patch adds the crc_pcl function that calculates CRC32C checksum using the
PCLMULQDQ instruction on processors that support this feature. This will
provide speedup over using CRC32 instruction only.
The usage of PCLMULQDQ necessitate the invocation of kernel_fpu_begin and
kernel_fpu_end and incur some overhead.  So the new crc_pcl function is only
invoked for buffer size of 512 bytes or more.  Larger sized
buffers will expect to see greater speedup.  This feature is best used coupled
with eager_fpu which reduces the kernel_fpu_begin/end overhead.  For
buffer size of 1K the speedup is around 1.6x and for buffer size greater than
4K, the speedup is around 3x compared to original implementation in crc32c-intel
module. Test was performed on Sandy Bridge based platform with constant frequency
set for cpu.

A white paper detailing the algorithm can be found here:
http://download.intel.com/design/intarch/papers/323405.pdf

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-10-15 22:18:24 +08:00