Commit Graph

601777 Commits

Author SHA1 Message Date
Herbert Xu d56d72c6a0 KEYS: Use skcipher for big keys
This patch replaces use of the obsolete blkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David Howells <dhowells@redhat.com>
2016-06-24 21:24:58 +08:00
Bin Liu 85e0687f8f crypto: omap-sham - set sw fallback to 240 bytes
Adds software fallback support for small crypto requests. In these cases,
it is undesirable to use DMA, as setting it up itself is rather heavy
operation. Gives about 40% extra performance in ipsec usecase.

Signed-off-by: Bin Liu <b-liu@ti.com>
[t-kristo@ti.com: dropped the extra traces, updated some comments
 on the code]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-24 21:24:58 +08:00
Lokesh Vutla b973eaab68 crypto: omap - do not call dmaengine_terminate_all
The extra call to dmaengine_terminate_all is not needed, as the DMA
is not running at this point. This improves performance slightly.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-24 21:24:57 +08:00
Tero Kristo 65e7a549af crypto: omap-sham - change queue size from 1 to 10
Change crypto queue size from 1 to 10 for omap SHA driver. This should
allow clients to enqueue requests more effectively to avoid serializing
whole crypto sequences, giving extra performance.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-24 21:24:57 +08:00
Tero Kristo e93f767bec crypto: omap-sham - use runtime_pm autosuspend for clock handling
Calling runtime PM API for every block causes serious performance hit to
crypto operations that are done on a long buffer. As crypto is performed
on a page boundary, encrypting large buffers can cause a series of crypto
operations divided by page. The runtime PM API is also called those many
times.

Convert the driver to use runtime_pm autosuspend instead, with a default
timeout value of 1 second. This results in upto ~50% speedup.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-24 21:24:57 +08:00
Salvatore Benedetto 3c4b23901a crypto: ecdh - Add ECDH software support
* Implement ECDH under kpp API
 * Provide ECC software support for curve P-192 and
   P-256.
 * Add kpp test for ECDH with data generated by OpenSSL

Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:57 +08:00
Salvatore Benedetto 802c7f1c84 crypto: dh - Add DH software implementation
* Implement MPI based Diffie-Hellman under kpp API
 * Test provided uses data generad by OpenSSL

Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:56 +08:00
Salvatore Benedetto 4e5f2c4007 crypto: kpp - Key-agreement Protocol Primitives API (KPP)
Add key-agreement protocol primitives (kpp) API which allows to
implement primitives required by protocols such as DH and ECDH.
The API is composed mainly by the following functions
 * set_secret() - It allows the user to set his secret, also
   referred to as his private key, along with the parameters
   known to both parties involved in the key-agreement session.
 * generate_public_key() - It generates the public key to be sent to
   the other counterpart involved in the key-agreement session. The
   function has to be called after set_params() and set_secret()
 * generate_secret() - It generates the shared secret for the session

Other functions such as init() and exit() are provided for allowing
cryptographic hardware to be inizialized properly before use

Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:56 +08:00
Megha Dey 331bf739c4 crypto: sha1-mb - async implementation for sha1-mb
Herbert wants the sha1-mb algorithm to have an async implementation:
https://lkml.org/lkml/2016/4/5/286.
Currently, sha1-mb uses an async interface for the outer algorithm
and a sync interface for the inner algorithm. This patch introduces
a async interface for even the inner algorithm.

Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:55 +08:00
Herbert Xu 820573ebd6 crypto: ghash-ce - Fix cryptd reordering
This patch fixes an old bug where requests can be reordered because
some are processed by cryptd while others are processed directly
in softirq context.

The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.

This patch also removes the redundant use of cryptd in the async
init function as init never touches the FPU.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:54 +08:00
Herbert Xu 7271b33cb8 crypto: ghash-clmulni - Fix cryptd reordering
This patch fixes an old bug where requests can be reordered because
some are processed by cryptd while others are processed directly
in softirq context.

The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.

This patch also removes the redundant use of cryptd in the async
init function as init never touches the FPU.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:53 +08:00
Herbert Xu 88407a39b5 crypto: ablk_helper - Fix cryptd reordering
This patch fixes an old bug where requests can be reordered because
some are processed by cryptd while others are processed directly
in softirq context.

The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:53 +08:00
Herbert Xu 38b2f68b42 crypto: aesni - Fix cryptd reordering problem on gcm
This patch fixes an old bug where gcm requests can be reordered
because some are processed by cryptd while others are processed
directly in softirq context.

The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:52 +08:00
Herbert Xu 81760ea6a9 crypto: cryptd - Add helpers to check whether a tfm is queued
This patch adds helpers to check whether a given tfm is currently
queued.  This is meant to be used by ablk_helper and similar
entities to ensure that no reordering is introduced because of
requests queued in cryptd with respect to requests being processed
in softirq context.

The per-cpu queue length limit is also increased to 1000 in line
with network limits.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:52 +08:00
Romain Perier 47a1f0b2d1 crypto: marvell - Increase the size of the crypto queue
Now that crypto requests are chained together at the DMA level, we
increase the size of the crypto queue for each engine. The result is
that as the backlog list is reached later, it does not stop the crypto
stack from sending asychronous requests, so more cryptographic tasks
are processed by the engines.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:52 +08:00
Romain Perier 85030c5168 crypto: marvell - Add support for chaining crypto requests in TDMA mode
The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
intervention. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.

This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine (stopped or possibly already
running).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:51 +08:00
Romain Perier bf8f91e711 crypto: marvell - Add load balancing between engines
This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
required to allow chaining crypto requests at the DMA level. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:51 +08:00
Romain Perier 2786cee8e5 crypto: marvell - Move SRAM I/O operations to step functions
Currently the crypto requests were sent to engines sequentially.
This commit moves the SRAM I/O operations from the prepare to the step
functions. It provides flexibility for future works and allow to prepare
a request while the engine is running.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:17:25 +08:00
Romain Perier 1bf6682cb3 crypto: marvell - Add a complete operation for async requests
So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:17:24 +08:00
Romain Perier 53da740fed crypto: marvell - Move tdma chain out of mv_cesa_tdma_req and remove it
Currently, the only way to access the tdma chain is to use the 'req'
union from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem
if we want to handle the TDMA chaining vs standard/non-DMA processing in
a generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones) and to remove the type completly. To limit the overhead, we get
rid of the type field, which can now be deduced from the req->chain.first
value. Once these changes are done the union is no longer needed, so
remove it and move mv_cesa_ablkcipher_std_req and mv_cesa_req
to mv_cesa_ablkcipher_req directly. There are also no needs to keep the
'base' field into the union of mv_cesa_ahash_req, so move it into the
upper structure.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:17:23 +08:00
Romain Perier bac8e805a3 crypto: marvell - Copy IV vectors by DMA transfers for acipher requests
Add a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:14:03 +08:00
Romain Perier b99acf79a1 crypto: marvell - Fix wrong type check in dma functions
So far, the way that the type of a TDMA operation was checked was wrong.
We have to use the type mask in order to get the right part of the flag
containing the type of the operation.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:14:01 +08:00
Romain Perier f62830886f crypto: marvell - Check engine is not already running when enabling a req
Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:14:00 +08:00
Romain Perier e26df73f80 crypto: marvell - Add a macro constant for the size of the crypto queue
Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:13:59 +08:00
Herbert Xu 7ea0da1d75 crypto: chacha20-simd - Use generic code for small requests
On 16-byte requests the optimised version is actually slower than
the generic code, so we should simply use that instead.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
2016-06-23 18:13:58 +08:00