Commit Graph

494 Commits

Author SHA1 Message Date
Dan Williams f9dd213437 Merge branch 'md-raid6-accel' into ioat3.2
Conflicts:
	include/linux/dmaengine.h
2009-09-08 17:42:29 -07:00
Dan Williams cb3c82992f async_tx: raid6 recovery self test
Port drivers/md/raid6test/test.c to use the async raid6 recovery
routines.  This is meant as a unit test for raid6 acceleration drivers.  In
addition to the 16-drive test case this implements tests for the 4-disk and
5-disk special cases (dma devices can not generically handle less than 2
sources), and adds a test for the D+Q case.

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:28 -07:00
Dan Williams 0a82a6239b async_tx: add support for asynchronous RAID6 recovery operations
async_raid6_2data_recov() recovers two data disk failures

 async_raid6_datap_recov() recovers a data disk and the P disk

These routines are a port of the synchronous versions found in
drivers/md/raid6recov.c.  The primary difference is breaking out the xor
operations into separate calls to async_xor.  Two helper routines are
introduced to perform scalar multiplication where needed.
async_sum_product() multiplies two sources by scalar coefficients and
then sums (xor) the result.  async_mult() simply multiplies a single
source by a scalar.

This implemention also includes, in contrast to the original
synchronous-only code, special case handling for the 4-disk and 5-disk
array cases.  In these situations the default N-disk algorithm will
present 0-source or 1-source operations to dma devices.  To cover for
dma devices where the minimum source count is 2 we implement 4-disk and
5-disk handling in the recovery code.

[ Impact: asynchronous raid6 recovery routines for 2data and datap cases ]

Cc: Yuri Tikhonov <yur@emcraft.com>
Cc: Ilya Yanok <yanok@emcraft.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams b2f46fd8ef async_tx: add support for asynchronous GF multiplication
[ Based on an original patch by Yuri Tikhonov ]

This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:

 async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

 async_syndrome_val() validates the given source buffers against known P
    and Q values.

When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation.  Care must
be taken to remove Q from P' and P from Q'.  For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:

p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.

The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.

Note1: Some devices have native support for P+Q continuation and can skip
this extra work.  Devices with this capability can advertise it with
dma_set_maxpq.  It is up to each driver how to handle the
DMA_PREP_CONTINUE flag.

Note2: The api supports disabling the generation of P when generating Q,
this is ignored by the synchronous path but is implemented by some dma
devices to save unnecessary writes.  In this case the continuation
algorithm is simplified to only reuse Q as a source.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams 95475e5711 async_tx: remove walk of tx->parent chain in dma_wait_for_async_tx
We currently walk the parent chain when waiting for a given tx to
complete however this walk may race with the driver cleanup routine.
The routines in async_raid6_recov.c may fall back to the synchronous
path at any point so we need to be prepared to call async_tx_quiesce()
(which calls  dma_wait_for_async_tx).  To remove the ->parent walk we
guarantee that every time a dependency is attached ->issue_pending() is
invoked, then we can simply poll the initial descriptor until
completion.

This also allows for a lighter weight 'issue pending' implementation as
there is no longer a requirement to iterate through all the channels'
->issue_pending() routines as long as operations have been submitted in
an ordered chain.  async_tx_issue_pending() is added for this case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams af1f951eb6 async_tx: kill needless module_{init|exit}
If module_init and module_exit are nops then neither need to be defined.

[ Impact: pure cleanup ]

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:26 -07:00
Dan Williams ad283ea4a3 async_tx: add sum check flags
Replace the flat zero_sum_result with a collection of flags to contain
the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
of DMA_ since these flags will be used on non-dma-zero-sum enabled
platforms.

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:26 -07:00
Dan Williams 04ce9ab385 async_xor: permit callers to pass in a 'dma/page scribble' region
async_xor() needs space to perform dma and page address conversions.  In
most cases the code can simply reuse the struct page * array because the
size of the native pointer matches the size of a dma/page address.  In
order to support archs where sizeof(dma_addr_t) is larger than
sizeof(struct page *), or to preserve the input parameters, we utilize a
memory region passed in by the caller.

Since the code is now prepared to handle the case where it cannot
perform address conversions on the stack, we no longer need the
!HIGHMEM64G dependency in drivers/dma/Kconfig.

[ Impact: don't clobber input buffers for address conversions ]

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-06-03 14:22:28 -07:00
Dan Williams a08abd8ca8 async_tx: structify submission arguments, add scribble
Prepare the api for the arrival of a new parameter, 'scribble'.  This
will allow callers to identify scratchpad memory for dma address or page
address conversions.  As this adds yet another parameter, take this
opportunity to convert the common submission parameters (flags,
dependency, callback, and callback argument) into an object that is
passed by reference.

Also, take this opportunity to fix up the kerneldoc and add notes about
the relevant ASYNC_TX_* flags for each routine.

[ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]

Signed-off-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-06-03 14:07:35 -07:00
Dan Williams 88ba2aa586 async_tx: kill ASYNC_TX_DEP_ACK flag
In support of inter-channel chaining async_tx utilizes an ack flag to
gate whether a dependent operation can be chained to another.  While the
flag is not set the chain can be considered open for appending.  Setting
the ack flag closes the chain and flags the descriptor for garbage
collection.  The ASYNC_TX_DEP_ACK flag essentially means "close the
chain after adding this dependency".  Since each operation can only have
one child the api now implicitly sets the ack flag at dependency
submission time.  This removes an unnecessary management burden from
clients of the api.

[ Impact: clean up and enforce one dependency per operation ]

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-06-03 14:07:34 -07:00
Herbert Xu d315a0e09f crypto: hash - Fix handling of sg entry that crosses page boundary
A quirk that we've always supported is having an sg entry that's
bigger than a page, or more generally an sg entry that crosses
page boundaries.  Even though it would be better to explicitly have
to sg entries for this, we need to support it for the existing users,
in particular, IPsec.

The new ahash sg walking code did try to handle this, but there was
a bug where we didn't increment the page so kept on walking on the
first page over an dover again.

This patch fixes it.

Tested-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-05-31 23:09:22 +10:00
Linus Torvalds cd208bcc7c Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: padlock - Revert aes-all alias to aes
  crypto: api - Fix algorithm module auto-loading
  crypto: eseqiv - Fix IV generation for sync algorithms
  crypto: ixp4xx - check firmware for crypto support
2009-05-17 15:48:05 -07:00
Herbert Xu 37fc334cc8 crypto: api - Fix algorithm module auto-loading
The commit a760a6656e (crypto:
api - Fix module load deadlock with fallback algorithms) broke
the auto-loading of algorithms that require fallbacks.  The
problem is that the fallback mask check is missing an and which
cauess bits that should be considered to interfere with the
result.

Reported-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-04-21 13:27:16 +08:00
Steffen Klassert abe5fa7899 crypto: eseqiv - Fix IV generation for sync algorithms
If crypto_ablkcipher_encrypt() returns synchronous,
eseqiv_complete2() is called even if req->giv is already the
pointer to the generated IV. The generated IV is overwritten
with some random data in this case. This patch fixes this by
calling eseqiv_complete2() just if the generated IV has to be
copied to req->giv.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-04-15 20:45:03 +08:00
Dan Williams 099f53cb50 async_tx: rename zero_sum to val
'zero_sum' does not properly describe the operation of generating parity
and checking that it validates against an existing buffer.  Change the
name of the operation to 'val' (for 'validate').  This is in
anticipation of the p+q case where it is a requirement to identify the
target parity buffers separately from the source buffers, because the
target parity buffers will not have corresponding pq coefficients.

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-04-08 14:28:37 -07:00
Dan Williams fd74ea6588 Merge branch 'dmaengine' into async-tx-raid6 2009-04-08 14:28:13 -07:00
Linus Torvalds 133e2a3164 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dma: Add SoF and EoF debugging to ipu_idmac.c, minor cleanup
  dw_dmac: add cyclic API to DW DMA driver
  dmaengine: Add privatecnt to revert DMA_PRIVATE property
  dmatest: add dma interrupts and callbacks
  dmatest: add xor test
  dmaengine: allow dma support for async_tx to be toggled
  async_tx: provide __async_inline for HAS_DMA=n archs
  dmaengine: kill some unused headers
  dmaengine: initialize tx_list in dma_async_tx_descriptor_init
  dma: i.MX31 IPU DMA robustness improvements
  dma: improve section assignment in i.MX31 IPU DMA driver
  dma: ipu_idmac driver cosmetic clean-up
  dmaengine: fail device registration if channel registration fails
2009-04-03 12:13:45 -07:00
Linus Torvalds c54c4dec61 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: ixp4xx - Fix handling of chained sg buffers
  crypto: shash - Fix unaligned calculation with short length
  hwrng: timeriomem - Use phys address rather than virt
2009-04-03 09:45:53 -07:00
Linus Torvalds 223cdea4c4 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md: (53 commits)
  md/raid5 revise rules for when to update metadata during reshape
  md/raid5: minor code cleanups in make_request.
  md: remove CONFIG_MD_RAID_RESHAPE config option.
  md/raid5: be more careful about write ordering when reshaping.
  md: don't display meaningless values in sysfs files resync_start and sync_speed
  md/raid5: allow layout and chunksize to be changed on active array.
  md/raid5: reshape using largest of old and new chunk size
  md/raid5: prepare for allowing reshape to change layout
  md/raid5: prepare for allowing reshape to change chunksize.
  md/raid5: clearly differentiate 'before' and 'after' stripes during reshape.
  Documentation/md.txt update
  md: allow number of drives in raid5 to be reduced
  md/raid5: change reshape-progress measurement to cope with reshaping backwards.
  md: add explicit method to signal the end of a reshape.
  md/raid5: enhance raid5_size to work correctly with negative delta_disks
  md/raid5: drop qd_idx from r6_state
  md/raid6: move raid6 data processing to raid6_pq.ko
  md: raid5 run(): Fix max_degraded for raid level 4.
  md: 'array_size' sysfs attribute
  md: centralize ->array_sectors modifications
  ...
2009-04-03 09:08:19 -07:00
NeilBrown bff61975b3 md: move lots of #include lines out of .h files and into .c
This makes the includes more explicit, and is preparation for moving
md_k.h to drivers/md/md.h

Remove include/raid/md.h as its only remaining use was to #include
other files.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-03-31 14:33:13 +11:00
Yehuda Sadeh f4f689933c crypto: shash - Fix unaligned calculation with short length
When the total length is shorter than the calculated number of unaligned bytes, the call to shash->update breaks. For example, calling crc32c on unaligned buffer with length of 1 can result in a system crash.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-03-27 13:03:51 +08:00
Linus Torvalds 562f477a54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (29 commits)
  crypto: sha512-s390 - Add missing block size
  hwrng: timeriomem - Breaks an allyesconfig build on s390:
  nlattr: Fix build error with NET off
  crypto: testmgr - add zlib test
  crypto: zlib - New zlib crypto module, using pcomp
  crypto: testmgr - Add support for the pcomp interface
  crypto: compress - Add pcomp interface
  netlink: Move netlink attribute parsing support to lib
  crypto: Fix dead links
  hwrng: timeriomem - New driver
  crypto: chainiv - Use kcrypto_wq instead of keventd_wq
  crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq
  crypto: api - Use dedicated workqueue for crypto subsystem
  crypto: testmgr - Test skciphers with no IVs
  crypto: aead - Avoid infinite loop when nivaead fails selftest
  crypto: skcipher - Avoid infinite loop when cipher fails selftest
  crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention
  crypto: api - crypto_alg_mod_lookup either tested or untested
  crypto: amcc - Add crypt4xx driver
  crypto: ansi_cprng - Add maintainer
  ...
2009-03-26 11:04:34 -07:00
Dan Williams 729b5d1b8e dmaengine: allow dma support for async_tx to be toggled
Provide a config option for blocking the allocation of dma channels to
the async_tx api.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-03-25 09:13:25 -07:00
Dan Williams 06164f3194 async_tx: provide __async_inline for HAS_DMA=n archs
To allow an async_tx routine to be compiled away on HAS_DMA=n arch it
needs to be declared __always_inline otherwise the compiler may emit
code and cause a link error.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-03-25 09:13:25 -07:00
Geert Uytterhoeven 0c01aed50d crypto: testmgr - add zlib test
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-03-04 15:42:15 +08:00