Commit Graph

34 Commits

Author SHA1 Message Date
Fabian Frederick 38b4fe5fcc lib/crc32.c: remove unnecessary __constant
Use cpu_to_le32 instead of __constant_cpu_to_le32.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:19 -07:00
Daniel Borkmann 165148396d lib: crc32: reduce number of cases for crc32{, c}_combine
We can safely reduce the number of test cases by a tenth.
There is no particular need to run as many as we're running
now for crc32{,c}_combine, that gives us still ~8000 tests
we're doing if people run kernels with crc selftests enabled
which is perfectly fine.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:27:08 -05:00
Daniel Borkmann cc0ac19995 lib: crc32: conditionally resched when running testcases
Fengguang reports that when crc32 selftests are running on startup, on
some e.g. 32bit systems, we can get a CPU stall like "INFO: rcu_sched
self-detected stall on CPU { 0} (t=2101 jiffies g=4294967081 c=4294967080
q=41)". As this is not intended, add a cond_resched() at the end of a
test case to fix it. Introduced by efba721f63 ("lib: crc32: add test cases
for crc32{, c}_combine routines").

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 15:27:08 -05:00
Daniel Borkmann efba721f63 lib: crc32: add test cases for crc32{, c}_combine routines
We already have 100 test cases for crcs itself, so split the test
buffer with a-prio known checksums, and test crc of two blocks
against crc of the whole block for the same results.

Output/result with CONFIG_CRC32_SELFTEST=y:

  [    2.687095] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
  [    2.687097] crc32: self tests passed, processed 225944 bytes in 278177 nsec
  [    2.687383] crc32c: CRC_LE_BITS = 64
  [    2.687385] crc32c: self tests passed, processed 225944 bytes in 141708 nsec
  [    7.336771] crc32_combine: 113072 self tests passed
  [   12.050479] crc32c_combine: 113072 self tests passed
  [   17.633089] alg: No test for crc32 (crc32-pclmul)

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03 23:04:56 -05:00
Daniel Borkmann 6e95fcaa42 lib: crc32: add functionality to combine two crc32{, c}s in GF(2)
This patch adds a combinator to merge two or more crc32{,c}s
into a new one. This is useful for checksum computations of
fragmented skbs that use crc32/crc32c as checksums.

The arithmetics for combining both in the GF(2) was taken and
slightly modified from zlib. Only passing two crcs is insufficient
as two crcs and the length of the second piece is needed for
merging. The code is made generic, so that only polynomials
need to be passed for crc32_le resp. crc32c_le.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03 23:04:56 -05:00
Daniel Borkmann d921e049a0 lib: crc32: clean up spacing in test cases
This is nothing more but a whitepace cleanup, as 80 chars is not a
hard but soft limit, and otherwise makes the test cases array really
look ugly. So fix it up.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03 23:04:56 -05:00
Gu Zheng f2e1d2ac34 lib/crc32: update the comments of crc32_{be,le}_generic()
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:58:38 -07:00
Joe Mario 8f243af42a sections: fix const sections for crc32 table
Fix the const sections for the code generated by crc32 table.  There's
no ro version of the cacheline aligned section, so we cannot put in
const data without a conflict Just don't make the crc tables const for
now.

[ak@linux.intel.com: some fixes and new description]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Joe Mario <jmario@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:46 +09:00
Thiago Rafael Becker 49ac572b93 lib/crc32.c: fix unused variables warnings
Variables t4, t5, t6 and t7 are only used when CRC_LE_BITS != 32.  Fix
the following compilation warnings:

  lib/crc32.c: In function 'crc32_body':
  lib/crc32.c:77:55: warning: unused variable 't7'
  lib/crc32.c:77:41: warning: unused variable 't6'
  lib/crc32.c:77:27: warning: unused variable 't5'
  lib/crc32.c:77:13: warning: unused variable 't4'

Signed-off-by: Thiago Rafael Becker <trbecker@trbecker.org>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-30 17:25:17 -07:00
Darrick J. Wong 577eba9e22 crc32: add self-test code for crc32c
Add self-test code for crc32c.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:38 -07:00
Darrick J. Wong 46c5801eaf crc32: bolt on crc32c
Reuse the existing crc32 code to stamp out a crc32c implementation.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:38 -07:00
Bob Pearson 78dff41897 crc32: add note about this patchset to crc32.c
Add a comment at the top of crc32.c

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:38 -07:00
Bob Pearson 0292c497b6 crc32: optimize loop counter for x86
Add two changes that improve the performance of x86 systems

1. replace main loop with incrementing counter this change improves
   the performance of the selftest by about 5-6% on Nehalem CPUs.  The
   apparent reason is that the compiler can use the loop index to perform
   an indexed memory access.  This is reported to make the performance of
   PowerPC CPUs to get worse.

2. replace the rem_len loop with incrementing counter this change
   improves the performance of the selftest, which has more than the usual
   number of occurances, by about 1-2% on x86 CPUs.  In actual work loads
   the length is most often a multiple of 4 bytes and this code does not
   get executed as often if at all.  Again this change is reported to make
   the performance of PowerPC get worse.

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson 324eb0f17d crc32: add slice-by-8 algorithm to existing code
Add slicing-by-8 algorithm to the existing slicing-by-4 algorithm.  This
consists of:

- extend largest BITS size from 32 to 64
- extend tables from tab[4][256] to up to tab[8][256]
- Add code for inner loop.

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson 9a1dbf6a29 crc32: make CRC_*_BITS definition correspond to actual bit counts
crc32.c provides a choice of one of several algorithms for computing the
LSB and LSB versions of the CRC32 checksum based on the parameters
CRC_LE_BITS and CRC_BE_BITS.

In the original version the values 1, 2, 4 and 8 respectively selected
versions of the alrogithm that computed the crc 1, 2, 4 and 32 bits as a
time.

This patch series adds a new version that computes the CRC 64 bits at a
time.  To make things easier to understand the parameter has been
reinterpreted to actually stand for the number of bits processed in each
step of the algorithm so that the old value 8 has been replaced with the
value 32.

This also allows us to add in a widely used crc algorithm that computes
the crc 8 bits at a time called the Sarwate algorithm.

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson ce4320ddda crc32: fix mixing of endian-specific types
crc32.c in its original version freely mixed u32, __le32 and __be32 types
which caused warnings from sparse with __CHECK_ENDIAN__.  This patch fixes
these by forcing the types to u32.

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson 60e58d5c9d crc32: miscellaneous cleanups
Misc cleanup of lib/crc32.c and related files.

- remove unnecessary header files.

- straighten out some convoluted ifdef's

- rewrite some references to 2 dimensional arrays as 1 dimensional
  arrays to make them correct.  I.e.  replace tab[i] with tab[0][i].

- a few trivial whitespace changes

- fix a warning in gen_crc32tables.c caused by a mismatch in the type of
  the pointer passed to output table.  Since the table is only used at
  kernel compile time, it is simpler to make the table big enough to hold
  the largest column size used.  One cannot make the column size smaller
  in output_table because it has to be used by both the le and be tables
  and they can have different column sizes.

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson 3863ef31dc crc32: simplify unit test code
Replace the unit test provided in crc32.c, which doesn't have a makefile
and doesn't compile with current headers, with a simpler self test
routine that also gives a measure of performance and runs at module init
time.  The self test option can be enabled through a configuration
option CONFIG_CRC32_SELFTEST.

The test stresses the pre and post loops and is thus not very realistic
since actual uses will likely have addresses and lengths that are at
least 4 byte aligned.  However, the main loop is long enough so that the
performance is dominated by that loop.

The expected values for crc32_le and crc32_be were generated with the
original version of crc32.c using CRC_BITS_LE = 8 and CRC_BITS_BE = 8.
These values were then used to check all the values of the BITS
parameters in both the original and new versions.

The performance results show some variability from run to run in spite
of attempts to both warm the cache and reduce the amount of OS noise by
limiting interrutps during the test.  To get comparable results and to
analyse options wrt performance the best time reported over a small
sample of runs has been taken.

[djwong@us.ibm.com: Minor changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson fbedceb100 crc32: move long comment about crc32 fundamentals to Documentation/
Move a long comment from lib/crc32.c to Documentation/crc32.txt where it
will more likely get read.

Edited the resulting document to add an explanation of the slicing-by-n
algorithm.

[djwong@us.ibm.com: minor changelog tweaks]
[akpm@linux-foundation.org: fix typo, per George]
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Bob Pearson e30c7a8fcf crc32: remove two instances of trailing whitespaces
This patchset (re)uses Bob Pearson's crc32 slice-by-8 code to stamp out
a software crc32c implementation.  It removes the crc32c implementation
in crypto/ in favor of using the stamped-out one in lib/.  There is also
a change to Kconfig so that the kernel builder can pick an
implementation best suited for the hardware.

The motivation for this patchset is that I am working on adding full
metadata checksumming to ext4.  As far as performance impact of adding
checksumming goes, I see nearly no change with a standard mail server
ffsb simulation.  On a test that involves only file creation and
deletion and extent tree writes, I see a drop of about 50 pcercent with
the current kernel crc32c implementation; this improves to a drop of
about 20 percent with the enclosed crc32c code.

When metadata is usually a small fraction of total IO, this new
implementation doesn't help much because metadata is usually a small
fraction of total IO.  However, when we are doing IO that is almost all
metadata (such as rm -rf'ing a tree), then this patch speeds up the
operation substantially.

Incidentally, given that iscsi, sctp, and btrfs also use crc32c, this
patchset should improve their speed as well.  I have not yet quantified
that, however.  This latest submission combines Bob's patches from late
August 2011 with mine so that they can be one coherent patch set.
Please excuse my inability to combine some of the patches; I've been
advised to leave Bob's patches alone and build atop them instead.  :/

Since the last posting, I've also collected some crc32c test results on
a bunch of different x86/powerpc/sparc platforms.  The results can be
viewed here: http://goo.gl/sgt3i ; the "crc32-kern-le" and "crc32c"
columns describe the performance of the kernel's current crc32 and
crc32c software implementations.  The "crc32c-by8-le" column shows
crc32c performance with this patchset applied.  I expect crc32
performance to be roughly the same.

The two _boost columns at the right side of the spreadsheet shows how much
faster the new implementation is over the old one.  As you can see, crc32
rises substantially, and crc32c experiences a huge increase.

This patch:

- remove trailing whitespace from lib/crc32.c
- remove trailing whitespace from lib/crc32defs.h

[djwong@us.ibm.com: changelog tweaks]
Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:37 -07:00
Joakim Tjernlund 5742332dea crc32: optimize inner loop
Taking a pointer reference to each row in the crc table matrix, one can
reduce the inner loop with a few insn's

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Cc: Frank Zago <fzago@systemfabricworks.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:51 -08:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Andrew Morton 0d2daf5cc8 revert "crc32: use __BYTE_ORDER macro for endian detection"
It doesn't work on big-endian - those architectures don't define
__LITTLE_ENDIAN.

Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-26 08:19:23 -07:00
Joakim Tjernlund 4762bbc1a3 crc32: use __BYTE_ORDER macro for endian detection.
Since crc32.c contains a nifty test program that can be executed in user
space, make sure endian detection works reliably in user space too.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:06 -07:00
Joakim Tjernlund 836e2af925 crc32: major optimization
Precompute more crc32 values(0xcc00, 0xcc0000 and 0xcc000000) into tables.
 This increases the table size from 1KB to 4KB but the performance benfit
makes it worth it:

28% faster on MPC8321, 266 MHz
2x faster on Core 2 Duo, 3.1GHz

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:06 -07:00