[PATCH] optimize hweight64 for x86_64

Based on patch from David Rientjes <rientjes@google.com>, but
changed by AK.

Optimizes the 64-bit hamming weight for x86_64 processors assuming they
have fast multiplication.  Uses five fewer bitops than the generic
hweight64.  Benchmark on one EMT64 showed ~25% speedup with 2^24
consecutive calls.

Define a new ARCH_HAS_FAST_MULTIPLIER that can be set by other
architectures that can also multiply fast.

Signed-off-by: Andi Kleen <ak@suse.de>
This commit is contained in:
Andi Kleen
2006-09-26 10:52:38 +02:00
committed by Andi Kleen
parent 8380aabb99
commit 0136611c62
2 changed files with 10 additions and 2 deletions
+2
View File
@@ -399,6 +399,8 @@ static __inline__ int fls(int x)
return r+1;
}
#define ARCH_HAS_FAST_MULTIPLIER 1
#include <asm-generic/bitops/hweight.h>
#endif /* __KERNEL__ */