Commit Graph

3241 Commits

Author SHA1 Message Date
Linus Torvalds
e2fdae7e7c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:
 "The PowerPC folks have a really nice scalable IOMMU pool allocator
  that we wanted to make use of for sparc.  So here we have a series
  that abstracts out their code into a common layer that anyone can make
  use of.

  Sparc is converted, and the PowerPC folks have reviewed and ACK'd this
  series and plan to convert PowerPC over as well"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  iommu-common: Fix PARISC compile-time warnings
  sparc: Make LDC use common iommu poll management functions
  sparc: Make sparc64 use scalable lib/iommu-common.c functions
  sparc: Break up monolithic iommu table/lock into finer graularity pools and lock
2015-04-17 16:19:26 -04:00
Sowmini Varadhan
cb97201cb0 iommu-common: Fix PARISC compile-time warnings
Fixes warnings due to
- no DMA_ERROR_CODE on PARISC,
- sizeof (unsigned long) == 4 bytes on PARISC.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-17 15:24:36 -04:00
Linus Torvalds
54e514b91b Merge branch 'akpm' (patches from Andrew)
Merge third patchbomb from Andrew Morton:

 - various misc things

 - a couple of lib/ optimisations

 - provide DIV_ROUND_CLOSEST_ULL()

 - checkpatch updates

 - rtc tree

 - befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs

 - ptrace fixes

 - fork() fixes

 - seccomp cleanups

 - more mmap_sem hold time reductions from Davidlohr

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (138 commits)
  proc: show locks in /proc/pid/fdinfo/X
  docs: add missing and new /proc/PID/status file entries, fix typos
  drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
  Documentation/spi/spidev_test.c: fix warning
  drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
  .gitignore: ignore *.tar
  MAINTAINERS: add Mediatek SoC mailing list
  tomoyo: reduce mmap_sem hold for mm->exe_file
  powerpc/oprofile: reduce mmap_sem hold for exe_file
  oprofile: reduce mmap_sem hold for mm->exe_file
  mips: ip32: add platform data hooks to use DS1685 driver
  lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
  x86: switch to using asm-generic for seccomp.h
  sparc: switch to using asm-generic for seccomp.h
  powerpc: switch to using asm-generic for seccomp.h
  parisc: switch to using asm-generic for seccomp.h
  mips: switch to using asm-generic for seccomp.h
  microblaze: use asm-generic for seccomp.h
  arm: use asm-generic for seccomp.h
  seccomp: allow COMPAT sigreturn overrides
  ...
2015-04-17 09:04:38 -04:00
Andrew Morton
9e522c0d28 lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
Cc: Yalin Wang <yalin.wang@sonymobile.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:04:10 -04:00
Sergey Senozhatsky
534b483a86 cpumask: don't perform while loop in cpumask_next_and()
cpumask_next_and() is looking for cpumask_next() in src1 in a loop and
tests if found cpu is also present in src2. remove that loop, perform
cpumask_and() of src1 and src2 first and use that new mask to find
cpumask_next().

Apart from removing while loop, ./bloat-o-meter on x86_64 shows
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function                                     old     new   delta
cpumask_next_and                              62      54      -8

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:04:08 -04:00
Yury Norov
2afe27c718 lib/bitmap.c: bitmap_[empty,full]: remove code duplication
bitmap_empty() has its own implementation.  But it's clearly as simple as:

	find_first_bit(src, nbits) == nbits

The same is true for 'bitmap_full'.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:56 -04:00
Rasmus Villemoes
675cf53c1d lib/vsprintf.c: improve put_dec_trunc8 slightly
I hadn't had enough coffee when I wrote this. Currently, the final
increment of buf depends on the value loaded from the table, and
causes gcc to emit a cmov immediately before the return. It is smarter
to let it depend on r, since the increment can then be computed in
parallel with the final load/store pair. It also shaves 16 bytes of
.text.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:55 -04:00
Sebastian Ott
a7a2c02a40 lib/dma-debug: fix bucket_find_contain()
bucket_find_contain() will search the bucket list for a dma_debug_entry.
When the entry isn't found it needs to search other buckets too, since
only the start address of a dma range is hashed (which might be in a
different bucket).

A copy of the dma_debug_entry is used to get the previous hash bucket
but when its list is searched the original dma_debug_entry is to be used
not its modified copy.

This fixes false "device driver tries to sync DMA memory it has not allocated"
warnings.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Horia Geanta <horia.geanta@freescale.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:54 -04:00
Rasmus Villemoes
7c43d9a30c lib/vsprintf.c: even faster binary to decimal conversion
The most expensive part of decimal conversion is the divisions by 10
(albeit done using reciprocal multiplication with appropriately chosen
constants).  I decided to see if one could eliminate around half of
these multiplications by emitting two digits at a time, at the cost of a
200 byte lookup table, and it does indeed seem like there is something
to be gained, especially on 64 bits.  Microbenchmarking shows
improvements ranging from -50% (for numbers uniformly distributed in [0,
2^64-1]) to -25% (for numbers heavily biased toward the smaller end, a
more realistic distribution).

On a larger scale, perf shows that top, one of the big consumers of /proc
data, uses 0.5-1.0% fewer cpu cycles.

I had to jump through some hoops to get the 32 bit code to compile and run
on my 64 bit machine, so I'm not sure how relevant these numbers are, but
just for comparison the microbenchmark showed improvements between -30%
and -10%.

The bloat-o-meter costs are around 150 bytes (the generated code is a
little smaller, so it's not the full 200 bytes) on both 32 and 64 bit.
I'm aware that extra cache misses won't show up in a microbenchmark as
used above, but on the other hand decimal conversions often happen in bulk
(for example in the case of top).

I have of course tested that the new code generates the same output as the
old, for both the first and last 1e10 numbers in [0,2^64-1] and 4e9
'random' numbers in-between.

Test and verification code on github: https://github.com/Villemoes/dec.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Tested-by: Jeff Epler <jepler@unpythonic.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:54 -04:00
Yury Norov
840620a159 lib: rename lib/find_next_bit.c to lib/find_bit.c
This file contains implementation for all find_*_bit{,_le}
So giving it more generic name looks reasonable.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: George Spelvin <linux@horizon.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Valentin Rothberg <valentinrothberg@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:54 -04:00
Yury Norov
8f6f19dd51 lib: move find_last_bit to lib/find_next_bit.c
Currently all 'find_*_bit' family is located in lib/find_next_bit.c,
except 'find_last_bit', which is in lib/find_last_bit.c. It seems,
there's no major benefit to have it separated.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: George Spelvin <linux@horizon.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Valentin Rothberg <valentinrothberg@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:54 -04:00
Yury Norov
2c57a0e233 lib: find_*_bit reimplementation
This patchset does rework to find_bit function family to achieve better
performance, and decrease size of text.  All rework is done in patch 1.
Patches 2 and 3 are about code moving and renaming.

It was boot-tested on x86_64 and MIPS (big-endian) machines.
Performance tests were ran on userspace with code like this:

	/* addr[] is filled from /dev/urandom */
	start = clock();
	while (ret < nbits)
		ret = find_next_bit(addr, nbits, ret + 1);

	end = clock();
	printf("%ld\t", (unsigned long) end - start);

On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for
find_next_bit, nbits is 8M, for find_first_bit - 80K)

	find_next_bit:		find_first_bit:
	new	current		new	current
	26932	43151		14777	14925
	26947	43182		14521	15423
	26507	43824		15053	14705
	27329	43759		14473	14777
	26895	43367		14847	15023
	26990	43693		15103	15163
	26775	43299		15067	15232
	27282	42752		14544	15121
	27504	43088		14644	14858
	26761	43856		14699	15193
	26692	43075		14781	14681
	27137	42969		14451	15061
	...			...

find_next_bit performance gain is 35-40%;
find_first_bit - no measurable difference.

On ARM machine, there is arch-specific implementation for find_bit.

Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
helpful discussions.

This patch (of 3):

New implementations takes less space in source file (see diffstat) and in
object.  For me it's 710 vs 453 bytes of text.  It also shows better
performance.

find_last_bit description fixed due to obvious typo.

[akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus]
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: George Spelvin <linux@horizon.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Valentin Rothberg <valentinrothberg@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:53 -04:00
Linus Torvalds
7d69cff26c Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
 "This is the usual grab bag of driver updates (lpfc, qla2xxx, storvsc,
  aacraid, ipr) plus an assortment of minor updates.  There's also a
  major update to aic1542 which moves the driver into this millenium"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (106 commits)
  change SCSI Maintainer email
  sd, mmc, virtio_blk, string_helpers: fix block size units
  ufs: add support to allow non standard behaviours (quirks)
  ufs-qcom: save controller revision info in internal structure
  qla2xxx: Update driver version to 8.07.00.18-k
  qla2xxx: Restore physical port WWPN only, when port down detected for FA-WWPN port.
  qla2xxx: Fix virtual port configuration, when switch port is disabled/enabled.
  qla2xxx: Prevent multiple firmware dump collection for ISP27XX.
  qla2xxx: Disable Interrupt handshake for ISP27XX.
  qla2xxx: Add debugging info for MBX timeout.
  qla2xxx: Add serdes read/write support for ISP27XX
  qla2xxx: Add udev notification to save fw dump for ISP27XX
  qla2xxx: Add message for sucessful FW dump collected for ISP27XX.
  qla2xxx: Add support to load firmware from file for ISP 26XX/27XX.
  qla2xxx: Fix beacon blink for ISP27XX.
  qla2xxx: Increase the wait time for firmware to be ready for P3P.
  qla2xxx: Fix crash due to wrong casting of reg for ISP27XX.
  qla2xxx: Fix warnings reported by static checker.
  lpfc: Update version to 10.5.0.0 for upstream patch set
  lpfc: Update copyright to 2015
  ...
2015-04-16 19:02:04 -04:00
Sowmini Varadhan
10b88a4b17 sparc: Break up monolithic iommu table/lock into finer graularity pools and lock
Investigation of multithreaded iperf experiments on an ethernet
interface show the iommu->lock as the hottest lock identified by
lockstat, with something of the order of  21M contentions out of
27M acquisitions, and an average wait time of 26 us for the lock.
This is not efficient. A more scalable design is to follow the ppc
model, where the iommu_table has multiple pools, each stretching
over a segment of the map, and with a separate lock for each pool.
This model allows for better parallelization of the iommu map search.

This patch adds the iommu range alloc/free function infrastructure.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-16 12:44:55 -07:00
Linus Torvalds
eea3a00264 Merge branch 'akpm' (patches from Andrew)
Merge second patchbomb from Andrew Morton:

 - the rest of MM

 - various misc bits

 - add ability to run /sbin/reboot at reboot time

 - printk/vsprintf changes

 - fiddle with seq_printf() return value

* akpm: (114 commits)
  parisc: remove use of seq_printf return value
  lru_cache: remove use of seq_printf return value
  tracing: remove use of seq_printf return value
  cgroup: remove use of seq_printf return value
  proc: remove use of seq_printf return value
  s390: remove use of seq_printf return value
  cris fasttimer: remove use of seq_printf return value
  cris: remove use of seq_printf return value
  openrisc: remove use of seq_printf return value
  ARM: plat-pxa: remove use of seq_printf return value
  nios2: cpuinfo: remove use of seq_printf return value
  microblaze: mb: remove use of seq_printf return value
  ipc: remove use of seq_printf return value
  rtc: remove use of seq_printf return value
  power: wakeup: remove use of seq_printf return value
  x86: mtrr: if: remove use of seq_printf return value
  linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK
  MAINTAINERS: CREDITS: remove Stefano Brivio from B43
  .mailmap: add Ricardo Ribalda
  CREDITS: add Ricardo Ribalda Delgado
  ...
2015-04-15 16:39:15 -07:00
Joe Perches
d50f8f8d91 lru_cache: remove use of seq_printf return value
The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03 ("seq_file: Rename seq_overflow() to
     seq_has_overflowed() and make public")

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:25 -07:00
Rasmus Villemoes
41416f2330 lib/string_helpers.c: change semantics of string_escape_mem
The current semantics of string_escape_mem are inadequate for one of its
current users, vsnprintf().  If that is to honour its contract, it must
know how much space would be needed for the entire escaped buffer, and
string_escape_mem provides no way of obtaining that (short of allocating a
large enough buffer (~4 times input string) to let it play with, and
that's definitely a big no-no inside vsnprintf).

So change the semantics for string_escape_mem to be more snprintf-like:
Return the size of the output that would be generated if the destination
buffer was big enough, but of course still only write to the part of dst
it is allowed to, and (contrary to snprintf) don't do '\0'-termination.
It is then up to the caller to detect whether output was truncated and to
append a '\0' if desired.  Also, we must output partial escape sequences,
otherwise a call such as snprintf(buf, 3, "%1pE", "\123") would cause
printf to write a \0 to buf[2] but leaving buf[0] and buf[1] with whatever
they previously contained.

This also fixes a bug in the escaped_string() helper function, which used
to unconditionally pass a length of "end-buf" to string_escape_mem();
since the latter doesn't check osz for being insanely large, it would
happily write to dst.  For example, kasprintf(GFP_KERNEL, "something and
then %pE", ...); is an easy way to trigger an oops.

In test-string_helpers.c, the -ENOMEM test is replaced with testing for
getting the expected return value even if the buffer is too small.  We
also ensure that nothing is written (by relying on a NULL pointer deref)
if the output size is 0 by passing NULL - this has to work for
kasprintf("%pE") to work.

In net/sunrpc/cache.c, I think qword_add still has the same semantics.
Someone should definitely double-check this.

In fs/proc/array.c, I made the minimum possible change, but longer-term it
should stop poking around in seq_file internals.

[andriy.shevchenko@linux.intel.com: simplify qword_add]
[andriy.shevchenko@linux.intel.com: add missed curly braces]
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:24 -07:00
Rasmus Villemoes
3aeddc7d66 lib/string_helpers.c: refactor string_escape_mem
When printf is given the format specifier %pE, it needs a way of obtaining
the total output size that would be generated if the buffer was large
enough, and string_escape_mem doesn't easily provide that.  This is a
refactorization of string_escape_mem in preparation of changing its
external API to provide that information.

The somewhat ugly early returns and subsequent seemingly redundant
conditionals are to make the following patch touch as little as possible
in string_helpers.c while still preserving the current behaviour of never
outputting partial escape sequences.  That behaviour must also change for
%pE to work as one expects from every other printf specifier.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Rasmus Villemoes
9c98f23596 lib/vsprintf.c: fix potential NULL deref in hex_string
The helper hex_string() is broken in two ways.  First, it doesn't
increment buf regardless of whether there is room to print, so callers
such as kasprintf() that try to probe the correct storage to allocate will
get a too small return value.  But even worse, kasprintf() (and likely
anyone else trying to find the size of the result) pass NULL for buf and 0
for size, so we also have end == NULL.  But this means that the end-1 in
hex_string() is (char*)-1, so buf < end-1 is true and we get a NULL
pointer deref.  I double-checked this with a trivial kernel module that
just did a kasprintf(GFP_KERNEL, "%14ph", "CrashBoomBang").

Nobody seems to be using %ph with kasprintf, but we might as well fix it
before it hits someone.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Geert Uytterhoeven
900cca2944 lib/vsprintf: add %pC{,n,r} format specifiers for clocks
Add format specifiers for printing struct clk:
  - '%pC' or '%pCn': name (Common Clock Framework) or address (legacy
    clock framework) of the clock,
  - '%pCr': rate of the clock.

[akpm@linux-foundation.org: omit code if !CONFIG_HAVE_CLK]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Rasmus Villemoes
d1c1b12137 lib/vsprintf.c: another small hack
Making ZEROPAD == '0'-' ', we can eliminate a few more instructions.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Rasmus Villemoes
3ea8d440a8 lib/vsprintf.c: eliminate duplicate hex string array
gcc doesn't merge or overlap const char[] objects with identical contents
(probably language lawyers would also insist that these things have
different addresses), but there's no reason to have the string
"0123456789ABCDEF" occur in multiple places.  hex_asc_upper is declared in
kernel.h and defined in lib/hexdump.c, which is unconditionally compiled
in.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Rasmus Villemoes
e26c12c777 lib/vsprintf.c: reduce stack use in number()
At least since the initial git commit, when base was passed as a separate
parameter, number() has only been called with bases 8, 10 and 16.  I'm
guessing that 66 was to accommodate 64 0/1, a sign and a '\0', but the
buffer is only used for the actual digits.  Octal digits carry 3 bits of
information, so 24 is enough.  Spell that 3*sizeof(num) so one less place
needs to be changed should long long ever be 128 bits.  Also remove the
commented-out code that would handle an arbitrary base.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Rasmus Villemoes
51be17dfff lib/vsprintf.c: eliminate some branches
Since FORMAT_TYPE_INT is simply 1 more than FORMAT_TYPE_UINT, and
similarly for BYTE/UBYTE, SHORT/USHORT, LONG/ULONG, we can eliminate a few
instructions by making SIGN have the value 1 instead of 2, and then use
arithmetic instead of branches for computing the right spec->type.  It's a
little hacky, but certainly in the same spirit as SMALL needing to have
the value 0x20.  For example for the spec->qualifier == 'l' case, gcc now
generates

     75e:       0f b6 53 01             movzbl 0x1(%rbx),%edx
     762:       83 e2 01                and    $0x1,%edx
     765:       83 c2 09                add    $0x9,%edx
     768:       88 13                   mov    %dl,(%rbx)

instead of

     763:       0f b6 53 01             movzbl 0x1(%rbx),%edx
     767:       83 e2 02                and    $0x2,%edx
     76a:       80 fa 01                cmp    $0x1,%dl
     76d:       19 d2                   sbb    %edx,%edx
     76f:       83 c2 0a                add    $0xa,%edx
     772:       88 13                   mov    %dl,(%rbx)

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:23 -07:00
Andi Kleen
c79574abe2 lib/test-hexdump.c: fix initconst confusion
const char *...[] is not const, but an array of pointer to const.  So
these arrays cannot be __initconst, but must be __initdata

This fixes section conflicts with LTO.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:22 -07:00