Randy Dunlap noticed that the recent comment clarifications from Andrew
had somehow gotten duplicated. Quoth Andrew: "hm, that could have been
some late-night reject-fixing."
Fix it up.
Cc: From: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
[PATCH] ocfs2: zero_user_page conversion
ocfs2: Support xfs style space reservation ioctls
ocfs2: support for removing file regions
ocfs2: update truncate handling of partial clusters
ocfs2: btree support for removal of arbirtrary extents
ocfs2: Support creation of unwritten extents
ocfs2: support writing of unwritten extents
ocfs2: small cleanup of ocfs2_write_begin_nolock()
ocfs2: btree changes for unwritten extents
ocfs2: abstract btree growing calls
ocfs2: use all extent block suballocators
ocfs2: plug truncate into cached dealloc routines
ocfs2: simplify deallocation locking
ocfs2: harden buffer check during mapping of page blocks
ocfs2: shared writeable mmap
ocfs2: factor out write aops into nolock variants
ocfs2: rework ocfs2_buffered_write_cluster()
ocfs2: take ip_alloc_sem during entire truncate
ocfs2: Add "preferred slot" mount option
[KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c
...
* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block: (25 commits)
bsg: Kconfig updates
bsg: add SCSI transport-level request support
bsg: add bidi support
add a struct request pointer to the request structure
bsg: fix the deadlock on discarding done commands
bsg: fix a blocking read bug
bsg: minor bug fixes
improve bsg device allocation
bind bsg to all SCSI devices
bsg: bind bsg to request_queue instead of gendisk
bsg: add a request_queue argument to scsi_cmd_ioctl()
bsg: simplify __bsg_alloc_command failpath
bsg: add cheasy error checks for sysfs stuff
Add queue resizing support
Replace s32, u32 and u64 with __s32, __u32 and __u64 in bsg.h for userspace
bsg: silence a bogus gcc warning
bsg: style cleanup
bsg: use u32 etc instead of uint32_t
bsg: add SG_IO to SG v4
bsg: replace SG v3 with SG v4
...
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
splice: direct splicing updates ppos twice
more ACSI removal
umem: Fix match of pci_ids in umem driver
umem: Remove references to dead CONFIG_MM_MAP_MEMORY variable
remove the documentation for the legacy CDROM drivers
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: (26 commits)
[SPARC64]: Fix UP build.
[SPARC64]: dr-cpu unconfigure support.
[SERIAL]: Fix console write locking in sparc drivers.
[SPARC64]: Give more accurate errors in dr_cpu_configure().
[SPARC64]: Clear cpu_{core,sibling}_map[] in smp_fill_in_sib_core_maps()
[SPARC64]: Fix leak when DR added cpu does not bootup.
[SPARC64]: Add ->set_affinity IRQ handlers.
[SPARC64]: Process dr-cpu events in a kthread instead of workqueue.
[SPARC64]: More sensible udelay implementation.
[SPARC64]: SMP build fixes.
[SPARC64]: mdesc.c needs linux/mm.h
[SPARC64]: Fix build regressions added by dr-cpu changes.
[SPARC64]: Unconditionally register vio_bus_type.
[SPARC64]: Initial LDOM cpu hotplug support.
[SPARC64]: Fix setting of variables in LDOM guest.
[SPARC64]: Fix MD property lifetime bugs.
[SPARC64]: Abstract out mdesc accesses for better MD update handling.
[SPARC64]: Use more mearningful names for IRQ registry.
[SPARC64]: Initial domain-services driver.
[SPARC64]: Export powerd facilities for external entities.
...
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (68 commits)
sh: sh-rtc support for SH7709.
sh: Revert __xdiv64_32 size change.
sh: Update r7785rp defconfig.
sh: Export div symbols for GCC 4.2 and ST GCC.
sh: fix race in parallel out-of-tree build
sh: Kill off dead mach.c for hp6xx.
sh: hd64461.h cleanup and added comments.
sh: Update the alignment when 4K stacks are used.
sh: Add a .bss.page_aligned section for 4K stacks.
sh: Don't let SH-4A clobber SH-4 CFLAGS.
sh: Add parport stub for SuperIO ports.
sh: Drop -Wa,-dsp for DSP tuning.
sh: Update dreamcast defconfig.
fb: pvr2fb: A few more __devinit annotations for PCI.
fb: pvr2fb: Fix up section mismatch warnings.
sh: Select IPR-IRQ for SH7091.
sh: Correct __xdiv64_32/div64_32 return value size.
sh: Fix timer-tmu build for SH-3.
sh: Add cpu and mach links to CLEAN_FILES.
sh: Preliminary support for the SH-X3 CPU.
...
FAT12 entry is 12bits, so it needs 2 phase to update the value. And
writer and reader access it without any lock, so reader can get the
half updated value.
This fixes the long standing race condition by adding a global
spinlock to only FAT12 for avoiding any impact against FAT16/32.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparc64:
drivers/sbus/char/cpwatchdog.c: In function `wd_toggleintr':
drivers/sbus/char/cpwatchdog.c:523: error: implicit declaration of function `readb'
drivers/sbus/char/cpwatchdog.c:533: error: implicit declaration of function `writeb'
drivers/sbus/char/cpwatchdog.c: In function `wd_pingtimer':
drivers/sbus/char/cpwatchdog.c:545: error: implicit declaration of function `readw'
drivers/sbus/char/cpwatchdog.c: In function `wd_starttimer':
drivers/sbus/char/cpwatchdog.c:584: error: implicit declaration of function `writew'
drivers/sbus/char/cpwatchdog.c: In function `wd_init':
drivers/sbus/char/cpwatchdog.c:767: error: implicit declaration of function `ioremap'
drivers/sbus/char/cpwatchdog.c:767: warning: assignment makes pointer from integer without a cast
drivers/sbus/char/cpwatchdog.c: In function `wd_cleanup':
drivers/sbus/char/cpwatchdog.c:849: error: implicit declaration of function `iounmap'
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a patch that speeds up statfs. It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes. That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).
It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted. While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a patch that speeds up statfs. It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes. That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).
It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted. While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a patch that speeds up statfs. It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes. That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).
It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted. While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Optimize integer-to-string conversion in vsprintf.c for base 10. This is
by far the most used conversion, and in some use cases it impacts
performance. For example, top reads /proc/$PID/stat for every process, and
with 4000 processes decimal conversion alone takes noticeable time.
Using code from
http://www.cs.uiowa.edu/~jones/bcd/decimal.html
(with permission from the author, Douglas W. Jones)
binary-to-decimal-string conversion is done in groups of five digits at
once, using only additions/subtractions/shifts (with -O2; -Os throws in
some multiply instructions).
On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code.
This patch is run tested. Userspace benchmark/test is also attached.
I tested it on PIII and AMD64 and new code is generally ~2.5 times
faster. On AMD64:
# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv: .......... 62 ns per iteration
Testing correctness
12895992590592 ok... [Ctrl-C]
# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv: .......... 62 ns per iteration
Testing correctness
26025406464 ok... [Ctrl-C]
More realistic test: top from busybox project was modified to
report how many us it took to scan /proc (this does not account
any processing done after that, like sorting process list),
and then I test it with 4000 processes:
#!/bin/sh
i=4000
while test $i != 0; do
sleep 30 &
let i--
done
busybox top -b -n3 >/dev/null
on unpatched kernel:
top: 4120 processes took 102864 microseconds to scan
top: 4120 processes took 91757 microseconds to scan
top: 4120 processes took 92517 microseconds to scan
top: 4120 processes took 92581 microseconds to scan
on patched kernel:
top: 4120 processes took 75460 microseconds to scan
top: 4120 processes took 66451 microseconds to scan
top: 4120 processes took 67267 microseconds to scan
top: 4120 processes took 67618 microseconds to scan
The speedup comes from much faster generation of /proc/PID/stat
by sprintf() calls inside the kernel.
Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* There is no point in having full "0...9a...z" constant vector,
if we use only "0...9a...f" (and "x" for "0x").
* Post-decrement usually needs a few more instructions, so use
pre decrement instead where makes sense:
- while (i < precision--) {
+ while (i <= --precision) {
* if base != 10 (=> base 8 or 16), we can avoid using division
in a loop and use mask/shift, obtaining much faster conversion.
(More complex optimization for base 10 case is in the second patch).
Overall, size vsprintf.o shows ~80 bytes smaller text section
with this patch applied.
Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In drivers/pnp/isapnp/core.c::isapnp_read_tag() there is a test of 'type'
being == 0 a bit down in the function. That test doesn't make any sense.
If 'type' could indeed be NULL, then the test happens way too late as we'd
already have tried to dereference the pointer earlier and looking at the
callers it also turns out that there is no way type can ever actually be
NULL.
So the test is completely pointless and should just be removed.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>