Commit Graph

4856 Commits

Author SHA1 Message Date
Mike Galbraith
b0aa51b999 sched: minor fast-path overhead reduction
Greetings,

103638d added a bit of avoidable overhead to the fast-path.

Use sysctl_sched_min_granularity instead of sched_slice() to restrict buddy wakeups.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-17 15:36:58 +02:00
Peter Zijlstra
b968905292 sched: fix the wrong mask_len, cleanup
Clean up the division in show_schedstat().

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-17 13:05:22 +02:00
Miao Xie
c851c8676b sched: fix the wrong mask_len
If NR_CPUS isn't a multiple of 32, we get a truncated string of sched
domains by catting /proc/schedstat. This is caused by the wrong mask_len.

This patch fixes it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-17 12:26:33 +02:00
Ingo Molnar
0f1f6dec95 Merge branch 'linus' into sched/urgent 2008-10-17 12:25:43 +02:00
Linus Torvalds
8cde1ad668 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched_clock: prevent scd->clock from moving backwards
2008-10-16 15:38:48 -07:00
Linus Torvalds
1c95e1b690 Fix kernel/softirq.c printk format warning properly
This fixes the broken 77af7e3403
("softirq, warning fix: correct a format to avoid a warning") fix
correctly.

The type of a pointer subtraction is not "int", nor is it "long".  It
can be either (or something else).  It's "ptrdiff_t", and the printk
format for it is "%td".

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 15:32:46 -07:00
Linus Torvalds
e533b22705 Merge branch 'core-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  do_generic_file_read: s/EINTR/EIO/ if lock_page_killable() fails
  softirq, warning fix: correct a format to avoid a warning
  softirqs, debug: preemption check
  x86, pci-hotplug, calgary / rio: fix EBDA ioremap()
  IO resources, x86: ioremap sanity check to catch mapping requests exceeding, fix
  IO resources, x86: ioremap sanity check to catch mapping requests exceeding the BAR sizes
  softlockup: Documentation/sysctl/kernel.txt: fix softlockup_thresh description
  dmi scan: warn about too early calls to dmi_check_system()
  generic: redefine resource_size_t as phys_addr_t
  generic: make PFN_PHYS explicitly return phys_addr_t
  generic: add phys_addr_t for holding physical addresses
  softirq: allocate less vectors
  IO resources: fix/remove printk
  printk: robustify printk, update comment
  printk: robustify printk, fix #2
  printk: robustify printk, fix
  printk: robustify printk

Fixed up conflicts in:
	arch/powerpc/include/asm/types.h
	arch/powerpc/platforms/Kconfig.cputype
manually.
2008-10-16 15:17:40 -07:00
Linus Torvalds
c813b4e16e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits)
  UIO: Fix mapping of logical and virtual memory
  UIO: add automata sercos3 pci card support
  UIO: Change driver name of uio_pdrv
  UIO: Add alignment warnings for uio-mem
  Driver core: add bus_sort_breadthfirst() function
  NET: convert the phy_device file to use bus_find_device_by_name
  kobject: Cleanup kobject_rename and !CONFIG_SYSFS
  kobject: Fix kobject_rename and !CONFIG_SYSFS
  sysfs: Make dir and name args to sysfs_notify() const
  platform: add new device registration helper
  sysfs: use ilookup5() instead of ilookup5_nowait()
  PNP: create device attributes via default device attributes
  Driver core: make bus_find_device_by_name() more robust
  usb: turn dev_warn+WARN_ON combos into dev_WARN
  debug: use dev_WARN() rather than WARN_ON() in device_pm_add()
  debug: Introduce a dev_WARN() function
  sysfs: fix deadlock
  device model: Do a quickcheck for driver binding before doing an expensive check
  Driver core: Fix cleanup in device_create_vargs().
  Driver core: Clarify device cleanup.
  ...
2008-10-16 12:40:26 -07:00
Linus Torvalds
c8d8a2321f Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  module: remove CONFIG_KMOD in comment after #endif
  remove CONFIG_KMOD from fs
  remove CONFIG_KMOD from drivers

Manually fix conflict due to include cleanups in drivers/md/md.c
2008-10-16 12:38:34 -07:00
Adrian Bunk
2b252c5411 make kprobes.c:kretprobe_table_lock() static
Make the needlessly global kretprobe_table_lock() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:52 -07:00
Bjorn Helgaas
c26ec88ea8 resources: tidy __request_region()
No functional change.  Just return NULL for kzalloc failure immediately,
rather than wrapping the whole function body in the body of an "if".

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:51 -07:00
Thomas Petazzoni
ebf3f09c63 Configure out AIO support
This patchs adds the CONFIG_AIO option which allows to remove support
for asynchronous I/O operations, that are not necessarly used by
applications, particularly on embedded devices. As this is a
size-reduction option, it depends on CONFIG_EMBEDDED. It allows to
save ~7 kilobytes of kernel code/data:

   text	   data	    bss	    dec	    hex	filename
1115067	 119180	 217088	1451335	 162547	vmlinux
1108025	 119048	 217088	1444161	 160941	vmlinux.new
  -7042    -132       0   -7174   -1C06 +/-

This patch has been originally written by Matt Mackall
<mpm@selenic.com>, and is part of the Linux Tiny project.

[randy.dunlap@oracle.com: build fix]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:51 -07:00
Alexey Dobriyan
f221e726bf sysctl: simplify ->strategy
name and nlen parameters passed to ->strategy hook are unused, remove
them.  In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ]
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:47 -07:00
Christoph Hellwig
b418da16dd compat: generic compat get/settimeofday
Nothing arch specific in get/settimeofday.  The details of the timeval
conversion varied a little from arch to arch, but all with the same
results.

Also add an extern declaration for sys_tz to linux/time.h because externs
in .c files are fowned upon.  I'll kill the externs in various other files
in a sparate patch.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Andi Kleen
20036fdcaf Add kerneldoc documentation for new printk format extensions
Add documentation in kerneldoc for new printk format extensions

This patch documents the new %pS/%pF options in printk in kernel doc.

Hope I didn't miss any other extension.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:32 -07:00
Randy Dunlap
6d5cd6effe taint: fix kernel-doc
Move print_tainted() kernel-doc to avoid the following error:

Error(/var/linsrc/mmotm-2008-1002-1617//kernel/panic.c:155): cannot understand prototype: 'struct tnt '

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:32 -07:00
Francois Cami
e1f8e87449 Remove Andrew Morton's old email accounts
People can use the real name an an index into MAINTAINERS to find the
current email address.

Signed-off-by: Francois Cami <francois.cami@free.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:32 -07:00
WANG Cong
7968b3d9a6 kernel/kallsyms.c: fix double return
Commit 6dd06c9fbe ("module: make
module_address_lookup safe") introduced double returns in the function
kallsyms_lookup(), it's weird.  The second one should be removed.

Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:32 -07:00
Andrew Morton
9679e4dd62 kernel/sys.c: improve code generation
utsname() is quite expensive to calculate.  Cache it in a local.

          text    data     bss     dec     hex filename
before:  11136     720      16   11872    2e60 kernel/sys.o
after:   11096     720      16   11832    2e38 kernel/sys.o

Acked-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Vegard Nossum
8798881507 utsname: completely overwrite prior information
On sethostname() and setdomainname(), previous information may be retained
if it was longer than than the new hostname/domainname.

This can be demonstrated trivially by calling sethostname() first with a
long name, then with a short name, and then calling uname() to retrieve
the full buffer that contains the hostname (and possibly parts of the old
hostname), one just has to look past the terminating zero.

I don't know if we should really care that much (hence the RFC); the only
scenarios I can possibly think of is administrator putting something
sensitive in the hostname (or domain name) by accident, and changing it
back will not undo the mistake entirely, though it's not like we can
recover gracefully from "rm -rf /" either...  The other scenario is
namespaces (CLONE_NEWUTS) where some information may be unintentionally
"inherited" from the previous namespace (a program wants to hide the
original name and does clone + sethostname, but some information is still
left).

I think the patch may be defended on grounds of the principle of least
surprise.  But I am not adamant :-)

(I guess the question now is whether userspace should be able to
write embedded NULs into the buffer or not...)

At least the observation has been made and the patch has been presented.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Dave Hansen
22b8ce9470 profiling: dynamically enable readprofile at runtime
Way too often, I have a machine that exhibits some kind of crappy
behavior.  The CPU looks wedged in the kernel or it is spending way too
much system time and I wonder what is responsible.

I try to run readprofile.  But, of course, Ubuntu doesn't enable it by
default.  Dang!

The reason we boot-time enable it is that it takes a big bufffer that we
generally can only bootmem alloc.  But, does it hurt to at least try and
runtime-alloc it?

To use:
echo 2 > /sys/kernel/profile

Then run readprofile like normal.

This should fix the compile issue with allmodconfig.  I've compile-tested
on a bunch more configs now including a few more architectures.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Adam Tkac
0c2d64fb6c rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY
When a process wants to set the limit of open files to RLIM_INFINITY it
gets EPERM even if it has CAP_SYS_RESOURCE capability.

For example, BIND does:

...
#elif defined(NR_OPEN) && defined(__linux__)
        /*
         * Some Linux kernels don't accept RLIM_INFINIT; the maximum
         * possible value is the NR_OPEN defined in linux/fs.h.
         */
        if (resource == isc_resource_openfiles && rlim_value == RLIM_INFINITY) {
                rl.rlim_cur = rl.rlim_max = NR_OPEN;
                unixresult = setrlimit(unixresource, &rl);
                if (unixresult == 0)
                        return (ISC_R_SUCCESS);
        }
#elif ...

If we allow setting RLIMIT_NOFILE to RLIM_INFINITY we increase portability
- you don't have to check if OS is linux and then use different schema for
limits.

The spec says "Specifying RLIM_INFINITY as any resource limit value on a
successful call to setrlimit() shall inhibit enforcement of that resource
limit." and we're presently not doing that.

Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Andi Kleen
25ddbb18aa Make the taint flags reliable
It's somewhat unlikely that it happens, but right now a race window
between interrupts or machine checks or oopses could corrupt the tainted
bitmap because it is modified in a non atomic fashion.

Convert the taint variable to an unsigned long and use only atomic bit
operations on it.

Unfortunately this means the intvec sysctl functions cannot be used on it
anymore.

It turned out the taint sysctl handler could actually be simplified a bit
(since it only increases capabilities) so this patch actually removes
code.

[akpm@linux-foundation.org: remove unneeded include]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Jan Beulich
9ba16087d9 Kconfig: eliminate "def_bool n" constructs
Using "def_bool n" is pointless, simply using bool here appears more
appropriate.

Further, retaining such options that don't have a prompt and aren't
selected by anything seems also at least questionable.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Tejun Heo
a25d644fc0 wait: kill is_sync_wait()
is_sync_wait() is used to distinguish between sync and async waits.
Basically sync waits are the ones initialized with init_waitqueue_entry()
and async ones with init_waitqueue_func_entry().  The sync/async
distinction is used only in prepare_to_wait[_exclusive]() and its only
function is to skip setting the current task state if the wait is async.
This has a few problems.

* No one uses it.  None of func_entry users use prepare_to_wait()
  functions, so the code path never gets executed.

* The distinction is bogus.  Maybe back when func_entry is used only
  by aio but it's now also used by epoll and in future possibly by 9p
  and poll/select.

* Taking @state as argument and ignoring it silenly depending on how
  @wait is initialized is just a bad error-prone API.

* It prevents func_entry waits from using wait->private for no good
  reason.

This patch kills is_sync_wait() and the associated code paths from
prepare_to_wait[_exclusive]().  As there was no user of these code paths,
this patch doesn't cause any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00