Commit Graph

3648 Commits

Author SHA1 Message Date
Vegard Nossum
129f1d2c53 timer_list: Fix printk format strings
This makes sure printk format strings contain no more than a single
line.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-29 09:39:38 +01:00
Adrian Bunk
64e38eb082 clockevents: unexport tick_nohz_get_sleep_length
This patch removes the unused 
EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-29 09:39:38 +01:00
Gautham R Shenoy
17aacfb9cd lockdep: fix a typo in the __lock_acquire comment
Fix a typo in the __lock_acquire comment.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-28 20:47:01 +01:00
Peter Zijlstra
ab63a633cf sched: fix unconditional irq lock
Lockdep noticed that this lock can also be taken from hardirq context, and can
thus not unconditionally disable/enable irqs.

 WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on()
  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
  [show_trace+18/32] show_trace+0x12/0x20
  [dump_stack+22/32] dump_stack+0x16/0x20
  [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0
  [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30
  [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080
  [show_state_filter+326/368] show_state_filter+0x146/0x170
  [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10
  [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120
  [handle_sysrq+40/64] handle_sysrq+0x28/0x40
  [kbd_event+1045/1680] kbd_event+0x415/0x690
  [input_pass_event+206/208] input_pass_event+0xce/0xd0
  [input_handle_event+170/928] input_handle_event+0xaa/0x3a0
  [input_event+95/112] input_event+0x5f/0x70
  [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0
  [serio_interrupt+59/128] serio_interrupt+0x3b/0x80
  [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240
  [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60
  [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140
  [do_IRQ+64/128] do_IRQ+0x40/0x80
  [common_interrupt+46/52] common_interrupt+0x2e/0x34

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-25 14:02:45 +02:00
Peter Williams
681f3e6854 sched: isolate SMP balancing code a bit more
At the moment, a lot of load balancing code that is irrelevant to non
SMP systems gets included during non SMP builds.

This patch addresses this issue and reduces the binary size on non
SMP systems:

   text    data     bss     dec     hex filename
  10983      28    1192   12203    2fab sched.o.before
  10739      28    1192   11959    2eb7 sched.o.after

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:51 +02:00
Peter Williams
e1d1484f72 sched: reduce balance-tasks overhead
At the moment, balance_tasks() provides low level functionality for both
  move_tasks() and move_one_task() (indirectly) via the load_balance()
function (in the sched_class interface) which also provides dual
functionality.  This dual functionality complicates the interfaces and
internal mechanisms and makes the run time overhead of operations that
are called with two run queue locks held.

This patch addresses this issue and reduces the overhead of these
operations.

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:51 +02:00
Adrian Bunk
a0f846aa76 sched: make cpu_shares_{show,store}() static
cpu_shares_{show,store}() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:50 +02:00
Paul Menage
2b01dfe372 sched: clean up some control group code
- replace "cont" with "cgrp" in a few places in the CFS cgroup code, 
- use write_uint rather than write for cpu.shares write function

Signed-off-by: Paul Menage <menage@google.com>
Acked-by : Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:50 +02:00
Mel Gorman
b3da2a73ff sched: document profile=sleep requiring CONFIG_SCHEDSTATS
profile=sleep only works if CONFIG_SCHEDSTATS is set. This patch notes
the limitation in Documentation/kernel-parameters.txt and prints a
warning at boot-time if profile=sleep is used without CONFIG_SCHEDSTAT.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:50 +02:00
Satyam Sharma
838225b48e sched: use show_regs() to improve __schedule_bug() output
A full register dump along with stack backtrace would make the
"scheduling while atomic" message more helpful. Use show_regs() instead
of dump_stack() for this. We already know we're atomic in here (that is
why this function was called) so show_regs()'s atomicity expectations
are guaranteed.

Also, modify the output of the "BUG: scheduling while atomic:" header a
bit to keep task->comm and task->pid together and preempt_count() after
them.

Signed-off-by: Satyam Sharma <satyam@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:50 +02:00
Ingo Molnar
4dcf6aff02 sched: clean up sched_domain_debug()
clean up sched_domain_debug().

this also shrinks the code a bit:

   text    data     bss     dec     hex filename
  50474    4306     480   55260    d7dc sched.o.before
  50404    4306     480   55190    d796 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:48 +02:00
Ingo Molnar
b15136e949 sched: fix fastcall mismatch in completion APIs
Jeff Dike noticed that wait_for_completion_interruptible()'s prototype
had a mismatched fastcall.

Fix this by removing the fastcall attributes from all the completion APIs.

Found-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:48 +02:00
Milton Miller
7378547f2c sched: fix sched_domain sysctl registration again
commit  029190c515 (cpuset
sched_load_balance flag) was not tested SCHED_DEBUG enabled as
committed as it dereferences NULL when used and it reordered
the sysctl registration to cause it to never show any domains
or their tunables.

Fixes:

1) restore arch_init_sched_domains ordering
	we can't walk the domains before we build them

	presently we register cpus with empty directories (no domain
	directories or files).

2) make unregister_sched_domain_sysctl do nothing when already unregistered
	detach_destroy_domains is now called one set of cpus at a time
	unregister_syctl dereferences NULL if called with a null.

	While the the function would always dereference null if called
	twice, in the previous code it was always called once and then
	was followed a register.  So only the hidden bug of the
	sysctl_root_table not being allocated followed by an attempt to
	free it would have shown the error.

3) always call unregister and register in partition_sched_domains
	The code is "smart" about unregistering only needed domains.
	Since we aren't guaranteed any calls to unregister, always 
	unregister.   Without calling register on the way out we
	will not have a table or any sysctl tree.

4) warn if register is called without unregistering
	The previous table memory is lost, leaving pointers to the
	later freed memory in sysctl and leaking the memory of the
	tables.

Before this patch on a 2-core 4-thread box compiled for SMT and NUMA,
the domains appear empty (there are actually 3 levels per cpu).  And as
soon as two domains a null pointer is dereferenced (unreliable in this
case is stack garbage):

bu19a:~# ls -R /proc/sys/kernel/sched_domain/
/proc/sys/kernel/sched_domain/:
cpu0  cpu1  cpu2  cpu3

/proc/sys/kernel/sched_domain/cpu0:

/proc/sys/kernel/sched_domain/cpu1:

/proc/sys/kernel/sched_domain/cpu2:

/proc/sys/kernel/sched_domain/cpu3:

bu19a:~# mkdir /dev/cpuset
bu19a:~# mount -tcpuset cpuset /dev/cpuset/
bu19a:~# cd /dev/cpuset/
bu19a:/dev/cpuset# echo 0 > sched_load_balance 
bu19a:/dev/cpuset# mkdir one
bu19a:/dev/cpuset# echo 1 > one/cpus               
bu19a:/dev/cpuset# echo 0 > one/sched_load_balance 
Unable to handle kernel paging request for data at address 0x00000018
Faulting instruction address: 0xc00000000006b608
NIP: c00000000006b608 LR: c00000000006b604 CTR: 0000000000000000
REGS: c000000018d973f0 TRAP: 0300   Not tainted  (2.6.23-bml)
MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 28242442  XER: 00000000
DAR: 0000000000000018, DSISR: 0000000040000000
TASK = c00000001912e340[1987] 'bash' THREAD: c000000018d94000 CPU: 2
..
NIP [c00000000006b608] .unregister_sysctl_table+0x38/0x110
LR [c00000000006b604] .unregister_sysctl_table+0x34/0x110
Call Trace:
[c000000018d97670] [c000000007017270] 0xc000000007017270 (unreliable)
[c000000018d97720] [c000000000058710] .detach_destroy_domains+0x30/0xb0
[c000000018d977b0] [c00000000005cf1c] .partition_sched_domains+0x1bc/0x230
[c000000018d97870] [c00000000009fdc4] .rebuild_sched_domains+0xb4/0x4c0
[c000000018d97970] [c0000000000a02e8] .update_flag+0x118/0x170
[c000000018d97a80] [c0000000000a1768] .cpuset_common_file_write+0x568/0x820
[c000000018d97c00] [c00000000009d95c] .cgroup_file_write+0x7c/0x180
[c000000018d97cf0] [c0000000000e76b8] .vfs_write+0xe8/0x1b0
[c000000018d97d90] [c0000000000e810c] .sys_write+0x4c/0x90
[c000000018d97e30] [c00000000000852c] syscall_exit+0x0/0x40

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:48 +02:00
Jeff Garzik
3bdf590eac cgroup: kill unused variable
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 21:28:39 -04:00
Herbert Xu
a98ce5c6fe Fix synchronize_irq races with IRQ handler
As it is some callers of synchronize_irq rely on memory barriers
to provide synchronisation against the IRQ handlers.  For example,
the tg3 driver does

	tp->irq_sync = 1;
	smp_mb();
	synchronize_irq();

and then in the IRQ handler:

	if (!tp->irq_sync)
		netif_rx_schedule(dev, &tp->napi);

Unfortunately memory barriers only work well when they come in
pairs.  Because we don't actually have memory barriers on the
IRQ path, the memory barrier before the synchronize_irq() doesn't
actually protect us.

In particular, synchronize_irq() may return followed by the
result of netif_rx_schedule being made visible.

This patch (mostly written by Linus) fixes this by using spin
locks instead of memory barries on the synchronize_irq() path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-23 09:01:31 -07:00
Randy Dunlap
481968f44e auditsc: fix kernel-doc param warnings
Fix kernel-doc for auditsc parameter changes.

Warning(linux-2.6.23-git17//kernel/auditsc.c:1623): No description found for parameter 'dentry'
Warning(linux-2.6.23-git17//kernel/auditsc.c:1666): No description found for parameter 'dentry'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 19:40:02 -07:00
Linus Torvalds
0fd56c7033 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: Use new smp_call_function_mask() in kvm_flush_remote_tlbs()
  sched: don't clear PF_VCPU in scheduler
  KVM: Improve local apic timer wraparound handling
  KVM: Fix local apic timer divide by zero
  KVM: Move kvm_guest_exit() after local_irq_enable()
  KVM: x86 emulator: fix access registers for instructions with ModR/M byte and Mod = 3
  KVM: VMX: Force vm86 mode if setting flags during real mode
  KVM: x86 emulator: implement 'movnti mem, reg'
  KVM: VMX: Reset mmu context when entering real mode
  KVM: VMX: Handle NMIs before enabling interrupts and preemption
  KVM: MMU: Set shadow pte atomically in mmu_pte_write_zap_pte()
  KVM: x86 emulator: fix repne/repnz decoding
  KVM: x86 emulator: fix merge screwup due to emulator split
2007-10-22 19:24:17 -07:00
Eric W. Biederman
5081dba658 Fix appletalk sysctl entry name
Gabriel C reported that modprobing appletalk on current git gives a
warning in dmesg :

   "sysctl table check failed: /net/appletalk .3.7 procname does not match binary path procname"

Oops.  My apologies it appears I made a mistake when creating my table
to check up on sysctl values.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Gabriel C <nix.or.die@googlemail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 19:15:59 -07:00
Laurent Vivier
83d87d1673 sched: don't clear PF_VCPU in scheduler
KVM clears it by itself now, and for s390 this is plain wrong.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-22 12:03:29 +02:00
Al Viro
74c3cbe33b [PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is.  Limitations:
	* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
	* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
	* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there.  New command
tells audit to recalculate the trees, trimming such sources of false
positives.

Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-10-21 02:37:45 -04:00
Al Viro
5a190ae697 [PATCH] pass dentry to audit_inode()/audit_inode_child()
makes caller simpler *and* allows to scan ancestors

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-10-21 02:37:18 -04:00
Linus Torvalds
c00046c279 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits)
  fix do_sys_open() prototype
  sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake
  Documentation: Fix typo in SubmitChecklist.
  Typo: depricated -> deprecated
  Add missing profile=kvm option to Documentation/kernel-parameters.txt
  fix typo about TBI in e1000 comment
  proc.txt: Add /proc/stat field
  small documentation fixes
  Fix compiler warning in smount example program from sharedsubtree.txt
  docs/sysfs: add missing word to sysfs attribute explanation
  documentation/ext3: grammar fixes
  Documentation/java.txt: typo and grammar fixes
  Documentation/filesystems/vfs.txt: typo fix
  include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros
  trivial copy_data_pages() tidy up
  Fix typo in arch/x86/kernel/tsc_32.c
  file link fix for Pegasus USB net driver help
  remove unused return within void return function
  Typo fixes retrun -> return
  x86 hpet.h: remove broken links
  ...
2007-10-19 20:36:17 -07:00
Eric W. Biederman
c1cb8e48bd sysctl: Don't compile sysctl_check when !CONFIG_SYSCTL
Weird I thought I had written the makefile so this would be handled.  Oh
well this should fix it.

Sorry about that.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 18:04:22 -07:00
Fengguang Wu
df7c487250 trivial copy_data_pages() tidy up
Change the loop style of copy_data_pages() to remove a duplicate condition.

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 02:26:04 +02:00
Uwe Kleine-König
6506f2aa66 fix comment: unlock_hrtimer_base is the counterpart of lock_hrtimer_base
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 01:56:53 +02:00