Commit Graph

50 Commits

Author SHA1 Message Date
Ingo Molnar
486fdae214 sched: build fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:01 +02:00
Peter Zijlstra
d19ca30874 sched: debug: add some debug code to handle the full hierarchy
Add some extra debug output so we can get a better overview of the
full hierarchy.

We print the cgroup path after each cfs_rq, so we can see what group
we're looking at.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:00 +02:00
Ingo Molnar
50df5d6aea sched: remove sysctl_sched_batch_wakeup_granularity
it's unused.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:57 +02:00
Ingo Molnar
4ae7d5cefd sched: improve affine wakeups
improve affine wakeups. Maintain the 'overlap' metric based on CFS's
sum_exec_runtime - which means the amount of time a task executes
after it wakes up some other task.

Use the 'overlap' for the wakeup decisions: if the 'overlap' is short,
it means there's strong workload coupling between this task and the
woken up task. If the 'overlap' is large then the workload is decoupled
and the scheduler will move them to separate CPUs more easily.

( Also slightly move the preempt_check within try_to_wake_up() - this has
  no effect on functionality but allows 'early wakeups' (for still-on-rq
  tasks) to be correctly accounted as well.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-19 04:27:53 +01:00
Arjan van de Ven
6d082592b6 sched: keep total / count stats in addition to the max for
Right now, the linux kernel (with scheduler statistics enabled) keeps track
of the maximum time a process is waiting to be scheduled. While the maximum
is a very useful metric, tracking average and total is equally useful
(at least for latencytop) to figure out the accumulated effect of scheduler
delays. The accumulated effect is important to judge the performance impact
of scheduler tuning/behavior.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:35 +01:00
Guillaume Chazarain
cc203d2422 sched: monitor clock underflows in /proc/sched_debug
We monitor clock overflows, let's also monitor clock underflows.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:34 +01:00
Ingo Molnar
90b2628f1f sched: fix gcc warnings
Meelis Roos reported these warnings on sparc64:

  CC      kernel/sched.o
  In file included from kernel/sched.c:879:
  kernel/sched_debug.c: In function 'nsec_high':
  kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast

the debug check in do_div() is over-eager here, because the long long
is always positive in these places. Mark this by casting them to
unsigned long long.

no change in code output:

   text    data     bss     dec     hex filename
  51471    6582     376   58429    e43d sched.o.before
  51471    6582     376   58429    e43d sched.o.after

  md5:
   7f7729c111f185bf3ccea4d542abc049  sched.o.before.asm
   7f7729c111f185bf3ccea4d542abc049  sched.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-30 17:24:35 +01:00
Ingo Molnar
c1a89740da sched: clean up overlong line in kernel/sched_debug.c
clean up overlong line in kernel/sched_debug.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-28 15:52:56 +01:00
Ingo Molnar
f7b9329e55 sched: bump version of kernel/sched_debug.c
bump version of kernel/sched_debug.c and remove CFS version
information from it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-26 21:21:49 +01:00
Peter Zijlstra
b2be5e96dc sched: reintroduce the sched_min_granularity tunable
we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.

no functionality changed.

[ mingo@elte.hu: some fixlets. ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:37 +01:00
Peter Zijlstra
ab63a633cf sched: fix unconditional irq lock
Lockdep noticed that this lock can also be taken from hardirq context, and can
thus not unconditionally disable/enable irqs.

 WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on()
  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
  [show_trace+18/32] show_trace+0x12/0x20
  [dump_stack+22/32] dump_stack+0x16/0x20
  [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0
  [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30
  [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080
  [show_state_filter+326/368] show_state_filter+0x146/0x170
  [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10
  [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120
  [handle_sysrq+40/64] handle_sysrq+0x28/0x40
  [kbd_event+1045/1680] kbd_event+0x415/0x690
  [input_pass_event+206/208] input_pass_event+0xce/0xd0
  [input_handle_event+170/928] input_handle_event+0xaa/0x3a0
  [input_event+95/112] input_event+0x5f/0x70
  [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0
  [serio_interrupt+59/128] serio_interrupt+0x3b/0x80
  [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240
  [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60
  [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140
  [do_IRQ+64/128] do_IRQ+0x40/0x80
  [common_interrupt+46/52] common_interrupt+0x2e/0x34

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-25 14:02:45 +02:00
Ken Chen
480b9434c5 sched: reduce schedstat variable overhead a bit
schedstat is useful in investigating CPU scheduler behavior.  Ideally,
I think it is beneficial to have it on all the time.  However, the
cost of turning it on in production system is quite high, largely due
to number of events it collects and also due to its large memory
footprint.

Most of the fields probably don't need to be full 64-bit on 64-bit
arch.  Rolling over 4 billion events will most like take a long time
and user space tool can be made to accommodate that.  I'm proposing
kernel to cut back most of variable width on 64-bit system.  (note,
the following patch doesn't affect 32-bit system).

Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18 21:32:56 +02:00
Arjan van de Ven
0dbee3a6b0 Make scheduler debug file operations const
In general, struct file_operations are const in the kernel, to not have
false cacheline sharing and to catch bugs at compiletime with accidental
writes to them. The new scheduler code introduces a new non-const one;
fix this up.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15 17:00:19 +02:00
Ingo Molnar
cc367732ff sched: debug, improve migration statistics
add new migration statistics when SCHED_DEBUG and SCHEDSTATS
is enabled. Available in /proc/<PID>/sched.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15 17:00:18 +02:00
Ingo Molnar
2d92f22784 sched: debug: increase width of debug line
increase width of debug line - in preparation of more debugging info.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15 17:00:18 +02:00
Dhaval Giani
5cb350baf5 sched: group scheduling, sysfs tunables
Add tunables in sysfs to modify a user's cpu share.

A directory is created in sysfs for each new user in the system.

	/sys/kernel/uids/<uid>/cpu_share

Reading this file returns the cpu shares granted for the user.
Writing into this file modifies the cpu share for the user. Only an
administrator is allowed to modify a user's cpu share.

Ex:
	# cd /sys/kernel/uids/
	# cat 512/cpu_share
	1024
	# echo 2048 > 512/cpu_share
	# cat 512/cpu_share
	2048
	#

Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15 17:00:14 +02:00
Ingo Molnar
4cf86d77f5 sched: cleanup: rename task_grp to task_group
cleanup: rename task_grp to task_group. No need to save two characters
and 'grp' is annoying to read.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15 17:00:14 +02:00
Srivatsa Vaddagiri
fb615581c7 sched: group scheduler, fix coding style issues
Fix coding style issues reported by Randy Dunlap and others

Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15 17:00:12 +02:00
Peter Zijlstra
5f6d858ecc sched: speed up and simplify vslice calculations
speed up and simplify vslice calculations.

[ From: Mike Galbraith <efault@gmx.de>: build fix ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15 17:00:12 +02:00
Ingo Molnar
2d72376b3a sched: clean up schedstats, cnt -> count
rename all 'cnt' fields and variables to the less yucky 'count' name.

yuckage noticed by Andrew Morton.

no change in code, other than the /proc/sched_debug bkl_count string got
a bit larger:

   text    data     bss     dec     hex filename
  38236    3506      24   41766    a326 sched.o.before
  38240    3506      24   41770    a32a sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15 17:00:12 +02:00
Peter Zijlstra
ddc9729750 sched debug: check spread
debug feature: check how well we schedule within a reasonable
vruntime 'spread' range. (note that CPU overload can increase
the spread, so this is not a hard condition, but normal loads
should be within the spread.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2007-10-15 17:00:10 +02:00
Ingo Molnar
d822ceceda sched debug: more width for parameter printouts
more width for parameter printouts in /proc/sched_debug.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15 17:00:10 +02:00
Ingo Molnar
1aa4731eff sched debug: print settings
print the current value of all tunables in /proc/sched_debug output.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15 17:00:10 +02:00
S.Caglar Onur
fdd71d132b sched debug: BKL usage statistics, fix
build fix for the SCHED_DEBUG && !SCHEDSTATS case.

Signed-off-by: S.Ceglar Onur <caglar@pardus.org.tr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15 17:00:10 +02:00
Ingo Molnar
b8efb56172 sched debug: BKL usage statistics
add per task and per rq BKL usage statistics.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15 17:00:10 +02:00