Yuyang Du
7ea241afbf
sched/fair: Clean up load average references
...
For cfs_rq, we have load.weight, runnable_load_avg, and load_avg.
Clean up how they are used:
- First, as group sched_entity already largely uses load_avg, we now expand
to use load_avg in all cases.
- Second, for CPU-wide load balancing, we choose to use runnable_load_avg
in all cases, which is the same as before this series.
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1436918682-4971-8-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:24:32 +02:00
Yuyang Du
139622343e
sched/fair: Provide runnable_load_avg back to cfs_rq
...
The cfs_rq's load_avg is composed of runnable_load_avg and blocked_load_avg.
Before this series, sometimes the runnable_load_avg is used, and sometimes
the load_avg is used. Completely replacing all uses of runnable_load_avg
with load_avg may be too big a leap, i.e., the blocked_load_avg is concerned
to result in overrated load. Therefore, we get runnable_load_avg back.
The new cfs_rq's runnable_load_avg is improved to be updated with all of the
runnable sched_eneities at the same time, so the one sched_entity updated and
the others stale problem is solved.
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1436918682-4971-7-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:24:31 +02:00
Yuyang Du
1269557889
sched/fair: Remove task and group entity load when they are dead
...
When task exits or group is destroyed, the entity's load should be
removed from its parent cfs_rq's load. Otherwise, it will take time
for the parent cfs_rq to decay the dead entity's load to 0, which
is not desired.
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1436918682-4971-6-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:24:30 +02:00
Yuyang Du
540247fb5d
sched/fair: Init cfs_rq's sched_entity load average
...
The runnable load and utilization averages of cfs_rq's sched_entity
were not initiated. Like done to a task, give new cfs_rq' sched_entity
start values to heavy its load in infant time.
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1436918682-4971-5-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:24:29 +02:00
Vincent Guittot
6c1d47c082
sched/fair: Implement update_blocked_averages() for CONFIG_FAIR_GROUP_SCHED=n
...
The load and the utilization of idle CPUs must be updated periodically in
order to decay the blocked part.
If CONFIG_FAIR_GROUP_SCHED is not set, the load and util of idle cpus
are not decayed and stay at the values set before becoming idle.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org >
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/1436918682-4971-4-git-send-email-yuyang.du@intel.com
[ Fixed up the SOB chain. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:24:28 +02:00
Yuyang Du
9d89c257df
sched/fair: Rewrite runnable load and utilization average tracking
...
The idea of runnable load average (let runnable time contribute to weight)
was proposed by Paul Turner and Ben Segall, and it is still followed by
this rewrite. This rewrite aims to solve the following issues:
1. cfs_rq's load average (namely runnable_load_avg and blocked_load_avg) is
updated at the granularity of an entity at a time, which results in the
cfs_rq's load average is stale or partially updated: at any time, only
one entity is up to date, all other entities are effectively lagging
behind. This is undesirable.
To illustrate, if we have n runnable entities in the cfs_rq, as time
elapses, they certainly become outdated:
t0: cfs_rq { e1_old, e2_old, ..., en_old }
and when we update:
t1: update e1, then we have cfs_rq { e1_new, e2_old, ..., en_old }
t2: update e2, then we have cfs_rq { e1_old, e2_new, ..., en_old }
...
We solve this by combining all runnable entities' load averages together
in cfs_rq's avg, and update the cfs_rq's avg as a whole. This is based
on the fact that if we regard the update as a function, then:
w * update(e) = update(w * e) and
update(e1) + update(e2) = update(e1 + e2), then
w1 * update(e1) + w2 * update(e2) = update(w1 * e1 + w2 * e2)
therefore, by this rewrite, we have an entirely updated cfs_rq at the
time we update it:
t1: update cfs_rq { e1_new, e2_new, ..., en_new }
t2: update cfs_rq { e1_new, e2_new, ..., en_new }
...
2. cfs_rq's load average is different between top rq->cfs_rq and other
task_group's per CPU cfs_rqs in whether or not blocked_load_average
contributes to the load.
The basic idea behind runnable load average (the same for utilization)
is that the blocked state is taken into account as opposed to only
accounting for the currently runnable state. Therefore, the average
should include both the runnable/running and blocked load averages.
This rewrite does that.
In addition, we also combine runnable/running and blocked averages
of all entities into the cfs_rq's average, and update it together at
once. This is based on the fact that:
update(runnable) + update(blocked) = update(runnable + blocked)
This significantly reduces the code as we don't need to separately
maintain/update runnable/running load and blocked load.
3. How task_group entities' share is calculated is complex and imprecise.
We reduce the complexity in this rewrite to allow a very simple rule:
the task_group's load_avg is aggregated from its per CPU cfs_rqs's
load_avgs. Then group entity's weight is simply proportional to its
own cfs_rq's load_avg / task_group's load_avg. To illustrate,
if a task_group has { cfs_rq1, cfs_rq2, ..., cfs_rqn }, then,
task_group_avg = cfs_rq1_avg + cfs_rq2_avg + ... + cfs_rqn_avg, then
cfs_rqx's entity's share = cfs_rqx_avg / task_group_avg * task_group's share
To sum up, this rewrite in principle is equivalent to the current one, but
fixes the issues described above. Turns out, it significantly reduces the
code complexity and hence increases clarity and efficiency. In addition,
the new averages are more smooth/continuous (no spurious spikes and valleys)
and updated more consistently and quickly to reflect the load dynamics.
As a result, we have less load tracking overhead, better performance,
and especially better power efficiency due to more balanced load.
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1436918682-4971-3-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:29 +02:00
Yuyang Du
cd126afe83
sched/fair: Remove rq's runnable avg
...
The current rq->avg is not used at all since its merge into the kernel,
and the code is in the scheduler's hot path, so remove it.
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com >
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: arjan@linux.intel.com
Cc: bsegall@google.com
Cc: fengguang.wu@intel.com
Cc: len.brown@intel.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: rafael.j.wysocki@intel.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1436918682-4971-2-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:28 +02:00
Oleg Nesterov
d308b9f1e4
stop_machine: Remove cpu_stop_work's from list in cpu_stop_park()
...
cpu_stop_park() does cpu_stop_signal_done() but leaves the work on
stopper->works. The owner of this work can free/reuse this memory
right after that and corrupt the list, so if this CPU becomes online
again cpu_stopper_thread() will crash.
Signed-off-by: Oleg Nesterov <oleg@redhat.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Tejun Heo <tj@kernel.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: viro@ZenIV.linux.org.uk
Link: http://lkml.kernel.org/r/20150630012958.GA23944@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:28 +02:00
Oleg Nesterov
9a301f22fa
stop_machine: Use 'cpu_stop_fn_t' where possible
...
Cosmetic, but 'cpu_stop_fn_t' actually makes the code more readable and
it doesn't break cscope. And most of the declarations already use it.
Signed-off-by: Oleg Nesterov <oleg@redhat.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Tejun Heo <tj@kernel.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: viro@ZenIV.linux.org.uk
Link: http://lkml.kernel.org/r/20150630012955.GA23937@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:27 +02:00
Oleg Nesterov
7eeb088e72
stop_machine: Unexport __stop_machine()
...
The only caller outside of stop_machine.c is _cpu_down(), it can use
stop_machine(). get_online_cpus() is fine under cpu_hotplug_begin().
Signed-off-by: Oleg Nesterov <oleg@redhat.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Tejun Heo <tj@kernel.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: viro@ZenIV.linux.org.uk
Link: http://lkml.kernel.org/r/20150630012951.GA23934@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:26 +02:00
Oleg Nesterov
b377c2a089
stop_machine: Don't do for_each_cpu() twice in queue_stop_cpus_work()
...
queue_stop_cpus_work() can do everything in one for_each_cpu() loop.
Signed-off-by: Oleg Nesterov <oleg@redhat.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Tejun Heo <tj@kernel.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: viro@ZenIV.linux.org.uk
Link: http://lkml.kernel.org/r/20150630012948.GA23927@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:26 +02:00
Oleg Nesterov
02cb7aa923
stop_machine: Move 'cpu_stopper_task' and 'stop_cpus_work' into 'struct cpu_stopper'
...
Multpiple DEFINE_PER_CPU's do not make sense, move all the per-cpu
variables into 'struct cpu_stopper'.
Signed-off-by: Oleg Nesterov <oleg@redhat.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Tejun Heo <tj@kernel.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: paulmck@linux.vnet.ibm.com
Cc: riel@redhat.com
Cc: viro@ZenIV.linux.org.uk
Link: http://lkml.kernel.org/r/20150630012944.GA23924@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:25 +02:00
Konstantin Khlebnikov
fe32d3cd5e
sched/preempt: Fix cond_resched_lock() and cond_resched_softirq()
...
These functions check should_resched() before unlocking spinlock/bh-enable:
preempt_count always non-zero => should_resched() always returns false.
cond_resched_lock() worked iff spin_needbreak is set.
This patch adds argument "preempt_offset" to should_resched().
preempt_count offset constants for that:
PREEMPT_DISABLE_OFFSET - offset after preempt_disable()
PREEMPT_LOCK_OFFSET - offset after spin_lock()
SOFTIRQ_DISABLE_OFFSET - offset after local_bh_distable()
SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh()
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Graf <agraf@suse.de >
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com >
Cc: David Vrabel <david.vrabel@citrix.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Fixes: bdb4380658 ("sched: Extract the basic add/sub preempt_count modifiers")
Link: http://lkml.kernel.org/r/20150715095204.12246.98268.stgit@buzz
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:24 +02:00
Konstantin Khlebnikov
c56dadf397
sched/preempt, powerpc, kvm: Use need_resched() instead of should_resched()
...
Function should_resched() is equal to (!preempt_count() && need_resched()).
In preemptive kernel preempt_count here is non-zero because of vc->lock.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Graf <agraf@suse.de >
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com >
Cc: David Vrabel <david.vrabel@citrix.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/20150715095203.12246.72922.stgit@buzz
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:24 +02:00
Konstantin Khlebnikov
0fa2f5cb2b
sched/preempt, xen: Use need_resched() instead of should_resched()
...
This code is used only when CONFIG_PREEMPT=n and only in non-atomic context:
xen_in_preemptible_hcall is set only in privcmd_ioctl_hypercall().
Thus preempt_count is zero and should_resched() is equal to need_resched().
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Alexander Graf <agraf@suse.de >
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com >
Cc: David Vrabel <david.vrabel@citrix.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/20150715095201.12246.49283.stgit@buzz
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:23 +02:00
Mike Galbraith
63b0e9edce
sched/fair: Beef up wake_wide()
...
Josef Bacik reported that Facebook sees better performance with their
1:N load (1 dispatch/node, N workers/node) when carrying an old patch
to try very hard to wake to an idle CPU. While looking at wake_wide(),
I noticed that it doesn't pay attention to the wakeup of a many partner
waker, returning 1 only when waking one of its many partners.
Correct that, letting explicit domain flags override the heuristic.
While at it, adjust task_struct bits, we don't need a 64-bit counter.
Tested-by: Josef Bacik <jbacik@fb.com >
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com >
[ Tidy things up. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: kernel-team<Kernel-team@fb.com >
Cc: morten.rasmussen@arm.com
Cc: riel@redhat.com
Link: http://lkml.kernel.org/r/1436888390.7983.49.camel@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:23 +02:00
Peter Zijlstra
fbd705a0c6
sched: Introduce the 'trace_sched_waking' tracepoint
...
Mathieu reported that since 317f394160 ("sched: Move the second half
of ttwu() to the remote cpu") trace_sched_wakeup() can happen out of
context of the waker.
This is a problem when you want to analyse wakeup paths because it is
now very hard to correlate the wakeup event to whoever issued the
wakeup.
OTOH trace_sched_wakeup() is issued at the point where we set
p->state = TASK_RUNNING, which is right were we hand the task off to
the scheduler, so this is an important point when looking at
scheduling behaviour, up to here its been the wakeup path everything
hereafter is due to scheduler policy.
To bridge this gap, introduce a second tracepoint: trace_sched_waking.
It is guaranteed to be called in the waker context.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Francis Giraldeau <francis.giraldeau@gmail.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Steven Rostedt <rostedt@goodmis.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/20150609091336.GQ3644@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:22 +02:00
Peter Zijlstra
9d7fb04276
sched/cputime: Guarantee stime + utime == rtime
...
While the current code guarantees monotonicity for stime and utime
independently of one another, it does not guarantee that the sum of
both is equal to the total time we started out with.
This confuses things (and peoples) who look at this sum, like top, and
will report >100% usage followed by a matching period of 0%.
Rework the code to provide both individual monotonicity and a coherent
sum.
Suggested-by: Fredrik Markstrom <fredrik.markstrom@gmail.com >
Reported-by: Fredrik Markstrom <fredrik.markstrom@gmail.com >
Tested-by: Fredrik Markstrom <fredrik.markstrom@gmail.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Frederic Weisbecker <fweisbec@gmail.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Rik van Riel <riel@redhat.com >
Cc: Stanislaw Gruszka <sgruszka@redhat.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: jason.low2@hp.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:21 +02:00
Markus Elfring
781b020342
sched, sysctl: Delete an unnecessary check before unregister_sysctl_table()
...
The unregister_sysctl_table() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/5597877E.3060503@users.sourceforge.net
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:21 +02:00
Xunlei Pang
3fe33bcdd3
sched/deadline: Remove a redundant condition from task_woken_dl()
...
'p' has been already queued at this point, so "!task_running(rq, p)"
and "p->nr_cpus_allowed > 1" imply that "has_pushable_dl_tasks(rq)"
is true, so it can be removed.
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Acked-by: Steven Rostedt <rostedt@goodmis.org >
Cc: Juri Lelli <juri.lelli@gmail.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/1435995563-3723-2-git-send-email-xlpang@126.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:20 +02:00
Xunlei Pang
8fd373548e
sched/rt: Remove a redundant condition from task_woken_rt()
...
'p' has been already queued at this point, so "!task_running(rq, p)"
and "p->nr_cpus_allowed > 1" imply that "has_pushable_tasks(rq)" is
true, so it can be removed.
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Acked-by: Steven Rostedt <rostedt@goodmis.org >
Cc: Juri Lelli <juri.lelli@gmail.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/1435995563-3723-1-git-send-email-xlpang@126.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:19 +02:00
Yuyang Du
985d3a4c11
sched/fair: Avoid pulling all tasks in idle balancing
...
In idle balancing where a CPU going idle pulls tasks from another CPU,
a livelock may happen if the CPU pulls all tasks from another, makes
it idle, and this iterates. So just avoid this.
Reported-by: Rabin Vincent <rabin.vincent@axis.com >
Signed-off-by: Yuyang Du <yuyang.du@intel.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Ben Segall <bsegall@google.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Mike Galbraith <efault@gmx.de >
Cc: Mike Galbraith <umgwanakikbuti@gmail.com >
Cc: Morten Rasmussen <morten.rasmussen@arm.com >
Cc: Paul Turner <pjt@google.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/20150705221151.GF5197@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-08-03 12:21:19 +02:00
Lucas Stach
63caae8480
sched/idle: Move latency tracing stop/start calls deeper inside the idle loop
...
Make sure to stop tracing only once we are past a point where
all latency tracing events have been processed (irqs are not
enabled again). This has the slight advantage of capturing more
latency related events in the idle path, but most importantly it
makes sure that latency tracing doesn't get re-enabled
inadvertently when new events are coming in.
This makes the irqsoff latency tracer useful again, as we stop
capturing CPU sleep time as IRQ latency.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de >
Cc: Daniel Lezcano <daniel.lezcano@linaro.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Rafael J. Wysocki <rjw@rjwysocki.net >
Cc: Steven Rostedt <rostedt@goodmis.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: kernel@pengutronix.de
Cc: patchwork-lst@pengutronix.de
Link: http://lkml.kernel.org/r/1437410090-3747-1-git-send-email-l.stach@pengutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-07-21 08:18:51 +02:00
Byungchul Park
399595f248
sched/fair: Fix a comment reflecting function name change
...
update_cfs_rq_load_contribution() was changed to
__update_cfs_rq_tg_load_contrib() - sync up the commit in
calc_tg_weight() too.
Signed-off-by: Byungchul Park <byungchul.park@lge.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/1436187062-19658-1-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-07-07 08:46:11 +02:00
Boqun Feng
8e2b0bf397
sched/fair: Clean up the __sched_period() code
...
Since commit:
4bf0b77158 ("sched: remove do_div() from __sched_slice()")
... the logic of __sched_period() can be implemented as a single if-else
without any local variables, so this patch cleans it up with an if-else
statement, which expresses the function's logic straightforwardly.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com >
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Link: http://lkml.kernel.org/r/1435847152-29543-1-git-send-email-boqun.feng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org >
2015-07-07 08:46:10 +02:00