Commit Graph

91 Commits

Author SHA1 Message Date
Dave Jones
ddad65df00 [CPUFREQ] Fix some more CPU hotplug locking.
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:38/lock_cpu_hotplug()
[<b0134a42>] lock_cpu_hotplug+0x42/0x65
[<b02f8af1>] cpufreq_update_policy+0x25/0xad
[<b0358756>] kprobe_flush_task+0x18/0x40
[<b0355aab>] schedule+0x63f/0x68b
[<b01377c2>] __link_module+0x0/0x1f
[<b0119e7d>] __cond_resched+0x16/0x34
[<b03560bf>] cond_resched+0x26/0x31
[<b0355b0e>] wait_for_completion+0x17/0xb1
[<f965c547>] cpufreq_stat_cpu_callback+0x13/0x20 [cpufreq_stats]
[<f9670074>] cpufreq_stats_init+0x74/0x8b [cpufreq_stats]
[<b0137872>] sys_init_module+0x91/0x174
[<b0102c81>] sysenter_past_esp+0x56/0x79

As there are other places that call cpufreq_update_policy without
the hotplug lock, it seems better to keep the hotplug locking
at the lower level for the time being until this is revamped.

Signed-off-by: Dave Jones <davej@redhat.com>
2006-09-22 19:15:23 -04:00
Dave Jones
3906f4edee [CPUFREQ] Fix sparse warning in ondemand
drivers/cpufreq/cpufreq_ondemand.c:323:2: warning: Using plain integer as NULL pointer

Signed-off-by: Dave Jones <davej@redhat.com>
2006-09-05 17:15:47 -04:00
Adrian Bunk
b5ecf60fe6 [CPUFREQ] make drivers/cpufreq/cpufreq_ondemand.c:powersave_bias_target() static
This patch makes the needlessly global powersave_bias_target() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-08-14 01:18:54 -04:00
Alexey Starikovskiy
05ca0350e8 [CPUFREQ][2/2] ondemand: updated add powersave_bias tunable
ondemand selects the minimum frequency that can retire
a workload with negligible idle time -- ideally resulting in the highest
performance/power efficiency with negligible performance impact.

But on some systems and some workloads, this algorithm
is more performance biased than necessary, and
de-tuning it a bit to allow some performance impact
can save measurable power.

This patch adds a "powersave_bias" tunable to ondemand
to allow it to reduce its target frequency by a specified percent.

By default, the powersave_bias is 0 and has no effect.
powersave_bias is in units of 0.1%, so it has an effective range
of 1 through 1000, resulting in 0.1% to 100% impact.

In practice, users will not be able to detect a difference between
0.1% increments, but 1.0% increments turned out to be too large.
Also, the max value of 1000 (100%) would simply peg the system
in its deepest power saving P-state, unless the processor really has
a hardware P-state at 0Hz:-)

For example, If ondemand requests 2.0GHz based on utilization,
and powersave_bias=100, this code will knock 10% off the target
and seek  a target of 1.8GHz instead of 2.0GHz until the
next sampling.  If 1.8 is an exact match with an hardware frequency
we use it, otherwise we average our time between the frequency
next higher than 1.8 and next lower than 1.8.

Note that a user or administrative program can change powersave_bias
at run-time depending on how they expect the system to be used.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi at intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy at intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-08-11 17:59:57 -04:00
Alexey Starikovskiy
1ce28d6b19 [CPUFREQ][1/2] ondemand: updated tune for hardware coordination
Try to make dbs_check_cpu() call on all CPUs at the same jiffy.
This will help when multiple cores share P-states via Hardware Coordination.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi at intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy at intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-08-11 17:59:56 -04:00
Dave Jones
cd87847979 [CPUFREQ] Fix typo.
Signed-off-by: Dave Jones <davej@redhat.com>
2006-08-11 17:59:28 -04:00
Jeremy Fitzhardinge
ea71497020 [CPUFREQ] [2/2] demand load governor modules.
Demand-load cpufreq governor modules if needed.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-07-31 18:37:06 -04:00
Jeremy Fitzhardinge
3bcb09a356 [CPUFREQ] [1/2] add __find_governor helper and clean up some error handling.
Adds a __find_governor() helper function to look up a governor by
name.  Also restructures some error handling to conform to the
"single-exit" model which is generally preferred for kernel code.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-07-31 18:37:06 -04:00
Mattia Dongili
9c9a43ed27 [CPUFREQ] return error when failing to set minfreq
I just stumbled on this bug/feature, this is how to reproduce it:

# echo 450000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
# echo 450000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
# echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# cpufreq-info -p
450000 450000 powersave
# echo 1800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq ; echo $?
0
# cpufreq-info -p
450000 450000 powersave

Here it is. The kernel refuses to set a min_freq higher than the
max_freq but it allows a max_freq lower than min_freq (lowering min_freq
also).

This behaviour is pretty straightforward (but undocumented) and it
doesn't return an error altough failing to accomplish the requested
action (set min_freq).
The problem (IMO) is basically that userspace is not allowed to set a
full policy atomically while the kernel always does that thus it must
enforce an ordering on operations.

The attached patch returns -EINVAL if trying to increase frequencies
starting from scaling_min_freq and documents the correct ordering of writes.

Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Dominik Brodowski <linux at dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>

--
2006-07-31 18:37:05 -04:00
Arjan van de Ven
153d7f3fca [PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizare
The patch below moves the cpu hotplugging higher up in the cpufreq
layering; this is needed to avoid recursive taking of the cpu hotplug
lock and to otherwise detangle the mess.

The new rules are:
1. you must do lock_cpu_hotplug() around the following functions:
   __cpufreq_driver_target
   __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only)
   __cpufreq_set_policy
2. governer methods (.governer) must NOT take the lock_cpu_hotplug()
   lock in any way; they are called with the lock taken already
3. if your governer spawns a thread that does things, like calling
   __cpufreq_driver_target, your thread must honor rule #1.
4. the policy lock and other cpufreq internal locks nest within
   the lock_cpu_hotplug() lock.

I'm not entirely happy about how the __cpufreq_governor rule ended up
(conditional locking rule depending on the argument) but basically all
callers pass this as a constant so it's not too horrible.

The patch also removes the cpufreq_governor() function since during the
locking audit it turned out to be entirely unused (so no need to fix it)

The patch works on my testbox, but it could use more testing
(otoh... it can't be much worse than the current code)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-26 07:21:40 -07:00
Linus Torvalds
2cd7cbdf4b [cpufreq] ondemand: make shutdown sequence more robust
Shutting down the ondemand policy was fraught with potential
problems, causing issues for SMP suspend (which wants to hot-
unplug) all but the last CPU.

This should fix at least the worst problems (divide-by-zero
and infinite wait for the workqueue to shut down).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-23 12:05:00 -07:00
Dave Jones
a496e25dfb [PATCH] Fix cpufreq vs hotplug lockdep recursion.
[ There's some not quite baked bits in cpufreq-git right now
  so sending this on as a patch instead ]

On Thu, 2006-07-06 at 07:58 -0700, Tom London wrote:

> After installing .2356 I get this each time I boot:
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> -------------------------------------------------------
> S06cpuspeed/1620 is trying to acquire lock:
>  (dbs_mutex){--..}, at: [<c060d6bb>] mutex_lock+0x21/0x24
>
> but task is already holding lock:
>  (cpucontrol){--..}, at: [<c060d6bb>] mutex_lock+0x21/0x24
>
> which lock already depends on the new lock.
>

make sure the cpu hotplug recursive mutex (yuck) is taken early in the
cpufreq codepaths to avoid a AB-BA deadlock.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-07 09:46:45 -07:00
Linus Torvalds
ca78f6baca Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  Move workqueue exports to where the functions are defined.
  [CPUFREQ] Misc cleanups in ondemand.
  [CPUFREQ] Make ondemand sampling per CPU and remove the mutex usage in sampling path.
  [CPUFREQ] Add queue_delayed_work_on() interface for workqueues.
  [CPUFREQ] Remove slowdown from ondemand sampling path.
2006-07-04 14:00:26 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Venkatesh Pallipadi
ffac80e925 [CPUFREQ] Misc cleanups in ondemand.
Misc cleanups in ondemand. Should have zero functional impact.
Also adding Alexey as author.

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-06-30 01:36:40 -04:00
Venkatesh Pallipadi
2f8a835c70 [CPUFREQ] Make ondemand sampling per CPU and remove the mutex usage in sampling path.
Make ondemand sampling per CPU and remove the mutex usage in sampling path.

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-06-30 01:33:31 -04:00
Venkatesh Pallipadi
ccb2fe209d [CPUFREQ] Remove slowdown from ondemand sampling path.
Remove slowdown from ondemand sampling path. This reduces the code path length
in dbs_check_cpu() by half. slowdown was not used by ondemand by default.
If there are any user level tools that were using this tunable, they
may report error now.

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-06-30 01:29:47 -04:00
Chandra Seetharaman
74b85f3790 [PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only
Make notifier_blocks associated with cpu_notifier as __cpuinitdata.

__cpuinitdata makes sure that the data is init time only unless
CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
65edc68c34 [PATCH] cpu hotplug: make [un]register_cpu_notifier init time only
CPUs come online only at init time (unless CONFIG_HOTPLUG_CPU is defined).
So, cpu_notifier functionality need to be available only at init time.

This patch makes register_cpu_notifier() available only at init time, unless
CONFIG_HOTPLUG_CPU is defined.

This patch exports register_cpu_notifier() and unregister_cpu_notifier() only
if CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
9c7b216d23 [PATCH] cpu hotplug: revert init patch submitted for 2.6.17
In 2.6.17, there was a problem with cpu_notifiers and XFS.  I provided a
band-aid solution to solve that problem.  In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).

We deferred the real fix to 2.6.18.  Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).

If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.

This patch reverts the notifier_call changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Andrew Morton
138a0128c0 [PATCH] cpufreq build fix
drivers/cpufreq/cpufreq_ondemand.c: In function 'do_dbs_timer':
drivers/cpufreq/cpufreq_ondemand.c:374: warning: implicit declaration of function 'lock_cpu_hotplug'
drivers/cpufreq/cpufreq_ondemand.c:381: warning: implicit declaration of function 'unlock_cpu_hotplug'
drivers/cpufreq/cpufreq_conservative.c: In function 'do_dbs_timer':
drivers/cpufreq/cpufreq_conservative.c:425: warning: implicit declaration of function 'lock_cpu_hotplug'
drivers/cpufreq/cpufreq_conservative.c:432: warning: implicit declaration of function 'unlock_cpu_hotplug'

Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 08:47:27 -07:00
Venkatesh Pallipadi
4ec223d02f [CPUFREQ] Fix ondemand vs suspend deadlock
Rootcaused the bug to a deadlock in cpufreq and ondemand. Due to non-existent
ordering between cpu_hotplug lock and dbs_mutex. Basically a race condition
between cpu_down() and do_dbs_timer().

cpu_down() flow:
* cpu_down() call for CPU 1
* Takes hot plug lock
* Calls pre down notifier
*     cpufreq notifier handler calls cpufreq_driver_target() which takes
      cpu_hotplug lock again. OK as cpu_hotplug lock is recursive in same
      process context
* CPU 1 goes down
* Calls post down notifier
*     cpufreq notifier handler calls ondemand event stop which takes dbs_mutex

So, cpu_hotplug lock is taken before dbs_mutex in this flow.

do_dbs_timer is triggerred by a periodic timer event.
It first takes dbs_mutex and then takes cpu_hotplug lock in
cpufreq_driver_target().
Note the reverse order here compared to above. So, if this timer event happens
at right moment during cpu_down, system will deadlok.

Attached patch fixes the issue for both ondemand and conservative.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-06-21 18:30:26 -04:00
Jan Beulich
b10eec2246 [CPUFREQ] cpufreq core {d,}printk adjustments
Remove KERN_* suffixes from some cpufreq driver's dprintk-s.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-06-04 19:47:38 -04:00
Dave Jones
484944a5b0 [CPUFREQ] Remove more freq_table reinitialisations.
Signed-off-by: Dave Jones <davej@redhat.com>
2006-05-30 18:09:31 -04:00
Dave Jones
5557976ca9 [CPUFREQ] Fix another redundant initialisation in freq_table
Signed-off-by: Dave Jones <davej@redhat.com>
2006-05-30 17:59:48 -04:00