Commit Graph

53 Commits

Author SHA1 Message Date
Linus Torvalds 3c00303206 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  cpuidle: Single/Global registration of idle states
  cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
  cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
  cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
  ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
  ACPI: Export FADT pm_profile integer value to userspace
  thermal: Prevent polling from happening during system suspend
  ACPI: Drop ACPI_NO_HARDWARE_INIT
  ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
  PNPACPI: Simplify disabled resource registration
  ACPI: Fix possible recursive locking in hwregs.c
  ACPI: use kstrdup()
  mrst pmu: update comment
  tools/power turbostat: less verbose debugging
2011-11-07 10:13:52 -08:00
Deepthi Dharwar 46bcfad7a8 cpuidle: Single/Global registration of idle states
This patch makes the cpuidle_states structure global (single copy)
instead of per-cpu. The statistics needed on per-cpu basis
by the governor are kept per-cpu. This simplifies the cpuidle
subsystem as state registration is done by single cpu only.
Having single copy of cpuidle_states saves memory. Rare case
of asymmetric C-states can be handled within the cpuidle driver
and architectures such as POWER do not have asymmetric C-states.

Having single/global registration of all the idle states,
dynamic C-state transitions on x86 are handled by
the boot cpu. Here, the boot cpu  would disable all the devices,
re-populate the states and later enable all the devices,
irrespective of the cpu that would receive the notification first.

Reference:
https://lkml.org/lkml/2011/4/25/83

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:58 -05:00
Deepthi Dharwar 4202735e8a cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
This is the first step towards global registration of cpuidle
states. The statistics used primarily by the governor are per-cpu
and have to be split from rest of the fields inside cpuidle_state,
which would be made global i.e. single copy. The driver_data field
is also per-cpu and moved.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:49 -05:00
Deepthi Dharwar e978aa7d7d cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
Cpuidle governor only suggests the state to enter using the
governor->select() interface, but allows the low level driver to
override the recommended state. The actual entered state
may be different because of software or hardware demotion. Software
demotion is done by the back-end cpuidle driver and can be accounted
correctly. Current cpuidle code uses last_state field to capture the
actual state entered and based on that updates the statistics for the
state entered.

Ideally the driver enter routine should update the counters,
and it should return the state actually entered rather than the time
spent there. The generic cpuidle code should simply handle where
the counters live in the sysfs namespace, not updating the counters.

Reference:
https://lkml.org/lkml/2011/3/25/52

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:30 -05:00
Paul Gortmaker 7c52d55170 x86: fix up files really needing to include module.h
These files aren't just exporting symbols -- they are also defining
a MODULE_LICENSE etc. so give them the full module.h file.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:36 -04:00
Thomas Renninger 15e123e5d7 intel_idle: Rename cpuidle states
Userspace apps might have to cut off parts off the
idle state name for display reasons.
Switch NHM-C1 to C1-NHM (and others) so that a cut off
name is unique and makes sense to the user.

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: lenb@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2011-02-28 11:04:45 -05:00
Len Brown bfb53ccf1c intel_idle: disable Atom/Lincroft HW C-state auto-demotion
Just as we had to disable auto-demotion for NHM/WSM,
we need to do the same for Atom (Lincroft version).

In particular, auto-demotion will prevent Lincroft
from entering the S0i3 idle power saving state.

https://bugzilla.kernel.org/show_bug.cgi?id=25252

Signed-off-by: Len Brown <len.brown@intel.com>
2011-02-17 17:08:48 -05:00
Len Brown 14796fca2b intel_idle: disable NHM/WSM HW C-state auto-demotion
Hardware C-state auto-demotion is a mechanism where the HW overrides
the OS C-state request, instead demoting to a shallower state,
which is less expensive, but saves less power.

Modern Linux should generally get exactly the states it requests.
In particular, when a CPU is taken off-line, it must not be demoted, else
it can prevent the entire package from reaching deep C-states.

https://bugzilla.kernel.org/show_bug.cgi?id=25252

Signed-off-by: Len Brown <len.brown@intel.com>
2011-02-17 17:08:46 -05:00
Shaohua Li ec30f343d6 fix a shutdown regression in intel_idle
Fix a shutdown regression caused by 2a2d31c8dc ("intel_idle: open
broadcast clock event").  The clockevent framework can automatically
shutdown broadcast timers for hotremove CPUs.  And we get a shutdown
regression when we shutdown broadcast timer for hot remove CPU, so just
delete some code.

Also fix some section mismatch.

Reported-by: Ari Savolainen <ari.m.savolainen@gmail.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-25 05:57:34 +10:00
Len Brown 43952886f0 Merge branch 'cpuidle-perf-events' into idle-test 2011-01-12 18:06:19 -05:00
Len Brown 56dbed129d Merge branch 'linus' into idle-test 2011-01-12 18:06:06 -05:00
Thomas Renninger f77cfe4ea2 cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer
Currently intel_idle and acpi_idle driver show double cpu_idle "exit idle"
events -> this patch fixes it and makes cpu_idle events throwing less complex.

It also introduces cpu_idle events for all architectures which use
the cpuidle subsystem, namely:
  - arch/arm/mach-at91/cpuidle.c
  - arch/arm/mach-davinci/cpuidle.c
  - arch/arm/mach-kirkwood/cpuidle.c
  - arch/arm/mach-omap2/cpuidle34xx.c
  - arch/drivers/acpi/processor_idle.c (for all cases, not only mwait)
  - arch/x86/kernel/process.c (did throw events before, but was a mess)
  - drivers/idle/intel_idle.c (did throw events before)

Convention should be:
Fire cpu_idle events inside the current pm_idle function (not somewhere
down the the callee tree) to keep things easy.

Current possible pm_idle functions in X86:
c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle
-> this is really easy is now.

This affects userspace:
The type field of the cpu_idle power event can now direclty get
mapped to:
/sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...}
instead of throwing very CPU/mwait specific values.
This change is not visible for the intel_idle driver.
For the acpi_idle driver it should only be visible if the vendor
misses out C-states in his BIOS.
Another (perf timechart) patch reads out cpuidle info of cpu_idle
events from:
/sys/.../cpuidle/stateX/*, then the cpuidle events are mapped
to the correct C-/cpuidle state again, even if e.g. vendors miss
out C-states in their BIOS and for example only export C1 and C3.
-> everything is fine.

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Robert Schoene <robert.schoene@tu-dresden.de>
CC: Jean Pihet <j-pihet@ti.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: linux-pm@lists.linux-foundation.org
CC: linux-acpi@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-perf-users@vger.kernel.org
CC: linux-omap@vger.kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 18:05:16 -05:00
Shaohua Li 2a2d31c8dc intel_idle: open broadcast clock event
Intel_idle driver uses CLOCK_EVT_NOTIFY_BROADCAST_ENTER
CLOCK_EVT_NOTIFY_BROADCAST_EXIT
for broadcast clock events. The _ENTER/_EXIT doesn't really open broadcast clock
events, please see processor_idle.c for an example. In some situation, this will
cause boot hang, because some CPUs enters idle but local APIC timer stalls.

Reported-and-tested-by: Yan Zheng <zheng.z.yan@intel.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
cc: stable@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:34 -05:00
Len Brown 956d033fb2 cpuidle: CPUIDLE_FLAG_TLB_FLUSHED is specific to intel_idle
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:33 -05:00
Thomas Renninger d18960494f ACPI, intel_idle: Cleanup idle= internal variables
Having four variables for the same thing:
  idle_halt, idle_nomwait, force_mwait and boot_option_idle_overrides
is rather confusing and unnecessary complex.

if idle= boot param is passed, only set up one variable:
boot_option_idle_overrides

Introduces following functional changes/fixes:
  - intel_idle driver does not register if any idle=xy
    boot param is passed.
  - processor_idle.c will also not register a cpuidle driver
    and get active if idle=halt is passed.
    Before a cpuidle driver with one (C1, halt) state got registered
    Now the default_idle function will be used which finally uses
    the same idle call to enter sleep state (safe_halt()), but
    without registering a whole cpuidle driver.

That means idle= param will always avoid cpuidle drivers to register
with one exception (same behavior as before):
idle=nomwait
may still register acpi_idle cpuidle driver, but C1 will not use
mwait, but hlt. This can be a workaround for IO based deeper sleep
states where C1 mwait causes problems.

Signed-off-by: Thomas Renninger <trenn@suse.de>
cc: x86@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:30 -05:00
Len Brown ddbd550d50 intel_idle: update Sandy Bridge core C-state residency targets
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:29 -05:00
Thomas Renninger 25e41933b5 perf: Clean up power events by introducing new, more generic ones
Add these new power trace events:

 power:cpu_idle
 power:cpu_frequency
 power:machine_suspend

The old C-state/idle accounting events:
  power:power_start
  power:power_end

Have now a replacement (but we are still keeping the old
tracepoints for compatibility):

  power:cpu_idle

and
  power:power_frequency

is replaced with:
  power:cpu_frequency

power:machine_suspend is newly introduced.

Jean Pihet has a patch integrated into the generic layer
(kernel/power/suspend.c) which will make use of it.

the type= field got removed from both, it was never
used and the type is differed by the event type itself.

perf timechart userspace tool gets adjusted in a separate patch.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Jean Pihet <jean.pihet@newoldbits.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: rjw@sisk.pl
LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>
2011-01-04 08:16:54 +01:00
Thomas Renninger 61a0d49c33 perf: Do not export power_frequency, but power_start event
power_frequency moved to drivers/cpufreq/cpufreq.c which has
to be compiled in, no need to export it.

intel_idle can a be module though...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jean Pihet <jean.pihet@newoldbits.com>
Cc: Jean Pihet <j-pihet@ti.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: rjw@sisk.pl
LKML-Reference: <1294073445-14812-2-git-send-email-trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>
2011-01-04 08:16:54 +01:00
Len Brown 56b9aea3b7 intel_idle: recognize ARAT on WSM-EX
We erroneously ignored the Always Running APIC Timer on WSM-EX.
Move the check for ARAT down so that it can apply to any/all models.

Signed-off-by: Len Brown <len.brown@intel.com>
2010-12-02 01:19:32 -05:00
Linus Torvalds 27afe58fe6 Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  intel_idle: do not use the LAPIC timer for ATOM C2
  intel_idle: add initial Sandy Bridge support
  acpi_idle: delete bogus data from cpuidle_state.power_usage
  intel_idle: delete bogus data from cpuidle_state.power_usage
  intel_idle: simplify test for leave_mm()
2010-10-26 17:28:07 -07:00
Len Brown c25d29952b intel_idle: do not use the LAPIC timer for ATOM C2
If we use the LAPIC timer during ATOM C2 on
some nvidia chisets, the system stalls.

https://bugzilla.kernel.org/show_bug.cgi?id=21032

Signed-off-by: Len Brown <len.brown@intel.com>
2010-10-26 15:45:17 -04:00
Len Brown 00527cc6bb Merge branch 'intel_idle+snb' into idle-release
Signed-off-by: Len Brown <len.brown@intel.com>
2010-10-23 02:35:23 -04:00
Len Brown d13780d439 intel_idle: add initial Sandy Bridge support
Signed-off-by: Len Brown <len.brown@intel.com>
2010-10-23 02:32:08 -04:00
Linus Torvalds 092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Linus Torvalds 2a8b67fb72 Merge branch 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, hotplug: In the MWAIT case of play_dead, CLFLUSH the cache line
  x86, hotplug: Move WBINVD back outside the play_dead loop
  x86, hotplug: Use mwait to offline a processor, fix the legacy case
  x86, mwait: Move mwait constants to a common header file
2010-10-21 13:45:38 -07:00