If the machine is booted without any cpu_idle driver set
(b/c disable_cpuidle() has been called) we should follow
other users of cpu_idle API and check the return value
for NULL before using it.
Reported-and-tested-by: Mark van Dijk <mark@internecto.net>
Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Pull ACPI & power management update from Len Brown:
"Re-write of the turbostat tool.
lower overhead was necessary for measuring very large system when
they are very idle.
IVB support in intel_idle
It's what I run on my IVB, others should be able to also:-)
ACPICA core update
We have found some bugs due to divergence between Linux and the
upstream ACPICA base. Most of these patches are to reduce that
divergence to reduce the risk of future bugs.
Some cpuidle updates, mostly for non-Intel
More will be coming, as they depend on this part.
Some thermal management changes needed by non-ACPI systems.
Some _OST (OS Status Indication) updates for hot ACPI hot-plug."
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (51 commits)
Thermal: Documentation update
Thermal: Add Hysteresis attributes
Thermal: Make Thermal trip points writeable
ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check
tools/power: turbostat: fix large c1% issue
tools/power: turbostat v2 - re-write for efficiency
ACPICA: Update to version 20120711
ACPICA: AcpiSrc: Fix some translation issues for Linux conversion
ACPICA: Update header files copyrights to 2012
ACPICA: Add new ACPI table load/unload external interfaces
ACPICA: Split file: tbxface.c -> tbxfload.c
ACPICA: Add PCC address space to space ID decode function
ACPICA: Fix some comment fields
ACPICA: Table manager: deploy new firmware error/warning interfaces
ACPICA: Add new interfaces for BIOS(firmware) errors and warnings
ACPICA: Split exception code utilities to a new file, utexcep.c
ACPI: acpi_pad: tune round_robin_time
ACPICA: Update to version 20120620
ACPICA: Add support for implicit notify on multiple devices
ACPICA: Update comments; no functional change
...
When the system is booted with some cpus offline, the idle
driver is not initialized. When a cpu is set online, the
acpi code call the intel idle init function. Unfortunately
this code introduce a dependency between intel_idle and acpi.
This patch is intended to remove this dependency by using the
notifier of intel_idle. This patch has the benefit of
encapsulating the intel_idle driver and remove some exported
functions.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Many users of debugfs copy the implementation of default_open() when
they want to support a custom read/write function op. This leads to a
proliferation of the default_open() implementation across the entire
tree.
Now that the common implementation has been consolidated into libfs we
can replace all the users of this function with simple_open().
This replacement was done with the following semantic patch:
<smpl>
@ open @
identifier open_f != simple_open;
identifier i, f;
@@
-int open_f(struct inode *i, struct file *f)
-{
(
-if (i->i_private)
-f->private_data = i->i_private;
|
-f->private_data = i->i_private;
)
-return 0;
-}
@ has_open depends on open @
identifier fops;
identifier open.open_f;
@@
struct file_operations fops = {
...
-.open = open_f,
+.open = simple_open,
...
};
</smpl>
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b66b8b9a4a ('intel-idle: convert
to x86_cpu_id auto probing') added a distinction between Nehalem and
Westemere processors and changed auto_demotion_disable_flags for the
former to 0. This was not explained in the commit message, so change
it back.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Thomas Renninger <trenn@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit b66b8b9a4a ('intel-idle: convert
to x86_cpu_id auto probing') put two entries for model 0x2f
(Westmere-EX Xeon) in the device ID table and left out model 0x2e
(Nehalem-EX Xeon).
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Thomas Renninger <trenn@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This was done to resolve a merge and build problem with the
drivers/acpi/processor_driver.c file.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With this it should be automatically loaded on suitable systems by
udev.
The old switch () is replaced with a table based approach, this
also cleans up the code.
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Delay the setting up of features (cpuidle, throttling by calling
acpi_processor_start()) to the time when the hotplugged
core got onlined the first time and got fully
initialized.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
kvm -cpu host passes the original cpuid info to the guest.
Latest kvm version seem to return true for mwait_leaf cpuid
function on recent Intel CPUs. But it does not return mwait
C-states (mwait_substates), instead zero is returned.
While real CPUs seem to always return non-zero values, the intel
idle driver should not get active in kvm (mwait_substates == 0)
case and bail out.
Otherwise a Null pointer exception will happen later when the
cpuidle subsystem tries to get active:
[0.984807] BUG: unable to handle kernel NULL pointer dereference at (null)
[0.984807] IP: [<(null)>] (null)
...
[0.984807][<ffffffff8143cf34>] ? cpuidle_idle_call+0xb4/0x340
[0.984807][<ffffffff8159e7bc>] ? __atomic_notifier_call_chain+0x4c/0x70
[0.984807][<ffffffff81001198>] ? cpu_idle+0x78/0xd0
Reference:
https://bugzilla.novell.com/show_bug.cgi?id=726296
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Bruno Friedmann <bruno@ioda-net.ch>
Signed-off-by: Len Brown <len.brown@intel.com>
Fix the following warning:
drivers/idle/intel_idle.c: In function 'intel_idle_cpuidle_devices_init':
drivers/idle/intel_idle.c:518:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
By making get_driver_data() return a long instead of an int.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
irq disabling happens earlier in process_32.c:cpu_idle. Basically,
cpuidle_state->enter is called, cpu irq is disabled. cpuidle_state->enter
would turn on irq when exiting.
intel_idle doesn't follow this assumption. Although it doesn't cause real
issue, it misleads developers. Remove the call to local_irq_disable() at
entry.
[akpm@linux-foundation.org: add comment]
Signed-off-by: Mingming Zhang <mingmingx.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
smp_call_function() only lets all other CPUs execute a specific function,
while we expect all CPUs do in intel_idle. Without the fix, we could have
one cpu which has auto_demotion enabled or has no broadcast timer setup.
Usually we don't see impact because auto demotion just harms power and the
intel_idle init is called in CPU 0, where boradcast timer delivers
interrupt, but this still could be a problem.
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
cpuidle: Single/Global registration of idle states
cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
ACPI: Export FADT pm_profile integer value to userspace
thermal: Prevent polling from happening during system suspend
ACPI: Drop ACPI_NO_HARDWARE_INIT
ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
PNPACPI: Simplify disabled resource registration
ACPI: Fix possible recursive locking in hwregs.c
ACPI: use kstrdup()
mrst pmu: update comment
tools/power turbostat: less verbose debugging
This patch makes the cpuidle_states structure global (single copy)
instead of per-cpu. The statistics needed on per-cpu basis
by the governor are kept per-cpu. This simplifies the cpuidle
subsystem as state registration is done by single cpu only.
Having single copy of cpuidle_states saves memory. Rare case
of asymmetric C-states can be handled within the cpuidle driver
and architectures such as POWER do not have asymmetric C-states.
Having single/global registration of all the idle states,
dynamic C-state transitions on x86 are handled by
the boot cpu. Here, the boot cpu would disable all the devices,
re-populate the states and later enable all the devices,
irrespective of the cpu that would receive the notification first.
Reference:
https://lkml.org/lkml/2011/4/25/83
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This is the first step towards global registration of cpuidle
states. The statistics used primarily by the governor are per-cpu
and have to be split from rest of the fields inside cpuidle_state,
which would be made global i.e. single copy. The driver_data field
is also per-cpu and moved.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cpuidle governor only suggests the state to enter using the
governor->select() interface, but allows the low level driver to
override the recommended state. The actual entered state
may be different because of software or hardware demotion. Software
demotion is done by the back-end cpuidle driver and can be accounted
correctly. Current cpuidle code uses last_state field to capture the
actual state entered and based on that updates the statistics for the
state entered.
Ideally the driver enter routine should update the counters,
and it should return the state actually entered rather than the time
spent there. The generic cpuidle code should simply handle where
the counters live in the sysfs namespace, not updating the counters.
Reference:
https://lkml.org/lkml/2011/3/25/52
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
These files aren't just exporting symbols -- they are also defining
a MODULE_LICENSE etc. so give them the full module.h file.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Userspace apps might have to cut off parts off the
idle state name for display reasons.
Switch NHM-C1 to C1-NHM (and others) so that a cut off
name is unique and makes sense to the user.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: lenb@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
Just as we had to disable auto-demotion for NHM/WSM,
we need to do the same for Atom (Lincroft version).
In particular, auto-demotion will prevent Lincroft
from entering the S0i3 idle power saving state.
https://bugzilla.kernel.org/show_bug.cgi?id=25252
Signed-off-by: Len Brown <len.brown@intel.com>