Commit Graph

811 Commits

Author SHA1 Message Date
Rafael J. Wysocki
75c0758137 acpi-cpufreq: Fail initialization if driver cannot be registered
Make acpi_cpufreq_init() return error codes when the driver cannot be
registered so that the module doesn't stay useless in memory and so
that acpi_cpufreq_exit() doesn't attempt to unregister things that
have never been registered when the module is unloaded.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2013-10-25 16:22:47 +02:00
Dirk Brandewie
7244cb62d9 intel_pstate: Correct calculation of min pstate value
The minimum pstate is supposed to be a percentage of the maximum P
state available.  Calculate min using max pstate and not the
current max which may have been limited by the user

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-22 01:16:39 +02:00
Brennan Shacklett
d253d2a526 intel_pstate: Improve accuracy by not truncating until final result
This patch addresses Bug 60727
(https://bugzilla.kernel.org/show_bug.cgi?id=60727)
which was due to the truncation of intermediate values in the
calculations, which causes the code to consistently underestimate the
current cpu frequency, specifically 100% cpu utilization was truncated
down to the setpoint of 97%. This patch fixes the problem by keeping
the results of all intermediate calculations as fixed point numbers
rather scaling them back and forth between integers and fixed point.

References: https://bugzilla.kernel.org/show_bug.cgi?id=60727
Signed-off-by: Brennan Shacklett <bpshacklett@gmail.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-22 01:15:38 +02:00
Charles Keepax
0e8244322b cpufreq: s3c64xx: Rename index to driver_data
The index field of cpufreq_frequency_table has been renamed to
driver_data by commit 5070158 (cpufreq: rename index as driver_data
in cpufreq_frequency_table).

This patch updates the s3c64xx driver to match.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Cc: 3.11+ <stable@vger.kernel.org> # 3.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 23:53:38 +02:00
Rafael J. Wysocki
09c87e2f79 intel_pstate: Fix type mismatch warning
The expression in line 398 of intel_pstate.c causes the following
warning to be emitted:

drivers/cpufreq/intel_pstate.c:398:3: warning: left shift count >= width of type

which happens because unsigned long is 32-bit on some architectures.

Fix that by using a helper u64 variable and simplify the code
slightly.

Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 22:59:33 +02:00
Dirk Brandewie
52e0a509e5 cpufreq / intel_pstate: Fix max_perf_pct on resume
If the system is suspended while max_perf_pct is less than 100 percent
or no_turbo set policy->{min,max} will be set incorrectly with scaled
values which turn the scaled values into hard limits.

References: https://bugzilla.kernel.org/show_bug.cgi?id=61241
Reported-by: Patrick Bartels <petzicus@googlemail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: 3.9+ <stable@vger.kernel.org> # 3.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 01:41:46 +02:00
Srinivas Pandruvada
1ccf7a1cda intel_pstate: fix no_turbo
When sysfs for no_turbo is set, then also some p states in turbo regions
are observed. This patch will set IDA Engage bit when no_turbo is set to
explicitly disengage turbo.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-01 22:51:11 +02:00
Philipp Zabel
43c638e3dd cpufreq: cpufreq-cpu0: NULL is a valid regulator, part 2
Since the patch "cpufreq: cpufreq-cpu0: NULL is a valid regulator", cpu_reg
contains an error value if the regulator is not set, instead of NULL.
Accordingly, fix the remaining check for non-NULL cpu_reg.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 20:08:59 +02:00
Sachin Kamat
bb25f13aed cpufreq: SPEAr: Fix incorrect variable type
'clk_round_rate' returns a negative error code upon failure. This
will never get detected by unsigned 'newfreq'. Make it signed.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 20:05:43 +02:00
Sachin Kamat
116decb7e4 cpufreq: exynos5440: Fix potential NULL pointer dereference
If 'dvfs_info' is NULL (due to devm_kzalloc failure) the failure
error message would try to dereference it. Use 'pdev' instead.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25 03:25:58 +02:00
Viresh Kumar
26ca869434 cpufreq: check cpufreq driver is valid and cpufreq isn't disabled in cpufreq_get()
cpufreq_get() can be called from external drivers which might not be aware if
cpufreq driver is registered or not. And so we should actually check if cpufreq
driver is registered or not and also if cpufreq is active or disabled, at the
beginning of cpufreq_get().

Otherwise call to lock_policy_rwsem_read() might hit BUG_ON(!policy).

Reported-and-tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25 03:24:02 +02:00
Yinghai Lu
8a61e12e84 acpi-cpufreq: skip loading acpi_cpufreq after intel_pstate
If the hw supports intel_pstate and acpi_cpufreq, intel_pstate will
get loaded first.

acpi_cpufreq_init() will call acpi_cpufreq_early_init()
and that will allocate perf data and init those perf data in ACPI core,
(that will cover all CPUs). But later it will free them as
cpufreq_register_driver(acpi_cpufreq) will fail as intel_pstate is
already registered

Use cpufreq_get_current_driver() to check if we can skip the
acpi_cpufreq loading.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25 03:19:09 +02:00
Yinghai Lu
4dea5806d3 cpufreq: return EEXIST instead of EBUSY for second registering
On systems that support intel_pstate, acpi_cpufreq fails to load, and
udev keeps trying until trace gets filled up and kernel crashes.

The root cause is driver return ret from cpufreq_register_driver(),
because when some other driver takes over before, it will return
EBUSY and then udev will keep trying ...

cpufreq_register_driver() should return EEXIST instead so that the
system can boot without appending intel_pstate=disable and still use
intel_pstate.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-20 00:37:10 +02:00
Sudeep KarkadaNagesha
b494b48dac cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device
Commit cdc58d602d "cpufreq: imx6q-cpufreq:
remove device tree parsing for cpu nodes" assumed the pdev->dev is set to
cpu0 device in the platform code. But it actually points to the virtual
cpufreq-cpu0 platform device which is not present in the device tree.
Most of the information needed by cpufreq is stored in cpu0 DT node.
So cpu_dev must point to cpu0 device.

This patch fixes the wrong assignment to cpu_dev.

Reported-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-19 03:53:43 +02:00
Sudeep KarkadaNagesha
e1825b2530 cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device
Commit f837a9b5ab "cpufreq: cpufreq-cpu0:
remove device tree parsing for cpu nodes" assumed the pdev->dev is set to
cpu0 device in the platform code. But it actually points to the virtual
cpufreq-cpu0 platform device which is not present in the device tree.
Most of the information needed by cpufreq is stored in cpu0 DT node.
So cpu_dev must point to cpu0 device.

This patch fixes the wrong assignment to cpu_dev.

Reported-and-tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-19 03:53:43 +02:00
Viresh Kumar
8efd57657d cpufreq: unlock correct rwsem while updating policy->cpu
Current code looks like this:

        WARN_ON(lock_policy_rwsem_write(cpu));
        update_policy_cpu(policy, new_cpu);
        unlock_policy_rwsem_write(cpu);

{lock|unlock}_policy_rwsem_write(cpu) takes/releases policy->cpu's rwsem.
Because cpu is changing with the call to update_policy_cpu(), the
unlock_policy_rwsem_write() will release the incorrect lock.

The right solution would be to release the same lock as was taken earlier. Also
update_policy_cpu() was also called from cpufreq_add_dev() without any locks and
so its better if we move this locking to inside update_policy_cpu().

This patch fixes a regression introduced in 3.12 by commit f9ba680d23
(cpufreq: Extract the handover of policy cpu to a helper function).

Reported-and-tested-by: Jon Medhurst<tixy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-18 00:01:52 +02:00
Viresh Kumar
9c8f1ee40b cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()
This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev()
into two parts" from Srivatsa.

Consider a scenario where we have two CPUs in a policy (0 & 1) and we are
removing CPU 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1
from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read
cpumask_weight of policy->cpus, which will come as 1 and this code will behave
as if we are removing the last CPU from policy :)

Fix it by clearing the CPU mask in __cpufreq_remove_dev_finish() instead of
__cpufreq_remove_dev_prepare().

Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-18 00:01:27 +02:00
Rafael J. Wysocki
f1728fd159 Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
  cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
  cpufreq: Restructure if/else block to avoid unintended behavior
  cpufreq: Fix crash in cpufreq-stats during suspend/resume
2013-09-12 13:04:11 +02:00
Lan Tianyu
44871c9c7f cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
In cpufreq_policy_restore() before system suspend policy is read from
percpu's cpufreq_cpu_data_fallback.  It's a read operation rather
than a write one, so take the lock for reading in there.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-11 23:30:03 +02:00
Srivatsa S. Bhat
cb38ed5cf1 cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
If update_policy_cpu() is invoked with the existing policy->cpu itself
as the new-cpu parameter, then a lot of things can go terribly wrong.

In its present form, update_policy_cpu() always assumes that the new-cpu
is different from policy->cpu and invokes other functions to perform their
respective updates. And those functions implement the actual update like
this:

per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
per_cpu(..., last_cpu) = NULL;

Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
references vanish into thin air! (memory leak). From there, it leads to more
problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
and hence tries to create a new sysfs-group; but sysfs already had created
the group earlier, so it complains that it cannot create a duplicate filename.
In short, the repercussions of a rather innocuous invocation of
update_policy_cpu() can turn out to be pretty nasty.

Ideally update_policy_cpu() should handle this situation (new == last)
gracefully, and not lead to such severe problems. So fix it by adding an
appropriate check.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-11 23:29:57 +02:00
Srivatsa S. Bhat
61173f256a cpufreq: Restructure if/else block to avoid unintended behavior
In __cpufreq_remove_dev_prepare(), the code which decides whether to remove
the sysfs link or nominate a new policy cpu, is governed by an if/else block
with a rather complex set of conditionals. Worse, they harbor a subtlety
which leads to certain unintended behavior.

The code looks like this:

        if (cpu != policy->cpu && !frozen) {
                sysfs_remove_link(&dev->kobj, "cpufreq");
        } else if (cpus > 1) {
		new_cpu = cpufreq_nominate_new_policy_cpu(...);
		...
		update_policy_cpu(..., new_cpu);
	}

The original intention was:
If the CPU going offline is not policy->cpu, just remove the link.
On the other hand, if the CPU going offline is the policy->cpu itself,
handover the policy->cpu job to some other surviving CPU in that policy.

But because the 'if' condition also includes the 'frozen' check, now there
are *two* possibilities by which we can enter the 'else' block:

1. cpu == policy->cpu (intended)
2. cpu != policy->cpu && frozen (unintended)

Due to the second (unintended) scenario, we end up spuriously nominating
a CPU as the policy->cpu, even when the existing policy->cpu is alive and
well. This can cause problems further down the line, especially when we end
up nominating the same policy->cpu as the new one (ie., old == new),
because it totally confuses update_policy_cpu().

To avoid this mess, restructure the if/else block to only do what was
originally intended, and thus prevent any unwelcome surprises.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-11 23:29:57 +02:00
Srivatsa S. Bhat
0d66b91ebf cpufreq: Fix crash in cpufreq-stats during suspend/resume
Stephen Warren reported that the cpufreq-stats code hits a NULL pointer
dereference during the second attempt to suspend a system. He also
pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight
init/teardown during suspend/resume".

That commit actually ensured that the cpufreq-stats table and the
cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during
suspend/resume, which makes it all the more surprising. However, it turns
out that the root-cause is not that we access an already freed memory, but
that the reference to the allocated memory gets moved around and we lose
track of that during resume, leading to the reported crash in a subsequent
suspend attempt.

In the suspend path, during CPU offline, the value of policy->cpu is updated
by choosing one of the surviving CPUs in that policy, as long as there is
atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is
invoked to update the reference to the stats structure by assigning it to
the new CPU. However, in the resume path, during CPU online, we end up
assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats
know about this. Thus the reference to the stats structure remains
(incorrectly) associated with the old CPU. So, in a subsequent suspend attempt,
during CPU offline, we end up accessing an incorrect location to get the
stats structure, which eventually leads to the NULL pointer dereference.

Fix this by letting cpufreq-stats know about the update of the policy->cpu
during CPU online in the resume path. (Also, move the update_policy_cpu()
function higher up in the file, so that __cpufreq_add_dev() can invoke
it).

Reported-and-tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-11 23:29:57 +02:00
Rafael J. Wysocki
0df03a30c3 Merge branch 'pm-cpufreq'
* pm-cpufreq:
  intel_pstate: Add Haswell CPU models
  Revert "cpufreq: make sure frequency transitions are serialized"
  cpufreq: Use signed type for 'ret' variable, to store negative error values
  cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
  cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
  cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
  cpufreq: Split __cpufreq_remove_dev() into two parts
  cpufreq: Fix wrong time unit conversion
  cpufreq: serialize calls to __cpufreq_governor()
  cpufreq: don't allow governor limits to be changed when it is disabled
2013-09-11 15:23:15 +02:00
Nell Hardcastle
6cdcdb7937 intel_pstate: Add Haswell CPU models
Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit

Signed-off-by: Nell Hardcastle <nell@spicious.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10 23:10:39 +02:00
Rafael J. Wysocki
798282a871 Revert "cpufreq: make sure frequency transitions are serialized"
Commit 7c30ed5 (cpufreq: make sure frequency transitions are
serialized) attempted to serialize frequency transitions by
adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE
notifications.  However, it assumed that the notifications will
always originate from the driver's .target() callback, but they
also can be triggered by cpufreq_out_of_sync() and that leads to
warnings like this on some systems:

 WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317
 __cpufreq_notify_transition+0x238/0x260()
 In middle of another frequency transition

accompanied by a call trace similar to this one:

 [<ffffffff81720daa>] dump_stack+0x46/0x58
 [<ffffffff8106534c>] warn_slowpath_common+0x8c/0xc0
 [<ffffffff815b8560>] ? acpi_cpufreq_target+0x320/0x320
 [<ffffffff81065436>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff815b1ec8>] __cpufreq_notify_transition+0x238/0x260
 [<ffffffff815b33be>] cpufreq_notify_transition+0x3e/0x70
 [<ffffffff815b345d>] cpufreq_out_of_sync+0x6d/0xb0
 [<ffffffff815b370c>] cpufreq_update_policy+0x10c/0x160
 [<ffffffff815b3760>] ? cpufreq_update_policy+0x160/0x160
 [<ffffffff81413813>] cpufreq_set_cur_state+0x8c/0xb5
 [<ffffffff814138df>] processor_set_cur_state+0xa3/0xcf
 [<ffffffff8158e13c>] thermal_cdev_update+0x9c/0xb0
 [<ffffffff8159046a>] step_wise_throttle+0x5a/0x90
 [<ffffffff8158e21f>] handle_thermal_trip+0x4f/0x140
 [<ffffffff8158e377>] thermal_zone_device_update+0x57/0xa0
 [<ffffffff81415b36>] acpi_thermal_check+0x2e/0x30
 [<ffffffff81415ca0>] acpi_thermal_notify+0x40/0xdc
 [<ffffffff813e7dbd>] acpi_device_notify+0x19/0x1b
 [<ffffffff813f8241>] acpi_ev_notify_dispatch+0x41/0x5c
 [<ffffffff813e3fbe>] acpi_os_execute_deferred+0x25/0x32
 [<ffffffff81081060>] process_one_work+0x170/0x4a0
 [<ffffffff81082121>] worker_thread+0x121/0x390
 [<ffffffff81082000>] ? manage_workers.isra.20+0x170/0x170
 [<ffffffff81088fe0>] kthread+0xc0/0xd0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8173582c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0

For this reason, revert commit 7c30ed5 along with the fix 266c13d
(cpufreq: Fix serialization of frequency transitions) on top of it
and we will revisit the serialization problem later.

Reported-by: Alessandro Bono <alessandro.bono@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10 02:54:50 +02:00