The file permissions of cpufreq per-cpu sysfs files are not preserved
across suspend/resume because we internally go through the CPU
Hotplug path which reinitializes the file permissions on CPU online.
But the user is not supposed to know that we are using CPU hotplug
internally within suspend/resume (IOW, the kernel should not silently
wreck the user-set file permissions across a suspend cycle).
Therefore, we need to preserve the file permissions as they are
across suspend/resume.
The simplest way to achieve that is to just not touch the sysfs files
at all - ie., just ignore the CPU hotplug notifications in the
suspend/resume path (_FROZEN) in the cpufreq hotplug callback.
Reported-by: Robert Jarzmik <robert.jarzmik@intel.com>
Reported-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We must call __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT) before
calling cpufreq_cpu_put(data), so that policy kobject have valid
fields. Otherwise, removing last online cpu of policy->cpus causes
this crash for ondemand/conservative governor.
[<c00fb076>] (sysfs_find_dirent+0xe/0xa8) from [<c00fb1bd>] (sysfs_get_dirent+0x21/0x58)
[<c00fb1bd>] (sysfs_get_dirent+0x21/0x58) from [<c00fc259>] (sysfs_remove_group+0x85/0xbc)
[<c00fc259>] (sysfs_remove_group+0x85/0xbc) from [<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0)
[<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0) from [<c02f66d7>] (__cpufreq_governor+0x2b/0x8c)
[<c02f66d7>] (__cpufreq_governor+0x2b/0x8c) from [<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250)
[<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250) from [<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c)
[<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c) from [<c0036fe1>] (notifier_call_chain+0x45/0x54)
[<c0036fe1>] (notifier_call_chain+0x45/0x54) from [<c001e611>] (__cpu_notify+0x1d/0x34)
[<c001e611>] (__cpu_notify+0x1d/0x34) from [<c03e5833>] (_cpu_down+0x63/0x1ac)
[<c03e5833>] (_cpu_down+0x63/0x1ac) from [<c03e5997>] (cpu_down+0x1b/0x30)
[<c03e5997>] (cpu_down+0x1b/0x30) from [<c03e60eb>] (store_online+0x27/0x54)
[<c03e60eb>] (store_online+0x27/0x54) from [<c0295629>] (dev_attr_store+0x11/0x18)
[<c0295629>] (dev_attr_store+0x11/0x18) from [<c00f9edd>] (sysfs_write_file+0xed/0x114)
[<c00f9edd>] (sysfs_write_file+0xed/0x114) from [<c00b42a9>] (vfs_write+0x65/0xd8)
[<c00b42a9>] (vfs_write+0x65/0xd8) from [<c00b4523>] (sys_write+0x2f/0x50)
[<c00b4523>] (sys_write+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52)
Of course this only impacted drivers which have
have_governor_per_policy set to true. i.e. big LITTLE cpufreq driver.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 5800043 (cpufreq: convert cpufreq_driver to using RCU) causes
the following call trace to be spit on boot:
BUG: sleeping function called from invalid context at /scratch/rafael/work/linux-pm/mm/slab.c:3179
in_atomic(): 0, irqs_disabled(): 0, pid: 292, name: systemd-udevd
2 locks held by systemd-udevd/292:
#0: (subsys mutex){+.+.+.}, at: [<ffffffff8146851a>] subsys_interface_register+0x4a/0xe0
#1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81538210>] cpufreq_add_dev_interface+0x60/0x5e0
Pid: 292, comm: systemd-udevd Not tainted 3.9.0-rc8+ #323
Call Trace:
[<ffffffff81072c90>] __might_sleep+0x140/0x1f0
[<ffffffff811581c2>] kmem_cache_alloc+0x42/0x2b0
[<ffffffff811e7179>] sysfs_new_dirent+0x59/0x130
[<ffffffff811e63cb>] sysfs_add_file_mode+0x6b/0x110
[<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0
[<ffffffff810a3254>] ? __lock_is_held+0x54/0x80
[<ffffffff811e647d>] sysfs_add_file+0xd/0x10
[<ffffffff811e6541>] sysfs_create_file+0x21/0x30
[<ffffffff81538280>] cpufreq_add_dev_interface+0xd0/0x5e0
[<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0
[<ffffffffa000337f>] ? acpi_processor_get_platform_limit+0x32/0xbb [processor]
[<ffffffffa022f540>] ? do_drv_write+0x70/0x70 [acpi_cpufreq]
[<ffffffff810a3254>] ? __lock_is_held+0x54/0x80
[<ffffffff8106c97e>] ? up_read+0x1e/0x40
[<ffffffff8106e632>] ? __blocking_notifier_call_chain+0x72/0xc0
[<ffffffff81538dbd>] cpufreq_add_dev+0x62d/0xae0
[<ffffffff815389b8>] ? cpufreq_add_dev+0x228/0xae0
[<ffffffff81468569>] subsys_interface_register+0x99/0xe0
[<ffffffffa014d000>] ? 0xffffffffa014cfff
[<ffffffff81535d5d>] cpufreq_register_driver+0x9d/0x200
[<ffffffffa014d000>] ? 0xffffffffa014cfff
[<ffffffffa014d0e9>] acpi_cpufreq_init+0xe9/0x1000 [acpi_cpufreq]
[<ffffffff810002fa>] do_one_initcall+0x11a/0x170
[<ffffffff810b4b87>] load_module+0x1cf7/0x2920
[<ffffffff81322580>] ? ddebug_proc_open+0xb0/0xb0
[<ffffffff816baee0>] ? retint_restore_args+0xe/0xe
[<ffffffff810b5887>] sys_init_module+0xd7/0x120
[<ffffffff816bb6d2>] system_call_fastpath+0x16/0x1b
which is quite obvious, because that commit put (multiple instances
of) sysfs_create_file() under rcu_read_lock()/rcu_read_unlock(),
although sysfs_create_file() may cause memory to be allocated with
GFP_KERNEL and that may sleep, which is not permitted in RCU read
critical section.
Revert the buggy commit altogether along with some changes on top
of it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some cpufreq drivers implement their own governor and so don't need
us to call generic governors interface via __cpufreq_governor(). Few
recent commits haven't obeyed this law well and we saw some
regressions.
This patch is an attempt to fix the above issue.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
__cpufreq_governor() must be called with a correct policy->cpus mask.
In __cpufreq_remove_dev() we initially clear policy->cpus with
cpumask_clear_cpu() and then call
__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). If the governor
is doing some per-cpu stuff in EXIT callback, this can create
uncertain behavior.
Generic governors in drivers/cpufreq/ doesn't do any per-cpu stuff
in EXIT callback and so we don't face any issues currently. But its
better to keep the code clean, so we don't face any issues in future.
Now, we call cpumask_clear_cpu() only when multiple cpus are managed
by policy.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We eventually would like to remove the rwlock cpufreq_driver_lock or
convert it back to a spinlock and protect the read sections with RCU.
The first step in that direction is to make cpufreq_driver use RCU.
I don't see an easy wasy to protect the cpufreq_cpu_data structure
with RCU, so I am leaving it with the rwlock for now since under
certain configs __cpufreq_cpu_get is a hot spot with 256+ cores.
[rjw: Subject, changelog, white space]
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
policy->cpus contains all online cpus that have single shared clock line. And
their frequencies are always updated together.
Many SMP system's cpufreq drivers take care of this in individual drivers but
the best place for this code is in cpufreq core.
This patch modifies cpufreq_notify_transition() to notify frequency change for
all cpus in policy->cpus and hence updates all users of this API.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.
Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.
This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.
This patch uses the infrastructure provided by earlier patch and
implements init/exit routines for ondemand and conservative
governors.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.
Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.
This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.
This patch is inclined towards providing this infrastructure. Because
we are required to allocate governor's resources dynamically now, we
must do it at policy creation and end. And so got
CPUFREQ_GOV_POLICY_INIT/EXIT.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This eliminates the contention I am seeing in __cpufreq_cpu_get.
It also nicely stages the lock to be replaced by the rcu.
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-cpufreq: (55 commits)
cpufreq / intel_pstate: Fix 32 bit build
cpufreq: conservative: Fix typos in comments
cpufreq: ondemand: Fix typos in comments
cpufreq: exynos: simplify .init() for setting policy->cpus
cpufreq: kirkwood: Add a cpufreq driver for Marvell Kirkwood SoCs
cpufreq/x86: Add P-state driver for sandy bridge.
cpufreq_stats: do not remove sysfs files if frequency table is not present
cpufreq: Do not track governor name for scaling drivers with internal governors.
cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target()
cpufreq: Retrieve current frequency from scaling drivers with internal governors
cpufreq: Fix locking issues
cpufreq: Create a macro for unlock_policy_rwsem{read,write}
cpufreq: Remove unused HOTPLUG_CPU code
cpufreq: governors: Fix WARN_ON() for multi-policy platforms
cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold
cpufreq / stats: Get rid of CPUFREQ_STATDEVICE_ATTR
cpufreq: Don't check cpu_online(policy->cpu)
cpufreq: add imx6q-cpufreq driver
cpufreq: Don't remove sysfs link for policy->cpu
cpufreq: Remove unnecessary use of policy->shared_type
...
Scaling drivers that implement internal governors do not have governor
structures assocaited with them. Only track the name of the governor
associated with the CPU if the driver does not implement
cpufreq_driver.setpolicy()
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Scaling drivers that implement cpufreq_driver.setpolicy() have
internal governors that do not signal changes via
cpufreq_notify_transition() so the frequncy in the policy will almost
certainly be different than the current frequncy. Only call
cpufreq_out_of_sync() when the underlying driver implements
cpufreq_driver.target()
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Scaling drivers that implement the cpufreq_driver.setpolicy() versus
the cpufreq_driver.target() interface do not set policy->cur.
Normally policy->cur is set during the call to cpufreq_driver.target()
when the frequnecy request is made by the governor.
If the scaling driver implements cpufreq_driver.setpolicy() and
cpufreq_driver.get() interfaces use cpufreq_driver.get() to retrieve
the current frequency.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq core uses two locks:
- cpufreq_driver_lock: General lock for driver and cpufreq_cpu_data array.
- cpu_policy_rwsemfix locking: per CPU reader-writer semaphore designed to cure
all cpufreq/hotplug/workqueue/etc related lock issues.
These locks were not used properly and are placed against their principle
(present before their definition) at various places. This patch is an attempt to
fix their use.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On the lines of macro: lock_policy_rwsem, we can create another macro for
unlock_policy_rwsem. Lets do it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Because the sibling cpu of any online cpu is identified very early in
cpufreq_add_dev(), below code is never executed. And so can be removed.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On multi-policy systems there is a single instance of governor for both the
policies (if same governor is chosen for both policies). With the code update
from following patches:
8eeed09 cpufreq: governors: Get rid of dbs_data->enable field
b394058 cpufreq: governors: Reset tunables only for cpufreq_unregister_governor()
We are creating/removing sysfs directory of governor for for every call to
GOV_START and STOP. This would fail for multi-policy system as there is a
per-policy call to START/STOP.
This patch reuses the governor->initialized variable to detect total users of
governor.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
policy->cpu or cpus in policy->cpus can't be offline anymore. And so we don't
need to check if they are online or not.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
"cpufreq" directory in policy->cpu is never created using
sysfs_create_link(), but using kobject_init_and_add(). And so we
shouldn't call sysfs_remove_link() for policy->cpu(). sysfs stuff
for policy->cpu is automatically removed when we call kobject_put()
for dying policy.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, whenever governor->governor() is called for CPUFRREQ_GOV_START event
we reset few tunables of governor. Which isn't correct, as this routine is
called for every cpu hot-[un]plugging event. We should actually be resetting
these only when the governor module is removed and re-installed.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently cpufreq_add_dev() firsts allocates policy, calls
driver->init() and then checks if this CPU is already managed or not.
And if it is already managed, its policy is freed.
We can save all this if we somehow know that CPU is managed or not in
advance. policy->related_cpus contains the list of all valid sibling
CPUs of policy->cpu. We can check this to see if the current CPU is
already managed.
From now on, platforms don't really need to set related_cpus from
their init() routines, as the same work is done by core too.
If a platform driver needs to set the related_cpus mask with some
additional CPUs, other than CPUs present in policy->cpus, they are
free to do it, though, as we don't override anything.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 956f339 "cpufreq: Don't use cpu removed during
cpufreq_driver_unregister".
With the addition of the following commit, this change/variable is not
required any more:
commit b9ba2725343ae57add3f324dfa5074167f48de96
Author: Viresh Kumar <viresh.kumar@linaro.org>
Date: Mon Jan 14 13:23:03 2013 +0000
cpufreq: Simplify __cpufreq_remove_dev()
[rjw: Subject and changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a helper function to return cpufreq_driver->name.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>