At boot up, CPUFreq core performs a sanity check to see if the system is
running at a frequency defined in the frequency table of the CPU. If so,
we try to find a valid frequency (lowest frequency greater than the
currently programmed frequency) from the table and set it. When the call
reaches dev_pm_opp_set_rate(), it calls _find_freq_ceil(opp_table,
&old_freq) to find the previously configured OPP and this call also
updates the old_freq. This eventually sets the old_freq == freq (new
target requested by cpufreq core) and we skip updating the performance
state in this case.
Fix this by also updating the performance state when the old_freq ==
freq.
Fixes: ca1b5d77b1 ("OPP: Configure all required OPPs")
Cc: v5.0 <stable@vger.kernel.org> # v5.0
Reported-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We seem to rely on the number of phandles specified in the
'required-opps' property to identify cases where a device is
associated with multiple power domains and hence would have
multiple virtual devices that have to be dealt with.
In cases where we do have devices with multiple power domains
but with only one of them being scalable, this logic seems to
fail.
Instead read the number of power domains from DT to identify
such cases.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull power management updates from Rafael Wysocki:
"These are PM-runtime framework changes to use ktime instead of jiffies
for accounting, new PM core flag to mark devices that don't need any
form of power management, cpuidle updates including driver API
documentation and a new governor, cpufreq updates including a new
driver for Armada 8K, thermal cleanups and more, some energy-aware
scheduling (EAS) enabling changes, new chips support in the intel_idle
and RAPL drivers and assorted cleanups in some other places.
Specifics:
- Update the PM-runtime framework to use ktime instead of jiffies for
accounting (Thara Gopinath, Vincent Guittot)
- Optimize the autosuspend code in the PM-runtime framework somewhat
(Ladislav Michl)
- Add a PM core flag to mark devices that don't need any form of
power management (Sudeep Holla)
- Introduce driver API documentation for cpuidle and add a new
cpuidle governor for tickless systems (Rafael Wysocki)
- Add Jacobsville support to the intel_idle driver (Zhang Rui)
- Clean up a cpuidle core header file and the cpuidle-dt and ACPI
processor-idle drivers (Yangtao Li, Joseph Lo, Yazen Ghannam)
- Add new cpufreq driver for Armada 8K (Gregory Clement)
- Fix and clean up cpufreq core (Rafael Wysocki, Viresh Kumar, Amit
Kucheria)
- Add support for light-weight tear-down and bring-up of CPUs to the
cpufreq core and use it in the cpufreq-dt driver (Viresh Kumar)
- Fix cpu_cooling Kconfig dependencies, add support for CPU cooling
auto-registration to the cpufreq core and use it in multiple
cpufreq drivers (Amit Kucheria)
- Fix some minor issues and do some cleanups in the davinci,
e_powersaver, ap806, s5pv210, qcom and kryo cpufreq drivers
(Bartosz Golaszewski, Gustavo Silva, Julia Lawall, Paweł Chmiel,
Taniya Das, Viresh Kumar)
- Add a Hisilicon CPPC quirk to the cppc_cpufreq driver (Xiongfeng
Wang)
- Clean up the intel_pstate and acpi-cpufreq drivers (Erwan Velu,
Rafael Wysocki)
- Clean up multiple cpufreq drivers (Yangtao Li)
- Update cpufreq-related MAINTAINERS entries (Baruch Siach, Lukas
Bulwahn)
- Add support for exposing the Energy Model via debugfs and make
multiple cpufreq drivers register an Energy Model to support
energy-aware scheduling (Quentin Perret, Dietmar Eggemann, Matthias
Kaehlcke)
- Add Ice Lake mobile and Jacobsville support to the Intel RAPL
power-capping driver (Gayatri Kammela, Zhang Rui)
- Add a power estimation helper to the operating performance points
(OPP) framework and clean up a core function in it (Quentin Perret,
Viresh Kumar)
- Make minor improvements in the generic power domains (genpd), OPP
and system suspend frameworks and in the PM core (Aditya Pakki,
Douglas Anderson, Greg Kroah-Hartman, Rafael Wysocki, Yangtao Li)"
* tag 'pm-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (80 commits)
cpufreq: kryo: Release OPP tables on module removal
cpufreq: ap806: add missing of_node_put after of_device_is_available
cpufreq: acpi-cpufreq: Report if CPU doesn't support boost technologies
cpufreq: Pass updated policy to driver ->setpolicy() callback
cpufreq: Fix two debug messages in cpufreq_set_policy()
cpufreq: Reorder and simplify cpufreq_update_policy()
cpufreq: Add kerneldoc comments for two core functions
PM / core: Add support to skip power management in device/driver model
cpufreq: intel_pstate: Rework iowait boosting to be less aggressive
cpufreq: intel_pstate: Eliminate intel_pstate_get_base_pstate()
cpufreq: intel_pstate: Avoid redundant initialization of local vars
powercap/intel_rapl: add Ice Lake mobile
ACPI / processor: Set P_LVL{2,3} idle state descriptions
cpufreq / cppc: Work around for Hisilicon CPPC cpufreq
ACPI / CPPC: Add a helper to get desired performance
cpufreq: davinci: move configuration to include/linux/platform_data
cpufreq: speedstep: convert BUG() to BUG_ON()
cpufreq: powernv: fix missing check of return value in init_powernv_pstates()
cpufreq: longhaul: remove unneeded semicolon
cpufreq: pcc-cpufreq: remove unneeded semicolon
..
Qualcomm ARM Based Driver Updates for v5.1
* Add Qualcomm RPMh power domain driver and related changes
* Fix issues with sleep/wake sets and batch API in RPMh
* Update MAINTAINERS Qualcomm entry
* Fixup RMTFS-mem sysfs and uevents
* Fix error handling in GSBI
* Add SMD-RPM compatible entry for SDM660
* tag 'qcom-drivers-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
soc: qcom: smd-rpm: Add sdm660 compatible
soc: qcom: gsbi: Fix error handling in gsbi_probe()
soc: qcom: rpmh: Avoid accessing freed memory from batch API
drivers: qcom: rpmh: avoid sending sleep/wake sets immediately
soc: qcom: rmtfs-mem: Make sysfs attributes world-readable
soc: qcom: rmtfs-mem: Add class to enable uevents
soc: qcom: update config dependencies for QCOM_RPMPD
soc: qcom: rpmpd: Drop family A RPM dependency
MAINTAINERS: update list of qcom drivers
soc: qcom: rpmhpd: Mark mx as a parent for cx
soc: qcom: rpmhpd: Add RPMh power domain driver
soc: qcom: rpmpd: Add support for get/set performance state
soc: qcom: rpmpd: Add a Power domain driver to model corners
dt-bindings: power: Add qcom rpm power domain driver bindings
OPP: Add support for parsing the 'opp-level' property
dt-bindings: opp: Introduce opp-level bindings
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull operating performance points (OPP) framework updates for v5.1
from Viresh Kumar:
"This pull request contains following changes:
- Introduced new OPP helper for power-estimation and used it in
several cpufreq drivers (Quentin Perret, Matthias Kaehlcke, Dietmar
Eggemann, and Yangtao Li).
- OPP Debugfs cleanup (Greg KH).
- OPP core cleanup (Viresh Kumar)."
* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: OMAP: Register an Energy Model
cpufreq: imx6q: Register an Energy Model
opp: no need to check return value of debugfs_create functions
cpufreq: mediatek: Register an Energy Model
cpufreq: scmi: Register an Energy Model
cpufreq: arm_big_little: Register an Energy Model
cpufreq: scpi: Register an Energy Model
cpufreq: dt: Register an Energy Model
PM / OPP: Introduce a power estimation helper
PM / OPP: Remove unused parameter of _generic_set_opp_clk_only()
The Energy Model (EM) framework provides an API to let drivers register
the active power of CPUs. The drivers are expected to provide a callback
method which estimates the power consumed by a CPU at each available
performance levels. How exactly this should be implemented, however,
depends on the platform.
On some systems, PM_OPP knows the voltage and frequency at which CPUs
can run. When coupled with the CPU 'capacitance' (as provided by the
'dynamic-power-coefficient' devicetree binding), it is possible to
estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with
C its capacitance and V and f respectively the voltage and frequency of
the OPP. The Intelligent Power Allocator (IPA) thermal governor already
implements that estimation method, in the thermal framework.
However, this power estimation method can be applied to any platform
where all the parameters are known (C, V and f), and not only those
suffering thermal issues. As such, the code implementing this feature
can be re-used to also populate the EM framework now used by EAS.
As a first step, introduce in PM_OPP a helper function which CPUFreq
drivers can use to register into the EM framework. This duplicates the
power estimation done in IPA until it can be migrated to using the EM
framework. This will be done later, once the EM framework has support
for at least all platforms currently supported by IPA.
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The previous frequency value isn't getting used in the routine
_generic_set_opp_clk_only(), drop it.
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Now that the OPP bindings are updated to include an optional
'opp-level' property, add support to parse it from device tree
and store it as part of dev_pm_opp structure.
Also add and export an helper 'dev_pm_opp_get_level()' that can be
used to get the level value read from device tree when present.
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Since the commit 2a4eb7358a "OPP: Don't remove dynamic OPPs from
_dev_pm_opp_remove_table()", dynamically created OPP aren't
automatically removed anymore by dev_pm_opp_cpumask_remove_table(). This
affects the scpi and scmi cpufreq drivers which no longer free OPPs on
failures or on invocations of the policy->exit() callback.
Create a generic OPP helper dev_pm_opp_remove_all_dynamic() which can be
called from these drivers instead of dev_pm_opp_cpumask_remove_table().
In dev_pm_opp_remove_all_dynamic(), we need to make sure that the
opp_list isn't getting accessed simultaneously from other parts of the
OPP core while the helper is freeing dynamic OPPs, i.e. we can't drop
the opp_table->lock while traversing through the OPP list. And to
accomplish that, this patch also creates _opp_kref_release_unlocked()
which can be called from this new helper with the opp_table lock already
held.
Cc: 4.20 <stable@vger.kernel.org> # v4.20
Reported-by: Valentin Schneider <valentin.schneider@arm.com>
Fixes: 2a4eb7358a "OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()"
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
of_get_required_opp_performance_state() returns 0 on errors currently
and a positive performance state otherwise. Since 0 is a valid
performance state (representing off), it would be better if this routine
returns negative values on error.
That will also make it behave similar to
dev_pm_opp_xlate_performance_state(), which also returns performance
states and returns negative values on error. Change the return type of
the function to "int" in order to return negative values.
This doesn't have any users for now and so no other part of the kernel
will be impacted with this change.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
dev_pm_genpd_set_performance_state() needs to handle performance state
propagation going forward. Currently this routine only gets the required
performance state of the device's genpd as an argument, but it doesn't
know how to translate that to master genpd(s) of the device's genpd.
Introduce a new helper dev_pm_opp_xlate_performance_state() which will
be used to translate from performance state of a device (or genpd
sub-domain) to another device (or master genpd).
Normally the src_table (of genpd sub-domain) will have the
"required_opps" property set to point to one of the OPPs in the
dst_table (of master genpd), but in some cases the genpd and its master
have one to one mapping of performance states and so none of them have
the "required-opps" property set. Return the performance state of the
src_table as it is in such cases.
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
There is one case where we may end up with no "supply" directory for the
OPPs in debugfs. That happens when the OPP core isn't managing the
regulators for the device and the device's OPP do have microvolt
property. It happens because the opp_table->regulator_count remains set
to 0 and the debugfs routines don't add any supply directory in such a
case.
This commit fixes that by setting opp_table->regulator_count to 1 in
that particular case. But to make everything work nicely and not break
other parts of the core, regulator_count is defined as "int" now instead
of "unsigned int" and it can have different special values now. It is
set to -1 initially to mark it "uninitialized" and later only we set it
to 0 or positive values after checking how many supplies are there.
This also helps in finding the bugs where only few of the OPPs have the
"opp-microvolt" property set and not all.
Fixes: 1fae788ed6 ("PM / OPP: Don't create debugfs "supply-0" directory unnecessarily")
Reported-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The value of opp_table->regulator_count is not very consistent right now
and it may end up being 0 while we do have a "opp-microvolt" property in
the OPP table. It was kept that way as we used to check if any
regulators are set with the OPP core for a device or not using value of
regulator_count.
Lets use opp_table->regulators for that purpose as the meaning of
regulator_count is going to change in the later patches.
Reported-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We currently return error if more than one phandle is present in the
"operating-points-v2" property, which is incorrect. We only want to
check the count of phandles here and set index to 0 if only one phandle
is present.
Fix it.
Fixes: 5ed4cecd75 ("OPP: Pass OPP table to _of_add_opp_table_v{1|2}()")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
_get_optimal_vdd_voltage call provides new_supply_vbb->u_volt
as the reference voltage while it should be really new_supply_vdd->u_volt.
Cc: 4.16+ <stable@vger.kernel.org> # v4.16+
Fixes: 9a835fa6e4 ("PM / OPP: Add ti-opp-supply driver")
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The voltage range (min, max) provided in the device tree is from
the data manual and is pretty big, catering to a wide range of devices.
On a i2c read/write failure the regulator_set_voltage_triplet function
falls back to set voltage between min and max. The min value from Device
Tree can be lesser than the optimal value and in that case that can lead
to a hang or crash. Hence set the u_volt_min dynamically to the optimal
voltage value.
Cc: 4.16+ <stable@vger.kernel.org> # v4.16+
Fixes: 9a835fa6e4 ("PM / OPP: Add ti-opp-supply driver")
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP core already has the performance state values for each of the
genpd's OPPs and there is no need to call the genpd callback again to
get the performance state for the case where the end device doesn't have
an OPP table and has the "required-opps" property directly in its node.
This commit renames of_genpd_opp_to_performance_state() as
of_get_required_opp_performance_state() and moves it to the OPP core, as
it is all about OPP stuff now.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Now that all the infrastructure is in place to support multiple required
OPPs, lets switch over to using it.
A new internal routine _set_required_opps() takes care of updating
performance state for all the required OPPs. With this the performance
state updates are supported even when the end device needs to configure
regulators as well, that wasn't the case earlier.
The pstates were earlier stored in the end device's OPP structures, that
also changes now as those values are stored in the genpd's OPP
structures. And so we switch over to using
pm_genpd_opp_to_performance_state() instead of
of_genpd_opp_to_performance_state() to get performance state for the
genpd OPPs.
The routine _generic_set_opp_domain() is not required anymore and is
removed.
On errors we don't try to recover by reverting to old settings as things
are really complex now and the calls here should never really fail
unless there is a bug. There is no point increasing the complexity, for
code which will never be executed.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Multiple generic power domains for a consumer device are supported with
the help of virtual devices, which are created for each consumer device
- genpd pair. These are the device structures which are attached to the
power domain and are required by the OPP core to set the performance
state of the genpd.
The helpers added by this commit are required to be called once for each
of these virtual devices. These are required only if multiple domains
are available for a device, otherwise the actual device structure will
be used instead by the OPP core.
The new helpers also support the complex cases where the consumer device
wouldn't always require all the domains. For example, a camera may
require only one power domain during normal operations but two during
high resolution operations. The consumer driver can call
dev_pm_opp_put_genpd_virt_dev(high_resolution_genpd_virt_dev) if it is
currently operating in the normal mode and doesn't have any performance
requirements from the genpd which manages high resolution power
requirements. The consumer driver can later call
dev_pm_opp_set_genpd_virt_dev(high_resolution_genpd_virt_dev) once it
switches back to the high resolution mode.
The new helpers differ from other OPP set/put helpers as the new ones
can be called with OPPs initialized for the table as we may need to call
them on the fly because of the complex case explained above. For this
reason it is possible that the genpd virt_dev structure may be used in
parallel while the new helpers are running and a new mutex is added to
protect against that. We didn't use the existing opp_table->lock mutex
as that is widely used in the OPP core and we will need this lock in the
dev_pm_opp_set_rate() helper while changing OPP and we need to make sure
there is not much contention while doing that as that's the hotpath.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
An earlier commit populated the OPP tables from the "required-opps"
property, this commit populates the individual OPPs. This is repeated
for each OPP in the OPP table and these populated OPPs will be used by
later commits.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>