With property "opp-supported-hw" introduced, the OPP table
in DT could be a large OPP table and ONLY a subset of OPPs
are available, based on the version of the hardware running
on. That introduces restriction of using "opp-suspend"
property to define the suspend OPP, as we are NOT sure if the
OPP containing "opp-suspend" property is available for the
hardware running on, and the of opp core does NOT allow multiple
suspend OPPs defined in DT OPP table.
To eliminate this restrition, make of opp core allow multiple
suspend OPPs defined in DT, and pick the OPP with highest rate
and with "opp-suspend" property present to be suspend OPP, it
can speed up the suspend/resume process.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add enable regulators to dev_pm_opp_set_regulators() and disable
regulators to dev_pm_opp_put_regulators(). Even if bootloader
leaves regulators enabled, they should be enabled in kernel in
order to increase the reference count.
Signed-off-by: Kamil Konieczny <k.konieczny@partner.samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The list_kref was added for static OPPs and to track their users. The
kref is initialized while the static OPPs are added, but removed
unconditionally even if the static OPPs were never added. This causes
refcount mismatch warnings currently.
Fix that by always initializing the kref when the OPP table is first
initialized. The refcount is later incremented only for the second user
onwards.
Fixes: d0e8ae6c26 ("OPP: Create separate kref for static OPPs list")
Reported-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Since the performance states in the OPP table are unique, implement a
dev_pm_opp_find_level_exact() in order to be able to fetch a specific OPP.
Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
[ Viresh: Updated commit log ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The cpufreq drivers don't need to do runtime PM operations on the
virtual devices returned by dev_pm_domain_attach_by_name() and so the
virtual devices weren't shared with the callers of
dev_pm_opp_attach_genpd() earlier.
But the IO device drivers would want to do that. This patch updates the
prototype of dev_pm_opp_attach_genpd() to accept another argument to
return the pointer to the array of genpd virtual devices.
Reported-by: Rajendra Nayak <rnayak@codeaurora.org>
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
A device may have multiple power-domains and not all of them may be
scalable (i.e. support performance states). But
dev_pm_opp_attach_genpd() doesn't take that into account currently.
Fix that by not verifying the names argument with "power-domain-names"
DT property and finding the index into the required-opps array. The
names argument will anyway get verified later on when we call
dev_pm_domain_attach_by_name().
Fixes: 6319aee10e ("opp: Attach genpds to devices from within OPP core")
Reported-by: Rajendra Nayak <rnayak@codeaurora.org>
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Pull operating performance points (OPP) framework changes for v5.3
from Viresh Kumar:
"This pull request contains:
- OPP core changes to support a wider range of devices, like IO
devices (Rajendra Nayak and Stehpen Boyd).
- Fixes around genpd_virt_devs (Viresh Kumar).
- Fix for platform with set_opp() callback (Dmitry Osipenko)."
* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Don't use IS_ERR on invalid supplies
opp: Make dev_pm_opp_set_rate() handle freq = 0 to drop performance votes
opp: Don't overwrite rounded clk rate
opp: Allocate genpd_virt_devs from dev_pm_opp_attach_genpd()
opp: Attach genpds to devices from within OPP core
_set_opp_custom() receives a set of OPP supplies as its arguments and
the caller of it passes NULL when the supplies are not valid. But
_set_opp_custom(), by mistake, checks for error by performing
IS_ERR(old_supply) on it which will always evaluate to false.
The problem was spotted during of testing of upcoming update for the
NVIDIA Tegra CPUFreq driver.
Cc: stable <stable@vger.kernel.org>
Fixes: 7e535993fa ("OPP: Separate out custom OPP handler specific code")
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[ Viresh: Massaged changelog ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For devices with performance state, we use dev_pm_opp_set_rate() to set
the appropriate clk rate and the performance state.
We do need a way to remove the performance state vote when we idle the
device and turn the clocks off. Use dev_pm_opp_set_rate() with freq = 0
to achieve this.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
[ Viresh: Updated _set_required_opps() to handle the !opp case ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP table normally contains 'fmax' values corresponding to the
voltage or performance levels of each OPP, but we don't necessarily want
all the devices to run at fmax all the time. Running at fmax makes sense
for devices like CPU/GPU, which have a finite amount of work to do and
since a specific amount of energy is consumed at an OPP, its better to
run at the highest possible frequency for that voltage value.
On the other hand, we have IO devices which need to run at specific
frequencies only for their proper functioning, instead of maximum
possible frequency.
The OPP core currently roundup to the next possible OPP for a frequency
and select the fmax value. To support the IO devices by the OPP core,
lets do the roundup to fetch the voltage or performance state values,
but not use the OPP frequency value. Rather use the value returned by
clk_round_rate().
The current user, cpufreq, of dev_pm_opp_set_rate() already does the
rounding to the next OPP before calling this routine and it won't
have any side affects because of this change.
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
[ Viresh: Massaged changelog, added comment and use temp_opp variable
instead ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Convert the PM documents to ReST, in order to allow them to
build with Sphinx.
The conversion is actually:
- add blank lines and indentation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the space for the array of virtual devices is allocated along
with the OPP table, but that isn't going to work well from now onwards.
For single power domain case, a driver can either use the original
device structure for setting the performance state (if genpd attached
with dev_pm_domain_attach()) or use the virtual device structure (if
genpd attached with dev_pm_domain_attach_by_name(), which returns the
virtual device) and so we can't know in advance if we are going to need
genpd_virt_devs array or not.
Lets delay the allocation a bit and do it along with
dev_pm_opp_attach_genpd() rather. The deallocation is done from
dev_pm_opp_detach_genpd().
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP core requires the virtual device pointers to set performance
state on behalf of the device, for the multiple power domain case. The
genpd API (dev_pm_domain_attach_by_name()) has evolved now to support
even the single power domain case and that lets us add common code for
handling both the cases more efficiently.
The virtual device structure returned by dev_pm_domain_attach_by_name()
isn't normally used by the cpufreq drivers as they don't manage power
on/off of the domains and so is only useful for the OPP core.
This patch moves all the complexity into the OPP core to make the end
drivers simple. The earlier APIs dev_pm_opp_{set|put}_genpd_virt_dev()
are reworked into dev_pm_opp_{attach|detach}_genpd(). The new helper
dev_pm_opp_attach_genpd() accepts a NULL terminated array of strings
which contains names of all the genpd's to attach. It then attaches all
the domains and saves the pointers to the virtual devices. The other
helper undo the work done by this helper.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This patch introduces a new helper routine in the OPP core, which
returns the OPP with the highest frequency which has voltage less than
or equal to the target voltage passed to the helper.
Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
[ Viresh: Massaged the commit log and renamed the helper with some
cleanups. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
At boot up, CPUFreq core performs a sanity check to see if the system is
running at a frequency defined in the frequency table of the CPU. If so,
we try to find a valid frequency (lowest frequency greater than the
currently programmed frequency) from the table and set it. When the call
reaches dev_pm_opp_set_rate(), it calls _find_freq_ceil(opp_table,
&old_freq) to find the previously configured OPP and this call also
updates the old_freq. This eventually sets the old_freq == freq (new
target requested by cpufreq core) and we skip updating the performance
state in this case.
Fix this by also updating the performance state when the old_freq ==
freq.
Fixes: ca1b5d77b1 ("OPP: Configure all required OPPs")
Cc: v5.0 <stable@vger.kernel.org> # v5.0
Reported-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We seem to rely on the number of phandles specified in the
'required-opps' property to identify cases where a device is
associated with multiple power domains and hence would have
multiple virtual devices that have to be dealt with.
In cases where we do have devices with multiple power domains
but with only one of them being scalable, this logic seems to
fail.
Instead read the number of power domains from DT to identify
such cases.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull power management updates from Rafael Wysocki:
"These are PM-runtime framework changes to use ktime instead of jiffies
for accounting, new PM core flag to mark devices that don't need any
form of power management, cpuidle updates including driver API
documentation and a new governor, cpufreq updates including a new
driver for Armada 8K, thermal cleanups and more, some energy-aware
scheduling (EAS) enabling changes, new chips support in the intel_idle
and RAPL drivers and assorted cleanups in some other places.
Specifics:
- Update the PM-runtime framework to use ktime instead of jiffies for
accounting (Thara Gopinath, Vincent Guittot)
- Optimize the autosuspend code in the PM-runtime framework somewhat
(Ladislav Michl)
- Add a PM core flag to mark devices that don't need any form of
power management (Sudeep Holla)
- Introduce driver API documentation for cpuidle and add a new
cpuidle governor for tickless systems (Rafael Wysocki)
- Add Jacobsville support to the intel_idle driver (Zhang Rui)
- Clean up a cpuidle core header file and the cpuidle-dt and ACPI
processor-idle drivers (Yangtao Li, Joseph Lo, Yazen Ghannam)
- Add new cpufreq driver for Armada 8K (Gregory Clement)
- Fix and clean up cpufreq core (Rafael Wysocki, Viresh Kumar, Amit
Kucheria)
- Add support for light-weight tear-down and bring-up of CPUs to the
cpufreq core and use it in the cpufreq-dt driver (Viresh Kumar)
- Fix cpu_cooling Kconfig dependencies, add support for CPU cooling
auto-registration to the cpufreq core and use it in multiple
cpufreq drivers (Amit Kucheria)
- Fix some minor issues and do some cleanups in the davinci,
e_powersaver, ap806, s5pv210, qcom and kryo cpufreq drivers
(Bartosz Golaszewski, Gustavo Silva, Julia Lawall, Paweł Chmiel,
Taniya Das, Viresh Kumar)
- Add a Hisilicon CPPC quirk to the cppc_cpufreq driver (Xiongfeng
Wang)
- Clean up the intel_pstate and acpi-cpufreq drivers (Erwan Velu,
Rafael Wysocki)
- Clean up multiple cpufreq drivers (Yangtao Li)
- Update cpufreq-related MAINTAINERS entries (Baruch Siach, Lukas
Bulwahn)
- Add support for exposing the Energy Model via debugfs and make
multiple cpufreq drivers register an Energy Model to support
energy-aware scheduling (Quentin Perret, Dietmar Eggemann, Matthias
Kaehlcke)
- Add Ice Lake mobile and Jacobsville support to the Intel RAPL
power-capping driver (Gayatri Kammela, Zhang Rui)
- Add a power estimation helper to the operating performance points
(OPP) framework and clean up a core function in it (Quentin Perret,
Viresh Kumar)
- Make minor improvements in the generic power domains (genpd), OPP
and system suspend frameworks and in the PM core (Aditya Pakki,
Douglas Anderson, Greg Kroah-Hartman, Rafael Wysocki, Yangtao Li)"
* tag 'pm-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (80 commits)
cpufreq: kryo: Release OPP tables on module removal
cpufreq: ap806: add missing of_node_put after of_device_is_available
cpufreq: acpi-cpufreq: Report if CPU doesn't support boost technologies
cpufreq: Pass updated policy to driver ->setpolicy() callback
cpufreq: Fix two debug messages in cpufreq_set_policy()
cpufreq: Reorder and simplify cpufreq_update_policy()
cpufreq: Add kerneldoc comments for two core functions
PM / core: Add support to skip power management in device/driver model
cpufreq: intel_pstate: Rework iowait boosting to be less aggressive
cpufreq: intel_pstate: Eliminate intel_pstate_get_base_pstate()
cpufreq: intel_pstate: Avoid redundant initialization of local vars
powercap/intel_rapl: add Ice Lake mobile
ACPI / processor: Set P_LVL{2,3} idle state descriptions
cpufreq / cppc: Work around for Hisilicon CPPC cpufreq
ACPI / CPPC: Add a helper to get desired performance
cpufreq: davinci: move configuration to include/linux/platform_data
cpufreq: speedstep: convert BUG() to BUG_ON()
cpufreq: powernv: fix missing check of return value in init_powernv_pstates()
cpufreq: longhaul: remove unneeded semicolon
cpufreq: pcc-cpufreq: remove unneeded semicolon
..
Qualcomm ARM Based Driver Updates for v5.1
* Add Qualcomm RPMh power domain driver and related changes
* Fix issues with sleep/wake sets and batch API in RPMh
* Update MAINTAINERS Qualcomm entry
* Fixup RMTFS-mem sysfs and uevents
* Fix error handling in GSBI
* Add SMD-RPM compatible entry for SDM660
* tag 'qcom-drivers-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
soc: qcom: smd-rpm: Add sdm660 compatible
soc: qcom: gsbi: Fix error handling in gsbi_probe()
soc: qcom: rpmh: Avoid accessing freed memory from batch API
drivers: qcom: rpmh: avoid sending sleep/wake sets immediately
soc: qcom: rmtfs-mem: Make sysfs attributes world-readable
soc: qcom: rmtfs-mem: Add class to enable uevents
soc: qcom: update config dependencies for QCOM_RPMPD
soc: qcom: rpmpd: Drop family A RPM dependency
MAINTAINERS: update list of qcom drivers
soc: qcom: rpmhpd: Mark mx as a parent for cx
soc: qcom: rpmhpd: Add RPMh power domain driver
soc: qcom: rpmpd: Add support for get/set performance state
soc: qcom: rpmpd: Add a Power domain driver to model corners
dt-bindings: power: Add qcom rpm power domain driver bindings
OPP: Add support for parsing the 'opp-level' property
dt-bindings: opp: Introduce opp-level bindings
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull operating performance points (OPP) framework updates for v5.1
from Viresh Kumar:
"This pull request contains following changes:
- Introduced new OPP helper for power-estimation and used it in
several cpufreq drivers (Quentin Perret, Matthias Kaehlcke, Dietmar
Eggemann, and Yangtao Li).
- OPP Debugfs cleanup (Greg KH).
- OPP core cleanup (Viresh Kumar)."
* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: OMAP: Register an Energy Model
cpufreq: imx6q: Register an Energy Model
opp: no need to check return value of debugfs_create functions
cpufreq: mediatek: Register an Energy Model
cpufreq: scmi: Register an Energy Model
cpufreq: arm_big_little: Register an Energy Model
cpufreq: scpi: Register an Energy Model
cpufreq: dt: Register an Energy Model
PM / OPP: Introduce a power estimation helper
PM / OPP: Remove unused parameter of _generic_set_opp_clk_only()
The Energy Model (EM) framework provides an API to let drivers register
the active power of CPUs. The drivers are expected to provide a callback
method which estimates the power consumed by a CPU at each available
performance levels. How exactly this should be implemented, however,
depends on the platform.
On some systems, PM_OPP knows the voltage and frequency at which CPUs
can run. When coupled with the CPU 'capacitance' (as provided by the
'dynamic-power-coefficient' devicetree binding), it is possible to
estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with
C its capacitance and V and f respectively the voltage and frequency of
the OPP. The Intelligent Power Allocator (IPA) thermal governor already
implements that estimation method, in the thermal framework.
However, this power estimation method can be applied to any platform
where all the parameters are known (C, V and f), and not only those
suffering thermal issues. As such, the code implementing this feature
can be re-used to also populate the EM framework now used by EAS.
As a first step, introduce in PM_OPP a helper function which CPUFreq
drivers can use to register into the EM framework. This duplicates the
power estimation done in IPA until it can be migrated to using the EM
framework. This will be done later, once the EM framework has support
for at least all platforms currently supported by IPA.
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The previous frequency value isn't getting used in the routine
_generic_set_opp_clk_only(), drop it.
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>