RAPL MSRs and handling for Cannon Lake are similar to Sky Lake
and Kaby Lake.
Signed-off-by: Joe Konno <joe.konno@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PL1 and PL2 could be throlled or de-throttled by
Thermal management to control SOC temperature.
However, currently, their value will be reset to default value
after once system suspend and resume.
Add pm_notifier to save PL1, PL2 before system suspect and restore
PL1, PL2 after system resume.
Signed-off-by: Zhen Han <zhen.han@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Simplify powercap_init() by reducing the number of redundant
assignments in it.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
[ rjw: Subject+changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fixes wrong bits shift operation in the rapl_write_data_raw function, which
might cause overridding bits outside of the mask.
For example, writing new TIME_WINDOW1 value can override POWER_LIMIT1.
Signed-off-by: Adam Lessnau <adam.lessnau@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the current code we accidentally return the successful result from
idr_alloc() instead of a negative error pointer. The caller is looking
for an error pointer and so it treats the returned value as a valid
pointer.
This one might be a bit serious because if it lets people get around the
kernel's protection for remapping NULL. I'm not sure.
Fixes: 75d2364ea0 (PowerCap: Add class driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit e1399ba20e ("powercap / RAPL: handle missing MSRs") added
contraint_to_pl() function to return index into an array. But it
can potentially return -EINVAL if powercap layer sends an out of
range constraint ID. This patch adds sanity check.
Unnecessary RAPL domain pointer check is removed since it must be
initialized before calling rapl_unit_xlate().
Fixes: e1399ba20e ("powercap / RAPL: handle missing MSRs")
Reported-by: Odzioba, Lukasz <lukasz.odzioba@intel.com>
Reported-by: Koss, Marcin <marcin.koss@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ability of the CPU hotplug code to stop online/offline at each step
makes it necessary to track the activated CPUs in a package directly,
because outerwise a CPU offline callback can find CPUs which have already
executed the offline callback, but are not yet marked offline in the
topology mask. That could make such a CPU the package leader and in case
that CPU goes fully offline leave the package lead orphaned.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The whole init/exit code is a duplicate of the cpuhotplug code. So we can
just let the hotplug code do the actual work of setting up and tearing down
the domains.
This also restores the package hardware when a package is removed during
hotplug instead of leaving it in the last configured state and only reset
it when the driver is removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Install the callbacks via the state machine as a first step. The init/exit
code is a duplicate of the hotplug code. This is cleaned up in a
consecutive patch.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If rapl_package_register_powercap() fails in rapl_add_package() the
function happily returns 0.
Capture the error code and propagate it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The domain data of packages is only updated at init time, but new packages
created by hotplug miss that treatment.
Add it there and remove the global update at init time, because it's now
obsolete.
The more interesting question is why rapl_update_domain_data() exists at
all as nothing ever uses that data.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It's confirmed that RAPL works as expected on Ivy Bridge servers.
Tested against processor: Intel(R) Xeon(R) CPU E5-2697 v2 @2.70GHz
Signed-off-by: Xiaolong Wang <xiaolong.wang@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some RAPL MSRs may not exist on some CPUs, we need to continue
the topology detection and enumerate what is available.
This patch handles the missing MSRs, then reports to the powercap
layer only the features available.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since the RAPL interface is not architectual, its enumeration depends
on poking MSRs instead of using the CPUID method.
In KVM guests, the RAPL driver probe will fail and emit the following
message for every CPU: "no valid rapl domains found in package"
This patch converts the warning to a debug message only (still return
-ENODEV so that RAPL does not run in KVM guests).
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull power management updates from Rafael Wysocki:
"The majority of changes go into the cpufreq subsystem this time.
To me, quite obviously, the biggest ticket item is the new "schedutil"
governor. Interestingly enough, it's the first new cpufreq governor
since the beginning of the git era (except for some out-of-the-tree
ones).
There are two main differences between it and the existing governors.
First, it uses the information provided by the scheduler directly for
making its decisions, so it doesn't have to track anything by itself.
Second, it can invoke drivers (supporting that feature) to adjust CPU
performance right away without having to spawn work items to be
executed in process context or similar. Currently, the acpi-cpufreq
driver is the only one supporting that mode of operation, but then it
is used on a large number of systems.
The "schedutil" governor as included here is very simple and mostly
regarded as a foundation for future work on the integration of the
scheduler with CPU power management (in fact, there is work in
progress on top of it already). Nevertheless it works and the
preliminary results obtained with it are encouraging.
There also is some consolidation of CPU frequency management for ARM
platforms that can add their machine IDs the the new stub dt-platdev
driver now and that will take care of creating the requisite platform
device for cpufreq-dt, so it is not necessary to do that in platform
code any more. Several ARM platforms are switched over to using this
generic mechanism.
In addition to that, the intel_pstate driver is now going to respect
CPU frequency limits set by the platform firmware (or a BMC) and
provided via the ACPI _PPC object.
The devfreq subsystem is getting a new "passive" governor for SoCs
subsystems that will depend on somebody else to manage their voltage
rails and its support for Samsung Exynos SoCs is consolidated.
The rest is support for new hardware (Intel Broxton support in
intel_idle for one example), bug fixes, optimizations and cleanups in
a number of places.
Specifics:
- New cpufreq "schedutil" governor (making decisions based on CPU
utilization information provided by the scheduler and capable of
switching CPU frequencies right away if the underlying driver
supports that) and support for fast frequency switching in the
acpi-cpufreq driver (Rafael Wysocki)
- Consolidation of CPU frequency management on ARM platforms allowing
them to get rid of some platform-specific boilerplate code if they
are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao,
Marc Gonzalez)
- Support for ACPI _PPC and CPU frequency limits in the intel_pstate
driver (Srinivas Pandruvada)
- Fixes and cleanups in the cpufreq core and generic governor code
(Rafael Wysocki, Sai Gurrappadi)
- intel_pstate driver optimizations and cleanups (Rafael Wysocki,
Philippe Longepe, Chen Yu, Joe Perches)
- cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri
Bhat)
- cpufreq qoriq driver fixes and cleanups (Jia Hongtao)
- ACPI cpufreq driver cleanups (Viresh Kumar)
- Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang,
Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla)
- Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann)
- Fixes and cleanups in the OPP (Operating Performance Points)
framework, mostly related to OPP sharing, and reorganization of
OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla)
- New "passive" governor for devfreq (for SoC subsystems that will
rely on someone else for the management of their power resources)
and consolidation of devfreq support for Exynos platforms, coding
style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham)
- PM core fixes and cleanups, mostly to make it work better with the
generic power domains (genpd) framework, and updates for that
framework (Ulf Hansson, Thierry Reding, Colin Ian King)
- Intel Broxton support for the intel_idle driver (Len Brown)
- cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach)
- ARM cpuidle cleanups (Jisheng Zhang)
- Intel Kabylake support for the RAPL power capping driver (Jacob
Pan)
- AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko
Stuebner)
- Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King,
Mattia Dongili, Thomas Renninger)"
* tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (112 commits)
intel_pstate: Clean up get_target_pstate_use_performance()
intel_pstate: Use sample.core_avg_perf in get_avg_pstate()
intel_pstate: Clarify average performance computation
intel_pstate: Avoid unnecessary synchronize_sched() during initialization
cpufreq: schedutil: Make default depend on CONFIG_SMP
cpufreq: powernv: del_timer_sync when global and local pstate are equal
cpufreq: powernv: Move smp_call_function_any() out of irq safe block
intel_pstate: Clean up intel_pstate_get()
cpufreq: schedutil: Make it depend on CONFIG_SMP
cpufreq: governor: Fix handling of special cases in dbs_update()
PM / OPP: Move CONFIG_OF dependent code in a separate file
cpufreq: intel_pstate: Ignore _PPC processing under HWP
cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table
PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table
cpufreq: tango: Use generic platdev driver
PM / OPP: pass cpumask by reference
cpufreq: Fix GOV_LIMITS handling for the userspace governor
cpupower: fix potential memory leak
PM / devfreq: style/typo fixes
PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus
..