Commit Graph

74 Commits

Author SHA1 Message Date
Lifeng Zheng
03d8b4e762 cpufreq: CPPC: Fix wrong max_freq in policy initialization
In policy initialization, policy->max and policy->cpuinfo.max_freq are
always set to the value calculated from caps->nominal_perf.

This will cause the frequency stay on base frequency even if the policy
is already boosted when a CPU is going online.

Fix this by using policy->boost_enabled to determine which value should
be set.

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250117101457.1530653-4-zhenglifeng1@huawei.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-23 21:06:33 +01:00
Frederic Weisbecker
b04e317b52 treewide: Introduce kthread_run_worker[_on_cpu]()
kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2025-01-08 18:15:03 +01:00
Rafael J. Wysocki
baf4ae8038 Merge tag 'cpufreq-arm-updates-6.13' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge ARM cpufreq updates for 6.13 from Viresh Kumar:

"- Add virtual cpufreq driver for guest kernels (David Dai).

 - Minor cleanup to various cpufreq drivers (Andy Shevchenko, Dhruva
   Gole, Jie Zhan, Jinjie Ruan, Shuosheng Huang, Sibi Sankar, and Yuan
   Can).

 - Revert "cpufreq: brcmstb-avs-cpufreq: Fix initial command check"
   (Colin Ian King).

 - Improve DT bindings for qcom-hw driver (Dmitry Baryshkov, Konrad
   Dybcio, and Nikunj Kela)."

* tag 'cpufreq-arm-updates-6.13' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  arm64: dts: qcom: sc8180x: Add a SoC-specific compatible to cpufreq-hw
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SC8180X compatible
  cpufreq: sun50i: add a100 cpufreq support
  cpufreq: mediatek-hw: Fix wrong return value in mtk_cpufreq_get_cpu_power()
  cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_power()
  cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_cost()
  cpufreq: loongson3: Check for error code from devm_mutex_init() call
  cpufreq: scmi: Fix cleanup path when boost enablement fails
  cpufreq: CPPC: Fix possible null-ptr-deref for cppc_get_cpu_cost()
  cpufreq: CPPC: Fix possible null-ptr-deref for cpufreq_cpu_get_raw()
  Revert "cpufreq: brcmstb-avs-cpufreq: Fix initial command check"
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SAR2130P compatible
  cpufreq: add virtual-cpufreq driver
  dt-bindings: cpufreq: add virtual cpufreq device
  cpufreq: loongson2: Unregister platform_driver on failure
  cpufreq: ti-cpufreq: Remove revision offsets in AM62 family
  cpufreq: ti-cpufreq: Allow backward compatibility for efuse syscon
  cppc_cpufreq: Remove HiSilicon CPPC workaround
  cppc_cpufreq: Use desired perf if feedback ctrs are 0 or unchanged
  dt-bindings: cpufreq: qcom-hw: document support for SA8255p
2024-11-19 21:35:14 +01:00
Jinjie Ruan
b51eb0874d cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_power()
cppc_get_cpu_power() return 0 if the policy is NULL. Then in
em_create_perf_table(), the later zero check for power is not valid
as power is uninitialized. As Quentin pointed out, kernel energy model
core check the return value of active_power() first, so if the callback
failed it should tell the core. So return -EINVAL to fix it.

Fixes: a78e720756 ("cpufreq: CPPC: Fix possible null-ptr-deref for cpufreq_cpu_get_raw()")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Quentin Perret <qperret@google.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-11-11 09:45:16 +05:30
Jinjie Ruan
be392aa80f cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_cost()
cppc_get_cpu_cost() return 0 if the policy is NULL. Then in
em_compute_costs(), the later zero check for cost is not valid
as cost is uninitialized. As Quentin pointed out, kernel energy model
core check the return value of get_cost() first, so if the callback
failed it should tell the core. Return -EINVAL to fix it.

Fixes: 1a1374bb8c ("cpufreq: CPPC: Fix possible null-ptr-deref for cppc_get_cpu_cost()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/c4765377-7830-44c2-84fa-706b6e304e10@stanley.mountain/
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Quentin Perret <qperret@google.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-11-11 09:45:16 +05:30
Jinjie Ruan
1a1374bb8c cpufreq: CPPC: Fix possible null-ptr-deref for cppc_get_cpu_cost()
cpufreq_cpu_get_raw() may return NULL if the cpu is not in
policy->cpus cpu mask and it will cause null pointer dereference,
so check NULL for cppc_get_cpu_cost().

Fixes: 740fcdc2c2 ("cpufreq: CPPC: Register EM based on efficiency class information")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-30 14:38:40 +05:30
Jinjie Ruan
a78e720756 cpufreq: CPPC: Fix possible null-ptr-deref for cpufreq_cpu_get_raw()
cpufreq_cpu_get_raw() may return NULL if the cpu is not in
policy->cpus cpu mask and it will cause null pointer dereference.

Fixes: 740fcdc2c2 ("cpufreq: CPPC: Register EM based on efficiency class information")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-30 08:48:36 +05:30
Jie Zhan
ea1829d4d4 cppc_cpufreq: Remove HiSilicon CPPC workaround
Since commit 6c8d750f97 ("cpufreq / cppc: Work around for Hisilicon CPPC
cpufreq"), we introduce a workround for HiSilicon platforms that do not
support performance feedback counters, whereas they can get the actual
frequency from the desired perf register.  Later on, FIE is disabled in
that workaround as well.

Now the workround can be handled by the common code.  Desired perf would be
read and converted to frequency if feedback counters don't change.  FIE
would be disabled if the CPPC regs are in PCC region.

Hence, the workaround is no longer needed and can be safely removed, in an
effort to consolidate the driver procedure.

Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Huisong Li <lihuisong@huawei.com>
[ Viresh: Move fie_disabled withing CONFIG option to fix warning ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-03 11:09:44 +05:30
Jie Zhan
c471956319 cppc_cpufreq: Use desired perf if feedback ctrs are 0 or unchanged
The CPPC performance feedback counters could be 0 or unchanged when the
target cpu is in a low-power idle state, e.g. power-gated or clock-gated.

When the counters are 0, cppc_cpufreq_get_rate() returns 0 KHz, which makes
cpufreq_online() get a false error and fail to generate a cpufreq policy.

When the counters are unchanged, the existing cppc_perf_from_fbctrs()
returns a cached desired perf, but some platforms may update the real
frequency back to the desired perf reg.

For the above cases in cppc_cpufreq_get_rate(), get the latest desired perf
from the CPPC reg to reflect the frequency because some platforms may
update the actual frequency back there; if failed, use the cached desired
perf.

Fixes: 6a4fec4f6d ("cpufreq: cppc: cppc_cpufreq_get_rate() returns zero in all error cases.")
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Reviewed-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-03 11:07:53 +05:30
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Christian Loehle
4eb71e3b45 cpufreq/cppc: Use NSEC_PER_MSEC for deadline task
Convert the cppc deadline task attributes to use the available
definitions to make them more readable.
No functional change.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20240813144348.1180344-4-christian.loehle@arm.com
2024-09-11 11:25:10 +02:00
Lizhe
b4b1ddc9df cpufreq: Make cpufreq_driver->exit() return void
The cpufreq core doesn't check the return type of the exit() callback
and there is not much the core can do on failures at that point. Just
drop the returned value and make it return void.

Signed-off-by: Lizhe <sensor1010@163.com>
[ Viresh: Reworked the patches to fix all missing changes together. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # Mediatek
Acked-by: Sudeep Holla <sudeep.holla@arm.com> # scpi, scmi, vexpress
Acked-by: Mario Limonciello <mario.limonciello@amd.com> # amd
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> # bmips
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com> # omap
2024-07-09 08:45:30 +05:30
Riwen Lu
90e4ed6bb0 cpufreq/cppc: Don't compare desired_perf in target()
There is a corner case where the desired_perf is exactly same as the old
perf, but the actual current freq is not.

This happens during S3 while the cpufreq governor is set to powersave.
During cpufreq resume process, the booting CPU's new_freq obtained via
.get() is the highest frequency, while the policy->cur and
cpu->perf_ctrls.desired_perf are set to the lowest level (powersave
governor). This causes the warning: "CPU frequency out of sync:", and
the cpufreq core sets policy->cur to new_freq.

Then the governor->limits() calls cppc_cpufreq_set_target() to
configures the CPU frequency and returns directly because the
desired_perf converted from target_freq is same as the
cpu->perf_ctrls.desired_perf and both are the lowest_perf.

Since target_freq and policy->cur have been already compared in
__cpufreq_driver_target(), there's no need to compare them again here.

Drop the comparison.

Signed-off-by: Riwen Lu <luriwen@kylinos.cn>
[ Viresh: Updated commit message / subject ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-06-13 15:12:57 +05:30
Aleksandr Mishin
cf7de25878 cppc_cpufreq: Fix possible null pointer dereference
cppc_cpufreq_get_rate() and hisi_cppc_cpufreq_get_rate() can be called from
different places with various parameters. So cpufreq_cpu_get() can return
null as 'policy' in some circumstances.
Fix this bug by adding null return check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: a28b2bfc09 ("cppc_cpufreq: replace per-cpu data array with a list")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-04-19 11:59:16 +05:30
Vincent Guittot
50b813b147 cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}()
Move and rename cppc_cpufreq_perf_to_khz() and cppc_cpufreq_khz_to_perf() to
use them outside cppc_cpufreq in topology_init_cpu_capacity_cppc().

Modify the interface to use struct cppc_perf_caps *caps instead of
struct cppc_cpudata *cpu_data as we only use the fields of cppc_perf_caps.

cppc_cpufreq was converting the lowest and nominal freq from MHz to kHz
before using them. We move this conversion inside cppc_perf_to_khz and
cppc_khz_to_perf to make them generic and usable outside cppc_cpufreq.

No functional change

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20231211104855.558096-6-vincent.guittot@linaro.org
2023-12-23 15:52:35 +01:00
Liao Chang
e613d8cff5 cpufreq: cppc: Set fie_disabled to FIE_DISABLED if fails to create kworker_fie
The function cppc_freq_invariance_init() may failed to create
kworker_fie, make it more robust by setting fie_disabled to FIE_DISBALED
to prevent an invalid pointer dereference in kthread_destroy_worker(),
which called from cppc_freq_invariance_exit().

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-08-17 14:24:58 +05:30
Liao Chang
6a4fec4f6d cpufreq: cppc: cppc_cpufreq_get_rate() returns zero in all error cases.
The cpufreq framework used to use the zero of return value to reflect
the cppc_cpufreq_get_rate() had failed to get current frequecy and treat
all positive integer to be succeed. Since cppc_get_perf_ctrs() returns a
negative integer in error case, so it is better to convert the value to
zero as the return value of cppc_cpufreq_get_rate().

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-08-17 14:12:11 +05:30
Pierre Gondois
f5f94b9c8b cpufreq: CPPC: Add u64 casts to avoid overflowing
The fields of the _CPC object are unsigned 32-bits values.
To avoid overflows while using _CPC's values, add 'u64' casts.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-12-27 08:27:14 +05:30
Jeremy Linton
ae2df912d1 ACPI: CPPC: Disable FIE if registers in PCC regions
PCC regions utilize a mailbox to set/retrieve register values used by
the CPPC code. This is fine as long as the operations are
infrequent. With the FIE code enabled though the overhead can range
from 2-11% of system CPU overhead (ex: as measured by top) on Arm
based machines.

So, before enabling FIE assure none of the registers used by
cppc_get_perf_ctrs() are in the PCC region. Finally, add a module
parameter which can override the PCC region detection at boot or
module reload.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-24 18:43:46 +02:00
Perry Yuan
a2a9d18500 ACPI: CPPC: Add ACPI disabled check to acpi_cpc_valid()
Make acpi_cpc_valid() check if ACPI is disabled, so that its callers
don't need to check that separately.  This will also cause the AMD
pstate driver to refuse to load right away when ACPI is disabled.

Also update the warning message in amd_pstate_init() to mention the
ACPI disabled case for completeness.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-25 13:55:17 +02:00
Pierre Gondois
da4363457f cpufreq: CPPC: Fix unused-function warning
Building the cppc_cpufreq driver with for arm64 with
CONFIG_ENERGY_MODEL=n triggers the following warnings:
 drivers/cpufreq/cppc_cpufreq.c:550:12: error: ‘cppc_get_cpu_cost’ defined but not used
[-Werror=unused-function]
   550 | static int cppc_get_cpu_cost(struct device *cpu_dev, unsigned long KHz,
       |            ^~~~~~~~~~~~~~~~~
 drivers/cpufreq/cppc_cpufreq.c:481:12: error: ‘cppc_get_cpu_power’ defined but not used
[-Werror=unused-function]
   481 | static int cppc_get_cpu_power(struct device *cpu_dev,
       |            ^~~~~~~~~~~~~~~~~~

Move the Energy Model related functions into specific guards.
This allows to fix the warning and prevent doing extra work
when the Energy Model is not present.

Fixes: 740fcdc2c2 ("cpufreq: CPPC: Register EM based on efficiency class information")
Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-30 15:33:42 +02:00
Zheng Bin
a3f083e04a cpufreq: CPPC: Fix build error without CONFIG_ACPI_CPPC_CPUFREQ_FIE
If CONFIG_ACPI_CPPC_CPUFREQ_FIE is not set, building fails:

drivers/cpufreq/cppc_cpufreq.c: In function ‘populate_efficiency_class’:
drivers/cpufreq/cppc_cpufreq.c:584:2: error: ‘cppc_cpufreq_driver’ undeclared (first use in this function); did you mean ‘cpufreq_driver’?
  cppc_cpufreq_driver.register_em = cppc_cpufreq_register_em;
  ^~~~~~~~~~~~~~~~~~~
  cpufreq_driver

Make declare of cppc_cpufreq_driver out of CONFIG_ACPI_CPPC_CPUFREQ_FIE
to fix this.

Fixes: 740fcdc2c2 ("cpufreq: CPPC: Register EM based on efficiency class information")
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-30 15:31:28 +02:00
Pierre Gondois
2d41dc2380 cpufreq: CPPC: Enable dvfs_possible_from_any_cpu
The communication mean of the _CPC desired performance can be
PCC, System Memory, System IO, or Functional Fixed Hardware (FFH).

PCC, SystemMemory and SystemIo address spaces are available from any
CPU. Thus, dvfs_possible_from_any_cpu should be enabled in such case.
For FFH, let the FFH implementation do smp_call_function_*() calls.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:45:34 +02:00
Pierre Gondois
3cc30dd00a cpufreq: CPPC: Enable fast_switch
The communication mean of the _CPC desired performance can be
PCC, System Memory, System IO, or Functional Fixed Hardware.

commit b7898fda5b ("cpufreq: Support for fast frequency switching")
fast_switching is 'for switching CPU frequencies from interrupt
context'.
Writes to SystemMemory and SystemIo are fast and suitable this.
This is not the case for PCC and might not be the case for FFH.

Enable fast_switching for the cppc_cpufreq driver in above cases.

Add cppc_allow_fast_switch() to check the desired performance
register address space and set fast_switching accordingly.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:45:34 +02:00
Pierre Gondois
740fcdc2c2 cpufreq: CPPC: Register EM based on efficiency class information
Performance states and energy consumption values are not advertised
in ACPI. In the GicC structure of the MADT table, the "Processor
Power Efficiency Class field" (called efficiency class from now)
allows to describe the relative energy efficiency of CPUs.

To leverage the EM and EAS, the CPPC driver creates a set of
artificial performance states and registers them in the Energy Model
(EM), such as:
- Every 20 capacity unit, a performance state is created.
- The energy cost of each performance state gradually increases.
No power value is generated as only the cost is used in the EM.

During task placement, a task can raise the frequency of its whole
pd. This can make EAS place a task on a pd with CPUs that are
individually less energy efficient.
As cost values are artificial, and to place tasks on CPUs with the
lower efficiency class, a gap in cost values is generated for adjacent
efficiency classes.
E.g.:
- efficiency class = 0, capacity is in [0-1024], so cost values
  are in [0: 51] (one performance state every 20 capacity unit)
- efficiency class = 1, capacity is in [0-1024], cost values
  are in [1*gap+0: 1*gap+51].

The value of the cost gap is chosen to absorb a the energy of 4 CPUs
at their maximum capacity. This means that between:
1- a pd of 4 CPUs, each of them being used at almost their full
   capacity. Their efficiency class is N.
2- a CPU using almost none of its capacity. Its efficiency class is
   N+1
EAS will choose the first option.

This patch also populates the (struct cpufreq_driver).register_em
callback if the valid efficiency_class ACPI values are provided.

Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-06 21:01:17 +02:00