Commit Graph

866 Commits

Author SHA1 Message Date
Linus Torvalds
f55b0671e3 Merge tag 'pm-6.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
 "These are mostly fixes on top of the previously merged power
  management material with the addition of some teo cpuidle governor
  updates, some of which may also be regarded as fixes:

   - Add missing error handling for syscore_suspend() to the hibernation
     core code (Wentao Liang)

   - Revert a commit that added unused macros (Andy Shevchenko)

   - Synchronize the runtime PM status of devices that were runtime-
     suspended before a system-wide suspend and need to be resumed
     during the subsequent system-wide resume transition (Rafael
     Wysocki)

   - Clean up the teo cpuidle governor and make the handling of short
     idle intervals in it consistent regardless of the properties of
     idle states supplied by the cpuidle driver (Rafael Wysocki)

   - Fix some boost-related issues in cpufreq (Lifeng Zheng)

   - Fix build issues in the s3c64xx and airoha cpufreq drivers (Viresh
     Kumar)

   - Remove unconditional binding of schedutil governor kthreads to the
     affected CPUs if the cpufreq driver indicates that updates can
     happen from any CPU (Christian Loehle)"

* tag 'pm-6.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: core: Synchronize runtime PM status of parents and children
  cpufreq: airoha: Depends on OF
  PM: Revert "Add EXPORT macros for exporting PM functions"
  PM: hibernate: Add error handling for syscore_suspend()
  cpufreq/schedutil: Only bind threads if needed
  cpufreq: ACPI: Remove set_boost in acpi_cpufreq_cpu_init()
  cpufreq: CPPC: Fix wrong max_freq in policy initialization
  cpufreq: Introduce a more generic way to set default per-policy boost flag
  cpufreq: Fix re-boost issue after hotplugging a CPU
  cpufreq: s3c64xx: Fix compilation warning
  cpuidle: teo: Skip sleep length computation for low latency constraints
  cpuidle: teo: Replace time_span_ns with a flag
  cpuidle: teo: Simplify handling of total events count
  cpuidle: teo: Skip getting the sleep length if wakeups are very frequent
  cpuidle: teo: Simplify counting events used for tick management
  cpuidle: teo: Clarify two code comments
  cpuidle: teo: Drop local variable prev_intercept_idx
  cpuidle: teo: Combine candidate state index checks against 0
  cpuidle: teo: Reorder candidate state index checks
  cpuidle: teo: Rearrange idle state lookup code
2025-01-30 15:10:34 -08:00
Linus Torvalds
68732c0bf9 Merge tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain updates from Ulf Hansson:
 "pmdomain core:
   - Add support for naming idlestates through DT

  pmdomain providers:
   - arm: Explicitly request the current state at init for the SCMI PM
     domain
   - mediatek: Add Airoha CPU PM Domain support for CPU frequency
     scaling
   - ti: Add per-device latency constraint management to the ti_sci PM
     domain

  cpuidle-psci:
   - Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP"

* tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode
  pmdomain: arm: scmi_pm_domain: Send an explicit request to set the current state
  pmdomain: airoha: Add Airoha CPU PM Domain support
  pmdomain: ti_sci: handle wake IRQs for IO daisy chain wakeups
  pmdomain: ti_sci: add wakeup constraint management
  pmdomain: ti_sci: add per-device latency constraint management
  pmdomain: imx-gpcv2: Suppress bind attrs
  pmdomain: imx8m[p]-blk-ctrl: Suppress bind attrs
  pmdomain: core: Support naming idle states
  dt-bindings: power: domain-idle-state: Allow idle-state-name
  cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
2025-01-24 07:43:35 -08:00
Rafael J. Wysocki
16c8d7586c cpuidle: teo: Skip sleep length computation for low latency constraints
If the idle state exit latency constraint is sufficiently low, it
is better to avoid the additional latency related to calling
tick_nohz_get_sleep_length().  It is also not necessary to compute
the sleep length in that case because shallow idle state selection
will be forced then regardless of the recent wakeup history.

Accordingly, skip the sleep length computation and subsequent
checks of the exit latency constraint is low enough.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/6122398.lOV4Wx5bFT@rjwysocki.net
2025-01-20 17:19:53 +01:00
Rafael J. Wysocki
65e18e6544 cpuidle: teo: Replace time_span_ns with a flag
After recent updates, the time_span_ns field in struct teo_cpu has
become an indicator on whether or not the most recent wakeup has been
"genuine" which may as well be indicated by a bool field without
calling local_clock(), so update the code accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/6010475.MhkbZ0Pkbq@rjwysocki.net
2025-01-20 17:18:02 +01:00
Rafael J. Wysocki
ddcfa79646 cpuidle: teo: Simplify handling of total events count
Instead of computing the total events count from scratch every time,
decay it and add a PULSE value to it in teo_update().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/9388883.CDJkKcVGEf@rjwysocki.net
2025-01-20 17:17:56 +01:00
Rafael J. Wysocki
13ed5c4a6d cpuidle: teo: Skip getting the sleep length if wakeups are very frequent
Commit 6da8f9ba5a ("cpuidle: teo: Skip tick_nohz_get_sleep_length()
call in some cases") attempted to reduce the governor overhead in some
cases by making it avoid obtaining the sleep length (the time till the
next timer event) which may be costly.

Among other things, after the above commit, tick_nohz_get_sleep_length()
was not called any more when idle state 0 was to be returned, which
turned out to be problematic and the previous behavior in that respect
was restored by commit 4b20b07ce7 ("cpuidle: teo: Don't count non-
existent intercepts").

However, commit 6da8f9ba5a also caused the governor to avoid calling
tick_nohz_get_sleep_length() on systems where idle state 0 is a "polling"
one (that is, it is not really an idle state, but a loop continuously
executed by the CPU) when the target residency of the idle state to be
returned was low enough, so there was no practical need to refine the
idle state selection in any way.  This change was not removed by the
other commit, so now on systems where idle state 0 is a "polling" one,
tick_nohz_get_sleep_length() is called when idle state 0 is to be
returned, but it is not called when a deeper idle state with
sufficiently low target residency is to be returned.  That is arguably
confusing and inconsistent.

Moreover, there is no specific reason why the behavior in question
should depend on whether or not idle state 0 is a "polling" one.

One way to address this would be to make the governor always call
tick_nohz_get_sleep_length() to obtain the sleep length, but that would
effectively mean reverting commit 6da8f9ba5a and restoring the latency
issue that was the reason for doing it.  This approach is thus not
particularly attractive.

To address it differently, notice that if a CPU is woken up very often,
this is not likely to be caused by timers in the first place (user space
has a default timer slack of 50 us and there are relatively few timers
with a deadline shorter than several microseconds in the kernel) and
even if it were the case, the potential benefit from using a deep idle
state would then be questionable for latency reasons.  Therefore, if the
majority of CPU wakeups occur within several microseconds, it can be
assumed that all wakeups in that range are non-timer and the sleep
length need not be determined.

Accordingly, introduce a new metric for counting wakeups with the
measured idle duration below RESIDENCY_THRESHOLD_NS and modify the idle
state selection to skip the tick_nohz_get_sleep_length() invocation if
idle state 0 has been selected or the target residency of the candidate
idle state is below RESIDENCY_THRESHOLD_NS and the value of the new
metric is at least 1/2 of the total event count.

Since the above requires the measured idle duration to be determined
every time, except for the cases when one of the safety nets has
triggered in which the wakeup is counted as a hit in the deepest
idle state idle residency range, update the handling of those cases
to avoid skipping the idle duration computation when the CPU wakeup
is "genuine".

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3851791.kQq0lBPeGt@rjwysocki.net
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
[ rjw: Renamed a struct field ]
[ rjw: Fixed typo in the subject and one in a comment ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-20 17:17:02 +01:00
Rafael J. Wysocki
d619b5cc67 cpuidle: teo: Simplify counting events used for tick management
Replace the tick_hits metric with a new tick_intercepts one that can be
used directly when deciding whether or not to stop the scheduler tick
and update the governor functional description accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/1987985.PYKUYFuaPT@rjwysocki.net
2025-01-20 17:16:53 +01:00
Rafael J. Wysocki
e24f8a55de cpuidle: teo: Clarify two code comments
Rewrite two code comments suposed to explain its behavior that are too
concise or not sufficiently clear.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/8472971.T7Z3S40VBb@rjwysocki.net
[ rjw: Fixed 2 typos in new comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-20 17:16:45 +01:00
Rafael J. Wysocki
b9a6af26bd cpuidle: teo: Drop local variable prev_intercept_idx
Local variable prev_intercept_idx in teo_select() is redundant because
it cannot be 0 when candidate state index is 0.

The prev_intercept_idx value is the index of the deepest enabled idle
state, so if it is 0, state 0 is the deepest enabled idle state, in
which case it must be the only enabled idle state, but then teo_select()
would have returned early before initializing prev_intercept_idx.

Thus prev_intercept_idx must be nonzero and the check of it against 0
always passes, so it can be dropped altogether.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/3327997.aeNJFYEL58@rjwysocki.net
[ rjw: Fixed typo in the changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-20 17:16:37 +01:00
Rafael J. Wysocki
ea185406d1 cpuidle: teo: Combine candidate state index checks against 0
There are two candidate state index checks against 0 in teo_select()
that need not be separate, so combine them and update comments around
them.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/13676346.uLZWGnKmhe@rjwysocki.net
2025-01-20 17:16:30 +01:00
Rafael J. Wysocki
92ce5c07b7 cpuidle: teo: Reorder candidate state index checks
Since constraint_idx may be 0, the candidate state index may change to 0
after assigning constraint_idx to it, so first check if it is greater
than constraint_idx (and update it if so) and then check it against 0.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/1907276.tdWV9SEqCh@rjwysocki.net
2025-01-20 17:16:16 +01:00
Rafael J. Wysocki
425b753645 cpuidle: teo: Rearrange idle state lookup code
Rearrange code in the idle state lookup loop in teo_select() to make it
somewhat easier to follow and update comments around it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/4619938.LvFx2qVVIh@rjwysocki.net
2025-01-20 17:15:44 +01:00
Rafael J. Wysocki
5a597a19a2 cpuidle: teo: Update documentation after previous changes
After previous changes, the description of the teo governor in the
documentation comment does not match the code any more, so update it
as appropriate.

Fixes: 4499143980 ("cpuidle: teo: Remove recent intercepts metric")
Fixes: 2662342079 ("cpuidle: teo: Gather statistics regarding whether or not to stop the tick")
Fixes: 6da8f9ba5a ("cpuidle: teo: Skip tick_nohz_get_sleep_length() call in some cases")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/6120335.lOV4Wx5bFT@rjwysocki.net
[ rjw: Corrected 3 typos found by Christian ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-13 20:46:27 +01:00
Javier Carrasco
7e25044b80 cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
The 'np' device_node is initialized via of_cpu_device_node_get(), which
requires explicit calls to of_node_put() when it is no longer required
to avoid leaking the resource.

Instead of adding the missing calls to of_node_put() in all execution
paths, use the cleanup attribute for 'np' by means of the __free()
macro, which automatically calls of_node_put() when the variable goes
out of scope. Given that 'np' is only used within the
for_each_possible_cpu(), reduce its scope to release the nood after
every iteration of the loop.

Fixes: 6abf32f1d9 ("cpuidle: Add RISC-V SBI CPU idle driver")
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce08@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:40:14 -08:00
Patrick Delaunay
bfd5859709 cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
Set GENPD_FLAG_ACTIVE_WAKEUP flag for domain psci cpuidle when OSI
is activated, then when a device is set as the wake-up source using
device_set_wakeup_path, the PSCI power domain could be retained to allow
so that the associated device can wake up the system.

With this flag, for S2IDLE system-wide suspend, the wake-up path is
managed in each device driver and is tested in the power framework:
a PSCI domain is only turned off when GENPD_FLAG_ACTIVE_WAKEUP is enabled
and the associated device is not in the wake-up path, so PSCI CPUIdle
selects the lowest level in the PSCI topology according to the wake-up
path.

This patch is a preliminary step to support PSCI OSI on the STM32MP25
platform with the D1 domain (power-domain-cluster) for the A35 cortex
cluster and for the associated peripherals including EXTI1 which manages
the wake-up interrupts for domain D1.

Signed-off-by: Patrick Delaunay <patrick.delaunay@foss.st.com>
Message-ID: <20241119141827.1.I6129b16ec6b558efc1707861db87e55bf7022f62@changeid>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-12-10 11:57:32 +01:00
Linus Torvalds
e70140ba0d Get rid of 'remove_new' relic from platform driver struct
The continual trickle of small conversion patches is grating on me, and
is really not helping.  Just get rid of the 'remove_new' member
function, which is just an alias for the plain 'remove', and had a
comment to that effect:

  /*
   * .remove_new() is a relic from a prototype conversion of .remove().
   * New drivers are supposed to implement .remove(). Once all drivers are
   * converted to not use .remove_new any more, it will be dropped.
   */

This was just a tree-wide 'sed' script that replaced '.remove_new' with
'.remove', with some care taken to turn a subsequent tab into two tabs
to make things line up.

I did do some minimal manual whitespace adjustment for places that used
spaces to line things up.

Then I just removed the old (sic) .remove_new member function, and this
is the end result.  No more unnecessary conversion noise.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-01 15:12:43 -08:00
Linus Torvalds
91dbbe6c9f Merge tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-v updates from Palmer Dabbelt:

 - Support for pointer masking in userspace

 - Support for probing vector misaligned access performance

 - Support for qspinlock on systems with Zacas and Zabha

* tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits)
  RISC-V: Remove unnecessary include from compat.h
  riscv: Fix default misaligned access trap
  riscv: Add qspinlock support
  dt-bindings: riscv: Add Ziccrse ISA extension description
  riscv: Add ISA extension parsing for Ziccrse
  asm-generic: ticket-lock: Add separate ticket-lock.h
  asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  riscv: Implement xchg8/16() using Zabha
  riscv: Implement arch_cmpxchg128() using Zacas
  riscv: Improve zacas fully-ordered cmpxchg()
  riscv: Implement cmpxchg8/16() using Zabha
  dt-bindings: riscv: Add Zabha ISA extension description
  riscv: Implement cmpxchg32/64() using Zacas
  riscv: Do not fail to build on byte/halfword operations with Zawrs
  riscv: Move cpufeature.h macros into their own header
  KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
  RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
  riscv: hwprobe: Export the Supm ISA extension
  riscv: selftests: Add a pointer masking test
  riscv: Allow ptrace control of the tagged address ABI
  ...
2024-11-27 11:19:09 -08:00
Linus Torvalds
42d9e8b7cc Merge tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:

 - Rework kfence support for the HPT MMU to work on systems with >= 16TB
   of RAM.

 - Remove the powerpc "maple" platform, used by the "Yellow Dog
   Powerstation".

 - Add support for DYNAMIC_FTRACE_WITH_CALL_OPS,
   DYNAMIC_FTRACE_WITH_DIRECT_CALLS & BPF Trampolines.

 - Add support for running KVM nested guests on Power11.

 - Other small features, cleanups and fixes.

Thanks to Amit Machhiwal, Arnd Bergmann, Christophe Leroy, Costa
Shulyupin, David Hunter, David Wang, Disha Goel, Gautam Menghani, Geert
Uytterhoeven, Hari Bathini, Julia Lawall, Kajol Jain, Keith Packard,
Lukas Bulwahn, Madhavan Srinivasan, Markus Elfring, Michal Suchanek,
Ming Lei, Mukesh Kumar Chaurasiya, Nathan Chancellor, Naveen N Rao,
Nicholas Piggin, Nysal Jan K.A, Paulo Miguel Almeida, Pavithra Prakash,
Ritesh Harjani (IBM), Rob Herring (Arm), Sachin P Bappalige, Shen
Lichuan, Simon Horman, Sourabh Jain, Thomas Weißschuh, Thorsten Blum,
Thorsten Leemhuis, Venkat Rao Bagalkote, Zhang Zekun, and zhang jiao.

* tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (89 commits)
  EDAC/powerpc: Remove PPC_MAPLE drivers
  powerpc/perf: Add per-task/process monitoring to vpa_pmu driver
  powerpc/kvm: Add vpa latency counters to kvm_vcpu_arch
  docs: ABI: sysfs-bus-event_source-devices-vpa-pmu: Document sysfs event format entries for vpa_pmu
  powerpc/perf: Add perf interface to expose vpa counters
  MAINTAINERS: powerpc: Mark Maddy as "M"
  powerpc/Makefile: Allow overriding CPP
  powerpc-km82xx.c: replace of_node_put() with __free
  ps3: Correct some typos in comments
  powerpc/kexec: Fix return of uninitialized variable
  macintosh: Use common error handling code in via_pmu_led_init()
  powerpc/powermac: Use of_property_match_string() in pmac_has_backlight_type()
  powerpc: remove dead config options for MPC85xx platform support
  powerpc/xive: Use cpumask_intersects()
  selftests/powerpc: Remove the path after initialization.
  powerpc/xmon: symbol lookup length fixed
  powerpc/ep8248e: Use %pa to format resource_size_t
  powerpc/ps3: Reorganize kerneldoc parameter names
  KVM: PPC: Book3S HV: Fix kmv -> kvm typo
  powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static
  ...
2024-11-23 10:44:31 -08:00
Rafael J. Wysocki
f65ee094ed cpuidle: Do not return from cpuidle_play_dead() on callback failures
If the :enter_dead() idle state callback fails for a certain state,
there may be still a shallower state for which it will work.

Because the only caller of cpuidle_play_dead(), native_play_dead(),
falls back to hlt_play_dead() if it returns an error, it should
better try all of the idle states for which :enter_dead() is present
before failing, so change it accordingly.

Also notice that the :enter_dead() state callback is not expected
to return on success (the CPU should be "dead" then), so make
cpuidle_play_dead() ignore its return value.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # 6.12-rc7
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/3318440.aeNJFYEL58@rjwysocki.net
2024-11-19 21:46:51 +01:00
Michael Ellerman
b23b9edf64 powerpc/machdep: Drop include of dma-mapping.h
Drop the include of dma-mapping.h in machdep.h, replace it with forward
declarations of struct device and struct pci_dev, and include time64.h
and page.h which are required for time64_t and pgprot_t respectively.

Add direct includes of some other headers to some files that were
getting them via machdep.h.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009051826.132805-2-mpe@ellerman.id.au
2024-10-29 23:01:05 +11:00
Shen Lichuan
ee702fdaf1 cpuidle: Correct some typos in comments
Fixed some confusing typos that were currently identified with codespell,
the details are as follows:

 -in the code comments:
  drivers/cpuidle/cpuidle-arm.c:142: registeration ==> registration
  drivers/cpuidle/cpuidle-qcom-spm.c:51: accidently ==> accidentally
  drivers/cpuidle/cpuidle.c:409: dependant ==> dependent
  drivers/cpuidle/driver.c:264: occuring ==> occurring
  drivers/cpuidle/driver.c:299: occuring ==> occurring

Signed-off-by: Shen Lichuan <shenlichuan@vivo.com>
Link: https://patch.msgid.link/20240927081018.8608-1-shenlichuan@vivo.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-18 18:44:32 +02:00
Palmer Dabbelt
7727020695 Merge patch series "cpuidle: riscv-sbi: Allow cpuidle pd used by other devices"
Nick Hu <nick.hu@sifive.com> says:

Add this patchset so the devices that inside the cpu/cluster power domain
can use the cpuidle pd to register the genpd notifier to handle the PM
when cpu/cluster is going to enter a deeper sleep state.

* b4-shazam-merge:
  cpuidle: riscv-sbi: Add cpuidle_disabled() check
  cpuidle: riscv-sbi: Move sbi_cpuidle_init to arch_initcall

Link: https://lore.kernel.org/r/20240814054434.3563453-1-nick.hu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-17 11:34:46 -07:00
Nick Hu
27b4d6aa29 cpuidle: riscv-sbi: Add cpuidle_disabled() check
The consumer devices that inside the cpu/cluster power domain may register
the genpd notifier where their power domains point to the pd nodes under
'/cpus/power-domains'. If the cpuidle.off==1, the genpd notifier will fail
due to sbi_cpuidle_pd_allow_domain_state is not set. We also need the
sbi_cpuidle_cpuhp_up/down to invoke the callbacks. Therefore adding a
cpuidle_disabled() check before cpuidle_register() to address the issue.

Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240814054434.3563453-3-nick.hu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-17 11:34:43 -07:00
Nick Hu
f8a23e3b79 cpuidle: riscv-sbi: Move sbi_cpuidle_init to arch_initcall
Move the sbi_cpuidle_init to the arch_initcall to prevent the consumer
devices from being deferred.

Signed-off-by: Nick Hu <nick.hu@sifive.com>
Link: https://lore.kernel.org/lkml/CAKddAkAOUJSnM=Px-YO=U6pis_7mODHZbmYqcgEzXikriqYvXQ@mail.gmail.com/
Suggested-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240814054434.3563453-2-nick.hu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-17 11:34:42 -07:00
Christian Loehle
38f83090f5 cpuidle: menu: Remove iowait influence
Remove CPU iowaiters influence on idle state selection.

Remove the menu notion of performance multiplier which increased with
the number of tasks that went to iowait sleep on this CPU and haven't
woken up yet.

Relying on iowait for cpuidle is problematic for a few reasons:

 1. There is no guarantee that an iowaiting task will wake up on the
    same CPU.

 2. The task being in iowait says nothing about the idle duration, we
    could be selecting shallower states for a long time.

 3. The task being in iowait doesn't always imply a performance hit
    with increased latency.

 4. If there is such a performance hit, the number of iowaiting tasks
    doesn't directly correlate.

 5. The definition of iowait altogether is vague at best, it is
    sprinkled across kernel code.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/20240905092645.2885200-2-christian.loehle@arm.com
[ rjw: Minor edits in the changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-09-30 17:00:55 +02:00