Add <asm/smp.h> for cpuid_to_hartid_map etc.
This is needed for both SMP and non-SMP builds, but not having it
causes a build error for non-SMP:
drivers/cpuidle/cpuidle-riscv-sbi.c: In function 'sbi_cpuidle_init_cpu':
drivers/cpuidle/cpuidle-riscv-sbi.c:350:26: error: implicit declaration of function 'cpuid_to_hartid_map' [-Werror=implicit-function-declaration]
Fixes: 6abf32f1d9 ("cpuidle: Add RISC-V SBI CPU idle driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This series adds RISC-V CPU Idle support using SBI HSM suspend function.
The RISC-V SBI CPU idle driver added by this series is highly inspired
from the ARM PSCI CPU idle driver.
Special thanks Sandeep Tripathy for providing early feeback on SBI HSM
support in all above projects (RISC-V SBI specification, OpenSBI, and
Linux RISC-V).
* palmer/riscv-idle:
RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine
dt-bindings: Add common bindings for ARM and RISC-V idle states
cpuidle: Add RISC-V SBI CPU idle driver
cpuidle: Factor-out power domain related code from PSCI domain driver
RISC-V: Add SBI HSM suspend related defines
RISC-V: Add arch functions for non-retentive suspend entry/exit
RISC-V: Rename relocate() and make it global
RISC-V: Enable CPU_IDLE drivers
Pull ARM driver updates from Arnd Bergmann:
"There are a few separately maintained driver subsystems that we merge
through the SoC tree, notable changes are:
- Memory controller updates, mainly for Tegra and Mediatek SoCs, and
clarifications for the memory controller DT bindings
- SCMI firmware interface updates, in particular a new transport
based on OPTEE and support for atomic operations.
- Cleanups to the TEE subsystem, refactoring its memory management
For SoC specific drivers without a separate subsystem, changes include
- Smaller updates and fixes for TI, AT91/SAMA5, Qualcomm and NXP
Layerscape SoCs.
- Driver support for Microchip SAMA5D29, Tesla FSD, Renesas RZ/G2L,
and Qualcomm SM8450.
- Better power management on Mediatek MT81xx, NXP i.MX8MQ and older
NVIDIA Tegra chips"
* tag 'arm-drivers-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (154 commits)
ARM: spear: fix typos in comments
soc/microchip: fix invalid free in mpfs_sys_controller_delete
soc: s4: Add support for power domains controller
dt-bindings: power: add Amlogic s4 power domains bindings
ARM: at91: add support in soc driver for new SAMA5D29
soc: mediatek: mmsys: add sw0_rst_offset in mmsys driver data
dt-bindings: memory: renesas,rpc-if: Document RZ/V2L SoC
memory: emif: check the pointer temp in get_device_details()
memory: emif: Add check for setup_interrupts
dt-bindings: arm: mediatek: mmsys: add support for MT8186
dt-bindings: mediatek: add compatible for MT8186 pwrap
soc: mediatek: pwrap: add pwrap driver for MT8186 SoC
soc: mediatek: mmsys: add mmsys reset control for MT8186
soc: mediatek: mtk-infracfg: Disable ACP on MT8192
soc: ti: k3-socinfo: Add AM62x JTAG ID
soc: mediatek: add MTK mutex support for MT8186
soc: mediatek: mmsys: add mt8186 mmsys routing table
soc: mediatek: pm-domains: Add support for mt8186
dt-bindings: power: Add MT8186 power domains
soc: mediatek: pm-domains: Add support for mt8195
...
The RISC-V SBI HSM extension provides HSM suspend call which can
be used by Linux RISC-V to enter platform specific low-power state.
This patch adds a CPU idle driver based on RISC-V SBI calls which
will populate idle states from device tree and use SBI calls to
entry these idle states.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Acked-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The generic power domain related code in PSCI domain driver is largely
independent of PSCI and can be shared with RISC-V SBI domain driver
hence we factor-out this code into dt_idle_genpd.c and dt_idle_genpd.h.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Call cpuidle_poll_state_init() only if it is needed to avoid doing
useless work.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
qcom_scm_set_cold/warm_boot_addr() currently take a cpumask parameter,
but it's not very useful because at the end we always set the same entry
address for all CPUs. This also allows speeding up probe of
cpuidle-qcom-spm a bit because only one SCM call needs to be made to
the TrustZone firmware, instead of one per CPU.
The main reason for this change is that it allows implementing the
"multi-cluster" variant of the set_boot_addr() call more easily
without having to rely on functions that break in certain build
configurations or that are not exported to modules.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211201130505.257379-4-stephan@gerhold.net
At the moment, the "qcom-spm-cpuidle" platform device is always created,
even if none of the CPUs is actually managed by the SPM. On non-qcom
platforms this will result in infinite probe-deferral due to the
failing qcom_scm_is_available() call.
To avoid this, look through the CPU DT nodes and check if there is
actually any CPU managed by a SPM (as indicated by the qcom,saw property).
It should also be available because e.g. MSM8916 has qcom,saw defined
but it's typically not enabled with ARM64/PSCI firmwares.
This is needed in preparation of a follow-up change that calls
qcom_scm_set_warm_boot_addr() a single time before registering any
cpuidle drivers. Otherwise this call might be made even on devices
that have this driver enabled but actually make use of PSCI.
Fixes: 60f3692b5f ("cpuidle: qcom_spm: Detach state machine from main SPM handling")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/86e3e09f-a8d7-3dff-3fc6-ddd7d30c5d78@samsung.com/
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211201130505.257379-2-stephan@gerhold.net
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field. Move the cpuidle sysfs code to use default_groups field which
has been the preferred way since aa30f47cf6 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix function name in sysfs.c kernel-doc comment
to remove a warning found by running scripts/kernel-doc,
which is caused by using 'make W=1'.
drivers/cpuidle/sysfs.c:512: warning: expecting prototype for
cpuidle_remove_driver_sysfs(). Prototype was for
cpuidle_remove_state_sysfs() instead
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The word `these' in a comment is repeated, so drop one.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull ARM SoC driver updates from Arnd Bergmann:
"These are all the driver updates for SoC specific drivers. There are a
couple of subsystems with individual maintainers picking up their
patches here:
- The reset controller subsystem add support for a few new SoC
variants to existing drivers, along with other minor improvements
- The OP-TEE subsystem gets a driver for the ARM FF-A transport
- The memory controller subsystem has improvements for Tegra,
Mediatek, Renesas, Freescale and Broadcom specific drivers.
- The tegra cpuidle driver changes get merged through this tree this
time. There are only minor changes, but they depend on other tegra
driver updates here.
- The ep93xx platform finally moves to using the drivers/clk/
subsystem, moving the code out of arch/arm in the process. This
depends on a small sound driver change that is included here as
well.
- There are some minor updates for Qualcomm and Tegra specific
firmware drivers.
The other driver updates are mainly for drivers/soc, which contains a
mixture of vendor specific drivers that don't really fit elsewhere:
- Mediatek drivers gain more support for MT8192, with new support for
hw-mutex and mmsys routing, plus support for reset lines in the
mmsys driver.
- Qualcomm gains a new "sleep stats" driver, and support for the
"Generic Packet Router" in the APR driver.
- There is a new user interface for routing the UARTS on ASpeed BMCs,
something that apparently nobody else has needed so far.
- More drivers can now be built as loadable modules, in particular
for Broadcom and Samsung platforms.
- Lots of improvements to the TI sysc driver for better
suspend/resume support"
Finally, there are lots of minor cleanups and new device IDs for
amlogic, renesas, tegra, qualcomm, mediateka, samsung, imx,
layerscape, allwinner, broadcom, and omap"
* tag 'drivers-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (179 commits)
optee: Fix spelling mistake "reclain" -> "reclaim"
Revert "firmware: qcom: scm: Add support for MC boot address API"
qcom: spm: allow compile-testing
firmware: arm_ffa: Remove unused 'compat_version' variable
soc: samsung: exynos-chipid: add exynosautov9 SoC support
firmware: qcom: scm: Don't break compile test on non-ARM platforms
soc: qcom: smp2p: Add of_node_put() before goto
soc: qcom: apr: Add of_node_put() before return
soc: qcom: qcom_stats: Fix client votes offset
soc: qcom: rpmhpd: fix sm8350_mxc's peer domain
dt-bindings: arm: cpus: Document qcom,msm8916-smp enable-method
ARM: qcom: Add qcom,msm8916-smp enable-method identical to MSM8226
firmware: qcom: scm: Add support for MC boot address API
soc: qcom: spm: Add 8916 SPM register data
dt-bindings: soc: qcom: spm: Document qcom,msm8916-saw2-v3.0-cpu
soc: qcom: socinfo: Add PM8150C and SMB2351 models
firmware: qcom_scm: Fix error retval in __qcom_scm_is_call_available()
soc: aspeed: Add UART routing support
soc: fsl: dpio: rename the enqueue descriptor variable
soc: fsl: dpio: use an explicit NULL instead of 0
...
Qualcomm driver updates for v5.16
This drops the use of power-domains for exposing the load_state from the
QMP driver to clients, to avoid issues related to system suspend.
SMP2P becomes wakeup capable, to allow dying remoteprocs to wake up
Linux from suspend to perform recovery.
It adds RPM power-domain support for SM6350 and MSM8953 and base RPM
support for MSM8953 and QCM2290.
It adds support for MSM8996, SDM630 and SDM660 in the SPM driver, which
will enable the introduction of proper voltage scaling of the CPU
subsystem.
Support for releasing secondary CPUs on MSM8226 is introduced.
The Asynchronous Packet Router (APR) driver is extended to support the
new Generic Packet Router (GPR) variant, which is used to communicate
with the firmware in the new AudioReach audio driver.
Lastly it transitions a number of drivers to safer string functions, as
well as switching things to use devm_platform_ioremap_resource().
* tag 'qcom-drivers-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (40 commits)
soc: qcom: apr: Add GPR support
soc: dt-bindings: qcom: add gpr bindings
soc: qcom: apr: make code more reuseable
soc: dt-bindings: qcom: apr: deprecate qcom,apr-domain property
soc: dt-bindings: qcom: apr: convert to yaml
dt-bindings: soc: qcom: aoss: Delete unused power-domain definitions
dt-bindings: msm/dp: Remove aoss-qmp header
soc: qcom: aoss: Drop power domain support
dt-bindings: soc: qcom: aoss: Drop the load state power-domain
soc: qcom: smp2p: Add wakeup capability to SMP2P IRQ
dt-bindings: power: rpmpd: Add SM6350 to rpmpd binding
dt-bindings: soc: qcom: aoss: Add SM6350 compatible
soc: qcom: llcc: Disable MMUHWT retention
soc: qcom: smd-rpm: Add QCM2290 compatible
dt-bindings: soc: qcom: smd-rpm: Add QCM2290 compatible
firmware: qcom_scm: Add compatible for MSM8953 SoC
dt-bindings: firmware: qcom-scm: Document msm8953 bindings
soc: qcom: pdr: Prefer strscpy over strcpy
soc: qcom: rpmh-rsc: Make use of the helper function devm_platform_ioremap_resource_byname()
soc: qcom: gsbi: Make use of the helper function devm_platform_ioremap_resource()
...
Link: https://lore.kernel.org/r/20211012173442.1017010-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Check whether PMC is ready before proceeding with the cpuidle registration.
This fixes racing with the PMC driver probe order, which results in a
disabled deepest CC6 idling state if cpuidle driver is probed before the
PMC.
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Commit c343bf1ba5 ("cpuidle: Fix three reference count leaks")
fixes the cleanup of kobjects; however, it removes kfree() calls
altogether, leading to memory leaks.
Fix those and also defer the initialization of dev->kobj_dev until
after the error check, so that we do not end up with a dangling
pointer.
Fixes: c343bf1ba5 ("cpuidle: Fix three reference count leaks")
Signed-off-by: Anel Orazgaliyeva <anelkz@amazon.de>
Suggested-by: Aman Priyadarshi <apeureka@amazon.de>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In commit a871be6b8e ("cpuidle: Convert Qualcomm SPM driver to a generic
CPUidle driver") the SPM driver has been converted to a
generic CPUidle driver: that was mainly made to simplify the
driver and that was a great accomplishment;
Though, at that time, this driver was only applicable to ARM 32-bit SoCs,
lacking logic about the handling of newer generation SAW.
In preparation for the enablement of SPM features on AArch64/ARM64,
split the cpuidle-qcom-spm driver in two: the CPUIdle related
state machine (currently used only on ARM SoCs) stays there, while
the SPM communication handling lands back in soc/qcom/spm.c and
also making sure to not discard the simplifications that were
introduced in the aforementioned commit.
Since now the "two drivers" are split, the SCM dependency in the
main SPM handling is gone and for this reason it was also possible
to move the SPM initialization early: this will also make sure that
whenever the SAW CPUIdle driver is getting initialized, the SPM
driver will be ready to do the job.
Please note that the anticipation of the SPM initialization was
also done to optimize the boot times on platforms that have their
CPU/L2 idle states managed by other means (such as PSCI), while
needing SAW initialization for other purposes, like AVS control.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
Tested-by: Stephan Gerhold <stephan@gerhold.net>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210729155609.608159-2-angelogioacchino.delregno@somainline.org
Pull MFD updates from Lee Jones:
"Core Frameworks:
- Add support for registering devices via MFD cells to Simple MFD (I2C)
New Drivers:
- Add support for Renesas Synchronization Management Unit (SMU)
New Device Support:
- Add support for N5010 to Intel M10 BMC
- Add support for Cannon Lake to Intel LPSS ACPI
- Add support for Samsung SSG{1,2} to ST-Ericsson's U8500 family
- Add support for TQMx110EB and TQMxE40x to TQ-Systems PLD TQMx86
New Functionality:
- Add support for GPIO to Intel LPC ICH
- Add support for Reset to Texas Instruments TPS65086
Fix-ups:
- Trivial, sorting, whitespace, renaming, etc; mt6360-core, db8500-prcmu-regs, tqmx86
- Device Tree fiddling; syscon, axp20x, qcom,pm8008, ti,tps65086, brcm,cru
- Use proper APIs for IRQ map resolution; ab8500-core, stmpe, tc3589x, wm8994-irq
- Pass 'supplied-from' property through axp288_fuel_gauge via swnode
- Remove unused file entry; MAINTAINERS
- Make interrupt line optional; tps65086
- Rename db8500-cpuidle driver symbol; db8500-prcmu
- Remove support for unused hardware; tqmx86
- Provide a standard LPC clock frequency for unknown boards; tqmx86
- Remove unused code; ti_am335x_tscadc
- Use of_iomap() instead of ioremap(); syscon
Bug Fixes:
- Clear GPIO IRQ resource flags when no IRQ is set; tqmx86
- Fix incorrect/misleading frequencies; db8500-prcmu
- Mitigate namespace clash with other GPIOBASE users"
* tag 'mfd-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (31 commits)
mfd: lpc_sch: Rename GPIOBASE to prevent build error
mfd: syscon: Use of_iomap() instead of ioremap()
dt-bindings: mfd: Add Broadcom CRU
mfd: ti_am335x_tscadc: Delete superfluous error message
mfd: tqmx86: Assume 24MHz LPC clock for unknown boards
mfd: tqmx86: Add support for TQ-Systems DMI IDs
mfd: tqmx86: Add support for TQMx110EB and TQMxE40x
mfd: tqmx86: Fix typo in "platform"
mfd: tqmx86: Remove incorrect TQMx90UC board ID
mfd: tqmx86: Clear GPIO IRQ resource when no IRQ is set
mfd: simple-mfd-i2c: Add support for registering devices via MFD cells
mfd/cpuidle: ux500: Rename driver symbol
mfd: tps65086: Add cell entry for reset driver
mfd: tps65086: Make interrupt line optional
dt-bindings: mfd: Convert tps65086.txt to YAML
MAINTAINERS: Adjust ARM/NOMADIK/Ux500 ARCHITECTURES to file renaming
mfd: db8500-prcmu: Handle missing FW variant
mfd: db8500-prcmu: Rename register header
mfd: axp20x: Add supplied-from property to axp288_fuel_gauge cell
mfd: Don't use irq_create_mapping() to resolve a mapping
...
Pull powerpc updates from Michael Ellerman:
- Convert pseries & powernv to use MSI IRQ domains.
- Rework the pseries CPU numbering so that CPUs that are removed, and
later re-added, are given a CPU number on the same node as
previously, when possible.
- Add support for a new more flexible device-tree format for specifying
NUMA distances.
- Convert powerpc to GENERIC_PTDUMP.
- Retire sbc8548 and sbc8641d board support.
- Various other small features and fixes.
Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard,
Cédric Le Goater, Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas,
Fangrui Song, Finn Thain, Gautham R. Shenoy, Hari Bathini, Joel
Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo Bras, Lukas
Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan
Chancellor, Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R.
Sampat, Randy Dunlap, Sebastian Andrzej Siewior, Srikar Dronamraju, Wan
Jiabing, Xiongwei Song, and Zheng Yongjun.
* tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (154 commits)
powerpc/bug: Cast to unsigned long before passing to inline asm
powerpc/ptdump: Fix generic ptdump for 64-bit
KVM: PPC: Fix clearing never mapped TCEs in realmode
powerpc/pseries/iommu: Rename "direct window" to "dma window"
powerpc/pseries/iommu: Make use of DDW for indirect mapping
powerpc/pseries/iommu: Find existing DDW with given property name
powerpc/pseries/iommu: Update remove_dma_window() to accept property name
powerpc/pseries/iommu: Reorganize iommu_table_setparms*() with new helper
powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
powerpc/pseries/iommu: Allow DDW windows starting at 0x00
powerpc/pseries/iommu: Add ddw_list_new_entry() helper
powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper
powerpc/kernel/iommu: Add new iommu_table_in_use() helper
powerpc/pseries/iommu: Replace hard-coded page shift
powerpc/numa: Update cpu_cpu_map on CPU online/offline
powerpc/numa: Print debug statements only when required
powerpc/numa: convert printk to pr_xxx
powerpc/numa: Drop dbg in favour of pr_debug
powerpc/smp: Enable CACHE domain for shared processor
powerpc/smp: Update cpu_core_map on all PowerPc systems
...
The PRCMU driver defines this as a DT node but there are no bindings
for it and it needs no data from the device tree. Just spawn the
device directly in the same way as the watchdog.
Name it "db8500-cpuidle" since there are no ambitions to support any
more SoCs than this one.
This rids this annoying boot message:
[ 0.032610] cpuidle-dbx500: Failed to locate of_node [id: 0]
However I think the device still spawns and work just fine, despite
not finding a device tree node.
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
After commit 7cbd631d4dec ("cpuidle: pseries: Fixup CEDE0 latency only
for POWER10 onwards"), pseries_idle_probe() is no longer inlined when
compiling with clang, which causes a modpost warning:
WARNING: modpost: vmlinux.o(.text+0xc86a54): Section mismatch in
reference from the function pseries_idle_probe() to the function
.init.text:fixup_cede0_latency()
The function pseries_idle_probe() references
the function __init fixup_cede0_latency().
This is often because pseries_idle_probe lacks a __init
annotation or the annotation of fixup_cede0_latency is wrong.
pseries_idle_probe() is a non-init function, which calls
fixup_cede0_latency(), which is an init function, explaining the
mismatch. pseries_idle_probe() is only called from
pseries_processor_idle_init(), which is an init function, so mark
pseries_idle_probe() as __init so there is no more warning.
Fixes: 054e44ba99 ("cpuidle: pseries: Add function to parse extended CEDE records")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210803211547.1093820-1-nathan@kernel.org
Rename two local variables in teo_select() so that their names better
reflect their purpose.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are three mistakes in the loop in teo_select() that is looking
for an alternative candidate idle state. First, it should walk all
of the idle states shallower than the current candidate one,
including all of the disabled ones, but it terminates after the first
enabled idle state. Second, it should not terminate its last step
if idle state 0 is disabled (which is related to the first issue).
Finally, it may return the current alternative candidate idle state
prematurely if the time span criterion is not met by the idle state
under consideration at the moment.
To address the issues mentioned above, make the loop in question walk
all of the idle states shallower than the current candidate idle state
all the way down to idle state 0 and rearrange the checks in it.
Fixes: 77577558f2 ("cpuidle: teo: Rework most recent idle duration values treatment")
Reported-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently in fixup_cede0_latency() code, we perform the fixup the
CEDE(0) exit latency value only if minimum advertized extended CEDE
latency values are less than 10us. This was done so as to not break
the expected behaviour on POWER8 platforms where the advertised
latency was higher than the default 10us, which would delay the SMT
folding on the core.
However, after the earlier patch "cpuidle/pseries: Fixup CEDE0 latency
only for POWER10 onwards", we can be sure that the fixup of CEDE0
latency is going to happen only from POWER10 onwards. Hence
unconditionally use the minimum exit latency provided by the platform.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1626676399-15975-3-git-send-email-ego@linux.vnet.ibm.com
Commit d947fb4c96 ("cpuidle: pseries: Fixup exit latency for
CEDE(0)") sets the exit latency of CEDE(0) based on the latency values
of the Extended CEDE states advertised by the platform
On POWER9 LPARs, the firmwares advertise a very low value of 2us for
CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the
PHYP hypervisor corresponds to the latency required to wakeup from the
underlying hardware idle state. However the wakeup latency from the
LPAR perspective should include
1. The time taken to transition the CPU from the Hypervisor into the
LPAR post wakeup from platform idle state
2. Time taken to send the IPI from the source CPU (waker) to the idle
target CPU (wakee).
1. can be measured via timer idle test, where we queue a timer, say
for 1ms, and enter the CEDE state. When the timer fires, in the timer
handler we compute how much extra timer over the expected 1ms have we
consumed. On a a POWER9 LPAR the numbers are
CEDE latency measured using a timer (numbers in ns)
N Min Median Avg 90%ile 99%ile Max Stddev
400 2601 5677 5668.74 5917 6413 9299 455.01
1. and 2. combined can be determined by an IPI latency test where we
send an IPI to an idle CPU and in the handler compute the time
difference between when the IPI was sent and when the handler ran. We
see the following numbers on POWER9 LPAR.
CEDE latency measured using an IPI (numbers in ns)
N Min Median Avg 90%ile 99%ile Max Stddev
400 711 7564 7369.43 8559 9514 9698 1200.01
Suppose, we consider the 99th percentile latency value measured using
the IPI to be the wakeup latency, the value would be 9.5us This is in
the ballpark of the default value of 10us.
Hence, use the exit latency of CEDE(0) based on the latency values
advertized by platform only from POWER10 onwards. The values
advertized on POWER10 platforms is more realistic and informed by the
latency measurements. For earlier platforms stick to the default value
of 10us. The fix was suggested by Michael Ellerman.
Fixes: d947fb4c96 ("cpuidle: pseries: Fixup exit latency for CEDE(0)")
Reported-by: Enrico Joedecke <joedecke@de.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com