On RK3399 platforms, power domains are managed mostly by the kernel
(drivers/soc/rockchip/pm_domains.c), but there are a few exceptions
where ARM Trusted Firmware has to be involved:
(1) system suspend/resume
(2) DRAM DVFS (a.k.a., "ddrfreq")
Exception (1) does not cause much conflict, since the kernel has
quiesced itself by the time we make the relevant PSCI call.
Exception (2) can cause conflict, because of two actions:
(a) ARM Trusted Firmware needs to read/modify/write the PMU_BUS_IDLE_REQ
register to idle the memory controller domain; the kernel driver
also has to touch this register for other domains.
(b) ARM Trusted Firmware needs to manage the clocks associated with
these domains.
To elaborate on (b): idling a power domain has always required ungating
an array of clocks; see this old explanation from Rockchip:
https://lore.kernel.org/linux-arm-kernel/54503C19.9060607@rock-chips.com/
Historically, ARM Trusted Firmware has avoided this issue by using a
special PMU_CRU_GATEDIS_CON0 register -- this register ungates all the
necessary clocks -- when idling the memory controller. Unfortunately,
we've found that this register is not 100% sufficient; it does not turn
the relevant PLLs on [0].
So it's possible to trigger issues with something like the following:
1. enable a power domain (e.g., RK3399_PD_VDU) -- kernel will
temporarily enable relevant clocks/PLLs, then turn them back off
2. a PLL (e.g., PLL_NPLL) is part of the clock tree for
RK3399_PD_VDU's clocks but otherwise unused; NPLL is disabled
3. perform a ddrfreq transition (rk3399_dmcfreq_target() -> ...
drivers/clk/rockchip/clk-ddr.c / ROCKCHIP_SIP_DRAM_FREQ)
4. ARM Trusted Firmware unagates VDU clocks (via PMU_CRU_GATEDIS_CON0)
5. ARM Trusted firmware idles the memory controller domain
6. Step 5 waits on the VDU domain/clocks, but NPLL is still off
i.e., we hang the system.
So for (b), we need to at a minimum manage the relevant PLLs on behalf
of firmware. It's easier to simply manage the whole clock tree, in a
similar way we do in rockchip_pd_power().
For (a), we need to provide mutual exclusion betwen rockchip_pd_power()
and firmware. To resolve that, we simply grab the PMU mutex and release
it when ddrfreq is done.
The Chromium OS kernel has been carrying versions of part of this hack
for a while, based on some new custom notifiers [1]. I've rewritten as a
simple function call between the drivers, which is OK because:
* the PMU driver isn't enabled, and we don't have this problem at all
(the firmware should have left us in an OK state, and there are no
runtime conflicts); or
* the PMU driver is present, and is a single instance.
And the power-domain driver cannot be removed, so there's no lifetime
management to worry about.
For completeness, there's a 'dmc_pmu_mutex' to guard (likely
theoretical?) probe()-time races. It's OK for the memory controller
driver to start running before the PMU, because the PMU will avoid any
critical actions during the block() sequence.
[0] The RK3399 TRM for PMU_CRU_GATEDIS_CON0 only talks about ungating
clocks. Based on experimentation, we've found that it does not power
up the necessary PLLs.
[1] CHROMIUM: soc: rockchip: power-domain: Add notifier to dmc driver
https://chromium-review.googlesource.com/q/I242dbd706d352f74ff706f5cbf42ebb92f9bcc60
Notably, the Chromium solution only handled conflict (a), not (b).
In practice, item (b) wasn't a problem in many cases because we
never managed to fully power off PLLs. Now that the (upstream) video
decoder driver performs runtime clock management, we often power off
NPLL.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Tested-by: Peter Geis <pgwipeout@gmail.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
This static struct can get reused if the device gets removed/reprobed,
and that causes use-after-free in its ->freq_table.
Let's just move the struct to our dynamic allocation.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
We want to keep the idle time fixed, so compute based on the current DDR
frequency.
The old properties were deprecated and never used, so we can safely drop
them from the driver.
This is a rewritten version of work by Lin Huang <hl@rock-chips.com>.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Implement the newly-defined properties to allow disabling certain
power-saving-at-idle features at higher frequencies.
This is a rewritten version of work by Lin Huang <hl@rock-chips.com>.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
We're going to add new usages, and it's cleaner to work with macros
instead of comments and magic numbers.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
All of these properties are initialized by ARM Trusted Firmware, and
have been since the early days of this chip. It's redundant (and
possibly wrong) to do this here now. What's more, there seems to be some
confusion about the units and some of the definitions of this timing
struct: the DT docs say MHz for many of these, but downstream users were
in Hz (and therefore, the ATF interface was Hz). Also, the in-driver
usage for some of these (e.g., for comparing to target frequency) were
in Hz too. So doubly wrong.
We can avoid thinking about who got the right units by dropping the
unnecessary code and properties. They are marked deprecated in the
binding schema.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
These properties are:
* undocumented
* directly representing software properties, not hardware properties
* unused (no in-tree users, yet; this IP block has so far only been used
in downstream kernels)
Let's just stick the values that downstream users have been using
directly in the driver and call it a day.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
DDR DVFS tuning has found that several power-saving features don't have
good tradeoffs at higher frequencies -- at higher frequencies, we'll see
glitches or other errors. Provide tuning controls so these can be
disabled at higher OPPs, and left active only at the lower ones.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
It's inefficient to use the same number of cycles for all OPPs, since
lower frequencies make for longer idle times. Let's specify the idle
time instead, so software can pick the optimal number of cycles on its
own.
NB: these bindings aren't used anywhere yet.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The driver and all downstream device trees [1] are using Hz units, but
the document claims MHz. DRAM frequency for these systems can't possibly
exceed 2^32-1 Hz, so the choice of unit doesn't really matter than much.
Rather than add unnecessary risk in getting the units wrong, let's just
go with the unofficial convention and make the docs match reality.
A sub-1MHz frequency is extremely unlikely, so include a minimum in the
schema, to help catch anybody who might have believed this was MHz.
[1] And notably, also those trying to upstream them:
https://lore.kernel.org/lkml/20210308233858.24741-3-daniel.lezcano@linaro.org/
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
These DRAM configuration properties are all handled in ARM Trusted
Firmware (and have been since the early days of this SoC), and there are
no in-tree users of the DMC binding yet. It's better to just defer to
firmware instead of maintaining this large list of properties.
There's also some confusion about units: many of these are specified in
MHz, but the downstream users and driver code are treating them as Hz, I
believe. Rather than straighten all that out, I just drop them.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
I want to add, deprecate, and bugfix some properties, as well as add the
first users. This is easier with a proper schema.
The transformation is mostly straightforward, plus a few notable tweaks:
* Renamed rockchip,dram_speed_bin to rockchip,ddr3_speed_bin. The
driver code and the example matched, but the description was
different. I went with the implementation. Note that this property is
also slated for deprecation/deletion in the subsequent patches.
* Drop upthreshold and downdifferential properties from the example.
These were undocumented (so, wouldn't pass validation), but were
representing software properties (governor tweaks). I drop them from
the driver in subsequent patches.
* Rename clock from pclk_ddr_mon to dmc_clk. The driver, DT example,
and all downstream users matched -- the binding definition was the
exception. Anyway, "dmc_clk" is a more appropriately generic name.
* Choose a better filename and location (this is a memory controller).
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Pull more tracing updates from Steven Rostedt:
- Rename the staging files to give them some meaning. Just
stage1,stag2,etc, does not show what they are for
- Check for NULL from allocation in bootconfig
- Hold event mutex for dyn_event call in user events
- Mark user events to broken (to work on the API)
- Remove eBPF updates from user events
- Remove user events from uapi header to keep it from being installed.
- Move ftrace_graph_is_dead() into inline as it is called from hot
paths and also convert it into a static branch.
* tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Move user_events.h temporarily out of include/uapi
ftrace: Make ftrace_graph_is_dead() a static branch
tracing: Set user_events to BROKEN
tracing/user_events: Remove eBPF interfaces
tracing/user_events: Hold event_mutex during dyn_event_add
proc: bootconfig: Add null pointer check
tracing: Rename the staging files for trace_events
Pull clk fix from Stephen Boyd:
"A single revert to fix a boot regression seen when clk_put() started
dropping rate range requests. It's best to keep various systems
booting so we'll kick this out and try again next time"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
Revert "clk: Drop the rate range on clk_put()"
Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes and updates:
- Make the prctl() for enabling dynamic XSTATE components correct so
it adds the newly requested feature to the permission bitmap
instead of overwriting it. Add a selftest which validates that.
- Unroll string MMIO for encrypted SEV guests as the hypervisor
cannot emulate it.
- Handle supervisor states correctly in the FPU/XSTATE code so it
takes the feature set of the fpstate buffer into account. The
feature sets can differ between host and guest buffers. Guest
buffers do not contain supervisor states. So far this was not an
issue, but with enabling PASID it needs to be handled in the buffer
offset calculation and in the permission bitmaps.
- Avoid a gazillion of repeated CPUID invocations in by caching the
values early in the FPU/XSTATE code.
- Enable CONFIG_WERROR in x86 defconfig.
- Make the X86 defconfigs more useful by adapting them to Y2022
reality"
* tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/xstate: Consolidate size calculations
x86/fpu/xstate: Handle supervisor states in XSTATE permissions
x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
x86/fpu: Cache xfeature flags from CPUID
x86/fpu/xsave: Initialize offset/size cache early
x86/fpu: Remove unused supervisor only offsets
x86/fpu: Remove redundant XCOMP_BV initialization
x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO
x86/config: Make the x86 defconfigs a bit more usable
x86/defconfig: Enable WERROR
selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test
x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation
Pull RT signal fix from Thomas Gleixner:
"Revert the RT related signal changes. They need to be reworked and
generalized"
* tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
Pull more dma-mapping updates from Christoph Hellwig:
- fix a regression in dma remap handling vs AMD memory encryption (me)
- finally kill off the legacy PCI DMA API (Christophe JAILLET)
* tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: move pgprot_decrypted out of dma_pgprot
PCI/doc: cleanup references to the legacy PCI DMA API
PCI: Remove the deprecated "pci-dma-compat.h" API
Pull more perf tools updates from Arnaldo Carvalho de Melo:
- Avoid SEGV if core.cpus isn't set in 'perf stat'.
- Stop depending on .git files for building PERF-VERSION-FILE, used in
'perf --version', fixing some perf tools build scenarios.
- Convert tracepoint.py example to python3.
- Update UAPI header copies from the kernel sources: socket,
mman-common, msr-index, KVM, i915 and cpufeatures.
- Update copy of libbpf's hashmap.c.
- Directly return instead of using local ret variable in
evlist__create_syswide_maps(), found by coccinelle.
* tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf python: Convert tracepoint.py example to python3
perf evlist: Directly return instead of using local ret variable
perf cpumap: More cpu map reuse by merge.
perf cpumap: Add is_subset function
perf evlist: Rename cpus to user_requested_cpus
perf tools: Stop depending on .git files for building PERF-VERSION-FILE
tools headers cpufeatures: Sync with the kernel sources
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools kvm headers arm64: Update KVM headers from the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers UAPI: Sync asm-generic/mman-common.h with the kernel
perf beauty: Update copy of linux/socket.h with the kernel sources
perf tools: Update copy of libbpf's hashmap.c
perf stat: Avoid SEGV if core.cpus isn't set
Pull Kbuild fixes from Masahiro Yamada:
- Fix empty $(PYTHON) expansion.
- Fix UML, which got broken by the attempt to suppress Clang warnings.
- Fix warning message in modpost.
* tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
modpost: restore the warning message for missing symbol versions
Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS"
kbuild: Remove '-mno-global-merge'
kbuild: fix empty ${PYTHON} in scripts/link-vmlinux.sh
kconfig: remove stale comment about removed kconfig_print_symbol()
Pull MIPS fixes from Thomas Bogendoerfer:
- build fix for gpio
- fix crc32 build problems
- check for failed memory allocations
* tag 'mips_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: crypto: Fix CRC32 code
MIPS: rb532: move GPIOD definition into C-files
MIPS: lantiq: check the return value of kzalloc()
mips: sgi-ip22: add a check for the return of kzalloc()