Commit Graph

1645 Commits

Author SHA1 Message Date
Linus Torvalds f689b742f2 Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
 "Core:
   - Ground work for the new Power9 MMU from Aneesh Kumar K.V
   - Optimise FP/VMX/VSX context switching from Anton Blanchard

  Misc:
   - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
     Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
     Andrew Donnellan
   - Allow wrapper to work on non-english system from Laurent Vivier
   - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
   - Fix module autoload for rackmeter & axonram drivers from Luis de
     Bethencourt
   - Include KVM guest test in all interrupt vectors from Paul Mackerras
   - Fix DSCR inheritance over fork() from Anton Blanchard
   - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
     fully ordered from Boqun Feng
   - Print MSR TM bits in oops messages from Michael Neuling
   - Add TM signal return & invalid stack selftests from Michael Neuling
   - Limit EPOW reset event warnings from Vipin K Parashar
   - Remove the Cell QPACE code from Rashmica Gupta
   - Append linux_banner to exception information in xmon from Rashmica
     Gupta
   - Add selftest to check if VSRs are corrupted from Rashmica Gupta
   - Remove broken GregorianDay() from Daniel Axtens
   - Import Anton's context_switch2 benchmark into selftests from
     Michael Ellerman
   - Add selftest script to test HMI functionality from Daniel Axtens
   - Remove obsolete OPAL v2 support from Stewart Smith
   - Make enter_rtas() private from Michael Ellerman
   - PPR exception cleanups from Michael Ellerman
   - Add page soft dirty tracking from Laurent Dufour
   - Add support for Nvlink NPUs from Alistair Popple
   - Add support for kexec on 476fpe from Alistair Popple
   - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
   - Copy only required pieces of the mm_context_t to the paca from
     Michael Neuling
   - Add a kmsg_dumper that flushes OPAL console output on panic from
     Russell Currey
   - Implement save_stack_trace_regs() to enable kprobe stack tracing
     from Steven Rostedt
   - Add HWCAP bits for Power9 from Michael Ellerman
   - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
   - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
   - scripts/recordmcount.pl: support data in text section on powerpc
     from Ulrich Weigand
   - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

  cxl:
   - cxl: Fix possible idr warning when contexts are released from
     Vaibhav Jain
   - cxl: use correct operator when writing pcie config space values
     from Andrew Donnellan
   - cxl: Fix DSI misses when the context owning task exits from Vaibhav
     Jain
   - cxl: fix build for GCC 4.6.x from Brian Norris
   - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
   - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
     Krishnan

  Freescale:
   - Freescale updates from Scott: Highlights include moving QE code out
     of arch/powerpc (to be shared with arm), device tree updates, and
     minor fixes"

* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
  powerpc/module: Handle R_PPC64_ENTRY relocations
  scripts/recordmcount.pl: support data in text section on powerpc
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  powerpc/mm: Fix _PAGE_PTE breaking swapoff
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: use -Werror only with CONFIG_PPC_WERROR
  cxl: fix build for GCC 4.6.x
  powerpc: Add HWCAP bits for Power9
  powerpc/powernv: Reserve PE#0 on NPU
  powerpc/powernv: Change NPU PE# assignment
  powerpc/powernv: Fix update of NVLink DMA mask
  powerpc/powernv: Remove misleading comment in pci.c
  powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
  powerpc: Fix build break due to paca mm_context_t changes
  cxl: Fix DSI misses when the context owning task exits
  MAINTAINERS: Update Scott Wood's e-mail address
  powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
  powerpc: Fix style of self-test config prompts
  powerpc/powernv: Only delay opal_rtc_read() retry when necessary
  ...
2016-01-15 13:18:47 -08:00
Andrzej Hajda 929ca89c30 cpufreq-dt: fix handling regulator_get_voltage() result
The function can return negative values so it should be assigned
to signed type.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci.

Link: http://permalink.gmane.org/gmane.linux.kernel/2038576
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-05 02:01:38 +01:00
Chen Yu 0df35026c6 cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC
It is reported that, with CONFIG_HZ_PERIODIC=y cpu stays at the
lowest frequency even if the usage goes to 100%, neither ondemand
nor conservative governor works, however performance and
userspace work as expected. If set with CONFIG_NO_HZ_FULL=y,
everything goes well.

This problem is caused by improper calculation of the idle_time
when the load is extremely high(near 100%). Firstly, cpufreq_governor
uses get_cpu_idle_time to get the total idle time for specific cpu, then:

1.If the system is configured with CONFIG_NO_HZ_FULL, the idle time is
  returned by ktime_get, which is always increasing, it's OK.
2.However, if the system is configured with CONFIG_HZ_PERIODIC,
  get_cpu_idle_time might not guarantee to be always increasing,
  because it will leverage get_cpu_idle_time_jiffy to calculate the
  idle_time, consider the following scenario:

At T1:
idle_tick_1 = total_tick_1 - user_tick_1

sample period(80ms)...

At T2: ( T2 = T1 + 80ms):
idle_tick_2 = total_tick_2 - user_tick_2

Currently the algorithm is using (idle_tick_2 - idle_tick_1) to
get the delta idle_time during the past sample period, however
it CAN NOT guarantee that idle_tick_2 >= idle_tick_1, especially
when cpu load is high.
(Yes, total_tick_2 >= total_tick_1, and user_tick_2 >= user_tick_1,
but how about idle_tick_2 and idle_tick_1? No guarantee.)
So governor might get a negative value of idle_time during the past
sample period, which might mislead the system that the idle time is
very big(converted to unsigned int), and the busy time is nearly zero,
which causes the governor to always choose the lowest cpufreq,
then cause this problem.

In theory there are two solutions:

1.The logic should not rely on the idle tick during every sample period,
  but be based on the busy tick directly, as this is how 'top' is
  implemented.

2.Or the logic must make sure that the idle_time is strictly increasing
  during each sample period, then there would be no negative idle_time
  anymore. This solution requires minimum modification to current code
  and this patch uses method 2.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=69821
Reported-by: Jan Fikar <j.fikar@gmail.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-05 02:01:32 +01:00
Pi-Cheng Chen a889331d75 cpufreq: mt8173: migrate to use operating-points-v2 bindings
Modify mt8173-cpufreq driver to get OPP-sharing information and set up
OPP table provided by operating-points-v2 bindings.

Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-02 00:49:33 +01:00
Rafael J. Wysocki 7a6c79f2fe cpufreq: Simplify core code related to boost support
Notice that the boost_supported field in struct cpufreq_driver is
redundant, because the driver's ->set_boost callback may be left
unset if "boost" is not supported.  Moreover, the only driver
populating the ->set_boost callback is acpi_cpufreq, so make it
avoid populating that callback if "boost" is not supported, rework
the core to check ->set_boost instead of boost_supported to
verify "boost" support and drop boost_supported which isn't
used any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-01-01 03:49:51 +01:00
Rafael J. Wysocki 17135782b8 cpufreq: acpi-cpufreq: Simplify boost-related code
The store_boost() routine is only used by store_cpb(), so move
the code from it directly to that function and rename _store_boost()
to set_boost() to make its name reflect the name of the driver
callback pointing to it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-01-01 03:49:51 +01:00
Rafael J. Wysocki 41669da030 cpufreq: Make cpufreq_boost_supported() static
cpufreq_boost_supported() is not used outside of cpufreq.c, so make
it static.

While at it, refactor it as a one-liner (which it really is).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-01-01 03:49:51 +01:00
Markus Elfring d23f8cadf9 blackfin-cpufreq: Mark cpu_set_cclk() as static
The cpu_set_cclk() function was only used in a single source file so far.
Indicate this setting also by the corresponding linkage specifier.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-28 01:51:36 +01:00
Markus Elfring a5810b4f07 blackfin-cpufreq: Change return type of cpu_set_cclk() to that of clk_set_rate()
The return type "unsigned long" was used by the cpu_set_cclk() function
while the type "int" is provided by the clk_set_rate() function.
Let us make this usage consistent.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-28 01:51:35 +01:00
Rafael J. Wysocki 9ad55cd9e6 Merge back earlier cpufreq material for v4.5. 2015-12-28 01:34:35 +01:00
Dan Carpenter a7def561c2 cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()
The "domain" variable needs to be signed for the error handling to work.

Fixes: 8def31034d (cpufreq: arm_big_little: add SCPI interface driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-24 02:11:37 +01:00
Rafael J. Wysocki 4157c2fc84 Merge back earlier cpufreq material for v4.5. 2015-12-21 03:15:15 +01:00
Stewart Smith e4d54f71d2 powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Rafael J. Wysocki f1b9fc591e Merge branches 'powercap', 'pm-cpufreq' and 'pm-domains'
* powercap:
  powercap / RAPL: fix BIOS lock check

* pm-cpufreq:
  cpufreq: intel_pstate: Minor cleanup for FRAC_BITS
  cpufreq: tegra: add regulator dependency for T124

* pm-domains:
  PM / Domains: Allow runtime PM callbacks to be re-used during system PM
2015-12-14 22:58:57 +01:00
Linus Torvalds 097b285d32 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
 "Here are a bunch of small bug fixes for various ARM platforms, nothing
  really sticks out this week, most of either fixes bugs in code that
  was just added in 4.4, or that has been broken for many years without
  anyone noticing.

  at91/sama5d2:
   - fix sama5de hardware setup of sd/mmc interface
   - proper selection of pinctrl drivers.  PIO4 is necessary for sama5d2

  berlin:
   - fix incorrect clock input for SDIO

  exynos:
   - Fix potential NULL pointer dereference in Exynos PMU driver.

  imx:
   - Fix vf610 SAI clock configuration bug which is discovered by the
     newly added master mode support in SAI audio driver.
   - Fix buggy L2 cache latency values in vf610 device trees, which may
     cause system hang when cpu runs at a higher frequency.

  ixp4xx:
   - fix prototypes for readl/writel functions

  ls2080a:
   - use little-endian register access for GPIO and SDHCI

  omap:
   - Fix clock source for ARM TWD and global timers on am437x
   - Always select REGULATOR_FIXED_VOLTAGE for omap2+ instead of when
     MACH_OMAP3_PANDORA is selected
   - Fix SPI DMA handles for dm816x as only some were mapped
   - Fix up mbox cells for dm816x to make mailbox usable

  pxa:
   - use PWM lookup table for all ezx machines

  s3c24xx:
   - Remove incorrect __init annotation from s3c24xx cpufreq driver
     structures.

  versatile:
   - fix PCI IRQ mapping on Versatile PB"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ls2080a/dts: Add little endian property for GPIO IP block
  dt-bindings: define little-endian property for QorIQ GPIO
  ARM64: dts: ls2080a: fix eSDHC endianness
  ARM: dts: vf610: use reset values for L2 cache latencies
  ARM: pxa: use PWM lookup table for all machines
  ARM: dts: berlin: add 2nd clock for BG2Q sdhci0 and sdhci1
  ARM: dts: berlin: correct BG2Q's sdhci2 2nd clock
  ARM: dts: am4372: fix clock source for arm twd and global timers
  ARM: at91: fix pinctrl driver selection
  ARM: at91/dt: add always-on to 1.8V regulator
  ARM: dts: vf610: fix clock definition for SAI2
  ARM: imx: clk-vf610: fix SAI clock tree
  ARM: ixp4xx: fix read{b,w,l} return types
  irqchip/versatile-fpga: Fix PCI IRQ mapping on Versatile PB
  ARM: OMAP2+: enable REGULATOR_FIXED_VOLTAGE
  ARM: dts: add dm816x missing spi DT dma handles
  ARM: dts: add dm816x missing #mbox-cells
  cpufreq: s3c24xx: Do not mark s3c2410_plls_add as __init
  ARM: EXYNOS: Fix potential NULL pointer access in exynos_sys_powerdown_conf
2015-12-12 16:43:44 -08:00
Lee Jones ab0ea257fc cpufreq: st: Provide runtime initialised driver for ST's platforms
The bootloader is charged with the responsibility to provide platform
specific Dynamic Voltage and Frequency Scaling (DVFS) information via
Device Tree.  This driver takes the supplied configuration and
registers it with the new generic OPP framework, to then be used with
CPUFreq.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-12 02:55:21 +01:00
Prarit Bhargava 88b7b7c0c2 cpufreq: intel_pstate: Minor cleanup for FRAC_BITS
785ee27 ("cpufreq: intel_pstate: Fix limits->max_perf rounding error")
hardcodes the value of FRAC_BITS.  This patch fixes that minor issue.

Fixes: 785ee27881 (cpufreq: intel_pstate: Fix limits->max_perf rounding error)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-12 02:28:19 +01:00
Pi-Cheng Chen 89b56047f6 cpufreq: mt8173: Move resources allocation into ->probe()
Since the return value of ->init() of cpufreq driver is not propagated
to the device driver model now, move resources allocation into
->probe() to handle -EPROBE_DEFER properly.

Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-12 02:27:16 +01:00
Arnd Bergmann b5832e4b62 cpufreq: tegra: add regulator dependency for T124
This driver is the only one that calls regulator_sync_voltage(), but it
can currently be built with CONFIG_REGULATOR disabled, producing
this build error:

drivers/cpufreq/tegra124-cpufreq.c: In function 'tegra124_cpu_switch_to_pllx':
drivers/cpufreq/tegra124-cpufreq.c:68:2: error: implicit declaration of function 'regulator_sync_voltage' [-Werror=implicit-function-declaration]
  regulator_sync_voltage(priv->vdd_cpu_reg);

My first attempt was to implement a helper for this function
for regulator_sync_voltage, but Mark Brown explained:

   We don't do this for *all* regulator API functions - there's some where
   using them strongly suggests that there is actually a dependency on
   the regulator API.  This does seem like it might be falling into the
   specialist category [...]
   Looking at the code I'm pretty unclear on what the authors think the
   use of _sync_voltage() is doing in the first place so it may be even
   better to just remove the call.  It seems to have been included in the
   first commit so there's not changelog explaining things and there's
   no comment either.  I'd *expect* it to be a noop as far as I can see.

This adds the dependency to make the driver always build successfully
or not be enabled at all. Alternatively, we could investigate if the
driver should stop calling regulator_sync_voltage instead.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-12 02:25:56 +01:00
Arnd Bergmann 789d73b384 Merge tag 'samsung-fixes-4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into fixes
Merge "Fixes for Exynos" from Krzysztof Kozlowski:

1. Fix potential NULL pointer dereference in Exynos PMU driver.
2. Remove incorrect __init annotation from s3c24xx cpufreq driver
   structures.

* tag 'samsung-fixes-4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  cpufreq: s3c24xx: Do not mark s3c2410_plls_add as __init
  ARM: EXYNOS: Fix potential NULL pointer access in exynos_sys_powerdown_conf
2015-12-11 00:19:37 +01:00
Philippe Longepe 63d1d656a5 cpufreq: intel_pstate: Account for IO wait time
In cases where we have many IOs, the global load becomes low and the
load algorithm will decrease the requested P-State. Because of that,
the IOs overheads will increase and impact the IO performances.

To improve IO bound work, we can count the io-wait time as busy time
in calculating CPU busy.

This change uses get_cpu_iowait_time_us() to obtain the IO wait time value
and converts time into number of cycles spent waiting on IO at the TSC
rate. At the moment, this trick is only used for Atom.

Signed-off-by: Philippe Longepe <philippe.longepe@intel.com>
Signed-off-by: Stephane Gasparini <stephane.gasparini@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-10 01:17:40 +01:00
Philippe Longepe e70eed2b64 cpufreq: intel_pstate: Account for non C0 time
The current function to calculate cpu utilization uses the average P-state
ratio (APerf/Mperf) scaled by the ratio of the current P-state to the
max available non-turbo one. This leads to an overestimation of
utilization which causes higher-performance P-states to be selected more
often and that leads to increased energy consumption.

This is a problem for low-power systems, so it is better to use a
different utilization calculation algorithm for them.

Namely, the Percent Busy value (or load) can be estimated as the ratio of the
MPERF counter that runs at a constant rate only during active periods (C0) to
the time stamp counter (TSC) that also runs (at the same rate) during idle.
That is:

Percent Busy = 100 * (delta_mperf / delta_tsc)

Use this algorithm for platforms with SoCs based on the Airmont and Silvermont
Atom cores.

Signed-off-by: Philippe Longepe <philippe.longepe@intel.com>
Signed-off-by: Stephane Gasparini <stephane.gasparini@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-10 01:17:40 +01:00
Philippe Longepe 157386b6fc cpufreq: intel_pstate: Configurable algorithm to get target pstate
Target systems using different cpus have different power and performance
requirements. They may use different algorithms to get the next P-state
based on their power or performance preference.

For example, power-constrained systems may not want to use
high-performance P-states as aggressively as a full-size desktop or a
server platform. A server platform may want to run close to the max to
achieve better performance, while laptop-like systems may prefer
sacrificing performance for longer battery lifes.

For the above reasons, modify intel_pstate to allow the target P-state
selection algorithm to be depend on the CPU ID.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Philippe Longepe <philippe.longepe@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-10 01:17:40 +01:00
Pi-Cheng Chen 40be4c3ccb cpufreq: mt8173: check return value of regulator_get_voltage() call
Sometimes regulator_get_voltage() call returns negative values for
reasons(e.g. underlying I2C bus timeout). Add check for the return
values and fail out early.

Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-10 00:27:33 +01:00
Pi-Cheng Chen 93625d52e7 cpufreq: mt8173: remove redundant regulator_get_voltage() call
Remove redundant regulator_get_voltage() call to get Vsram value
since it will be obtained later at the beginning of voltage tracking
loop.

Signed-off-by: Pi-Cheng Chen <pi-cheng.chen@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-10 00:27:33 +01:00