Commit Graph

467 Commits

Author SHA1 Message Date
Linus Torvalds d691b7e7d1 Merge tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for STRICT_KERNEL_RWX on 64-bit server CPUs.

   - Platform support for FSP2 (476fpe) board

   - Enable ZONE_DEVICE on 64-bit server CPUs.

   - Generic & powerpc spin loop primitives to optimise busy waiting

   - Convert VDSO update function to use new update_vsyscall() interface

   - Optimisations to hypercall/syscall/context-switch paths

   - Improvements to the CPU idle code on Power8 and Power9.

  As well as many other fixes and improvements.

  Thanks to: Akshay Adiga, Andrew Donnellan, Andrew Jeffery, Anshuman
  Khandual, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt,
  Christophe Leroy, Christophe Lombard, Colin Ian King, Dan Carpenter,
  Gautham R. Shenoy, Hari Bathini, Ian Munsie, Ivan Mikhaylov, Javier
  Martinez Canillas, Madhavan Srinivasan, Masahiro Yamada, Matt Brown,
  Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Naveen N.
  Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pavel Machek,
  Russell Currey, Santosh Sivaraj, Stephen Rothwell, Thiago Jung
  Bauermann, Yang Li"

* tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits)
  powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs
  powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix
  powerpc/mm/hash: Implement mark_rodata_ro() for hash
  powerpc/vmlinux.lds: Align __init_begin to 16M
  powerpc/lib/code-patching: Use alternate map for patch_instruction()
  powerpc/xmon: Add patch_instruction() support for xmon
  powerpc/kprobes/optprobes: Use patch_instruction()
  powerpc/kprobes: Move kprobes over to patch_instruction()
  powerpc/mm/radix: Fix execute permissions for interrupt_vectors
  powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp()
  powerpc/64s: Blacklist rtas entry/exit from kprobes
  powerpc/64s: Blacklist functions invoked on a trap
  powerpc/64s: Un-blacklist system_call() from kprobes
  powerpc/64s: Move system_call() symbol to just after setting MSR_EE
  powerpc/64s: Blacklist system_call() and system_call_common() from kprobes
  powerpc/64s: Convert .L__replay_interrupt_return to a local label
  powerpc64/elfv1: Only dereference function descriptor for non-text symbols
  cxl: Export library to support IBM XSL
  powerpc/dts: Use #include "..." to include local DT
  powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8
  ...
2017-07-07 13:55:45 -07:00
Linus Torvalds 408c9861c6 Merge tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "The big ticket items here are the rework of suspend-to-idle in order
  to add proper support for power button wakeup from it on recent Dell
  laptops and the rework of interfaces exporting the current CPU
  frequency on x86.

  In addition to that, support for a few new pieces of hardware is
  added, the PCI/ACPI device wakeup infrastructure is simplified
  significantly and the wakeup IRQ framework is fixed to unbreak the IRQ
  bus locking infrastructure.

  Also, there are some functional improvements for intel_pstate, tools
  updates and small fixes and cleanups all over.

  Specifics:

   - Rework suspend-to-idle to allow it to take wakeup events signaled
     by the EC into account on ACPI-based platforms in order to properly
     support power button wakeup from suspend-to-idle on recent Dell
     laptops (Rafael Wysocki).

     That includes the core suspend-to-idle code rework, support for the
     Low Power S0 _DSM interface, and support for the ACPI INT0002
     Virtual GPIO device from Hans de Goede (required for USB keyboard
     wakeup from suspend-to-idle to work on some machines).

   - Stop trying to export the current CPU frequency via /proc/cpuinfo
     on x86 as that is inaccurate and confusing (Len Brown).

   - Rework the way in which the current CPU frequency is exported by
     the kernel (over the cpufreq sysfs interface) on x86 systems with
     the APERF and MPERF registers by always using values read from
     these registers, when available, to compute the current frequency
     regardless of which cpufreq driver is in use (Len Brown).

   - Rework the PCI/ACPI device wakeup infrastructure to remove the
     questionable and artificial distinction between "devices that can
     wake up the system from sleep states" and "devices that can
     generate wakeup signals in the working state" from it, which allows
     the code to be simplified quite a bit (Rafael Wysocki).

   - Fix the wakeup IRQ framework by making it use SRCU instead of RCU
     which doesn't allow sleeping in the read-side critical sections,
     but which in turn is expected to be allowed by the IRQ bus locking
     infrastructure (Thomas Gleixner).

   - Modify some computations in the intel_pstate driver to avoid
     rounding errors resulting from them (Srinivas Pandruvada).

   - Reduce the overhead of the intel_pstate driver in the HWP
     (hardware-managed P-states) mode and when the "performance" P-state
     selection algorithm is in use by making it avoid registering
     scheduler callbacks in those cases (Len Brown).

   - Rework the energy_performance_preference sysfs knob in intel_pstate
     by changing the values that correspond to different symbolic hint
     names used by it (Len Brown).

   - Make it possible to use more than one cpuidle driver at the same
     time on ARM (Daniel Lezcano).

   - Make it possible to prevent the cpuidle menu governor from using
     the 0 state by disabling it via sysfs (Nicholas Piggin).

   - Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1 on
     AMD systems (Yazen Ghannam).

   - Make the CPPC cpufreq driver take the lowest nonlinear performance
     information into account (Prashanth Prakash).

   - Add support for hi3660 to the cpufreq-dt driver, fix the imx6q
     driver and clean up the sfi, exynos5440 and intel_pstate drivers
     (Colin Ian King, Krzysztof Kozlowski, Octavian Purdila, Rafael
     Wysocki, Tao Wang).

   - Fix a few minor issues in the generic power domains (genpd)
     framework and clean it up somewhat (Krzysztof Kozlowski, Mikko
     Perttunen, Viresh Kumar).

   - Fix a couple of minor issues in the operating performance points
     (OPP) framework and clean it up somewhat (Viresh Kumar).

   - Fix a CONFIG dependency in the hibernation core and clean it up
     slightly (Balbir Singh, Arvind Yadav, BaoJun Luo).

   - Add rk3228 support to the rockchip-io adaptive voltage scaling
     (AVS) driver (David Wu).

   - Fix an incorrect bit shift operation in the RAPL power capping
     driver (Adam Lessnau).

   - Add support for the EPP field in the HWP (hardware managed
     P-states) control register, HWP.EPP, to the x86_energy_perf_policy
     tool and update msr-index.h with HWP.EPP values (Len Brown).

   - Fix some minor issues in the turbostat tool (Len Brown).

   - Add support for AMD family 0x17 CPUs to the cpupower tool and fix a
     minor issue in it (Sherry Hurwitz).

   - Assorted cleanups, mostly related to the constification of some
     data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof
     Kozlowski)"

* tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (69 commits)
  cpufreq: Update scaling_cur_freq documentation
  cpufreq: intel_pstate: Clean up after performance governor changes
  PM: hibernate: constify attribute_group structures.
  cpuidle: menu: allow state 0 to be disabled
  intel_idle: Use more common logging style
  PM / Domains: Fix missing default_power_down_ok comment
  PM / Domains: Fix unsafe iteration over modified list of domains
  PM / Domains: Fix unsafe iteration over modified list of domain providers
  PM / Domains: Fix unsafe iteration over modified list of device links
  PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device
  PM / Domains: Call driver's noirq callbacks
  PM / core: Drop run_wake flag from struct dev_pm_info
  PCI / PM: Simplify device wakeup settings code
  PCI / PM: Drop pme_interrupt flag from struct pci_dev
  ACPI / PM: Consolidate device wakeup settings code
  ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flags
  PM / QoS: constify *_attribute_group.
  PM / AVS: rockchip-io: add io selectors and supplies for rk3228
  powercap/RAPL: prevent overridding bits outside of the mask
  PM / sysfs: Constify attribute groups
  ...
2017-07-04 13:39:41 -07:00
Nicholas Piggin 3ed09c9458 cpuidle: menu: allow state 0 to be disabled
The menu driver does not allow state0 to be disabled completely.
If it is disabled but other enabled states don't meet latency
requirements, it is still used.

Fix this by starting with the first enabled idle state. Fall back
to state 0 if no idle states are enabled (arguably this should be
-EINVAL if it is attempted, but this is the minimal fix).

Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-29 22:59:17 +02:00
Nicholas Piggin 7ded429152 cpuidle: powerpc: no memory barrier after break from idle
A memory barrier is not required after the task wakes up,
only if we clear the polling flag before waking. The case
where we have work to do is the important one, so optimise
for it.

Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28 13:08:12 +10:00
Nicholas Piggin 624e46d035 cpuidle: powerpc: read mostly for common globals
Ensure these don't get put into bouncing cachelines.

Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28 13:08:12 +10:00
Nicholas Piggin 3fc5ee927f cpuidle: powerpc: cpuidle set polling before enabling irqs
local_irq_enable can cause interrupts to be taken which could
take significant amount of processing time. The idle process
should set its polling flag before this, so another process that
wakes it during this time will not have to send an IPI.

Expand the TIF_POLLING_NRFLAG coverage to as large as possible.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28 13:08:11 +10:00
Daniel Lezcano d50a7d8acd ARM: cpuidle: Support asymmetric idle definition
Some hardware have clusters with different idle states. The current code does
not support this and fails as it expects all the idle states to be identical.

Because of this, the Mediatek mtk8173 had to create the same idle state for a
big.Little system and now the Hisilicon 960 is facing the same situation.

Solve this by simply assuming the multiple driver will be needed for all the
platforms using the ARM generic cpuidle driver which makes sense because of the
different topologies we can support with a single kernel for ARM32 or ARM64.

Every CPU has its own driver, so every single CPU can specify in the DT the
idle states.

This simple approach allows to support the future dynamIQ system, current SMP
and HMP.

Tested on:
 - 96boards: Hikey 620
 - 96boards: Hikey 960
 - 96boards: dragonboard410c
 - Mediatek 8173

Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-24 01:51:00 +02:00
Ingo Molnar 902b319413 Merge branch 'WIP.sched/core' into sched/core
Conflicts:
	kernel/sched/Makefile

Pick up the waitqueue related renames - it didn't get much feedback,
so it appears to be uncontroversial. Famous last words? ;-)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 12:28:21 +02:00
Nicholas Piggin 2201f994a5 powerpc/64s/idle: Move soft interrupt mask logic into C code
This simplifies the asm and fixes irq-off tracing over sleep
instructions.

Also move powersave_nap check for POWER8 into C code, and move
PSSCR register value calculation for POWER9 into C.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19 19:46:26 +10:00
Rafael J. Wysocki f63e4f7d41 Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'pm-devfreq'
* pm-cpufreq:
  cpufreq: conservative: Allow down_threshold to take values from 1 to 10
  Revert "cpufreq: schedutil: Reduce frequencies slower"

* pm-cpuidle:
  cpuidle: dt: Add missing 'of_node_put()'

* pm-devfreq:
  PM / devfreq: exynos-ppmu: Staticize event list
  PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable
  PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
2017-06-15 01:51:33 +02:00
Christophe Jaillet b2cdd8e1b5 cpuidle: dt: Add missing 'of_node_put()'
'of_node_put()' should be called on pointer returned by
'of_parse_phandle()' when done. In this function this is done in all path
except this 'continue', so add it.

Fixes: 97735da074 (drivers: cpuidle: Add status property to ARM idle states)
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 14:36:13 +02:00
Gautham R. Shenoy f9122ee4f5 cpuidle-powernv: Allow Deep stop states that don't stop time
The current code in the cpuidle-powernv intialization only allows deep
stop states (indicated by OPAL_PM_STOP_INST_DEEP) which lose timebase
(indicated by OPAL_PM_TIMEBASE_STOP). This assumption goes back to
POWER8 time where deep states used to lose the timebase. However, on
POWER9, we do have stop states that are deep (they lose hypervisor
state) but retain the timebase.

Fix the initialization code in the cpuidle-powernv driver to allow
such deep states.

Further, there is a bug in cpuidle-powernv driver with
CONFIG_TICK_ONESHOT=n where we end up incrementing the nr_idle_states
even if a platform idle state which loses time base was not added to
the cpuidle table.

Fix this by ensuring that the nr_idle_states variable gets incremented
only when the platform idle state was added to the cpuidle table.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30 14:59:51 +10:00
Peter Zijlstra f9fccdb9ef cpuidle: Fix idle time tracking
Ville reported that on his Core2, which has TSC stop in idle, we would
always report very short idle durations. He tracked this down to
commit:

  e93e59ce5b ("cpuidle: Replace ktime_get() with local_clock()")

which replaces ktime_get() with local_clock().

Add a sched_clock_idle_wakeup_event() call, which will re-sync the
clock with ktime_get_ns() when TSC is unstable and no-op otherwise.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: e93e59ce5b ("cpuidle: Replace ktime_get() with local_clock()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:20 +02:00
Rafael J. Wysocki 80449d89d4 Merge branches 'pm-domains', 'pm-cpuidle', 'pm-sleep' and 'powercap'
* pm-domains:
  PM / Domains: Add DT file to MAINTAINERS
  PM / Domains: Fix DT example

* pm-cpuidle:
  x86/intel_idle: add Gemini Lake support
  cpuidle: check dev before usage in cpuidle_use_deepest_state()

* pm-sleep:
  ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle
  PM / wakeup: Integrate mechanism to abort transitions in progress

* powercap:
  powercap: intel_rapl: Add support for Gemini Lake
2017-05-09 23:21:46 +02:00
Li, Fei 41dc750ea6 cpuidle: check dev before usage in cpuidle_use_deepest_state()
In case of there is no cpuidle devices registered, dev will be null, and
panic will be triggered like below;
In this patch, add checking of dev before usage, like that done in
cpuidle_idle_call.

Panic without fix:
[  184.961328] BUG: unable to handle kernel NULL pointer dereference at
  (null)
[  184.961328] IP: cpuidle_use_deepest_state+0x30/0x60
...
[  184.961328]  play_idle+0x8d/0x210
[  184.961328]  ? __schedule+0x359/0x8e0
[  184.961328]  ? _raw_spin_unlock_irqrestore+0x28/0x50
[  184.961328]  ? kthread_queue_delayed_work+0x41/0x80
[  184.961328]  clamp_idle_injection_func+0x64/0x1e0

Fixes: bb8313b603 (cpuidle: Allow enforcing deepest idle state selection)
Signed-off-by: Li, Fei <fei.li@intel.com>
Tested-by: Shi, Feng <fengx.shi@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-05-01 23:17:20 +02:00
Rafael J. Wysocki 060d0fbb43 Merge branches 'pm-cpuidle', 'pm-core', 'pm-domains', 'pm-avs' and 'pm-devfreq'
* pm-cpuidle:
  cpuidle: powernv: Avoid a branch in the core snooze_loop() loop
  cpuidle: powernv: Don't continually set thread priority in snooze_loop()
  cpuidle: powernv: Don't bounce between low and very low thread priority
  cpuidle: cpuidle-cps: remove unused variable
  powernv-cpuidle: Validate DT property array size

* pm-core:
  PM / runtime: Document autosuspend-helper side effects
  PM / runtime: Fix autosuspend documentation

* pm-domains:
  PM / Domains: Ignore domain-idle-states that are not compatible
  PM / Domains: Don't warn about IRQ safe device for an always on PM domain
  PM / Domains: Respect errors from genpd's ->power_off() callback
  PM / Domains: Enable users of genpd to specify always on PM domains
  PM / Domains: Clean up code validating genpd's status
  PM / Domain: remove conditional from error case

* pm-avs:
  PM / AVS: rockchip-io: add io selectors and supplies for rk3328

* pm-devfreq:
  PM / devfreq: Move struct devfreq_governor to devfreq directory
2017-04-28 23:15:34 +02:00
Anton Blanchard 0baa91cb73 cpuidle: powernv: Avoid a branch in the core snooze_loop() loop
When in the snooze_loop() we want to take up the least amount of
resources. On my version of gcc (6.3), we end up with an extra
branch because it predicts snooze_timeout_en to be false, whereas it
is almost always true.

Use likely() to avoid the branch and be a little nicer to the
other non idle threads on the core.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:17:19 +02:00
Anton Blanchard 26eb48a9fa cpuidle: powernv: Don't continually set thread priority in snooze_loop()
The powerpc64 kernel exception handlers have preserved thread priorities
for a long time now, so there is no need to continually set it.

Just set it once on entry and once exit.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:17:18 +02:00
Anton Blanchard 79b578111f cpuidle: powernv: Don't bounce between low and very low thread priority
The core of snooze_loop() continually bounces between low and very
low thread priority. Changing thread priorities is an expensive
operation that can negatively impact other threads on a core.

All CPUs that can run PowerNV support very low priority, so we can
avoid the change completely.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:17:18 +02:00
Marcin Nowakowski 02018b3929 cpuidle: cpuidle-cps: remove unused variable
'core' in cps_cpuidle_init has never been used and is unnecessary, so
remove the dead code.

Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:04:54 +02:00
Vaidyanathan Srinivasan 293d264f13 cpuidle: powernv: Pass correct drv->cpumask for registration
drv->cpumask defaults to cpu_possible_mask in __cpuidle_driver_init().
On PowerNV platform cpu_present could be less than cpu_possible in cases
where firmware detects the cpu, but it is not available to the OS.  When
CONFIG_HOTPLUG_CPU=n, such cpus are not hotplugable at runtime and hence
we skip creating cpu_device.

This breaks cpuidle on powernv where register_cpu() is not called for
cpus in cpu_possible_mask that cannot be hot-added at runtime.

Trying cpuidle_register_device() on cpu without cpu_device will cause
crash like this:

cpu 0xf: Vector: 380 (Data SLB Access) at [c000000ff1503490]
    pc: c00000000022c8bc: string+0x34/0x60
    lr: c00000000022ed78: vsnprintf+0x284/0x42c
    sp: c000000ff1503710
   msr: 9000000000009033
   dar: 6000000060000000
  current = 0xc000000ff1480000
  paca    = 0xc00000000fe82d00   softe: 0        irq_happened: 0x01
    pid   = 1, comm = swapper/8
Linux version 4.11.0-rc2 (sv@sagarika) (gcc version 4.9.4
(Buildroot 2017.02-00004-gc28573e) ) #15 SMP Fri Mar 17 19:32:02 IST 2017
enter ? for help
[link register   ] c00000000022ed78 vsnprintf+0x284/0x42c
[c000000ff1503710] c00000000022ebb8 vsnprintf+0xc4/0x42c (unreliable)
[c000000ff1503800] c00000000022ef40 vscnprintf+0x20/0x44
[c000000ff1503830] c0000000000ab61c vprintk_emit+0x94/0x2cc
[c000000ff15038a0] c0000000000acc9c vprintk_func+0x60/0x74
[c000000ff15038c0] c000000000619694 printk+0x38/0x4c
[c000000ff15038e0] c000000000224950 kobject_get+0x40/0x60
[c000000ff1503950] c00000000022507c kobject_add_internal+0x60/0x2c4
[c000000ff15039e0] c000000000225350 kobject_init_and_add+0x70/0x78
[c000000ff1503a60] c00000000053c288 cpuidle_add_sysfs+0x9c/0xe0
[c000000ff1503ae0] c00000000053aeac cpuidle_register_device+0xd4/0x12c
[c000000ff1503b30] c00000000053b108 cpuidle_register+0x98/0xcc
[c000000ff1503bc0] c00000000085eaf0 powernv_processor_idle_init+0x140/0x1e0
[c000000ff1503c60] c00000000000cd60 do_one_initcall+0xc0/0x15c
[c000000ff1503d20] c000000000833e84 kernel_init_freeable+0x1a0/0x25c
[c000000ff1503dc0] c00000000000d478 kernel_init+0x24/0x12c
[c000000ff1503e30] c00000000000b564 ret_from_kernel_thread+0x5c/0x78

This patch fixes the bug by passing correct cpumask from
powernv-cpuidle driver.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
[ rjw: Comment massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-29 22:55:36 +02:00
Gautham R. Shenoy ecad4502d0 powernv-cpuidle: Validate DT property array size
The various properties associated with powernv idle states such as
names, flags, residency-ns, latencies-ns, psscr, psscr-mask are
exposed in the device-tree as property arrays such the pointwise
entries in each of these arrays correspond to the properties of the
same idle state.

This patch validates that the lengths of the property arrays are the
same. If there is a mismatch, the patch will ensure that we bail out
and not expose the platform idle states via cpuidle.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-29 00:14:00 +02:00
Vaidyanathan Srinivasan ad0a45fd9c cpuidle: Validate cpu_dev in cpuidle_add_sysfs()
If a given cpu is not in cpu_present and cpu hotplug
is disabled, arch can skip setting up the cpu_dev.

Arch cpuidle driver should pass correct cpu mask
for registration, but failing to do so by the driver
causes error to propagate and crash like this:

[   30.076045] Unable to handle kernel paging request for data at address 0x00000048
[   30.076100] Faulting instruction address: 0xc0000000007b2f30
cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670]
    pc: c0000000007b2f30: kobject_get+0x20/0x70
    lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0
    sp: c000003feb18b8f0
   msr: 9000000000009033
   dar: 48
 dsisr: 40000000
  current = 0xc000003fd2ed8300
  paca    = 0xc00000000fbab500   softe: 0        irq_happened: 0x01
    pid   = 1, comm = swapper/0
Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0
20161005 (Ubuntu 6.2.0-5ubuntu12) ) #10 SMP Sun Mar 19 00:08:09 IST 2017
enter ? for help
[c000003feb18b960] c0000000007b3c94 kobject_add_internal+0x54/0x3f0
[c000003feb18b9f0] c0000000007b43a4 kobject_init_and_add+0x64/0xa0
[c000003feb18ba70] c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130
[c000003feb18baf0] c000000000e26038 cpuidle_register_device+0x118/0x1c0
[c000003feb18bb30] c000000000e26c48 cpuidle_register+0x78/0x120
[c000003feb18bbc0] c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4
[c000003feb18bc40] c00000000000cff8 do_one_initcall+0x68/0x1d0
[c000003feb18bd00] c0000000016242f4 kernel_init_freeable+0x280/0x360
[c000003feb18bdc0] c00000000000d864 kernel_init+0x24/0x160
[c000003feb18be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74

Validating cpu_dev fixes the crash and reports correct error message like:

[   30.163506] Failed to register cpuidle device for cpu136
[   30.173329] Registration of powernv driver failed.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
[ rjw: Comment massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-21 22:26:37 +01:00
Linus Torvalds 1827adb11a Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched.h split-up from Ingo Molnar:
 "The point of these changes is to significantly reduce the
  <linux/sched.h> header footprint, to speed up the kernel build and to
  have a cleaner header structure.

  After these changes the new <linux/sched.h>'s typical preprocessed
  size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
  lines), which is around 40% faster to build on typical configs.

  Not much changed from the last version (-v2) posted three weeks ago: I
  eliminated quirks, backmerged fixes plus I rebased it to an upstream
  SHA1 from yesterday that includes most changes queued up in -next plus
  all sched.h changes that were pending from Andrew.

  I've re-tested the series both on x86 and on cross-arch defconfigs,
  and did a bisectability test at a number of random points.

  I tried to test as many build configurations as possible, but some
  build breakage is probably still left - but it should be mostly
  limited to architectures that have no cross-compiler binaries
  available on kernel.org, and non-default configurations"

* 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
  sched/headers: Clean up <linux/sched.h>
  sched/headers: Remove #ifdefs from <linux/sched.h>
  sched/headers: Remove the <linux/topology.h> include from <linux/sched.h>
  sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h>
  sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h>
  sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h>
  sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/init.h>
  sched/core: Remove unused prefetch_stack()
  sched/headers: Remove <linux/rculist.h> from <linux/sched.h>
  sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h>
  sched/headers: Remove <linux/signal.h> from <linux/sched.h>
  sched/headers: Remove <linux/rwsem.h> from <linux/sched.h>
  sched/headers: Remove the runqueue_is_locked() prototype
  sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h>
  sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h>
  sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h>
  ...
2017-03-03 10:16:38 -08:00
Linus Torvalds 080e4168c0 Merge tag 'pm-extra-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates deom Rafael Wysocki:
 "These fix two bugs introduced by recent power management updates (in
  the cpuidle menu governor and intel_pstate) and a few other issues,
  clean up things and remove unused code.

  Specifics:

   - Fix for a cpuidle menu governor problem that started to take an
     unnecessary spinlock after one of the recent updates and that did
     not play well with the RT patch (Rafael Wysocki).

   - Fix for the new intel_pstate operation mode switching feature added
     recently that did not reinitialize P-state limits properly when
     switching operation modes (Rafael Wysocki).

   - Removal of unused global notifiers from the PM QoS framework
     (Viresh Kumar).

   - Generic power domains framework update to make it handle
     asynchronous invocations of PM callbacks in the "noirq" phases of
     system suspend/hibernation correctly (Ulf Hansson).

   - Two hibernation core cleanups (Rafael Wysocki).

   - intel_idle cleanup related to the sysfs interface (Len Brown).

   - Off-by-one bug fix in the OPP (Operating Performance Points)
     framework (Andrzej Hajda).

   - OPP framework's documentation fix (Viresh Kumar).

   - cpufreq qoriq driver cleanup (Tang Yuantian).

   - Fixes for typos in comments in the device runtime PM framework
     (Christophe Jaillet)"

* tag 'pm-extra-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / OPP: Documentation: Fix opp-microvolt in examples
  intel_idle: stop exposing platform acronyms in sysfs
  cpufreq: intel_pstate: Fix limits issue with operation mode switching
  PM / hibernate: Define pr_fmt() and use pr_*() instead of printk()
  PM / hibernate: Untangle power_down()
  cpuidle: menu: Avoid taking spinlock for accessing QoS values
  PM / QoS: Remove global notifiers
  PM / runtime: Fix some typos
  cpufreq: qoriq: clean up unused code
  PM / OPP: fix off-by-one bug in dev_pm_opp_get_max_volt_latency loop
  PM / Domains: Power off masters immediately in the power off sequence
  PM / Domains: Rename is_async to one_dev_on for genpd_power_off()
  PM / Domains: Move genpd_power_off() above genpd_power_on()
2017-03-02 17:33:52 -08:00