Commit Graph

868 Commits

Author SHA1 Message Date
Dirk Behme 87260d3f7a thermal: rcar_thermal: Fix priv->zone error handling
In case thermal_zone_xxx_register() returns an error, priv->zone
isn't NULL any more, but contains the error code.

This is passed to thermal_zone_device_unregister(), then. This checks
for priv->zone being NULL, but the error code is != NULL. So it works
with the error code as a pointer. Crashing immediately.

To fix this, reset priv->zone to NULL before entering
rcar_gen3_thermal_remove().

Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-09-06 20:46:06 +08:00
Corentin LABBE 829bc78aa7 thermal: imx: fix a possible NULL dereference
of_match_device could return NULL, and so cause a NULL pointer
dereference later at line 472:
data->socdata = of_id->data;

For fixing this problem, we use of_device_get_match_data(), this will
simplify the code a little by using a standard function for
getting the match data.

Reported-by: coverity (CID 1324128)
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-19 21:34:08 +08:00
Markus Elfring 57027db00d Thermal-INT3406: Delete owner assignment
The field "owner" is set by core. Thus delete an extra initialisation.

Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-19 21:32:48 +08:00
Brendan Jackman a305a4387a thermal: cpu_cooling: Fix NULL dereference in cpufreq_state2power
Currently all CPU cooling devices share a
`struct thermal_cooling_device_ops` instance. The thermal core uses the
presence of functions in this struct to determine if a cooling device
has a power model (see cdev_is_power_actor). cpu_cooling.c adds the
power model functions to the shared struct when a device is registered
with a power model.

Therefore, if a CPU cooling device is registered using
[of_]cpufreq_power_cooling_register, _all_ devices will be determined to
have a power model, including any registered with
[of_]cpufreq_cooling_register. This can result in cpufreq_state2power
being called on a device where dyn_power_table is NULL.

With this commit, instead of having a shared thermal_cooling_device_ops
which is mutated, we have two versions: one with the power functions and
one without.

Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Javi Merino <javi.merino@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-19 21:32:18 +08:00
Zhang Rui 1577ddfac7 Merge branches 'thermal-intel' and 'thermal-core' into next 2016-08-08 10:59:35 +08:00
Wei Yongjun 165989a5b6 thermal: clock_cooling: Fix missing mutex_init()
The driver allocates the mutex but not initialize it.
Use mutex_init() on it to initialize it correctly.

This is detected by Coccinelle semantic patch.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:57:39 +08:00
Kuninori Morimoto f4c592439b thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs
thermal_add_hwmon_sysfs()/thermal_remove_hwmon_sysfs() need
EXPORT_SYMBOL_GPL(). Otherwise we will have ERROR

>> ERROR: "thermal_remove_hwmon_sysfs" [drivers/thermal/rcar_thermal.ko] undefined!
>> ERROR: "thermal_add_hwmon_sysfs" [drivers/thermal/rcar_thermal.ko] undefined!

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:57:39 +08:00
Michele Di Giorgio d0b7306d20 thermal: fix race condition when updating cooling device
When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:

	    Thread k1				Thread k2
                                    ||
                                    ||  call thermal_cdev_update()
                                    ||      ...
                                    ||      set_cur_state(cdev, target);
    call power_actor_set_power()    ||
        ...                         ||
        instance->target = state;   ||
        cdev->updated = false;      ||
                                    ||      cdev->updated = true;
                                    ||      // completes execution
    call thermal_cdev_update()      ||
        // cdev->updated == true    ||
        return;                     ||
                                    \/
                                    time

k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().

This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.

CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:57:39 +08:00
Petr Mladek 70c50ee72e thermal/powerclamp: Prevent division by zero when counting interval
I have got a zero division error when disabling the forced
idle injection from the intel powerclamp. I did

  echo 0 >/sys/class/thermal/cooling_device48/cur_state

and got

[  986.072632] divide error: 0000 [#1] PREEMPT SMP
[  986.078989] Modules linked in:
[  986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted 4.7.0-1-default+ #3055
[  986.093781] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
[  986.106227] task: ffff880430e1c080 task.stack: ffff880427ef0000
[  986.114122] RIP: 0010:[<ffffffff81794859>]  [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[  986.124609] RSP: 0018:ffff880427ef3e20  EFLAGS: 00010246
[  986.131860] RAX: 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
[  986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018
[  986.150478] RBP: ffff880427ef3ec8 R08: ffff880427ef0000 R09: 0000000000000002
[  986.159779] R10: 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
[  986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: ffff880427ef0000
[  986.178388] FS:  0000000000000000(0000) GS:ffff880435940000(0000) knlGS:0000000000000000
[  986.188785] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  986.196559] CR2: 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
[  986.205909] Stack:
[  986.209524]  ffff8802be897b00 ffff880430e1c080 0000000000000011 0000006a35959780
[  986.219236]  0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
[  986.228966]  0000000100029d93 ffffffff81794140 0000000000000000 ffffffff05000011
[  986.238686] Call Trace:
[  986.242825]  [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80
[  986.250866]  [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
[  986.259797]  [<ffffffff8111d1a9>] kthread+0xc9/0xe0
[  986.266682]  [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
[  986.274142]  [<ffffffff8111d0e0>] ? kthread_create_on_node+0x180/0x180
[  986.282869] Code: d1 ea 48 89 d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8
[  986.307806] RIP  [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[  986.315871]  RSP <ffff880427ef3e20>

RIP points to the following lines:

	compensation = get_compensation(target_ratio);
	interval = duration_jiffies*100/(target_ratio+compensation);

A solution would be to switch the following two commands in
powerclamp_set_cur_state():

	set_target_ratio = 0;
	end_power_clamp();

But I think that the zero division might happen also when target_ratio
is non-zero because the compensation might be negative. Therefore
we also check the sum of target_ratio and compensation explicitly.

Also the compensated_ratio variable is always set. Therefore there
is no need to initialize it.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:54:55 +08:00
Srinivas Pandruvada 176b1ec213 thermal: intel_pch_thermal: Add suspend/resume callback
Added suspend/resume callback to disable/enable PCH thermal sensor
respectively. If the sensor is enabled by the BIOS, then the sensor status
will not be changed during suspend/resume.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:54:55 +08:00
Rafael J. Wysocki ae892d150f Merge branch 'x86/cpu' 2016-06-13 23:49:23 +02:00
Rafael J. Wysocki b77b565108 Merge branch 'x86/cpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into x86/cpu
Pull recent changes related to x86 CPU model representations from tip.
2016-06-13 23:48:23 +02:00
Rafael J. Wysocki bb4b9933e2 Merge back earlier cpufreq changes for v4.8. 2016-06-13 23:33:17 +02:00
Linus Torvalds 57120fac12 Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:

 - fix an ordering issue in cpu cooling that cooling device is
   registered before it's ready (freq_table being populated).
   (Lukasz Luba)

 - fix a missing comment update (Caesar Wang)

* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: add the note for set_trip_temp
  thermal: cpu_cooling: fix improper order during initialization
2016-06-12 06:30:39 -07:00
Viresh Kumar f8bfc116ca cpufreq: Remove cpufreq_frequency_get_table()
Most of the callers of cpufreq_frequency_get_table() already have the
pointer to a valid 'policy' structure and they don't really need to go
through the per-cpu variable first and then a check to validate the
frequency, in order to find the freq-table for the policy.

Directly use the policy->freq_table field instead for them.

Only one user of that API is left after above changes, cpu_cooling.c and
it accesses the freq_table in a racy way as the policy can get freed in
between.

Fix it by using cpufreq_cpu_get() properly.

Since there are no more users of cpufreq_frequency_get_table() left, get
rid of it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Javi Merino <javi.merino@arm.com> (cpu_cooling.c)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-09 00:58:05 +02:00
Dave Hansen ce53da02eb x86, thermal: Clean up and fix CPU model detection for intel_soc_dts_thermal
The X86_FAMILY_ANY in here is bogus.  "BYT" and model 0x37 are
family-6 only.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: jacob.jun.pan@intel.com
Cc: linux-pm@vger.kernel.org
Link: http://lkml.kernel.org/r/20160603001952.9B6E114D@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 13:03:26 +02:00
Rafael J. Wysocki 60c07f80b0 Merge branches 'acpica-fixes', 'acpi-video' and 'acpi-processor'
* acpica-fixes:
  ACPICA / Hardware: Fix old register check in acpi_hw_get_access_bit_width()

* acpi-video:
  ACPI / Thermal / video: fix max_level incorrect value

* acpi-processor:
  ACPI / processor: Avoid reserving IO regions too early
2016-06-03 22:35:05 +02:00
Lukasz Luba f840ab18bd thermal: cpu_cooling: fix improper order during initialization
The freq_table array is not populated before calling
thermal_of_cooling_register. The code which populates the freq table was
introduced in commit f6859014.
This should be done before registering new thermal cooling device.
The log shows effects of this wrong decision.
[    2.172614] cpu cpu1: Failed to get voltage for frequency 1984518656000: -34
[    2.220863] cpu cpu0: Failed to get voltage for frequency 1984524416000: -34

Cc: <stable@vger.kernel.org> # 3.19+
Fixes: f6859014c7 ("thermal: cpu_cooling: Store frequencies in descending order")
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Javi Merino <javi.merino@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-06-01 22:21:19 +08:00
Aaron Lu 9f9cd7ee2c ACPI / Thermal / video: fix max_level incorrect value
commit 059500940d (ACPI/video: export acpi_video_get_levels)
mistakenly dropped the correct value of max_level and that caused the
set_level function following failed and the acpi_video backlight interface
didn't get created. Fix this by passing back the correct max_level value.

While at it, also fix the param used in acpi_video_device_lcd_query_levels
where acpi_handle is expected but acpi_video_device is passed.

Fixes: 059500940d (ACPI/video: export acpi_video_get_levels)
Reported-and-tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 13:53:09 +02:00
Linus Torvalds bfb764440d Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:

 - Introduce generic ADC thermal driver, based on OF thermal (Laxman
   Dewangan)

 - Introduce new thermal driver for Tango chips (Marc Gonzalez)

 - Rockchip driver support for RK3399, RK3366, and some fixes (Caesar
   Wang, Elaine Zhang and Shawn Lin)

 - Add CPU power cooling model to Mediatek thermal driver (Dawei Chien)

 - Wider usage of dev_thermal_zone_of_sensor_register (Eduardo Valentin)

 - TI thermal driver gained a new maintainer (Keerthy).

 - Enabled powerclamp driver by checking CPU feature and package cstate
   counter instead of CPU whitelist (Jacob Pan)

 - Various fixes on thermal governor, OF thermal, Tegra, and RCAR

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (50 commits)
  thermal: tango: initialize TEMPSI_CFG
  thermal: rockchip: use the usleep_range instead of udelay
  thermal: rockchip: add the notes for better reading
  thermal: rockchip: Support RK3366 SoCs in the thermal driver
  thermal: rockchip: handle the power sequence for tsadc controller
  thermal: rockchip: update the tsadc table for rk3399
  thermal: rockchip: fixes the code_to_temp for tsadc driver
  thermal: rockchip: disable thermal->clk in err case
  thermal: tegra: add Tegra132 specific SOC_THERM driver
  thermal: fix ptr_ret.cocci warnings
  thermal: mediatek: Add cpu dynamic power cooling model.
  thermal: generic-adc: Add ADC based thermal sensor driver
  thermal: generic-adc: Add DT binding for ADC based thermal sensor
  thermal: tegra: fix static checker warning
  thermal: tegra: mark PM functions __maybe_unused
  thermal: add temperature sensor support for tango SoC
  thermal: hisilicon: fix IRQ imbalance enabling
  thermal: hisilicon: support to use any sensor
  thermal: rcar: Remove binding docs for r8a7794
  thermal: tegra: add PM support
  ...
2016-05-26 09:23:43 -07:00
Zhang Rui 88ac99063e Merge branches 'thermal-core', 'thermal-intel' and 'thermal-soc' into next 2016-05-18 15:37:08 +08:00
Marc Gonzalez 431c30f74c thermal: tango: initialize TEMPSI_CFG
TEMPSI_CFG is not equal to 0 at reset. It must be initialized.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-05-17 07:28:33 -07:00
Caesar Wang 2fe5c1b045 thermal: rockchip: use the usleep_range instead of udelay
Documentation/timers/timers-howto.txt recommends to use
usleep_range on delays > 10usec.
The usleep_range indeed reduces CPU load, since the udelay will busy wait
for enough loop cycles to achieve the desired delay.

Fixes commit b06c52db39fd ("thermal: rockchip:
handle the power sequence for tsadc controller").

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Suggested-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-05-17 07:28:33 -07:00
Caesar Wang 678065d5b7 thermal: rockchip: add the notes for better reading
To update the notes for keeping in mind that quickly in case
someone re-read this driver in the future.

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-05-17 07:28:33 -07:00
Elaine Zhang 1cd6026937 thermal: rockchip: Support RK3366 SoCs in the thermal driver
The RK3366 SoCs have two Temperature Sensors, channel 0 is for CPU
channel 1 is for GPU.

Signed-off-by: Elaine Zhang <zhangqing@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-05-17 07:28:32 -07:00