Commit Graph

482099 Commits

Author SHA1 Message Date
Viresh Kumar 6ce4184d03 PM / OPP: do error handling at the bottom of dev_pm_opp_add_dynamic()
This makes it less error prone and moves common resource deallocation at a
single place.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 22:18:34 +01:00
Viresh Kumar 07cce74a7b PM / OPP: handle allocation of device_opp in a separate routine
Get the 'device_opp' allocation code into a separate routine to keep only the
necessary part in dev_pm_opp_add_dynamic().

Also do s/sizeof(struct device_opp)/sizeof(*dev_opp) and remove the print
message on kzalloc() failure as checkpatch warns for that.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 22:18:34 +01:00
Viresh Kumar 29df0ee1b1 PM / OPP: reuse find_device_opp() instead of duplicating code
Reuse find_device_opp() in opp_set_availability() instead of duplicating code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 22:18:34 +01:00
Viresh Kumar 86453b473b PM / OPP: Staticize __dev_pm_opp_remove()
Its a local routine and need not be accessible outside of opp.c.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 22:18:34 +01:00
Viresh Kumar 1c6a662f89 PM / OPP: replace kfree with kfree_rcu while freeing 'struct device_opp'
Somehow one of the instance of freeing resources failed to use kfree_rcu() and
used kfree() instead. This might cause problems as the node might be referenced
by readers under rcu locks and we must wait for the rcu grace period as well.

While we are at it, also update comment over 'struct device_opp' to mention why
we are waiting for both rcu and srcu grace periods.

Fixes: 129eec55df (PM / OPP Introduce APIs to remove OPPs)
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 22:18:33 +01:00
Viresh Kumar 2a6127d037 PM / OPP: remove double calls to find_device_opp()
By mistake we called find_device_opp() twice in of_free_opp_table(), fix it.

Generated diff doesn't show the problem well and so here is the code snippet:

void of_free_opp_table(struct device *dev)
{
	struct device_opp *dev_opp = find_device_opp(dev);
	struct dev_pm_opp *opp, *tmp;

	/* Check for existing list for 'dev' */
	dev_opp = find_device_opp(dev);

	...
}

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 02:39:12 +01:00
Viresh Kumar 7284a00f17 PM / OPP: set new_opp->dev_opp to a valid dev_opp
We find/allocate dev_opp after using its value to fill new_opp->dev_opp right
now. Move this to a later point where dev_opp is valid.

Fixes: a7470db6fe (PM / OPP don't match for existing OPPs when list is empty)
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-10 02:32:51 +01:00
Viresh Kumar b4037aaa58 PM / OPP replace kfree_rcu() with call_srcu() in opp_set_availability()
This existed before we introduced call_srcu() in opp layer to synchronize with
srcu_notifier_call_chain() while removing OPPs. And is a potential bug which
wasn't noticed earlier.

Let fix it as well by using the right API to free OPP.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-29 23:53:20 +01:00
Viresh Kumar 129eec55df PM / OPP Introduce APIs to remove OPPs
OPPs are created statically (from DT) or dynamically. Currently we don't free
OPPs that are created statically, when the module unloads. And so if the module
is inserted back again, we get warning for duplicate OPPs as the same were
already present.

Also, there might be a need to remove dynamic OPPs in future and so API for that
is also added.

This patch adds helper APIs to remove/free existing static and dynamic OPPs.

Because the OPPs are used both under RCU and SRCU, we have to wait for grace
period of both. And so are using kfree_rcu() from within call_srcu().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-29 23:53:20 +01:00
Viresh Kumar 38393409da PM / OPP mark OPPs as 'static' or 'dynamic'
Static OPPs are the ones created from Device Tree entries and dynamic are the
ones created at runtime by calling dev_pm_opp_add().

There is a need to distinguish them as we need to free static OPPs from cpufreq
drivers when they are removed.

So, add another field 'dynamic' in 'struct dev_pm_opp' to keep this information.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-29 23:53:20 +01:00
Viresh Kumar a7470db6fe PM / OPP don't match for existing OPPs when list is empty
OPP list is guaranteed to be empty when 'dev_opp' is created. And so we don't
need to run the comparison loop with existing OPPs.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-29 23:53:20 +01:00
Viresh Kumar cd1a068a52 PM / OPP rename 'head' as 'rcu_head' or 'srcu_head' based on its type
Both 'struct dev_pm_opp' and 'struct device_opp' have member with name 'head'
but with different types. This leads to confusion while reading the code.

Name them 'rcu_head' and 'srcu_head'.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-29 23:53:20 +01:00
Linus Torvalds 5d01410fe4 Linux 3.18-rc6 2014-11-23 15:25:20 -08:00
Andy Lutomirski 82975bc6a6 uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME
x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
not on non-paranoid returns.  I suspect that this is a mistake and that
the code only works because int3 is paranoid.

Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
for the x86 bug.  With that bug fixed, we can remove _TIF_NOTIFY_RESUME
from the uprobes code.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 14:25:28 -08:00
Thomas Gleixner 90e362f4a7 sched: Provide update_curr callbacks for stop/idle scheduling classes
Chris bisected a NULL pointer deference in task_sched_runtime() to
commit 6e998916df 'sched/cputime: Fix clock_nanosleep()/clock_gettime()
inconsistency'.

Chris observed crashes in atop or other /proc walking programs when he
started fork bombs on his machine.  He assumed that this is a new exit
race, but that does not make any sense when looking at that commit.

What's interesting is that, the commit provides update_curr callbacks
for all scheduling classes except stop_task and idle_task.

While nothing can ever hit that via the clock_nanosleep() and
clock_gettime() interfaces, which have been the target of the commit in
question, the author obviously forgot that there are other code paths
which invoke task_sched_runtime()

do_task_stat(()
 thread_group_cputime_adjusted()
   thread_group_cputime()
     task_cputime()
       task_sched_runtime()
        if (task_current(rq, p) && task_on_rq_queued(p)) {
          update_rq_clock(rq);
          up->sched_class->update_curr(rq);
        }

If the stats are read for a stomp machine task, aka 'migration/N' and
that task is current on its cpu, this will happily call the NULL pointer
of stop_task->update_curr.  Ooops.

Chris observation that this happens faster when he runs the fork bomb
makes sense as the fork bomb will kick migration threads more often so
the probability to hit the issue will increase.

Add the missing update_curr callbacks to the scheduler classes stop_task
and idle_task.  While idle tasks cannot be monitored via /proc we have
other means to hit the idle case.

Fixes: 6e998916df 'sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency'
Reported-by: Chris Mason <clm@fb.com>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 14:14:40 -08:00
Linus Torvalds 00c89b2f11 Merge branch 'x86-traps' (trap handling from Andy Lutomirski)
Merge x86-64 iret fixes from Andy Lutomirski:
 "This addresses the following issues:

   - an unrecoverable double-fault triggerable with modify_ldt.
   - invalid stack usage in espfix64 failed IRET recovery from IST
     context.
   - invalid stack usage in non-espfix64 failed IRET recovery from IST
     context.

  It also makes a good but IMO scary change: non-espfix64 failed IRET
  will now report the correct error.  Hopefully nothing depended on the
  old incorrect behavior, but maybe Wine will get confused in some
  obscure corner case"

* emailed patches from Andy Lutomirski <luto@amacapital.net>:
  x86_64, traps: Rework bad_iret
  x86_64, traps: Stop using IST for #SS
  x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
2014-11-23 13:56:55 -08:00
Andy Lutomirski b645af2d59 x86_64, traps: Rework bad_iret
It's possible for iretq to userspace to fail.  This can happen because
of a bad CS, SS, or RIP.

Historically, we've handled it by fixing up an exception from iretq to
land at bad_iret, which pretends that the failed iret frame was really
the hardware part of #GP(0) from userspace.  To make this work, there's
an extra fixup to fudge the gs base into a usable state.

This is suboptimal because it loses the original exception.  It's also
buggy because there's no guarantee that we were on the kernel stack to
begin with.  For example, if the failing iret happened on return from an
NMI, then we'll end up executing general_protection on the NMI stack.
This is bad for several reasons, the most immediate of which is that
general_protection, as a non-paranoid idtentry, will try to deliver
signals and/or schedule from the wrong stack.

This patch throws out bad_iret entirely.  As a replacement, it augments
the existing swapgs fudge into a full-blown iret fixup, mostly written
in C.  It's should be clearer and more correct.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 13:56:19 -08:00
Andy Lutomirski 6f442be2fb x86_64, traps: Stop using IST for #SS
On a 32-bit kernel, this has no effect, since there are no IST stacks.

On a 64-bit kernel, #SS can only happen in user code, on a failed iret
to user space, a canonical violation on access via RSP or RBP, or a
genuine stack segment violation in 32-bit kernel code.  The first two
cases don't need IST, and the latter two cases are unlikely fatal bugs,
and promoting them to double faults would be fine.

This fixes a bug in which the espfix64 code mishandles a stack segment
violation.

This saves 4k of memory per CPU and a tiny bit of code.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 13:56:19 -08:00
Andy Lutomirski af726f21ed x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
There's nothing special enough about the espfix64 double fault fixup to
justify writing it in assembly.  Move it to C.

This also fixes a bug: if the double fault came from an IST stack, the
old asm code would return to a partially uninitialized stack frame.

Fixes: 3891a04aaf
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23 13:56:18 -08:00
Linus Torvalds 27946315d2 Merge tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
 "A collection of fixes this week:

   - A set of clock fixes for shmobile platforms
   - A fix for tegra that moves serial port labels to be per board.
     We're choosing to merge this for 3.18 because the labels will start
     being parsed in 3.19, and without this change serial port numbers
     that used to be stable since the dawn of time will change numbers.
   - A few other DT tweaks for Tegra.
   - A fix for multi_v7_defconfig that makes it stop spewing cpufreq
     errors on Arndale (Exynos)"

* tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: multi_v7_defconfig: fix failure setting CPU voltage by enabling dependent I2C controller
  ARM: tegra: roth: Fix SD card VDD_IO regulator
  ARM: tegra: Remove eMMC vmmc property for roth/tn7
  ARM: dts: tegra: move serial aliases to per-board
  ARM: tegra: Add serial port labels to Tegra124 DT
  ARM: shmobile: kzm9g legacy: Set i2c clks_per_count to 2
  ARM: shmobile: r8a7740 dtsi: Correct IIC0 parent clock
  ARM: shmobile: r8a7790: Fix SD3CKCR address to device tree
  ARM: shmobile: r8a7740 legacy: Correct IIC0 parent clock
  ARM: shmobile: r8a7740 legacy: Add missing INTCA clock for irqpin module
  ARM: shmobile: r8a7790: Fix SD3CKCR address
  ARM: dts: sun6i: Re-parent ahb1_mux to pll6 as required by dma controller
2014-11-23 11:46:01 -08:00
Linus Torvalds 9f2e0f6370 Merge branch 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
 "This contains one patch to fix a race condition which can lead to
  percpu_ref using a percpu pointer which is corrupted with a set DEAD
  bit.  The bug was introduced while separating out the ATOMIC mode flag
  from the DEAD flag.  The fix is pretty straight forward.

  I just committed the patch to the percpu tree but am sending out the
  pull request early as I'll be on vacation for a week.  The patch
  should be fairly safe and while the latency will be higher I'll be
  checking emails"

* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu-ref: fix DEAD flag contamination of percpu pointer
2014-11-23 11:33:49 -08:00
Linus Torvalds d038a63ace Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs deadlock fix from Chris Mason:
 "This has a fix for a long standing deadlock that we've been trying to
  nail down for a while.  It ended up being a bad interaction with the
  fair reader/writer locks and the order btrfs reacquires locks in the
  btree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: fix lockups from btrfs_clear_path_blocking
2014-11-23 11:16:36 -08:00
Tejun Heo 4aab3b5b3c percpu-ref: fix DEAD flag contamination of percpu pointer
While decoupling ATOMIC and DEAD flags, f47ad45784 ("percpu_ref:
decouple switching to percpu mode and reinit") updated
__ref_is_percpu() so that it only tests ATOMIC flag to determine
whether the ref is in percpu mode or not; however, while DEAD implies
ATOMIC, the two flags are set separately during percpu_ref_kill() and
if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
ATOMIC.  Because __ref_is_percpu() returns @ref->percpu_count_ptr
value verbatim as the percpu pointer after testing ATOMIC, the pointer
may now be contaminated with the DEAD flag.

This can be fixed by clearing the flag bits before returning the
pointer which was the fix proposed by Shaohua; however, as DEAD
implies ATOMIC, we can just test for both flags at once and avoid the
explicit masking.

Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
are clear before returning @ref->percpu_count_ptr as the percpu
pointer.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-Reviewed-by: Shaohua Li <shli@kernel.org>
Link: http://lkml.kernel.org/r/995deb699f5b873c45d667df4add3b06f73c2c25.1416638887.git.shli@kernel.org
Fixes: f47ad45784 ("percpu_ref: decouple switching to percpu mode and reinit")
2014-11-23 12:36:06 -05:00
Linus Torvalds cb95413971 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "A single bugfix for an init order problem in the sun4i subarch
  clockevents code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevent: sun4i: Fix race condition in the probe code
2014-11-22 14:33:11 -08:00
Linus Torvalds ecde00642c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Assorted fixes, most in overlayfs land"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: ovl_dir_fsync() cleanup
  ovl: update MAINTAINERS
  ovl: pass dentry into ovl_dir_read_merged()
  ovl: use lockless_dereference() for upperdentry
  ovl: allow filenames with comma
  ovl: fix race in private xattr checks
  ovl: fix remove/copy-up race
  ovl: rename filesystem type to "overlay"
  isofs: avoid unused function warning
  vfs: fix reference leak in d_prune_aliases()
2014-11-22 14:15:27 -08:00