Commit Graph

205 Commits

Author SHA1 Message Date
Paul E. McKenney
78e691f4ae Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', 'preempt.2015.01.06a', 'srcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEAD
doc.2015.01.07a: Documentation updates.
fixes.2015.01.15a: Miscellaneous fixes.
preempt.2015.01.06a: Changes to handling of lists of preempted tasks.
srcu.2015.01.06a: SRCU updates.
stall.2015.01.16a: RCU CPU stall-warning updates and fixes.
torture.2015.01.11a: RCU torture-test updates and fixes.
2015-01-15 23:34:34 -08:00
Paul E. McKenney
5cd37193ce rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
in places where it would be useful for it to apply to the normal RCU
flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
case for workloads that aggressively overload the system, particularly
those that generate large numbers of RCU updates on systems running
NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
from cond_resched_rcu_qs() to the normal RCU flavors.

Note that it is unfortunately necessary to leave the old ->passed_quiesce
mechanism in place to allow quiescent states that apply to only one
flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
that case, but that is not so good for debugging of RCU internals.)
In addition, if one of the RCU flavor's grace period has stalled, this
will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
quiescent state visible from other CPUs.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
  was used in preemptible code. ]
2015-01-15 23:33:14 -08:00
Pranith Kumar
ac59853c06 rcupdate: Replace smp_read_barrier_depends() with lockless_dereference()
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06 11:01:11 -08:00
Ingo Molnar
d360b78f99 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

 - Streamline RCU's use of per-CPU variables, shifting from "cpu"
   arguments to functions to "this_"-style per-CPU variable accessors.

 - Signal-handling RCU updates.

 - Real-time updates.

 - Torture-test updates.

 - Miscellaneous fixes.

 - Documentation updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-20 08:57:58 +01:00
Paul E. McKenney
9ea6c58856 Merge branches 'torture.2014.11.03a', 'cpu.2014.11.03a', 'doc.2014.11.13a', 'fixes.2014.11.13a', 'signal.2014.10.29a' and 'rt.2014.10.29a' into HEAD
cpu.2014.11.03a: Changes for per-CPU variables.
doc.2014.11.13a: Documentation updates.
fixes.2014.11.13a: Miscellaneous fixes.
signal.2014.10.29a: Signal changes.
rt.2014.10.29a: Real-time changes.
torture.2014.11.03a: torture-test changes.
2014-11-13 10:39:04 -08:00
Oleg Nesterov
ce36f2f3eb rcu: More info about potential deadlocks with rcu_read_unlock()
The comment above rcu_read_unlock() explains the potential deadlock
if the caller holds one of the locks taken by rt_mutex_unlock() paths,
but it is not clear from this documentation that any lock which can
be taken from interrupt can lead to deadlock as well and we need to
take rt_mutex_lock() into account too.

The problem is that rt_mutex_lock() takes wait_lock without disabling
irqs, and thus an interrupt taking some LOCK can obviously race with
rcu_read_unlock_special() called with the same LOCK held.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:40 -08:00
Paul E. McKenney
b6331ae8af rcu: Optimize cond_resched_rcu_qs()
The current implementation of cond_resched_rcu_qs() can invoke
rcu_note_voluntary_context_switch() twice in the should_resched()
case, once via the call to __schedule() and once directly.  However, as
noted by Joe Lawrence in a patch to the team subsystem, cond_resched()
returns an indication as to whether or not the call to __schedule()
actually happened.  This commit therefore changes cond_resched_rcu_qs()
so as to invoke rcu_note_voluntary_context_switch() only when __schedule()
was not called.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:39 -08:00
Pranith Kumar
1a6c9b2675 rcu: Add sparse check for RCU_INIT_POINTER()
Add a sparse check when RCU_INIT_POINTER() is used to assign a non __rcu
annotated pointer.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:38 -08:00
Paul E. McKenney
aa6da5140b rcu: Remove "cpu" argument to rcu_needs_cpu()
The "cpu" argument to rcu_needs_cpu() is always the current CPU, so drop
it.  This in turn allows the "cpu" argument to rcu_cpu_has_callbacks()
to be removed, which allows the uses of "cpu" in both functions to be
replaced with a this_cpu_ptr().  Again, the anticipated cross-CPU uses
of these functions has been replaced by NO_HZ_FULL.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03 19:20:43 -08:00
Paul E. McKenney
c3377c2da6 rcu: Remove "cpu" argument to rcu_check_callbacks()
The "cpu" argument was kept around on the off-chance that RCU might
offload scheduler-clock interrupts.  However, this offload approach
has been replaced by NO_HZ_FULL, which offloads -all- RCU processing
from qualifying CPUs.  It is therefore time to remove the "cpu" argument
to rcu_check_callbacks(), which this commit does.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03 19:20:11 -08:00
Pranith Kumar
28f6569ab7 rcu: Remove redundant TREE_PREEMPT_RCU config option
PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-10-29 10:20:05 -07:00
Paul E. McKenney
54ef6df3f3 rcu: Provide counterpart to rcu_dereference() for non-RCU situations
Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying.  This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-28 18:25:17 -04:00
Paul E. McKenney
59da22a020 rcutorture: Rename rcutorture_runnable parameter
This commit changes rcutorture_runnable to torture_runnable, which is
consistent with the names of the other parameters and is a bit shorter
as well.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-16 13:41:44 -07:00
Paul E. McKenney
96b4672703 Merge branch 'rcu-tasks.2014.09.10a' into HEAD
rcu-tasks.2014.09.10a: Add RCU-tasks flavor of RCU.
2014-09-16 10:10:44 -07:00
Paul E. McKenney
e98d06dd6c Merge branches 'doc.2014.09.07a', 'fixes.2014.09.10a', 'nocb-nohz.2014.09.16b' and 'torture.2014.09.07a' into HEAD
doc.2014.09.07a: Documentation updates.
fixes.2014.09.10a: Miscellaneous fixes.
nocb-nohz.2014.09.16b: No-CBs CPUs and NO_HZ_FULL updates.
torture.2014.09.07a: Torture-test updates.
2014-09-16 10:08:34 -07:00
Paul E. McKenney
f4579fc57c rcu: Fix attempt to avoid unsolicited offloading of callbacks
Commit b58cc46c5f (rcu: Don't offload callbacks unless specifically
requested) failed to adjust the callback lists of the CPUs that are
known to be no-CBs CPUs only because they are also nohz_full= CPUs.
This failure can result in callbacks that are posted during early boot
getting stranded on nxtlist for CPUs whose no-CBs property becomes
apparent late, and there can also be spurious warnings about offline
CPUs posting callbacks.

This commit fixes these problems by adding an early-boot rcu_init_nohz()
that properly initializes the no-CBs CPUs.

Note that kernels built with CONFIG_RCU_NOCB_CPU_ALL=y or with
CONFIG_RCU_NOCB_CPU=n do not exhibit this bug.  Neither do kernels
booted without the nohz_full= boot parameter.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:07:59 -07:00
Paul E. McKenney
284a8c93af rcu: Per-CPU operation cleanups to rcu_*_qs() functions
The rcu_bh_qs(), rcu_preempt_qs(), and rcu_sched_qs() functions use
old-style per-CPU variable access and write to ->passed_quiesce even
if it is already set.  This commit therefore updates to use the new-style
per-CPU variable access functions and avoids the spurious writes.
This commit also eliminates the "cpu" argument to these functions because
they are always invoked on the indicated CPU.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:35 -07:00
Paul E. McKenney
01a8133034 rcu: Remove redundant preempt_disable() from rcu_note_voluntary_context_switch()
In theory, synchronize_sched() requires a read-side critical section
to order against.  In practice, preemption can be thought of as
being disabled across every machine instruction, at least for those
machine instructions that are not in the idle loop and not on offline
CPUs.  So this commit removes the redundant preempt_disable() from
rcu_note_voluntary_context_switch().

Please note that the single instruction in question is the store of
zero to ->rcu_tasks_holdout.  The "if" is simply a performance optimization
that avoids unnecessary stores.  To see this, keep in mind that both
the "if" condition and the store are in a quiescent state.  Therefore,
even if the task is preempted for a full grace period (presumably due
to its having done a context switch beforehand), the store will be
recording a legitimate quiescent state.

Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Conflicts:
	include/linux/rcupdate.h
2014-09-07 16:27:32 -07:00
Paul E. McKenney
69c604557c rcutorture: Add torture tests for RCU-tasks
This commit adds torture tests for RCU-tasks.  It also fixes a bug that
would segfault for an RCU flavor lacking a callback-barrier function.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-09-07 16:27:24 -07:00
Paul E. McKenney
3f95aa81d2 rcu: Make TASKS_RCU handle tasks that are almost done exiting
Once a task has passed exit_notify() in the do_exit() code path, it
is no longer on the task lists, and is therefore no longer visible
to rcu_tasks_kthread().  This means that an almost-exited task might
be preempted while within a trampoline, and this task won't be waited
on by rcu_tasks_kthread().  This commit fixes this bug by adding an
srcu_struct.  An exiting task does srcu_read_lock() just before calling
exit_notify(), and does the corresponding srcu_read_unlock() after
doing the final preempt_disable().  This means that rcu_tasks_kthread()
can do synchronize_srcu() to wait for all mostly-exited tasks to reach
their final preempt_disable() region, and then use synchronize_sched()
to wait for those tasks to finish exiting.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:22 -07:00
Paul E. McKenney
53c6d4edf8 rcu: Add synchronous grace-period waiting for RCU-tasks
It turns out to be easier to add the synchronous grace-period waiting
functions to RCU-tasks than to work around their absense in rcutorture,
so this commit adds them.  The key point is that the existence of
call_rcu_tasks() means that rcutorture needs an rcu_barrier_tasks().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:21 -07:00
Paul E. McKenney
bde6c3aa99 rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops
RCU-tasks requires the occasional voluntary context switch
from CPU-bound in-kernel tasks.  In some cases, this requires
instrumenting cond_resched().  However, there is some reluctance
to countenance unconditionally instrumenting cond_resched() (see
http://lwn.net/Articles/603252/), so this commit creates a separate
cond_resched_rcu_qs() that may be used in place of cond_resched() in
locations prone to long-duration in-kernel looping.

This commit currently instruments only RCU-tasks.  Future possibilities
include also instrumenting RCU, RCU-bh, and RCU-sched in order to reduce
IPI usage.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:20 -07:00
Paul E. McKenney
8315f42295 rcu: Add call_rcu_tasks()
This commit adds a new RCU-tasks flavor of RCU, which provides
call_rcu_tasks().  This RCU flavor's quiescent states are voluntary
context switch (not preemption!) and userspace execution (not the idle
loop -- use some sort of schedule_on_each_cpu() if you need to handle the
idle tasks.  Note that unlike other RCU flavors, these quiescent states
occur in tasks, not necessarily CPUs.  Includes fixes from Steven Rostedt.

This RCU flavor is assumed to have very infrequent latency-tolerant
updaters.  This assumption permits significant simplifications, including
a single global callback list protected by a single global lock, along
with a single task-private linked list containing all tasks that have not
yet passed through a quiescent state.  If experience shows this assumption
to be incorrect, the required additional complexity will be added.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:19 -07:00
Oleg Nesterov
85b39d305b rcu: Uninline rcu_read_lock_held()
This commit uninlines rcu_read_lock_held(). According to "size vmlinux"
this saves 28549 in .text:

	- 5541731 3014560 14757888 23314179
	+ 5513182 3026848 14757888 23297918

Note: it looks as if the data grows by 12288 bytes but this is not true,
it does not actually grow. But .data starts with ALIGN(THREAD_SIZE) and
since .text shrinks the padding grows, and thus .data grows too as it
seen by /bin/size. diff System.map:

	- ffffffff81510000 D _sdata
	- ffffffff81510000 D init_thread_union
	+ ffffffff81509000 D _sdata
	+ ffffffff8150c000 D init_thread_union

Perhaps we can change vmlinux.lds.S to .data itself, so that /bin/size
can't "wrongly" report that .data grows if .text shinks.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:18:13 -07:00
Pranith Kumar
521d24ee59 rcu: Return bool type in rcu_lockdep_current_cpu_online()
Return true instead of 1 in rcu_lockdep_current_cpu_online() as this
has bool as return type.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:18:11 -07:00