Commit Graph

66 Commits

Author SHA1 Message Date
Paul E. McKenney
dd56af42bd rcu: Eliminate deadlock between CPU hotplug and expedited grace periods
Currently, the expedited grace-period primitives do get_online_cpus().
This greatly simplifies their implementation, but means that calls
to them holding locks that are acquired by CPU-hotplug notifiers (to
say nothing of calls to these primitives from CPU-hotplug notifiers)
can deadlock.  But this is starting to become inconvenient, as can be
seen here: https://lkml.org/lkml/2014/8/5/754.  The problem in this
case is that some developers need to acquire a mutex from a CPU-hotplug
notifier, but also need to hold it across a synchronize_rcu_expedited().
As noted above, this currently results in deadlock.

This commit avoids the deadlock and retains the simplicity by creating
a try_get_online_cpus(), which returns false if the get_online_cpus()
reference count could not immediately be incremented.  If a call to
try_get_online_cpus() returns true, the expedited primitives operate as
before.  If a call returns false, the expedited primitives fall back to
normal grace-period operations.  This falling back of course results in
increased grace-period latency, but only during times when CPU hotplug
operations are actually in flight.  The effect should therefore be
negligible during normal operation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Tested-by: Lan Tianyu <tianyu.lan@intel.com>
2014-09-18 16:22:27 -07:00
Paul E. McKenney
96b4672703 Merge branch 'rcu-tasks.2014.09.10a' into HEAD
rcu-tasks.2014.09.10a: Add RCU-tasks flavor of RCU.
2014-09-16 10:10:44 -07:00
Paul E. McKenney
e98d06dd6c Merge branches 'doc.2014.09.07a', 'fixes.2014.09.10a', 'nocb-nohz.2014.09.16b' and 'torture.2014.09.07a' into HEAD
doc.2014.09.07a: Documentation updates.
fixes.2014.09.10a: Miscellaneous fixes.
nocb-nohz.2014.09.16b: No-CBs CPUs and NO_HZ_FULL updates.
torture.2014.09.07a: Torture-test updates.
2014-09-16 10:08:34 -07:00
Paul E. McKenney
c847f14217 rcu: Avoid misordering in nocb_leader_wait()
The NOCB follower wakeup ordering depends on the store to the tail
pointer happening before the wakeup.  However, because atomic_long_add()
does not return a value, it does not provide ordering guarantees, and
the locking in wake_up() only guarantees that the store will happen
before the unlock, which might be too late.  Even though this is only a
theoretical issue, this commit adds a smp_mb__after_atomic() after the
final atomic_long_add() to provide the needed ordering guarantee.

Reported-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:03 -07:00
Paul E. McKenney
1772947bd0 rcu: Handle NOCB callbacks from irq-disabled idle code
If an RCU callback is queued on a no-CBs CPU from idle code with irqs
disabled, and if that CPU stays idle forever after, the callback will
never be invoked.  This commit therefore adds a check for this situation
in ____call_rcu_nocb(), invoking the RCU core solely for the purpose
of the ensuing return-to-idle transition.  (If the CPU doesn't return
to idle, the next scheduling-clock interrupt will fix things up.)

Reported-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:03 -07:00
Paul E. McKenney
39953dfd40 rcu: Avoid misordering in __call_rcu_nocb_enqueue()
The NOCB leader wakeup ordering depends on the store to the header
happening before the check for the leader already being awake.  However,
because atomic_long_add() does not return a value, it does not provide
ordering guarantees, the incorrect comment in wake_nocb_leader()
notwithstanding.  This commit therefore adds a smp_mb__after_atomic()
after the final atomic_long_add() to provide the needed ordering
guarantee.

Reported-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:03 -07:00
Paul E. McKenney
663e131090 rcu: Don't track sysidle state if no nohz_full= CPUs
If there are no nohz_full= CPUs, then there is currently no reason to
track sysidle state.  This commit therefore short-circuits this state
tracking if !tick_nohz_full_enabled().

Note that these checks will need to be revisited if nohz_full= state
can ever be changed at runtime.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:02 -07:00
Paul E. McKenney
417e8d2655 rcu: Eliminate redundant rcu_sysidle_state variable
Now that we have rcu_state_p, which references rcu_preempt_state for
TREE_PREEMPT_RCU and rcu_sched_state for TREE_RCU, we don't need a
separate rcu_sysidle_state variable.  This commit therefore eliminates
rcu_preempt_state in favor of rcu_state_p.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:02 -07:00
Pranith Kumar
22c2f66961 rcu: Check for have_rcu_nocb_mask instead of rcu_nocb_mask
If we configure a kernel with CONFIG_NOCB_CPU=y, CONFIG_RCU_NOCB_CPU_NONE=y and
CONFIG_CPUMASK_OFFSTACK=n and do not pass in a rcu_nocb= boot parameter, the
cpumask rcu_nocb_mask can be garbage instead of NULL.

Hence this commit replaces checks for rcu_nocb_mask == NULL with a check for
have_rcu_nocb_mask.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:02 -07:00
Paul E. McKenney
35ce7f29a4 rcu: Create rcuo kthreads only for onlined CPUs
RCU currently uses for_each_possible_cpu() to spawn rcuo kthreads,
which can result in more rcuo kthreads than one would expect, for
example, derRichard reported 64 CPUs worth of rcuo kthreads on an
8-CPU image.  This commit therefore creates rcuo kthreads only for
those CPUs that actually come online.

This was reported by derRichard on the OFTC IRC network.

Reported-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:02 -07:00
Paul E. McKenney
9386c0b75d rcu: Rationalize kthread spawning
Currently, RCU spawns kthreads from several different early_initcall()
functions.  Although this has served RCU well for quite some time,
as more kthreads are added a more deterministic approach is required.
This commit therefore causes all of RCU's early-boot kthreads to be
spawned from a single early_initcall() function.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:01 -07:00
Pranith Kumar
f4aa84ba24 rcu: Return false instead of 0 in rcu_nocb_adopt_orphan_cbs()
Return false instead of 0 in rcu_nocb_adopt_orphan_cbs() as this has
bool as return type.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:01 -07:00
Pranith Kumar
4afc7e269b rcu: Use false for return in __call_rcu_nocb()
Return false instead of 0 in __call_rcu_nocb() as this has bool as
return type.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:01 -07:00
Pranith Kumar
0a9e1e111b rcu: Use true/false for return in rcu_nocb_adopt_orphan_cbs()
Return true/false in rcu_nocb_adopt_orphan_cbs() instead of 0/1 as
this function has return type of bool.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:00 -07:00
Pranith Kumar
c271d3a957 rcu: Use true/false for return in __call_rcu_nocb()
Return true/false instead of 0/1 in __call_rcu_nocb() as this returns a
bool type.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:00 -07:00
Pranith Kumar
949cccdbe6 rcu: Check the return value of zalloc_cpumask_var()
This commit checks the return value of the zalloc_cpumask_var() used for
allocating cpumask for rcu_nocb_mask.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:08:00 -07:00
Paul E. McKenney
f4579fc57c rcu: Fix attempt to avoid unsolicited offloading of callbacks
Commit b58cc46c5f (rcu: Don't offload callbacks unless specifically
requested) failed to adjust the callback lists of the CPUs that are
known to be no-CBs CPUs only because they are also nohz_full= CPUs.
This failure can result in callbacks that are posted during early boot
getting stranded on nxtlist for CPUs whose no-CBs property becomes
apparent late, and there can also be spurious warnings about offline
CPUs posting callbacks.

This commit fixes these problems by adding an early-boot rcu_init_nohz()
that properly initializes the no-CBs CPUs.

Note that kernels built with CONFIG_RCU_NOCB_CPU_ALL=y or with
CONFIG_RCU_NOCB_CPU=n do not exhibit this bug.  Neither do kernels
booted without the nohz_full= boot parameter.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2014-09-16 10:07:59 -07:00
Paul E. McKenney
284a8c93af rcu: Per-CPU operation cleanups to rcu_*_qs() functions
The rcu_bh_qs(), rcu_preempt_qs(), and rcu_sched_qs() functions use
old-style per-CPU variable access and write to ->passed_quiesce even
if it is already set.  This commit therefore updates to use the new-style
per-CPU variable access functions and avoids the spurious writes.
This commit also eliminates the "cpu" argument to these functions because
they are always invoked on the indicated CPU.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:35 -07:00
Paul E. McKenney
1d082fd061 rcu: Remove local_irq_disable() in rcu_preempt_note_context_switch()
The rcu_preempt_note_context_switch() function is on a scheduling fast
path, so it would be good to avoid disabling irqs.  The reason that irqs
are disabled is to synchronize process-level and irq-handler access to
the task_struct ->rcu_read_unlock_special bitmask.  This commit therefore
makes ->rcu_read_unlock_special instead be a union of bools with a short
allowing single-access checks in RCU's __rcu_read_unlock().  This results
in the process-level and irq-handler accesses being simple loads and
stores, so that irqs need no longer be disabled.  This commit therefore
removes the irq disabling from rcu_preempt_note_context_switch().

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:34 -07:00
Paul E. McKenney
176f8f7a52 rcu: Make TASKS_RCU handle nohz_full= CPUs
Currently TASKS_RCU would ignore a CPU running a task in nohz_full=
usermode execution.  There would be neither a context switch nor a
scheduling-clock interrupt to tell TASKS_RCU that the task in question
had passed through a quiescent state.  The grace period would therefore
extend indefinitely.  This commit therefore makes RCU's dyntick-idle
subsystem record the task_struct structure of the task that is running
in dyntick-idle mode on each CPU.  The TASKS_RCU grace period can
then access this information and record a quiescent state on
behalf of any CPU running in dyntick-idle usermode.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:30 -07:00
Paul E. McKenney
bde6c3aa99 rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops
RCU-tasks requires the occasional voluntary context switch
from CPU-bound in-kernel tasks.  In some cases, this requires
instrumenting cond_resched().  However, there is some reluctance
to countenance unconditionally instrumenting cond_resched() (see
http://lwn.net/Articles/603252/), so this commit creates a separate
cond_resched_rcu_qs() that may be used in place of cond_resched() in
locations prone to long-duration in-kernel looping.

This commit currently instruments only RCU-tasks.  Future possibilities
include also instrumenting RCU, RCU-bh, and RCU-sched in order to reduce
IPI usage.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:20 -07:00
Paul E. McKenney
73a860cd58 rcu: Replace flush_signals() with WARN_ON(signal_pending())
Currently, when RCU awakens from a wait_event_interruptible() that
might have awakened prematurely, it does a flush_signals(). This is
done on the off-chance that someone figured out how to deliver a signal
to a kthread, which is supposed to be impossible.  Given that this
is supposed to be impossible, this commit changes the flush_signals()
calls into WARN_ON(signal_pending()).

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:18:20 -07:00
Paul E. McKenney
9fdd3bc900 rcu: Break more call_rcu() deadlock involving scheduler and perf
Commit 96d3fd0d31 (rcu: Break call_rcu() deadlock involving scheduler
and perf) covered the case where __call_rcu_nocb_enqueue() needs to wake
the rcuo kthread due to the queue being initially empty, but did not
do anything for the case where the queue was overflowing.  This commit
therefore also defers wakeup for the overflow case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:18:17 -07:00
Pranith Kumar
d0bc90fd37 rcu: Return bool type for rcu_try_advance_all_cbs()
Return a bool type instead of 0 in rcu_try_advance_all_cbs().

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:18:10 -07:00
Pranith Kumar
bf33eb1aef rcu: Fix sparse warning about rcu_batches_completed_preempt() being non-static
fix sparse warning about rcu_batches_completed_preempt() being non-static by
marking it as static

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:18:08 -07:00