SRCU uses per-CPU variables, and DEFINE_STATIC_SRCU() uses a static
per-CPU variable. However, per-CPU variables have significant
restrictions, for example, names of per-CPU variables must be globally
unique, even if declared static. These restrictions carry over to
DEFINE_STATIC_SRCU(), and this commit therefore documents these
restrictions.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Currently, __srcu_read_lock() cannot be invoked from restricted
environments because it contains calls to preempt_disable() and
preempt_enable(), both of which can invoke lockdep, which is a bad
idea in some restricted execution modes. This commit therefore moves
the preempt_disable() and preempt_enable() from __srcu_read_lock()
to srcu_read_lock(). It also inserts the preempt_disable() and
preempt_enable() around the call to __srcu_read_lock() in do_exit().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcu_dereference_check() family of primitives evaluates the RCU
lockdep expression first, and only then evaluates the expression passed
in. This works fine normally, but can potentially fail in environments
(such as NMI handlers) where lockdep cannot be invoked. The problem is
that even if the expression passed in is "1", the compiler would need to
prove that the RCU lockdep expression (rcu_read_lock_held(), for example)
is free of side effects in order to be able to elide it. Given that
rcu_read_lock_held() is sometimes separately compiled, the compiler cannot
always use this optimization.
This commit therefore reverse the order of evaluation, so that the
expression passed in is evaluated first, and the RCU lockdep expression is
evaluated only if the passed-in expression evaluated to false, courtesy
of the C-language short-circuit boolean evaluation rules. This compells
the compiler to forego executing the RCU lockdep expression in cases
where the passed-in expression evaluates to "1" at compile time, so that
(for example) rcu_dereference_raw() can be guaranteed to execute safely
within an NMI handler.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
The DEFINE_SRCU() and DEFINE_STATIC_SRCU() definitions are quite
similar, so this commit combines them, saving a bit of code and removing
redundancy.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
When rcutorture used only the low-order 32 bits of the grace-period
number, it was not a problem for SRCU to use a 32-bit completed field.
However, rcutorture now uses the full 64 bits on 64-bit systems, so
this commit converts SRCU's ->completed field to unsigned long so as to
provide 64 bits on 64-bit systems.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
All of the RCU source files have the usual GPL header, which contains a
long-obsolete postal address for FSF. To avoid the need to track the
FSF office's movements, this commit substitutes the URL where GPL may
be found.
Reported-by: Greg KH <gregkh@linuxfoundation.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
srcu read lock/unlock include a full memory barrier
but that's an implementation detail.
Add an API for make memory fencing explicit for
users that need this barrier, to make sure we
can change it as needed without breaking all users.
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
These interfaces never did get used, so this commit removes them,
their rcutorture tests, and documentation referencing them.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of ff195cb6 (rcu: Warn when srcu_read_lock() is used in an extended
quiescent state).
It also makes the codes match the comments in whatisRCU.txt:
g. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU? If so, SRCU is the
only choice that will work for you.
[ paulmck: There is at least one remaining issue, namely use of lockdep
with tracing enabled. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of c0d6d01b (rcu: Check for illegal use of RCU from offlined CPUs).
It also makes the code match the comments in whatisRCU.txt:
g. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU? If so, SRCU is the
only choice that will work for you.
[ paulmck: There is at least one remaining issue, namely use of lockdep
with tracing enabled. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
In old days, we had two different API sets for dynamic-allocated per-CPU
data and DEFINE_PER_CPU()-defined per_cpu data, and because SRCU used
dynamic-allocated per-CPU data, its srcu_struct structures cannot be
declared statically. This commit therefore introduces DEFINE_SRCU()
and DEFINE_STATIC_SRCU() to allow statically declared SRCU structures,
using the new static per-CPU interfaces.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Updated for __DELAYED_WORK_INITIALIZER() added argument,
fixed whitespace issue. ]
Because process_srcu() will be used in DEFINE_SRCU(), which is a macro
that could be expanded pretty much anywhere, it can no longer be static.
Note that process_srcu() is still internal to srcu.h.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Lai Jiangshan rewrote SRCU, so this commit ensures that he gets his
proper share of blame^Wcredit.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit implements an SRCU state machine in support of call_srcu().
The state machine is preemptible, light-weight, and single-threaded,
minimizing synchronization overhead. In particular, there is no longer
any need for synchronize_srcu() to be guarded by a mutex.
Expedited processing is handled, at least in the absence of concurrent
grace-period operations on that same srcu_struct structure, by having
the synchronize_srcu_expedited() thread take on the role of the
workqueue thread for one iteration.
There is a reasonable probability that a given SRCU callback will
be invoked on the same CPU that registered it, however, there is no
guarantee. Concurrent SRCU grace-period primitives can cause callbacks
to be executed elsewhere, even in absence of CPU-hotplug operations.
Callbacks execute in process context, but under the influence of
local_bh_disable(), so it is illegal to sleep in an SRCU callback
function.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The old srcu_barrier() macro is now unused. This commit removes it so
that it may be used for the SRCU flavor of rcu_barrier(), which will in
turn be needed to allow the upcoming call_srcu() to be used from within
modules.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit implements a variant of Peter's algorithm, which may be found
at https://lkml.org/lkml/2012/2/1/119.
o Make the checking lock-free to enable parallel checking.
Parallel checking is required when (1) the original checking
task is preempted for a long time, (2) sychronize_srcu_expedited()
starts during an ongoing SRCU grace period, or (3) we wish to
avoid acquiring a lock.
o Since the checking is lock-free, we avoid a mutex in state machine
for call_srcu().
o Remove the SRCU_REF_MASK and remove the coupling with the flipping.
This might allow us to remove the preempt_disable() in future
versions, though such removal will need great care because it
rescinds the one-old-reader-per-CPU guarantee.
o Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
more intuitive.
Inspired-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
that no reasonable series of srcu_read_lock() and srcu_read_unlock()
operations can return the value of the counter to its original value.
This guarantee is require only after the index has been switched to
the other set of counters, so at most one srcu_read_lock() can affect
a given CPU's counter. The number of srcu_read_unlock() operations
on a given counter is limited to the number of tasks in the system,
which given the Linux kernel's current structure is limited to far less
than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
(Something about a limited number of bytes in the kernel's address space.)
Therefore, if srcu_read_lock() increments the upper bits, then
srcu_read_unlock() need not do so. In this case, an srcu_read_lock() and
an srcu_read_unlock() will flip the lower bit of the upper field of the
counter. An unreasonably large additional number of srcu_read_unlock()
operations would be required to return the counter to its initial value,
thus preserving the guarantee.
This commit takes this approach, which further allows it to shrink
the size of the upper field to one bit, making the number of
srcu_read_unlock() operations required to return the counter to its
initial value even more unreasonable than before.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The current implementation of synchronize_srcu_expedited() can cause
severe OS jitter due to its use of synchronize_sched(), which in turn
invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
This can result in severe performance degradation for real-time workloads
and especially for short-interation-length HPC workloads. Furthermore,
because only one instance of try_stop_cpus() can be making forward progress
at a given time, only one instance of synchronize_srcu_expedited() can
make forward progress at a time, even if they are all operating on
distinct srcu_struct structures.
This commit, inspired by an earlier implementation by Peter Zijlstra
(https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
takes a strictly algorithmic bits-in-memory approach. This has the
disadvantage of requiring one explicit memory-barrier instruction in
each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
completely dispenses with OS jitter and furthermore allows SRCU to be
used freely by CPUs that RCU believes to be idle or offline.
The update-side implementation handles the single read-side memory
barrier by rechecking the per-CPU counters after summing them and
by running through the update-side state machine twice.
This implementation has passed moderate rcutorture testing on both
x86 and Power. Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
as suggested by Peter Zijlstra.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
The WARN_ON_ONCE() in rcu_lock_acquire() results in infinite recursion
on S390, and also doesn't print very much information. Remove this.
Updated patch to add lockdep-RCU assertions to RCU's read-side primitives.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Although it is legal to use RCU during early boot, it is anything
but legal to use RCU at runtime from an offlined CPU. After all, RCU
explicitly ignores offlined CPUs. This commit therefore adds checks
for runtime use of RCU from offlined CPUs.
These checks are not perfect, in particular, they can be subverted
through use of things like rcu_dereference_raw(). Note that it is not
possible to put checks in rcu_read_lock() and friends due to the fact
that these primitives are used in code that might be used under either
RCU or lock-based protection, which means that checking rcu_read_lock()
gets you fat piles of false positives.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The intent is that a given RCU read-side critical section be confined
to a single context. For example, it is illegal to invoke rcu_read_lock()
in an exception handler and then invoke rcu_read_unlock() from the
context of the task that received the exception.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The RCU implementations, including SRCU, are designed to be used in a
lock-like fashion, so that the read-side lock and unlock primitives must
execute in the same context for any given read-side critical section.
This constraint is enforced by lockdep-RCU. However, there is a need
to enter an SRCU read-side critical section within the context of an
exception and then exit in the context of the task that encountered the
exception. The cost of this capability is that the read-side operations
incur the overhead of disabling interrupts.
Note that although the current implementation allows a given read-side
critical section to be entered by one task and then exited by another, all
known possible implementations that allow this have scalability problems.
Therefore, a given read-side critical section must be exited by the same
task that entered it, though perhaps from an interrupt or exception
handler running within that task's context. But if you are thinking
in terms of interrupt handlers, make sure that you have considered the
possibility of threaded interrupt handlers.
Credit goes to Peter Zijlstra for suggesting use of the existing _raw
suffix to indicate disabling lockdep over the earlier "bulkref" names.
Requested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
A common debug_lockdep_rcu_enabled() function is used to check whether
RCU lockdep splats should be reported, but srcu_read_lock() does not
use it. This commit therefore brings srcu_read_lock_held() up to date.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Catch SRCU up to the other variants of RCU by making PROVE_RCU
complain if either srcu_read_lock() or srcu_read_lock_held() are
used from within RCU-idle mode.
Frederic reworked this to allow for the new versions of his patches
that check for extended quiescent states.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
It is illegal to wait for an SRCU grace period while within the
corresponding flavor of SRCU read-side critical section. Therefore,
this commit updates the srcu_read_lock() docbook accordingly.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>