You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"Main changes:
- Torture-test changes, including refactoring of rcutorture and
introduction of a vestigial locktorture.
- Real-time latency fixes.
- Documentation updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
rcu: Provide grace-period piggybacking API
rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone
rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c
notifier: Substitute rcu_access_pointer() for rcu_dereference_raw()
Documentation/memory-barriers.txt: Clarify release/acquire ordering
rcutorture: Save kvm.sh output to log
rcutorture: Add a lock_busted to test the test
rcutorture: Place kvm-test-1-run.sh output into res directory
rcutorture: Rename TREE_RCU-Kconfig.txt
locktorture: Add kvm-recheck.sh plug-in for locktorture
rcutorture: Gracefully handle NULL cleanup hooks
locktorture: Add vestigial locktorture configuration
rcutorture: Introduce "rcu" directory level underneath configs
rcutorture: Rename kvm-test-1-rcu.sh
rcutorture: Remove RCU dependencies from ver_functions.sh API
rcutorture: Create CFcommon file for common Kconfig parameters
rcutorture: Create config files for scripted test-the-test testing
rcutorture: Add an rcu_busted to test the test
locktorture: Add a lock-torture kernel module
rcutorture: Abstract kvm-recheck.sh
...
This commit is contained in:
+125
-24
@@ -31,6 +31,14 @@ has lapsed, so this approach may be used in non-GPL software, if desired.
|
||||
(In contrast, implementation of RCU is permitted only in software licensed
|
||||
under either GPL or LGPL. Sorry!!!)
|
||||
|
||||
In 1987, Rashid et al. described lazy TLB-flush [RichardRashid87a].
|
||||
At first glance, this has nothing to do with RCU, but nevertheless
|
||||
this paper helped inspire the update-side batching used in the later
|
||||
RCU implementation in DYNIX/ptx. In 1988, Barbara Liskov published
|
||||
a description of Argus that noted that use of out-of-date values can
|
||||
be tolerated in some situations. Thus, this paper provides some early
|
||||
theoretical justification for use of stale data.
|
||||
|
||||
In 1990, Pugh [Pugh90] noted that explicitly tracking which threads
|
||||
were reading a given data structure permitted deferred free to operate
|
||||
in the presence of non-terminating threads. However, this explicit
|
||||
@@ -41,11 +49,11 @@ providing a fine-grained locking design, however, it would be interesting
|
||||
to see how much of the performance advantage reported in 1990 remains
|
||||
today.
|
||||
|
||||
At about this same time, Adams [Adams91] described ``chaotic relaxation'',
|
||||
where the normal barriers between successive iterations of convergent
|
||||
numerical algorithms are relaxed, so that iteration $n$ might use
|
||||
data from iteration $n-1$ or even $n-2$. This introduces error,
|
||||
which typically slows convergence and thus increases the number of
|
||||
At about this same time, Andrews [Andrews91textbook] described ``chaotic
|
||||
relaxation'', where the normal barriers between successive iterations
|
||||
of convergent numerical algorithms are relaxed, so that iteration $n$
|
||||
might use data from iteration $n-1$ or even $n-2$. This introduces
|
||||
error, which typically slows convergence and thus increases the number of
|
||||
iterations required. However, this increase is sometimes more than made
|
||||
up for by a reduction in the number of expensive barrier operations,
|
||||
which are otherwise required to synchronize the threads at the end
|
||||
@@ -55,7 +63,8 @@ is thus inapplicable to most data structures in operating-system kernels.
|
||||
|
||||
In 1992, Henry (now Alexia) Massalin completed a dissertation advising
|
||||
parallel programmers to defer processing when feasible to simplify
|
||||
synchronization. RCU makes extremely heavy use of this advice.
|
||||
synchronization [HMassalinPhD]. RCU makes extremely heavy use of
|
||||
this advice.
|
||||
|
||||
In 1993, Jacobson [Jacobson93] verbally described what is perhaps the
|
||||
simplest deferred-free technique: simply waiting a fixed amount of time
|
||||
@@ -90,27 +99,29 @@ mechanism, which is quite similar to RCU [Gamsa99]. These operating
|
||||
systems made pervasive use of RCU in place of "existence locks", which
|
||||
greatly simplifies locking hierarchies and helps avoid deadlocks.
|
||||
|
||||
2001 saw the first RCU presentation involving Linux [McKenney01a]
|
||||
at OLS. The resulting abundance of RCU patches was presented the
|
||||
following year [McKenney02a], and use of RCU in dcache was first
|
||||
described that same year [Linder02a].
|
||||
The year 2000 saw an email exchange that would likely have
|
||||
led to yet another independent invention of something like RCU
|
||||
[RustyRussell2000a,RustyRussell2000b]. Instead, 2001 saw the first
|
||||
RCU presentation involving Linux [McKenney01a] at OLS. The resulting
|
||||
abundance of RCU patches was presented the following year [McKenney02a],
|
||||
and use of RCU in dcache was first described that same year [Linder02a].
|
||||
|
||||
Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer"
|
||||
techniques that defer the destruction of data structures to simplify
|
||||
non-blocking synchronization (wait-free synchronization, lock-free
|
||||
synchronization, and obstruction-free synchronization are all examples of
|
||||
non-blocking synchronization). In particular, this technique eliminates
|
||||
locking, reduces contention, reduces memory latency for readers, and
|
||||
parallelizes pipeline stalls and memory latency for writers. However,
|
||||
these techniques still impose significant read-side overhead in the
|
||||
form of memory barriers. Researchers at Sun worked along similar lines
|
||||
in the same timeframe [HerlihyLM02]. These techniques can be thought
|
||||
of as inside-out reference counts, where the count is represented by the
|
||||
number of hazard pointers referencing a given data structure rather than
|
||||
the more conventional counter field within the data structure itself.
|
||||
The key advantage of inside-out reference counts is that they can be
|
||||
stored in immortal variables, thus allowing races between access and
|
||||
deletion to be avoided.
|
||||
non-blocking synchronization). The corresponding journal article appeared
|
||||
in 2004 [MagedMichael04a]. This technique eliminates locking, reduces
|
||||
contention, reduces memory latency for readers, and parallelizes pipeline
|
||||
stalls and memory latency for writers. However, these techniques still
|
||||
impose significant read-side overhead in the form of memory barriers.
|
||||
Researchers at Sun worked along similar lines in the same timeframe
|
||||
[HerlihyLM02]. These techniques can be thought of as inside-out reference
|
||||
counts, where the count is represented by the number of hazard pointers
|
||||
referencing a given data structure rather than the more conventional
|
||||
counter field within the data structure itself. The key advantage
|
||||
of inside-out reference counts is that they can be stored in immortal
|
||||
variables, thus allowing races between access and deletion to be avoided.
|
||||
|
||||
By the same token, RCU can be thought of as a "bulk reference count",
|
||||
where some form of reference counter covers all reference by a given CPU
|
||||
@@ -123,8 +134,10 @@ can be thought of in other terms as well.
|
||||
|
||||
In 2003, the K42 group described how RCU could be used to create
|
||||
hot-pluggable implementations of operating-system functions [Appavoo03a].
|
||||
Later that year saw a paper describing an RCU implementation of System
|
||||
V IPC [Arcangeli03], and an introduction to RCU in Linux Journal
|
||||
Later that year saw a paper describing an RCU implementation
|
||||
of System V IPC [Arcangeli03] (following up on a suggestion by
|
||||
Hugh Dickins [Dickins02a] and an implementation by Mingming Cao
|
||||
[MingmingCao2002IPCRCU]), and an introduction to RCU in Linux Journal
|
||||
[McKenney03a].
|
||||
|
||||
2004 has seen a Linux-Journal article on use of RCU in dcache
|
||||
@@ -383,6 +396,21 @@ for Programming Languages and Operating Systems}"
|
||||
}
|
||||
}
|
||||
|
||||
@phdthesis{HMassalinPhD
|
||||
,author="H. Massalin"
|
||||
,title="Synthesis: An Efficient Implementation of Fundamental Operating
|
||||
System Services"
|
||||
,school="Columbia University"
|
||||
,address="New York, NY"
|
||||
,year="1992"
|
||||
,annotation={
|
||||
Mondo optimizing compiler.
|
||||
Wait-free stuff.
|
||||
Good advice: defer work to avoid synchronization. See page 90
|
||||
(PDF page 106), Section 5.4, fourth bullet point.
|
||||
}
|
||||
}
|
||||
|
||||
@unpublished{Jacobson93
|
||||
,author="Van Jacobson"
|
||||
,title="Avoid Read-Side Locking Via Delayed Free"
|
||||
@@ -671,6 +699,20 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
|
||||
[Viewed October 18, 2004]"
|
||||
}
|
||||
|
||||
@conference{Michael02b
|
||||
,author="Maged M. Michael"
|
||||
,title="High Performance Dynamic Lock-Free Hash Tables and List-Based Sets"
|
||||
,Year="2002"
|
||||
,Month="August"
|
||||
,booktitle="{Proceedings of the 14\textsuperscript{th} Annual ACM
|
||||
Symposium on Parallel
|
||||
Algorithms and Architecture}"
|
||||
,pages="73-82"
|
||||
,annotation={
|
||||
Like the title says...
|
||||
}
|
||||
}
|
||||
|
||||
@Conference{Linder02a
|
||||
,Author="Hanna Linder and Dipankar Sarma and Maneesh Soni"
|
||||
,Title="Scalability of the Directory Entry Cache"
|
||||
@@ -727,6 +769,24 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
|
||||
}
|
||||
}
|
||||
|
||||
@conference{Michael02a
|
||||
,author="Maged M. Michael"
|
||||
,title="Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic
|
||||
Reads and Writes"
|
||||
,Year="2002"
|
||||
,Month="August"
|
||||
,booktitle="{Proceedings of the 21\textsuperscript{st} Annual ACM
|
||||
Symposium on Principles of Distributed Computing}"
|
||||
,pages="21-30"
|
||||
,annotation={
|
||||
Each thread keeps an array of pointers to items that it is
|
||||
currently referencing. Sort of an inside-out garbage collection
|
||||
mechanism, but one that requires the accessing code to explicitly
|
||||
state its needs. Also requires read-side memory barriers on
|
||||
most architectures.
|
||||
}
|
||||
}
|
||||
|
||||
@unpublished{Dickins02a
|
||||
,author="Hugh Dickins"
|
||||
,title="Use RCU for System-V IPC"
|
||||
@@ -735,6 +795,17 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
|
||||
,note="private communication"
|
||||
}
|
||||
|
||||
@InProceedings{HerlihyLM02
|
||||
,author={Maurice Herlihy and Victor Luchangco and Mark Moir}
|
||||
,title="The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized,
|
||||
Lock-Free Data Structures"
|
||||
,booktitle={Proceedings of 16\textsuperscript{th} International
|
||||
Symposium on Distributed Computing}
|
||||
,year=2002
|
||||
,month="October"
|
||||
,pages="339-353"
|
||||
}
|
||||
|
||||
@unpublished{Sarma02b
|
||||
,Author="Dipankar Sarma"
|
||||
,Title="Some dcache\_rcu benchmark numbers"
|
||||
@@ -749,6 +820,19 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
|
||||
}
|
||||
}
|
||||
|
||||
@unpublished{MingmingCao2002IPCRCU
|
||||
,Author="Mingming Cao"
|
||||
,Title="[PATCH]updated ipc lock patch"
|
||||
,month="October"
|
||||
,year="2002"
|
||||
,note="Available:
|
||||
\url{https://lkml.org/lkml/2002/10/24/262}
|
||||
[Viewed February 15, 2014]"
|
||||
,annotation={
|
||||
Mingming Cao's patch to introduce RCU to SysV IPC.
|
||||
}
|
||||
}
|
||||
|
||||
@unpublished{LinusTorvalds2003a
|
||||
,Author="Linus Torvalds"
|
||||
,Title="Re: {[PATCH]} small fixes in brlock.h"
|
||||
@@ -982,6 +1066,23 @@ Realtime Applications"
|
||||
}
|
||||
}
|
||||
|
||||
@article{MagedMichael04a
|
||||
,author="Maged M. Michael"
|
||||
,title="Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects"
|
||||
,Year="2004"
|
||||
,Month="June"
|
||||
,journal="IEEE Transactions on Parallel and Distributed Systems"
|
||||
,volume="15"
|
||||
,number="6"
|
||||
,pages="491-504"
|
||||
,url="Available:
|
||||
\url{http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf}
|
||||
[Viewed March 1, 2005]"
|
||||
,annotation={
|
||||
New canonical hazard-pointer citation.
|
||||
}
|
||||
}
|
||||
|
||||
@phdthesis{PaulEdwardMcKenneyPhD
|
||||
,author="Paul E. McKenney"
|
||||
,title="Exploiting Deferred Destruction:
|
||||
|
||||
@@ -256,10 +256,10 @@ over a rather long period of time, but improvements are always welcome!
|
||||
variations on this theme.
|
||||
|
||||
b. Limiting update rate. For example, if updates occur only
|
||||
once per hour, then no explicit rate limiting is required,
|
||||
unless your system is already badly broken. The dcache
|
||||
subsystem takes this approach -- updates are guarded
|
||||
by a global lock, limiting their rate.
|
||||
once per hour, then no explicit rate limiting is
|
||||
required, unless your system is already badly broken.
|
||||
Older versions of the dcache subsystem take this approach,
|
||||
guarding updates with a global lock, limiting their rate.
|
||||
|
||||
c. Trusted update -- if updates can only be done manually by
|
||||
superuser or some other trusted user, then it might not
|
||||
@@ -268,7 +268,8 @@ over a rather long period of time, but improvements are always welcome!
|
||||
the machine.
|
||||
|
||||
d. Use call_rcu_bh() rather than call_rcu(), in order to take
|
||||
advantage of call_rcu_bh()'s faster grace periods.
|
||||
advantage of call_rcu_bh()'s faster grace periods. (This
|
||||
is only a partial solution, though.)
|
||||
|
||||
e. Periodically invoke synchronize_rcu(), permitting a limited
|
||||
number of updates per grace period.
|
||||
@@ -276,6 +277,13 @@ over a rather long period of time, but improvements are always welcome!
|
||||
The same cautions apply to call_rcu_bh(), call_rcu_sched(),
|
||||
call_srcu(), and kfree_rcu().
|
||||
|
||||
Note that although these primitives do take action to avoid memory
|
||||
exhaustion when any given CPU has too many callbacks, a determined
|
||||
user could still exhaust memory. This is especially the case
|
||||
if a system with a large number of CPUs has been configured to
|
||||
offload all of its RCU callbacks onto a single CPU, or if the
|
||||
system has relatively little free memory.
|
||||
|
||||
9. All RCU list-traversal primitives, which include
|
||||
rcu_dereference(), list_for_each_entry_rcu(), and
|
||||
list_for_each_safe_rcu(), must be either within an RCU read-side
|
||||
|
||||
@@ -162,7 +162,18 @@ Purpose: Execute workqueue requests
|
||||
To reduce its OS jitter, do any of the following:
|
||||
1. Run your workload at a real-time priority, which will allow
|
||||
preempting the kworker daemons.
|
||||
2. Do any of the following needed to avoid jitter that your
|
||||
2. A given workqueue can be made visible in the sysfs filesystem
|
||||
by passing the WQ_SYSFS to that workqueue's alloc_workqueue().
|
||||
Such a workqueue can be confined to a given subset of the
|
||||
CPUs using the /sys/devices/virtual/workqueue/*/cpumask sysfs
|
||||
files. The set of WQ_SYSFS workqueues can be displayed using
|
||||
"ls sys/devices/virtual/workqueue". That said, the workqueues
|
||||
maintainer would like to caution people against indiscriminately
|
||||
sprinkling WQ_SYSFS across all the workqueues. The reason for
|
||||
caution is that it is easy to add WQ_SYSFS, but because sysfs is
|
||||
part of the formal user/kernel API, it can be nearly impossible
|
||||
to remove it, even if its addition was a mistake.
|
||||
3. Do any of the following needed to avoid jitter that your
|
||||
application cannot tolerate:
|
||||
a. Build your kernel with CONFIG_SLUB=y rather than
|
||||
CONFIG_SLAB=y, thus avoiding the slab allocator's periodic
|
||||
|
||||
@@ -608,26 +608,30 @@ as follows:
|
||||
b = p; /* BUG: Compiler can reorder!!! */
|
||||
do_something();
|
||||
|
||||
The solution is again ACCESS_ONCE(), which preserves the ordering between
|
||||
the load from variable 'a' and the store to variable 'b':
|
||||
The solution is again ACCESS_ONCE() and barrier(), which preserves the
|
||||
ordering between the load from variable 'a' and the store to variable 'b':
|
||||
|
||||
q = ACCESS_ONCE(a);
|
||||
if (q) {
|
||||
barrier();
|
||||
ACCESS_ONCE(b) = p;
|
||||
do_something();
|
||||
} else {
|
||||
barrier();
|
||||
ACCESS_ONCE(b) = p;
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
You could also use barrier() to prevent the compiler from moving
|
||||
the stores to variable 'b', but barrier() would not prevent the
|
||||
compiler from proving to itself that a==1 always, so ACCESS_ONCE()
|
||||
is also needed.
|
||||
The initial ACCESS_ONCE() is required to prevent the compiler from
|
||||
proving the value of 'a', and the pair of barrier() invocations are
|
||||
required to prevent the compiler from pulling the two identical stores
|
||||
to 'b' out from the legs of the "if" statement.
|
||||
|
||||
It is important to note that control dependencies absolutely require a
|
||||
a conditional. For example, the following "optimized" version of
|
||||
the above example breaks ordering:
|
||||
the above example breaks ordering, which is why the barrier() invocations
|
||||
are absolutely required if you have identical stores in both legs of
|
||||
the "if" statement:
|
||||
|
||||
q = ACCESS_ONCE(a);
|
||||
ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */
|
||||
@@ -643,9 +647,11 @@ It is of course legal for the prior load to be part of the conditional,
|
||||
for example, as follows:
|
||||
|
||||
if (ACCESS_ONCE(a) > 0) {
|
||||
barrier();
|
||||
ACCESS_ONCE(b) = q / 2;
|
||||
do_something();
|
||||
} else {
|
||||
barrier();
|
||||
ACCESS_ONCE(b) = q / 3;
|
||||
do_something_else();
|
||||
}
|
||||
@@ -659,9 +665,11 @@ the needed conditional. For example:
|
||||
|
||||
q = ACCESS_ONCE(a);
|
||||
if (q % MAX) {
|
||||
barrier();
|
||||
ACCESS_ONCE(b) = p;
|
||||
do_something();
|
||||
} else {
|
||||
barrier();
|
||||
ACCESS_ONCE(b) = p;
|
||||
do_something_else();
|
||||
}
|
||||
@@ -723,8 +731,13 @@ In summary:
|
||||
use smb_rmb(), smp_wmb(), or, in the case of prior stores and
|
||||
later loads, smp_mb().
|
||||
|
||||
(*) If both legs of the "if" statement begin with identical stores
|
||||
to the same variable, a barrier() statement is required at the
|
||||
beginning of each leg of the "if" statement.
|
||||
|
||||
(*) Control dependencies require at least one run-time conditional
|
||||
between the prior load and the subsequent store. If the compiler
|
||||
between the prior load and the subsequent store, and this
|
||||
conditional must involve the prior load. If the compiler
|
||||
is able to optimize the conditional away, it will have also
|
||||
optimized away the ordering. Careful use of ACCESS_ONCE() can
|
||||
help to preserve the needed conditional.
|
||||
@@ -1249,6 +1262,23 @@ The ACCESS_ONCE() function can prevent any number of optimizations that,
|
||||
while perfectly safe in single-threaded code, can be fatal in concurrent
|
||||
code. Here are some examples of these sorts of optimizations:
|
||||
|
||||
(*) The compiler is within its rights to reorder loads and stores
|
||||
to the same variable, and in some cases, the CPU is within its
|
||||
rights to reorder loads to the same variable. This means that
|
||||
the following code:
|
||||
|
||||
a[0] = x;
|
||||
a[1] = x;
|
||||
|
||||
Might result in an older value of x stored in a[1] than in a[0].
|
||||
Prevent both the compiler and the CPU from doing this as follows:
|
||||
|
||||
a[0] = ACCESS_ONCE(x);
|
||||
a[1] = ACCESS_ONCE(x);
|
||||
|
||||
In short, ACCESS_ONCE() provides cache coherence for accesses from
|
||||
multiple CPUs to a single variable.
|
||||
|
||||
(*) The compiler is within its rights to merge successive loads from
|
||||
the same variable. Such merging can cause the compiler to "optimize"
|
||||
the following code:
|
||||
@@ -1644,12 +1674,12 @@ for each construct. These operations all imply certain barriers:
|
||||
Memory operations issued after the ACQUIRE will be completed after the
|
||||
ACQUIRE operation has completed.
|
||||
|
||||
Memory operations issued before the ACQUIRE may be completed after the
|
||||
ACQUIRE operation has completed. An smp_mb__before_spinlock(), combined
|
||||
with a following ACQUIRE, orders prior loads against subsequent stores and
|
||||
stores and prior stores against subsequent stores. Note that this is
|
||||
weaker than smp_mb()! The smp_mb__before_spinlock() primitive is free on
|
||||
many architectures.
|
||||
Memory operations issued before the ACQUIRE may be completed after
|
||||
the ACQUIRE operation has completed. An smp_mb__before_spinlock(),
|
||||
combined with a following ACQUIRE, orders prior loads against
|
||||
subsequent loads and stores and also orders prior stores against
|
||||
subsequent stores. Note that this is weaker than smp_mb()! The
|
||||
smp_mb__before_spinlock() primitive is free on many architectures.
|
||||
|
||||
(2) RELEASE operation implication:
|
||||
|
||||
@@ -1694,24 +1724,21 @@ may occur as:
|
||||
|
||||
ACQUIRE M, STORE *B, STORE *A, RELEASE M
|
||||
|
||||
This same reordering can of course occur if the lock's ACQUIRE and RELEASE are
|
||||
to the same lock variable, but only from the perspective of another CPU not
|
||||
holding that lock.
|
||||
When the ACQUIRE and RELEASE are a lock acquisition and release,
|
||||
respectively, this same reordering can occur if the lock's ACQUIRE and
|
||||
RELEASE are to the same lock variable, but only from the perspective of
|
||||
another CPU not holding that lock. In short, a ACQUIRE followed by an
|
||||
RELEASE may -not- be assumed to be a full memory barrier.
|
||||
|
||||
In short, a RELEASE followed by an ACQUIRE may -not- be assumed to be a full
|
||||
memory barrier because it is possible for a preceding RELEASE to pass a
|
||||
later ACQUIRE from the viewpoint of the CPU, but not from the viewpoint
|
||||
of the compiler. Note that deadlocks cannot be introduced by this
|
||||
interchange because if such a deadlock threatened, the RELEASE would
|
||||
simply complete.
|
||||
|
||||
If it is necessary for a RELEASE-ACQUIRE pair to produce a full barrier, the
|
||||
ACQUIRE can be followed by an smp_mb__after_unlock_lock() invocation. This
|
||||
will produce a full barrier if either (a) the RELEASE and the ACQUIRE are
|
||||
executed by the same CPU or task, or (b) the RELEASE and ACQUIRE act on the
|
||||
same variable. The smp_mb__after_unlock_lock() primitive is free on many
|
||||
architectures. Without smp_mb__after_unlock_lock(), the critical sections
|
||||
corresponding to the RELEASE and the ACQUIRE can cross:
|
||||
Similarly, the reverse case of a RELEASE followed by an ACQUIRE does not
|
||||
imply a full memory barrier. If it is necessary for a RELEASE-ACQUIRE
|
||||
pair to produce a full barrier, the ACQUIRE can be followed by an
|
||||
smp_mb__after_unlock_lock() invocation. This will produce a full barrier
|
||||
if either (a) the RELEASE and the ACQUIRE are executed by the same
|
||||
CPU or task, or (b) the RELEASE and ACQUIRE act on the same variable.
|
||||
The smp_mb__after_unlock_lock() primitive is free on many architectures.
|
||||
Without smp_mb__after_unlock_lock(), the CPU's execution of the critical
|
||||
sections corresponding to the RELEASE and the ACQUIRE can cross, so that:
|
||||
|
||||
*A = a;
|
||||
RELEASE M
|
||||
@@ -1722,7 +1749,36 @@ could occur as:
|
||||
|
||||
ACQUIRE N, STORE *B, STORE *A, RELEASE M
|
||||
|
||||
With smp_mb__after_unlock_lock(), they cannot, so that:
|
||||
It might appear that this reordering could introduce a deadlock.
|
||||
However, this cannot happen because if such a deadlock threatened,
|
||||
the RELEASE would simply complete, thereby avoiding the deadlock.
|
||||
|
||||
Why does this work?
|
||||
|
||||
One key point is that we are only talking about the CPU doing
|
||||
the reordering, not the compiler. If the compiler (or, for
|
||||
that matter, the developer) switched the operations, deadlock
|
||||
-could- occur.
|
||||
|
||||
But suppose the CPU reordered the operations. In this case,
|
||||
the unlock precedes the lock in the assembly code. The CPU
|
||||
simply elected to try executing the later lock operation first.
|
||||
If there is a deadlock, this lock operation will simply spin (or
|
||||
try to sleep, but more on that later). The CPU will eventually
|
||||
execute the unlock operation (which preceded the lock operation
|
||||
in the assembly code), which will unravel the potential deadlock,
|
||||
allowing the lock operation to succeed.
|
||||
|
||||
But what if the lock is a sleeplock? In that case, the code will
|
||||
try to enter the scheduler, where it will eventually encounter
|
||||
a memory barrier, which will force the earlier unlock operation
|
||||
to complete, again unraveling the deadlock. There might be
|
||||
a sleep-unlock race, but the locking primitive needs to resolve
|
||||
such races properly in any case.
|
||||
|
||||
With smp_mb__after_unlock_lock(), the two critical sections cannot overlap.
|
||||
For example, with the following code, the store to *A will always be
|
||||
seen by other CPUs before the store to *B:
|
||||
|
||||
*A = a;
|
||||
RELEASE M
|
||||
@@ -1730,13 +1786,18 @@ With smp_mb__after_unlock_lock(), they cannot, so that:
|
||||
smp_mb__after_unlock_lock();
|
||||
*B = b;
|
||||
|
||||
will always occur as either of the following:
|
||||
The operations will always occur in one of the following orders:
|
||||
|
||||
STORE *A, RELEASE, ACQUIRE, STORE *B
|
||||
STORE *A, ACQUIRE, RELEASE, STORE *B
|
||||
STORE *A, RELEASE, ACQUIRE, smp_mb__after_unlock_lock(), STORE *B
|
||||
STORE *A, ACQUIRE, RELEASE, smp_mb__after_unlock_lock(), STORE *B
|
||||
ACQUIRE, STORE *A, RELEASE, smp_mb__after_unlock_lock(), STORE *B
|
||||
|
||||
If the RELEASE and ACQUIRE were instead both operating on the same lock
|
||||
variable, only the first of these two alternatives can occur.
|
||||
variable, only the first of these alternatives can occur. In addition,
|
||||
the more strongly ordered systems may rule out some of the above orders.
|
||||
But in any case, as noted earlier, the smp_mb__after_unlock_lock()
|
||||
ensures that the store to *A will always be seen as happening before
|
||||
the store to *B.
|
||||
|
||||
Locks and semaphores may not provide any guarantee of ordering on UP compiled
|
||||
systems, and so cannot be counted on in such a situation to actually achieve
|
||||
@@ -2757,7 +2818,7 @@ in that order, but, without intervention, the sequence may have almost any
|
||||
combination of elements combined or discarded, provided the program's view of
|
||||
the world remains consistent. Note that ACCESS_ONCE() is -not- optional
|
||||
in the above example, as there are architectures where a given CPU might
|
||||
interchange successive loads to the same location. On such architectures,
|
||||
reorder successive loads to the same location. On such architectures,
|
||||
ACCESS_ONCE() does whatever is necessary to prevent this, for example, on
|
||||
Itanium the volatile casts used by ACCESS_ONCE() cause GCC to emit the
|
||||
special ld.acq and st.rel instructions that prevent such reordering.
|
||||
|
||||
@@ -497,7 +497,7 @@ repeat:
|
||||
error = fd;
|
||||
#if 1
|
||||
/* Sanity check */
|
||||
if (rcu_dereference_raw(fdt->fd[fd]) != NULL) {
|
||||
if (rcu_access_pointer(fdt->fd[fd]) != NULL) {
|
||||
printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd);
|
||||
rcu_assign_pointer(fdt->fd[fd], NULL);
|
||||
}
|
||||
|
||||
@@ -247,9 +247,10 @@ static inline void list_splice_init_rcu(struct list_head *list,
|
||||
* primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
|
||||
*/
|
||||
#define list_entry_rcu(ptr, type, member) \
|
||||
({typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; \
|
||||
container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
|
||||
})
|
||||
({ \
|
||||
typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
|
||||
container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
|
||||
})
|
||||
|
||||
/**
|
||||
* Where are list_empty_rcu() and list_first_entry_rcu()?
|
||||
@@ -285,11 +286,11 @@ static inline void list_splice_init_rcu(struct list_head *list,
|
||||
* primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
|
||||
*/
|
||||
#define list_first_or_null_rcu(ptr, type, member) \
|
||||
({struct list_head *__ptr = (ptr); \
|
||||
struct list_head *__next = ACCESS_ONCE(__ptr->next); \
|
||||
likely(__ptr != __next) ? \
|
||||
list_entry_rcu(__next, type, member) : NULL; \
|
||||
})
|
||||
({ \
|
||||
struct list_head *__ptr = (ptr); \
|
||||
struct list_head *__next = ACCESS_ONCE(__ptr->next); \
|
||||
likely(__ptr != __next) ? list_entry_rcu(__next, type, member) : NULL; \
|
||||
})
|
||||
|
||||
/**
|
||||
* list_for_each_entry_rcu - iterate over rcu list of given type
|
||||
|
||||
+48
-46
@@ -12,8 +12,8 @@
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write to the Free Software
|
||||
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright IBM Corporation, 2001
|
||||
*
|
||||
@@ -44,7 +44,9 @@
|
||||
#include <linux/debugobjects.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/compiler.h>
|
||||
#include <asm/barrier.h>
|
||||
|
||||
extern int rcu_expedited; /* for sysctl */
|
||||
#ifdef CONFIG_RCU_TORTURE_TEST
|
||||
extern int rcutorture_runnable; /* for sysctl */
|
||||
#endif /* #ifdef CONFIG_RCU_TORTURE_TEST */
|
||||
@@ -479,11 +481,9 @@ static inline void rcu_preempt_sleep_check(void)
|
||||
do { \
|
||||
rcu_preempt_sleep_check(); \
|
||||
rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map), \
|
||||
"Illegal context switch in RCU-bh" \
|
||||
" read-side critical section"); \
|
||||
"Illegal context switch in RCU-bh read-side critical section"); \
|
||||
rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map), \
|
||||
"Illegal context switch in RCU-sched"\
|
||||
" read-side critical section"); \
|
||||
"Illegal context switch in RCU-sched read-side critical section"); \
|
||||
} while (0)
|
||||
|
||||
#else /* #ifdef CONFIG_PROVE_RCU */
|
||||
@@ -510,43 +510,40 @@ static inline void rcu_preempt_sleep_check(void)
|
||||
#endif /* #else #ifdef __CHECKER__ */
|
||||
|
||||
#define __rcu_access_pointer(p, space) \
|
||||
({ \
|
||||
typeof(*p) *_________p1 = (typeof(*p)*__force )ACCESS_ONCE(p); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
((typeof(*p) __force __kernel *)(_________p1)); \
|
||||
})
|
||||
({ \
|
||||
typeof(*p) *_________p1 = (typeof(*p) *__force)ACCESS_ONCE(p); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
((typeof(*p) __force __kernel *)(_________p1)); \
|
||||
})
|
||||
#define __rcu_dereference_check(p, c, space) \
|
||||
({ \
|
||||
typeof(*p) *_________p1 = (typeof(*p)*__force )ACCESS_ONCE(p); \
|
||||
rcu_lockdep_assert(c, "suspicious rcu_dereference_check()" \
|
||||
" usage"); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
smp_read_barrier_depends(); \
|
||||
((typeof(*p) __force __kernel *)(_________p1)); \
|
||||
})
|
||||
({ \
|
||||
typeof(*p) *_________p1 = (typeof(*p) *__force)ACCESS_ONCE(p); \
|
||||
rcu_lockdep_assert(c, "suspicious rcu_dereference_check() usage"); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
|
||||
((typeof(*p) __force __kernel *)(_________p1)); \
|
||||
})
|
||||
#define __rcu_dereference_protected(p, c, space) \
|
||||
({ \
|
||||
rcu_lockdep_assert(c, "suspicious rcu_dereference_protected()" \
|
||||
" usage"); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
((typeof(*p) __force __kernel *)(p)); \
|
||||
})
|
||||
({ \
|
||||
rcu_lockdep_assert(c, "suspicious rcu_dereference_protected() usage"); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
((typeof(*p) __force __kernel *)(p)); \
|
||||
})
|
||||
|
||||
#define __rcu_access_index(p, space) \
|
||||
({ \
|
||||
typeof(p) _________p1 = ACCESS_ONCE(p); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
(_________p1); \
|
||||
})
|
||||
({ \
|
||||
typeof(p) _________p1 = ACCESS_ONCE(p); \
|
||||
rcu_dereference_sparse(p, space); \
|
||||
(_________p1); \
|
||||
})
|
||||
#define __rcu_dereference_index_check(p, c) \
|
||||
({ \
|
||||
typeof(p) _________p1 = ACCESS_ONCE(p); \
|
||||
rcu_lockdep_assert(c, \
|
||||
"suspicious rcu_dereference_index_check()" \
|
||||
" usage"); \
|
||||
smp_read_barrier_depends(); \
|
||||
(_________p1); \
|
||||
})
|
||||
({ \
|
||||
typeof(p) _________p1 = ACCESS_ONCE(p); \
|
||||
rcu_lockdep_assert(c, \
|
||||
"suspicious rcu_dereference_index_check() usage"); \
|
||||
smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
|
||||
(_________p1); \
|
||||
})
|
||||
|
||||
/**
|
||||
* RCU_INITIALIZER() - statically initialize an RCU-protected global variable
|
||||
@@ -585,12 +582,7 @@ static inline void rcu_preempt_sleep_check(void)
|
||||
* please be careful when making changes to rcu_assign_pointer() and the
|
||||
* other macros that it invokes.
|
||||
*/
|
||||
#define rcu_assign_pointer(p, v) \
|
||||
do { \
|
||||
smp_wmb(); \
|
||||
ACCESS_ONCE(p) = RCU_INITIALIZER(v); \
|
||||
} while (0)
|
||||
|
||||
#define rcu_assign_pointer(p, v) smp_store_release(&p, RCU_INITIALIZER(v))
|
||||
|
||||
/**
|
||||
* rcu_access_pointer() - fetch RCU pointer with no dereferencing
|
||||
@@ -1015,11 +1007,21 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
|
||||
#define kfree_rcu(ptr, rcu_head) \
|
||||
__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head))
|
||||
|
||||
#ifdef CONFIG_RCU_NOCB_CPU
|
||||
#if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL)
|
||||
static inline int rcu_needs_cpu(int cpu, unsigned long *delta_jiffies)
|
||||
{
|
||||
*delta_jiffies = ULONG_MAX;
|
||||
return 0;
|
||||
}
|
||||
#endif /* #if defined(CONFIG_TINY_RCU) || defined(CONFIG_RCU_NOCB_CPU_ALL) */
|
||||
|
||||
#if defined(CONFIG_RCU_NOCB_CPU_ALL)
|
||||
static inline bool rcu_is_nocb_cpu(int cpu) { return true; }
|
||||
#elif defined(CONFIG_RCU_NOCB_CPU)
|
||||
bool rcu_is_nocb_cpu(int cpu);
|
||||
#else
|
||||
static inline bool rcu_is_nocb_cpu(int cpu) { return false; }
|
||||
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
|
||||
#endif
|
||||
|
||||
|
||||
/* Only for use by adaptive-ticks code. */
|
||||
|
||||
+12
-8
@@ -12,8 +12,8 @@
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write to the Free Software
|
||||
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright IBM Corporation, 2008
|
||||
*
|
||||
@@ -27,6 +27,16 @@
|
||||
|
||||
#include <linux/cache.h>
|
||||
|
||||
static inline unsigned long get_state_synchronize_rcu(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void cond_synchronize_rcu(unsigned long oldstate)
|
||||
{
|
||||
might_sleep();
|
||||
}
|
||||
|
||||
static inline void rcu_barrier_bh(void)
|
||||
{
|
||||
wait_rcu_gp(call_rcu_bh);
|
||||
@@ -68,12 +78,6 @@ static inline void kfree_call_rcu(struct rcu_head *head,
|
||||
call_rcu(head, func);
|
||||
}
|
||||
|
||||
static inline int rcu_needs_cpu(int cpu, unsigned long *delta_jiffies)
|
||||
{
|
||||
*delta_jiffies = ULONG_MAX;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void rcu_note_context_switch(int cpu)
|
||||
{
|
||||
rcu_sched_qs(cpu);
|
||||
|
||||
@@ -12,8 +12,8 @@
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write to the Free Software
|
||||
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright IBM Corporation, 2008
|
||||
*
|
||||
@@ -31,7 +31,9 @@
|
||||
#define __LINUX_RCUTREE_H
|
||||
|
||||
void rcu_note_context_switch(int cpu);
|
||||
#ifndef CONFIG_RCU_NOCB_CPU_ALL
|
||||
int rcu_needs_cpu(int cpu, unsigned long *delta_jiffies);
|
||||
#endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */
|
||||
void rcu_cpu_stall_reset(void);
|
||||
|
||||
/*
|
||||
@@ -74,6 +76,8 @@ static inline void synchronize_rcu_bh_expedited(void)
|
||||
void rcu_barrier(void);
|
||||
void rcu_barrier_bh(void);
|
||||
void rcu_barrier_sched(void);
|
||||
unsigned long get_state_synchronize_rcu(void);
|
||||
void cond_synchronize_rcu(unsigned long oldstate);
|
||||
|
||||
extern unsigned long rcutorture_testseq;
|
||||
extern unsigned long rcutorture_vernum;
|
||||
|
||||
@@ -12,8 +12,8 @@
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write to the Free Software
|
||||
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright (C) IBM Corporation, 2006
|
||||
* Copyright (C) Fujitsu, 2012
|
||||
|
||||
@@ -0,0 +1,100 @@
|
||||
/*
|
||||
* Common functions for in-kernel torture tests.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright IBM Corporation, 2014
|
||||
*
|
||||
* Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
||||
*/
|
||||
|
||||
#ifndef __LINUX_TORTURE_H
|
||||
#define __LINUX_TORTURE_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/cache.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/threads.h>
|
||||
#include <linux/cpumask.h>
|
||||
#include <linux/seqlock.h>
|
||||
#include <linux/lockdep.h>
|
||||
#include <linux/completion.h>
|
||||
#include <linux/debugobjects.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/compiler.h>
|
||||
|
||||
/* Definitions for a non-string torture-test module parameter. */
|
||||
#define torture_param(type, name, init, msg) \
|
||||
static type name = init; \
|
||||
module_param(name, type, 0444); \
|
||||
MODULE_PARM_DESC(name, msg);
|
||||
|
||||
#define TORTURE_FLAG "-torture:"
|
||||
#define TOROUT_STRING(s) \
|
||||
pr_alert("%s" TORTURE_FLAG s "\n", torture_type)
|
||||
#define VERBOSE_TOROUT_STRING(s) \
|
||||
do { if (verbose) pr_alert("%s" TORTURE_FLAG " %s\n", torture_type, s); } while (0)
|
||||
#define VERBOSE_TOROUT_ERRSTRING(s) \
|
||||
do { if (verbose) pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); } while (0)
|
||||
|
||||
/* Definitions for a non-string torture-test module parameter. */
|
||||
#define torture_parm(type, name, init, msg) \
|
||||
static type name = init; \
|
||||
module_param(name, type, 0444); \
|
||||
MODULE_PARM_DESC(name, msg);
|
||||
|
||||
/* Definitions for online/offline exerciser. */
|
||||
int torture_onoff_init(long ooholdoff, long oointerval);
|
||||
char *torture_onoff_stats(char *page);
|
||||
bool torture_onoff_failures(void);
|
||||
|
||||
/* Low-rider random number generator. */
|
||||
struct torture_random_state {
|
||||
unsigned long trs_state;
|
||||
long trs_count;
|
||||
};
|
||||
#define DEFINE_TORTURE_RANDOM(name) struct torture_random_state name = { 0, 0 }
|
||||
unsigned long torture_random(struct torture_random_state *trsp);
|
||||
|
||||
/* Task shuffler, which causes CPUs to occasionally go idle. */
|
||||
void torture_shuffle_task_register(struct task_struct *tp);
|
||||
int torture_shuffle_init(long shuffint);
|
||||
|
||||
/* Test auto-shutdown handling. */
|
||||
void torture_shutdown_absorb(const char *title);
|
||||
int torture_shutdown_init(int ssecs, void (*cleanup)(void));
|
||||
|
||||
/* Task stuttering, which forces load/no-load transitions. */
|
||||
void stutter_wait(const char *title);
|
||||
int torture_stutter_init(int s);
|
||||
|
||||
/* Initialization and cleanup. */
|
||||
void torture_init_begin(char *ttype, bool v, int *runnable);
|
||||
void torture_init_end(void);
|
||||
bool torture_cleanup(void);
|
||||
bool torture_must_stop(void);
|
||||
bool torture_must_stop_irq(void);
|
||||
void torture_kthread_stopping(char *title);
|
||||
int _torture_create_kthread(int (*fn)(void *arg), void *arg, char *s, char *m,
|
||||
char *f, struct task_struct **tp);
|
||||
void _torture_stop_kthread(char *m, struct task_struct **tp);
|
||||
|
||||
#define torture_create_kthread(n, arg, tp) \
|
||||
_torture_create_kthread(n, (arg), #n, "Creating " #n " task", \
|
||||
"Failed to create " #n, &(tp))
|
||||
#define torture_stop_kthread(n, tp) \
|
||||
_torture_stop_kthread("Stopping " #n " task", &(tp))
|
||||
|
||||
#endif /* __LINUX_TORTURE_H */
|
||||
@@ -93,6 +93,7 @@ obj-$(CONFIG_PADATA) += padata.o
|
||||
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
|
||||
obj-$(CONFIG_JUMP_LABEL) += jump_label.o
|
||||
obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
|
||||
obj-$(CONFIG_TORTURE_TEST) += torture.o
|
||||
|
||||
$(obj)/configs.o: $(obj)/config_data.h
|
||||
|
||||
|
||||
@@ -19,6 +19,8 @@
|
||||
#include <linux/sched.h>
|
||||
#include <linux/capability.h>
|
||||
|
||||
#include <linux/rcupdate.h> /* rcu_expedited */
|
||||
|
||||
#define KERNEL_ATTR_RO(_name) \
|
||||
static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
|
||||
|
||||
|
||||
@@ -23,3 +23,4 @@ obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock_debug.o
|
||||
obj-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += rwsem-spinlock.o
|
||||
obj-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem-xadd.o
|
||||
obj-$(CONFIG_PERCPU_RWSEM) += percpu-rwsem.o
|
||||
obj-$(CONFIG_LOCK_TORTURE_TEST) += locktorture.o
|
||||
|
||||
@@ -0,0 +1,452 @@
|
||||
/*
|
||||
* Module-based torture test facility for locking
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License as published by
|
||||
* the Free Software Foundation; either version 2 of the License, or
|
||||
* (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright (C) IBM Corporation, 2014
|
||||
*
|
||||
* Author: Paul E. McKenney <paulmck@us.ibm.com>
|
||||
* Based on kernel/rcu/torture.c.
|
||||
*/
|
||||
#include <linux/types.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/kthread.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/bitops.h>
|
||||
#include <linux/completion.h>
|
||||
#include <linux/moduleparam.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/notifier.h>
|
||||
#include <linux/reboot.h>
|
||||
#include <linux/freezer.h>
|
||||
#include <linux/cpu.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/stat.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/trace_clock.h>
|
||||
#include <asm/byteorder.h>
|
||||
#include <linux/torture.h>
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com>");
|
||||
|
||||
torture_param(int, nwriters_stress, -1,
|
||||
"Number of write-locking stress-test threads");
|
||||
torture_param(int, onoff_holdoff, 0, "Time after boot before CPU hotplugs (s)");
|
||||
torture_param(int, onoff_interval, 0,
|
||||
"Time between CPU hotplugs (s), 0=disable");
|
||||
torture_param(int, shuffle_interval, 3,
|
||||
"Number of jiffies between shuffles, 0=disable");
|
||||
torture_param(int, shutdown_secs, 0, "Shutdown time (j), <= zero to disable.");
|
||||
torture_param(int, stat_interval, 60,
|
||||
"Number of seconds between stats printk()s");
|
||||
torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable");
|
||||
torture_param(bool, verbose, true,
|
||||
"Enable verbose debugging printk()s");
|
||||
|
||||
static char *torture_type = "spin_lock";
|
||||
module_param(torture_type, charp, 0444);
|
||||
MODULE_PARM_DESC(torture_type,
|
||||
"Type of lock to torture (spin_lock, spin_lock_irq, ...)");
|
||||
|
||||
static atomic_t n_lock_torture_errors;
|
||||
|
||||
static struct task_struct *stats_task;
|
||||
static struct task_struct **writer_tasks;
|
||||
|
||||
static int nrealwriters_stress;
|
||||
static bool lock_is_write_held;
|
||||
|
||||
struct lock_writer_stress_stats {
|
||||
long n_write_lock_fail;
|
||||
long n_write_lock_acquired;
|
||||
};
|
||||
static struct lock_writer_stress_stats *lwsa;
|
||||
|
||||
#if defined(MODULE) || defined(CONFIG_LOCK_TORTURE_TEST_RUNNABLE)
|
||||
#define LOCKTORTURE_RUNNABLE_INIT 1
|
||||
#else
|
||||
#define LOCKTORTURE_RUNNABLE_INIT 0
|
||||
#endif
|
||||
int locktorture_runnable = LOCKTORTURE_RUNNABLE_INIT;
|
||||
module_param(locktorture_runnable, int, 0444);
|
||||
MODULE_PARM_DESC(locktorture_runnable, "Start locktorture at boot");
|
||||
|
||||
/* Forward reference. */
|
||||
static void lock_torture_cleanup(void);
|
||||
|
||||
/*
|
||||
* Operations vector for selecting different types of tests.
|
||||
*/
|
||||
struct lock_torture_ops {
|
||||
void (*init)(void);
|
||||
int (*writelock)(void);
|
||||
void (*write_delay)(struct torture_random_state *trsp);
|
||||
void (*writeunlock)(void);
|
||||
unsigned long flags;
|
||||
const char *name;
|
||||
};
|
||||
|
||||
static struct lock_torture_ops *cur_ops;
|
||||
|
||||
/*
|
||||
* Definitions for lock torture testing.
|
||||
*/
|
||||
|
||||
static int torture_lock_busted_write_lock(void)
|
||||
{
|
||||
return 0; /* BUGGY, do not use in real life!!! */
|
||||
}
|
||||
|
||||
static void torture_lock_busted_write_delay(struct torture_random_state *trsp)
|
||||
{
|
||||
const unsigned long longdelay_us = 100;
|
||||
|
||||
/* We want a long delay occasionally to force massive contention. */
|
||||
if (!(torture_random(trsp) %
|
||||
(nrealwriters_stress * 2000 * longdelay_us)))
|
||||
mdelay(longdelay_us);
|
||||
#ifdef CONFIG_PREEMPT
|
||||
if (!(torture_random(trsp) % (nrealwriters_stress * 20000)))
|
||||
preempt_schedule(); /* Allow test to be preempted. */
|
||||
#endif
|
||||
}
|
||||
|
||||
static void torture_lock_busted_write_unlock(void)
|
||||
{
|
||||
/* BUGGY, do not use in real life!!! */
|
||||
}
|
||||
|
||||
static struct lock_torture_ops lock_busted_ops = {
|
||||
.writelock = torture_lock_busted_write_lock,
|
||||
.write_delay = torture_lock_busted_write_delay,
|
||||
.writeunlock = torture_lock_busted_write_unlock,
|
||||
.name = "lock_busted"
|
||||
};
|
||||
|
||||
static DEFINE_SPINLOCK(torture_spinlock);
|
||||
|
||||
static int torture_spin_lock_write_lock(void) __acquires(torture_spinlock)
|
||||
{
|
||||
spin_lock(&torture_spinlock);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void torture_spin_lock_write_delay(struct torture_random_state *trsp)
|
||||
{
|
||||
const unsigned long shortdelay_us = 2;
|
||||
const unsigned long longdelay_us = 100;
|
||||
|
||||
/* We want a short delay mostly to emulate likely code, and
|
||||
* we want a long delay occasionally to force massive contention.
|
||||
*/
|
||||
if (!(torture_random(trsp) %
|
||||
(nrealwriters_stress * 2000 * longdelay_us)))
|
||||
mdelay(longdelay_us);
|
||||
if (!(torture_random(trsp) %
|
||||
(nrealwriters_stress * 2 * shortdelay_us)))
|
||||
udelay(shortdelay_us);
|
||||
#ifdef CONFIG_PREEMPT
|
||||
if (!(torture_random(trsp) % (nrealwriters_stress * 20000)))
|
||||
preempt_schedule(); /* Allow test to be preempted. */
|
||||
#endif
|
||||
}
|
||||
|
||||
static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
|
||||
{
|
||||
spin_unlock(&torture_spinlock);
|
||||
}
|
||||
|
||||
static struct lock_torture_ops spin_lock_ops = {
|
||||
.writelock = torture_spin_lock_write_lock,
|
||||
.write_delay = torture_spin_lock_write_delay,
|
||||
.writeunlock = torture_spin_lock_write_unlock,
|
||||
.name = "spin_lock"
|
||||
};
|
||||
|
||||
static int torture_spin_lock_write_lock_irq(void)
|
||||
__acquires(torture_spinlock_irq)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&torture_spinlock, flags);
|
||||
cur_ops->flags = flags;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void torture_lock_spin_write_unlock_irq(void)
|
||||
__releases(torture_spinlock)
|
||||
{
|
||||
spin_unlock_irqrestore(&torture_spinlock, cur_ops->flags);
|
||||
}
|
||||
|
||||
static struct lock_torture_ops spin_lock_irq_ops = {
|
||||
.writelock = torture_spin_lock_write_lock_irq,
|
||||
.write_delay = torture_spin_lock_write_delay,
|
||||
.writeunlock = torture_lock_spin_write_unlock_irq,
|
||||
.name = "spin_lock_irq"
|
||||
};
|
||||
|
||||
/*
|
||||
* Lock torture writer kthread. Repeatedly acquires and releases
|
||||
* the lock, checking for duplicate acquisitions.
|
||||
*/
|
||||
static int lock_torture_writer(void *arg)
|
||||
{
|
||||
struct lock_writer_stress_stats *lwsp = arg;
|
||||
static DEFINE_TORTURE_RANDOM(rand);
|
||||
|
||||
VERBOSE_TOROUT_STRING("lock_torture_writer task started");
|
||||
set_user_nice(current, 19);
|
||||
|
||||
do {
|
||||
schedule_timeout_uninterruptible(1);
|
||||
cur_ops->writelock();
|
||||
if (WARN_ON_ONCE(lock_is_write_held))
|
||||
lwsp->n_write_lock_fail++;
|
||||
lock_is_write_held = 1;
|
||||
lwsp->n_write_lock_acquired++;
|
||||
cur_ops->write_delay(&rand);
|
||||
lock_is_write_held = 0;
|
||||
cur_ops->writeunlock();
|
||||
stutter_wait("lock_torture_writer");
|
||||
} while (!torture_must_stop());
|
||||
torture_kthread_stopping("lock_torture_writer");
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Create an lock-torture-statistics message in the specified buffer.
|
||||
*/
|
||||
static void lock_torture_printk(char *page)
|
||||
{
|
||||
bool fail = 0;
|
||||
int i;
|
||||
long max = 0;
|
||||
long min = lwsa[0].n_write_lock_acquired;
|
||||
long long sum = 0;
|
||||
|
||||
for (i = 0; i < nrealwriters_stress; i++) {
|
||||
if (lwsa[i].n_write_lock_fail)
|
||||
fail = true;
|
||||
sum += lwsa[i].n_write_lock_acquired;
|
||||
if (max < lwsa[i].n_write_lock_fail)
|
||||
max = lwsa[i].n_write_lock_fail;
|
||||
if (min > lwsa[i].n_write_lock_fail)
|
||||
min = lwsa[i].n_write_lock_fail;
|
||||
}
|
||||
page += sprintf(page, "%s%s ", torture_type, TORTURE_FLAG);
|
||||
page += sprintf(page,
|
||||
"Writes: Total: %lld Max/Min: %ld/%ld %s Fail: %d %s\n",
|
||||
sum, max, min, max / 2 > min ? "???" : "",
|
||||
fail, fail ? "!!!" : "");
|
||||
if (fail)
|
||||
atomic_inc(&n_lock_torture_errors);
|
||||
}
|
||||
|
||||
/*
|
||||
* Print torture statistics. Caller must ensure that there is only one
|
||||
* call to this function at a given time!!! This is normally accomplished
|
||||
* by relying on the module system to only have one copy of the module
|
||||
* loaded, and then by giving the lock_torture_stats kthread full control
|
||||
* (or the init/cleanup functions when lock_torture_stats thread is not
|
||||
* running).
|
||||
*/
|
||||
static void lock_torture_stats_print(void)
|
||||
{
|
||||
int size = nrealwriters_stress * 200 + 8192;
|
||||
char *buf;
|
||||
|
||||
buf = kmalloc(size, GFP_KERNEL);
|
||||
if (!buf) {
|
||||
pr_err("lock_torture_stats_print: Out of memory, need: %d",
|
||||
size);
|
||||
return;
|
||||
}
|
||||
lock_torture_printk(buf);
|
||||
pr_alert("%s", buf);
|
||||
kfree(buf);
|
||||
}
|
||||
|
||||
/*
|
||||
* Periodically prints torture statistics, if periodic statistics printing
|
||||
* was specified via the stat_interval module parameter.
|
||||
*
|
||||
* No need to worry about fullstop here, since this one doesn't reference
|
||||
* volatile state or register callbacks.
|
||||
*/
|
||||
static int lock_torture_stats(void *arg)
|
||||
{
|
||||
VERBOSE_TOROUT_STRING("lock_torture_stats task started");
|
||||
do {
|
||||
schedule_timeout_interruptible(stat_interval * HZ);
|
||||
lock_torture_stats_print();
|
||||
torture_shutdown_absorb("lock_torture_stats");
|
||||
} while (!torture_must_stop());
|
||||
torture_kthread_stopping("lock_torture_stats");
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void
|
||||
lock_torture_print_module_parms(struct lock_torture_ops *cur_ops,
|
||||
const char *tag)
|
||||
{
|
||||
pr_alert("%s" TORTURE_FLAG
|
||||
"--- %s: nwriters_stress=%d stat_interval=%d verbose=%d shuffle_interval=%d stutter=%d shutdown_secs=%d onoff_interval=%d onoff_holdoff=%d\n",
|
||||
torture_type, tag, nrealwriters_stress, stat_interval, verbose,
|
||||
shuffle_interval, stutter, shutdown_secs,
|
||||
onoff_interval, onoff_holdoff);
|
||||
}
|
||||
|
||||
static void lock_torture_cleanup(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (torture_cleanup())
|
||||
return;
|
||||
|
||||
if (writer_tasks) {
|
||||
for (i = 0; i < nrealwriters_stress; i++)
|
||||
torture_stop_kthread(lock_torture_writer,
|
||||
writer_tasks[i]);
|
||||
kfree(writer_tasks);
|
||||
writer_tasks = NULL;
|
||||
}
|
||||
|
||||
torture_stop_kthread(lock_torture_stats, stats_task);
|
||||
lock_torture_stats_print(); /* -After- the stats thread is stopped! */
|
||||
|
||||
if (atomic_read(&n_lock_torture_errors))
|
||||
lock_torture_print_module_parms(cur_ops,
|
||||
"End of test: FAILURE");
|
||||
else if (torture_onoff_failures())
|
||||
lock_torture_print_module_parms(cur_ops,
|
||||
"End of test: LOCK_HOTPLUG");
|
||||
else
|
||||
lock_torture_print_module_parms(cur_ops,
|
||||
"End of test: SUCCESS");
|
||||
}
|
||||
|
||||
static int __init lock_torture_init(void)
|
||||
{
|
||||
int i;
|
||||
int firsterr = 0;
|
||||
static struct lock_torture_ops *torture_ops[] = {
|
||||
&lock_busted_ops, &spin_lock_ops, &spin_lock_irq_ops,
|
||||
};
|
||||
|
||||
torture_init_begin(torture_type, verbose, &locktorture_runnable);
|
||||
|
||||
/* Process args and tell the world that the torturer is on the job. */
|
||||
for (i = 0; i < ARRAY_SIZE(torture_ops); i++) {
|
||||
cur_ops = torture_ops[i];
|
||||
if (strcmp(torture_type, cur_ops->name) == 0)
|
||||
break;
|
||||
}
|
||||
if (i == ARRAY_SIZE(torture_ops)) {
|
||||
pr_alert("lock-torture: invalid torture type: \"%s\"\n",
|
||||
torture_type);
|
||||
pr_alert("lock-torture types:");
|
||||
for (i = 0; i < ARRAY_SIZE(torture_ops); i++)
|
||||
pr_alert(" %s", torture_ops[i]->name);
|
||||
pr_alert("\n");
|
||||
torture_init_end();
|
||||
return -EINVAL;
|
||||
}
|
||||
if (cur_ops->init)
|
||||
cur_ops->init(); /* no "goto unwind" prior to this point!!! */
|
||||
|
||||
if (nwriters_stress >= 0)
|
||||
nrealwriters_stress = nwriters_stress;
|
||||
else
|
||||
nrealwriters_stress = 2 * num_online_cpus();
|
||||
lock_torture_print_module_parms(cur_ops, "Start of test");
|
||||
|
||||
/* Initialize the statistics so that each run gets its own numbers. */
|
||||
|
||||
lock_is_write_held = 0;
|
||||
lwsa = kmalloc(sizeof(*lwsa) * nrealwriters_stress, GFP_KERNEL);
|
||||
if (lwsa == NULL) {
|
||||
VERBOSE_TOROUT_STRING("lwsa: Out of memory");
|
||||
firsterr = -ENOMEM;
|
||||
goto unwind;
|
||||
}
|
||||
for (i = 0; i < nrealwriters_stress; i++) {
|
||||
lwsa[i].n_write_lock_fail = 0;
|
||||
lwsa[i].n_write_lock_acquired = 0;
|
||||
}
|
||||
|
||||
/* Start up the kthreads. */
|
||||
|
||||
if (onoff_interval > 0) {
|
||||
firsterr = torture_onoff_init(onoff_holdoff * HZ,
|
||||
onoff_interval * HZ);
|
||||
if (firsterr)
|
||||
goto unwind;
|
||||
}
|
||||
if (shuffle_interval > 0) {
|
||||
firsterr = torture_shuffle_init(shuffle_interval);
|
||||
if (firsterr)
|
||||
goto unwind;
|
||||
}
|
||||
if (shutdown_secs > 0) {
|
||||
firsterr = torture_shutdown_init(shutdown_secs,
|
||||
lock_torture_cleanup);
|
||||
if (firsterr)
|
||||
goto unwind;
|
||||
}
|
||||
if (stutter > 0) {
|
||||
firsterr = torture_stutter_init(stutter);
|
||||
if (firsterr)
|
||||
goto unwind;
|
||||
}
|
||||
|
||||
writer_tasks = kzalloc(nrealwriters_stress * sizeof(writer_tasks[0]),
|
||||
GFP_KERNEL);
|
||||
if (writer_tasks == NULL) {
|
||||
VERBOSE_TOROUT_ERRSTRING("writer_tasks: Out of memory");
|
||||
firsterr = -ENOMEM;
|
||||
goto unwind;
|
||||
}
|
||||
for (i = 0; i < nrealwriters_stress; i++) {
|
||||
firsterr = torture_create_kthread(lock_torture_writer, &lwsa[i],
|
||||
writer_tasks[i]);
|
||||
if (firsterr)
|
||||
goto unwind;
|
||||
}
|
||||
if (stat_interval > 0) {
|
||||
firsterr = torture_create_kthread(lock_torture_stats, NULL,
|
||||
stats_task);
|
||||
if (firsterr)
|
||||
goto unwind;
|
||||
}
|
||||
torture_init_end();
|
||||
return 0;
|
||||
|
||||
unwind:
|
||||
torture_init_end();
|
||||
lock_torture_cleanup();
|
||||
return firsterr;
|
||||
}
|
||||
|
||||
module_init(lock_torture_init);
|
||||
module_exit(lock_torture_cleanup);
|
||||
+1
-1
@@ -309,7 +309,7 @@ int __blocking_notifier_call_chain(struct blocking_notifier_head *nh,
|
||||
* racy then it does not matter what the result of the test
|
||||
* is, we re-check the list after having taken the lock anyway:
|
||||
*/
|
||||
if (rcu_dereference_raw(nh->head)) {
|
||||
if (rcu_access_pointer(nh->head)) {
|
||||
down_read(&nh->rwsem);
|
||||
ret = notifier_call_chain(&nh->head, val, v, nr_to_call,
|
||||
nr_calls);
|
||||
|
||||
+1
-1
@@ -1,5 +1,5 @@
|
||||
obj-y += update.o srcu.o
|
||||
obj-$(CONFIG_RCU_TORTURE_TEST) += torture.o
|
||||
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
|
||||
obj-$(CONFIG_TREE_RCU) += tree.o
|
||||
obj-$(CONFIG_TREE_PREEMPT_RCU) += tree.o
|
||||
obj-$(CONFIG_TREE_RCU_TRACE) += tree_trace.o
|
||||
|
||||
+3
-4
@@ -12,8 +12,8 @@
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write to the Free Software
|
||||
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright IBM Corporation, 2011
|
||||
*
|
||||
@@ -23,6 +23,7 @@
|
||||
#ifndef __LINUX_RCU_H
|
||||
#define __LINUX_RCU_H
|
||||
|
||||
#include <trace/events/rcu.h>
|
||||
#ifdef CONFIG_RCU_TRACE
|
||||
#define RCU_TRACE(stmt) stmt
|
||||
#else /* #ifdef CONFIG_RCU_TRACE */
|
||||
@@ -116,8 +117,6 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
|
||||
}
|
||||
}
|
||||
|
||||
extern int rcu_expedited;
|
||||
|
||||
#ifdef CONFIG_RCU_STALL_COMMON
|
||||
|
||||
extern int rcu_cpu_stall_suppress;
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
+5
-6
@@ -12,8 +12,8 @@
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write to the Free Software
|
||||
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
* along with this program; if not, you can access it online at
|
||||
* http://www.gnu.org/licenses/gpl-2.0.html.
|
||||
*
|
||||
* Copyright (C) IBM Corporation, 2006
|
||||
* Copyright (C) Fujitsu, 2012
|
||||
@@ -36,8 +36,6 @@
|
||||
#include <linux/delay.h>
|
||||
#include <linux/srcu.h>
|
||||
|
||||
#include <trace/events/rcu.h>
|
||||
|
||||
#include "rcu.h"
|
||||
|
||||
/*
|
||||
@@ -398,7 +396,7 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
|
||||
rcu_batch_queue(&sp->batch_queue, head);
|
||||
if (!sp->running) {
|
||||
sp->running = true;
|
||||
schedule_delayed_work(&sp->work, 0);
|
||||
queue_delayed_work(system_power_efficient_wq, &sp->work, 0);
|
||||
}
|
||||
spin_unlock_irqrestore(&sp->queue_lock, flags);
|
||||
}
|
||||
@@ -674,7 +672,8 @@ static void srcu_reschedule(struct srcu_struct *sp)
|
||||
}
|
||||
|
||||
if (pending)
|
||||
schedule_delayed_work(&sp->work, SRCU_INTERVAL);
|
||||
queue_delayed_work(system_power_efficient_wq,
|
||||
&sp->work, SRCU_INTERVAL);
|
||||
}
|
||||
|
||||
/*
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user