Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu

Pull RCU updates from Paul E. McKenney:

  - Documentation updates.

  - Changes permitting use of call_rcu() and friends very early in
    boot, for example, before rcu_init() is invoked.

  - Miscellaneous fixes.

  - Add in-kernel API to enable and disable expediting of normal RCU
    grace periods.

  - Improve RCU's handling of (hotplug-) outgoing CPUs.

    Note: ARM support is lagging a bit here, and these improved
    diagnostics might generate (harmless) splats.

  - NO_HZ_FULL_SYSIDLE fixes.

  - Tiny RCU updates to make it more tiny.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
Ingo Molnar
2015-03-27 10:04:06 +01:00
30 changed files with 948 additions and 437 deletions
+23 -22
View File
@@ -201,11 +201,11 @@ These routines add 1 and subtract 1, respectively, from the given
atomic_t and return the new counter value after the operation is
performed.
Unlike the above routines, it is required that explicit memory
barriers are performed before and after the operation. It must be
done such that all memory operations before and after the atomic
operation calls are strongly ordered with respect to the atomic
operation itself.
Unlike the above routines, it is required that these primitives
include explicit memory barriers that are performed before and after
the operation. It must be done such that all memory operations before
and after the atomic operation calls are strongly ordered with respect
to the atomic operation itself.
For example, it should behave as if a smp_mb() call existed both
before and after the atomic operation.
@@ -233,21 +233,21 @@ These two routines increment and decrement by 1, respectively, the
given atomic counter. They return a boolean indicating whether the
resulting counter value was zero or not.
It requires explicit memory barrier semantics around the operation as
above.
Again, these primitives provide explicit memory barrier semantics around
the atomic operation.
int atomic_sub_and_test(int i, atomic_t *v);
This is identical to atomic_dec_and_test() except that an explicit
decrement is given instead of the implicit "1". It requires explicit
memory barrier semantics around the operation.
decrement is given instead of the implicit "1". This primitive must
provide explicit memory barrier semantics around the operation.
int atomic_add_negative(int i, atomic_t *v);
The given increment is added to the given atomic counter value. A
boolean is return which indicates whether the resulting counter value
is negative. It requires explicit memory barrier semantics around the
operation.
The given increment is added to the given atomic counter value. A boolean
is return which indicates whether the resulting counter value is negative.
This primitive must provide explicit memory barrier semantics around
the operation.
Then:
@@ -257,7 +257,7 @@ This performs an atomic exchange operation on the atomic variable v, setting
the given new value. It returns the old value that the atomic variable v had
just before the operation.
atomic_xchg requires explicit memory barriers around the operation.
atomic_xchg must provide explicit memory barriers around the operation.
int atomic_cmpxchg(atomic_t *v, int old, int new);
@@ -266,7 +266,7 @@ with the given old and new values. Like all atomic_xxx operations,
atomic_cmpxchg will only satisfy its atomicity semantics as long as all
other accesses of *v are performed through atomic_xxx operations.
atomic_cmpxchg requires explicit memory barriers around the operation.
atomic_cmpxchg must provide explicit memory barriers around the operation.
The semantics for atomic_cmpxchg are the same as those defined for 'cas'
below.
@@ -279,8 +279,8 @@ If the atomic value v is not equal to u, this function adds a to v, and
returns non zero. If v is equal to u then it returns zero. This is done as
an atomic operation.
atomic_add_unless requires explicit memory barriers around the operation
unless it fails (returns 0).
atomic_add_unless must provide explicit memory barriers around the
operation unless it fails (returns 0).
atomic_inc_not_zero, equivalent to atomic_add_unless(v, 1, 0)
@@ -460,9 +460,9 @@ the return value into an int. There are other places where things
like this occur as well.
These routines, like the atomic_t counter operations returning values,
require explicit memory barrier semantics around their execution. All
memory operations before the atomic bit operation call must be made
visible globally before the atomic bit operation is made visible.
must provide explicit memory barrier semantics around their execution.
All memory operations before the atomic bit operation call must be
made visible globally before the atomic bit operation is made visible.
Likewise, the atomic bit operation must be visible globally before any
subsequent memory operation is made visible. For example:
@@ -536,8 +536,9 @@ except that two underscores are prefixed to the interface name.
These non-atomic variants also do not require any special memory
barrier semantics.
The routines xchg() and cmpxchg() need the same exact memory barriers
as the atomic and bit operations returning values.
The routines xchg() and cmpxchg() must provide the same exact
memory-barrier semantics as the atomic and bit operations returning
values.
Spinlocks and rwlocks have memory barrier expectations as well.
The rule to follow is simple:
+15 -5
View File
@@ -2968,6 +2968,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Set maximum number of finished RCU callbacks to
process in one batch.
rcutree.gp_init_delay= [KNL]
Set the number of jiffies to delay each step of
RCU grace-period initialization. This only has
effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT is
set.
rcutree.rcu_fanout_leaf= [KNL]
Increase the number of CPUs assigned to each
leaf rcu_node structure. Useful for very large
@@ -2991,11 +2997,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
value is one, and maximum value is HZ.
rcutree.kthread_prio= [KNL,BOOT]
Set the SCHED_FIFO priority of the RCU
per-CPU kthreads (rcuc/N). This value is also
used for the priority of the RCU boost threads
(rcub/N). Valid values are 1-99 and the default
is 1 (the least-favored priority).
Set the SCHED_FIFO priority of the RCU per-CPU
kthreads (rcuc/N). This value is also used for
the priority of the RCU boost threads (rcub/N)
and for the RCU grace-period kthreads (rcu_bh,
rcu_preempt, and rcu_sched). If RCU_BOOST is
set, valid values are 1-99 and the default is 1
(the least-favored priority). Otherwise, when
RCU_BOOST is not set, valid values are 0-99 and
the default is zero (non-realtime operation).
rcutree.rcu_nocb_leader_stride= [KNL]
Set the number of NOCB kthread groups, which
+21 -13
View File
@@ -190,20 +190,24 @@ To reduce its OS jitter, do any of the following:
on each CPU, including cs_dbs_timer() and od_dbs_timer().
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
d. It is not possible to entirely get rid of OS jitter
from vmstat_update() on CONFIG_SMP=y systems, but you
can decrease its frequency by writing a large value
to /proc/sys/vm/stat_interval. The default value is
HZ, for an interval of one second. Of course, larger
values will make your virtual-memory statistics update
more slowly. Of course, you can also run your workload
at a real-time priority, thus preempting vmstat_update(),
d. As of v3.18, Christoph Lameter's on-demand vmstat workers
commit prevents OS jitter due to vmstat_update() on
CONFIG_SMP=y systems. Before v3.18, is not possible
to entirely get rid of the OS jitter, but you can
decrease its frequency by writing a large value to
/proc/sys/vm/stat_interval. The default value is HZ,
for an interval of one second. Of course, larger values
will make your virtual-memory statistics update more
slowly. Of course, you can also run your workload at
a real-time priority, thus preempting vmstat_update(),
but if your workload is CPU-bound, this is a bad idea.
However, there is an RFC patch from Christoph Lameter
(based on an earlier one from Gilad Ben-Yossef) that
reduces or even eliminates vmstat overhead for some
workloads at https://lkml.org/lkml/2013/9/4/379.
e. If running on high-end powerpc servers, build with
e. Boot with "elevator=noop" to avoid workqueue use by
the block layer.
f. If running on high-end powerpc servers, build with
CONFIG_PPC_RTAS_DAEMON=n. This prevents the RTAS
daemon from running on each CPU every second or so.
(This will require editing Kconfig files and will defeat
@@ -211,12 +215,12 @@ To reduce its OS jitter, do any of the following:
due to the rtas_event_scan() function.
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
f. If running on Cell Processor, build your kernel with
g. If running on Cell Processor, build your kernel with
CBE_CPUFREQ_SPU_GOVERNOR=n to avoid OS jitter from
spu_gov_work().
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
g. If running on PowerMAC, build your kernel with
h. If running on PowerMAC, build your kernel with
CONFIG_PMAC_RACKMETER=n to disable the CPU-meter,
avoiding OS jitter from rackmeter_do_timer().
@@ -258,8 +262,12 @@ Purpose: Detect software lockups on each CPU.
To reduce its OS jitter, do at least one of the following:
1. Build with CONFIG_LOCKUP_DETECTOR=n, which will prevent these
kthreads from being created in the first place.
2. Echo a zero to /proc/sys/kernel/watchdog to disable the
2. Boot with "nosoftlockup=0", which will also prevent these kthreads
from being created. Other related watchdog and softlockup boot
parameters may be found in Documentation/kernel-parameters.txt
and Documentation/watchdog/watchdog-parameters.txt.
3. Echo a zero to /proc/sys/kernel/watchdog to disable the
watchdog timer.
3. Echo a large number of /proc/sys/kernel/watchdog_thresh in
4. Echo a large number of /proc/sys/kernel/watchdog_thresh in
order to reduce the frequency of OS jitter due to the watchdog
timer down to a level that is acceptable for your workload.
+29 -13
View File
@@ -592,9 +592,9 @@ See also the subsection on "Cache Coherency" for a more thorough example.
CONTROL DEPENDENCIES
--------------------
A control dependency requires a full read memory barrier, not simply a data
dependency barrier to make it work correctly. Consider the following bit of
code:
A load-load control dependency requires a full read memory barrier, not
simply a data dependency barrier to make it work correctly. Consider the
following bit of code:
q = ACCESS_ONCE(a);
if (q) {
@@ -615,14 +615,15 @@ case what's actually required is:
}
However, stores are not speculated. This means that ordering -is- provided
in the following example:
for load-store control dependencies, as in the following example:
q = ACCESS_ONCE(a);
if (q) {
ACCESS_ONCE(b) = p;
}
Please note that ACCESS_ONCE() is not optional! Without the
Control dependencies pair normally with other types of barriers.
That said, please note that ACCESS_ONCE() is not optional! Without the
ACCESS_ONCE(), might combine the load from 'a' with other loads from
'a', and the store to 'b' with other stores to 'b', with possible highly
counterintuitive effects on ordering.
@@ -813,6 +814,8 @@ In summary:
barrier() can help to preserve your control dependency. Please
see the Compiler Barrier section for more information.
(*) Control dependencies pair normally with other types of barriers.
(*) Control dependencies do -not- provide transitivity. If you
need transitivity, use smp_mb().
@@ -823,14 +826,14 @@ SMP BARRIER PAIRING
When dealing with CPU-CPU interactions, certain types of memory barrier should
always be paired. A lack of appropriate pairing is almost certainly an error.
General barriers pair with each other, though they also pair with
most other types of barriers, albeit without transitivity. An acquire
barrier pairs with a release barrier, but both may also pair with other
barriers, including of course general barriers. A write barrier pairs
with a data dependency barrier, an acquire barrier, a release barrier,
a read barrier, or a general barrier. Similarly a read barrier or a
data dependency barrier pairs with a write barrier, an acquire barrier,
a release barrier, or a general barrier:
General barriers pair with each other, though they also pair with most
other types of barriers, albeit without transitivity. An acquire barrier
pairs with a release barrier, but both may also pair with other barriers,
including of course general barriers. A write barrier pairs with a data
dependency barrier, a control dependency, an acquire barrier, a release
barrier, a read barrier, or a general barrier. Similarly a read barrier,
control dependency, or a data dependency barrier pairs with a write
barrier, an acquire barrier, a release barrier, or a general barrier:
CPU 1 CPU 2
=============== ===============
@@ -850,6 +853,19 @@ Or:
<data dependency barrier>
y = *x;
Or even:
CPU 1 CPU 2
=============== ===============================
r1 = ACCESS_ONCE(y);
<general barrier>
ACCESS_ONCE(y) = 1; if (r2 = ACCESS_ONCE(x)) {
<implicit control dependency>
ACCESS_ONCE(y) = 1;
}
assert(r1 == 0 || r2 == 0);
Basically, the read barrier always has to be there, even though it can be of
the "weaker" type.
+3 -7
View File
@@ -158,13 +158,9 @@ not come for free:
to the need to inform kernel subsystems (such as RCU) about
the change in mode.
3. POSIX CPU timers on adaptive-tick CPUs may miss their deadlines
(perhaps indefinitely) because they currently rely on
scheduling-tick interrupts. This will likely be fixed in
one of two ways: (1) Prevent CPUs with POSIX CPU timers from
entering adaptive-tick mode, or (2) Use hrtimers or other
adaptive-ticks-immune mechanism to cause the POSIX CPU timer to
fire properly.
3. POSIX CPU timers prevent CPUs from entering adaptive-tick mode.
Real-time applications needing to take actions based on CPU time
consumption need to use other means of doing so.
4. If there are more perf events pending than the hardware can
accommodate, they are normally round-robined so as to collect
+2 -4
View File
@@ -413,16 +413,14 @@ int __cpu_disable(void)
return 0;
}
static DECLARE_COMPLETION(cpu_killed);
int __cpu_die(unsigned int cpu)
{
return wait_for_completion_timeout(&cpu_killed, 5000);
return cpu_wait_death(cpu, 5);
}
void cpu_die(void)
{
complete(&cpu_killed);
(void)cpu_report_death();
atomic_dec(&init_mm.mm_users);
atomic_dec(&init_mm.mm_count);
+2 -3
View File
@@ -261,7 +261,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
}
#ifdef CONFIG_HOTPLUG_CPU
static DECLARE_COMPLETION(cpu_killed);
/*
* __cpu_disable runs on the processor to be shutdown.
@@ -299,7 +298,7 @@ int __cpu_disable(void)
*/
void __cpu_die(unsigned int cpu)
{
if (!wait_for_completion_timeout(&cpu_killed, msecs_to_jiffies(1)))
if (!cpu_wait_death(cpu, 1))
pr_err("CPU%u: unable to kill\n", cpu);
}
@@ -314,7 +313,7 @@ void cpu_die(void)
local_irq_disable();
idle_task_exit();
complete(&cpu_killed);
(void)cpu_report_death();
asm ("XOR TXENABLE, D0Re0,D0Re0\n");
}
-2
View File
@@ -34,8 +34,6 @@ extern int _debug_hotplug_cpu(int cpu, int action);
#endif
#endif
DECLARE_PER_CPU(int, cpu_state);
int mwait_usable(const struct cpuinfo_x86 *);
#endif /* _ASM_X86_CPU_H */
+1 -1
View File
@@ -150,12 +150,12 @@ static inline void arch_send_call_function_ipi_mask(const struct cpumask *mask)
}
void cpu_disable_common(void);
void cpu_die_common(unsigned int cpu);
void native_smp_prepare_boot_cpu(void);
void native_smp_prepare_cpus(unsigned int max_cpus);
void native_smp_cpus_done(unsigned int max_cpus);
int native_cpu_up(unsigned int cpunum, struct task_struct *tidle);
int native_cpu_disable(void);
int common_cpu_die(unsigned int cpu);
void native_cpu_die(unsigned int cpu);
void native_play_dead(void);
void play_dead_common(void);
+18 -21
View File
@@ -77,9 +77,6 @@
#include <asm/realmode.h>
#include <asm/misc.h>
/* State of each CPU */
DEFINE_PER_CPU(int, cpu_state) = { 0 };
/* Number of siblings per CPU package */
int smp_num_siblings = 1;
EXPORT_SYMBOL(smp_num_siblings);
@@ -257,7 +254,7 @@ static void notrace start_secondary(void *unused)
lock_vector_lock();
set_cpu_online(smp_processor_id(), true);
unlock_vector_lock();
per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
cpu_set_state_online(smp_processor_id());
x86_platform.nmi_init();
/* enable local interrupts */
@@ -948,7 +945,10 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
*/
mtrr_save_state();
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
/* x86 CPUs take themselves offline, so delayed offline is OK. */
err = cpu_check_up_prepare(cpu);
if (err && err != -EBUSY)
return err;
/* the FPU context is blank, nobody can own it */
__cpu_disable_lazy_restore(cpu);
@@ -1191,7 +1191,7 @@ void __init native_smp_prepare_boot_cpu(void)
switch_to_new_gdt(me);
/* already set me in cpu_online_mask in boot_cpu_init() */
cpumask_set_cpu(me, cpu_callout_mask);
per_cpu(cpu_state, me) = CPU_ONLINE;
cpu_set_state_online(me);
}
void __init native_smp_cpus_done(unsigned int max_cpus)
@@ -1318,14 +1318,10 @@ static void __ref remove_cpu_from_maps(int cpu)
numa_remove_cpu(cpu);
}
static DEFINE_PER_CPU(struct completion, die_complete);
void cpu_disable_common(void)
{
int cpu = smp_processor_id();
init_completion(&per_cpu(die_complete, smp_processor_id()));
remove_siblinginfo(cpu);
/* It's now safe to remove this processor from the online map */
@@ -1349,24 +1345,27 @@ int native_cpu_disable(void)
return 0;
}
void cpu_die_common(unsigned int cpu)
int common_cpu_die(unsigned int cpu)
{
wait_for_completion_timeout(&per_cpu(die_complete, cpu), HZ);
}
int ret = 0;
void native_cpu_die(unsigned int cpu)
{
/* We don't do anything here: idle task is faking death itself. */
cpu_die_common(cpu);
/* They ack this in play_dead() by setting CPU_DEAD */
if (per_cpu(cpu_state, cpu) == CPU_DEAD) {
if (cpu_wait_death(cpu, 5)) {
if (system_state == SYSTEM_RUNNING)
pr_info("CPU %u is now offline\n", cpu);
} else {
pr_err("CPU %u didn't die...\n", cpu);
ret = -1;
}
return ret;
}
void native_cpu_die(unsigned int cpu)
{
common_cpu_die(cpu);
}
void play_dead_common(void)
@@ -1375,10 +1374,8 @@ void play_dead_common(void)
reset_lazy_tlbstate();
amd_e400_remove_cpu(raw_smp_processor_id());
mb();
/* Ack it */
__this_cpu_write(cpu_state, CPU_DEAD);
complete(&per_cpu(die_complete, smp_processor_id()));
(void)cpu_report_death();
/*
* With physical CPU hotplug, we should halt the cpu
+25 -21
View File
@@ -90,14 +90,10 @@ static void cpu_bringup(void)
set_cpu_online(cpu, true);
this_cpu_write(cpu_state, CPU_ONLINE);
wmb();
cpu_set_state_online(cpu); /* Implies full memory barrier. */
/* We can take interrupts now: we're officially "up". */
local_irq_enable();
wmb(); /* make sure everything is out */
}
/*
@@ -459,7 +455,13 @@ static int xen_cpu_up(unsigned int cpu, struct task_struct *idle)
xen_setup_timer(cpu);
xen_init_lock_cpu(cpu);
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
/*
* PV VCPUs are always successfully taken down (see 'while' loop
* in xen_cpu_die()), so -EBUSY is an error.
*/
rc = cpu_check_up_prepare(cpu);
if (rc)
return rc;
/* make sure interrupts start blocked */
per_cpu(xen_vcpu, cpu)->evtchn_upcall_mask = 1;
@@ -479,10 +481,8 @@ static int xen_cpu_up(unsigned int cpu, struct task_struct *idle)
rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL);
BUG_ON(rc);
while(per_cpu(cpu_state, cpu) != CPU_ONLINE) {
while (cpu_report_state(cpu) != CPU_ONLINE)
HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
barrier();
}
return 0;
}
@@ -511,11 +511,11 @@ static void xen_cpu_die(unsigned int cpu)
schedule_timeout(HZ/10);
}
cpu_die_common(cpu);
xen_smp_intr_free(cpu);
xen_uninit_lock_cpu(cpu);
xen_teardown_timer(cpu);
if (common_cpu_die(cpu) == 0) {
xen_smp_intr_free(cpu);
xen_uninit_lock_cpu(cpu);
xen_teardown_timer(cpu);
}
}
static void xen_play_dead(void) /* used only with HOTPLUG_CPU */
@@ -747,6 +747,16 @@ static void __init xen_hvm_smp_prepare_cpus(unsigned int max_cpus)
static int xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle)
{
int rc;
/*
* This can happen if CPU was offlined earlier and
* offlining timed out in common_cpu_die().
*/
if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) {
xen_smp_intr_free(cpu);
xen_uninit_lock_cpu(cpu);
}
/*
* xen_smp_intr_init() needs to run before native_cpu_up()
* so that IPI vectors are set up on the booting CPU before
@@ -768,12 +778,6 @@ static int xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle)
return rc;
}
static void xen_hvm_cpu_die(unsigned int cpu)
{
xen_cpu_die(cpu);
native_cpu_die(cpu);
}
void __init xen_hvm_smp_init(void)
{
if (!xen_have_vector_callback)
@@ -781,7 +785,7 @@ void __init xen_hvm_smp_init(void)
smp_ops.smp_prepare_cpus = xen_hvm_smp_prepare_cpus;
smp_ops.smp_send_reschedule = xen_smp_send_reschedule;
smp_ops.cpu_up = xen_hvm_cpu_up;
smp_ops.cpu_die = xen_hvm_cpu_die;
smp_ops.cpu_die = xen_cpu_die;
smp_ops.send_call_func_ipi = xen_smp_send_call_function_ipi;
smp_ops.send_call_func_single_ipi = xen_smp_send_call_function_single_ipi;
smp_ops.smp_prepare_boot_cpu = xen_smp_prepare_boot_cpu;
+14
View File
@@ -95,6 +95,10 @@ enum {
* Called on the new cpu, just before
* enabling interrupts. Must not sleep,
* must not fail */
#define CPU_DYING_IDLE 0x000B /* CPU (unsigned)v dying, reached
* idle loop. */
#define CPU_BROKEN 0x000C /* CPU (unsigned)v did not die properly,
* perhaps due to preemption. */
/* Used for CPU hotplug events occurring while tasks are frozen due to a suspend
* operation in progress
@@ -271,4 +275,14 @@ void arch_cpu_idle_enter(void);
void arch_cpu_idle_exit(void);
void arch_cpu_idle_dead(void);
DECLARE_PER_CPU(bool, cpu_dead_idle);
int cpu_report_state(int cpu);
int cpu_check_up_prepare(int cpu);
void cpu_set_state_online(int cpu);
#ifdef CONFIG_HOTPLUG_CPU
bool cpu_wait_death(unsigned int cpu, int seconds);
bool cpu_report_death(void);
#endif /* #ifdef CONFIG_HOTPLUG_CPU */
#endif /* _LINUX_CPU_H_ */
+6 -1
View File
@@ -531,8 +531,13 @@ do { \
# define might_lock_read(lock) do { } while (0)
#endif
#ifdef CONFIG_PROVE_RCU
#ifdef CONFIG_LOCKDEP
void lockdep_rcu_suspicious(const char *file, const int line, const char *s);
#else
static inline void
lockdep_rcu_suspicious(const char *file, const int line, const char *s)
{
}
#endif
#endif /* __LINUX_LOCKDEP_H */
+36 -4
View File
@@ -48,6 +48,26 @@
extern int rcu_expedited; /* for sysctl */
#ifdef CONFIG_TINY_RCU
/* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
static inline bool rcu_gp_is_expedited(void) /* Internal RCU use. */
{
return false;
}
static inline void rcu_expedite_gp(void)
{
}
static inline void rcu_unexpedite_gp(void)
{
}
#else /* #ifdef CONFIG_TINY_RCU */
bool rcu_gp_is_expedited(void); /* Internal RCU use. */
void rcu_expedite_gp(void);
void rcu_unexpedite_gp(void);
#endif /* #else #ifdef CONFIG_TINY_RCU */
enum rcutorture_type {
RCU_FLAVOR,
RCU_BH_FLAVOR,
@@ -195,6 +215,15 @@ void call_rcu_sched(struct rcu_head *head,
void synchronize_sched(void);
/*
* Structure allowing asynchronous waiting on RCU.
*/
struct rcu_synchronize {
struct rcu_head head;
struct completion completion;
};
void wakeme_after_rcu(struct rcu_head *head);
/**
* call_rcu_tasks() - Queue an RCU for invocation task-based grace period
* @head: structure to be used for queueing the RCU updates.
@@ -258,6 +287,7 @@ static inline int rcu_preempt_depth(void)
/* Internal to kernel */
void rcu_init(void);
void rcu_end_inkernel_boot(void);
void rcu_sched_qs(void);
void rcu_bh_qs(void);
void rcu_check_callbacks(int user);
@@ -266,6 +296,8 @@ void rcu_idle_enter(void);
void rcu_idle_exit(void);
void rcu_irq_enter(void);
void rcu_irq_exit(void);
int rcu_cpu_notify(struct notifier_block *self,
unsigned long action, void *hcpu);
#ifdef CONFIG_RCU_STALL_COMMON
void rcu_sysrq_start(void);
@@ -720,7 +752,7 @@ static inline void rcu_preempt_sleep_check(void)
* annotated as __rcu.
*/
#define rcu_dereference_check(p, c) \
__rcu_dereference_check((p), rcu_read_lock_held() || (c), __rcu)
__rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu)
/**
* rcu_dereference_bh_check() - rcu_dereference_bh with debug checking
@@ -730,7 +762,7 @@ static inline void rcu_preempt_sleep_check(void)
* This is the RCU-bh counterpart to rcu_dereference_check().
*/
#define rcu_dereference_bh_check(p, c) \
__rcu_dereference_check((p), rcu_read_lock_bh_held() || (c), __rcu)
__rcu_dereference_check((p), (c) || rcu_read_lock_bh_held(), __rcu)
/**
* rcu_dereference_sched_check() - rcu_dereference_sched with debug checking
@@ -740,7 +772,7 @@ static inline void rcu_preempt_sleep_check(void)
* This is the RCU-sched counterpart to rcu_dereference_check().
*/
#define rcu_dereference_sched_check(p, c) \
__rcu_dereference_check((p), rcu_read_lock_sched_held() || (c), \
__rcu_dereference_check((p), (c) || rcu_read_lock_sched_held(), \
__rcu)
#define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ needed? @@@*/
@@ -933,9 +965,9 @@ static inline void rcu_read_unlock(void)
{
rcu_lockdep_assert(rcu_is_watching(),
"rcu_read_unlock() used illegally while idle");
rcu_lock_release(&rcu_lock_map);
__release(RCU);
__rcu_read_unlock();
rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
}
/**
+1 -1
View File
@@ -182,7 +182,7 @@ static inline int srcu_read_lock_held(struct srcu_struct *sp)
* lockdep_is_held() calls.
*/
#define srcu_dereference_check(p, sp, c) \
__rcu_dereference_check((p), srcu_read_lock_held(sp) || (c), __rcu)
__rcu_dereference_check((p), (c) || srcu_read_lock_held(sp), __rcu)
/**
* srcu_dereference - fetch SRCU-protected pointer for later dereferencing
+13
View File
@@ -791,6 +791,19 @@ config RCU_NOCB_CPU_ALL
endchoice
config RCU_EXPEDITE_BOOT
bool
default n
help
This option enables expedited grace periods at boot time,
as if rcu_expedite_gp() had been invoked early in boot.
The corresponding rcu_unexpedite_gp() is invoked from
rcu_end_inkernel_boot(), which is intended to be invoked
at the end of the kernel-only boot sequence, just before
init is exec'ed.
Accept the default if unsure.
endmenu # "RCU Subsystem"
config BUILD_BIN2C
+3 -1
View File
@@ -408,8 +408,10 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
*
* Wait for the stop thread to go away.
*/
while (!idle_cpu(cpu))
while (!per_cpu(cpu_dead_idle, cpu))
cpu_relax();
smp_mb(); /* Read from cpu_dead_idle before __cpu_die(). */
per_cpu(cpu_dead_idle, cpu) = false;
/* This actually kills the CPU. */
__cpu_die(cpu);
+26 -1
View File
@@ -853,6 +853,8 @@ rcu_torture_fqs(void *arg)
static int
rcu_torture_writer(void *arg)
{
bool can_expedite = !rcu_gp_is_expedited();
int expediting = 0;
unsigned long gp_snap;
bool gp_cond1 = gp_cond, gp_exp1 = gp_exp, gp_normal1 = gp_normal;
bool gp_sync1 = gp_sync;
@@ -865,9 +867,15 @@ rcu_torture_writer(void *arg)
int nsynctypes = 0;
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
pr_alert("%s" TORTURE_FLAG
" Grace periods expedited from boot/sysfs for %s,\n",
torture_type, cur_ops->name);
pr_alert("%s" TORTURE_FLAG
" Testing of dynamic grace-period expediting diabled.\n",
torture_type);
/* Initialize synctype[] array. If none set, take default. */
if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_sync)
if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_sync1)
gp_cond1 = gp_exp1 = gp_normal1 = gp_sync1 = true;
if (gp_cond1 && cur_ops->get_state && cur_ops->cond_sync)
synctype[nsynctypes++] = RTWS_COND_GET;
@@ -949,9 +957,26 @@ rcu_torture_writer(void *arg)
}
}
rcutorture_record_progress(++rcu_torture_current_version);
/* Cycle through nesting levels of rcu_expedite_gp() calls. */
if (can_expedite &&
!(torture_random(&rand) & 0xff & (!!expediting - 1))) {
WARN_ON_ONCE(expediting == 0 && rcu_gp_is_expedited());
if (expediting >= 0)
rcu_expedite_gp();
else
rcu_unexpedite_gp();
if (++expediting > 3)
expediting = -expediting;
}
rcu_torture_writer_state = RTWS_STUTTER;
stutter_wait("rcu_torture_writer");
} while (!torture_must_stop());
/* Reset expediting back to unexpedited. */
if (expediting > 0)
expediting = -expediting;
while (can_expedite && expediting++ < 0)
rcu_unexpedite_gp();
WARN_ON_ONCE(can_expedite && rcu_gp_is_expedited());
rcu_torture_writer_state = RTWS_STOPPING;
torture_kthread_stopping("rcu_torture_writer");
return 0;
+1 -18
View File
@@ -402,23 +402,6 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *head,
}
EXPORT_SYMBOL_GPL(call_srcu);
struct rcu_synchronize {
struct rcu_head head;
struct completion completion;
};
/*
* Awaken the corresponding synchronize_srcu() instance now that a
* grace period has elapsed.
*/
static void wakeme_after_rcu(struct rcu_head *head)
{
struct rcu_synchronize *rcu;
rcu = container_of(head, struct rcu_synchronize, head);
complete(&rcu->completion);
}
static void srcu_advance_batches(struct srcu_struct *sp, int trycount);
static void srcu_reschedule(struct srcu_struct *sp);
@@ -507,7 +490,7 @@ static void __synchronize_srcu(struct srcu_struct *sp, int trycount)
*/
void synchronize_srcu(struct srcu_struct *sp)
{
__synchronize_srcu(sp, rcu_expedited
__synchronize_srcu(sp, rcu_gp_is_expedited()
? SYNCHRONIZE_SRCU_EXP_TRYCOUNT
: SYNCHRONIZE_SRCU_TRYCOUNT);
}
+1 -13
View File
@@ -103,8 +103,7 @@ EXPORT_SYMBOL(__rcu_is_watching);
static int rcu_qsctr_help(struct rcu_ctrlblk *rcp)
{
RCU_TRACE(reset_cpu_stall_ticks(rcp));
if (rcp->rcucblist != NULL &&
rcp->donetail != rcp->curtail) {
if (rcp->donetail != rcp->curtail) {
rcp->donetail = rcp->curtail;
return 1;
}
@@ -169,17 +168,6 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
unsigned long flags;
RCU_TRACE(int cb_count = 0);
/* If no RCU callbacks ready to invoke, just return. */
if (&rcp->rcucblist == rcp->donetail) {
RCU_TRACE(trace_rcu_batch_start(rcp->name, 0, 0, -1));
RCU_TRACE(trace_rcu_batch_end(rcp->name, 0,
!!ACCESS_ONCE(rcp->rcucblist),
need_resched(),
is_idle_task(current),
false));
return;
}
/* Move the ready-to-invoke callbacks to a local list. */
local_irq_save(flags);
RCU_TRACE(trace_rcu_batch_start(rcp->name, 0, rcp->qlen, -1));

Some files were not shown because too many files have changed in this diff Show More