Commit Graph

85 Commits

Author SHA1 Message Date
Joel Fernandes (Google)
189a6883dc rcu: Remove kfree_call_rcu_nobatch()
Now that the kfree_rcu() special-casing has been removed from tree RCU,
this commit removes kfree_call_rcu_nobatch() since it is no longer needed.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Byungchul Park
a35d16905e rcu: Add basic support for kfree_rcu() batching
Recently a discussion about stability and performance of a system
involving a high rate of kfree_rcu() calls surfaced on the list [1]
which led to another discussion how to prepare for this situation.

This patch adds basic batching support for kfree_rcu(). It is "basic"
because we do none of the slab management, dynamic allocation, code
moving or any of the other things, some of which previous attempts did
[2]. These fancier improvements can be follow-up patches and there are
different ideas being discussed in those regards. This is an effort to
start simple, and build up from there. In the future, an extension to
use kfree_bulk and possibly per-slab batching could be done to further
improve performance due to cache-locality and slab-specific bulk free
optimizations. By using an array of pointers, the worker thread
processing the work would need to read lesser data since it does not
need to deal with large rcu_head(s) any longer.

Torture tests follow in the next patch and show improvements of around
5x reduction in number of  grace periods on a 16 CPU system. More
details and test data are in that patch.

There is an implication with rcu_barrier() with this patch. Since the
kfree_rcu() calls can be batched, and may not be handed yet to the RCU
machinery in fact, the monitor may not have even run yet to do the
queue_rcu_work(), there seems no easy way of implementing rcu_barrier()
to wait for those kfree_rcu()s that are already made. So this means a
kfree_rcu() followed by an rcu_barrier() does not imply that memory will
be freed once rcu_barrier() returns.

Another implication is higher active memory usage (although not
run-away..) until the kfree_rcu() flooding ends, in comparison to
without batching. More details about this are in the second patch which
adds an rcuperf test.

Finally, in the near future we will get rid of kfree_rcu() special casing
within RCU such as in rcu_do_batch and switch everything to just
batching. Currently we don't do that since timer subsystem is not yet up
and we cannot schedule the kfree_rcu() monitor as the timer subsystem's
lock are not initialized. That would also mean getting rid of
kfree_call_rcu_nobatch() entirely.

[1] http://lore.kernel.org/lkml/20190723035725-mutt-send-email-mst@kernel.org
[2] https://lkml.org/lkml/2017/12/19/824

Cc: kernel-team@android.com
Cc: kernel-team@lge.com
Co-developed-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Applied 0day and Paul Walmsley feedback on ->monitor_todo. ]
[ paulmck: Make it work during early boot. ]
[ paulmck: Add a crude early boot self-test. ]
[ paulmck: Style adjustments and experimental docbook structure header. ]
Link: https://lore.kernel.org/lkml/alpine.DEB.2.21.9999.1908161931110.32497@viisi.sifive.com/T/#me9956f66cb611b95d26ae92700e1d901f46e8c59
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:17:03 -08:00
Paul E. McKenney
79ba7ff5a9 rcutorture: Emulate dyntick aspect of userspace nohz_full sojourn
During an actual call_rcu() flood, there would be frequent trips to
userspace (in-kernel call_rcu() floods must be otherwise housebroken).
Userspace execution on nohz_full CPUs implies an RCU dyntick idle/not-idle
transition pair, so this commit adds emulation of that pair.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2019-10-05 10:46:05 -07:00
Christoph Hellwig
24691069a3 rcu: Don't include <linux/ktime.h> in rcutiny.h
The kbuild reported a built failure due to a header loop when RCUTINY is
enabled with my pending riscv-nommu port.  Switch rcutiny.h to only
include the minimal required header to get HZ instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-08-26 16:27:01 -07:00
Paul E. McKenney
6c44212736 linux/rcutiny: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Update .h SPDX format per Joe Perches. ]
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-02-09 08:45:46 -08:00
Paul E. McKenney
b56ada1209 Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' and 'torture.2018.08.29a' into HEAD
doc.2018.08.30a: Documentation updates
dynticks.2018.08.30b: RCU flavor consolidation updates and cleanups
srcu.2018.08.30b: SRCU updates
torture.2018.08.29a: Torture-test updates
2018-08-30 16:12:53 -07:00
Paul E. McKenney
31ab604bf3 rcu: Remove unused rcu_dynticks_snap() from Tiny RCU
The rcu_dynticks_snap() function is defined in include/linux/rcutiny.h,
but is no longer used by Tiny RCU.  This commit therefore removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:46 -07:00
Paul E. McKenney
a8bb74acd8 rcu: Consolidate RCU-sched update-side function definitions
This commit saves a few lines by consolidating the RCU-sched function
definitions at the end of include/linux/rcupdate.h.  This consolidation
also makes it easier to remove them all when the time comes.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:26 -07:00
Paul E. McKenney
4c7e9c1434 rcu: Consolidate RCU-bh update-side function definitions
This commit saves a few lines by consolidating the RCU-bh function
definitions at the end of include/linux/rcupdate.h.  This consolidation
also makes it easier to remove them all when the time comes.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:03:25 -07:00
Paul E. McKenney
709fdce754 rcu: Express Tiny RCU updates in terms of RCU rather than RCU-sched
This commit renames Tiny RCU functions so that the lowest level of
functionality is RCU (e.g., synchronize_rcu()) rather than RCU-sched
(e.g., synchronize_sched()).  This provides greater naming compatibility
with Tree RCU, which will in turn permit more LoC removal once
the RCU-sched and RCU-bh update-side API is removed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix Tiny call_rcu()'s EXPORT_SYMBOL() in response to a bug
  report from kbuild test robot. ]
2018-08-30 16:02:46 -07:00
Paul E. McKenney
45975c7d21 rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds
Now that RCU-preempt knows about preemption disabling, its implementation
of synchronize_rcu() works for synchronize_sched(), and likewise for the
other RCU-sched update-side API members.  This commit therefore confines
the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines
RCU-sched's update-side API members in terms of those of RCU-preempt.

This means that any given build of the Linux kernel has only one
update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds
and RCU-sched for CONFIG_PREEMPT=n builds.  This in turn means that kernels
built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
2018-08-30 16:02:45 -07:00
Paul E. McKenney
65cfe3583b rcu: Define RCU-bh update API in terms of RCU
Now that the main RCU API knows about softirq disabling and softirq's
quiescent states, the RCU-bh update code can be dispensed with.
This commit therefore removes the RCU-bh update-side implementation and
defines RCU-bh's update-side API in terms of that of either RCU-preempt or
RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option.

In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect
of reducing by one the number of rcuo kthreads per CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 16:02:40 -07:00
Paul E. McKenney
d28139c4e9 rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe
One necessary step towards consolidating the three flavors of RCU is to
make sure that the resulting consolidated "one flavor to rule them all"
correctly handles networking denial-of-service attacks.  One thing that
allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every
so often, and so something similar has to happen for consolidated RCU.

This must be done carefully.  For example, if a preemption-disabled
region of code takes an interrupt which does softirq processing before
returning, consolidated RCU must ignore the resulting rcu_bh_qs()
invocations -- preemption is still disabled, and that means an RCU
reader for the consolidated flavor.

This commit therefore creates a new rcu_softirq_qs() that is called only
from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region
problem.  This new rcu_softirq_qs() function invokes rcu_sched_qs(),
rcu_preempt_qs(), and rcu_preempt_deferred_qs().  The latter call handles
any deferred quiescent states.

Note that __do_softirq() still invokes rcu_bh_qs().  It will continue to
do so until a later stage of cleanup when the RCU-bh flavor is removed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix !SMP issue located by kbuild test robot. ]
2018-08-30 16:02:38 -07:00
Paul E. McKenney
3e31009898 rcu: Defer reporting RCU-preempt quiescent states when disabled
This commit defers reporting of RCU-preempt quiescent states at
rcu_read_unlock_special() time when any of interrupts, softirq, or
preemption are disabled.  These deferred quiescent states are reported
at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug
offline operation.  Of course, if another RCU read-side critical
section has started in the meantime, the reporting of the quiescent
state will be further deferred.

This also means that disabling preemption, interrupts, and/or
softirqs will act as an RCU-preempt read-side critical section.
This is enforced by checking preempt_count() as needed.

Some special cases must be handled on an ad-hoc basis, for example,
context switch is a quiescent state even though both the scheduler and
do_exit() disable preemption.  In these cases, additional calls to
rcu_preempt_deferred_qs() override the preemption disabling.  Similar
logic overrides disabled interrupts in rcu_preempt_check_callbacks()
because in this case the quiescent state happened just before the
corresponding scheduling-clock interrupt.

In theory, this change lifts a long-standing restriction that required
that if interrupts were disabled across a call to rcu_read_unlock()
that the matching rcu_read_lock() also be contained within that
interrupts-disabled region of code.  Because the reporting of the
corresponding RCU-preempt quiescent state is now deferred until
after interrupts have been enabled, it is no longer possible for this
situation to result in deadlocks involving the scheduler's runqueue and
priority-inheritance locks.  This may allow some code simplification that
might reduce interrupt latency a bit.  Unfortunately, in practice this
would also defer deboosting a low-priority task that had been subjected
to RCU priority boosting, so real-time-response considerations might
well force this restriction to remain in place.

Because RCU-preempt grace periods are now blocked not only by RCU
read-side critical sections, but also by disabling of interrupts,
preemption, and softirqs, it will be possible to eliminate RCU-bh and
RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels.  This may
require some additional plumbing to provide the network denial-of-service
guarantees that have been traditionally provided by RCU-bh.  Once these
are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh
into RCU-sched.  This would mean that all kernels would have but
one flavor of RCU, which would open the door to significant code
cleanup.

Moving to a single flavor of RCU would also have the beneficial effect
of reducing the NOCB kthreads by at least a factor of two.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback
  from Joel Fernandes. ]
[ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in
  response to bug reports from kbuild test robot. ]
[ paulmck: Fix bug located by kbuild test robot involving recursion
  via rcu_preempt_deferred_qs(). ]
2018-08-30 16:02:34 -07:00
Paul E. McKenney
1b27291b1e rcutorture: Add forward-progress tests for RCU grace periods
This commit adds a kthread that loops going into and out of RCU
read-side critical sections, but also including a cond_resched(),
optionally guarded by a check of need_resched(), in that same loop.
This commit relies solely on rcu_torture_writer() progress to judge
the forward progress of grace periods.

Note that Tasks RCU and SRCU are exempted from forward-progress testing
due their (intentionally) less-robust forward-progress guarantees.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 09:20:48 -07:00
Paul E. McKenney
6f56f714db rcu: Improve RCU-tasks naming and comments
The naming and comments associated with some RCU-tasks code make
the faulty assumption that context switches due to cond_resched()
are voluntary.  As several people pointed out, this is not the case.
This commit therefore updates function names and comments to better
reflect current reality.

Reported-by: Byungchul Park <byungchul.park@lge.com>
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:15 -07:00
Peter Zijlstra
f64c6013a2 rcu/x86: Provide early rcu_cpu_starting() callback
The x86/mtrr code does horrific things because hardware. It uses
stop_machine_from_inactive_cpu(), which does a wakeup (of the stopper
thread on another CPU), which uses RCU, all before the CPU is onlined.

RCU complains about this, because wakeups use RCU and RCU does
(rightfully) not consider offline CPUs for grace-periods.

Fix this by initializing RCU way early in the MTRR case.

Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Add !SMP support, per 0day Test Robot report. ]
2018-05-22 16:12:26 -07:00
Paul E. McKenney
844ccdd7dc rcu: Eliminate rcu_irq_enter_disabled()
Now that the irq path uses the rcu_nmi_{enter,exit}() algorithm,
rcu_irq_enter() and rcu_irq_exit() may be used from any context.  There is
thus no need for rcu_irq_enter_disabled() and for the checks using it.
This commit therefore eliminates rcu_irq_enter_disabled().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-27 08:42:03 -08:00
Paul E. McKenney
825c5bd2fd srcu: Move rcu_scheduler_starting() from Tiny RCU to Tiny SRCU
Other than lockdep support, Tiny RCU has no need for the
scheduler status.  However, Tiny SRCU will need this to control
boot-time behavior independent of lockdep.  Therefore, this commit
moves rcu_scheduler_starting() from kernel/rcu/tiny_plugin.h to
kernel/rcu/srcutiny.c.  This in turn allows the complete removal of
kernel/rcu/tiny_plugin.h.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-07-24 16:03:22 -07:00
Paul E. McKenney
d2b1654f91 rcu: Remove #ifdef moving rcu_end_inkernel_boot from rcupdate.h
This commit removes a #ifdef and saves a few lines of code by moving
the rcu_end_inkernel_boot() function from include/linux/rcupdate.h to
include/linux/rcutiny.h (for TINY_RCU) and to include/linux/rcutree.h
(for TREE_RCU).

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:40 -07:00
Paul E. McKenney
5f192ab027 rcu: Refactor #includes from include/linux/rcupdate.h
The list of #includes from include/linux/rcupdate.h has grown quite
a bit, so it is time to trim it.  This commit moves the #include
of include/linux/ktime.h to include/linux/rcutiny.h, along with the
Tiny-RCU-only function that was the only thing needing ktimem.h.  It then
reconstructs the files included into include/linux/ktime.h based on what
is actually needed, with significant help from the 0day Test Robot.

This single change reduces the .i file footprint from rcupdate.h from
9018 lines to 7101 lines.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:37 -07:00
Paul E. McKenney
71c40fd0b5 rcu: Move rcutiny.h to new empty/true/false-function style
This commit saves a few lines in include/linux/rcutiny.h by moving
to single-line definitions for empty functions, instead of the old
style where the two curly braces each get their own line.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:34 -07:00
Paul E. McKenney
fe21a27e8c rcu: Move rcu_request_urgent_qs_task() out of rcutiny.h and rcutree.h
The rcu_request_urgent_qs_task() function is used only within RCU,
so there is no point in exporting it to the rest of the kernel from
nclude/linux/rcutiny.h and include/linux/rcutree.h.  This commit therefore
moves this function to kernel/rcu/rcu.h.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:33 -07:00
Paul E. McKenney
e3c8d51e1a rcu: Move torture-related functions out of rcutiny.h and rcutree.h
The various functions similar to rcu_batches_started(), the
function show_rcu_gp_kthreads(), the various functions similar to
rcu_force_quiescent_state(), and the variables rcutorture_testseq and
rcutorture_vernum are used only within RCU.  There is therefore no point
in exporting them to the kernel at large from include/linux/rcutiny.h
and include/linux/rcutree.h.  This commit therefore moves all of these
to kernel/rcu/rcu.h.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:33 -07:00
Paul E. McKenney
791875d16e rcu: Eliminate the unused __rcu_is_watching() function
The __rcu_is_watching() function is currently not used, aside from
to implement the rcu_is_watching() function.  This commit therefore
eliminates __rcu_is_watching(), which has the beneficial side-effect
of shrinking include/linux/rcupdate.h a bit.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:30 -07:00