Commit Graph

85 Commits

Author SHA1 Message Date
Paul E. McKenney
394473d876 torture: Print out torture module parameters
The kernel/torture.c module now has several module parameters, so this
commit causes them to be printed out.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:01 +02:00
Joel Fernandes (Google)
cca42bd8eb rcutorture: Fix stuttering races and other issues
The stuttering code isn't functioning as expected. Ideally, it should
pause the torture threads for a designated period before resuming. Yet,
it fails to halt the test for the correct duration. Additionally, a race
condition exists, potentially causing the stuttering code to pause for
an extended period if the 'spt' variable is non-zero due to the stutter
orchestration thread's inadequate CPU time.

Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels.
This happens as the stuttering code may run within a softirq due to RCU
callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds,
thus obstructing RCU's progress. This situation triggers a warning
message in the logs:

[ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9

This warning suggests that an RCU torture object, although invisible to
RCU readers, couldn't make it past the pipe array and be freed -- a
strong indication that there weren't enough grace periods during the
stutter interval.

To address these issues, this patch sets the "stutter end" time to an
absolute point in the future set by the main stutter thread. This is
then used for waiting in stutter_wait(). While the stutter thread still
defines this absolute time, the waiters' waiting logic doesn't rely on
the stutter thread receiving sufficient CPU time to halt the stuttering
as the halting is now self-controlled.

Cc: stable@vger.kernel.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:01 +02:00
Paul E. McKenney
a741deac78 torture: Make torture_hrtimeout_ns() take an hrtimer mode parameter
The current torture-test sleeps are waiting for a duration, but there
are situations where it is better to wait for an absolute time, for
example, when ending a stutter interval.  This commit therefore adds
an hrtimer mode parameter to torture_hrtimeout_ns().  Why not also the
other torture_hrtimeout_*() functions?  The theory is that most absolute
times will be in nanoseconds, especially not (say) jiffies.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:01 +02:00
Paul E. McKenney
d0b654e19a torture: Share torture_random_state with torture_shuffle_tasks()
Both torture_shuffle_tasks() and its caller torture_shuffle()
define a torture_random_state structure.  This is suboptimal given
that torture_shuffle_tasks() runs for a very short period of time.
This commit therefore causes torture_shuffle() to pass a pointer to its
torture_random_state structure down to torture_shuffle_tasks().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:01 +02:00
Paul E. McKenney
6cab60ceb1 torture: Stop right-shifting torture_random() return values
Now that torture_random() uses swahw32(), its callers no longer see
not-so-random low-order bits, as these are now swapped up into the upper
16 bits of the torture_random() function's return value.  This commit
therefore removes the right-shifting of torture_random() return values.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-08-14 15:01:08 -07:00
Paul E. McKenney
10af43671e torture: Move stutter_wait() timeouts to hrtimers
In order to gain better race coverage, move the test start/stop
waits in stutter_wait() to torture_hrtimeout_jiffies().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-08-14 15:01:08 -07:00
Paul E. McKenney
dea81dcfd3 torture: Move torture_shuffle() timeouts to hrtimers
In order to gain better race coverage, move the CPU-migration timed
waits in torture_shuffle() to torture_hrtimeout_jiffies().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-08-14 15:01:08 -07:00
Paul E. McKenney
3f0c06e1cb torture: Move torture_onoff() timeouts to hrtimers
In order to gain better race coverage, move the CPU-hotplug-related
timed waits in torture_onoff() to torture_hrtimeout_jiffies().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-08-14 15:01:08 -07:00
Paul E. McKenney
872948c665 torture: Make torture_hrtimeout_*() use TASK_IDLE
Given that it is expected that more code will use torture_hrtimeout_*(),
including for longer timeouts, make it use TASK_IDLE instead of
TASK_UNINTERRUPTIBLE.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-08-14 15:01:07 -07:00
Dietmar Eggemann
5d248bb39f torture: Add lock_torture writer_fifo module parameter
This commit adds a module parameter that causes the locktorture writer
to run at real-time priority.

To use it:
insmod /lib/modules/torture.ko random_shuffle=1
insmod /lib/modules/locktorture.ko torture_type=mutex_lock rt_boost=1 rt_boost_factor=50 nested_locks=3 writer_fifo=1
													^^^^^^^^^^^^^

A predecessor to this patch has been helpful to uncover issues with the
proxy-execution series.

[ paulmck: Remove locktorture-specific code from kernel/torture.c. ]

Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: kernel-team@android.com
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
[jstultz: Include header change to build, reword commit message]
Signed-off-by: John Stultz <jstultz@google.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-08-14 15:01:07 -07:00
Paul E. McKenney
67d5404d27 torture: Add a kthread-creation callback to _torture_create_kthread()
This commit adds a kthread-creation callback to the
_torture_create_kthread() function, which allows callers of a new
torture_create_kthread_cb() macro to specify a function to be invoked
after the kthread is created but before it is awakened for the first time.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: kernel-team@android.com
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: John Stultz <jstultz@google.com>
2023-08-14 15:00:37 -07:00
Connor O'Brien
e2a0b786c5 torture: Support randomized shuffling for proxy exec testing
Currently shuffling sets the same cpu affinities for all tasks,
which makes us less likely to hit paths involving migrating
blocked tasks onto a cpu where they can't run.

This patch adds an element of randomness to allow affinities of
different writer tasks to diverge.

This has helped uncover issues in testing with Proxy Execution

Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: kernel-team@android.com
Signed-off-by: Connor O'Brien <connoro@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-07-14 15:04:09 -07:00
Joel Fernandes (Google)
d52d3a2bf4 torture: Fix hang during kthread shutdown phase
During rcutorture shutdown, the rcu_torture_cleanup() function calls
torture_cleanup_begin(), which sets the fullstop global variable to
FULLSTOP_RMMOD. This causes the rcutorture threads for readers and
fakewriters to exit all of their "while" loops and start shutting down.

They then call torture_kthread_stopping(), which in turn waits for
kthread_stop() to be called.  However, rcu_torture_cleanup() has
not yet called kthread_stop() on those threads, and before it gets a
chance to do so, multiple instances of torture_kthread_stopping() invoke
schedule_timeout_interruptible(1) in a tight loop.  Tracing confirms that
TIMER_SOFTIRQ can then continuously execute timer callbacks.  If that
TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that
task might never invoke kthread_stop().

This commit improves this situation by increasing the timeout passed to
schedule_timeout_interruptible() from one jiffy to 1/20th of a second.
This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus
allowing rcu_torture_cleanup() to carry out the needed kthread_stop()
invocations.  Testing has shown 100 runs of TREE07 passing reliably,
as oppose to the tens-of-percent failure rates seen beforehand.

Cc: Paul McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: <stable@vger.kernel.org> # 6.0.x
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-01-05 12:10:35 -08:00
Paul E. McKenney
95f93e97ed torture: Seed torture_random_state on CPU
The DEFINE_TORTURE_RANDOM_PERCPU() macro defines per-CPU random-number
generators for torture testing, but the seeds for each CPU's instance
will be identical if they are first used at the same time.  This commit
therefore adds the CPU number to the mix when reseeding.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-01-03 17:52:54 -08:00
Paul E. McKenney
6b8646a9d3 torture: Wake up kthreads after storing task_struct pointer
Currently, _torture_create_kthread() uses kthread_run() to create
torture-test kthreads, which means that the resulting task_struct
pointer is stored after the newly created kthread has been marked
runnable.  This in turn can cause spurious failure of checks for
code being run by a particular kthread.  This commit therefore changes
_torture_create_kthread() to use kthread_create(), then to do an explicit
wake_up_process() after the task_struct pointer has been stored.

Reported-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-02-01 17:24:39 -08:00
Paul E. McKenney
2b4a7f20f1 torture: Distinguish kthread stopping and being asked to stop
Right now, if a given kthread (call it "kthread") realizes that it needs
to stop, "Stopping kthread" is written to the console.  When the cleanup
code decides that it is time to stop that kthread, "Stopping kthread
tasks" is written to the console.  These two events might happen in
either order, especially in the case of time-based torture-test shutdown.

But it is hard to distinguish these, especially for those unfamiliar with
the torture tests.  This commit therefore changes the first case from
"Stopping kthread" to "kthread is stopping" to make things more clear.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-02-01 17:24:38 -08:00
Li Zhijian
81faa4f6fb locktorture,rcutorture,torture: Always log error message
Unconditionally log messages corresponding to errors.

Acked-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-07 16:36:17 -08:00
Sebastian Andrzej Siewior
ed4fa2442e torture: Replace deprecated CPU-hotplug functions.
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-10 10:48:07 -07:00
Stephen Zhang
4ac9de07b2 torture: Replace torture_init_begin string with %s
This commit replaces a hard-coded "torture_init_begin" string in
a pr_alert() format with "%s" and __func__.

Signed-off-by: Stephen Zhang <stephenzhangzsd@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:22:28 -08:00
Paul E. McKenney
1afb95fee0 torture: Maintain torture-specific set of CPUs-online books
The TREE01 rcutorture scenario intentionally creates confusion as to the
number of available CPUs by specifying the "maxcpus=8 nr_cpus=43" kernel
boot parameters.  This can disable rcutorture's load shedding, which
currently uses num_online_cpus(), which would count the extra 35 CPUs.
However, the rcutorture guest OS will be provisioned with only 8 CPUs,
which means that rcutorture will present full load even when all but one
of the original 8 CPUs are offline.  This can result in spurious errors
due to extreme overloading of that single remaining CPU.

This commit therefore keeps a separate set of books on the number of
usable online CPUs, so that torture_num_online_cpus() is used for load
shedding instead of num_online_cpus().  Note that initial sizing must
use num_online_cpus() because torture_num_online_cpus() will return
NR_CPUS until shortly after torture_onoff_init() is invoked.

Reported-by: Frederic Weisbecker <frederic@kernel.org>
[ paulmck: Apply feedback from kernel test robot. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-06 17:17:22 -08:00
Paul E. McKenney
0b962c8fe0 torture: Clean up after torture-test CPU hotplugging
This commit puts all CPUs back online at the end of a torture test,
and also unconditionally puts them online at the beginning of the test,
rather than just in the case of built-in tests.  This allows torture tests
to behave in a predictable manner, whether built-in or based on modules.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-06 17:17:22 -08:00
Paul E. McKenney
8a67a20bf2 torture: Throttle VERBOSE_TOROUT_*() output
This commit adds kernel boot parameters torture.verbose_sleep_frequency
and torture.verbose_sleep_duration, which allow VERBOSE_TOROUT_*() output
to be throttled with periodic sleeps on large systems.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-06 17:17:21 -08:00
Paul E. McKenney
ed24affa71 torture: Make stutter use torture_hrtimeout_*() functions
This commit saves a few lines of code by making the stutter_wait()
and torture_stutter() functions use torture_hrtimeout_jiffies() and
torture_hrtimeout_us().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-06 17:17:20 -08:00
Paul E. McKenney
ae19aaafae torture: Add fuzzed hrtimer-based sleep functions
This commit adds torture_hrtimeout_ns(), torture_hrtimeout_us(),
torture_hrtimeout_ms(), torture_hrtimeout_jiffies(), and
torture_hrtimeout_s(), each of which uses hrtimers to block for a fuzzed
time interval.  These functions are intended to be used by the various
torture tests to decouple wakeups from the timer wheel, thus providing
more opportunity for Murphy to insert destructive race conditions.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-06 17:17:19 -08:00
Paul E. McKenney
ab1b7880de rcutorture: Make stutter_wait() caller restore priority
Currently, stutter_wait() will happily spin waiting for the stutter
interval to end even if the caller is running at a real-time priority
level.  This could starve normal-priority tasks for no good reason.  This
commit therefore drops the calling task's priority to SCHED_OTHER MAX_NICE
if stutter_wait() needs to wait.  But when it waits, stutter_wait()
returns true, which allows the caller to restore the priority if needed.
Callers that were already running at SCHED_OTHER MAX_NICE obviously
do not need any changes, but this commit also restores priority for
higher-priority callers.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-06 17:13:54 -08:00