Merge tag 'trace-tools-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull rv and tools/rtla updates from Steven Rostedt:

 - Add a test suite to test the tool

   Add a small test suite that can be used to test rtla's basic features
   to at least have something to test when applying changes.

 - Automate manual steps in monitor creation

   While creating a new monitor in RV, besides generating code from
   dot2k, there are a few manual steps which can be tedious and error
   prone, like adding the tracepoints, makefile lines and kconfig, or
   selecting events that start the monitor in the initial state.

   Updates were made to try and automate as much as possible among those
   steps to make creating a new RV monitor much quicker. It is still
   requires to select proper tracepoints, this step is harder to
   automate in a general way and, in several cases, would still need
   user intervention.

 - Have rtla timerlat hist and top set OSNOISE_WORKLOAD flag

   Have both rtla-timerlat-hist and rtla-timerlat-top set
   OSNOISE_WORKLOAD to the proper value ("on" when running with -k,
   "off" when running with -u) every time the option is available
   instead of setting it only when running with -u.

   This prevents rtla timerlat -k from giving no results when
   NO_OSNOISE_WORKLOAD is set, either manually or by an abnormally
   exited earlier run of rtla timerlat -u.

 - Stop rtla timerlat on signal properly when overloaded

   There is an issue where if rtla is run on machines with a high number
   of CPUs (100+), timerlat can generate more samples than rtla is able
   to process via tracefs_iterate_raw_events. This is especially common
   when the interval is set to 100us (rteval and cyclictest default) as
   opposed to the rtla default of 1000us, but also happens with the rtla
   default.

   Currently, this leads to rtla hanging and having to be terminated
   with SIGTERM. SIGINT setting stop_tracing is not enough, since more
   and more events are coming and tracefs_iterate_raw_events never
   exits.

   To fix this: Stop the timerlat tracer on SIGINT/SIGALRM to ensure no
   more events are generated when rtla is supposed to exit.

   Also on receiving SIGINT/SIGALRM twice, abort iteration immediately
   with tracefs_iterate_stop, making rtla exit right away instead of
   waiting for all events to be processed.

 - Account for missed events

   Due to tracefs buffer overflow, it can happen that rtla misses
   events, making the tracing results inaccurate.

   Count both the number of missed events and the total number of
   processed events, and display missed events as well as their
   percentage. The numbers are displayed for both osnoise and timerlat,
   even though for the earlier, missed events are generally not
   expected.

   For hist, the number is displayed at the end of the run; for top, it
   is displayed on each printing of the top table.

 - Changes to make osnoise more robust

   There was a dependency in the code that the first field of the
   osnoise_tool structure was the trace field. If that that ever
   changed, then the code work break. Change the code to encapsulate
   this dependency where the code that uses the structure does not have
   this dependency.

* tag 'trace-tools-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits)
  rtla: Report missed event count
  rtla: Add function to report missed events
  rtla: Count all processed events
  rtla: Count missed trace events
  tools/rtla: Add osnoise_trace_is_off()
  rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads
  rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads
  rtla/osnoise: Distinguish missing workload option
  rtla/timerlat_top: Abort event processing on second signal
  rtla/timerlat_hist: Abort event processing on second signal
  rtla/timerlat_top: Stop timerlat tracer on signal
  rtla/timerlat_hist: Stop timerlat tracer on signal
  rtla: Add trace_instance_stop
  tools/rtla: Add basic test suite
  verification/dot2k: Implement event type detection
  verification/dot2k: Auto patch current kernel source
  verification/dot2k: Simplify manual steps in monitor creation
  rv: Simplify manual steps in monitor creation
  verification/dot2k: Add support for name and description options
  verification/dot2k: More robust template variables
  ...
This commit is contained in:
Linus Torvalds
2025-01-26 14:25:58 -08:00
33 changed files with 690 additions and 405 deletions

View File

@@ -25,30 +25,9 @@ menuconfig RV
For further information, see:
Documentation/trace/rv/runtime-verification.rst
config RV_MON_WIP
depends on RV
depends on PREEMPT_TRACER
select DA_MON_EVENTS_IMPLICIT
bool "wip monitor"
help
Enable wip (wakeup in preemptive) sample monitor that illustrates
the usage of per-cpu monitors, and one limitation of the
preempt_disable/enable events.
For further information, see:
Documentation/trace/rv/monitor_wip.rst
config RV_MON_WWNR
depends on RV
select DA_MON_EVENTS_ID
bool "wwnr monitor"
help
Enable wwnr (wakeup while not running) sample monitor, this is a
sample monitor that illustrates the usage of per-task monitor.
The model is borken on purpose: it serves to test reactors.
For further information, see:
Documentation/trace/rv/monitor_wwnr.rst
source "kernel/trace/rv/monitors/wip/Kconfig"
source "kernel/trace/rv/monitors/wwnr/Kconfig"
# Add new monitors here
config RV_REACTORS
bool "Runtime verification reactors"

View File

@@ -1,8 +1,11 @@
# SPDX-License-Identifier: GPL-2.0
ccflags-y += -I $(src) # needed for trace events
obj-$(CONFIG_RV) += rv.o
obj-$(CONFIG_RV_MON_WIP) += monitors/wip/wip.o
obj-$(CONFIG_RV_MON_WWNR) += monitors/wwnr/wwnr.o
# Add new monitors here
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o

View File

@@ -0,0 +1,12 @@
config RV_MON_WIP
depends on RV
depends on PREEMPT_TRACER
select DA_MON_EVENTS_IMPLICIT
bool "wip monitor"
help
Enable wip (wakeup in preemptive) sample monitor that illustrates
the usage of per-cpu monitors, and one limitation of the
preempt_disable/enable events.
For further information, see:
Documentation/trace/rv/monitor_wip.rst

View File

@@ -10,7 +10,7 @@
#define MODULE_NAME "wip"
#include <trace/events/rv.h>
#include <rv_trace.h>
#include <trace/events/sched.h>
#include <trace/events/preemptirq.h>

View File

@@ -0,0 +1,15 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Snippet to be included in rv_trace.h
*/
#ifdef CONFIG_RV_MON_WIP
DEFINE_EVENT(event_da_monitor, event_wip,
TP_PROTO(char *state, char *event, char *next_state, bool final_state),
TP_ARGS(state, event, next_state, final_state));
DEFINE_EVENT(error_da_monitor, error_wip,
TP_PROTO(char *state, char *event),
TP_ARGS(state, event));
#endif /* CONFIG_RV_MON_WIP */

View File

@@ -0,0 +1,11 @@
config RV_MON_WWNR
depends on RV
select DA_MON_EVENTS_ID
bool "wwnr monitor"
help
Enable wwnr (wakeup while not running) sample monitor, this is a
sample monitor that illustrates the usage of per-task monitor.
The model is borken on purpose: it serves to test reactors.
For further information, see:
Documentation/trace/rv/monitor_wwnr.rst

View File

@@ -10,7 +10,7 @@
#define MODULE_NAME "wwnr"
#include <trace/events/rv.h>
#include <rv_trace.h>
#include <trace/events/sched.h>
#include "wwnr.h"

View File

@@ -0,0 +1,16 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Snippet to be included in rv_trace.h
*/
#ifdef CONFIG_RV_MON_WWNR
/* id is the pid of the task */
DEFINE_EVENT(event_da_monitor_id, event_wwnr,
TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
TP_ARGS(id, state, event, next_state, final_state));
DEFINE_EVENT(error_da_monitor_id, error_wwnr,
TP_PROTO(int id, char *state, char *event),
TP_ARGS(id, state, event));
#endif /* CONFIG_RV_MON_WWNR */

View File

@@ -145,7 +145,7 @@
#ifdef CONFIG_DA_MON_EVENTS
#define CREATE_TRACE_POINTS
#include <trace/events/rv.h>
#include <rv_trace.h>
#endif
#include "rv.h"

View File

@@ -57,15 +57,9 @@ DECLARE_EVENT_CLASS(error_da_monitor,
__entry->state)
);
#ifdef CONFIG_RV_MON_WIP
DEFINE_EVENT(event_da_monitor, event_wip,
TP_PROTO(char *state, char *event, char *next_state, bool final_state),
TP_ARGS(state, event, next_state, final_state));
#include <monitors/wip/wip_trace.h>
// Add new monitors based on CONFIG_DA_MON_EVENTS_IMPLICIT here
DEFINE_EVENT(error_da_monitor, error_wip,
TP_PROTO(char *state, char *event),
TP_ARGS(state, event));
#endif /* CONFIG_RV_MON_WIP */
#endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
#ifdef CONFIG_DA_MON_EVENTS_ID
@@ -123,20 +117,14 @@ DECLARE_EVENT_CLASS(error_da_monitor_id,
__entry->state)
);
#ifdef CONFIG_RV_MON_WWNR
/* id is the pid of the task */
DEFINE_EVENT(event_da_monitor_id, event_wwnr,
TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
TP_ARGS(id, state, event, next_state, final_state));
DEFINE_EVENT(error_da_monitor_id, error_wwnr,
TP_PROTO(int id, char *state, char *event),
TP_ARGS(id, state, event));
#endif /* CONFIG_RV_MON_WWNR */
#include <monitors/wwnr/wwnr_trace.h>
// Add new monitors based on CONFIG_DA_MON_EVENTS_ID here
#endif /* CONFIG_DA_MON_EVENTS_ID */
#endif /* _TRACE_RV_H */
/* This part ust be outside protection */
#undef TRACE_INCLUDE_PATH
#define TRACE_INCLUDE_PATH .
#define TRACE_INCLUDE_FILE rv_trace
#include <trace/define_trace.h>

View File

@@ -85,4 +85,6 @@ clean: doc_clean fixdep-clean
$(Q)find . -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
$(Q)rm -f rtla rtla-static fixdep FEATURE-DUMP rtla-*
$(Q)rm -rf feature
.PHONY: FORCE clean
check: $(RTLA)
RTLA=$(RTLA) prove -o -f tests/
.PHONY: FORCE clean check

View File

@@ -867,7 +867,7 @@ int osnoise_set_workload(struct osnoise_context *context, bool onoff)
retval = osnoise_options_set_option("OSNOISE_WORKLOAD", onoff);
if (retval < 0)
return -1;
return -2;
context->opt_workload = onoff;
@@ -1079,6 +1079,42 @@ out_err:
return NULL;
}
bool osnoise_trace_is_off(struct osnoise_tool *tool, struct osnoise_tool *record)
{
/*
* The tool instance is always present, it is the one used to collect
* data.
*/
if (!tracefs_trace_is_on(tool->trace.inst))
return true;
/*
* The trace record instance is only enabled when -t is set. IOW, when the system
* is tracing.
*/
return record && !tracefs_trace_is_on(record->trace.inst);
}
/*
* osnoise_report_missed_events - report number of events dropped by trace
* buffer
*/
void
osnoise_report_missed_events(struct osnoise_tool *tool)
{
unsigned long long total_events;
if (tool->trace.missed_events == UINT64_MAX)
printf("unknown number of events missed, results might not be accurate\n");
else if (tool->trace.missed_events > 0) {
total_events = tool->trace.processed_events + tool->trace.missed_events;
printf("%lld (%.2f%%) events missed, results might not be accurate\n",
tool->trace.missed_events,
(double) tool->trace.missed_events / total_events * 100.0);
}
}
static void osnoise_usage(int err)
{
int i;

View File

@@ -104,6 +104,8 @@ struct osnoise_tool {
void osnoise_destroy_tool(struct osnoise_tool *top);
struct osnoise_tool *osnoise_init_tool(char *tool_name);
struct osnoise_tool *osnoise_init_trace_tool(char *tracer);
void osnoise_report_missed_events(struct osnoise_tool *tool);
bool osnoise_trace_is_off(struct osnoise_tool *tool, struct osnoise_tool *record);
int osnoise_hist_main(int argc, char *argv[]);
int osnoise_top_main(int argc, char **argv);

View File

@@ -440,6 +440,7 @@ osnoise_print_stats(struct osnoise_hist_params *params, struct osnoise_tool *too
trace_seq_reset(trace->seq);
osnoise_print_summary(params, trace, data);
osnoise_report_missed_events(tool);
}
/*
@@ -970,7 +971,7 @@ int osnoise_hist_main(int argc, char *argv[])
goto out_hist;
}
if (trace_is_off(&tool->trace, &record->trace))
if (osnoise_trace_is_off(tool, record))
break;
}
@@ -980,7 +981,7 @@ int osnoise_hist_main(int argc, char *argv[])
return_value = 0;
if (trace_is_off(&tool->trace, &record->trace)) {
if (osnoise_trace_is_off(tool, record)) {
printf("rtla osnoise hit stop tracing\n");
if (params->trace_output) {
printf(" Saving trace to %s\n", params->trace_output);

View File

@@ -280,6 +280,7 @@ osnoise_print_stats(struct osnoise_top_params *params, struct osnoise_tool *top)
trace_seq_do_printf(trace->seq);
trace_seq_reset(trace->seq);
osnoise_report_missed_events(top);
}
/*
@@ -801,7 +802,7 @@ int osnoise_top_main(int argc, char **argv)
if (!params->quiet)
osnoise_print_stats(params, tool);
if (trace_is_off(&tool->trace, &record->trace))
if (osnoise_trace_is_off(tool, record))
break;
}
@@ -810,7 +811,7 @@ int osnoise_top_main(int argc, char **argv)
return_value = 0;
if (trace_is_off(&tool->trace, &record->trace)) {
if (osnoise_trace_is_off(tool, record)) {
printf("osnoise hit stop tracing\n");
if (params->trace_output) {
printf(" Saving trace to %s\n", params->trace_output);

View File

@@ -656,6 +656,7 @@ timerlat_print_stats(struct timerlat_hist_params *params, struct osnoise_tool *t
timerlat_print_summary(params, trace, data);
timerlat_print_stats_all(params, trace, data);
osnoise_report_missed_events(tool);
}
/*
@@ -1100,12 +1101,15 @@ timerlat_hist_apply_config(struct osnoise_tool *tool, struct timerlat_hist_param
}
}
if (params->user_hist) {
retval = osnoise_set_workload(tool->context, 0);
if (retval) {
err_msg("Failed to set OSNOISE_WORKLOAD option\n");
goto out_err;
}
/*
* Set workload according to type of thread if the kernel supports it.
* On kernels without support, user threads will have already failed
* on missing timerlat_fd, and kernel threads do not need it.
*/
retval = osnoise_set_workload(tool->context, params->kernel_workload);
if (retval < -1) {
err_msg("Failed to set OSNOISE_WORKLOAD option\n");
goto out_err;
}
return 0;
@@ -1146,9 +1150,20 @@ out_err:
}
static int stop_tracing;
static struct trace_instance *hist_inst = NULL;
static void stop_hist(int sig)
{
if (stop_tracing) {
/*
* Stop requested twice in a row; abort event processing and
* exit immediately
*/
tracefs_iterate_stop(hist_inst->inst);
return;
}
stop_tracing = 1;
if (hist_inst)
trace_instance_stop(hist_inst);
}
/*
@@ -1195,6 +1210,12 @@ int timerlat_hist_main(int argc, char *argv[])
}
trace = &tool->trace;
/*
* Save trace instance into global variable so that SIGINT can stop
* the timerlat tracer.
* Otherwise, rtla could loop indefinitely when overloaded.
*/
hist_inst = trace;
retval = enable_timerlat(trace);
if (retval) {
@@ -1342,7 +1363,7 @@ int timerlat_hist_main(int argc, char *argv[])
goto out_hist;
}
if (trace_is_off(&tool->trace, &record->trace))
if (osnoise_trace_is_off(tool, record))
break;
/* is there still any user-threads ? */
@@ -1363,7 +1384,7 @@ int timerlat_hist_main(int argc, char *argv[])
return_value = 0;
if (trace_is_off(&tool->trace, &record->trace)) {
if (osnoise_trace_is_off(tool, record) && !stop_tracing) {
printf("rtla timerlat hit stop tracing\n");
if (!params->no_aa)

View File

@@ -435,6 +435,7 @@ timerlat_print_stats(struct timerlat_top_params *params, struct osnoise_tool *to
trace_seq_do_printf(trace->seq);
trace_seq_reset(trace->seq);
osnoise_report_missed_events(top);
}
/*
@@ -851,12 +852,15 @@ timerlat_top_apply_config(struct osnoise_tool *top, struct timerlat_top_params *
}
}
if (params->user_top) {
retval = osnoise_set_workload(top->context, 0);
if (retval) {
err_msg("Failed to set OSNOISE_WORKLOAD option\n");
goto out_err;
}
/*
* Set workload according to type of thread if the kernel supports it.
* On kernels without support, user threads will have already failed
* on missing timerlat_fd, and kernel threads do not need it.
*/
retval = osnoise_set_workload(top->context, params->kernel_workload);
if (retval < -1) {
err_msg("Failed to set OSNOISE_WORKLOAD option\n");
goto out_err;
}
if (isatty(STDOUT_FILENO) && !params->quiet)
@@ -900,9 +904,20 @@ out_err:
}
static int stop_tracing;
static struct trace_instance *top_inst = NULL;
static void stop_top(int sig)
{
if (stop_tracing) {
/*
* Stop requested twice in a row; abort event processing and
* exit immediately
*/
tracefs_iterate_stop(top_inst->inst);
return;
}
stop_tracing = 1;
if (top_inst)
trace_instance_stop(top_inst);
}
/*
@@ -950,6 +965,13 @@ int timerlat_top_main(int argc, char *argv[])
}
trace = &top->trace;
/*
* Save trace instance into global variable so that SIGINT can stop
* the timerlat tracer.
* Otherwise, rtla could loop indefinitely when overloaded.
*/
top_inst = trace;
retval = enable_timerlat(trace);
if (retval) {
@@ -1093,7 +1115,7 @@ int timerlat_top_main(int argc, char *argv[])
while (!stop_tracing) {
sleep(params->sleep_time);
if (params->aa_only && !trace_is_off(&top->trace, &record->trace))
if (params->aa_only && !osnoise_trace_is_off(top, record))
continue;
retval = tracefs_iterate_raw_events(trace->tep,
@@ -1110,7 +1132,7 @@ int timerlat_top_main(int argc, char *argv[])
if (!params->quiet)
timerlat_print_stats(params, top);
if (trace_is_off(&top->trace, &record->trace))
if (osnoise_trace_is_off(top, record))
break;
/* is there still any user-threads ? */
@@ -1131,7 +1153,7 @@ int timerlat_top_main(int argc, char *argv[])
return_value = 0;
if (trace_is_off(&top->trace, &record->trace)) {
if (osnoise_trace_is_off(top, record) && !stop_tracing) {
printf("rtla timerlat hit stop tracing\n");
if (!params->no_aa)

View File

@@ -118,6 +118,8 @@ collect_registered_events(struct tep_event *event, struct tep_record *record,
struct trace_instance *trace = context;
struct trace_seq *s = trace->seq;
trace->processed_events++;
if (!event->handler)
return 0;
@@ -126,6 +128,31 @@ collect_registered_events(struct tep_event *event, struct tep_record *record,
return 0;
}
/*
* collect_missed_events - record number of missed events
*
* If rtla cannot keep up with events generated by tracer, events are going
* to fall out of the ring buffer.
* Collect how many events were missed so it can be reported to the user.
*/
static int
collect_missed_events(struct tep_event *event, struct tep_record *record,
int cpu, void *context)
{
struct trace_instance *trace = context;
if (trace->missed_events == UINT64_MAX)
return 0;
if (record->missed_events > 0)
trace->missed_events += record->missed_events;
else
/* Events missed but no data on how many */
trace->missed_events = UINT64_MAX;
return 0;
}
/*
* trace_instance_destroy - destroy and free a rtla trace instance
*/
@@ -181,6 +208,17 @@ int trace_instance_init(struct trace_instance *trace, char *tool_name)
*/
tracefs_trace_off(trace->inst);
/*
* Collect the number of events missed due to tracefs buffer
* overflow.
*/
trace->missed_events = 0;
tracefs_follow_missed_events(trace->inst,
collect_missed_events,
trace);
trace->processed_events = 0;
return 0;
out_err:
@@ -196,6 +234,14 @@ int trace_instance_start(struct trace_instance *trace)
return tracefs_trace_on(trace->inst);
}
/*
* trace_instance_stop - stop tracing a given rtla instance
*/
int trace_instance_stop(struct trace_instance *trace)
{
return tracefs_trace_off(trace->inst);
}
/*
* trace_events_free - free a list of trace events
*/
@@ -522,25 +568,6 @@ void trace_events_destroy(struct trace_instance *instance,
trace_events_free(events);
}
int trace_is_off(struct trace_instance *tool, struct trace_instance *trace)
{
/*
* The tool instance is always present, it is the one used to collect
* data.
*/
if (!tracefs_trace_is_on(tool->inst))
return 1;
/*
* The trace instance is only enabled when -t is set. IOW, when the system
* is tracing.
*/
if (trace && !tracefs_trace_is_on(trace->inst))
return 1;
return 0;
}
/*
* trace_set_buffer_size - set the per-cpu tracing buffer size.
*/

View File

@@ -17,10 +17,13 @@ struct trace_instance {
struct tracefs_instance *inst;
struct tep_handle *tep;
struct trace_seq *seq;
unsigned long long missed_events;
unsigned long long processed_events;
};
int trace_instance_init(struct trace_instance *trace, char *tool_name);
int trace_instance_start(struct trace_instance *trace);
int trace_instance_stop(struct trace_instance *trace);
void trace_instance_destroy(struct trace_instance *trace);
struct trace_seq *get_trace_seq(void);
@@ -47,5 +50,4 @@ int trace_events_enable(struct trace_instance *instance,
int trace_event_add_filter(struct trace_events *event, char *filter);
int trace_event_add_trigger(struct trace_events *event, char *trigger);
int trace_is_off(struct trace_instance *tool, struct trace_instance *trace);
int trace_set_buffer_size(struct trace_instance *trace, int size);

View File

@@ -0,0 +1,48 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
test_begin() {
# Count tests to allow the test harness to double-check if all were
# included correctly.
ctr=0
[ -z "$RTLA" ] && RTLA="./rtla"
[ -n "$TEST_COUNT" ] && echo "1..$TEST_COUNT"
}
check() {
# Simple check: run rtla with given arguments and test exit code.
# If TEST_COUNT is set, run the test. Otherwise, just count.
ctr=$(($ctr + 1))
if [ -n "$TEST_COUNT" ]
then
# Run rtla; in case of failure, include its output as comment
# in the test results.
result=$(stdbuf -oL $TIMEOUT "$RTLA" $2 2>&1); exitcode=$?
if [ $exitcode -eq 0 ]
then
echo "ok $ctr - $1"
else
echo "not ok $ctr - $1"
# Add rtla output and exit code as comments in case of failure
echo "$result" | col -b | while read line; do echo "# $line"; done
printf "#\n# exit code %s\n" $exitcode
fi
fi
}
set_timeout() {
TIMEOUT="timeout -v -k 15s $1"
}
unset_timeout() {
unset TIMEOUT
}
test_end() {
# If running without TEST_COUNT, tests are not actually run, just
# counted. In that case, re-run the test with the correct count.
[ -z "$TEST_COUNT" ] && TEST_COUNT=$ctr exec bash $0 || true
}
# Avoid any environmental discrepancies
export LC_ALL=C
unset_timeout

Some files were not shown because too many files have changed in this diff Show More