Commit Graph

292 Commits

Author SHA1 Message Date
Steven Rostedt
e0a413f619 tracing: Warn on output if the function tracer was found corrupted
As the function tracer is very intrusive, lots of self checks are
performed on the tracer and if something is found to be strange
it will shut itself down keeping it from corrupting the rest of the
kernel. This shutdown may still allow functions to be traced, as the
tracing only stops new modifications from happening. Trying to stop
the function tracer itself can cause more harm as it requires code
modification.

Although a WARN_ON() is executed, a user may not notice it. To help
the user see that something isn't right with the tracing of the system
a big warning is added to the output of the tracer that lets the user
know that their data may be incomplete.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-10-11 09:13:25 -04:00
Jiri Olsa
1d0e78e380 tracing/filter: Add startup tests for events filter
Adding automated tests running as late_initcall. Tests are
compiled in with CONFIG_FTRACE_STARTUP_TEST option.

Adding test event "ftrace_test_filter" used to simulate
filter processing during event occurance.

String filters are compiled and tested against several
test events with different values.

Also testing that evaluation of explicit predicates is ommited
due to the lazy filter evaluation.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1313072754-4620-11-git-send-email-jolsa@redhat.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-08-19 14:35:59 -04:00
Jiri Olsa
61aaef5530 tracing/filter: Remove field_name from filter_pred struct
The field_name was used just for finding event's fields. This way we
don't need to care about field_name allocation/free.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1313072754-4620-4-git-send-email-jolsa@redhat.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-08-19 14:35:52 -04:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Ingo Molnar
40bcea7bbe Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2011-07-21 09:32:40 +02:00
Ingo Molnar
492f73a303 Merge branch 'perf/urgent' into perf/core
Merge reason: pick up the latest fixes - they won't make v3.0.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:29:21 +02:00
Steven Rostedt
e4a3f541f0 tracing: Still trace filtered irq functions when irq trace is disabled
If a function is set to be traced by the set_graph_function, but the
option funcgraph-irqs is zero, and the traced function happens to be
called from a interrupt, it will not be traced.

The point of funcgraph-irqs is to not trace interrupts when we are
preempted by an irq, not to not trace functions we want to trace that
happen to be *in* a irq.

Luckily the current->trace_recursion element is perfect to add a flag
to help us be able to trace functions within an interrupt even when
we are not tracing interrupts that preempt the trace.

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-07 22:26:27 -04:00
Steven Rostedt
e9dbfae53e tracing: Fix bug when reading system filters on module removal
The event system is freed when its nr_events is set to zero. This happens
when a module created an event system and then later the module is
removed. Modules may share systems, so the system is allocated when
it is created and freed when the modules are unloaded and all the
events under the system are removed (nr_events set to zero).

The problem arises when a task opened the "filter" file for the
system. If the module is unloaded and it removed the last event for
that system, the system structure is freed. If the task that opened
the filter file accesses the "filter" file after the system has
been freed, the system will access an invalid pointer.

By adding a ref_count, and using it to keep track of what
is using the event system, we can free it after all users
are finished with the event system.

Cc: <stable@kernel.org>
Reported-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-07 11:19:18 -04:00
Masami Hiramatsu
1fd8df2c39 tracing/kprobes: Fix kprobe-tracer to support stack trace
Fix to support kernel stack trace correctly on kprobe-tracer.
Since the execution path of kprobe-based dynamic events is different
from other tracepoint-based events, normal ftrace_trace_stack() doesn't
work correctly. To fix that, this introduces ftrace_trace_stack_regs()
which traces stack via pt_regs instead of current stack register.

e.g.

 # echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events
 # echo 1 > /sys/kernel/debug/tracing/options/stacktrace
 # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
 # head -n 20 /sys/kernel/debug/tracing/trace
            bash-2968  [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca)
            bash-2968  [000] 10297.050247: <stack trace>
 => schedule_timeout
 => n_tty_read
 => tty_read
 => vfs_read
 => sys_read
 => system_call_fastpath
     kworker/0:1-2940  [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca)
     kworker/0:1-2940  [000] 10297.050266: <stack trace>
 => worker_thread
 => kthread
 => kernel_thread_helper
            sshd-1132  [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca)
            sshd-1132  [000] 10297.050365: <stack trace>
 => sysret_careful

Note: Even with this fix, the first entry will be skipped
if the probe is put on the function entry area before
the frame pointer is set up (usually, that is 4 bytes
 (push %bp; mov %sp %bp) on x86), because stack unwinder
depends on the frame pointer.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@gmail.com>
Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 22:48:53 -04:00
Steven Rostedt
cf30cf67d6 tracing: Add disable_on_free option
Add a trace option to disable tracing on free. When this option is
set, a write into the free_buffer file will not only shrink the
ring buffer down to zero, but it will also disable tracing.

Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 22:48:45 -04:00
Steven Rostedt
b1cff0ad10 ftrace: Add internal recursive checks
Witold reported a reboot caused by the selftests of the dynamic function
tracer. He sent me a config and I used ktest to do a config_bisect on it
(as my config did not cause the crash). It pointed out that the problem
config was CONFIG_PROVE_RCU.

What happened was that if multiple callbacks are attached to the
function tracer, we iterate a list of callbacks. Because the list is
managed by synchronize_sched() and preempt_disable, the access to the
pointers uses rcu_dereference_raw().

When PROVE_RCU is enabled, the rcu_dereference_raw() calls some
debugging functions, which happen to be traced. The tracing of the debug
function would then call rcu_dereference_raw() which would then call the
debug function and then... well you get the idea.

I first wrote two different patches to solve this bug.

1) add a __rcu_dereference_raw() that would not do any checks.
2) add notrace to the offending debug functions.

Both of these patches worked.

Talking with Paul McKenney on IRC, he suggested to add recursion
detection instead. This seemed to be a better solution, so I decided to
implement it. As the task_struct already has a trace_recursion to detect
recursion in the ring buffer, and that has a very small number it
allows, I decided to use that same variable to add flags that can detect
the recursion inside the infrastructure of the function tracer.

I plan to change it so that the task struct bit can be checked in
mcount, but as that requires changes to all archs, I will hold that off
to the next merge window.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.com
Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-25 22:13:49 -04:00
Steven Rostedt
95950c2ecb ftrace: Add self-tests for multiple function trace users
Add some basic sanity tests for multiple users of the function
tracer at startup.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-18 19:24:51 -04:00
Steven Rostedt
9a24470b28 tracing: Align 4 byte ints together in struct tracer
Move elements in struct tracer for better alignment.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10 10:34:54 -05:00
David Sharp
750912fa36 tracing: Add an 'overwrite' trace_option.
Add an "overwrite" trace_option for ftrace to control whether the buffer should
be overwritten on overflow or not. The default remains to overwrite old events
when the buffer is full. This patch adds the option to instead discard newest
events when the buffer is full. This is useful to get a snapshot of traces just
after enabling traces. Dropping the current event is also a simpler code path.

Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-09 13:52:27 -05:00
Steven Rostedt
bf93f9ed3a tracing/filter: Increase the max preds to 2^14
Now that the filter logic does not require to save the pred results
on the stack, we can increase the max number of preds we allow.
As the preds are index by a short value, and we use the MSBs as flags
we can increase the max preds to 2^14 (16384) which should be way
more than enough.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:20 -05:00
Steven Rostedt
4a3d27e98a tracing/filter: Move MAX_FILTER_PRED to local tracing directory
The MAX_FILTER_PRED is only needed by the kernel/trace/*.c files.
Move it to kernel/trace/trace.h.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:20 -05:00
Steven Rostedt
43cd414552 tracing/filter: Optimize filter by folding the tree
There are many cases that a filter will contain multiple ORs or
ANDs together near the leafs. Walking up and down the tree to get
to the next compare can be a waste.

If there are several ORs or ANDs together, fold them into a single
pred and allocate an array of the conditions that they check.
This will speed up the filter by linearly walking an array
and can still break out if a short circuit condition is met.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:19 -05:00
Steven Rostedt
61e9dea20e tracing/filter: Use a tree instead of stack for filter_match_preds()
Currently the filter_match_preds() requires a stack to push
and pop the preds to determine if the filter matches the record or not.
This has two drawbacks:

1) It requires a stack to store state information. As this is done
   in fast paths we can't allocate the storage for this stack, and
   we can't use a global as it must be re-entrant. The stack is stored
   on the kernel stack and this greatly limits how many preds we
   may allow.

2) All conditions are calculated even when a short circuit exists.
   a || b  will always calculate a and b even though a was determined
   to be true.

Using a tree we can walk a constant structure that will save
the state as we go. The algorithm is simply:

  pred = root;
  do {
	switch (move) {
	case MOVE_DOWN:
		if (OR or AND) {
			pred = left;
			continue;
		}
		if (pred == root)
			break;
		match = pred->fn();
		pred = pred->parent;
		move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT;
		continue;

	case MOVE_UP_FROM_LEFT:
		/* Only OR or AND can be a parent */
		if (match && OR || !match && AND) {
			/* short circuit */
			if (pred == root)
				break;
			pred = pred->parent;
			move = left child ?
				MOVE_UP_FROM_LEFT :
				MOVE_UP_FROM_RIGHT;
			continue;
		}
		pred = pred->right;
		move = MOVE_DOWN;
		continue;

	case MOVE_UP_FROM_RIGHT:
		if (pred == root)
			break;
		pred = pred->parent;
		move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT;
		continue;
	}
	done = 1;
  } while (!done);

This way there's no strict limit to how many preds we allow
and it also will short circuit the logical operations when possible.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:19 -05:00
Steven Rostedt
74e9e58c35 tracing/filter: Allocate the preds in an array
Currently we allocate an array of pointers to filter_preds, and then
allocate a separate filter_pred for each item in the array.
This adds slight overhead in the filters as it needs to derefernce
twice to get to the op condition.

Allocating the preds themselves in a single array removes a dereference
as well as helps on the cache footprint.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:18 -05:00
Steven Rostedt
c9c53ca03d tracing/filter: Dynamically allocate preds
For every filter that is made, we create predicates to hold every
operation within the filter. We have a max of 32 predicates that we
can hold. Currently, we allocate all 32 even if we only need to
use one.

Part of the reason we do this is that the filter can be used at
any moment by any event. Fortunately, the filter is only used
with preemption disabled. By reseting the count of preds used "n_preds"
to zero, then performing a synchronize_sched(), we can safely
free and reallocate a new array of preds.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:18 -05:00
Steven Rostedt
58d9a597c4 tracing/filter: Move OR and AND logic out of fn() method
The ops OR and AND act different from the other ops, as they
are the only ones to take other ops as their arguements.
These ops als change the logic of the filter_match_preds.

By removing the OR and AND fn's we can also remove the val1 and val2
that is passed to all other fn's and are unused.

Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07 20:56:18 -05:00
Jiri Olsa
0a772620a2 tracing: Make graph related irqs/preemptsoff functions global
Move trace_graph_function() and print_graph_headers_flags() functions
to the trace_function_graph.c to be globaly available.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <1285243253-7372-3-git-send-email-jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-18 10:53:28 -04:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Jason Wessel
955b61e597 ftrace,kdb: Extend kdb to be able to dump the ftrace buffer
Add in a helper function to allow the kdb shell to dump the ftrace
buffer.

Modify trace.c to expose the capability to iterate over the ftrace
buffer in a read only capacity.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
CC: Frederic Weisbecker <fweisbec@gmail.com>
2010-08-05 09:22:23 -05:00
Ingo Molnar
3a01736e70 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2010-07-23 09:10:29 +02:00