Commit Graph

350382 Commits

Author SHA1 Message Date
Steven Rostedt (Red Hat) 7818b38865 ftrace: Use manual free after synchronize_sched() not call_rcu_sched()
The entries to the probe hash must be freed after a synchronize_sched()
after the entry has been removed from the hash.

As the entries are registered with ops that may have their own callbacks,
and these callbacks may sleep, we can not use call_rcu_sched() because
the rcu callbacks registered with that are called from a softirq context.

Instead of using call_rcu_sched(), manually save the entries on a free_list
and at the end of the loop that removes the entries, do a synchronize_sched()
and then go through the free_list, freeing the entries.

Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:03 -04:00
Steven Rostedt (Red Hat) e67efb93f0 ftrace: Clean up function probe methods
When a function probe is created, each function that the probe is
attached to, a "callback" method is called. On release of the probe,
each function entry calls the "free" method.

First, "callback" is a confusing name and does not really match what
it does. Callback sounds like it will be called when the probe
triggers. But that's not the case. This is really an "init" function,
so lets rename it as such.

Secondly, both "init" and "free" do not pass enough information back
to the handlers. Pass back the ops, ip and data for each time the
method is called. We have the information, might as well use it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:02 -04:00
Steven Rostedt (Red Hat) 57d01ad097 tracing: Fix comments for ftrace_event_file/call flags
Most of the flags for the struct ftrace_event_file were moved over
to the flags of the struct ftrace_event_call, but the comments were
never updated.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:02 -04:00
Steven Rostedt (Red Hat) 77fd5c15e3 tracing: Add snapshot trigger to function probes
echo 'schedule:snapshot:1' > /debug/tracing/set_ftrace_filter

This will cause the scheduler to trigger a snapshot the next time
it's called (you can use any function that's not called by NMI).

Even though it triggers only once, you still need to remove it with:

 echo '!schedule:snapshot:0' > /debug/tracing/set_ftrace_filter

The :1 can be left off for the first command:

 echo 'schedule:snapshot' > /debug/tracing/set_ftrace_filter

But this will cause all calls to schedule to trigger a snapshot.
This must be removed without the ':0'

 echo '!schedule:snapshot' > /debug/tracing/set_ftrace_filter

As adding a "count" is a different operation (internally).

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:01 -04:00
Steven Rostedt (Red Hat) 3209cff449 tracing: Add alloc/free_snapshot() to replace duplicate code
Add alloc_snapshot() and free_snapshot() to allocate and free the
snapshot buffer respectively, and use these to remove duplicate
code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:00 -04:00
Steven Rostedt (Red Hat) e1df4cb682 ftrace: Fix function probe to only enable needed functions
Currently the function probe enables all functions and runs a "hash"
against every function call to see if it should call a probe. This
is extremely wasteful.

Note, a probe is something like:

  echo schedule:traceoff > /debug/tracing/set_ftrace_filter

When schedule is called, the probe will disable tracing. But currently,
it has a call back for *all* functions, and checks to see if the
called function is the probe that is needed.

The probe function has been created before ftrace was rewritten to
allow for more than one "op" to be registered by the function tracer.
When probes were created, it couldn't limit the functions without also
limiting normal function calls. But now we can, it's about time
to update the probe code.

Todo, have separate ops for different entries. That is, assign
a ftrace_ops per probe, instead of one op for all probes. But
as there's not many probes assigned, this may not be that urgent.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:00 -04:00
Steven Rostedt (Red Hat) 8380d24860 ftrace: Separate unlimited probes from count limited probes
The function tracing probes that trigger traceon or traceoff can be
set to unlimited, or given a count of # of times to execute.

By separating these two types of probes, we can then use the dynamic
ftrace function filtering directly, and remove the brute force
"check if this function called is my probe" routines in ftrace.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:59 -04:00
Steven Rostedt (Red Hat) 8b8fa62c60 tracing: Consolidate ftrace_trace_onoff_unreg() into callback
The only thing ftrace_trace_onoff_unreg() does is to do a strcmp()
against the cmd parameter to determine what op to unregister. But
this compare is also done after the location that this function is
called (and returns). By moving the check for '!' to unregister after
the strcmp(), the callback function itself can just do the unregister
and we can get rid of the helper function.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:58 -04:00
Steven Rostedt (Red Hat) 1c31714328 tracing: Consolidate updating of count for traceon/off
Remove some duplicate code and replace it with a helper function.
This makes the code a it cleaner.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:58 -04:00
Steven Rostedt (Red Hat) 1b22e382ab tracing: Let tracing_snapshot() be used by modules but not NMI
Add EXPORT_SYMBOL_GPL() to let the tracing_snapshot() functions be
called from modules.

Also add a test to see if the snapshot was called from NMI context
and just warn in the tracing buffer if so, and return.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:57 -04:00
Steven Rostedt (Red Hat) ca268da6e4 tracing: Add internal ftrace trace_puts() for ftrace to use
There's a few places that ftrace uses trace_printk() for internal
use, but this requires context (normal, softirq, irq, NMI) buffers
to keep things lockless. But the trace_puts() does not, as it can
write the string directly into the ring buffer. Make a internal helper
for trace_puts() and have the internal functions use that.

This way the extra context buffers are not used.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:56 -04:00
Steven Rostedt (Red Hat) 9d3c752c06 tracing: Optimize trace_printk() with one arg to use trace_puts()
Although trace_printk() is extremely fast, especially when it uses
trace_bprintk() (writes args straight to buffer instead of inserting
into string), it still has the overhead of calling one of the printf
sprintf() functions, that need to scan the fmt string to determine
what, if any args it has.

This is a waste of precious CPU cycles if the printk format has no
args but a single constant string. It is better to use trace_puts()
which does not have the overhead of the fmt scanning.

But wouldn't it be nice if the developer didn't have to think about
such things, and the compile would just do it for them?

  trace_printk("this string has no args\n");
  [...]
  trace_printk("this sting does %p %d\n", foo, bar);

As tracing is critical to have the least amount of overhead,
especially when dealing with race conditions, and you want to
eliminate any "Heisenbugs", you want the trace_printk() to use the
fastest possible means of tracing.

Currently the macro magic determines if it will use trace_bprintk()
or if the fmt is a dynamic string (a variable), it will fall
back to the slow trace_printk() method that does a full snprintf()
before copying it into the buffer, where as trace_bprintk() only
copys the pointer to the fmt and the args into the buffer.

Well, now there's a way to spend some more Hogwarts cash and come
up with new fancy macro magic.

  #define trace_printk(fmt, ...)			\
  do {							\
	char _______STR[] = __stringify((__VA_ARGS__));	\
	if (sizeof(_______STR) > 3)			\
		do_trace_printk(fmt, ##__VA_ARGS__);	\
	else						\
		trace_puts(fmt);			\
  } while (0)

The above needs a bit of explaining (both here and in the comments).

By stringifying the __VA_ARGS__, we can, at compile time, determine
the number of args that are being passed to trace_printk(). The extra
parenthesis are required, otherwise the compiler complains about
too many parameters for __stringify if there is more than one arg.

When there are no args, the __stringify((__VA_ARGS__)) converts into
"()\0", a string of 3 characters. Anything else, will be a string
containing more than 3 characters. Now we assign that string to a
dynamic char array, and then take the sizeof() of that array.
If it is greater than 3 characters, we know trace_printk() has args
and we need to do the full "do_trace_printk()" on them, otherwise
it was only passed a single arg and we can optimize to use trace_puts().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:56 -04:00
Steven Rostedt (Red Hat) 09ae72348e tracing: Add trace_puts() for even faster trace_printk() tracing
The trace_printk() is extremely fast and is very handy as it can be
used in any context (including NMIs!). But it still requires scanning
the fmt string for parsing the args. Even the trace_bprintk() requires
a scan to know what args will be saved, although it doesn't copy the
format string itself.

Several times trace_printk() has no args, and wastes cpu cycles scanning
the fmt string.

Adding trace_puts() allows the developer to use an even faster
tracing method that only saves the pointer to the string in the
ring buffer without doing any format parsing at all. This will
help remove even more of the "Heisenbug" effect, when debugging.

Also fixed up the F_printk()s for the ftrace internal bprint and print events.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:55 -04:00
Steven Rostedt (Red Hat) 153e8ed913 tracing: Fix the branch tracer that broke with buffer change
The changce to add the trace_buffer struct to have the trace array
have both the main buffer and max buffer broke the branch tracer
because the change did not update that code. As the branch tracer
adds a significant amount of overhead, and must be selected via
a selection (not a allyesconfig) it was missed in testing.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:54 -04:00
Steven Rostedt (Red Hat) 55034cd6e6 tracing: Add alloc_snapshot kernel command line parameter
If debugging the kernel, and the developer wants to use
tracing_snapshot() in places where tracing_snapshot_alloc() may
be difficult (or more likely, the developer is lazy and doesn't
want to bother with tracing_snapshot_alloc() at all), then adding

  alloc_snapshot

to the kernel command line parameter will tell ftrace to allocate
the snapshot buffer (if configured) when it allocates the main
tracing buffer.

I also noticed that ring_buffer_expanded and tracing_selftest_disabled
had inconsistent use of boolean "true" and "false" with "0" and "1".
I cleaned that up too.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:53 -04:00
Steven Rostedt (Red Hat) f4e781c0a8 tracing: Move the tracing selftest code into its own function
Move the tracing startup selftest code into its own function and
when not enabled, always have that function succeed.

This makes the register_tracer() function much more readable.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:53 -04:00
Steven Rostedt (Red Hat) f5eb558826 ring-buffer: Do not use schedule_work_on() for current CPU
The ring buffer updates when done while the ring buffer is active,
needs to be completed on the CPU that is used for the ring buffer
per_cpu buffer. To accomplish this, schedule_work_on() is used to
schedule work on the given CPU.

Now there's no reason to use schedule_work_on() if the process
doing the update happens to be on the CPU that it is processing.
It has already filled the requirement. Instead, just do the work
and continue.

This is needed for tracing_snapshot_alloc() where it may be called
really early in boot, where the work queues have not been set up yet.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:52 -04:00
Steven Rostedt (Red Hat) ad909e21bb tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.

With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.

Note, any snapshot will overwrite the old one, so take care
in how this is done.

These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.

The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").

tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.

If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.

Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:51 -04:00
Steven Rostedt (Red Hat) a695cb5816 tracing: Prevent deleting instances when they are being read
Add a ref count to the trace_array structure and prevent removal
of instances that have open descriptors.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:51 -04:00
Steven Rostedt (Red Hat) 121aaee7b0 tracing: Add per_cpu directory into tracing instances
Add the per_cpu directory to the created tracing instances:

  cd /sys/kernel/debug/tracing/instances
  mkdir foo
  ls foo/per_cpu/cpu0
buffer_size_kb	snapshot_raw  trace	  trace_pipe_raw
snapshot	stats	      trace_pipe

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:50 -04:00
Steven Rostedt (Red Hat) ce9bae5597 tracing: Add snapshot feature to instances
Add the "snapshot" file to the the multi-buffer instances.

  cd /sys/kernel/debug/tracing/instances
  mkdir foo
  ls foo
buffer_size_kb  buffer_total_size_kb  events  free_buffer  set_event
snapshot  trace  trace_clock  trace_marker  trace_options  trace_pipe
tracing_on
  cat foo/snapshot
 # tracer: nop
 #
 #
 # * Snapshot is freed *
 #
 # Snapshot commands:
 # echo 0 > snapshot : Clears and frees snapshot buffer
 # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
 #                      Takes a snapshot of the main buffer.
 # echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
 #                      (Doesn't have to be '2' works with any number that
 #                       is not a '0' or '1')

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:49 -04:00
Steven Rostedt (Red Hat) 737223fbca tracing: Consolidate buffer allocation code
There's a bit of duplicate code in creating the trace buffers for
the normal trace buffer and the max trace buffer among the instances
and the main global_trace. This code can be consolidated and cleaned
up a bit making the code cleaner and more readable as well as less
duplication.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:49 -04:00
Steven Rostedt (Red Hat) 45ad21ca55 tracing: Have trace_array keep track if snapshot buffer is allocated
The snapshot buffer belongs to the trace array not the tracer that is
running. The trace array should be the data structure that keeps track
of whether or not the snapshot buffer is allocated, not the tracer
desciptor. Having the trace array keep track of it makes modifications
so much easier.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:48 -04:00
Steven Rostedt (Red Hat) 6de58e6269 tracing: Add snapshot_raw to extract the raw data from snapshot
Add a 'snapshot_raw' per_cpu file that allows tools to read the raw
binary data of the snapshot buffer.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:47 -04:00
Steven Rostedt (Red Hat) 0b85ffc293 tracing: Add config option to allow snapshot to swap per cpu
When the preempt or irq latency tracers are enabled, they require
the ring buffer to be able to swap the per cpu sub buffers between
two main buffers. This adds a slight overhead to tracing as the
trace recording needs to perform some checks to synchronize
between recording and swaps that might be happening on other CPUs.

The config RING_BUFFER_ALLOW_SWAP is set when a user of the ring
buffer needs the "swap cpu" feature, otherwise the extra checks
are not implemented and removed from the tracing overhead.

The snapshot feature will swap per CPU if the RING_BUFFER_ALLOW_SWAP
config is set. But that only gets set by things like OPROFILE
and the irqs and preempt latency tracers.

This config is added to let the user decide to include this feature
with the snapshot agnostic from whether or not another user of
the ring buffer sets this config.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:35:47 -04:00