There are some places in the kernel that define several tracepoints and
they are all identical besides the name. The code to enable, disable and
record is created for every trace point even if most of the code is
identical.
This patch adds TRACE_EVENT_TEMPLATE that lets the developer create
a template TRACE_EVENT and create trace points with DEFINE_EVENT, which
is based off of a given template. Each trace point used by this
will share most of the code, and bring down the size of the kernel
when there are several duplicate events.
Usage is:
TRACE_EVENT_TEMPLATE(name, proto, args, tstruct, assign, print);
Which would be the same as defining a normal TRACE_EVENT.
To create the trace events that the trace points will use:
DEFINE_EVENT(template, name, proto, args) is done. The template
is the name of the TRACE_EVENT_TEMPLATE to use. The name is the
name of the trace point. The parameters proto and args must be the same
as the proto and args of the template. If they are not the same,
then a compile error will result. I tried hard removing this duplication
but the C preprocessor is not powerful enough (or my CPP magic
experience points is not at a high enough level) to not need them.
A lot of trace events are coming in with new XFS development. Most of
the trace points are identical except for the name. The following shows
the advantage of having TRACE_EVENT_TEMPLATE:
$ size fs/xfs/xfs.o.*
text data bss dec hex filename
452114 2788 3520 458422 6feb6 fs/xfs/xfs.o.old
638482 38116 3744 680342 a6196 fs/xfs/xfs.o.template
996954 38116 4480 1039550 fdcbe fs/xfs/xfs.o.trace
xfs.o.old is without any tracepoints.
xfs.o.template uses the new TRACE_EVENT_TEMPLATE.
xfs.o.trace uses the current TRACE_EVENT macros.
Requested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Fix a misplaced ifdef. We need the perf event headers also in
off-case to avoid the following build error:
include/linux/hw_breakpoint.h:94: error: expected declaration specifiers or '...' before 'perf_callback_t'
include/linux/hw_breakpoint.h:102: error: expected declaration specifiers or '...' before 'perf_callback_t'
include/linux/hw_breakpoint.h:109: error: expected declaration specifiers or '...' before 'perf_callback_t'
include/linux/hw_breakpoint.h:116: error: expected declaration specifiers or '...' before 'perf_callback_t'
Reported-by: Kisskb-bot by Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1259011812-8093-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that we can check the buildid to see if it really matches,
this can be done safely:
vmlinux
/boot/vmlinux
/boot/vmlinux-<uts.release>
/lib/modules/<uts.release>/build/vmlinux
/usr/lib/debug/lib/modules/%s/vmlinux
More can be added - if you know about distros that put the
vmlinux somewhere else please let us know.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add the breakpoint events support with this new sysnopsis:
mem:addr[:access]
Where addr is a raw addr value in the kernel and access can be
either [r][w][x]
Example to profile tasklist_lock:
$ grep tasklist_lock /proc/kallsyms
ffffffff8189c000 D tasklist_lock
$ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
$ perf report
# Samples: 62
#
# Overhead Command Shared Object Symbol
# ........ ............... ............. ......
#
29.03% swapper [kernel] [k] _raw_read_trylock
29.03% swapper [kernel] [k] _raw_read_unlock
19.35% init [kernel] [k] _raw_read_trylock
19.35% init [kernel] [k] _raw_read_unlock
1.61% events/0 [kernel] [k] _raw_read_trylock
1.61% events/0 [kernel] [k] _raw_read_unlock
Coming soon:
- Support for symbols in the event definition.
- Default period to 1 for breakpoint events because these are
not high frequency events. The same thing is needed for trace
events.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1258987355-8751-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Add the remaining necessary bits to support breakpoints created
through perf syscall.
We don't use the software counter interface as:
- We don't need to check against recursion, this is already done
in hardware breakpoints arch level.
- We already know the perf event we are dealing with when the
event is to be committed.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1258987355-8751-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Perf tools create perf events as disabled in the beginning.
Breakpoints are then considered like ptrace temporary
breakpoints, only meant to reserve a breakpoint slot until we
get all the necessary informations from the user.
In this case, we don't check the address that is breakpointed as
it is NULL in the ptrace case.
But perf tools don't have the same purpose, events are created
disabled to wait for all events to be created before enabling
all of them. We want to check the breakpoint parameters in this
case.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1258987355-8751-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It is quite possible to call update_event_times() on a context
that isn't actually running and thereby confuse the thing.
perf stat was reporting !100% scale values for software counters
(2e2af50b perf_events: Disable events when we detach them,
solved the worst of that, but there was still some left).
The thing that happens is that because we are not self-reaping
(we have a caring parent) there is a time between the last
schedule (out) and having do_exit() called which will detach the
events.
This period would be accounted as enabled,!running because the
event->state==INACTIVE, even though !event->ctx->is_active.
Similar issues could have been observed by calling read() on a
event while the attached task was not scheduled in.
Solve this by teaching update_event_times() about
ctx->is_active.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1258984836.4531.480.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make perf_swevent_get_recursion_context return a context number
and disable preemption.
This could be used to remove the IRQ disable from the trace bit
and index the per-cpu buffer with.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091123103819.993226816@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the update_event_times() call in __perf_event_exit_task()
into list_del_event() because that holds the proper lock
(ctx->lock) and seems a more natural place to do the last time
update.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091123103819.842455480@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It appeared we did call update_event_times() on exit, but we
failed to update the context time, which renders the former
moot.
Locking is a bit iffy, we call update_event_times under
ctx->mutex instead of ctx->lock - the next patch fixes this.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091123103819.764207355@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
gcc with no flags typically is a sane default for systems to
use, and looking at the running kernel is probably broken for
cross-builds anyway, so let's not do this. Add EXTRA_CFLAGS so
that users can override default gcc mode if they want to.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091122121335.GA24254@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>