Add a new flag field while opening a tracepoint perf counter:
-e tracepoint_subsystem:tracepoint_name:flags
This is intended to be generic although for now it only supports the
r[e[c[o[r[d]]]]] flag:
./perf record -e workqueue:workqueue_insertion:record
./perf record -e workqueue:workqueue_insertion:r
will have the same effect: enabling the raw samples record for
the given tracepoint counter.
In the future, we may want to support further flags, separated
by commas.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250152039-7284-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It is better than showing the map addr, this way at least we
know that we can't get the symtabs because the DSO was deleted
(system update) while an app still used such DSO.
Yeah, don't do that, but if you do, you'll figure it out
quicker this way.
[acme@doppio linux-2.6-tip]$ perf report | head -15
# Samples: 3796
#
# Overhead Command Shared Object Symbol
# ........ ....... ................................................................... ......
#
23.55% pidgin /lib64/libglib-2.0.so.0.2000.4.#prelink#.Pd98lu (deleted) [.] 0x00000000038844
21.55% pidgin /lib64/libpthread-2.10.1.so.#prelink#.AFwK8Q (deleted) [.] 0x0000000000a42d
10.85% pidgin [kernel] [.] vread_hpet
7.85% pidgin /lib64/libgobject-2.0.so.0.2000.4.#prelink#.o1vpU7 (deleted) [.] 0x00000000014de8
3.35% pidgin /lib64/libc-2.10.1.so (deleted) [.] 0x0000000007a875
3.19% pidgin /lib64/libdbus-1.so.3.4.0.#prelink#.6mwgZP (deleted) [.] 0x0000000001d254
3.06% pidgin /usr/lib64/libgtk-x11-2.0.so.0.1600.5.#prelink#.511hAl (deleted) [.] 0x000000002334e7
2.90% pidgin /usr/lib64/libgdk-x11-2.0.so.0.1600.5.#prelink#.5qlMo1 (deleted) [.] 0x00000000037b2d
1.84% pidgin [kernel] [k] do_sys_poll
1.45% pidgin /usr/lib64/libX11.so.6.2.0.#prelink#.iR59Rx (deleted) [.] 0x0000000004c751
[acme@doppio linux-2.6-tip]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Luis Claudio R. Gonçalves <lclaudio@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090811200436.GA3478@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Sometimes we get callchain branches that have a rate under the
limit given by the user.
Say you launched:
perf record -f -g -a ./hackbench 10
perf report -g fractal,10.0
And you got:
2.33% hackbench [kernel] [k] _spin_lock_irqsave
|
|--78.57%-- remove_wait_queue
| poll_freewait
| do_sys_poll
| sys_poll
| sysenter_dispatch
| 0xf7ffa430
| 0x1ffadea3c
|
|--7.14%-- __up_read
| up_read
| do_page_fault
| page_fault
| 0xf7ffa430
| 0xa0df710000000a
...
It is abnormal to get a 7.14% branch whereas we passed a 10%
filter.
The problem is that we round down the minimum threshold. This
happens mostly when we have very low number of events. If the
total amount of your branch is 4 and you have a subranch of 3
events, filtering to 90% will be computed like follows:
limit = 4 * 0.9;
The result is about 3.6, but the cast to integer will round
down to 3. It means that our filter is actually of 75%
We must then explicitly round up the minimum threshold.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: efault@gmx.de
LKML-Reference: <20090809024235.GA10146@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If we recorded with -g option to record the callchain, right now
we require a -g option to perf report as well - and people reported
this as unnecessary complication: the user already specified -g
once, no need to require it a second time.
So if the recording includes call-chains, display the callchain by
default from perf report.
( The user can override this default using "-g none" option from
perf report. )
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When the callchain tree comes to insert an empty backtrace, it
raises a spurious warning about the fact we are inserting an
empty. This is spurious because the radix tree assumes it did
something wrong to reach this state. But it didn't, we just met
an empty callchain that has to be ignored.
This happens occasionally with certain types of call-chain
recordings. If it happens it's a big nuisance as perf report
output starts with thousands of warning lines.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While toying with perf, I've noticed that perf record can
easily enter a busy loop when doing something as silly as:
$ perf record -A ls
Yeah, do_read here really wants to read a known size, not being
able to should die(), not busy-loop ;)
That was the cause for the bug.
Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stop perf list from displaying tracepoints without an id file,
those are special tracepoints that are not interfaced to
perfcounters so listing them is erroneous and passing them as
events will produce no output.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Used with perf report --verbose:
[acme@doppio linux-2.6-tip]$ perf report -v | head -16
5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal
1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable
1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet
1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock
1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet
1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light
1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc
[acme@doppio linux-2.6-tip]$
The new field is just after the symbol address. To help in
figuring out symbol resolution bugs.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Brice Goglin reported:
> I can easily sort them by thread id, but I don't know how to match
> my 4 events with each group of 4 lines.
Also report the counter id and the time running/enabled
stats (in case the counter got time-shared).
Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Brice Goglin reported that only the first result from a
multi-counter perf record --stat run is accurate, the
rest looks bogus.
A silly mistake made us re-read the first attribute for
every recorded attribute.
Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The callchain fractal mode builds each new total hits in a new
branch of profiling by using the parent's hits of the current
branch plus the hits of the children.
This is wrong, the total hits of a branch should be made of the
sum of every children hits, we must ignore the parent hits in
this scope.
This patch also fixes another mistake with the hit counting.
Now the rates are correct.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In some cases distros have binaries and debuginfo in weird places:
[root@doppio tuna]# ls -la /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
-rwxr-xr-x 1 root root 90024 2009-08-03 19:45 /usr/lib64/firefox-3.5.2/firefox
-rwxr-xr-x 1 root root 90024 2009-08-03 18:23 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
[root@doppio tuna]# sha1sum /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/firefox-3.5.2/firefox
[root@doppio tuna]# rpm -qf /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
xulrunner-1.9.1.2-1.fc11.x86_64
firefox-3.5.2-2.fc11.x86_64
[root@doppio tuna]# ls -la /usr/lib/debug/{usr/lib64/xulrunner-1.9.1/xulrunner-stub,usr/lib64/firefox-3.5.2/firefox}.debug
ls: cannot access /usr/lib/debug/usr/lib64/firefox-3.5.2/firefox.debug: No such file or directory
-rwxr-xr-x 1 root root 403608 2009-08-03 18:22 /usr/lib/debug/usr/lib64/xulrunner-1.9.1/xulrunner-stub.debug
Seemingly we don't have a .symtab when we actually can find it
if we use the .note.gnu.build-id ELF section put in place by
some distros. Use it and find the symbols we need.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the C++ demangling isn't needed for everybody and
bfd/iberty aren't widely/easily available on all machines, make
it optional.
It also allows you to forcefully disable demangling by using
NO_DEMANGLE=1 and otherwise tries to detect libbfd/libiberty
combinations that result in a compiling demangler.
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
LKML-Reference: <20090801082048.GX12579@kernel.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes a spelling error that has resulted from copy
and pasting. The location of the error was found using a
semantic patch but the semantic patch was not trying to find
these errors. After looking things over it seemed logical that
this change was needed. Please review it and then include the
patch if it is in fact the correct change.
Signed-off-by: Stoyan Gaydarov <sgayda2@uiuc.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1248949529-20891-1-git-send-email-sgayda2@uiuc.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
vmlinux meets the criteria for symbol adjustment, which breaks vmlinux generated symbols.
Fix this by exempting vmlinux. This is a bit fragile in that someone could change the
kernel dso's name, but currently that name is also hardwired.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1248091298.18702.18.camel@marge.simson.net>
If "/sys/kernel/debug" is not a debugfs mount point, search for the debugfs
filesystem in /proc/mounts, but also allows the user to specify
'--debugfs-dir=blah' or set the environment variable: 'PERF_DEBUGFS_DIR'
Signed-off-by: Jason Baron <jbaron@redhat.com>
[ also made it probe "/debug" by default ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090721181629.GA3094@redhat.com>
for some reason, this structure gets compiled as 36 bytes in some files
(the ones that alloacte it) but 40 bytes in others (the ones that use it).
The cause is an off_t type that gets a different size in different
compilation units for some yet-to-be-explained reason.
But the effect is disasterous; the size/offset members of the struct
are at different offsets, and result in mostly complete garbage.
The parser in perf is so robust that this all gets hidden, and after
skipping an certain amount of samples, it recovers.... so this bug
is not normally noticed.
.... except when you want every sample to be exact.
Fix this by just using an explicitly sized type.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4A655917.9080504@linux.intel.com>
The strlist__entry method allows accessing strlists like an
array, will be used in the 'perf report' to access the first
entry.
We now keep the nr_entries so that we can check if we have just
one entry, will be used in 'perf report' to improve the output
by showing just at the top when we have just, say, one DSO.
While at it use nr_entries to optimize strlist__is_empty by not
using the far more costly rb_first based implementation.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1247325517-12272-2-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>