Certain versions of GCC dont see the initialization that is done here:
builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:1038: warning: ‘syms’ may be used uninitialized in this function
So annotate it with a NULL initialization.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Among perf annotate, perf report and perf top, we can find the
common colored printing of percents according to the following
rules:
High overhead = > 5%, colored in red
Mid overhead = > 0.5%, colored in green
Low overhead = < 0.5%, default color
Factorize these multiple checks in a single function named
percent_color_fprintf() and also provide a get_percent_color()
for sites which print percentages and other things at the same
time.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246558475-10624-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Callchains output may become a burden on a trace because even
rarely hit site are exposed. This can be too much information.
Let the user set a threshold as a minimum percent of hits using
the new pattern for the -c option:
-c mode,min_percent
Example:
$ perf report -s sym -c flat,4
8.25% [k] copy_user_generic_string
4.19%
copy_user_generic_string
generic_file_aio_read
do_sync_read
vfs_read
sys_pread64
system_call_fastpath
pread64
5.39% [k] search_by_key
4.63% 0x00000000009e0a
2.36% [k] memcpy_c
[...]
$ perf report -s sym -c graph,2
8.25% [k] copy_user_generic_string
|
|--4.31%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| --4.19%-- sys_pread64
| system_call_fastpath
| pread64
|
--3.24%-- generic_file_buffered_write
__generic_file_aio_write_nolock
generic_file_aio_write
do_sync_write
reiserfs_file_write
vfs_write
|
--3.14%-- sys_pwrite64
system_call_fastpath
__pwrite64
5.39% [k] search_by_key
|
--2.23%-- reiserfs_update_sd_size
4.63% 0x00000000009e0a
2.36% [k] memcpy_c
[...]
You can also omit it and it will default to 0.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246558475-10624-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, the printing of callchains is done in a single
vertical level, this is the "flat" mode:
8.25% [k] copy_user_generic_string
4.19%
copy_user_generic_string
generic_file_aio_read
do_sync_read
vfs_read
sys_pread64
system_call_fastpath
pread64
This patch introduces a new "graph" mode which provides a
hierarchical output of factorized paths recursively sorted:
8.25% [k] copy_user_generic_string
|
|--4.31%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| |--4.19%-- sys_pread64
| | system_call_fastpath
| | pread64
| |
| --0.12%-- sys_read
| system_call_fastpath
| __read
|
|--3.24%-- generic_file_buffered_write
| __generic_file_aio_write_nolock
| generic_file_aio_write
| do_sync_write
| reiserfs_file_write
| vfs_write
| |
| |--3.14%-- sys_pwrite64
| | system_call_fastpath
| | __pwrite64
| |
| --0.10%-- sys_write
[...]
The command line has then changed.
By providing the -c option, the callchain will output in the
flat mode by default.
But you can override it:
perf report -c graph
or
perf report -c flat
You can also pass the abreviated mode:
perf report -c g
or
perf report -c gra
will both make use of the graph mode.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The copy we were using came from another copy I did for the dwarves
(pahole) package, that came from the kernel years ago.
The only function that is used by the perf tools and that isn't in the
kernel is list_del_range, that I'm leaving in the perf tools only for
now.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090701174608.GA5823@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Enable -Wextra. This found a few real bugs plus a number
of signed/unsigned type mismatches/uncleanlinesses. It
also required a few annotations
All things considered it was still worth it so lets try with
this enabled for now.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix:
builtin-report.c: In function ‘hist_entry__add’:
builtin-report.c:1015: error: case label not within a switch statement
builtin-report.c:1017: error: break statement not within loop or switch
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use the newly created callchains radix tree to gather the chains stats
from the recorded events and then print the callchains for all of them,
sorted by hits, using the "-c" parameter with perf report.
Example:
66.15% [k] atm_clip_exit
63.08%
0xffffffffffffff80
0xffffffff810196a8
0xffffffff810c14c8
0xffffffff8101a79c
0xffffffff810194f3
0xffffffff8106ab7f
0xffffffff8106abe5
0xffffffff8106acde
0xffffffff8100d94b
0xffffffff8153e7ea
[...]
1.54%
0xffffffffffffff80
0xffffffff810196a8
0xffffffff810c14c8
0xffffffff8101a79c
[...]
Symbols are not yet resolved.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246026481-8314-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We plan to display the callchains depending on some user-configurable
parameters.
To gather the callchains stats from the recorded stream in a fast way,
this patch introduces an ad hoc radix tree adapted for callchains and also
a rbtree to sort these callchains once we have gathered every events
from the stream.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide the basic infrastructure to provide per task stats.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The PERF_EVENT_READ implementation made me realize we don't
actually need the sample_type int the output sample, since
we already have that in the perf_counter_attr information.
Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the
event->type overloading, and imply put counter overflow
samples in a PERF_EVENT_SAMPLE type.
This also fixes the issue that event->type was only 32-bit
and sample_type had 64 usable bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Create a structured file format that includes the full
perf_counter_attr and all its relevant counter IDs so that
the reporting program has full information.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Martin Schwidefsky reported "perf report" symbol resolution
problems on S390.
Since we only report MMAP, not MUNMAP, we have to deal with
overlapping maps.
We used to simply throw out the old map on the assumption whole
maps got unmapped. This obviously doesn't deal with partial
unmaps. However it appears some dynamic linkers do fancy
partial unmaps (s390), so do something more elaborate and
truncate the old maps, only removing them when they've been
fully covered.
This resolves (part of) the S390 symbol resolution problems.
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>