The 'annotate' tool does some filtering in the entries in a DSO but
forgot to reset the cache done in dso__find_symbol(), cauxing a SEGV:
[root@zoo ~]# perf annotate netlink_poll
perf: Segmentation fault
-------- backtrace --------
perf[0x526ceb]
/lib64/libc.so.6(+0x34960)[0x7faedfbe0960]
perf(rb_erase+0x223)[0x499d63]
perf[0x4213e9]
perf[0x4bc123]
perf[0x4bc621]
perf[0x4bf26b]
perf[0x4bc855]
perf(perf_session__process_events+0x340)[0x4bddc0]
perf(cmd_annotate+0x6bb)[0x421b5b]
perf[0x479063]
perf(main+0x60a)[0x42098a]
/lib64/libc.so.6(__libc_start_main+0xf0)[0x7faedfbcbfe0]
perf[0x420aa9]
[0x0]
[root@zoo ~]#
Fix it by reseting the find cache when removing symbols.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: b685ac22b4 ("perf symbols: Add front end cache for DSO symbol lookup")
Link: http://lkml.kernel.org/n/tip-b2y9x46y0t8yem1ive41zqyp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This adds the basic infrastructure to keep track of cycle counts per
basic block for annotate. We allocate an array similar to the normal
accounting, and then account branch cycles there.
We handle two cases:
cycles per basic block with start and cycles per branch (these are later
used for either IPC or just cycles per BB)
In the start case we cannot handle overlaps, so always the longest basic
block wins.
For the cycles per branch case everything is accurately accounted.
v2: Remove unnecessary checks. Slight restructure. Move
symbol__get_annotation to another patch. Move histogram allocation.
v3: Merged with current tree
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In addition to using refcounts for the struct thread lifetime
management, we need to protect access to machine->threads from
concurrent access.
That happens in 'perf top', where a thread processes events, inserting
and deleting entries from that rb_tree while another thread decays
hist_entries, that end up dropping references and ultimately deleting
threads from the rb_tree and releasing its resources when no further
hist_entry (or other data structures, like in 'perf sched') references
it.
So the rule is the same for refcounts + protected trees in the kernel,
get the tree lock, find object, bump the refcount, drop the tree lock,
return, use object, drop the refcount if no more use of it is needed,
keep it if storing it in some other data structure, drop when releasing
that data structure.
I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
"perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".
The addr_location__put() one is because as we return references to
several data structures, we may end up adding more reference counting
for the other data structures and then we'll drop it at
addr_location__put() time.
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently vmlinux_path__init() only tries to find vmlinux file from
current directory, /boot and some canonical directories with version
number of the running kernel. This can be a problem when reporting old
data recorded on a kernel version not running currently.
We can use --symfs option for this but it's annoying for user to do it
always. As we already have the info in the perf.data file, it can be
changed to use it for the search automatically.
Before:
$ perf report
...
# Samples: 4K of event 'cpu-clock'
# Event count (approx.): 1067250000
#
# Overhead Command Shared Object Symbol
# ........ .......... ................. ..............................
71.87% swapper [kernel.kallsyms] [k] recover_probed_instruction
After:
# Overhead Command Shared Object Symbol
# ........ .......... ................. ....................
71.87% swapper [kernel.kallsyms] [k] native_safe_halt
This requires to change signature of symbol__init() to receive struct
perf_session_env *.
Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, accounting each sample is done in multiple places - once
when adding them to the input tree, other when adding them to the
output tree. It's not only confusing but also can cause a subtle
problem since concurrent processing like in perf top might see the
updated stats before adding entries into the output tree - like seeing
more (blank) lines at the end and/or slight inaccurate percentage.
To fix this, only account the entries when it's moved into the output
tree so that they cannot be seen prematurely. There're some
exceptional cases here and there - they should be addressed separately
with comments.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>