This patch adds the sorting and histogram support
functions to enable profiling of memory accesses.
The following sorting orders are added:
- symbol_daddr: data address symbol (or raw address)
- dso_daddr: data address shared object
- locked: access uses locked transaction
- tlb : TLB access
- mem : memory level of the access (L1, L2, L3, RAM, ...)
- snoop: access snoop mode
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed, the move of methods to
machine.[ch], and the rename of dsrc to data_src, to match the change
made in the PERF_SAMPLE_DSRC in a previous patch. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record has a new option -W that enables weightened sampling.
Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.
Add the necessary glue to perf report, record and the library.
v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.
Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Current perf report gets segmentation fault when a branch stack specific
sort key is provided by --sort option to a perf.data file which contains
no branch infomation. It's because those sort keys reference branch
info of a hist entry unconditionally. Maybe we can change it checks
whether such branch info is valid or not. But if the branch stacks are
not recorded, it'd be nop. Thus it'd be better to make those keys are
unselectable.
This patch separates those keys to a different dimension array, so that
if user passes such a key to a file which has no branch stack will get
following message rather than a segfault.
Error: Invalid --sort key: `symbol_from'
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reported-by: Stefan Beller <stefanbeller@googlemail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1356599507-14226-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By using a mutex just for inserting and rotating two hist_entry rb
trees, so that when sorting we can get the last batch of entries created
from the ring buffer, merge it with whatever we have processed so far
and show the output while new entries are being added.
The 'report' tool continues, for now, to do it without threading, but
will use this in the future to allow visualization of results in long
perf.data sessions while the entries are being processed.
The new 'top' tool will be the first user.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9b05atsn0q6m7fqgrug8fk2i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In order to implement callchains collapsing, we need to keep
track of the maximum depth in a histogram tree of callchains.
This way we'll avoid allocating an arbitrary temporary buffer
size on callchain merge time.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@infradead.org>
The stock newt checkbox tree widget we were using was not really
suitable for hist entry + callchain browsing.
The problems with it were manifold:
- We needed to traverse the whole hist_entry rb_tree to add each entry +
callchains beforehand.
- No control over the colors used for each row
So a new tree widget, based mostly on slang, was written.
It extends the ui_browser class already used for annotate to allow the
user to fold/unfold branches in the callchains tree, using extra fields
in the symbol_map class that is embedded in hist_entry and
callchain_node instances to store the folding state and when changing
this state calculates the number of rows that are produced when showing
a particular hist_entry instance.
This greatly speeds up browsing as we don't have to upfront touch all
the entries and only calculate callchain related operations when some
callchain branch is actually unfolded.
The memory footprint is also reduced as the data structure is not
duplicated, just some extra fields for controling callchain state and to
simplify the process of seeking thru entries (nr_rows, row_offset) were
added.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>