An overflow (OVF packet) is treated as an error because it represents a
loss of trace data, but there is no loss of synchronization, so the packet
state should be INTEL_PT_STATE_IN_SYNC not INTEL_PT_STATE_ERR_RESYNC.
To support that, some additional variables must be reset, and the FUP
packet that may follow OVF is treated as an FUP event.
Fixes: f4aa081949 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1
to then make this the default he found problems in some distros where
the system linux/bpf.h file was being used and lacked this:
util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS'
__uint(map_flags, BPF_F_PRESERVE_ELEMS);
So use instead the vmlinux.h file generated by bpftool from BTF info.
This fixed these as well, getting the build back working on debian:11,
debian:experimental and ubuntu:21.10:
In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33:
:
In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111:
:
/usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error:
: : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found:
/usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found
#include <asm/types.h>#include <asm/types.h>
^~~~~~~~~~~~~ ^~~~~~~~~~~~~
#include <asm/types.h>
^~~~~~~~~~~~~
1 error generated.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CF175681-8101-43D1-ABDB-449E644BE986@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, the 'weight' field in the perf sample has latency information
for some instructions like in memory accesses. And perf tool has 'weight'
and 'local_weight' sort keys to display the info.
But it's somewhat confusing what it shows exactly. In my understanding,
'local_weight' shows a weight in a single sample, and (global) 'weight'
shows a sum of the weights in the hist_entry.
For example:
$ perf mem record -t load dd if=/dev/zero of=/dev/null bs=4k count=1M
$ perf report --stdio -n -s +local_weight
...
#
# Overhead Samples Command Shared Object Symbol Local Weight
# ........ ....... ....... ................ ......................... ............
#
21.23% 313 dd [kernel.vmlinux] [k] lockref_get_not_zero 32
12.43% 183 dd [kernel.vmlinux] [k] lockref_get_not_zero 35
11.97% 159 dd [kernel.vmlinux] [k] lockref_get_not_zero 36
10.40% 141 dd [kernel.vmlinux] [k] lockref_put_return 32
7.63% 113 dd [kernel.vmlinux] [k] lockref_get_not_zero 33
6.37% 92 dd [kernel.vmlinux] [k] lockref_get_not_zero 34
6.15% 90 dd [kernel.vmlinux] [k] lockref_put_return 33
...
So let's look at the 'lockref_get_not_zero' symbols. The top entry
shows that 313 samples were captured with 'local_weight' 32, so the
total weight should be 313 x 32 = 10016. But it's not the case:
$ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero
...
#
# Overhead Samples Command Shared Object Local Weight Weight
# ........ ....... ....... ................ ............ ......
#
1.36% 4 dd [kernel.vmlinux] 36 144
0.47% 4 dd [kernel.vmlinux] 37 148
0.42% 4 dd [kernel.vmlinux] 32 128
0.40% 4 dd [kernel.vmlinux] 34 136
0.35% 4 dd [kernel.vmlinux] 36 144
0.34% 4 dd [kernel.vmlinux] 35 140
0.30% 4 dd [kernel.vmlinux] 36 144
0.30% 4 dd [kernel.vmlinux] 34 136
0.30% 4 dd [kernel.vmlinux] 32 128
0.30% 4 dd [kernel.vmlinux] 32 128
...
With the 'weight' sort key, it's divided to 4 samples even with the same
info ('comm', 'dso', 'sym' and 'local_weight'). I don't think this is
what we want.
I found this because of the way it aggregates the 'weight' value. Since
it's not a period, we should not add them in the he->stat. Otherwise,
two 32 'weight' entries will create a 64 'weight' entry.
After that, new 32 'weight' samples don't have a matching entry so it'd
create a new entry and make it a 64 'weight' entry again and again.
Later, they will be merged into 128 'weight' entries during the
hists__collapse_resort() with 4 samples, multiple times like above.
Let's keep the weight and display it differently. For 'local_weight',
it can show the weight as is, and for (global) 'weight' it can display
the number multiplied by the number of samples.
With this change, I can see the expected numbers.
$ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero
...
#
# Overhead Samples Command Shared Object Local Weight Weight
# ........ ....... ....... ................ ............ .....
#
21.23% 313 dd [kernel.vmlinux] 32 10016
12.43% 183 dd [kernel.vmlinux] 35 6405
11.97% 159 dd [kernel.vmlinux] 36 5724
7.63% 113 dd [kernel.vmlinux] 33 3729
6.37% 92 dd [kernel.vmlinux] 34 3128
4.17% 59 dd [kernel.vmlinux] 37 2183
0.08% 1 dd [kernel.vmlinux] 269 269
0.08% 1 dd [kernel.vmlinux] 38 38
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211105225617.151364-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The exit function fixes a memory leak with the src field as detected by
leak sanitizer. An example of which is:
Indirect leak of 25133184 byte(s) in 207 object(s) allocated from:
#0 0x7f199ecfe987 in __interceptor_calloc libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x55defe638224 in annotated_source__alloc_histograms util/annotate.c:803
#2 0x55defe6397e4 in symbol__hists util/annotate.c:952
#3 0x55defe639908 in symbol__inc_addr_samples util/annotate.c:968
#4 0x55defe63aa29 in hist_entry__inc_addr_samples util/annotate.c:1119
#5 0x55defe499a79 in hist_iter__report_callback tools/perf/builtin-report.c:182
#6 0x55defe7a859d in hist_entry_iter__add util/hist.c:1236
#7 0x55defe49aa63 in process_sample_event tools/perf/builtin-report.c:315
#8 0x55defe731bc8 in evlist__deliver_sample util/session.c:1473
#9 0x55defe731e38 in machines__deliver_event util/session.c:1510
#10 0x55defe732a23 in perf_session__deliver_event util/session.c:1590
#11 0x55defe72951e in ordered_events__deliver_event util/session.c:183
#12 0x55defe740082 in do_flush util/ordered-events.c:244
#13 0x55defe7407cb in __ordered_events__flush util/ordered-events.c:323
#14 0x55defe740a61 in ordered_events__flush util/ordered-events.c:341
#15 0x55defe73837f in __perf_session__process_events util/session.c:2390
#16 0x55defe7385ff in perf_session__process_events util/session.c:2420
...
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211112035124.94327-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>