Add initial support for s390 auxiliary traces using the CPU-Measurement
Sampling Facility.
Support and ignore PERF_REPORT_AUXTRACE_INFO records in the perf data
file. Later patches will show the contents of the auxiliary traces.
Setup the auxtrace queues and data structures for s390. A raw dump of
the perf.data file now does not show an error when an auxtrace event is
encountered.
Output before:
[root@s35lp76 perf]# ./perf report -D -i perf.data.auxtrace
0x128 [0x10]: failed to process type: 70
Error:
failed to process sample
0x128 [0x10]: event: 70
.
. ... raw event: size 16 bytes
. 0000: 00 00 00 46 00 00 00 10 00 00 00 00 00 00 00 00 ...F............
0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 0
[root@s35lp76 perf]#
Output after:
# ./perf report -D -i perf.data.auxtrace |fgrep PERF_RECORD_AUXTRACE
0 0 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 5
0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000
offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4
....
Additional notes about the underlying hardware and software
implementation, provided by Hendrik Brueckner (see Link: below).
=============================================================================
The CPU-Measurement Facility (CPU-MF) provides a set of functions to obtain
performance information on the mainframe. Basically, it was introduced
with System z10 years ago for the z/Architecture, that means, 64-bit.
For Linux, there are two facilities of interest, counter facility and sampling
facility. The counter facility provides hardware counters for instructions,
cycles, crypto-activities, and many more.
The sampling facility is a hardware sampler that when started will write
samples at a particular interval into a sampling buffer. At some point,
for example, if a sample block is full, it generates an interrupt to collect
samples (while the sampler continues to run).
Few years ago, I started to provide the a perf PMU to use the counter
and sampling facilities. Recently, the device driver was updated to also
"export" the sampling buffer into the AUX area. Thomas now completed the
related perf work to interpret and process these AUX data.
If people are more interested in the sampling facility, they can have a
look into:
- The Load-Program-Parameter and the CPU-Measurement Facilities, SA23-2260-05
http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a
and to learn how-to use it for Linux on Z, have look at chapter 54,
"Using the CPU-measurement facilities" in the:
- Device Drivers, Features, and Commands, SC33-8411-34
http://public.dhe.ibm.com/software/dw/linux390/docu/l416dd34.pdf
=============================================================================
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180803100758.GA28475@linux.ibm.com
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180802074622.13641-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding mem2node object to allow the easy lookup of the node for the
physical address.
It has following interface:
int mem2node__init(struct mem2node *map, struct perf_env *env);
void mem2node__exit(struct mem2node *map);
int mem2node__node(struct mem2node *map, u64 addr);
The mem2node__toolsinit initialize object from the perf data file
MEM_TOPOLOGY feature data. Following calls to mem2node__node will return
node number for given physical address. The mem2node__exit function
frees the object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' and 'perf report --dump-raw-trace' supported in this
release.
Example usage:
# perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
# perf report --dump-raw-trace
Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.
Output will contain raw SPE data and its textual representation, such
as:
0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
. 00000000: 49 00 LD
. 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0
. 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1
. 00000014: 9a 00 00 LAT 0 XLAT
. 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1
. 00000022: 98 00 00 LAT 0 TOT
. 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000002e: 49 00 LD
. 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80
. 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1
. 00000042: 9a 00 00 LAT 0 XLAT
. 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1
. 00000050: 98 00 00 LAT 0 TOT
. 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000005c: 48 00 INSN-OTHER
. 0000005e: 42 02 EV RETIRED
. 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1
. 00000069: 98 00 00 LAT 0 TOT
. 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481
...
Other release notes:
- applies to acme's perf/{core,urgent} branches, likely elsewhere
- Report is self-contained within the tool.
Record requires enabling the kernel SPE driver by
setting CONFIG_ARM_SPE_PMU.
- The intel-bts implementation was used as a starting point; its
min/default/max buffer sizes and power of 2 pages granularity need to be
revisited for ARM SPE
- Recording across multiple SPE clusters/domains not supported
- Snapshot support (record -S), and conversion to native perf events
(e.g., via 'perf inject --itrace'), are also not supported
- Technically both cs-etm and spe can be used simultaneously, however
disabled for simplicity in this release
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi reported a performance drop in single threaded perf tools such as
'perf script' due to the growing number of locks being put in place to
allow for multithreaded tools, so wrap the POSIX threads rwlock routines
with the names used for such kinds of locks in the Linux kernel and then
allow for tools to ask for those locks to be used or not.
I.e. a tool may have a multithreaded phase and then switch to single
threaded, like the upcoming patches for the synthesizing of
PERF_RECORD_{FORK,MMAP,etc} for pre-existing processes to then switch to
single threaded mode in 'perf top'.
The init routines will not be conditional, this way starting as single
threaded to then move to multi threaded mode should be possible.
Reported-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170404161739.GH12903@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add generic support for standalone metrics specified in JSON files to
perf stat. A metric is a formula that uses multiple events to compute a
higher level result (e.g. IPC).
Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have standalone
metrics. They are in the same JSON data structure as events, but don't
have an event name.
We also allow to organize the metrics in metric groups, which allows a
short cut to select several related metrics at once.
Add a new -M / --metrics option to perf stat that adds the metrics or
metric groups specified.
Add the core code to manage and parse the metric groups. They are
collected from the JSON data structures into a separate rblist. When
computing shadow values look for metrics in that list. Then they are
computed using the existing saved values infrastructure in stat-shadow.c
The actual JSON metrics are in a separate pull request.
% perf stat -M Summary --metric-only -a sleep 1
Performance counter stats for 'system wide':
Instructions CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization
317614222.0 1392930775.0 0.0 0.0 0.2 0.1
1.001497549 seconds time elapsed
% perf stat -M GFLOPs flops
Performance counter stats for 'flops':
3,999,541,471 fp_comp_ops_exe.sse_scalar_single # 1.2 GFLOPs (66.65%)
14 fp_comp_ops_exe.sse_scalar_double (66.65%)
0 fp_comp_ops_exe.sse_packed_double (66.67%)
0 fp_comp_ops_exe.sse_packed_single (66.70%)
0 simd_fp_256.packed_double (66.70%)
0 simd_fp_256.packed_single (66.67%)
0 duration_time
3.238372845 seconds time elapsed
v2: Add missing header file
v3: Move find_map to pmu.c
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create new util/branch.c and util/branch.h to contain the common branch
functions. Such as:
branch_type_count(): Count the numbers of branch types
branch_type_name() : Return the name of branch type
branch_type_stat_display(): Display branch type statistics info
branch_type_str(): Construct the branch type string.
The branch type is saved in branch_flags.
Change log:
v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
v7: Since the common branch type name is changed (e.g. JCC->COND),
this patch is performed the modification accordingly.
v6: Move that multiline conditional code inside {} brackets.
Move branch_type_stat_display() from builtin-report.c to
branch.c.
Move branch_type_str() from callchain.c to branch.c.
v5: It's a new patch in v5 patch series.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com
[ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a simple expression parser good enough to parse JSON relation
expressions. The parser is implemented using bison.
This is just intended as an simple parser for internal usage in the
event lists, not the beginning of a "perf scripting language"
v2: Use expr__ prefix instead of expr_
Support multiple free variables for parser
Committer note:
The v2 patch had:
%define api.pure full
In expr.y, that is a feature introduced in bison 2.7, to have reentrant
parsers, not using global variables, which would make tools/perf stop
building with the bison version shipped in older distros, so Andi
realised that the other parsers (e.g. parse-events.y) were using:
%pure-parser
Which is present in older versions of bison and fits the bill.
I added:
CFLAGS_expr-bison.o += -DYYENABLE_NLS=0 -DYYLTYPE_IS_TRIVIAL=0 -w
To finally make it build, copying what was there for pmu-bison.o,
another parser.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-8-andi@firstfloor.org
[ stdlib.h is needed in tests/expr.c for free() fixing build in systems such as ubuntu:16.04-x-s390 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement printing instruction sequences as hex dump for branch stacks.
This relies on the x86 instruction decoder used by the PT decoder to
find the lengths of instructions to dump them individually.
This is good enough for pattern matching.
This allows to study hot paths for individual samples, together with
branch misprediction and cycle count / IPC information if available (on
Skylake systems).
% perf record -b ...
% perf script -F brstackinsn
...
read_hpet+67:
ffffffff9905b843 insn: 74 ea # PRED
ffffffff9905b82f insn: 85 c9
ffffffff9905b831 insn: 74 12
ffffffff9905b833 insn: f3 90
ffffffff9905b835 insn: 48 8b 0f
ffffffff9905b838 insn: 48 89 ca
ffffffff9905b83b insn: 48 c1 ea 20
ffffffff9905b83f insn: 39 f2
ffffffff9905b841 insn: 89 d0
ffffffff9905b843 insn: 74 ea # PRED
Only works when no special branch filters are specified.
Occasionally the path does not reach up to the sample IP, as the LBRs
may be frozen before executing a final jump. In this case we print a
special message.
The instruction dumper piggy backs on the existing infrastructure from
the IP PT decoder.
An earlier iteration of this patch relied on a disassembler, but this
version only uses the existing instruction decoder.
Committer note:
Added hint about how to get suitable perf.data files for use with
'-F brstackinsm':
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
$
$ perf script -F brstackinsn
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'
$
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.
Committer notes:
Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.
Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:
# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The use_browser and perf_version_string variables are both declared in
perf.c but they are also referenced by other functions of libperf.a.
Therefore a user linking an own main() with libperf.a must declare those
two variables in their files even if the files never use the browser or
the version information.
This patch fixes this issue by moving use_browser and
perf_version_string out of perf.c to some other files.
Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add basic clang support in clang.cpp and test__clang() testcase. The
first testcase checks if builtin clang is able to generate LLVM IR.
tests/clang.c is a proxy. Real testcase resides in
utils/c++/clang-test.cpp in c++ and exports C interface to perf test
subsystem.
Test result:
$ perf test -v clang
51: builtin clang support :
51.1: Test builtin clang compile C source to IR :
--- start ---
test child forked, pid 13215
test child finished with 0
---- end ----
Test builtin clang support subtest 0: Ok
Committer note:
Make sure you've enabled CLANG and LLVM builtin support by setting
the LIBCLANGLLVM variable on the make command line, e.g.:
make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin
Otherwise you'll get this when trying to do the 'perf test' call above:
# perf test clang
51: builtin clang support : Skip (not compiled in)
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com
[ Removed "Test" from descriptions, redundant and already removed from all the other entries ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>