Make struct cpu_topo global and rename it to 'struct cpu_topology', so
that it can be used from the 'perf record' command in the following
patches.
Add the following interface functions to load/free cpu topology details:
struct cpu_topology *cpu_topology__new(void);
void cpu_topology__delete(struct cpu_topology *tp);
Move it to a separate source file cputopo.c together with numa related
object in the following patches.
No functional change, the new interface will be used in upcoming changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of
PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to
turn it on.
Committer notes:
Add dummy machine__process_bpf_event() variant that returns zero for
systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking
the build in such systems.
Remove the needless include <machine.h> from bpf->event.h, provide just
forward declarations for the structs and unions in the parameters, to
reduce compilation time and needless rebuilds when machine.h gets
changed.
Committer testing:
When running with:
# perf record --bpf-event
On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL
is not present, we fallback to removing those two bits from
perf_event_attr, making the tool to continue to work on older kernels:
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -22
switching off bpf_event
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
------------------------------------------------------------
sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -22
switching off ksymbol
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
And then proceeds to work without those two features.
As passing --bpf-event is an explicit action performed by the user, perhaps we
should emit a warning telling that the kernel has no such feature, but this can
be done on top of this patch.
Now with a kernel that supports these events, start the 'record --bpf-event -a'
and then run 'perf trace sleep 10000' that will use the BPF
augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus
should generate PERF_RECORD_BPF_EVENT events:
[root@quaco ~]# perf record -e dummy -a --bpf-event
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.713 MB perf.data ]
[root@quaco ~]# bpftool prog
13: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 13,14
14: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 13,14
15: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 15,16
16: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 15,16
17: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:44-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 17,18
18: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:44-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 17,18
21: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:45-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 21,22
22: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:45-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 21,22
31: tracepoint name sys_enter tag 12504ba9402f952f gpl
loaded_at 2019-01-19T09:19:56-0300 uid 0
xlated 512B jited 374B memlock 4096B map_ids 30,29,28
32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl
loaded_at 2019-01-19T09:19:56-0300 uid 0
xlated 256B jited 191B memlock 4096B map_ids 30,29
# perf report -D | grep PERF_RECORD_BPF_EVENT | nl
1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29
10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29
11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30
12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30
13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31
14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32
#
There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'.
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 the event bc000 (also named CF_DIAG) extracts the CPU
Measurement Facility diagnostic counter sets and displays them as
counter number and counter value pairs sorted by counter set number.
Output:
[root@s35lp76 perf]# ./perf report -D --stdio
[00000000] Counterset:0 Counters:6
Counter:000 Value:0x000000000085ec36 Counter:001 Value:0x0000000000796c94
Counter:002 Value:0x0000000000005ada Counter:003 Value:0x0000000000092460
Counter:004 Value:0x0000000000006073 Counter:005 Value:0x00000000001a9a73
[0x000038] Counterset:1 Counters:2
Counter:000 Value:0x000000000007c59f Counter:001 Value:0x000000000002fad6
[0x000050] Counterset:2 Counters:16
Counter:000 Value:000000000000000000 Counter:001 Value:000000000000000000
Counter:002 Value:000000000000000000 Counter:003 Value:000000000000000000
Counter:004 Value:000000000000000000 Counter:005 Value:000000000000000000
Counter:006 Value:000000000000000000 Counter:007 Value:000000000000000000
Counter:008 Value:000000000000000000 Counter:009 Value:000000000000000000
Counter:010 Value:000000000000000000 Counter:011 Value:000000000000000000
Counter:012 Value:000000000000000000 Counter:013 Value:000000000000000000
Counter:014 Value:000000000000000000 Counter:015 Value:000000000000000000
[0x0000d8] Counterset:3 Counters:128
Counter:000 Value:0x000000000000020f Counter:001 Value:0x00000000000001d8
Counter:002 Value:0x000000000000d7fa Counter:003 Value:0x000000000000008b
...
The number in brackets is the offset into the raw data field of the
sample.
New functions trace_event_sample_raw__init() and s390_sample_raw() are
introduced in the code path to enable interpretation on non s390
platforms. This event bc000 attached raw data is generated only on s390
platform. Correct display on other platforms requires correct endianness
handling.
Committer notes:
Added a init function that sets up a evlist function pointer to avoid
repeated tests on evlist->env and calls to perf_env__name() that
involves normalizing, etc, for each PERF_RECORD_SAMPLE.
Removed needless __maybe_unused from the trace_event_raw()
prototype in session.h, move it to be an static function in evlist.
The 'offset' variable is a size_t, not an u64, fix it to avoid this on
some arches:
CC /tmp/build/perf/util/s390-sample-raw.o
util/s390-sample-raw.c: In function 's390_cpumcfdg_testctr':
util/s390-sample-raw.c:77:4: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'size_t' [-Werror=format=]
pr_err("Invalid counter set entry at %#" PRIx64 "\n",
^
cc1: all warnings being treated as errors
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Link: https://lkml.kernel.org/r/9c856ac0-ef23-72b5-901d-a1f815508976@linux.ibm.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Link: https://lkml.kernel.org/n/tip-s3jhif06et9ug78qhclw41z1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When looking at PT or brstackinsn traces with 'perf script' it can be
very useful to see the source code. This adds a simple facility to print
them with 'perf script', if the information is available through dwarf
% perf record ...
% perf script -F insn,ip,sym,srccode
...
4004c6 main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004c6 main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004b3 main
6 v++;
% perf record -b ...
% perf script -F insn,ip,sym,srccode,brstackinsn
...
main+22:
0000000000400543 insn: e8 ca ff ff ff # PRED
|18 f1();
f1:
0000000000400512 insn: 55
|10 {
0000000000400513 insn: 48 89 e5
0000000000400516 insn: b8 00 00 00 00
|11 f2();
000000000040051b insn: e8 d6 ff ff ff # PRED
f2:
00000000004004f6 insn: 55
|5 {
00000000004004f7 insn: 48 89 e5
00000000004004fa insn: 8b 05 2c 0b 20 00
|6 c = a / b;
0000000000400500 insn: 8b 0d 2a 0b 20 00
0000000000400506 insn: 99
0000000000400507 insn: f7 f9
0000000000400509 insn: 89 05 29 0b 20 00
000000000040050f insn: 90
|7 }
0000000000400510 insn: 5d
0000000000400511 insn: c3 # PRED
f1+14:
0000000000400520 insn: b8 00 00 00 00
|12 f2();
0000000000400525 insn: e8 cc ff ff ff # PRED
f2:
00000000004004f6 insn: 55
|5 {
00000000004004f7 insn: 48 89 e5
00000000004004fa insn: 8b 05 2c 0b 20 00
|6 c = a / b;
Not supported for callchains currently, would need some layout changes
there.
Committer notes:
Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
warning:
In file included from util/srccode.c:19:0:
/usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
#warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add initial support for s390 auxiliary traces using the CPU-Measurement
Sampling Facility.
Support and ignore PERF_REPORT_AUXTRACE_INFO records in the perf data
file. Later patches will show the contents of the auxiliary traces.
Setup the auxtrace queues and data structures for s390. A raw dump of
the perf.data file now does not show an error when an auxtrace event is
encountered.
Output before:
[root@s35lp76 perf]# ./perf report -D -i perf.data.auxtrace
0x128 [0x10]: failed to process type: 70
Error:
failed to process sample
0x128 [0x10]: event: 70
.
. ... raw event: size 16 bytes
. 0000: 00 00 00 46 00 00 00 10 00 00 00 00 00 00 00 00 ...F............
0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 0
[root@s35lp76 perf]#
Output after:
# ./perf report -D -i perf.data.auxtrace |fgrep PERF_RECORD_AUXTRACE
0 0 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 5
0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000
offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4
....
Additional notes about the underlying hardware and software
implementation, provided by Hendrik Brueckner (see Link: below).
=============================================================================
The CPU-Measurement Facility (CPU-MF) provides a set of functions to obtain
performance information on the mainframe. Basically, it was introduced
with System z10 years ago for the z/Architecture, that means, 64-bit.
For Linux, there are two facilities of interest, counter facility and sampling
facility. The counter facility provides hardware counters for instructions,
cycles, crypto-activities, and many more.
The sampling facility is a hardware sampler that when started will write
samples at a particular interval into a sampling buffer. At some point,
for example, if a sample block is full, it generates an interrupt to collect
samples (while the sampler continues to run).
Few years ago, I started to provide the a perf PMU to use the counter
and sampling facilities. Recently, the device driver was updated to also
"export" the sampling buffer into the AUX area. Thomas now completed the
related perf work to interpret and process these AUX data.
If people are more interested in the sampling facility, they can have a
look into:
- The Load-Program-Parameter and the CPU-Measurement Facilities, SA23-2260-05
http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a
and to learn how-to use it for Linux on Z, have look at chapter 54,
"Using the CPU-measurement facilities" in the:
- Device Drivers, Features, and Commands, SC33-8411-34
http://public.dhe.ibm.com/software/dw/linux390/docu/l416dd34.pdf
=============================================================================
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180803100758.GA28475@linux.ibm.com
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180802074622.13641-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding mem2node object to allow the easy lookup of the node for the
physical address.
It has following interface:
int mem2node__init(struct mem2node *map, struct perf_env *env);
void mem2node__exit(struct mem2node *map);
int mem2node__node(struct mem2node *map, u64 addr);
The mem2node__toolsinit initialize object from the perf data file
MEM_TOPOLOGY feature data. Following calls to mem2node__node will return
node number for given physical address. The mem2node__exit function
frees the object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' and 'perf report --dump-raw-trace' supported in this
release.
Example usage:
# perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
# perf report --dump-raw-trace
Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.
Output will contain raw SPE data and its textual representation, such
as:
0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
. 00000000: 49 00 LD
. 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0
. 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1
. 00000014: 9a 00 00 LAT 0 XLAT
. 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1
. 00000022: 98 00 00 LAT 0 TOT
. 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000002e: 49 00 LD
. 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80
. 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1
. 00000042: 9a 00 00 LAT 0 XLAT
. 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1
. 00000050: 98 00 00 LAT 0 TOT
. 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000005c: 48 00 INSN-OTHER
. 0000005e: 42 02 EV RETIRED
. 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1
. 00000069: 98 00 00 LAT 0 TOT
. 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481
...
Other release notes:
- applies to acme's perf/{core,urgent} branches, likely elsewhere
- Report is self-contained within the tool.
Record requires enabling the kernel SPE driver by
setting CONFIG_ARM_SPE_PMU.
- The intel-bts implementation was used as a starting point; its
min/default/max buffer sizes and power of 2 pages granularity need to be
revisited for ARM SPE
- Recording across multiple SPE clusters/domains not supported
- Snapshot support (record -S), and conversion to native perf events
(e.g., via 'perf inject --itrace'), are also not supported
- Technically both cs-etm and spe can be used simultaneously, however
disabled for simplicity in this release
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi reported a performance drop in single threaded perf tools such as
'perf script' due to the growing number of locks being put in place to
allow for multithreaded tools, so wrap the POSIX threads rwlock routines
with the names used for such kinds of locks in the Linux kernel and then
allow for tools to ask for those locks to be used or not.
I.e. a tool may have a multithreaded phase and then switch to single
threaded, like the upcoming patches for the synthesizing of
PERF_RECORD_{FORK,MMAP,etc} for pre-existing processes to then switch to
single threaded mode in 'perf top'.
The init routines will not be conditional, this way starting as single
threaded to then move to multi threaded mode should be possible.
Reported-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170404161739.GH12903@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add generic support for standalone metrics specified in JSON files to
perf stat. A metric is a formula that uses multiple events to compute a
higher level result (e.g. IPC).
Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have standalone
metrics. They are in the same JSON data structure as events, but don't
have an event name.
We also allow to organize the metrics in metric groups, which allows a
short cut to select several related metrics at once.
Add a new -M / --metrics option to perf stat that adds the metrics or
metric groups specified.
Add the core code to manage and parse the metric groups. They are
collected from the JSON data structures into a separate rblist. When
computing shadow values look for metrics in that list. Then they are
computed using the existing saved values infrastructure in stat-shadow.c
The actual JSON metrics are in a separate pull request.
% perf stat -M Summary --metric-only -a sleep 1
Performance counter stats for 'system wide':
Instructions CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization
317614222.0 1392930775.0 0.0 0.0 0.2 0.1
1.001497549 seconds time elapsed
% perf stat -M GFLOPs flops
Performance counter stats for 'flops':
3,999,541,471 fp_comp_ops_exe.sse_scalar_single # 1.2 GFLOPs (66.65%)
14 fp_comp_ops_exe.sse_scalar_double (66.65%)
0 fp_comp_ops_exe.sse_packed_double (66.67%)
0 fp_comp_ops_exe.sse_packed_single (66.70%)
0 simd_fp_256.packed_double (66.70%)
0 simd_fp_256.packed_single (66.67%)
0 duration_time
3.238372845 seconds time elapsed
v2: Add missing header file
v3: Move find_map to pmu.c
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create new util/branch.c and util/branch.h to contain the common branch
functions. Such as:
branch_type_count(): Count the numbers of branch types
branch_type_name() : Return the name of branch type
branch_type_stat_display(): Display branch type statistics info
branch_type_str(): Construct the branch type string.
The branch type is saved in branch_flags.
Change log:
v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
v7: Since the common branch type name is changed (e.g. JCC->COND),
this patch is performed the modification accordingly.
v6: Move that multiline conditional code inside {} brackets.
Move branch_type_stat_display() from builtin-report.c to
branch.c.
Move branch_type_str() from callchain.c to branch.c.
v5: It's a new patch in v5 patch series.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com
[ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>