Commit Graph

8966 Commits

Author SHA1 Message Date
Wang Nan 09fa4f4012 perf record: Rename variable to make code clear
record__mmap_read() writes data from ring buffer into perf.data.  'head'
is maintained by the kernel, points to the last written record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data.

The names of these variables are not very intutive. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.

Add 'start' and 'end' pointer to make code clear and set md->prev to
'head' instead of the moved 'old' pointer. This patch doesn't change
behavior since:

    buf = &data[old & md->mask];
    size = head - old;
    old += size;     <--- Here, old == head

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-4-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 18:22:47 -03:00
Wang Nan 2d11c65071 perf record: Prevent reading invalid data in record__mmap_read
When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 18:22:46 -03:00
Wang Nan 65aea23387 perf evlist: Add API to pause/resume
perf_evlist__toggle_{pause,resume}() are introduced to pause/resume
events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.

Following commits use them to ensure overwrite ring buffer is paused
before reading.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Return -1, like all other ioctl() usage in evlist.c, rename 'pause'
  arg to avoid breaking the build on ubuntu 12.04 and other old systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 18:22:00 -03:00
Arnaldo Carvalho de Melo 12f3ca4fc8 perf trace: Use the ptr->name beautifier as default for "filename" args
Auto-attach the ptr->name beautifier to syscall args "filename", "path"
and "pathname" if they are of type "const char *".

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jxii4qmcgoppftv0zdvml9d7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 16:41:00 -03:00
Arnaldo Carvalho de Melo b6565c908a perf trace: Use the fd->name beautifier as default for "fd" args
Noticed when the 'setsockopt' 'fd' arg wasn't being formatted via
the SCA_FD beautifier, so just remove the setting of "fd" args to
SCA_FD and do it when reading the syscall info, like we do for
args of type "pid_t", i.e. "fd" as the name should be enough as
the decision to use the SFA_FD beautifier. For odd cases we can
just do it explicitely.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0qissgetiuqmqyj4b6ancmpn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 16:41:00 -03:00
Andi Kleen 508be0dfe6 perf report: Add srcline_from/to branch sort keys
Add "srcline_from" and "srcline_to" branch sort keys that allow to show
the source lines of a branch.

That makes it much easier to track down where particular branches happen
in the program, for example to examine branch mispredictions, or to
associate it with cycle counts:

  % perf record -b -e cycles:p ./tcall
  % perf report --sort srcline_from,srcline_to,mispredict
  ...
    15.10%  tcall.c:18       tcall.c:10       N
    14.83%  tcall.c:11       tcall.c:5        N
    14.12%  tcall.c:7        tcall.c:12       N
    14.04%  tcall.c:12       tcall.c:5        N
    12.42%  tcall.c:17       tcall.c:18       N
    12.39%  tcall.c:7        tcall.c:13       N
    12.27%  tcall.c:13       tcall.c:17       N
  ...

  % perf report --sort srcline_from,srcline_to,cycles
  ...
    17.12%  tcall.c:18       tcall.c:11       1
    17.01%  tcall.c:12       tcall.c:6        1
    16.98%  tcall.c:11       tcall.c:6        1
    15.91%  tcall.c:17       tcall.c:18       1
     6.38%  tcall.c:7        tcall.c:17       7
     4.80%  tcall.c:7        tcall.c:12       8
     4.21%  tcall.c:7        tcall.c:17       8
     2.67%  tcall.c:7        tcall.c:12       7
     2.62%  tcall.c:7        tcall.c:12       10
     2.10%  tcall.c:7        tcall.c:17       9
     1.58%  tcall.c:7        tcall.c:12       6
     1.44%  tcall.c:7        tcall.c:12       5
     1.38%  tcall.c:7        tcall.c:12       9
     1.06%  tcall.c:7        tcall.c:17       13
     1.05%  tcall.c:7        tcall.c:12       4
     1.01%  tcall.c:7        tcall.c:17       6

Open issues:

- Some kernel symbols get misresolved.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-23 11:25:16 -03:00
Wang Nan d4c6fb36ac perf evsel: Record fd into perf_mmap
Add a fd field into struct perf_mmap so that perf can track the mmap fd.

This feature will be used for toggling overwrite ring buffers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463762315-155689-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 14:56:58 -03:00
Wang Nan b90dc17a5d perf evsel: Add overwrite attribute and check write_backward
Add 'overwrite' attribute to evsel to mark whether this event is
overwritable. The following commits will support syntax like:

  # perf record -e cycles/overwrite/ ...

An overwritable evsel requires kernel support for the
perf_event_attr.write_backward ring buffer feature.

Add it to perf_missing_feature.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463762315-155689-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 14:54:23 -03:00
Ingo Molnar 408cf67707 Merge tag 'perf-core-for-mingo-20160520' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- We should not use the current value of the kernel.perf_event_max_stack as the
  default value for --max-stack in tools that can process perf.data files, they
  will only match if that sysctl wasn't changed from its default value at the
  time the perf.data file was recorded, fix it.

  This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report'
  produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo)

- Provide a better warning when running 'perf trace' on a system where the
  kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record',
  noticed on ubuntu 16.04 where this is the default kptr_restrict setting.
  (Arnaldo Carvalho de Melo)

- Fix ordering of instructions in the annotation code, noticed when annotating
  ARM binaries, now that table is auto-ordered at first use, to avoid more such
  problems (Chris Ryder)

- Set buildid dir under symfs when --symfs is provided (He Kuang)

- Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo)

- Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being
  traced (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-20 19:37:43 +02:00
He Kuang a706670900 perf tools: Set buildid dir under symfs when --symfs is provided
This patch moves the reference of buildid dir to 'symfs/.debug' and
skips the local buildid dir when '--symfs' is given, so that every
single file opened by perf is relative to symfs directory now.

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463658462-85131-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:58 -03:00
Arnaldo Carvalho de Melo caa36ed7ba perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
When --min-stack or --max-stack is passwd but --no-syscalls is also in
effect, there is no point in automatically setting '--call-graph dwarf'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-pq922i7h9wef0pho1dqpttvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:57 -03:00
Chris Ryder 7e4c149813 perf annotate: Sort list of recognised instructions
Currently the list of instructions recognised by perf annotate has to be
explicitly written in sorted order. This makes it easy to make mistakes
when adding new instructions. Sort the list of instructions on first
access.

Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/4268febaf32f47f322c166fb2fe98cfec7041e11.1463676839.git.chris.ryder@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:57 -03:00
Chris Ryder 58c0400176 perf annotate: Fix identification of ARM blt and bls instructions
The ARM blt and bls instructions are not correctly identified when
parsing assembly because the list of recognised instructions must be
sorted by name. Swap the ordering of blt and bls.

Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/560e196b7c79b7ff853caae13d8719a31479cb1a.1463676839.git.chris.ryder@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:57 -03:00
Arnaldo Carvalho de Melo fe176085a4 perf tools: Fix usage of max_stack sysctl
We cannot limit processing stacks from the current value of the sysctl,
as we may be processing perf.data files, possibly from other machines.

Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can
be overriden using --max-stack or equivalent.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Fixes: 4cb93446c5 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack")
Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:56 -03:00
Arnaldo Carvalho de Melo bf8bddbf19 perf callchain: Stop validating callchains by the max_stack sysctl
As thread__resolve_callchain_sample can be used for handling perf.data
files, that could've been recorded with a large max_stack sysctl setting
than what the system used for analysis has set.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-2995bt2g5yq2m05vga4kip6m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:56 -03:00
Arnaldo Carvalho de Melo c008f78f93 perf trace: Fix exit_group() formatting
This doesn't return, so there is no raw_syscalls:sys_exit for it, add
the ending ')', without any return value, since it is void.

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vh2mii0g4qlveuc4joufbipu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:55 -03:00
Arnaldo Carvalho de Melo e77a07425f perf top: Use machine->kptr_restrict_warned
Its now there, no need to have it too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-y18oeou494uy11im7u9to0dx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:55 -03:00
Arnaldo Carvalho de Melo caf8a0d049 perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
Hook into the libtraceevent plugin kernel symbol resolver to warn the
user that that can't happen with kptr_restrict=1.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9gc412xx1gl0lvqj1d1xwlyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:54 -03:00
Arnaldo Carvalho de Melo 45e9005690 perf machine: Do not bail out if not managing to read ref reloc symbol
This means the user can't access /proc/kallsyms, for instance, because
/proc/sys/kernel/kptr_restrict is set to 1.

Instead leave the ref_reloc_sym as NULL and code using it will cope.

This allows 'perf trace' to work on such systems for !root, the only
issue would be when trying to resolve kernel symbols, which happens,
for instance, in some libtracevent plugins.  A warning for that case
will be provided in the next patch in this series.

Noticed in Ubuntu 16.04, that comes with kptr_restrict=1.

Reported-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-knpu3z4iyp2dxpdfm798fac4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-20 11:43:54 -03:00
Ingo Molnar 21f77d231f Merge tag 'perf-core-for-mingo-20160516' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- Honour the kernel.perf_event_max_stack knob more precisely by not counting
  PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
  the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)

- Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)

- Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)

- Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)

- Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)

- Store vdso buildid unconditionally, as it appears in callchains and
  we're not checking those when creating the build-id table, so we
  end up not being able to resolve VDSO symbols when doing analysis
  on a different machine than the one where recording was done, possibly
  of a different arch even (arm -> x86_64) (He Kuang)

Infrastructure changes:

- Generalize max_stack sysctl handler, will be used for configuring
  multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)

Cleanups:

- Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
  open coded strings (Masami Hiramatsu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-20 08:20:14 +02:00
Arnaldo Carvalho de Melo a29d5c9b81 perf tools: Separate accounting of contexts and real addresses in a stack trace
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So match the kernel support and validate chain->nr taking into account
both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:54 -03:00
Masami Hiramatsu 0a77582f04 perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE
Instead of using a raw string, use DSO__NAME_KALLSYMS and
DSO__NAME_KCORE macros for kallsyms and kcore.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:48 -03:00
Namhyung Kim a1f3d56761 perf stat: Use cpu-clock event for cpu targets
Currently 'perf stat' always counts task-clock event by default.  But
it's somewhat confusing for system-wide targets (especially with 'sleep
N' as the 'sleep' task just sleeps and doesn't use cputime).  Changing
to cpu-clock event instead for that case makes more sense IMHO.

Before:
  # perf stat -a sleep 0.1

   Performance counter stats for 'system wide':

        403.038603      task-clock (msec)     #    4.001 CPUs utilized
               150      context-switches      #    0.372 K/sec
                 7      cpu-migrations        #    0.017 K/sec
                71      page-faults           #    0.176 K/sec
        23,705,169      cycles                #    0.059 GHz
        15,888,166      instructions          #    0.67  insn per cycle
         3,326,078      branches              #    8.253 M/sec
            87,643      branch-misses         #    2.64% of all branches

       0.100737009 seconds time elapsed

  #

After:

  # perf stat -a sleep 0.1

   Performance counter stats for 'system wide':

        404.271182      cpu-clock (msec)      #    4.000 CPUs utilized
               143      context-switches      #    0.354 K/sec
                13      cpu-migrations        #    0.032 K/sec
                73      page-faults           #    0.181 K/sec
        22,119,220      cycles                #    0.055 GHz
        13,622,065      instructions          #    0.62  insn per cycle
         2,918,769      branches              #    7.220 M/sec
            85,033      branch-misses         #    2.91% of all branches

       0.101073089 seconds time elapsed

  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:47 -03:00
Namhyung Kim daf4f4786e perf stat: Update runtime using cpu-clock event
Currently only the task-clock event updates the runtime_nsec so it
cannot show the metric when using cpu-clock events.  However cpu clock
works basically same as task-clock, so no need to not update the runtime
IMHO.

Before:

  # perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1

    Performance counter stats for 'system wide':

         1217.759506      cpu-clock (msec)
                  93      context-switches
                  61      page-faults
          18,958,022      cycles

         0.101393794 seconds time elapsed

After:

   Performance counter stats for 'system wide':

         1220.471884      cpu-clock (msec)          #   12.013 CPUs utilized
                 118      context-switches          #    0.097 K/sec
                  59      page-faults               #    0.048 K/sec
          17,941,247      cycles                    #    0.015 GHz

         0.101594777 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:46 -03:00
Namhyung Kim b0404be8d6 perf stat: Fix indentation of stalled backend cycle
The commit 140aeadc1f ("perf stat: Abstract stat metrics printing")
changed how shadow metrics are printed, but it missed to update the
width of the stalled backend cycles event to 7.2% like others.  This
resulted in misaligned output like below:

  Performance counter stats for 'pwd':

          0.638313      task-clock (msec)         #    0.567 CPUs utilized
                 0      context-switches          #    0.000 K/sec
                 0      cpu-migrations            #    0.000 K/sec
                54      page-faults               #    0.085 M/sec
           885,600      cycles                    #    1.387 GHz
           558,438      stalled-cycles-frontend   #   63.06% frontend cycles idle
           431,355      stalled-cycles-backend    #  48.71% backend cycles idle
           674,956      instructions              #    0.76  insn per cycle
                                                  #    0.83  stalled cycles per insn
           130,380      branches                  #  204.257 M/sec
     <not counted>      branch-misses

       0.001125426 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 140aeadc1f ("perf stat: Abstract stat metrics printing")
Link: http://lkml.kernel.org/r/1463119263-5569-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:45 -03:00