Commit Graph

11445 Commits

Author SHA1 Message Date
Tiezhu Yang 3e9b26dc22 perf tools: Remove some duplicated includes
There exists some duplicated includes in tools/perf, remove them.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1591071304-19338-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 11:09:41 -03:00
Adrian Hunter 0affd0e526 perf symbols: Fix kernel maps for kcore and eBPF
Adjust 'map->pgoff' also when moving a map's start address.

Example with v5.4.34 based kernel:

  Before:

    $ sudo tools/perf/perf record -a --kcore -e intel_pt//k sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.958 MB perf.data ]
    $ sudo tools/perf/perf script --itrace=e >/dev/null
    Warning:
    961 instruction trace errors

  After:

    $ sudo tools/perf/perf script --itrace=e >/dev/null
    $

Committer testing:

  # uname -a
  Linux seventh 5.6.10-100.fc30.x86_64 #1 SMP Mon May 4 15:36:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  #

Before:

  # perf record -a --kcore -e intel_pt//k sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.923 MB perf.data ]
  # perf script --itrace=e >/dev/null
  Warning:
  295 instruction trace errors
  #

After:

  # perf record -a --kcore -e intel_pt//k sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.919 MB perf.data ]
  # perf script --itrace=e >/dev/null
  #

Fixes: fb5a88d413 ("perf tools: Preserve eBPF maps when loading kcore")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200602112505.1406-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 11:05:37 -03:00
Jiri Olsa a9a1790247 perf stat: Ensure group is defined on top of the same cpu mask
Jin Yao reported the issue (and posted first versions of this change)
with groups being defined over events with different cpu mask.

This causes assert aborts in get_group_fd, like:

  # perf stat -M "C2_Pkg_Residency" -a -- sleep 1
  perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed.
  Aborted

All the events in the group have to be defined over the same cpus so the
group_fd can be found for every leader/member pair.

Adding check to ensure this condition is met and removing the group
(with warning) if we detect mixed cpus, like:

  $ sudo perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
  WARNING: event cpu maps do not match, disabling group:
    anon group { power/energy-cores/, cycles }
    anon group { instructions, power/energy-cores/ }

Ian asked also for cpu maps details, it's displayed in verbose mode:

  $ sudo perf stat -e '{cycles,power/energy-cores/}' -v
  WARNING: group events cpu maps do not match, disabling group:
    anon group { power/energy-cores/, cycles }
       power/energy-cores/: 0
       cycles: 0-7
    anon group { instructions, power/energy-cores/ }
       instructions: 0-7
       power/energy-cores/: 0

Committer testing:

  [root@seventh ~]# perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
  WARNING: grouped events cpus do not match, disabling group:
    anon group { power/energy-cores/, cycles }
    anon group { instructions, power/energy-cores/ }
  ^C
   Performance counter stats for 'system wide':

               12.62 Joules power/energy-cores/
         106,920,637        cycles
          80,228,899        instructions              #    0.75  insn per cycle
               12.62 Joules power/energy-cores/

        14.514476987 seconds time elapsed

  [root@seventh ~]#

But if we put compatible events in each group it works:

  [root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2

   Performance counter stats for 'system wide':

                1.95 Joules power/energy-cores/
                0.92 Joules power/energy-ram/
          29,305,715        instructions              #    1.03  insn per cycle
          28,423,338        cycles

         2.001438142 seconds time elapsed

  [root@seventh ~]#

This needs improvement tho:

  [root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (power/energy-cores/).
  /bin/dmesg | grep -i perf may provide additional information.

  [root@seventh ~]#

We need to emit a better message, one stating that the power/ events
can't be used for a specific workload, instead it is per-cpu or system
wide.

Fixes: 6a4bb04caa ("perf tools: Enable grouping logic for parsed events")
Co-developed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602101736.GE1112120@krava
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 10:43:06 -03:00
Ian Rogers 5cf0e8ebc2 perf libdw: Fix off-by 1 relative directory includes
This is currently working due to extra include paths in the build.

Before:

  $ cd tools/perf/arch/arm64/util
  $ ls -la ../../util/unwind-libdw.h
  ls: cannot access '../../util/unwind-libdw.h': No such file or directory

After:

  $ ls -la ../../../util/unwind-libdw.h
  -rw-r----- 1 irogers irogers 553 Apr 17 14:31 ../../../util/unwind-libdw.h

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200529225232.207532-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-01 12:24:23 -03:00
Tan Xiaojun a54ca19498 perf arm-spe: Support synthetic events
After the commit ffd3d18c20 ("perf tools: Add ARM Statistical
Profiling Extensions (SPE) support") has been merged, it supports to
output raw data with option "--dump-raw-trace".  However, it misses for
support synthetic events so cannot output any statistical info.

This patch is to improve the "perf report" support for ARM SPE for four
types synthetic events:

  First level cache synthetic events, including L1 data cache accessing
  and missing events;
  Last level cache synthetic events, including last level cache
  accessing and missing events;
  TLB synthetic events, including TLB accessing and missing events;
  Remote access events, which is used to account load/store operations
  caused to another socket.

Example usage:

  $ perf record -c 1024 -e arm_spe_0/branch_filter=1,ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ dd if=/dev/zero of=/dev/null count=10000
  $ perf report --stdio

  # Samples: 59  of event 'l1d-miss'
  # Event count (approx.): 59
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  ..................................
  #
      23.73%    23.73%  dd       [kernel.kallsyms]  [k] perf_iterate_ctx.constprop.135
      20.34%    20.34%  dd       [kernel.kallsyms]  [k] filemap_map_pages
       5.08%     5.08%  dd       [kernel.kallsyms]  [k] perf_event_mmap
       5.08%     5.08%  dd       [kernel.kallsyms]  [k] unlock_page_memcg
       5.08%     5.08%  dd       [kernel.kallsyms]  [k] unmap_page_range
       3.39%     3.39%  dd       [kernel.kallsyms]  [k] PageHuge
       3.39%     3.39%  dd       [kernel.kallsyms]  [k] release_pages
       3.39%     3.39%  dd       ld-2.28.so         [.] 0x0000000000008b5c
       1.69%     1.69%  dd       [kernel.kallsyms]  [k] __alloc_fd
       [...]

  # Samples: 3K of event 'l1d-access'
  # Event count (approx.): 3980
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  ......................................
  #
      26.98%    26.98%  dd       [kernel.kallsyms]  [k] ret_to_user
      10.53%    10.53%  dd       [kernel.kallsyms]  [k] fsnotify
       7.51%     7.51%  dd       [kernel.kallsyms]  [k] new_sync_read
       4.57%     4.57%  dd       [kernel.kallsyms]  [k] vfs_read
       4.35%     4.35%  dd       [kernel.kallsyms]  [k] vfs_write
       3.69%     3.69%  dd       [kernel.kallsyms]  [k] __fget_light
       3.69%     3.69%  dd       [kernel.kallsyms]  [k] rw_verify_area
       3.44%     3.44%  dd       [kernel.kallsyms]  [k] security_file_permission
       2.76%     2.76%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
       2.44%     2.44%  dd       [kernel.kallsyms]  [k] ksys_write
       2.24%     2.24%  dd       [kernel.kallsyms]  [k] iov_iter_zero
       2.19%     2.19%  dd       [kernel.kallsyms]  [k] read_iter_zero
       1.81%     1.81%  dd       dd                 [.] 0x0000000000002960
       1.78%     1.78%  dd       dd                 [.] 0x0000000000002980
       [...]

  # Samples: 35  of event 'llc-miss'
  # Event count (approx.): 35
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  ...........................
  #
      34.29%    34.29%  dd       [kernel.kallsyms]  [k] filemap_map_pages
       8.57%     8.57%  dd       [kernel.kallsyms]  [k] unlock_page_memcg
       8.57%     8.57%  dd       [kernel.kallsyms]  [k] unmap_page_range
       5.71%     5.71%  dd       [kernel.kallsyms]  [k] PageHuge
       5.71%     5.71%  dd       [kernel.kallsyms]  [k] release_pages
       5.71%     5.71%  dd       ld-2.28.so         [.] 0x0000000000008b5c
       2.86%     2.86%  dd       [kernel.kallsyms]  [k] __queue_work
       2.86%     2.86%  dd       [kernel.kallsyms]  [k] __radix_tree_lookup
       2.86%     2.86%  dd       [kernel.kallsyms]  [k] copy_page
       [...]

  # Samples: 2  of event 'llc-access'
  # Event count (approx.): 2
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  .............
  #
      50.00%    50.00%  dd       [kernel.kallsyms]  [k] copy_page
      50.00%    50.00%  dd       libc-2.28.so       [.] _dl_addr

  # Samples: 48  of event 'tlb-miss'
  # Event count (approx.): 48
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  ..................................
  #
      20.83%    20.83%  dd       [kernel.kallsyms]  [k] perf_iterate_ctx.constprop.135
      12.50%    12.50%  dd       [kernel.kallsyms]  [k] __arch_clear_user
      10.42%    10.42%  dd       [kernel.kallsyms]  [k] clear_page
       4.17%     4.17%  dd       [kernel.kallsyms]  [k] copy_page
       4.17%     4.17%  dd       [kernel.kallsyms]  [k] filemap_map_pages
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] __alloc_fd
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] __mod_memcg_state.part.70
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] __queue_work
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] __rcu_read_unlock
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] d_path
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] destroy_inode
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] do_dentry_open
       [...]

  # Samples: 9K of event 'tlb-access'
  # Event count (approx.): 9573
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  ......................................
  #
      25.79%    25.79%  dd       [kernel.kallsyms]  [k] __arch_clear_user
      11.22%    11.22%  dd       [kernel.kallsyms]  [k] ret_to_user
       8.56%     8.56%  dd       [kernel.kallsyms]  [k] fsnotify
       4.06%     4.06%  dd       [kernel.kallsyms]  [k] new_sync_read
       3.67%     3.67%  dd       [kernel.kallsyms]  [k] el0_svc_common.constprop.2
       3.04%     3.04%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
       2.90%     2.90%  dd       [kernel.kallsyms]  [k] vfs_write
       2.82%     2.82%  dd       [kernel.kallsyms]  [k] vfs_read
       2.52%     2.52%  dd       libc-2.28.so       [.] write
       2.26%     2.26%  dd       [kernel.kallsyms]  [k] security_file_permission
       2.08%     2.08%  dd       [kernel.kallsyms]  [k] ksys_write
       1.96%     1.96%  dd       [kernel.kallsyms]  [k] rw_verify_area
       1.95%     1.95%  dd       [kernel.kallsyms]  [k] read_iter_zero
       [...]

  # Samples: 9  of event 'branch-miss'
  # Event count (approx.): 9
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  .........................
  #
      22.22%    22.22%  dd       libc-2.28.so       [.] _dl_addr
      11.11%    11.11%  dd       [kernel.kallsyms]  [k] __arch_clear_user
      11.11%    11.11%  dd       [kernel.kallsyms]  [k] __arch_copy_from_user
      11.11%    11.11%  dd       [kernel.kallsyms]  [k] __dentry_kill
      11.11%    11.11%  dd       [kernel.kallsyms]  [k] __efistub_memcpy
      11.11%    11.11%  dd       ld-2.28.so         [.] 0x0000000000012b7c
      11.11%    11.11%  dd       libc-2.28.so       [.] 0x000000000002a980
      11.11%    11.11%  dd       libc-2.28.so       [.] 0x0000000000083340

  # Samples: 29  of event 'remote-access'
  # Event count (approx.): 29
  #
  # Children      Self  Command  Shared Object      Symbol
  # ........  ........  .......  .................  ...........................
  #
      41.38%    41.38%  dd       [kernel.kallsyms]  [k] filemap_map_pages
      10.34%    10.34%  dd       [kernel.kallsyms]  [k] unlock_page_memcg
      10.34%    10.34%  dd       [kernel.kallsyms]  [k] unmap_page_range
       6.90%     6.90%  dd       [kernel.kallsyms]  [k] release_pages
       3.45%     3.45%  dd       [kernel.kallsyms]  [k] PageHuge
       3.45%     3.45%  dd       [kernel.kallsyms]  [k] __queue_work
       3.45%     3.45%  dd       [kernel.kallsyms]  [k] page_add_file_rmap
       3.45%     3.45%  dd       [kernel.kallsyms]  [k] page_counter_try_charge
       3.45%     3.45%  dd       [kernel.kallsyms]  [k] page_remove_rmap
       3.45%     3.45%  dd       [kernel.kallsyms]  [k] xas_start
       3.45%     3.45%  dd       ld-2.28.so         [.] 0x0000000000002a1c
       3.45%     3.45%  dd       ld-2.28.so         [.] 0x0000000000008b5c
       3.45%     3.45%  dd       ld-2.28.so         [.] 0x00000000000093cc

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-4-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-01 12:24:23 -03:00
Tan Xiaojun 9f74d77018 perf auxtrace: Add four itrace options
This patch is to add four options to synthesize events which are
described as below:

 'f': synthesize first level cache events
 'm': synthesize last level cache events
 't': synthesize TLB events
 'a': synthesize remote access events

This four options will be used by ARM SPE as their first consumer.

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: James Clark <james.clark@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-3-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-01 12:24:23 -03:00
Tan Xiaojun 4db25f6693 perf tools: Move arm-spe-pkt-decoder.h/c to the new dir
Create a new arm-spe-decoder directory for subsequent extensions and
move arm-spe-pkt-decoder.h/c to this directory. No code changes.

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: James Clark <james.clark@arm.com>
Tested-by: Qi Liu <liuqi115@hisilicon.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-2-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-01 12:24:23 -03:00
Ian Rogers 0fb0d615f3 perf test: Initialize memory in dwarf-unwind
Avoid a false positive caused by assembly code in arch/x86.

In tests, zero the perf_event to avoid uninitialized memory uses.

Warnings were caught using clang with -fsanitize=memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200530082015.39162-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-01 12:24:23 -03:00
Ian Rogers 8617e2e34f perf tests: Don't tail call optimize in unwind test
The tail call optimization can unexpectedly make the stack smaller and
cause the test to fail.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: clang-built-linux@googlegroups.com
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200530082015.39162-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-01 12:24:23 -03:00
Arnaldo Carvalho de Melo 9300acc6fe perf build: Add a LIBPFM4=1 build test entry
So that when one runs:

  $ make -C tools/perf build-test

We make sure that recent changes don't break that opt-in build.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jiwei Sun <jiwei.sun@windriver.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonghong Song <yhs@fb.com>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Stephane Eranian 7094349078 perf tools: Add optional support for libpfm4
This patch links perf with the libpfm4 library if it is available and
LIBPFM4 is passed to the build. The libpfm4 library contains hardware
event tables for all processors supported by perf_events. It is a helper
library that helps convert from a symbolic event name to the event
encoding required by the underlying kernel interface. This library is
open-source and available from: http://perfmon2.sf.net.

With this patch, it is possible to specify full hardware events by name.
Hardware filters are also supported. Events must be specified via the
--pfm-events and not -e option. Both options are active at the same time
and it is possible to mix and match:

  $ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles ....

One needs to explicitely ask for its inclusion by using the LIBPFM4 make
command line option, ie its opt-in rather than opt-out of feature
detection and build support.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jiwei Sun <jiwei.sun@windriver.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Ed Maste 82352ae28f perf tools: Correct license on jsmn JSON parser
This header is part of the jsmn JSON parser, introduced in 867a979a83.
Correct the SPDX tag to indicate that it is under the MIT license.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lore.kernel.org/lkml/20200528170858.48457-1-emaste@freefall.freebsd.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Nick Gasson 1e4bd2ae45 perf jit: Fix inaccurate DWARF line table
Fix an issue where addresses in the DWARF line table are offset by -0x40
(GEN_ELF_TEXT_OFFSET). This can be seen with `objdump -S` on the ELF
files after perf inject.

Committer notes:

Ian added this in his Acked-by reply:

 ---
Without too much knowledge this looks good to me. The original code came
from oprofile's jit support:

  https://sourceforge.net/p/oprofile/oprofile/ci/master/tree/opjitconv/debug_line.c#l325
 ---

Signed-off-by: Nick Gasson <nick.gasson@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200528051916.6722-1-nick.gasson@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Nick Gasson 7d7e503cac perf jvmti: Remove redundant jitdump line table entries
For each PC/BCI pair in the JVMTI compiler inlining record table, the
jitdump plugin emits debug line table entries for every source line in
the method preceding that BCI. Instead only emit one source line per
PC/BCI pair. Reported by Ian Rogers. This reduces the .dump size for
SPECjbb from ~230MB to ~40MB.

Signed-off-by: Nick Gasson <nick.gasson@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200528054049.13662-1-nick.gasson@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Arnaldo Carvalho de Melo 60da3a12c5 perf build: Add NO_SDT=1 to the default set of build tests
We forgot to add it, so one would have to explicitely ask for it to be
run, fix that by adding it to the set of tests that are performed by
default when one does:

  $ make -C tools/perf build-test

It was being exercised only in the make_minimal test, this patch makes
it be tested in isolation, i.e. disabling only this feature.

Fixes: e26e63be64 ("perf build: Add sdt feature detection")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Arnaldo Carvalho de Melo 69fbadbe98 perf build: Add NO_LIBCRYPTO=1 to the default set of build tests
We forgot to add it, so one would have to explicitely ask for it to be
run, fix that by adding it to the set of tests that are performed by
default when one does:

  $ make -C tools/perf build-test

It was being exercised only in the make_minimal test, this patch makes
it be tested in isolation, i.e. disabling only this feature.

Fixes: 8ee4646038 ("perf build: Add libcrypto feature detection")
Cc: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Arnaldo Carvalho de Melo 5bc7aac3e7 perf build: Add NO_SYSCALL_TABLE=1 to the build tests
So that we make sure that even on x86-64 and other architectures where
that is the default method we test build the fallback to libaudit that
other architectures use.

I.e. now this line got added to:

  $ make -C tools/perf build-test
  <SNIP>
       make_no_syscall_tbl_O: cd . && make NO_SYSCALL_TABLE=1 FEATURES_DUMP=/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j12 O=/tmp/tmp.W0HtKR1mfr DESTDIR=/tmp/tmp.lNezgCVPzW
  <SNIP>
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:16 -03:00
Arnaldo Carvalho de Melo a88f70de1b perf build: Remove libaudit from the default feature checks
Ingo reported that the libaudit was always appearing as OFF:

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ OFF ]

And everything seemed to work, i.e. we were checking for a feature that
we don't use, causing confusion for people building perf, so work to
remove that nuisance while making sure that it works when an arch
doesn't provide the alternative method to generate the syscall id/name
conversion tables.

Longer explanation of the new modus operandi:

  $ make -C tools/perf O=/tmp/build/perf NO_SYSCALL_TABLE=1
  <SNIP>
  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libcap: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]
  ...                        libaio: [ on  ]
  ...                       libzstd: [ on  ]
  ...        disassembler-four-args: [ on  ]

  Makefile.config:665: No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev
    GEN      /tmp/build/perf/common-cmds.h
    MKDIR    /tmp/build/perf/fd/
    MKDIR    /tmp/build/perf/fs/
  <SNIP>
  $

The libaudit test is forced and it fails when audit-libs-devel isn't available:

  $ cat /tmp/build/perf/feature/test-libaudit.make.output
  test-libaudit.c:2:10: fatal error: libaudit.h: No such file or directory
      2 | #include <libaudit.h>
        |          ^~~~~~~~~~~~
  compilation terminated.
  $

If we install audit-libs-devel and rebuild it continues not to be shown as OFF
in the main auto-detection summary, but again gets tested and this time:

  $ rpm -q audit-libs-devel
  audit-libs-devel-3.0-0.15.20191104git1c2f876.fc31.x86_64
  $

The make output for the feature detection comes clean:

  $ cat /tmp/build/perf/feature/test-libaudit.make.output

And the feature detection binary is successfully built and is dynamicly linked
with libaudit:

  $ ldd /tmp/build/perf/feature/test-libaudit.bin | grep audit
  	libaudit.so.1 => /lib64/libaudit.so.1 (0x00007f5bf5177000)
  $

As well as the resulting perf binary:

  $ ldd /tmp/build/perf/perf | grep audit
  	libaudit.so.1 => /lib64/libaudit.so.1 (0x00007fad511c7000)
  $

And 'perf trace' works using the libaudit method:

  $ sudo /tmp/build/perf/perf trace -e nanosleep sleep 1
       0.000 (1000.067 ms): sleep/281872 nanosleep(rqtp: 0x7ffedbbe69d0) = 0
  $

If we leave audit-libs-devel installed but don't disable the use of the best
method, the one using SYSCALL_TABLE, the default for architectures that provide
the script to build the syscall id/name mapping using the .tbl files copied
from the kernel sources, we get:

  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
  $ make -C tools/perf O=/tmp/build/perf
  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libcap: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]
  ...                        libaio: [ on  ]
  ...                       libzstd: [ on  ]
  ...        disassembler-four-args: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
  <SNIP>
  $

Again, no mention of libaudit being on or OFF and:

  $ cat /tmp/build/perf/feature/test-libaudit.make.output
  cat: /tmp/build/perf/feature/test-libaudit.make.output: No such file or directory
  $

We didn't even bother checking for its availability, slightly speeding up the
build process and:

  $ ldd /tmp/build/perf/perf | grep libaudit
  $

We don't link with it, also:

  $ sudo /tmp/build/perf/perf trace -e nanosleep sleep 1
       0.000 (1000.053 ms): sleep/299125 nanosleep(rqtp: 0x7ffc24611b50) = 0
  $

And globs become available:

  $ sudo /tmp/build/perf/perf trace -e *sleep sleep 1
       0.000 (1000.072 ms): sleep/299136 nanosleep(rqtp: 0x7ffe7a3c4ff0) = 0
  $

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:13 -03:00
Arnaldo Carvalho de Melo d21cb73a90 perf trace: Grow the syscall table as needed when using libaudit
The audit-libs API doesn't provide a way to figure out what is the
syscall with the greatest number/id, take that into account when using
that method to go on growing the syscall table as we the syscalls go on
appearing on the radar.

With this the libaudit based method is back working, i.e. when building
with:

  $ make NO_SYSCALL_TABLE=1 O=/tmp/build/perf -C tools/perf install-bin
  <SNIP>
  Auto-detecting system features:
  <SNIP>
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libcap: [ on  ]
  <SNIP>
  $ ldd ~/bin/perf | grep audit
	libaudit.so.1 => /lib64/libaudit.so.1 (0x00007faef22df000)
  $

perf trace is back working, which makes it functional in arches other
than x86_64, powerpc, arm64 and s390, that provides these generators:

  $ find tools/perf/arch/ -name "*syscalltbl*"
  tools/perf/arch/x86/entry/syscalls/syscalltbl.sh
  tools/perf/arch/arm64/entry/syscalls/mksyscalltbl
  tools/perf/arch/s390/entry/syscalls/mksyscalltbl
  tools/perf/arch/powerpc/entry/syscalls/mksyscalltbl
  $

Example output forcing the libaudit method on x86_64:

  # perf trace -e file,nanosleep sleep 0.001
           ? (         ): sleep/859090  ... [continued]: execve())                                   = 0
       0.045 ( 0.005 ms): sleep/859090 access(filename: 0x8733e850, mode: R)                         = -1 ENOENT (No such file or directory)
       0.055 ( 0.005 ms): sleep/859090 openat(dfd: CWD, filename: 0x8733ba29, flags: RDONLY|CLOEXEC) = 3
       0.079 ( 0.005 ms): sleep/859090 openat(dfd: CWD, filename: 0x87345d20, flags: RDONLY|CLOEXEC) = 3
       0.085 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483f58, count: 832)                  = 832
       0.090 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483b50, count: 784)                  = 784
       0.094 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483b20, count: 32)                   = 32
       0.098 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483ad0, count: 68)                   = 68
       0.109 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483a50, count: 784)                  = 784
       0.113 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483730, count: 32)                   = 32
       0.117 ( 0.002 ms): sleep/859090 read(fd: 3, buf: 0x7ffd9d483710, count: 68)                   = 68
       0.320 ( 0.008 ms): sleep/859090 openat(dfd: CWD, filename: 0x872c3660, flags: RDONLY|CLOEXEC) = 3
       0.372 ( 1.057 ms): sleep/859090 nanosleep(rqtp: 0x7ffd9d484ac0)                               = 0
  #

There are still some limitations when using the libaudit method, that
will be fixed at some point, i.e., this works with the mksyscalltbl
method but not with libaudit's:

  # perf trace -e file,*sleep sleep 0.001
  event syntax error: '*sleep'
                       \___ parser error
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:11 -03:00
Arnaldo Carvalho de Melo a9e8c1f856 perf trace: Use zalloc() to make sure all fields are zeroed in the syscalltbl constructor
In the past this wasn't needed as the libaudit based code would use just
one field, and the alternative constructor would fill in all the fields,
but now that even when using the libaudit based method we need the other
fields, switch to zalloc() to make sure the other fields are zeroed at
instantiation time.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:50:29 -03:00
Arnaldo Carvalho de Melo db6b8cc891 perf trace: Remove union from syscalltbl, all the fields are needed
When we moved to a syscalltbl generated from the kernel syscall tables
(arch/..../syscall*.tbl) the idea was to either use it, when having the
generator (e.g. tools/perf/arch/x86/entry/syscalls/syscalltbl.sh), or
falling back to the previous audit-libs based way of mapping syscall ids
to strings and the other way around.

At first we just needed the audit_detect_machine() return to then use it
to the str->id/id->str, or the other fields for the now used by default
in the most well developed arches method of using the syscall table
generator.

The problem is that then the libaudit code fell into disrepair, and
architectures where it is the method used are not working.

Now, with NO_SYSCALL_TABLE=1 being possible to pass on the make command
line we can automate the testing of that method even on x86-64, arm64,
etc.

And doing it I noted that we actually use fields in both entries in the
union, oops, so ditch the union, as we need all those fields at the same
time.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:50:26 -03:00
Arnaldo Carvalho de Melo 43de3869b5 perf build: Allow explicitely disabling the NO_SYSCALL_TABLE variable
This is useful to see if, on x86, the legacy libaudit still works, as it
is used in architectures that don't have the SYSCALL_TABLE logic and we
want to have it tested in 'make -C tools/perf/ build-test'.

E.g.:

Without having audit-libs-devel installed:

  $ make NO_SYSCALL_TABLE=1 O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j12' parallel build
  <SNIP>
  Auto-detecting system features:
  <SNIP>
  ...                      libaudit: [ OFF ]
  ...                        libbfd: [ on  ]
  ...                        libcap: [ on  ]
  <SNIP>
  Makefile.config:664: No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev
  <SNIP>

After installing it:

  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
  $ time make NO_SYSCALL_TABLE=1 O=/tmp/build/perf  -C tools/perf install-bin ; perf test python
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j12' parallel build
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
  diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
  Warning: Kernel ABI header at 'tools/perf/util/hashmap.c' differs from latest version at 'tools/lib/bpf/hashmap.c'
  diff -u tools/perf/util/hashmap.c tools/lib/bpf/hashmap.c

  Auto-detecting system features:
  <SNIP>
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libcap: [ on  ]
  <SNIP>
  $ ldd ~/bin/perf | grep audit
  	libaudit.so.1 => /lib64/libaudit.so.1 (0x00007fc18978e000)
  $

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20200529155552.463-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:50:24 -03:00
Arnaldo Carvalho de Melo 9b90d9734a perf build: Group the NO_SYSCALL_TABLE logic
To help in allowing to disable it from the make command line.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20200529155552.463-2-acme@kernel.org
[ Fixed the logic for the filter part, it should be ifeq, not ifneq ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:48:03 -03:00
Adrian Hunter 9b2d2066dd perf intel-pt: Refine kernel decoding only warning message
Stop the message displaying when user space is not being traced.

Example:

  Prerequisites:

    sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
    sudo chmod +r /proc/kcore

  Before:

    $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
    Warning:
    Intel Processor Trace decoding will not be possible except for kernel tracing!
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.838 MB perf.data ]

  After:

    $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.068 MB perf.data ]

    $ sudo chmod go-r /proc/kcore
    $ sudo setcap -r ~/bin/perf

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lore.kernel.org/lkml/20200528120859.21604-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 11:37:45 -03:00
Adrian Hunter 16b4b4e1a0 perf record: Respect --no-switch-events
Context switch events are added automatically by Intel PT and Coresight.

Make it possible to suppress them. That is useful for tracing the
scheduler without the disturbance that the switch event processing
creates.

Example:

  Prerequisites:

    $ which perf
    ~/bin/perf
    $ sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
    $ sudo chmod +r /proc/kcore

  Before:

    $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.938 MB perf.data ]
    $ perf script -D | grep PERF_RECORD_SWITCH | wc -l
    572

  After:

    $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
    Warning:
    Intel Processor Trace decoding will not be possible except for kernel tracing!
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.838 MB perf.data ]
    $ perf script -D | grep PERF_RECORD_SWITCH | wc -l
    0

    $ sudo chmod go-r /proc/kcore
    $ sudo setcap -r ~/bin/perf

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lore.kernel.org/lkml/20200528120859.21604-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 11:33:36 -03:00