Merge tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "New features:

  perf ftrace:

   - Add -n/--use-nsec option to the 'latency' subcommand.

     Default: usecs:

     $ sudo perf ftrace latency -T dput -a sleep 1
     #   DURATION     |      COUNT | GRAPH                          |
          0 - 1    us |    2098375 | #############################  |
          1 - 2    us |         61 |                                |
          2 - 4    us |         33 |                                |
          4 - 8    us |         13 |                                |
          8 - 16   us |        124 |                                |
         16 - 32   us |        123 |                                |
         32 - 64   us |          1 |                                |
         64 - 128  us |          0 |                                |
        128 - 256  us |          1 |                                |
        256 - 512  us |          0 |                                |

     Better granularity with nsec:

     $ sudo perf ftrace latency -T dput -a -n sleep 1
     #   DURATION     |      COUNT | GRAPH                          |
          0 - 1    us |          0 |                                |
          1 - 2    ns |          0 |                                |
          2 - 4    ns |          0 |                                |
          4 - 8    ns |          0 |                                |
          8 - 16   ns |          0 |                                |
         16 - 32   ns |          0 |                                |
         32 - 64   ns |          0 |                                |
         64 - 128  ns |    1163434 | ##############                 |
        128 - 256  ns |     914102 | #############                  |
        256 - 512  ns |        884 |                                |
        512 - 1024 ns |        613 |                                |
          1 - 2    us |         31 |                                |
          2 - 4    us |         17 |                                |
          4 - 8    us |          7 |                                |
          8 - 16   us |        123 |                                |
         16 - 32   us |         83 |                                |

  perf lock:

   - Add -c/--combine-locks option to merge lock instances in the same
     class into a single entry.

     # perf lock report -c
                    Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)

           rcu_read_lock   251225         0            0              0            0            0
      hrtimer_bases.lock    39450         0            0              0            0            0
     &sb->s_type->i_l...    10301         1          662            662          662          662
        ptlock_ptr(page)    10173         2          701           1402          760          642
     &(ei->i_block_re...     8732         0            0              0            0            0
            &xa->xa_lock     8088         0            0              0            0            0
             &base->lock     6705         0            0              0            0            0
             &p->pi_lock     5549         0            0              0            0            0
     &dentry->d_lockr...     5010         4         1274           5097         1844          789
               &ep->lock     3958         0            0              0            0            0

      - Add -F/--field option to customize the list of fields to output:

     $ perf lock report -F contended,wait_max -k avg_wait
                     Name contended max wait(ns) avg wait(ns)

           slock-AF_INET6         1        23543        23543
        &lruvec->lru_lock         5        18317        11254
           slock-AF_INET6         1        10379        10379
               rcu_node_1         1         2104         2104
      &dentry->d_lockr...         1         1844         1844
      &dentry->d_lockr...         1         1672         1672
         &newf->file_lock        15         2279         1025
      &dentry->d_lockr...         1          792          792

   - Add --synth=no option for record, as there is no need to symbolize,
     lock names comes from the tracepoints.

  perf record:

   - Threaded recording, opt-in, via the new --threads command line
     option.

   - Improve AMD IBS (Instruction-Based Sampling) error handling
     messages.

  perf script:

   - Add 'brstackinsnlen' field (use it with -F) for branch stacks.

   - Output branch sample type in 'perf script'.

  perf report:

   - Add "addr_from" and "addr_to" sort dimensions.

   - Print branch stack entry type in 'perf report --dump-raw-trace'

   - Fix symbolization for chrooted workloads.

  Hardware tracing:

  Intel PT:

   - Add CFE (Control Flow Event) and EVD (Event Data) packets support.

   - Add MODE.Exec IFLAG bit support.

     Explanation about these features from the "Intel® 64 and IA-32
     architectures software developer’s manual combined volumes: 1, 2A,
     2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:

        https://cdrdv2.intel.com/v1/dl/getContent/671200

     At page 3951:
      "32.2.4

       Event Trace is a capability that exposes details about the
       asynchronous events, when they are generated, and when their
       corresponding software event handler completes execution. These
       include:

        o Interrupts, including NMI and SMI, including the interrupt
          vector when defined.

        o Faults, exceptions including the fault vector.

           - Page faults additionally include the page fault address,
             when in context.

        o Event handler returns, including IRET and RSM.

        o VM exits and VM entries.¹

           - VM exits include the values written to the “exit reason”
             and “exit qualification” VMCS fields. INIT and SIPI events.

        o TSX aborts, including the abort status returned for the RTM
          instructions.

        o Shutdown.

       Additionally, it provides indication of the status of the
       Interrupt Flag (IF), to indicate when interrupts are masked"

  ARM CoreSight:

   - Use advertised caps/min_interval as default sample_period on ARM
     spe.

   - Update deduction of TRCCONFIGR register for branch broadcast on
     ARM's CoreSight ETM.

  Vendor Events (JSON):

  Intel:

   - Update events and metrics for: Alderlake, Broadwell, Broadwell DE,
     BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont,
     GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX,
     Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP,
     Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
     Tigerlake, TremontX, Westmere EP-SP, and Westmere EX.

  ARM:

   - Add support for HiSilicon CPA PMU aliasing.

  perf stat:

   - Fix forked applications enablement of counters.

   - The 'slots' should only be printed on a different order than the
     one specified on the command line when 'topdown' events are
     present, fix it.

  Miscellaneous:

   - Sync msr-index, cpufeatures header files with the kernel sources.

   - Stop using some deprecated libbpf APIs in 'perf trace'.

   - Fix some spelling mistakes.

   - Refactor the maps pointers usage to pave the way for using refcount
     debugging.

   - Only offer the --tui option on perf top, report and annotate when
     perf was built with libslang.

   - Don't mention --to-ctf in 'perf data --help' when not linking with
     the required library, libbabeltrace.

   - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by
     array_size.cocci.

   - Enhance the matching of sub-commands abbreviations:
	'perf c2c rec' -> 'perf c2c record'
	'perf c2c recport -> error

   - Set build-id using build-id header on new mmap records.

   - Fix generation of 'perf --version' string.

  perf test:

   - Add test for the arm_spe event.

   - Add test to check unwinding using fame-pointer (fp) mode on arm64.

   - Make metric testing more robust in 'perf test'.

   - Add error message for unsupported branch stack cases.

  libperf:

   - Add API for allocating new thread map array.

   - Fix typo in perf_evlist__open() failure error messages in libperf
     tests.

  perf c2c:

   - Replace bitmap_weight() with bitmap_empty() where appropriate"

* tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits)
  perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages
  perf python: Add perf_env stubs that will be needed in evsel__open_strerror()
  perf tools: Enhance the matching of sub-commands abbreviations
  libperf tests: Fix typo in perf_evlist__open() failure error messages
  tools arm64: Import cputype.h
  perf lock: Add -F/--field option to control output
  perf lock: Extend struct lock_key to have print function
  perf lock: Add --synth=no option for record
  tools headers cpufeatures: Sync with the kernel sources
  tools headers cpufeatures: Sync with the kernel sources
  perf stat: Fix forked applications enablement of counters
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf evsel: Make evsel__env() always return a valid env
  perf build-id: Fix spelling mistake "Cant" -> "Can't"
  perf header: Fix spelling mistake "could't" -> "couldn't"
  perf script: Add 'brstackinsnlen' for branch stacks
  perf parse-events: Move slots only with topdown
  perf ftrace latency: Update documentation
  perf ftrace latency: Add -n/--use-nsec option
  perf tools: Fix version kernel tag
  ...
This commit is contained in:
Linus Torvalds
2022-03-27 13:42:32 -07:00
329 changed files with 94225 additions and 77611 deletions

View File

@@ -0,0 +1,258 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2012 ARM Ltd.
*/
#ifndef __ASM_CPUTYPE_H
#define __ASM_CPUTYPE_H
#define INVALID_HWID ULONG_MAX
#define MPIDR_UP_BITMASK (0x1 << 30)
#define MPIDR_MT_BITMASK (0x1 << 24)
#define MPIDR_HWID_BITMASK UL(0xff00ffffff)
#define MPIDR_LEVEL_BITS_SHIFT 3
#define MPIDR_LEVEL_BITS (1 << MPIDR_LEVEL_BITS_SHIFT)
#define MPIDR_LEVEL_MASK ((1 << MPIDR_LEVEL_BITS) - 1)
#define MPIDR_LEVEL_SHIFT(level) \
(((1 << level) >> 1) << MPIDR_LEVEL_BITS_SHIFT)
#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
((mpidr >> MPIDR_LEVEL_SHIFT(level)) & MPIDR_LEVEL_MASK)
#define MIDR_REVISION_MASK 0xf
#define MIDR_REVISION(midr) ((midr) & MIDR_REVISION_MASK)
#define MIDR_PARTNUM_SHIFT 4
#define MIDR_PARTNUM_MASK (0xfff << MIDR_PARTNUM_SHIFT)
#define MIDR_PARTNUM(midr) \
(((midr) & MIDR_PARTNUM_MASK) >> MIDR_PARTNUM_SHIFT)
#define MIDR_ARCHITECTURE_SHIFT 16
#define MIDR_ARCHITECTURE_MASK (0xf << MIDR_ARCHITECTURE_SHIFT)
#define MIDR_ARCHITECTURE(midr) \
(((midr) & MIDR_ARCHITECTURE_MASK) >> MIDR_ARCHITECTURE_SHIFT)
#define MIDR_VARIANT_SHIFT 20
#define MIDR_VARIANT_MASK (0xf << MIDR_VARIANT_SHIFT)
#define MIDR_VARIANT(midr) \
(((midr) & MIDR_VARIANT_MASK) >> MIDR_VARIANT_SHIFT)
#define MIDR_IMPLEMENTOR_SHIFT 24
#define MIDR_IMPLEMENTOR_MASK (0xff << MIDR_IMPLEMENTOR_SHIFT)
#define MIDR_IMPLEMENTOR(midr) \
(((midr) & MIDR_IMPLEMENTOR_MASK) >> MIDR_IMPLEMENTOR_SHIFT)
#define MIDR_CPU_MODEL(imp, partnum) \
(((imp) << MIDR_IMPLEMENTOR_SHIFT) | \
(0xf << MIDR_ARCHITECTURE_SHIFT) | \
((partnum) << MIDR_PARTNUM_SHIFT))
#define MIDR_CPU_VAR_REV(var, rev) \
(((var) << MIDR_VARIANT_SHIFT) | (rev))
#define MIDR_CPU_MODEL_MASK (MIDR_IMPLEMENTOR_MASK | MIDR_PARTNUM_MASK | \
MIDR_ARCHITECTURE_MASK)
#define ARM_CPU_IMP_ARM 0x41
#define ARM_CPU_IMP_APM 0x50
#define ARM_CPU_IMP_CAVIUM 0x43
#define ARM_CPU_IMP_BRCM 0x42
#define ARM_CPU_IMP_QCOM 0x51
#define ARM_CPU_IMP_NVIDIA 0x4E
#define ARM_CPU_IMP_FUJITSU 0x46
#define ARM_CPU_IMP_HISI 0x48
#define ARM_CPU_IMP_APPLE 0x61
#define ARM_CPU_PART_AEM_V8 0xD0F
#define ARM_CPU_PART_FOUNDATION 0xD00
#define ARM_CPU_PART_CORTEX_A57 0xD07
#define ARM_CPU_PART_CORTEX_A72 0xD08
#define ARM_CPU_PART_CORTEX_A53 0xD03
#define ARM_CPU_PART_CORTEX_A73 0xD09
#define ARM_CPU_PART_CORTEX_A75 0xD0A
#define ARM_CPU_PART_CORTEX_A35 0xD04
#define ARM_CPU_PART_CORTEX_A55 0xD05
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A510 0xD46
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
#define APM_CPU_PART_POTENZA 0x000
#define CAVIUM_CPU_PART_THUNDERX 0x0A1
#define CAVIUM_CPU_PART_THUNDERX_81XX 0x0A2
#define CAVIUM_CPU_PART_THUNDERX_83XX 0x0A3
#define CAVIUM_CPU_PART_THUNDERX2 0x0AF
/* OcteonTx2 series */
#define CAVIUM_CPU_PART_OCTX2_98XX 0x0B1
#define CAVIUM_CPU_PART_OCTX2_96XX 0x0B2
#define CAVIUM_CPU_PART_OCTX2_95XX 0x0B3
#define CAVIUM_CPU_PART_OCTX2_95XXN 0x0B4
#define CAVIUM_CPU_PART_OCTX2_95XXMM 0x0B5
#define CAVIUM_CPU_PART_OCTX2_95XXO 0x0B6
#define BRCM_CPU_PART_BRAHMA_B53 0x100
#define BRCM_CPU_PART_VULCAN 0x516
#define QCOM_CPU_PART_FALKOR_V1 0x800
#define QCOM_CPU_PART_FALKOR 0xC00
#define QCOM_CPU_PART_KRYO 0x200
#define QCOM_CPU_PART_KRYO_2XX_GOLD 0x800
#define QCOM_CPU_PART_KRYO_2XX_SILVER 0x801
#define QCOM_CPU_PART_KRYO_3XX_SILVER 0x803
#define QCOM_CPU_PART_KRYO_4XX_GOLD 0x804
#define QCOM_CPU_PART_KRYO_4XX_SILVER 0x805
#define NVIDIA_CPU_PART_DENVER 0x003
#define NVIDIA_CPU_PART_CARMEL 0x004
#define FUJITSU_CPU_PART_A64FX 0x001
#define HISI_CPU_PART_TSV110 0xD01
#define APPLE_CPU_PART_M1_ICESTORM 0x022
#define APPLE_CPU_PART_M1_FIRESTORM 0x023
#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
#define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
#define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75)
#define MIDR_CORTEX_A35 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A35)
#define MIDR_CORTEX_A55 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A55)
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
#define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
#define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
#define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
#define MIDR_OCTX2_98XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_OCTX2_98XX)
#define MIDR_OCTX2_96XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_OCTX2_96XX)
#define MIDR_OCTX2_95XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_OCTX2_95XX)
#define MIDR_OCTX2_95XXN MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_OCTX2_95XXN)
#define MIDR_OCTX2_95XXMM MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_OCTX2_95XXMM)
#define MIDR_OCTX2_95XXO MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_OCTX2_95XXO)
#define MIDR_CAVIUM_THUNDERX2 MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX2)
#define MIDR_BRAHMA_B53 MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_BRAHMA_B53)
#define MIDR_BRCM_VULCAN MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_VULCAN)
#define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1)
#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR)
#define MIDR_QCOM_KRYO MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO)
#define MIDR_QCOM_KRYO_2XX_GOLD MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_2XX_GOLD)
#define MIDR_QCOM_KRYO_2XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_2XX_SILVER)
#define MIDR_QCOM_KRYO_3XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_3XX_SILVER)
#define MIDR_QCOM_KRYO_4XX_GOLD MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_4XX_GOLD)
#define MIDR_QCOM_KRYO_4XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_4XX_SILVER)
#define MIDR_NVIDIA_DENVER MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_DENVER)
#define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL)
#define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX)
#define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110)
#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM)
#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM)
/* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
#define MIDR_FUJITSU_ERRATUM_010001 MIDR_FUJITSU_A64FX
#define MIDR_FUJITSU_ERRATUM_010001_MASK (~MIDR_CPU_VAR_REV(1, 0))
#define TCR_CLEAR_FUJITSU_ERRATUM_010001 (TCR_NFD1 | TCR_NFD0)
#ifndef __ASSEMBLY__
#include "sysreg.h"
#define read_cpuid(reg) read_sysreg_s(SYS_ ## reg)
/*
* Represent a range of MIDR values for a given CPU model and a
* range of variant/revision values.
*
* @model - CPU model as defined by MIDR_CPU_MODEL
* @rv_min - Minimum value for the revision/variant as defined by
* MIDR_CPU_VAR_REV
* @rv_max - Maximum value for the variant/revision for the range.
*/
struct midr_range {
u32 model;
u32 rv_min;
u32 rv_max;
};
#define MIDR_RANGE(m, v_min, r_min, v_max, r_max) \
{ \
.model = m, \
.rv_min = MIDR_CPU_VAR_REV(v_min, r_min), \
.rv_max = MIDR_CPU_VAR_REV(v_max, r_max), \
}
#define MIDR_REV_RANGE(m, v, r_min, r_max) MIDR_RANGE(m, v, r_min, v, r_max)
#define MIDR_REV(m, v, r) MIDR_RANGE(m, v, r, v, r)
#define MIDR_ALL_VERSIONS(m) MIDR_RANGE(m, 0, 0, 0xf, 0xf)
static inline bool midr_is_cpu_model_range(u32 midr, u32 model, u32 rv_min,
u32 rv_max)
{
u32 _model = midr & MIDR_CPU_MODEL_MASK;
u32 rv = midr & (MIDR_REVISION_MASK | MIDR_VARIANT_MASK);
return _model == model && rv >= rv_min && rv <= rv_max;
}
static inline bool is_midr_in_range(u32 midr, struct midr_range const *range)
{
return midr_is_cpu_model_range(midr, range->model,
range->rv_min, range->rv_max);
}
static inline bool
is_midr_in_range_list(u32 midr, struct midr_range const *ranges)
{
while (ranges->model)
if (is_midr_in_range(midr, ranges++))
return true;
return false;
}
/*
* The CPU ID never changes at run time, so we might as well tell the
* compiler that it's constant. Use this function to read the CPU ID
* rather than directly reading processor_id or read_cpuid() directly.
*/
static inline u32 __attribute_const__ read_cpuid_id(void)
{
return read_cpuid(MIDR_EL1);
}
static inline u64 __attribute_const__ read_cpuid_mpidr(void)
{
return read_cpuid(MPIDR_EL1);
}
static inline unsigned int __attribute_const__ read_cpuid_implementor(void)
{
return MIDR_IMPLEMENTOR(read_cpuid_id());
}
static inline unsigned int __attribute_const__ read_cpuid_part_number(void)
{
return MIDR_PARTNUM(read_cpuid_id());
}
static inline u32 __attribute_const__ read_cpuid_cachetype(void)
{
return read_cpuid(CTR_EL0);
}
#endif /* __ASSEMBLY__ */
#endif

View File

@@ -299,9 +299,6 @@
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */
#define X86_FEATURE_AMX_BF16 (18*32+22) /* AMX bf16 Support */
#define X86_FEATURE_AMX_TILE (18*32+24) /* AMX tile Support */
#define X86_FEATURE_AMX_INT8 (18*32+25) /* AMX int8 Support */
/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
#define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */
@@ -330,6 +327,7 @@
#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */
#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */
#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
#define X86_FEATURE_HFI (14*32+19) /* Hardware Feedback Interface */
/* AMD SVM Feature Identification, CPUID level 0x8000000a (EDX), word 15 */
#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */
@@ -390,7 +388,10 @@
#define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */
#define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
#define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */
#define X86_FEATURE_AMX_BF16 (18*32+22) /* AMX bf16 Support */
#define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */
#define X86_FEATURE_AMX_TILE (18*32+24) /* AMX tile Support */
#define X86_FEATURE_AMX_INT8 (18*32+25) /* AMX int8 Support */
#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */

View File

@@ -56,8 +56,11 @@
# define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31))
#endif
/* Force disable because it's broken beyond repair */
#define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
#ifdef CONFIG_INTEL_IOMMU_SVM
# define DISABLE_ENQCMD 0
#else
# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
#endif
#ifdef CONFIG_X86_SGX
# define DISABLE_SGX 0

View File

@@ -705,12 +705,14 @@
#define PACKAGE_THERM_STATUS_PROCHOT (1 << 0)
#define PACKAGE_THERM_STATUS_POWER_LIMIT (1 << 10)
#define PACKAGE_THERM_STATUS_HFI_UPDATED (1 << 26)
#define MSR_IA32_PACKAGE_THERM_INTERRUPT 0x000001b2
#define PACKAGE_THERM_INT_HIGH_ENABLE (1 << 0)
#define PACKAGE_THERM_INT_LOW_ENABLE (1 << 1)
#define PACKAGE_THERM_INT_PLN_ENABLE (1 << 24)
#define PACKAGE_THERM_INT_HFI_ENABLE (1 << 25)
/* Thermal Thresholds Support */
#define THERM_INT_THRESHOLD0_ENABLE (1 << 15)
@@ -959,4 +961,8 @@
#define MSR_VM_IGNNE 0xc0010115
#define MSR_VM_HSAVE_PA 0xc0010117
/* Hardware Feedback Interface */
#define MSR_IA32_HW_FEEDBACK_PTR 0x17d0
#define MSR_IA32_HW_FEEDBACK_CONFIG 0x17d1
#endif /* _ASM_X86_MSR_INDEX_H */

View File

@@ -102,10 +102,6 @@
# define __init
#endif
#ifndef noinline
# define noinline
#endif
#include <linux/types.h>
/*

View File

@@ -18,6 +18,7 @@
* ETMv3.5/PTM doesn't define ETMCR config bits with prefix "ETM3_" and
* directly use below macros as config bits.
*/
#define ETM_OPT_BRANCH_BROADCAST 8
#define ETM_OPT_CYCACC 12
#define ETM_OPT_CTXTID 14
#define ETM_OPT_CTXTID2 15
@@ -25,6 +26,7 @@
#define ETM_OPT_RETSTK 29
/* ETMv4 CONFIGR programming bits for the ETM OPTs */
#define ETM4_CFG_BIT_BB 3
#define ETM4_CFG_BIT_CYCACC 4
#define ETM4_CFG_BIT_CTXTID 6
#define ETM4_CFG_BIT_VMID 7

View File

@@ -88,6 +88,23 @@ int fdarray__add(struct fdarray *fda, int fd, short revents, enum fdarray_flags
return pos;
}
int fdarray__dup_entry_from(struct fdarray *fda, int pos, struct fdarray *from)
{
struct pollfd *entry;
int npos;
if (pos >= from->nr)
return -EINVAL;
entry = &from->entries[pos];
npos = fdarray__add(fda, entry->fd, entry->events, from->priv[pos].flags);
if (npos >= 0)
fda->priv[npos] = from->priv[pos];
return npos;
}
int fdarray__filter(struct fdarray *fda, short revents,
void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
void *arg)

View File

@@ -42,6 +42,7 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
void fdarray__delete(struct fdarray *fda);
int fdarray__add(struct fdarray *fda, int fd, short revents, enum fdarray_flags flags);
int fdarray__dup_entry_from(struct fdarray *fda, int pos, struct fdarray *from);
int fdarray__poll(struct fdarray *fda, int timeout);
int fdarray__filter(struct fdarray *fda, short revents,
void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),

View File

@@ -62,11 +62,12 @@ SYNOPSIS
struct perf_thread_map;
struct perf_thread_map *perf_thread_map__new_dummy(void);
struct perf_thread_map *perf_thread_map__new_array(int nr_threads, pid_t *array);
void perf_thread_map__set_pid(struct perf_thread_map *map, int thread, pid_t pid);
char *perf_thread_map__comm(struct perf_thread_map *map, int thread);
void perf_thread_map__set_pid(struct perf_thread_map *map, int idx, pid_t pid);
char *perf_thread_map__comm(struct perf_thread_map *map, int idx);
int perf_thread_map__nr(struct perf_thread_map *threads);
pid_t perf_thread_map__pid(struct perf_thread_map *map, int thread);
pid_t perf_thread_map__pid(struct perf_thread_map *map, int idx);
struct perf_thread_map *perf_thread_map__get(struct perf_thread_map *map);
void perf_thread_map__put(struct perf_thread_map *map);

View File

@@ -8,11 +8,12 @@
struct perf_thread_map;
LIBPERF_API struct perf_thread_map *perf_thread_map__new_dummy(void);
LIBPERF_API struct perf_thread_map *perf_thread_map__new_array(int nr_threads, pid_t *array);
LIBPERF_API void perf_thread_map__set_pid(struct perf_thread_map *map, int thread, pid_t pid);
LIBPERF_API char *perf_thread_map__comm(struct perf_thread_map *map, int thread);
LIBPERF_API void perf_thread_map__set_pid(struct perf_thread_map *map, int idx, pid_t pid);
LIBPERF_API char *perf_thread_map__comm(struct perf_thread_map *map, int idx);
LIBPERF_API int perf_thread_map__nr(struct perf_thread_map *threads);
LIBPERF_API pid_t perf_thread_map__pid(struct perf_thread_map *map, int thread);
LIBPERF_API pid_t perf_thread_map__pid(struct perf_thread_map *map, int idx);
LIBPERF_API struct perf_thread_map *perf_thread_map__get(struct perf_thread_map *map);
LIBPERF_API void perf_thread_map__put(struct perf_thread_map *map);

View File

@@ -12,6 +12,7 @@ LIBPERF_0.0.1 {
perf_cpu_map__empty;
perf_cpu_map__max;
perf_cpu_map__has;
perf_thread_map__new_array;
perf_thread_map__new_dummy;
perf_thread_map__set_pid;
perf_thread_map__comm;

View File

@@ -69,7 +69,7 @@ static int test_stat_cpu(void)
perf_evlist__set_maps(evlist, cpus, NULL);
err = perf_evlist__open(evlist);
__T("failed to open evsel", err == 0);
__T("failed to open evlist", err == 0);
perf_evlist__for_each_evsel(evlist, evsel) {
cpus = perf_evsel__cpus(evsel);
@@ -130,7 +130,7 @@ static int test_stat_thread(void)
perf_evlist__set_maps(evlist, NULL, threads);
err = perf_evlist__open(evlist);
__T("failed to open evsel", err == 0);
__T("failed to open evlist", err == 0);
perf_evlist__for_each_evsel(evlist, evsel) {
perf_evsel__read(evsel, 0, 0, &counts);
@@ -187,7 +187,7 @@ static int test_stat_thread_enable(void)
perf_evlist__set_maps(evlist, NULL, threads);
err = perf_evlist__open(evlist);
__T("failed to open evsel", err == 0);
__T("failed to open evlist", err == 0);
perf_evlist__for_each_evsel(evlist, evsel) {
perf_evsel__read(evsel, 0, 0, &counts);
@@ -507,7 +507,7 @@ static int test_stat_multiplexing(void)
perf_evlist__set_maps(evlist, NULL, threads);
err = perf_evlist__open(evlist);
__T("failed to open evsel", err == 0);
__T("failed to open evlist", err == 0);
perf_evlist__enable(evlist);

View File

@@ -11,9 +11,43 @@ static int libperf_print(enum libperf_print_level level,
return vfprintf(stderr, fmt, ap);
}
static int test_threadmap_array(int nr, pid_t *array)
{
struct perf_thread_map *threads;
int i;
threads = perf_thread_map__new_array(nr, array);
__T("Failed to allocate new thread map", threads);
__T("Unexpected number of threads", perf_thread_map__nr(threads) == nr);
for (i = 0; i < nr; i++) {
__T("Unexpected initial value of thread",
perf_thread_map__pid(threads, i) == (array ? array[i] : -1));
}
for (i = 1; i < nr; i++)
perf_thread_map__set_pid(threads, i, i * 100);
__T("Unexpected value of thread 0",
perf_thread_map__pid(threads, 0) == (array ? array[0] : -1));
for (i = 1; i < nr; i++) {
__T("Unexpected thread value",
perf_thread_map__pid(threads, i) == i * 100);
}
perf_thread_map__put(threads);
return 0;
}
#define THREADS_NR 10
int test_threadmap(int argc, char **argv)
{
struct perf_thread_map *threads;
pid_t thr_array[THREADS_NR];
int i;
__T_START;
@@ -27,6 +61,13 @@ int test_threadmap(int argc, char **argv)
perf_thread_map__put(threads);
perf_thread_map__put(threads);
test_threadmap_array(THREADS_NR, NULL);
for (i = 0; i < THREADS_NR; i++)
thr_array[i] = i + 100;
test_threadmap_array(THREADS_NR, thr_array);
__T_END;
return tests_failed == 0 ? 0 : -1;
}

View File

@@ -32,26 +32,36 @@ struct perf_thread_map *perf_thread_map__realloc(struct perf_thread_map *map, in
#define thread_map__alloc(__nr) perf_thread_map__realloc(NULL, __nr)
void perf_thread_map__set_pid(struct perf_thread_map *map, int thread, pid_t pid)
void perf_thread_map__set_pid(struct perf_thread_map *map, int idx, pid_t pid)
{
map->map[thread].pid = pid;
map->map[idx].pid = pid;
}
char *perf_thread_map__comm(struct perf_thread_map *map, int thread)
char *perf_thread_map__comm(struct perf_thread_map *map, int idx)
{
return map->map[thread].comm;
return map->map[idx].comm;
}
struct perf_thread_map *perf_thread_map__new_array(int nr_threads, pid_t *array)
{
struct perf_thread_map *threads = thread_map__alloc(nr_threads);
int i;
if (!threads)
return NULL;
for (i = 0; i < nr_threads; i++)
perf_thread_map__set_pid(threads, i, array ? array[i] : -1);
threads->nr = nr_threads;
refcount_set(&threads->refcnt, 1);
return threads;
}
struct perf_thread_map *perf_thread_map__new_dummy(void)
{
struct perf_thread_map *threads = thread_map__alloc(1);
if (threads != NULL) {
perf_thread_map__set_pid(threads, 0, -1);
threads->nr = 1;
refcount_set(&threads->refcnt, 1);
}
return threads;
return perf_thread_map__new_array(1, NULL);
}
static void perf_thread_map__delete(struct perf_thread_map *threads)
@@ -85,7 +95,7 @@ int perf_thread_map__nr(struct perf_thread_map *threads)
return threads ? threads->nr : 1;
}
pid_t perf_thread_map__pid(struct perf_thread_map *map, int thread)
pid_t perf_thread_map__pid(struct perf_thread_map *map, int idx)
{
return map->map[thread].pid;
return map->map[idx].pid;
}

View File

@@ -7,6 +7,8 @@
p synthesize power events (incl. PSB events for Intel PT)
o synthesize other events recorded due to the use
of aux-output (refer to perf record)
I synthesize interrupt or similar (asynchronous) events
(e.g. Intel PT Event Trace)
e synthesize error events
d create a debug log
f synthesize first level cache events

View File

@@ -9,32 +9,24 @@ perf-ftrace - simple wrapper for kernel's ftrace functionality
SYNOPSIS
--------
[verse]
'perf ftrace' <command>
'perf ftrace' {trace|latency} <command>
DESCRIPTION
-----------
The 'perf ftrace' command is a simple wrapper of kernel's ftrace
functionality. It only supports single thread tracing currently and
just reads trace_pipe in text and then write it to stdout.
The 'perf ftrace' command provides a collection of subcommands which use
kernel's ftrace infrastructure.
'perf ftrace trace' is a simple wrapper of the ftrace. It only supports
single thread tracing currently and just reads trace_pipe in text and then
write it to stdout.
'perf ftrace latency' calculates execution latency of a given function
(optionally with BPF) and display it as a histogram.
The following options apply to perf ftrace.
OPTIONS
-------
-t::
--tracer=::
Tracer to use when neither -G nor -F option is not
specified: function_graph or function.
-v::
--verbose::
Increase the verbosity level.
-F::
--funcs::
List available functions to trace. It accepts a pattern to
only list interested functions.
COMMON OPTIONS
--------------
-p::
--pid=::
@@ -43,10 +35,6 @@ OPTIONS
--tid=::
Trace on existing thread id (comma separated list).
-D::
--delay::
Time (ms) to wait before starting tracing after program start.
-a::
--all-cpus::
Force system-wide collection. Scripts run without a <command>
@@ -61,6 +49,28 @@ OPTIONS
Ranges of CPUs are specified with -: 0-2.
Default is to trace on all online CPUs.
-v::
--verbose::
Increase the verbosity level.
OPTIONS for 'perf ftrace trace'
-------------------------------
-t::
--tracer=::
Tracer to use when neither -G nor -F option is not
specified: function_graph or function.
-F::
--funcs::
List available functions to trace. It accepts a pattern to
only list interested functions.
-D::
--delay::
Time (ms) to wait before starting tracing after program start.
-m::
--buffer-size::
Set the size of per-cpu tracing buffer, <size> is expected to
@@ -114,6 +124,25 @@ OPTIONS
thresh=<n> - Setup trace duration threshold in microseconds.
depth=<n> - Set max depth for function graph tracer to follow.
OPTIONS for 'perf ftrace latency'
---------------------------------
-T::
--trace-funcs=::
Set the function name to get the histogram. Unlike perf ftrace trace,
it only allows single function to calculate the histogram.
-b::
--use-bpf::
Use BPF to measure function latency instead of using the ftrace (it
uses function_graph tracer internally).
-n::
--use-nsec::
Use nano-second instead of micro-second as a base unit of the histogram.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-trace[1]

View File

@@ -108,9 +108,10 @@ displayed as follows:
perf script --itrace=ibxwpe -F+flags
The flags are "bcrosyiABExgh" which stand for branch, call, return, conditional,
The flags are "bcrosyiABExghDt" which stand for branch, call, return, conditional,
system, asynchronous, interrupt, transaction abort, trace begin, trace end,
in transaction, VM-entry, and VM-exit respectively.
in transaction, VM-entry, VM-exit, interrupt disabled, and interrupt disable
toggle respectively.
perf script also supports higher level ways to dump instruction traces:
@@ -483,6 +484,30 @@ pwr_evt Enable power events. The power events provide information about
which contains "1" if the feature is supported and
"0" otherwise.
event Enable Event Trace. The events provide information about asynchronous
events.
Support for this feature is indicated by:
/sys/bus/event_source/devices/intel_pt/caps/event_trace
which contains "1" if the feature is supported and
"0" otherwise.
notnt Disable TNT packets. Without TNT packets, it is not possible to walk
executable code to reconstruct control flow, however FUP, TIP, TIP.PGE
and TIP.PGD packets still indicate asynchronous control flow, and (if
return compression is disabled - see noretcomp) return statements.
The advantage of eliminating TNT packets is reducing the size of the
trace and corresponding tracing overhead.
Support for this feature is indicated by:
/sys/bus/event_source/devices/intel_pt/caps/tnt_disable
which contains "1" if the feature is supported and
"0" otherwise.
AUX area sampling option
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -876,6 +901,8 @@ The letters are:
p synthesize "power" events (incl. PSB events)
c synthesize branches events (calls only)
r synthesize branches events (returns only)
o synthesize PEBS-via-PT events
I synthesize Event Trace events
e synthesize tracing error events
d create a debug log
g synthesize a call chain (use with i or x)
@@ -1371,6 +1398,79 @@ There were none.
:17006 17006 [001] 11500.262869216: ffffffff8220116e error_entry+0xe ([guest.kernel.kallsyms]) pushq %rax
Event Trace
-----------
Event Trace records information about asynchronous events, for example interrupts,
faults, VM exits and entries. The information is recorded in CFE and EVD packets,
and also the Interrupt Flag is recorded on the MODE.Exec packet. The CFE packet
contains a type field to identify one of the following:
1 INTR interrupt, fault, exception, NMI
2 IRET interrupt return
3 SMI system management interrupt
4 RSM resume from system management mode
5 SIPI startup interprocessor interrupt
6 INIT INIT signal
7 VMENTRY VM-Entry
8 VMEXIT VM-Entry
9 VMEXIT_INTR VM-Exit due to interrupt
10 SHUTDOWN Shutdown
For more details, refer to the Intel 64 and IA-32 Architectures Software
Developer Manuals (version 076 or later).
The capability to do Event Trace is indicated by the
/sys/bus/event_source/devices/intel_pt/caps/event_trace file.
Event trace is selected for recording using the "event" config term. e.g.
perf record -e intel_pt/event/u uname
Event trace events are output using the --itrace I option. e.g.
perf script --itrace=Ie
perf script displays events containing CFE type, vector and event data,
in the form:
evt: hw int (t) cfe: INTR IP: 1 vector: 3 PFA: 0x8877665544332211
The IP flag indicates if the event binds to an IP, which includes any case where
flow control packet generation is enabled, as well as when CFE packet IP bit is
set.
perf script displays events containing changes to the Interrupt Flag in the form:
iflag: t IFLAG: 1->0 via branch
where "via branch" indicates a branch (interrupt or return from interrupt) and
"non branch" indicates an instruction such as CFI, STI or POPF).
In addition, the current state of the interrupt flag is indicated by the presence
or absence of the "D" (interrupt disabled) perf script flag. If the interrupt
flag is changed, then the "t" flag is also included i.e.
no flag, interrupts enabled IF=1
t interrupts become disabled IF=1 -> IF=0
D interrupts are disabled IF=0
Dt interrupts become enabled IF=0 -> IF=1
The intel-pt-events.py script illustrates how to access Event Trace information
using a Python script.
TNT Disable
-----------
TNT packets are disabled using the "notnt" config term. e.g.
perf record -e intel_pt/notnt/u uname
In that case the --itrace q option is forced because walking executable code
to reconstruct the control flow is not possible.
SEE ALSO
--------

View File

@@ -54,6 +54,16 @@ REPORT OPTIONS
Sorting key. Possible values: acquired (default), contended,
avg_wait, wait_total, wait_max, wait_min.
-F::
--field=<value>::
Output fields. By default it shows all the fields but users can
customize that using this. Possible values: acquired, contended,
avg_wait, wait_total, wait_max, wait_min.
-c::
--combine-locks::
Merge lock instances in the same class (based on name).
INFO OPTIONS
------------

View File

@@ -713,6 +713,40 @@ measurements:
wait -n ${perf_pid}
exit $?
--threads=<spec>::
Write collected trace data into several data files using parallel threads.
<spec> value can be user defined list of masks. Masks separated by colon
define CPUs to be monitored by a thread and affinity mask of that thread
is separated by slash:
<cpus mask 1>/<affinity mask 1>:<cpus mask 2>/<affinity mask 2>:...
CPUs or affinity masks must not overlap with other corresponding masks.
Invalid CPUs are ignored, but masks containing only invalid CPUs are not
allowed.
For example user specification like the following:
0,2-4/2-4:1,5-7/5-7
specifies parallel threads layout that consists of two threads,
the first thread monitors CPUs 0 and 2-4 with the affinity mask 2-4,
the second monitors CPUs 1 and 5-7 with the affinity mask 5-7.
<spec> value can also be a string meaning predefined parallel threads
layout:
cpu - create new data streaming thread for every monitored cpu
core - create new thread to monitor CPUs grouped by a core
package - create new thread to monitor CPUs grouped by a package
numa - create new threed to monitor CPUs grouped by a NUMA domain
Predefined layouts can be used on systems with large number of CPUs in
order not to spawn multiple per-cpu streaming threads but still avoid LOST
events in data directory files. Option specified with no or empty value
defaults to CPU layout. Masks defined or provided by the option value are
filtered through the mask provided by -C option.
include::intel-hybrid.txt[]
--debuginfod[=URLs]::

View File

@@ -129,8 +129,8 @@ OPTIONS
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output,
brstackinsn, brstackoff, callindent, insn, insnlen, synth, phys_addr,
metric, misc, srccode, ipc, data_page_size, code_page_size, ins_lat.
brstackinsn, brstackinsnlen, brstackoff, callindent, insn, insnlen, synth,
phys_addr, metric, misc, srccode, ipc, data_page_size, code_page_size, ins_lat.
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@@ -195,16 +195,19 @@ OPTIONS
At this point usage is displayed, and perf-script exits.
The flags field is synthesized and may have a value when Instruction
Trace decoding. The flags are "bcrosyiABExgh" which stand for branch,
Trace decoding. The flags are "bcrosyiABExghDt" which stand for branch,
call, return, conditional, system, asynchronous, interrupt,
transaction abort, trace begin, trace end, in transaction, VM-Entry, and VM-Exit
respectively. Known combinations of flags are printed more nicely e.g.
transaction abort, trace begin, trace end, in transaction, VM-Entry,
VM-Exit, interrupt disabled and interrupt disable toggle respectively.
Known combinations of flags are printed more nicely e.g.
"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
"tr end" for "bE", "vmentry" for "bcg", "vmexit" for "bch".
However the "x" flag will be displayed separately in those
cases e.g. "jcc (x)" for a condition branch within a transaction.
However the "x", "D" and "t" flags will be displayed separately in those
cases e.g. "jcc (xD)" for a condition branch within a transaction
with interrupts disabled. Note, interrupts becoming disabled is "t",
whereas interrupts becoming enabled is "Dt".
The callindent field is synthesized and may have a value when
Instruction Trace decoding. For calls and returns, it will display the
@@ -238,6 +241,10 @@ OPTIONS
is printed. This is the full execution path leading to the sample. This is only supported when the
sample was recorded with perf record -b or -j any.
Use brstackinsnlen to print the brstackinsn lenght. For example, you
cant know the next sequential instruction after an unconditional branch unless
you calculate that based on its length.
The brstackoff field will print an offset into a specific dso/binary.
With the metric option perf script can compute metrics for

Some files were not shown because too many files have changed in this diff Show More