Levi Yun
1d18ebcfd3
perf expr: Initialize is_test value in expr__ctx_new()
...
When expr_parse_ctx is allocated by expr_ctx_new(),
expr_scanner_ctx->is_test isn't initialize, so it has garbage value.
this can affects the result of expr__parse() return when it parses
non-exist event literal according to garbage value.
Use calloc instead of malloc in expr_ctx_new() to fix this.
Fixes: 3340a08354 ("perf pmu-events: Fix testing with JEVENTS_ARCH=all")
Reviewed-by: Ian Rogers <irogers@google.com >
Reviewed-by: James Clark <james.clark@linaro.org >
Signed-off-by: Levi Yun <yeoreum.yun@arm.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20241108143424.819126-1-yeoreum.yun@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2024-12-12 16:12:37 -03:00
Ian Rogers
494c403ff1
perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
...
On ARM the cpuid is dependent on the core type of the CPU in
question. The PMU was passed for the sake of the CPU map but this
means in places a temporary PMU is created just to pass a CPU
value. Just pass the CPU and fix up the callers.
As there are no longer PMU users in header.h, shuffle forward
declarations earlier to work around build failures.
Reviewed-by: James Clark <james.clark@linaro.org >
Signed-off-by: Ian Rogers <irogers@google.com >
Tested-by: Xu Yang <xu.yang_2@nxp.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Albert Ou <aou@eecs.berkeley.edu >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexandre Ghiti <alexghiti@rivosinc.com >
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com >
Cc: Ben Zong-You Xie <ben717@andestech.com >
Cc: Benjamin Gray <bgray@linux.ibm.com >
Cc: Bibo Mao <maobibo@loongson.cn >
Cc: Clément Le Goffic <clement.legoffic@foss.st.com >
Cc: Dima Kogan <dima@secretsauce.net >
Cc: Dr. David Alan Gilbert <linux@treblig.org >
Cc: Huacai Chen <chenhuacai@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.g.garry@oracle.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linux.dev >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Masami Hiramatsu <mhiramat@kernel.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Palmer Dabbelt <palmer@dabbelt.com >
Cc: Paul Walmsley <paul.walmsley@sifive.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Sandipan Das <sandipan.das@amd.com >
Cc: Will Deacon <will@kernel.org >
Cc: Yicong Yang <yangyicong@hisilicon.com >
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2024-11-16 16:40:30 -03:00
Ian Rogers
069057239a
perf tool_pmu: Move expr literals to tool_pmu
...
Add the expr literals like "#smt_on" as tool events, this allows stat
events to give the values. On my laptop with hyperthreading enabled:
```
$ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true
Performance counter stats for 'true':
0 has_pmem
8 num_cores
16 num_cpus
16 num_cpus_online
1 num_dies
1 num_packages
1 smt_on
2,496,000,000 system_tsc_freq
0.001113637 seconds time elapsed
0.001218000 seconds user
0.000000000 seconds sys
```
And with hyperthreading disabled:
```
$ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true
Performance counter stats for 'true':
0 has_pmem
8 num_cores
16 num_cpus
8 num_cpus_online
1 num_dies
1 num_packages
0 smt_on
2,496,000,000 system_tsc_freq
0.000802115 seconds time elapsed
0.000000000 seconds user
0.000806000 seconds sys
```
As zero matters for these values, in stat-display
should_skip_zero_counter only skip the zero value if it is not the
first aggregation index.
The tool event implementations are used in expr but not evaluated as
events for simplicity. Also core_wide isn't made a tool event as it
requires command line parameters.
Signed-off-by: Ian Rogers <irogers@google.com >
Acked-by: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20241002032016.333748-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
2024-10-10 23:40:32 -07:00
Clément Le Goffic
9aa61d8ecb
perf: parse-events: Fix compilation error while defining DEBUG_PARSER
...
Compiling perf tool with 'DEBUG_PARSER=1' leads to errors:
$> make -C tools/perf PARSER_DEBUG=1 NO_LIBTRACEEVENT=1
...
CC util/expr-flex.o
CC util/expr.o
util/parse-events.c:33:12: error: redundant redeclaration of ‘parse_events_debug’ [-Werror=redundant-decls]
33 | extern int parse_events_debug;
| ^~~~~~~~~~~~~~~~~~
In file included from util/parse-events.c:18:
util/parse-events-bison.h:43:12: note: previous declaration of ‘parse_events_debug’ with type ‘int’
43 | extern int parse_events_debug;
| ^~~~~~~~~~~~~~~~~~
util/expr.c:27:12: error: redundant redeclaration of ‘expr_debug’ [-Werror=redundant-decls]
27 | extern int expr_debug;
| ^~~~~~~~~~
In file included from util/expr.c:11:
util/expr-bison.h:43:12: note: previous declaration of ‘expr_debug’ with type ‘int’
43 | extern int expr_debug;
| ^~~~~~~~~~
cc-1: all warnings being treated as errors
Remove extern declaration from the parse-envents.c file as there is a
conflict with the ones generated using bison and yacc tools from the file
parse-events.[ly].
Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com >
Reviewed-by: Ian Rogers <irogers@google.com >
Cc: James Clark <james.clark@arm.com >
Cc: John Garry <john.g.garry@oracle.com >
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20240605140453.614862-1-clement.legoffic@foss.st.com
2024-06-06 00:19:36 -07:00
Ian Rogers
6dd76680b9
perf expr: Fix "has_event" function for metric style events
...
Events in metrics cannot use '/' as a separator, it would be
recognized as a divide, so they use '@'. The '@' is recognized in the
metricgroups code and changed to '/', do the same in the has_event
function so that the parsing is only tried without the @s.
Fixes: 4a4a9bf907 ("perf expr: Add has_event function")
Signed-off-by: Ian Rogers <irogers@google.com >
Reviewed-by: Kan Liang <kan.liang@linux.intel.com >
Cc: K Prateek Nayak <kprateek.nayak@amd.com >
Cc: James Clark <james.clark@arm.com >
Cc: Kaige Ye <ye@kaige.org >
Cc: John Garry <john.g.garry@oracle.com >
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20240209204947.3873294-3-irogers@google.com
2024-02-13 13:48:06 -08:00
James Clark
3d0f5f456a
perf pmu: Move pmu__find_core_pmu() to pmus.c
...
pmu__find_core_pmu() more logically belongs in pmus.c because it
iterates over all PMUs, so move it to pmus.c
At the same time rename it to perf_pmus__find_core_pmu() to match the
naming convention in this file.
list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now
that it's called from the same compilation unit. This is with -O2
(specifically -O1 -ftree-vrp -finline-functions
-finline-small-functions) which allow the bounds of the array
access to be determined at compile time. list_prepare_entry() subtracts
the offset of the 'list' member in struct perf_pmu from &core_pmus,
which isn't a struct perf_pmu. The compiler sees that pmu results in
&core_pmus - 8 and refuses to compile. At runtime this works because
list_for_each_entry_continue() always adds the offset back again before
dereferencing ->next, but it's technically undefined behavior. With
-fsanitize=undefined an additional warning is generated.
Using list_first_entry_or_null() to get the first entry here avoids
doing &core_pmus - 8 but has the same result and fixes both the compile
warning and the undefined behavior warning. There are other uses of
list_prepare_entry() in pmus.c, but the compiler doesn't seem to be
able to see that they can also be called with &core_pmus, so I won't
change any at this time.
Signed-off-by: James Clark <james.clark@arm.com >
Reviewed-by: Ian Rogers <irogers@google.com >
Reviewed-by: John Garry <john.g.garry@oracle.com >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Eduard Zingerman <eddyz87@gmail.com >
Cc: Will Deacon <will@kernel.org >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Jing Zhang <renyu.zj@linux.alibaba.com >
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230913153355.138331-2-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
2023-09-15 16:46:40 -07:00
Ian Rogers
f0005f1732
perf metric: Add #num_cpus_online literal
...
Returns the number of CPUs online, unlike #num_cpus that returns the
number present.
Add a test of the property.
This will be used in future Intel metrics.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Eduard Zingerman <eddyz87@gmail.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jing Zhang <renyu.zj@linux.alibaba.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.g.garry@oracle.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20230830073026.1829912-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-08-30 23:03:03 -03:00
James Clark
9d5da30e4a
perf jevents: Add a new expression builtin strcmp_cpuid_str()
...
This will allow writing formulas that are conditional on a specific
CPU type or CPU version. It calls through to the existing
strcmp_cpuid_str() function in Perf which has a default weak version,
and an arch specific version for x86 and arm64.
The function takes an 'ID' type value, which is a string. But in this
case Arm CPU IDs are hex numbers prefixed with '0x'. metric.py
assumes strings are only used by event names, and that they can't start
with a number ('0'), so an additional change has to be made to the
regex to convert hex numbers back to 'ID' types.
Signed-off-by: James Clark <james.clark@arm.com >
Reviewed-by: John Garry <john.g.garry@oracle.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Eduard Zingerman <eddyz87@gmail.com >
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jing Zhang <renyu.zj@linux.alibaba.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Nick Forrington <nick.forrington@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Rob Herring <robh@kernel.org >
Cc: Sohom Datta <sohomdatta1@gmail.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230816114841.1679234-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-08-17 14:12:14 -03:00
Namhyung Kim
c7e97f215a
perf build: Include generated header files properly
...
The flex and bison generate header files from the source. When user
specified a build directory with O= option, it'd generate files under
the directory. The build command has -I option to specify the header
include directory.
But the -I option only affects the files included like <...>. Let's
change the flex and bison headers to use it instead of "...".
Fixes: 80eeb67fe5 ("perf jevents: Program to convert JSON file")
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Anup Sharma <anupnewsmail@gmail.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@kernel.org >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230728022447.1323563-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-08-03 17:01:24 -03:00
Ian Rogers
4a4a9bf907
perf expr: Add has_event function
...
Some events are dependent on firmware/kernel enablement. Allow such
events to be detected when the metric is parsed so that the metric's
event parsing doesn't fail.
Signed-off-by: Ian Rogers <irogers@google.com >
Tested-by: Namhyung Kim <namhyung@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Eduard Zingerman <eddyz87@gmail.com >
Cc: Sohom Datta <sohomdatta1@gmail.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Edward Baker <edward.baker@intel.com >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Samantha Alt <samantha.alt@intel.com >
Cc: Weilin Wang <weilin.wang@intel.com >
Cc: Arnaldo Carvalho de Melo <acme@kernel.org >
Cc: Andrii Nakryiko <andrii@kernel.org >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Jing Zhang <renyu.zj@linux.alibaba.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com >
Cc: John Garry <john.g.garry@oracle.com >
Cc: Ingo Molnar <mingo@redhat.com >
Link: https://lore.kernel.org/r/20230623151016.4193660-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
2023-06-29 22:13:15 -07:00
Arnaldo Carvalho de Melo
a77f8184a0
perf expr: Use zfree() to reduce chances of use after free
...
Do defensive programming by using zfree() to initialize freed pointers
to NULL, so that eventual use after free result in a NULL pointer deref
instead of more subtle behaviour.
Also remove one NULL test before free(), as it accepts a NULL arg and we
get one line shaved not doing it explicitely.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-04-12 10:06:11 -03:00
Ian Rogers
c3bf86f11d
perf metrics: Add has_pmem literal
...
Add literal so that if nvdimms aren't installed we can record fewer
events. The file detection mechanism was suggested by Dan Williams
<dan.j.williams@intel.com > in:
https://lore.kernel.org/linux-perf-users/641bbe1eced26_1b98bb29440@dwillia2-xfh.jf.intel.com.notmuch/
Reviewed-by: Kan Liang <kan.liang@linux.intel.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Edward Baker <edward.baker@intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Samantha Alt <samantha.alt@intel.com >
Cc: Weilin Wang <weilin.wang@intel.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-04-04 09:39:55 -03:00
Ian Rogers
207f7df727
perf expr: Make the online topology accessible globally
...
Knowing the topology of online CPUs is useful for more than just expr
literals. Move to a global function that caches the value. An
additional upside is that this may also avoid computing the CPU
topology in some situations.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com >
Cc: Andrii Nakryiko <andrii@kernel.org >
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Eduard Zingerman <eddyz87@gmail.com >
Cc: Florian Fischer <florian.fischer@muhq.space >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jing Zhang <renyu.zj@linux.alibaba.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.g.garry@oracle.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Sandipan Das <sandipan.das@amd.com >
Cc: Sean Christopherson <seanjc@google.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Suzuki Poulouse <suzuki.poulose@arm.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Link: https://lore.kernel.org/r/20230219092848.639226-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-02-19 08:03:12 -03:00
Jing Zhang
acef233b7c
perf pmu: Add #slots literal support for arm64
...
The slots in each architecture may be different, so add #slots literal
to obtain the slots of different architectures, and the #slots can be
applied in the metric. Currently, The #slots just support for arm64,
and other architectures will return NAN.
On arm64, the value of slots is from the register PMMIR_EL1.SLOT, which
I can read in /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots.
PMMIR_EL1.SLOT might read as zero if the PMU version is lower than
ID_AA64DFR0_EL1_PMUVer_V3P4 or the STALL_SLOT event is not implemented.
Reviewed-by: John Garry <john.g.garry@oracle.com >
Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andrew Kilroy <andrew.kilroy@arm.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Shuai Xue <xueshuai@linux.alibaba.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: Zhuo Song <zhuo.song@linux.alibaba.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/1673940573-90503-2-git-send-email-renyu.zj@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2023-01-19 09:38:55 -03:00
Arnaldo Carvalho de Melo
1a931707ad
Merge remote-tracking branch 'torvalds/master' into perf/core
...
To resolve a trivial merge conflict with c302378bc1 ("libbpf:
Hashmap interface update to allow both long and void* keys/values"),
where a function present upstream was removed in the perf tools
development tree.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-12-16 09:53:53 -03:00
Ian Rogers
bd560973c5
perf expr: Tidy hashmap dependency
...
hashmap.h comes from libbpf but isn't installed with its
headers. Always use the header file of the code in util. Change the
hashmap.h dependency in expr.h to a forward declaration, add the
necessary header file includes in the C files.
Signed-off-by: Ian Rogers <irogers@google.com >
Acked-by: Namhyung Kim <namhyung@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Masahiro Yamada <masahiroy@kernel.org >
Cc: Nick Desaulniers <ndesaulniers@google.com >
Cc: Nicolas Schier <nicolas@fjasle.eu >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221109184914.1357295-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-11-16 12:17:15 -03:00
Eduard Zingerman
c302378bc1
libbpf: Hashmap interface update to allow both long and void* keys/values
...
An update for libbpf's hashmap interface from void* -> void* to a
polymorphic one, allowing both long and void* keys and values.
This simplifies many use cases in libbpf as hashmaps there are mostly
integer to integer.
Perf copies hashmap implementation from libbpf and has to be
updated as well.
Changes to libbpf, selftests/bpf and perf are packed as a single
commit to avoid compilation issues with any future bisect.
Polymorphic interface is acheived by hiding hashmap interface
functions behind auxiliary macros that take care of necessary
type casts, for example:
#define hashmap_cast_ptr(p) \
({ \
_Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\
#p " pointee should be a long-sized integer or a pointer"); \
(long *)(p); \
})
bool hashmap_find(const struct hashmap *map, long key, long *value);
#define hashmap__find(map, key, value) \
hashmap_find((map), (long)(key), hashmap_cast_ptr(value))
- hashmap__find macro casts key and value parameters to long
and long* respectively
- hashmap_cast_ptr ensures that value pointer points to a memory
of appropriate size.
This hack was suggested by Andrii Nakryiko in [1].
This is a follow up for [2].
[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/
[2] https://lore.kernel.org/bpf/af1facf9-7bc8-8a3d-0db4-7b3f333589a2@meta.com/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com >
Signed-off-by: Andrii Nakryiko <andrii@kernel.org >
Link: https://lore.kernel.org/bpf/20221109142611.879983-2-eddyz87@gmail.com
2022-11-09 20:45:14 -08:00
Ian Rogers
715b824f4a
perf expr: Remove jevents case workaround
...
jevents.py no longer lowercases metrics and altering the case can cause
hashmap lookups to fail, so remove.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Ahmad Yasin <ahmad.yasin@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Florian Fischer <florian.fischer@muhq.space >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Miaoqian Lin <linmq006@gmail.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Samantha Alt <samantha.alt@intel.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Richter <tmricht@linux.ibm.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20221004021612.325521-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-10-06 08:03:52 -03:00
Ian Rogers
1725e9cd32
perf metrics: Wire up core_wide
...
Pass state necessary for core_wide into the expression parser. Add
system_wide and user_requested_cpu_list to perf_stat_config to make it
available at display time. evlist isn't used as the
evlist__create_maps, that computes user_requested_cpus, needs the list
of events which is generated by the metric.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Ahmad Yasin <ahmad.yasin@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Florian Fischer <florian.fischer@muhq.space >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Miaoqian Lin <linmq006@gmail.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Richter <tmricht@linux.ibm.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220831174926.579643-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-10-04 08:55:22 -03:00
Ian Rogers
09b73fe9e3
perf smt: Compute SMT from topology
...
The topology records sibling threads. Rather than computing SMT using
siblings in sysfs, reuse the values in topology. This only applies
when the file smt/active isn't available.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Ahmad Yasin <ahmad.yasin@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Florian Fischer <florian.fischer@muhq.space >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Miaoqian Lin <linmq006@gmail.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Richter <tmricht@linux.ibm.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220831174926.579643-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-10-04 08:55:22 -03:00
Ian Rogers
1a6abdde13
perf expr: Move the scanner_ctx into the parse_ctx
...
We currently maintain the two independently and copy from one to the
other. This is a burden when additional scanner context values are
necessary, so combine them.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Ahmad Yasin <ahmad.yasin@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Florian Fischer <florian.fischer@muhq.space >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Miaoqian Lin <linmq006@gmail.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Thomas Richter <tmricht@linux.ibm.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220831174926.579643-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-10-04 08:55:22 -03:00
Kan Liang
bc2373a58a
perf tsc: Add arch TSC frequency information
...
The TSC frequency information is required for the event metrics with the
literal, system_tsc_freq. For the newer Intel platform, the TSC
frequency information can be retrieved from the CPUID leaf 0x15. If the
TSC frequency information isn't present the /proc/cpuinfo approach is
used.
Refactor cpuid() for this use. Note, the previous stack pushing/popping
approach was broken on x86-64 that has stack red zones that would be
clobbered.
Committer testing:
Before:
$ perf record sleep 0.0001
[ perf record: Woken up 1 times to write data ]
$ perf report --header-only |& grep cpuid
# cpuid : AuthenticAMD,25,33,0
$
After the patch:
$ perf record sleep 0.0001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ]
$ perf report --header-only |& grep cpuid
# cpuid : AuthenticAMD,25,33,0
$
Signed-off-by: Kan Liang <kan.liang@linux.intel.com >
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220718164312.3994191-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-07-25 12:28:00 -03:00
Ian Rogers
c0dd94558d
perf pmu-events: Don't lower case MetricExpr
...
This patch changes MetricExpr to be written out in the same case. This
enables events in metrics to use modifiers like 'G' which currently
yield parse errors when made lower case. To keep tests passing the
literal #smt_on is compared in a non-case sensitive way - #SMT_on is
present in at least SkylakeX metrics.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Paul Clarke <pc@us.ibm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Link: http://lore.kernel.org/lkml/20211126071305.3733878-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-01-12 15:02:48 -03:00
Ian Rogers
f56ef30a31
perf expr: Add debug logging for literals
...
Useful for diagnosing problems with metrics.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Konstantin Khlebnikov <koct9i@gmail.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Paul Clarke <pc@us.ibm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Link: http://lore.kernel.org/lkml/20211124001231.3277836-1-irogers@google.com
[ Fixed up perf_cpu conflict, i.e. we need to append ".cpu" to cpu__max_present_cpu() result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-01-12 14:50:28 -03:00
Ian Rogers
6d18804b96
perf cpumap: Give CPUs their own type
...
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.
Committer notes:
To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:
tools/perf/util/bpf_counter.c
tools/perf/util/bpf_counter_cgroup.c
tools/perf/util/bpf_ftrace.c
Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".
Additionally these needed to be fixed for the ARM builds to complete:
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/pmu.c
Suggested-by: John Garry <john.garry@huawei.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Paul Clarke <pc@us.ibm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Riccardo Mancini <rickyman7@gmail.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Suzuki Poulouse <suzuki.poulose@arm.com >
Cc: Vineet Singh <vineet.singh@intel.com >
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-01-12 14:28:23 -03:00