mirror of
https://github.com/armbian/linux-cix.git
synced 2026-01-06 12:30:45 -08:00
7aca0ac4792e6cb0f35ef97bfcb39b1663a92fb7
899 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
2706053173 |
bpf: Rework process_dynptr_func
Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
for use in callback state, because in case of user ringbuf helpers,
there is no dynptr on the stack that is passed into the callback. To
reflect such a state, a special register type was created.
However, some checks have been bypassed incorrectly during the addition
of this feature. First, for arg_type with MEM_UNINIT flag which
initialize a dynptr, they must be rejected for such register type.
Secondly, in the future, there are plans to add dynptr helpers that
operate on the dynptr itself and may change its offset and other
properties.
In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
to such helpers, however the current code simply returns 0.
The rejection for helpers that release the dynptr is already handled.
For fixing this, we take a step back and rework existing code in a way
that will allow fitting in all classes of helpers and have a coherent
model for dealing with the variety of use cases in which dynptr is used.
First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
fact. To make the distinction clear, use MEM_RDONLY flag to indicate
that the helper only operates on the memory pointed to by the dynptr,
not the dynptr itself. In C parlance, it would be equivalent to taking
the dynptr as a point to const argument.
When either of these flags are not present, the helper is allowed to
mutate both the dynptr itself and also the memory it points to.
Currently, the read only status of the memory is not tracked in the
dynptr, but it would be trivial to add this support inside dynptr state
of the register.
With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
better reflect its usage, it can no longer be passed to helpers that
initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
A note to reviewers is that in code that does mark_stack_slots_dynptr,
and unmark_stack_slots_dynptr, we implicitly rely on the fact that
PTR_TO_STACK reg is the only case that can reach that code path, as one
cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
both cases such helpers won't be setting that flag.
The next patch will add a couple of selftest cases to make sure this
doesn't break.
Fixes:
|
||
|
|
4f4ac4d910 |
tools: add IFLA_XFRM_COLLECT_METADATA to uapi/linux/if_link.h
Needed for XFRM metadata tests. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20221203084659.1837829-4-eyal.birger@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> |
||
|
|
72b43bde38 |
bpf: Update bpf_{g,s}etsockopt() documentation
* append missing optnames to the end * simplify bpf_getsockopt()'s doc Signed-off-by: Ji Rongfeng <SikoJobs@outlook.com> Link: https://lore.kernel.org/r/DU0P192MB15479B86200B1216EC90E162D6099@DU0P192MB1547.EURP192.PROD.OUTLOOK.COM Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> |
||
|
|
f0c5941ff5 |
bpf: Support bpf_list_head in map values
Add the support on the map side to parse, recognize, verify, and build
metadata table for a new special field of the type struct bpf_list_head.
To parameterize the bpf_list_head for a certain value type and the
list_node member it will accept in that value type, we use BTF
declaration tags.
The definition of bpf_list_head in a map value will be done as follows:
struct foo {
struct bpf_list_node node;
int data;
};
struct map_value {
struct bpf_list_head head __contains(foo, node);
};
Then, the bpf_list_head only allows adding to the list 'head' using the
bpf_list_node 'node' for the type struct foo.
The 'contains' annotation is a BTF declaration tag composed of four
parts, "contains:name:node" where the name is then used to look up the
type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during
the lookup. The node defines name of the member in this type that has
the type struct bpf_list_node, which is actually used for linking into
the linked list. For now, 'kind' part is hardcoded as struct.
This allows building intrusive linked lists in BPF, using container_of
to obtain pointer to entry, while being completely type safe from the
perspective of the verifier. The verifier knows exactly the type of the
nodes, and knows that list helpers return that type at some fixed offset
where the bpf_list_node member used for this list exists. The verifier
also uses this information to disallow adding types that are not
accepted by a certain list.
For now, no elements can be added to such lists. Support for that is
coming in future patches, hence draining and freeing items is done with
a TODO that will be resolved in a future patch.
Note that the bpf_list_head_free function moves the list out to a local
variable under the lock and releases it, doing the actual draining of
the list items outside the lock. While this helps with not holding the
lock for too long pessimizing other concurrent list operations, it is
also necessary for deadlock prevention: unless every function called in
the critical section would be notrace, a fentry/fexit program could
attach and call bpf_map_update_elem again on the map, leading to the
same lock being acquired if the key matches and lead to a deadlock.
While this requires some special effort on part of the BPF programmer to
trigger and is highly unlikely to occur in practice, it is always better
if we can avoid such a condition.
While notrace would prevent this, doing the draining outside the lock
has advantages of its own, hence it is used to also fix the deadlock
related problem.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
||
|
|
f4c4ca70de |
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says: ==================== bpf-next 2022-11-11 We've added 49 non-merge commits during the last 9 day(s) which contain a total of 68 files changed, 3592 insertions(+), 1371 deletions(-). The main changes are: 1) Veristat tool improvements to support custom filtering, sorting, and replay of results, from Andrii Nakryiko. 2) BPF verifier precision tracking fixes and improvements, from Andrii Nakryiko. 3) Lots of new BPF documentation for various BPF maps, from Dave Tucker, Donald Hunter, Maryam Tahhan, Bagas Sanjaya. 4) BTF dedup improvements and libbpf's hashmap interface clean ups, from Eduard Zingerman. 5) Fix veth driver panic if XDP program is attached before veth_open, from John Fastabend. 6) BPF verifier clean ups and fixes in preparation for follow up features, from Kumar Kartikeya Dwivedi. 7) Add access to hwtstamp field from BPF sockops programs, from Martin KaFai Lau. 8) Various fixes for BPF selftests and samples, from Artem Savkov, Domenico Cerasuolo, Kang Minchul, Rong Tao, Yang Jihong. 9) Fix redirection to tunneling device logic, preventing skb->len == 0, from Stanislav Fomichev. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits) selftests/bpf: fix veristat's singular file-or-prog filter selftests/bpf: Test skops->skb_hwtstamp selftests/bpf: Fix incorrect ASSERT in the tcp_hdr_options test bpf: Add hwtstamp field for the sockops prog selftests/bpf: Fix xdp_synproxy compilation failure in 32-bit arch bpf, docs: Document BPF_MAP_TYPE_ARRAY docs/bpf: Document BPF map types QUEUE and STACK docs/bpf: Document BPF ARRAY_OF_MAPS and HASH_OF_MAPS docs/bpf: Document BPF_MAP_TYPE_CPUMAP map docs/bpf: Document BPF_MAP_TYPE_LPM_TRIE map libbpf: Hashmap.h update to fix build issues using LLVM14 bpf: veth driver panics when xdp prog attached before veth_open selftests: Fix test group SKIPPED result selftests/bpf: Tests for btf_dedup_resolve_fwds libbpf: Resolve unambigous forward declarations libbpf: Hashmap interface update to allow both long and void* keys/values samples/bpf: Fix sockex3 error: Missing BPF prog type selftests/bpf: Fix u32 variable compared with less than zero Documentation: bpf: Escape underscore in BPF type name prefix selftests/bpf: Use consistent build-id type for liburandom_read.so ... ==================== Link: https://lore.kernel.org/r/20221111233733.1088228-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
9bb053490f |
bpf: Add hwtstamp field for the sockops prog
The bpf-tc prog has already been able to access the skb_hwtstamps(skb)->hwtstamp. This patch extends the same hwtstamp access to the sockops prog. In sockops, the skb is also available to the bpf prog during the BPF_SOCK_OPS_PARSE_HDR_OPT_CB event. There is a use case that the hwtstamp will be useful to the sockops prog to better measure the one-way-delay when the sender has put the tx timestamp in the tcp header option. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221107230420.4192307-2-martin.lau@linux.dev |
||
|
|
966a9b4903 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/can/pch_can.c |
||
|
|
a778f5d46b |
tools/headers: Pull in stddef.h to uapi to fix BPF selftests build in CI
With recent sync of linux/in.h tools/include headers are now relying on
__DECLARE_FLEX_ARRAY macro, which isn't itself defined inside
tools/include headers anywhere and is instead assumed to be present in
system-wide UAPI header. This breaks isolated environments that don't
have kernel UAPI headers installed system-wide, like BPF CI ([0]).
To fix this, bring in include/uapi/linux/stddef.h into tools/include.
We can't just copy/paste it, though, it has to be processed with
scripts/headers_install.sh, which has a dependency on scripts/unifdef.
So the full command to (re-)generate stddef.h for inclusion into
tools/include directory is:
$ make scripts_unifdef && \
cp $KBUILD_OUTPUT/scripts/unifdef scripts/ && \
scripts/headers_install.sh include/uapi/linux/stddef.h tools/include/uapi/linux/stddef.h
This assumes KBUILD_OUTPUT envvar is set and used for out-of-tree builds.
[0] https://github.com/kernel-patches/bpf/actions/runs/3379432493/jobs/5610982609
Fixes:
|
||
|
|
b54a0d4094 |
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
bpf-next 2022-11-02
We've added 70 non-merge commits during the last 14 day(s) which contain
a total of 96 files changed, 3203 insertions(+), 640 deletions(-).
The main changes are:
1) Make cgroup local storage available to non-cgroup attached BPF programs
such as tc BPF ones, from Yonghong Song.
2) Avoid unnecessary deadlock detection and failures wrt BPF task storage
helpers, from Martin KaFai Lau.
3) Add LLVM disassembler as default library for dumping JITed code
in bpftool, from Quentin Monnet.
4) Various kprobe_multi_link fixes related to kernel modules,
from Jiri Olsa.
5) Optimize x86-64 JIT with emitting BMI2-based shift instructions,
from Jie Meng.
6) Improve BPF verifier's memory type compatibility for map key/value
arguments, from Dave Marchevsky.
7) Only create mmap-able data section maps in libbpf when data is exposed
via skeletons, from Andrii Nakryiko.
8) Add an autoattach option for bpftool to load all object assets,
from Wang Yufen.
9) Various memory handling fixes for libbpf and BPF selftests,
from Xu Kuohai.
10) Initial support for BPF selftest's vmtest.sh on arm64,
from Manu Bretelle.
11) Improve libbpf's BTF handling to dedup identical structs,
from Alan Maguire.
12) Add BPF CI and denylist documentation for BPF selftests,
from Daniel Müller.
13) Check BPF cpumap max_entries before doing allocation work,
from Florian Lehner.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (70 commits)
samples/bpf: Fix typo in README
bpf: Remove the obsolte u64_stats_fetch_*_irq() users.
bpf: check max_entries before allocating memory
bpf: Fix a typo in comment for DFS algorithm
bpftool: Fix spelling mistake "disasembler" -> "disassembler"
selftests/bpf: Fix bpftool synctypes checking failure
selftests/bpf: Panic on hard/soft lockup
docs/bpf: Add documentation for new cgroup local storage
selftests/bpf: Add test cgrp_local_storage to DENYLIST.s390x
selftests/bpf: Add selftests for new cgroup local storage
selftests/bpf: Fix test test_libbpf_str/bpf_map_type_str
bpftool: Support new cgroup local storage
libbpf: Support new cgroup local storage
bpf: Implement cgroup storage available to non-cgroup-attached bpf progs
bpf: Refactor some inode/task/sk storage functions for reuse
bpf: Make struct cgroup btf id global
selftests/bpf: Tracing prog can still do lookup under busy lock
selftests/bpf: Ensure no task storage failure for bpf_lsm.s prog due to deadlock detection
bpf: Add new bpf_task_storage_delete proto with no deadlock detection
bpf: bpf_task_storage_delete_recur does lookup first before the deadlock check
...
====================
Link: https://lore.kernel.org/r/20221102062120.5724-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
||
|
|
31f1aa4f74 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c |
||
|
|
831c05a762 |
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
To pick the changes in: |
||
|
|
c4bcfb38a9 |
bpf: Implement cgroup storage available to non-cgroup-attached bpf progs
Similar to sk/inode/task storage, implement similar cgroup local storage.
There already exists a local storage implementation for cgroup-attached
bpf programs. See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper
bpf_get_local_storage(). But there are use cases such that non-cgroup
attached bpf progs wants to access cgroup local storage data. For example,
tc egress prog has access to sk and cgroup. It is possible to use
sk local storage to emulate cgroup local storage by storing data in socket.
But this is a waste as it could be lots of sockets belonging to a particular
cgroup. Alternatively, a separate map can be created with cgroup id as the key.
But this will introduce additional overhead to manipulate the new map.
A cgroup local storage, similar to existing sk/inode/task storage,
should help for this use case.
The life-cycle of storage is managed with the life-cycle of the
cgroup struct. i.e. the storage is destroyed along with the owning cgroup
with a call to bpf_cgrp_storage_free() when cgroup itself
is deleted.
The userspace map operations can be done by using a cgroup fd as a key
passed to the lookup, update and delete operations.
Typically, the following code is used to get the current cgroup:
struct task_struct *task = bpf_get_current_task_btf();
... task->cgroups->dfl_cgrp ...
and in structure task_struct definition:
struct task_struct {
....
struct css_set __rcu *cgroups;
....
}
With sleepable program, accessing task->cgroups is not protected by rcu_read_lock.
So the current implementation only supports non-sleepable program and supporting
sleepable program will be the next step together with adding rcu_read_lock
protection for rcu tagged structures.
Since map name BPF_MAP_TYPE_CGROUP_STORAGE has been used for old cgroup local
storage support, the new map name BPF_MAP_TYPE_CGRP_STORAGE is used
for cgroup storage available to non-cgroup-attached bpf programs. The old
cgroup storage supports bpf_get_local_storage() helper to get the cgroup data.
The new cgroup storage helper bpf_cgrp_storage_get() can provide similar
functionality. While old cgroup storage pre-allocates storage memory, the new
mechanism can also pre-allocate with a user space bpf_map_update_elem() call
to avoid potential run-time memory allocation failure.
Therefore, the new cgroup storage can provide all functionality w.r.t.
the old one. So in uapi bpf.h, the old BPF_MAP_TYPE_CGROUP_STORAGE is alias to
BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED to indicate the old cgroup storage can
be deprecated since the new one can provide the same functionality.
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042850.673791-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
||
|
|
49c75d30b0 |
tools headers uapi: Sync linux/stat.h with the kernel sources
To pick the changes from:
|
||
|
|
036b8f5b89 |
tools headers uapi: Update linux/in.h copy
To get the changes in: |
||
|
|
96917bb3a3 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/net.h |
||
|
|
9aec606c16 |
tools: include: sync include/api/linux/kvm.h
Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL.
Fixes:
|
||
|
|
3566a79c9e |
Merge tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-10-18 We've added 33 non-merge commits during the last 14 day(s) which contain a total of 31 files changed, 874 insertions(+), 538 deletions(-). The main changes are: 1) Add RCU grace period chaining to BPF to wait for the completion of access from both sleepable and non-sleepable BPF programs, from Hou Tao & Paul E. McKenney. 2) Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer values. In the wild we have seen OS vendors doing buggy backports where helper call numbers mismatched. This is an attempt to make backports more foolproof, from Andrii Nakryiko. 3) Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions, from Roberto Sassu. 4) Fix libbpf's BTF dumper for structs with padding-only fields, from Eduard Zingerman. 5) Fix various libbpf bugs which have been found from fuzzing with malformed BPF object files, from Shung-Hsi Yu. 6) Clean up an unneeded check on existence of SSE2 in BPF x86-64 JIT, from Jie Meng. 7) Fix various ASAN bugs in both libbpf and selftests when running the BPF selftest suite on arm64, from Xu Kuohai. 8) Fix missing bpf_iter_vma_offset__destroy() call in BPF iter selftest and use in-skeleton link pointer to remove an explicit bpf_link__destroy(), from Jiri Olsa. 9) Fix BPF CI breakage by pointing to iptables-legacy instead of relying on symlinked iptables which got upgraded to iptables-nft, from Martin KaFai Lau. 10) Minor BPF selftest improvements all over the place, from various others. * tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (33 commits) bpf/docs: Update README for most recent vmtest.sh bpf: Use rcu_trace_implies_rcu_gp() for program array freeing bpf: Use rcu_trace_implies_rcu_gp() in local storage map bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator rcu-tasks: Provide rcu_trace_implies_rcu_gp() selftests/bpf: Use sys_pidfd_open() helper when possible libbpf: Fix null-pointer dereference in find_prog_by_sec_insn() libbpf: Deal with section with no data gracefully libbpf: Use elf_getshdrnum() instead of e_shnum selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow selftest/bpf: Fix memory leak in kprobe_multi_test selftests/bpf: Fix memory leak caused by not destroying skeleton libbpf: Fix memory leak in parse_usdt_arg() libbpf: Fix use-after-free in btf_dump_name_dups selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test selftests/bpf: Alphabetize DENYLISTs selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id() libbpf: Introduce bpf_link_get_fd_by_id_opts() libbpf: Introduce bpf_btf_get_fd_by_id_opts() ... ==================== Link: https://lore.kernel.org/r/20221018210631.11211-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
d465bff130 |
Merge tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Add support for AMD on 'perf mem' and 'perf c2c', the kernel
enablement patches went via tip.
Example:
$ sudo perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
$
- "perf lock" improvements:
- Add -E/--entries option to limit the number of entries to
display, say to ask for just the top 5 contended locks.
- Add -q/--quiet option to suppress header and debug messages.
- Add a 'perf test' kernel lock contention entry to test 'perf
lock'.
- "perf lock contention" improvements:
- Ask BPF's bpf_get_stackid() to skip some callchain entries.
The ones closer to the tooling are bpf related and not that
interesting, the ones calling the locking function are the ones
we're interested in, example of a full, unskipped callstack:
- Allow changing the callstack depth and number of entries to skip.
1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffbb8b8e75 bpf_trace_run2+0x35
0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb
0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5
0xffffffffbc1c26ff _raw_spin_lock+0x1f
0xffffffffbb841015 tick_do_update_jiffies64+0x25
0xffffffffbb8409ee tick_irq_enter+0x9e
- Show full callstack in verbose mode (-v option), sometimes this
is desirable instead of showing just one callstack entry.
- Allow multiple time ranges in 'perf record --delay' to help in
reducing the amount of data collected from hardware tracing (Intel
PT, etc) when there is a rough idea of periods of time where events
of interest take time.
- Add Intel PT to record only decoder debug messages when error
happens.
- Improve layout of Intel PT man page.
- Add new branch types: alignment, data and inst faults and arch
specific ones, such as fiq, debug_halt, debug_exit, debug_inst and
debug_data on arm64.
Kernel enablement went thru the tip tree.
- Fix 'perf probe' error log check in 'perf test' when no debuginfo is
available.
- Fix 'perf stat' aggregation mode logic, it should be looking at the
CPU not at the core number.
- Fix flags parsing in 'perf trace' filters.
- Introduce compact encoding of CPU range encoding on perf.data, to
avoid having a bitmap with all the CPUs.
- Improvements to the 'perf stat' metrics, including adding
"core_wide", and computing "smt" from the CPU topology.
- Add support to the new PERF_FORMAT_LOST perf_event_attr.read_format,
that allows tooling to ask for the precise number of lost samples for
a given event.
- Add 'addr' sort key to see just the address of sampled instructions:
$ perf record -o- true | perf report -i- -s addr
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
# Samples: 12 of event 'cycles:u'
# Event count (approx.): 252512
#
# Overhead Address
# ........ ..................
42.96% 0x7f96f08443d7
29.55% 0x7f96f0859b50
14.76% 0x7f96f0852e02
8.30% 0x7f96f0855028
4.43% 0xffffffff8de01087
perf annotate: Toggle full address <-> offset display
- Add 'f' hotkey to the 'perf annotate' TUI interface when in
'disassembler output' mode ('o' hotkey) to toggle showing full
virtual address or just the offset.
- Cache DSO build-ids when synthesizing PERF_RECORD_MMAP records for
pre-existing threads, at the start of a 'perf record' session,
speeding up that record startup phase.
- Add a command line option to specify build ids in 'perf inject'.
- Update JSON event files for the Intel alderlake, broadwell,
broadwellde, broadwellx, cascadelakex, haswell, haswellx, icelake,
icelakex, ivybridge, ivytown, jaketown, sandybridge, sapphirerapids,
skylake, skylakex, and tigerlake processors.
- Update vendor JSON event files for the ARM Neoverse V1 and E1
platforms.
- Add a 'perf test' entry for 'perf mem' where a struct has false
sharing and this gets detected in the 'perf mem' output, tested with
Intel, AMD and ARM64 systems.
- Add a 'perf test' entry to test the resolution of java symbols, where
an output like this is expected:
8.18% jshell jitted-50116-29.so [.] Interpreter
0.75% Thread-1 jitted-83602-1670.so [.] jdk.internal.jimage.BasicImageReader.getString(int)
- Add tests for the ARM64 CoreSight hardware tracing feature, with
specially crafted pureloop, memcpy, thread loop and unroll tread that
then gets traced and the output compared with expected output.
Documentation explaining it is also included.
- Add per thread Intel PT 'perf test' entry to check that
PERF_RECORD_TEXT_POKE events are recorded per CPU, resulting in a
mixture of per thread and per CPU events and mmaps, verify that this
gets all recorded correctly.
- Introduce pthread mutex wrappers to allow for building with clang's
-Wthread-safety, i.e. using the "guarded_by" "pt_guarded_by"
"lockable", "exclusive_lock_function", "exclusive_trylock_function",
"exclusive_locks_required", and "no_thread_safety_analysis" compiler
function attributes.
- Fix empty version number when building outside of a git repo.
- Improve feature detection display when multiple versions of a feature
are present, such as for binutils libbfd, that has a mix of possible
ways to detect according to the Linux distribution.
Previously in some cases we had:
Auto-detecting system features
<SNIP>
... libbfd: [ on ]
... libbfd-liberty: [ on ]
... libbfd-liberty-z: [ on ]
<SNIP>
Now for this case we show just the main feature:
Auto-detecting system features
<SNIP>
... libbfd: [ on ]
<SNIP>
- Remove some unused structs, variables, macros, function prototypes
and includes from various places.
* tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
perf script: Add missing fields in usage hint
perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
perf mem/c2c: Avoid printing empty lines for unsupported events
perf mem/c2c: Add load store event mappings for AMD
perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel
tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel
perf stat: Fix cpu check to use id.cpu.cpu in aggr_printout()
perf test coresight: Add relevant documentation about ARM64 CoreSight testing
perf test: Add git ignore for tmp and output files of ARM CoreSight tests
perf test coresight: Add unroll thread test shell script
perf test coresight: Add unroll thread test tool
perf test coresight: Add thread loop test shell scripts
perf test coresight: Add thread loop test tool
perf test coresight: Add memcpy thread test shell script
perf test coresight: Add memcpy thread test tool
perf test: Add git ignore for perf data generated by the ARM CoreSight tests
perf test: Add arm64 asm pureloop test shell script
perf test: Add asm pureloop test tool
...
|
||
|
|
b7ddd38ccc |
tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel
Two new fields for mem_lvl_num has been introduced: PERF_MEM_LVLNUM_IO and PERF_MEM_LVLNUM_CXL which are required to support perf mem/c2c on AMD platform. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: https://lore.kernel.org/r/20221006153946.7816-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8a76145a2e |
bpf: explicitly define BPF_FUNC_xxx integer values
Historically enum bpf_func_id's BPF_FUNC_xxx enumerators relied on
implicit sequential values being assigned by compiler. This is
convenient, as new BPF helpers are always added at the very end, but it
also has its downsides, some of them being:
- with over 200 helpers now it's very hard to know what's each helper's ID,
which is often important to know when working with BPF assembly (e.g.,
by dumping raw bpf assembly instructions with llvm-objdump -d
command). it's possible to work around this by looking into vmlinux.h,
dumping /sys/btf/kernel/vmlinux, looking at libbpf-provided
bpf_helper_defs.h, etc. But it always feels like an unnecessary step
and one should be able to quickly figure this out from UAPI header.
- when backporting and cherry-picking only some BPF helpers onto older
kernels it's important to be able to skip some enum values for helpers
that weren't backported, but preserve absolute integer IDs to keep BPF
helper IDs stable so that BPF programs stay portable across upstream
and backported kernels.
While neither problem is insurmountable, they come up frequently enough
and are annoying enough to warrant improving the situation. And for the
backporting the problem can easily go unnoticed for a while, especially
if backport is done with people not very familiar with BPF subsystem overall.
Anyways, it's easy to fix this by making sure that __BPF_FUNC_MAPPER
macro provides explicit helper IDs. Unfortunately that would potentially
break existing users that use UAPI-exposed __BPF_FUNC_MAPPER and are
expected to pass macro that accepts only symbolic helper identifier
(e.g., map_lookup_elem for bpf_map_lookup_elem() helper).
As such, we need to introduce a new macro (___BPF_FUNC_MAPPER) which
would specify both identifier and integer ID, but in such a way as to
allow existing __BPF_FUNC_MAPPER be expressed in terms of new
___BPF_FUNC_MAPPER macro. And that's what this patch is doing. To avoid
duplication and allow __BPF_FUNC_MAPPER stay *exactly* the same,
___BPF_FUNC_MAPPER accepts arbitrary "context" arguments, which can be
used to pass any extra macros, arguments, and whatnot. In our case we
use this to pass original user-provided macro that expects single
argument and __BPF_FUNC_MAPPER is using it's own three-argument
__BPF_FUNC_MAPPER_APPLY intermediate macro to impedance-match new and
old "callback" macros.
Once we resolve this, we use new ___BPF_FUNC_MAPPER to define enum
bpf_func_id with explicit values. The other users of __BPF_FUNC_MAPPER
in kernel (namely in kernel/bpf/disasm.c) are kept exactly the same both
as demonstration that backwards compat works, but also to avoid
unnecessary code churn.
Note that new ___BPF_FUNC_MAPPER() doesn't forcefully insert comma
between values, as that might not be appropriate in all possible cases
where ___BPF_FUNC_MAPPER might be used by users. This doesn't reduce
usability, as it's trivial to insert that comma inside "callback" macro.
To validate all the manually specified IDs are exactly right, we used
BTF to compare before and after values:
$ bpftool btf dump file ~/linux-build/default/vmlinux | rg bpf_func_id -A 211 > after.txt
$ git stash # stach UAPI changes
$ make -j90
... re-building kernel without UAPI changes ...
$ bpftool btf dump file ~/linux-build/default/vmlinux | rg bpf_func_id -A 211 > before.txt
$ diff -u before.txt after.txt
--- before.txt 2022-10-05 10:48:18.119195916 -0700
+++ after.txt 2022-10-05 10:46:49.446615025 -0700
@@ -1,4 +1,4 @@
-[14576] ENUM 'bpf_func_id' encoding=UNSIGNED size=4 vlen=211
+[9560] ENUM 'bpf_func_id' encoding=UNSIGNED size=4 vlen=211
'BPF_FUNC_unspec' val=0
'BPF_FUNC_map_lookup_elem' val=1
'BPF_FUNC_map_update_elem' val=2
As can be seen from diff above, the only thing that changed was resulting BTF
type ID of ENUM bpf_func_id, not any of the enumerators, their names or integer
values.
The only other place that needed fixing was scripts/bpf_doc.py used to generate
man pages and bpf_helper_defs.h header for libbpf and selftests. That script is
tightly-coupled to exact shape of ___BPF_FUNC_MAPPER macro definition, so had
to be trivially adapted.
Cc: Quentin Monnet <quentin@isovalent.com>
Reported-by: Andrea Terzolo <andrea.terzolo@polito.it>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221006042452.2089843-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
||
|
|
fb42f8b729 |
perf branch: Add PERF_BR_NEW_ARCH_[N] map for BRBE on arm64 platform
This updates the perf tool with arch specific branch type classification used for BRBE on arm64 platform as added in the kernel earlier. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220824044822.70230-9-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
bcb96ce6d2 |
perf branch: Add branch privilege information request flag
This updates the perf tools with branch privilege information request flag i.e PERF_SAMPLE_BRANCH_PRIV_SAVE that has been added earlier in the kernel. This also updates 'perf record' documentation, branch_modes[], and generic branch privilege level enumeration as added earlier in the kernel. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220824044822.70230-8-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
0ddea8e2a0 |
perf branch: Extend branch type classification
This updates the perf tool with generic branch type classification with new
ABI extender place holder i.e PERF_BR_EXTEND_ABI, the new 4 bit branch type
field i.e perf_branch_entry.new_type, new generic page fault related branch
types and some arch specific branch types as added earlier in the kernel.
Committer note:
Add an extra entry to the branch_type_name array to cope with
PERF_BR_EXTEND_ABI, to address build warnings on some compiler/systems,
like:
75 8.89 ubuntu:20.04-x-powerpc64el : FAIL gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~20.04)
inlined from 'branch_type_stat_display' at util/branch.c:152:4:
/usr/powerpc64le-linux-gnu/include/bits/stdio2.h:100:10: error: '%8s' directive argument is null [-Werror=format-overflow=]
100 | return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101 | __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-7-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
1c96b6e45f |
perf branch: Add system error and not in transaction branch types
This updates the perf tool with generic branch type classification with two new branch types i.e system error (PERF_BR_SERROR) and not in transaction (PERF_BR_NO_TX) which got updated earlier in the kernel. This also updates corresponding branch type strings in branch_type_name(). Committer notes: At perf tools merge time this is only on PeterZ's tree, at: git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core So for testing one has to build a kernel with that branch, then test the tooling side from: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220824044822.70230-6-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
e52f7c1ddf |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in the left-over fixes before the net-next pull-request. Conflicts: drivers/net/ethernet/mediatek/mtk_ppe.c |