Pull non-mm updates from Andrew Morton:
"Mainly singleton patches, documented in their respective changelogs.
Notable series include:
- Some maintenance and performance work for ocfs2 in Heming Zhao's
series "improve write IO performance when fragmentation is high".
- Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
exposed by fstests".
- kfifo header rework from Andy Shevchenko in the series "kfifo:
Clean up kfifo.h".
- GDB script fixes from Florian Rommel in the series "scripts/gdb:
Fixes for $lx_current and $lx_per_cpu".
- After much discussion, a coding-style update from Barry Song
explaining one reason why inline functions are preferred over
macros. The series is "codingstyle: avoid unused parameters for a
function-like macro""
* tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (62 commits)
fs/proc: fix softlockup in __read_vmcore
nilfs2: convert BUG_ON() in nilfs_finish_roll_forward() to WARN_ON()
scripts: checkpatch: check unused parameters for function-like macro
Documentation: coding-style: ask function-like macros to evaluate parameters
nilfs2: use __field_struct() for a bitwise field
selftests/kcmp: remove unused open mode
nilfs2: remove calls to folio_set_error() and folio_clear_error()
kernel/watchdog_perf.c: tidy up kerneldoc
watchdog: allow nmi watchdog to use raw perf event
watchdog: handle comma separated nmi_watchdog command line
nilfs2: make superblock data array index computation sparse friendly
squashfs: remove calls to set the folio error flag
squashfs: convert squashfs_symlink_read_folio to use folio APIs
scripts/gdb: fix detection of current CPU in KGDB
scripts/gdb: make get_thread_info accept pointers
scripts/gdb: fix parameter handling in $lx_per_cpu
scripts/gdb: fix failing KGDB detection during probe
kfifo: don't use "proxy" headers
media: stih-cec: add missing io.h
media: rc: add missing io.h
...
Adjust `union bpf_attr` size passed to kernel in two feature-detecting
functions to take into account prog_token_fd field.
Libbpf is avoiding memset()'ing entire `union bpf_attr` by only using
minimal set of bpf_attr's fields. Two places have been missed when
wiring BPF token support in libbpf's feature detection logic.
Fix them trivially.
Fixes: f3dcee938f ("libbpf: Wire up token_fd into feature probing logic")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240513180804.403775-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[Changes from V1:
- Use a default branch in the switch statement to initialize `val'.]
GCC warns that `val' may be used uninitialized in the
BPF_CRE_READ_BITFIELD macro, defined in bpf_core_read.h as:
[...]
unsigned long long val; \
[...] \
switch (__CORE_RELO(s, field, BYTE_SIZE)) { \
case 1: val = *(const unsigned char *)p; break; \
case 2: val = *(const unsigned short *)p; break; \
case 4: val = *(const unsigned int *)p; break; \
case 8: val = *(const unsigned long long *)p; break; \
} \
[...]
val; \
} \
This patch adds a default entry in the switch statement that sets
`val' to zero in order to avoid the warning, and random values to be
used in case __builtin_preserve_field_info returns unexpected values
for BPF_FIELD_BYTE_SIZE.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240508101313.16662-1-jose.marchesi@oracle.com
The tools/lib/rbtree.c code came from the kernel. Remove the
EXPORT_SYMBOL() that make sense only there. Unfortunately it is not being
checked with tools/perf/check_headers.sh. Will try to remedy this. Until
then pick the improvements from:
b0687c1119 ("lib/rbtree: use '+' instead of '|' for setting color.")
That I noticed by doing:
diff -u tools/lib/rbtree.c lib/rbtree.c
diff -u tools/include/linux/rbtree_augmented.h include/linux/rbtree_augmented.h
There is one other cases, but lets pick it in separate patches.
Link: https://lkml.kernel.org/r/ZigZzeFoukzRKG1Q@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noah Goldstein <goldstein.w.n@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Extend libbpf's pre-load checks for BPF programs, detecting more typical
conditions that are destinated to cause BPF program failure. This is an
opportunity to provide more helpful and actionable error message to
users, instead of potentially very confusing BPF verifier log and/or
error.
In this case, we detect struct_ops BPF program that was not referenced
anywhere, but still attempted to be loaded (according to libbpf logic).
Suggest that the program might need to be used in some struct_ops
variable. User will get a message of the following kind:
libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it?
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-6-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
strerror_r(), used from libbpf-specific libbpf_strerror_r() wrapper is
documented to return error in two different ways, depending on glibc
version. Take that into account when handling strerror_r()'s own errors,
which happens when we pass some non-standard (internal) kernel error to
it. Before this patch we'd have "ERROR: strerror_r(524)=22", which is
quite confusing. Now for the same situation we'll see a bit less
visually scary "unknown error (-524)".
At least we won't confuse user with irrelevant EINVAL (22).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-5-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
There is yet another corner case where user can set STRUCT_OPS program
reference in STRUCT_OPS map to NULL, but libbpf will fail to disable
autoload for such BPF program. This time it's the case of "new" kernel
which has type information about callback field, but user explicitly
nulled-out program reference from user-space after opening BPF object.
Fix, hopefully, the last remaining unhandled case.
Fixes: 0737df6de9 ("libbpf: better fix for handling nulled-out struct_ops program")
Fixes: f973fccd43 ("libbpf: handle nulled-out program in struct_ops correctly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-3-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
libbpf ensures that BPF program references set in map->st_ops->progs[i]
during open phase are always valid STRUCT_OPS programs. This is done in
bpf_object__collect_st_ops_relos(). So there is no need to double-check
that in bpf_map__init_kern_struct_ops().
Simplify the code by removing unnecessary check. Also, we avoid using
local prog variable to keep code similar to the upcoming fix, which adds
similar logic in another part of bpf_map__init_kern_struct_ops().
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
[Differences from V1:
- Do not introduce a global typedef, as this is a public header.
- Keep the void* casts in BPF_KPROBE_READ_RET_IP and
BPF_KRETPROBE_READ_RET_IP, as these are necessary
for converting to a const void* argument of
bpf_probe_read_kernel.]
The BPF_PROG, BPF_KPROBE and BPF_KSYSCALL macros defined in
tools/lib/bpf/bpf_tracing.h use a clever hack in order to provide a
convenient way to define entry points for BPF programs as if they were
normal C functions that get typed actual arguments, instead of as
elements in a single "context" array argument.
For example, PPF_PROGS allows writing:
SEC("struct_ops/cwnd_event")
void BPF_PROG(cwnd_event, struct sock *sk, enum tcp_ca_event event)
{
bbr_cwnd_event(sk, event);
dctcp_cwnd_event(sk, event);
cubictcp_cwnd_event(sk, event);
}
That expands into a pair of functions:
void ____cwnd_event (unsigned long long *ctx, struct sock *sk, enum tcp_ca_event event)
{
bbr_cwnd_event(sk, event);
dctcp_cwnd_event(sk, event);
cubictcp_cwnd_event(sk, event);
}
void cwnd_event (unsigned long long *ctx)
{
_Pragma("GCC diagnostic push")
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")
return ____cwnd_event(ctx, (void*)ctx[0], (void*)ctx[1]);
_Pragma("GCC diagnostic pop")
}
Note how the 64-bit unsigned integers in the incoming CTX get casted
to a void pointer, and then implicitly converted to whatever type of
the actual argument in the wrapped function. In this case:
Arg1: unsigned long long -> void * -> struct sock *
Arg2: unsigned long long -> void * -> enum tcp_ca_event
The behavior of GCC and clang when facing such conversions differ:
pointer -> pointer
Allowed by the C standard.
GCC: no warning nor error.
clang: no warning nor error.
pointer -> integer type
[C standard says the result of this conversion is implementation
defined, and it may lead to unaligned pointer etc.]
GCC: error: integer from pointer without a cast [-Wint-conversion]
clang: error: incompatible pointer to integer conversion [-Wint-conversion]
pointer -> enumerated type
GCC: error: incompatible types in assigment (*)
clang: error: incompatible pointer to integer conversion [-Wint-conversion]
These macros work because converting pointers to pointers is allowed,
and converting pointers to integers also works provided a suitable
integer type even if it is implementation defined, much like casting a
pointer to uintptr_t is guaranteed to work by the C standard. The
conversion errors emitted by both compilers by default are silenced by
the pragmas.
However, the GCC error marked with (*) above when assigning a pointer
to an enumerated value is not associated with the -Wint-conversion
warning, and it is not possible to turn it off.
This is preventing building the BPF kernel selftests with GCC.
This patch fixes this by avoiding intermediate casts to void*,
replaced with casts to `unsigned long long', which is an integer type
capable of safely store a BPF pointer, much like the standard
uintptr_t.
Testing performed in bpf-next master:
- vmtest.sh -- ./test_verifier
- vmtest.sh -- ./test_progs
- make M=samples/bpf
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240502170925.3194-1-jose.marchesi@oracle.com
The macro bpf_ksym_exists is defined in bpf_helpers.h as:
#define bpf_ksym_exists(sym) ({ \
_Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \
!!sym; \
})
The purpose of the macro is to determine whether a given symbol has
been defined, given the address of the object associated with the
symbol. It also has a compile-time check to make sure the object
whose address is passed to the macro has been declared as weak, which
makes the check on `sym' meaningful.
As it happens, the check for weak doesn't work in GCC in all cases,
because __builtin_constant_p not always folds at parse time when
optimizing. This is because optimizations that happen later in the
compilation process, like inlining, may make a previously non-constant
expression a constant. This results in errors like the following when
building the selftests with GCC:
bpf_helpers.h:190:24: error: expression in static assertion is not constant
190 | _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fortunately recent versions of GCC support a __builtin_has_attribute
that can be used to directly check for the __weak__ attribute. This
patch changes bpf_helpers.h to use that builtin when building with a
recent enough GCC, and to omit the check if GCC is too old to support
the builtin.
The macro used for GCC becomes:
#define bpf_ksym_exists(sym) ({ \
_Static_assert(__builtin_has_attribute (*sym, __weak__), #sym " should be marked as __weak"); \
!!sym; \
})
Note that since bpf_ksym_exists is designed to get the address of the
object associated with symbol SYM, we pass *sym to
__builtin_has_attribute instead of sym. When an expression is passed
to __builtin_has_attribute then it is the type of the passed
expression that is checked for the specified attribute. The
expression itself is not evaluated. This accommodates well with the
existing usages of the macro:
- For function objects:
struct task_struct *bpf_task_acquire(struct task_struct *p) __ksym __weak;
[...]
bpf_ksym_exists(bpf_task_acquire)
- For variable objects:
extern const struct rq runqueues __ksym __weak; /* typed */
[...]
bpf_ksym_exists(&runqueues)
Note also that BPF support was added in GCC 10 and support for
__builtin_has_attribute in GCC 9.
Locally tested in bpf-next master branch.
No regressions.
Signed-of-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240428112559.10518-1-jose.marchesi@oracle.com
ringbuf_process_ring() return int64_t, while ring__consume_n() assigns
it to int. It's highly unlikely, but possible for ringbuf_process_ring()
to return value larger than INT_MAX, so use int64_t. ring__consume_n()
does check INT_MAX before returning int result to the user.
Fixes: 4d22ea94ea ("libbpf: Add ring__consume_n / ring_buffer__consume_n")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240430201952.888293-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Previous attempt to fix the handling of nulled-out (from skeleton)
struct_ops program is working well only if struct_ops program is defined
as non-autoloaded by default (i.e., has SEC("?struct_ops") annotation,
with question mark).
Unfortunately, that fix is incomplete due to how
bpf_object_adjust_struct_ops_autoload() is marking referenced or
non-referenced struct_ops program as autoloaded (or not). Because
bpf_object_adjust_struct_ops_autoload() is run after
bpf_map__init_kern_struct_ops() step, which sets program slot to NULL,
such programs won't be considered "referenced", and so its autoload
property won't be changed.
This all sounds convoluted and it is, but the desire is to have as
natural behavior (as far as struct_ops usage is concerned) as possible.
This fix is redoing the original fix but makes it work for
autoloaded-by-default struct_ops programs as well. We achieve this by
forcing prog->autoload to false if prog was declaratively set for some
struct_ops map, but then nulled-out from skeleton (programmatically).
This achieves desired effect of not autoloading it. If such program is
still referenced somewhere else (different struct_ops map or different
callback field), it will get its autoload property adjusted by
bpf_object_adjust_struct_ops_autoload() later.
We also fix selftest, which accidentally used SEC("?struct_ops")
annotation. It was meant to use autoload-by-default program from the
very beginning.
Fixes: f973fccd43 ("libbpf: handle nulled-out program in struct_ops correctly")
Cc: Kui-Feng Lee <thinker.li@gmail.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240501041706.3712608-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
In some situations, it is useful to explicitly specify a kernel module
to search for a tracing program target (e.g. when a function of the same
name exists in multiple modules or in vmlinux).
This patch enables that by allowing the "module:function" syntax for the
find_kernel_btf_id function. Thanks to this, the syntax can be used both
from a SEC macro (i.e. `SEC(fentry/module:function)`) and via the
bpf_program__set_attach_target API call.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/9085a8cb9a552de98e554deb22ff7e977d025440.1714469650.git.vmalik@redhat.com
Adding support to attach program in kprobe session mode
with bpf_program__attach_kprobe_multi_opts function.
Adding session bool to bpf_kprobe_multi_opts struct that allows
to load and attach the bpf program via kprobe session.
the attachment to create kprobe multi session.
Also adding new program loader section that allows:
SEC("kprobe.session/bpf_fentry_test*")
and loads/attaches kprobe program as kprobe session.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240430112830.1184228-5-jolsa@kernel.org
If struct_ops has one of program callbacks set declaratively and host
kernel is old and doesn't support this callback, libbpf will allow to
load such struct_ops as long as that callback was explicitly nulled-out
(presumably through skeleton). This is all working correctly, except we
won't reset corresponding program slot to NULL before bailing out, which
will lead to libbpf not detecting that BPF program has to be not
auto-loaded. Fix this by unconditionally resetting corresponding program
slot to NULL.
Fixes: c911fc61a7 ("libbpf: Skip zeroed or null fields if not found in the kernel type.")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240428030954.3918764-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The definition of bpf_tail_call_static in tools/lib/bpf/bpf_helpers.h
is guarded by a preprocessor check to assure that clang is recent
enough to support it. This patch updates the guard so the function is
compiled when using GCC 13 or later as well.
Tested in bpf-next master. No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240426145158.14409-1-jose.marchesi@oracle.com
When dumping a character array, libbpf will watch for a '\0' and set
is_array_terminated=true if found. This prevents libbpf from printing
the remaining characters of the array, treating it as a nul-terminated
string.
However, once this flag is set, it's never reset, leading to subsequent
characters array not being printed properly:
.str_multi = (__u8[2][16])[
[
'H',
'e',
'l',
],
],
This patch saves the is_array_terminated flag and restores its
default (false) value before looping over the elements of an array,
then restores it afterward. This way, libbpf's behavior is unchanged
when dumping the characters of an array, but subsequent arrays are
printed properly:
.str_multi = (__u8[2][16])[
[
'H',
'e',
'l',
],
[
'l',
'o',
],
],
Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413211258.134421-3-qde@naccy.de
In btf_dump_array_data(), libbpf will call btf_dump_dump_type_data() for
each element. For an array of characters, each element will be
processed the following way:
- btf_dump_dump_type_data() is called to print the character
- btf_dump_data_pfx() prefixes the current line with the proper number
of indentations
- btf_dump_int_data() is called to print the character
- After the last character is printed, btf_dump_dump_type_data() calls
btf_dump_data_pfx() before writing the closing bracket
However, for an array containing characters, btf_dump_int_data() won't
print any '\0' and subsequent characters. This leads to situations where
the line prefix is written, no character is added, then the prefix is
written again before adding the closing bracket:
(struct sk_metadata){
.str_array = (__u8[14])[
'H',
'e',
'l',
'l',
'o',
],
This change solves this issue by printing the '\0' character, which
has two benefits:
- The bracket closing the array is properly aligned
- It's clear from a user point of view that libbpf uses '\0' as a
terminator for arrays of characters.
Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413211258.134421-2-qde@naccy.de
In some cases, instead of always consuming all items from ring buffers
in a greedy way, we may want to consume up to a certain amount of items,
for example when we need to copy items from the BPF ring buffer to a
limited user buffer.
This change allows to set an upper limit to the amount of items consumed
from one or more ring buffers.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240406092005.92399-3-andrea.righi@canonical.com