Commit Graph

973 Commits

Author SHA1 Message Date
Matteo Croce
39391377f8 libbpf: add binary to gitignore
Some binaries are generated when building libbpf from tools/lib/bpf/,
namely libbpf.so.0.0.2 and libbpf.so.0.
Add them to the local .gitignore.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
Daniel T. Lee
32e621e554 libbpf: fix samples/bpf build failure due to undefined UINT32_MAX
Currently, building bpf samples will cause the following error.

    ./tools/lib/bpf/bpf.h:132:27: error: 'UINT32_MAX' undeclared here (not in a function) ..
     #define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */
                               ^
    ./samples/bpf/bpf_load.h:31:25: note: in expansion of macro 'BPF_LOG_BUF_SIZE'
     extern char bpf_log_buf[BPF_LOG_BUF_SIZE];
                             ^~~~~~~~~~~~~~~~

Due to commit 4519efa6f8 ("libbpf: fix BPF_LOG_BUF_SIZE off-by-one error")
hard-coded size of BPF_LOG_BUF_SIZE has been replaced with UINT32_MAX which is
defined in <stdint.h> header.

Even with this change, bpf selftests are running fine since these are built
with clang and it includes header(-idirafter) from clang/6.0.0/include.
(it has <stdint.h>)

    clang -I. -I./include/uapi -I../../../include/uapi -idirafter /usr/local/include -idirafter /usr/include \
    -idirafter /usr/lib/llvm-6.0/lib/clang/6.0.0/include -idirafter /usr/include/x86_64-linux-gnu \
    -Wno-compare-distinct-pointer-types -O2 -target bpf -emit-llvm -c progs/test_sysctl_prog.c -o - | \
    llc -march=bpf -mcpu=generic  -filetype=obj -o /linux/tools/testing/selftests/bpf/test_sysctl_prog.o

But bpf samples are compiled with GCC, and it only searches and includes
headers declared at the target file. As '#include <stdint.h>' hasn't been
declared in tools/lib/bpf/bpf.h, it causes build failure of bpf samples.

    gcc -Wp,-MD,./samples/bpf/.sockex3_user.o.d -Wall -Wmissing-prototypes -Wstrict-prototypes \
    -O2 -fomit-frame-pointer -std=gnu89 -I./usr/include -I./tools/lib/ -I./tools/testing/selftests/bpf/ \
    -I./tools/  lib/ -I./tools/include -I./tools/perf -c -o ./samples/bpf/sockex3_user.o ./samples/bpf/sockex3_user.c;

This commit add declaration of '#include <stdint.h>' to tools/lib/bpf/bpf.h
to fix this problem.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-25 23:34:10 +02:00
Daniel Borkmann
4f8827d2b6 bpf, libbpf: fix segfault in bpf_object__init_maps' pr_debug statement
Ran into it while testing; in bpf_object__init_maps() data can be NULL
in the case where no map section is present. Therefore we simply cannot
access data->d_size before NULL test. Move the pr_debug() where it's
safe to access.

Fixes: d859900c4c ("bpf, libbpf: support global data/bss/rodata sections")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 13:47:29 -07:00
Daniel Borkmann
8837fe5dd0 bpf, libbpf: handle old kernels more graceful wrt global data sections
Andrii reported a corner case where e.g. global static data is present
in the BPF ELF file in form of .data/.bss/.rodata section, but without
any relocations to it. Such programs could be loaded before commit
d859900c4c ("bpf, libbpf: support global data/bss/rodata sections"),
whereas afterwards if kernel lacks support then loading would fail.

Add a probing mechanism which skips setting up libbpf internal maps
in case of missing kernel support. In presence of relocation entries,
we abort the load attempt.

Fixes: d859900c4c ("bpf, libbpf: support global data/bss/rodata sections")
Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 13:47:29 -07:00
McCabe, Robert J
4519efa6f8 libbpf: fix BPF_LOG_BUF_SIZE off-by-one error
The BPF_PROG_LOAD condition for kernel version <= 5.1 is

   log->len_total > UINT_MAX >> 8 /* (16 * 1024 * 1024) - 1 */

Signed-off-by: McCabe, Robert J <robert.mccabe@rockwellcollins.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-19 17:05:48 -07:00
Magnus Karlsson
79b1b30e4c libbpf: remove compile time warning from libbpf_util.h
Having a helpful compile time warning in libbpf_util.h is not a good
idea since all warnings are treated as errors. Change this into a
comment in the code instead.

Fixes: b7e3a28019 ("libbpf: remove dependency on barrier.h in xsk.h")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-18 16:07:12 -07:00
Magnus Karlsson
2c5935f1b2 libbpf: optimize barrier for XDP socket rings
The full memory barrier in the XDP socket rings on the consumer side
between the load of the data and the store of the consumer ring is
there to protect the store from being executed before the load of the
data. If this was allowed to happen, the producer might overwrite the
data field with a new entry before the consumer got the chance to read
it.

On x86, stores are guaranteed not to be reordered with older loads, so
it does not need a full memory barrier here. A compile time barrier
would be enough. This patch introdcues a new primitive in
libbpf_util.h that implements a new barrier type (libbpf_smp_rwmb)
hindering stores to be reordered with older loads. It is then used in
the XDP socket ring access code in libbpf to improve performance.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-16 20:13:10 -07:00
Magnus Karlsson
b7e3a28019 libbpf: remove dependency on barrier.h in xsk.h
The use of smp_rmb() and smp_wmb() creates a Linux header dependency
on barrier.h that is unnecessary in most parts. This patch implements
the two small defines that are needed from barrier.h. As a bonus, the
new implementations are faster than the default ones as they default
to sfence and lfence for x86, while we only need a compiler barrier in
our case. Just as it is when the same ring access code is compiled in
the kernel.

Fixes: 1cad078842 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-16 20:13:10 -07:00
Magnus Karlsson
a06d729646 libbpf: remove likely/unlikely in xsk.h
This patch removes the use of likely and unlikely in xsk.h since they
create a dependency on Linux headers as reported by several
users. There have also been reports that the use of these decreases
performance as the compiler puts the code on two different cache lines
instead of on a single one. All in all, I think we are better off
without them.

Fixes: 1cad078842 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-16 20:13:10 -07:00
Magnus Karlsson
d5e63fdd44 libbpf: fix XDP socket ring buffer memory ordering
The ring buffer code of	XDP sockets is missing a memory	barrier	on the
consumer side between the load of the data and the write that signals
that it is ok for the producer to put new data into the buffer. On
architectures that does not guarantee that stores are not reordered
with older loads, the producer might put data into the ring before the
consumer had the chance to read it. As IA does guarantee this
ordering, it would only need a compiler barrier here, but there are no
primitives in barrier.h for this specific case (hinder writes to be ordered
before older reads) so I had to add a smp_mb() here which will
translate into a run-time synch operation on IA.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-16 20:13:10 -07:00
Andrii Nakryiko
e1d1dc4653 libbpf: fix printf formatter for ptrdiff_t argument
Using %ld for printing out value of ptrdiff_t type is not portable
between 32-bit and 64-bit archs. This is causing compilation errors for
libbpf on 32-bit platform (discovered as part of an effort to integrate
libbpf into systemd ([0])). Proper formatter is %td, which is used in
this patch.

v2->v1:
  - add Reported-by
  - provide more context on how this issue was discovered

[0] https://github.com/systemd/systemd/pull/12151

Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-16 19:44:19 -07:00
Rikard Falkeborn
f32c2877bc tools lib traceevent: Fix missing equality check for strcmp
There was a missing comparison with 0 when checking if type is "s64" or
"u64". Therefore, the body of the if-statement was entered if "type" was
"u64" or not "s64", which made the first strcmp() redundant since if
type is "u64", it's not "s64".

If type is "s64", the body of the if-statement is not entered but since
the remainder of the function consists of if-statements which will not
be entered if type is "s64", we will just return "val", which is
correct, albeit at the cost of a few more calls to strcmp(), i.e., it
will behave just as if the if-statement was entered.

If type is neither "s64" or "u64", the body of the if-statement will be
entered incorrectly and "val" returned. This means that any type that is
checked after "s64" and "u64" is handled the same way as "s64" and
"u64", i.e., the limiting of "val" to fit in for example "s8" is never
reached.

This was introduced in the kernel tree when the sources were copied from
trace-cmd in commit f7d82350e5 ("tools/events: Add files to create
libtraceevent.a"), and in the trace-cmd repo in 1cdbae6035cei
("Implement typecasting in parser") when the function was introduced,
i.e., it has always behaved the wrong way.

Detected by cppcheck.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Fixes: f7d82350e5 ("tools/events: Add files to create libtraceevent.a")
Link: http://lkml.kernel.org/r/20190409091529.2686-1-rikard.falkeborn@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-16 11:27:36 -03:00
Ingo Molnar
496156e364 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-16 12:02:43 +02:00
Andrii Nakryiko
189cf5a4a7 btf: add support for VAR and DATASEC in btf_dedup()
This patch adds support for VAR and DATASEC in btf_dedup(). VAR/DATASEC
are never deduplicated, but they need to be processed anyway as types
they refer to might need to be remapped due to deduplication and
compaction.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-16 09:50:20 +02:00
Andrey Ignatov
063cc9f06e libbpf: Support sysctl hook
Support BPF_PROG_TYPE_CGROUP_SYSCTL program in libbpf: identifying
program and attach types by section name, probe.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-12 13:54:58 -07:00
David S. Miller
bb23581b9b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-04-12

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Improve BPF verifier scalability for large programs through two
   optimizations: i) remove verifier states that are not useful in pruning,
   ii) stop walking parentage chain once first LIVE_READ is seen. Combined
   gives approx 20x speedup. Increase limits for accepting large programs
   under root, and add various stress tests, from Alexei.

2) Implement global data support in BPF. This enables static global variables
   for .data, .rodata and .bss sections to be properly handled which allows
   for more natural program development. This also opens up the possibility
   to optimize program workflow by compiling ELFs only once and later only
   rewriting section data before reload, from Daniel and with test cases and
   libbpf refactoring from Joe.

3) Add config option to generate BTF type info for vmlinux as part of the
   kernel build process. DWARF debug info is converted via pahole to BTF.
   Latter relies on libbpf and makes use of BTF deduplication algorithm which
   results in 100x savings compared to DWARF data. Resulting .BTF section is
   typically about 2MB in size, from Andrii.

4) Add BPF verifier support for stack access with variable offset from
   helpers and add various test cases along with it, from Andrey.

5) Extend bpf_skb_adjust_room() growth BPF helper to mark inner MAC header
   so that L2 encapsulation can be used for tc tunnels, from Alan.

6) Add support for input __sk_buff context in BPF_PROG_TEST_RUN so that
   users can define a subset of allowed __sk_buff fields that get fed into
   the test program, from Stanislav.

7) Add bpf fs multi-dimensional array tests for BTF test suite and fix up
   various UBSAN warnings in bpftool, from Yonghong.

8) Generate a pkg-config file for libbpf, from Luca.

9) Dump program's BTF id in bpftool, from Prashant.

10) libbpf fix to use smaller BPF log buffer size for AF_XDP's XDP
    program, from Magnus.

11) kallsyms related fixes for the case when symbols are not present in
    BPF selftests and samples, from Daniel
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-11 17:00:05 -07:00
Stanislav Fomichev
5e903c656b libbpf: add support for ctx_{size, }_{in, out} in BPF_PROG_TEST_RUN
Support recently introduced input/output context for test runs.
We extend only bpf_prog_test_run_xattr. bpf_prog_test_run is
unextendable and left as is.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11 10:21:41 +02:00
Andrey Ignatov
d5adbdd77e libbpf: Fix build with gcc-8
Reported in [1].

With gcc 8.3.0 the following error is issued:

  cc -Ibpf@sta -I. -I.. -I.././include -I.././include/uapi
  -fdiagnostics-color=always -fsanitize=address,undefined -fno-omit-frame-pointer
  -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -g -fPIC -g -O2
  -Werror -Wall -Wno-pointer-arith -Wno-sign-compare  -MD -MQ
  'bpf@sta/src_libbpf.c.o' -MF 'bpf@sta/src_libbpf.c.o.d' -o
  'bpf@sta/src_libbpf.c.o' -c ../src/libbpf.c
  ../src/libbpf.c: In function 'bpf_object__elf_collect':
  ../src/libbpf.c:947:18: error: 'map_def_sz' may be used uninitialized in this
  function [-Werror=maybe-uninitialized]
     if (map_def_sz <= sizeof(struct bpf_map_def)) {
         ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ../src/libbpf.c:827:18: note: 'map_def_sz' was declared here
    int i, map_idx, map_def_sz, nr_syms, nr_maps = 0, nr_maps_glob = 0;
                    ^~~~~~~~~~

According to [2] -Wmaybe-uninitialized is enabled by -Wall.
Same error is generated by clang's -Wconditional-uninitialized.

[1] https://github.com/libbpf/libbpf/pull/29#issuecomment-481902601
[2] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

Fixes: d859900c4c ("bpf, libbpf: support global data/bss/rodata sections")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-11 10:21:38 +02:00
Magnus Karlsson
50bd645b3a libbpf: fix crash in XDP socket part with new larger BPF_LOG_BUF_SIZE
In commit da11b41758 ("libbpf: teach libbpf about log_level bit 2"),
the BPF_LOG_BUF_SIZE was increased to 16M. The XDP socket part of
libbpf allocated the log_buf on the stack, but for the new 16M buffer
size this is not going to work. Change the code so it uses a 16K buffer
instead.

Fixes: da11b41758 ("libbpf: teach libbpf about log_level bit 2")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-10 09:51:50 +02:00
Yonghong Song
69a0f9ecef bpf, bpftool: fix a few ubsan warnings
The issue is reported at https://github.com/libbpf/libbpf/issues/28.

Basically, per C standard, for
  void *memcpy(void *dest, const void *src, size_t n)
if "dest" or "src" is NULL, regardless of whether "n" is 0 or not,
the result of memcpy is undefined. clang ubsan reported three such
instances in bpf.c with the following pattern:
  memcpy(dest, 0, 0).

Although in practice, no known compiler will cause issues when
copy size is 0. Let us still fix the issue to silence ubsan
warnings.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-10 09:46:51 +02:00
Daniel Borkmann
1713d68b3b bpf, libbpf: add support for BTF Var and DataSec
This adds libbpf support for BTF Var and DataSec kinds. Main point
here is that libbpf needs to do some preparatory work before the
whole BTF object can be loaded into the kernel, that is, fixing up
of DataSec size taken from the ELF section size and non-static
variable offset which needs to be taken from the ELF's string section.

Upstream LLVM doesn't fix these up since at time of BTF emission
it is too early in the compilation process thus this information
isn't available yet, hence loader needs to take care of it.

Note, deduplication handling has not been in the scope of this work
and needs to be addressed in a future commit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://reviews.llvm.org/D59441
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:47 -07:00
Daniel Borkmann
d859900c4c bpf, libbpf: support global data/bss/rodata sections
This work adds BPF loader support for global data sections
to libbpf. This allows to write BPF programs in more natural
C-like way by being able to define global variables and const
data.

Back at LPC 2018 [0] we presented a first prototype which
implemented support for global data sections by extending BPF
syscall where union bpf_attr would get additional memory/size
pair for each section passed during prog load in order to later
add this base address into the ldimm64 instruction along with
the user provided offset when accessing a variable. Consensus
from LPC was that for proper upstream support, it would be
more desirable to use maps instead of bpf_attr extension as
this would allow for introspection of these sections as well
as potential live updates of their content. This work follows
this path by taking the following steps from loader side:

 1) In bpf_object__elf_collect() step we pick up ".data",
    ".rodata", and ".bss" section information.

 2) If present, in bpf_object__init_internal_map() we add
    maps to the obj's map array that corresponds to each
    of the present sections. Given section size and access
    properties can differ, a single entry array map is
    created with value size that is corresponding to the
    ELF section size of .data, .bss or .rodata. These
    internal maps are integrated into the normal map
    handling of libbpf such that when user traverses all
    obj maps, they can be differentiated from user-created
    ones via bpf_map__is_internal(). In later steps when
    we actually create these maps in the kernel via
    bpf_object__create_maps(), then for .data and .rodata
    sections their content is copied into the map through
    bpf_map_update_elem(). For .bss this is not necessary
    since array map is already zero-initialized by default.
    Additionally, for .rodata the map is frozen as read-only
    after setup, such that neither from program nor syscall
    side writes would be possible.

 3) In bpf_program__collect_reloc() step, we record the
    corresponding map, insn index, and relocation type for
    the global data.

 4) And last but not least in the actual relocation step in
    bpf_program__relocate(), we mark the ldimm64 instruction
    with src_reg = BPF_PSEUDO_MAP_VALUE where in the first
    imm field the map's file descriptor is stored as similarly
    done as in BPF_PSEUDO_MAP_FD, and in the second imm field
    (as ldimm64 is 2-insn wide) we store the access offset
    into the section. Given these maps have only single element
    ldimm64's off remains zero in both parts.

 5) On kernel side, this special marked BPF_PSEUDO_MAP_VALUE
    load will then store the actual target address in order
    to have a 'map-lookup'-free access. That is, the actual
    map value base address + offset. The destination register
    in the verifier will then be marked as PTR_TO_MAP_VALUE,
    containing the fixed offset as reg->off and backing BPF
    map as reg->map_ptr. Meaning, it's treated as any other
    normal map value from verification side, only with
    efficient, direct value access instead of actual call to
    map lookup helper as in the typical case.

Currently, only support for static global variables has been
added, and libbpf rejects non-static global variables from
loading. This can be lifted until we have proper semantics
for how BPF will treat multi-object BPF loads. From BTF side,
libbpf will set the value type id of the types corresponding
to the ".bss", ".data" and ".rodata" names which LLVM will
emit without the object name prefix. The key type will be
left as zero, thus making use of the key-less BTF option in
array maps.

Simple example dump of program using globals vars in each
section:

  # bpftool prog
  [...]
  6784: sched_cls  name load_static_dat  tag a7e1291567277844  gpl
        loaded_at 2019-03-11T15:39:34+0000  uid 0
        xlated 1776B  jited 993B  memlock 4096B  map_ids 2238,2237,2235,2236,2239,2240

  # bpftool map show id 2237
  2237: array  name test_glo.bss  flags 0x0
        key 4B  value 64B  max_entries 1  memlock 4096B
  # bpftool map show id 2235
  2235: array  name test_glo.data  flags 0x0
        key 4B  value 64B  max_entries 1  memlock 4096B
  # bpftool map show id 2236
  2236: array  name test_glo.rodata  flags 0x80
        key 4B  value 96B  max_entries 1  memlock 4096B

  # bpftool prog dump xlated id 6784
  int load_static_data(struct __sk_buff * skb):
  ; int load_static_data(struct __sk_buff *skb)
     0: (b7) r6 = 0
  ; test_reloc(number, 0, &num0);
     1: (63) *(u32 *)(r10 -4) = r6
     2: (bf) r2 = r10
  ; int load_static_data(struct __sk_buff *skb)
     3: (07) r2 += -4
  ; test_reloc(number, 0, &num0);
     4: (18) r1 = map[id:2238]
     6: (18) r3 = map[id:2237][0]+0    <-- direct addr in .bss area
     8: (b7) r4 = 0
     9: (85) call array_map_update_elem#100464
    10: (b7) r1 = 1
  ; test_reloc(number, 1, &num1);
  [...]
  ; test_reloc(string, 2, str2);
   120: (18) r8 = map[id:2237][0]+16   <-- same here at offset +16
   122: (18) r1 = map[id:2239]
   124: (18) r3 = map[id:2237][0]+16
   126: (b7) r4 = 0
   127: (85) call array_map_update_elem#100464
   128: (b7) r1 = 120
  ; str1[5] = 'x';
   129: (73) *(u8 *)(r9 +5) = r1
  ; test_reloc(string, 3, str1);
   130: (b7) r1 = 3
   131: (63) *(u32 *)(r10 -4) = r1
   132: (b7) r9 = 3
   133: (bf) r2 = r10
  ; int load_static_data(struct __sk_buff *skb)
   134: (07) r2 += -4
  ; test_reloc(string, 3, str1);
   135: (18) r1 = map[id:2239]
   137: (18) r3 = map[id:2235][0]+16   <-- direct addr in .data area
   139: (b7) r4 = 0
   140: (85) call array_map_update_elem#100464
   141: (b7) r1 = 111
  ; __builtin_memcpy(&str2[2], "hello", sizeof("hello"));
   142: (73) *(u8 *)(r8 +6) = r1       <-- further access based on .bss data
   143: (b7) r1 = 108
   144: (73) *(u8 *)(r8 +5) = r1
  [...]

For Cilium use-case in particular, this enables migrating configuration
constants from Cilium daemon's generated header defines into global
data sections such that expensive runtime recompilations with LLVM can
be avoided altogether. Instead, the ELF file becomes effectively a
"template", meaning, it is compiled only once (!) and the Cilium daemon
will then rewrite relevant configuration data from the ELF's .data or
.rodata sections directly instead of recompiling the program. The
updated ELF is then loaded into the kernel and atomically replaces
the existing program in the networking datapath. More info in [0].

Based upon recent fix in LLVM, commit c0db6b6bd444 ("[BPF] Don't fail
for static variables").

  [0] LPC 2018, BPF track, "ELF relocation for static data in BPF",
      http://vger.kernel.org/lpc-bpf2018.html#session-3

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:47 -07:00
Joe Stringer
f8c7a4d4dc bpf, libbpf: refactor relocation handling
Adjust the code for relocations slightly with no functional changes,
so that upcoming patches that will introduce support for relocations
into the .data, .rodata and .bss sections can be added independent
of these changes.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:47 -07:00
Andrey Ignatov
ff466b5805 libbpf: Ignore -Wformat-nonliteral warning
vsprintf() in __base_pr() uses nonliteral format string and it breaks
compilation for those who provide corresponding extra CFLAGS, e.g.:
https://github.com/libbpf/libbpf/issues/27

If libbpf is built with the flags from PR:

  libbpf.c:68:26: error: format string is not a string literal
  [-Werror,-Wformat-nonliteral]
          return vfprintf(stderr, format, args);
                                  ^~~~~~
  1 error generated.

Ignore this warning since the use case in libbpf.c is legit.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-06 23:13:54 -07:00
Alexei Starovoitov
da11b41758 libbpf: teach libbpf about log_level bit 2
Allow bpf_prog_load_xattr() to specify log_level for program loading.

Teach libbpf to accept log_level with bit 2 set.

Increase default BPF_LOG_BUF_SIZE from 256k to 16M.
There is no downside to increase it to a maximum allowed by old kernels.
Existing 256k limit caused ENOSPC errors and users were not able to see
verifier error which is printed at the end of the verifier log.

If ENOSPC is hit, double the verifier log and try again to capture
the verifier error.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-04 01:27:38 +02:00