Commit Graph

109 Commits

Author SHA1 Message Date
David S. Miller
90fed9c946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-05-24

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc).

2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers.

3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit.

4) Jiong Wang adds support for indirect and arithmetic shifts to NFP

5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible.

6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing
   to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions.

7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT.

8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events.

9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24 22:20:51 -04:00
Yonghong Song
ecb96f7fe1 samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY
This is mostly to test kprobe/uprobe which needs kernel headers.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-24 18:18:20 -07:00
David S. Miller
6f6e434aa2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.

TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.

The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.

Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:01:54 -04:00
Jakub Kicinski
768759edb9 samples: bpf: make the build less noisy
Building samples with clang ignores the $(Q) setting, always
printing full command to the output.  Make it less verbose.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-14 22:52:10 -07:00
Jakub Kicinski
0cc54db181 samples: bpf: move libbpf from object dependencies to libs
Make complains that it doesn't know how to make libbpf.a:

scripts/Makefile.host:106: target 'samples/bpf/../../tools/lib/bpf/libbpf.a' doesn't match the target pattern

Now that we have it as a dependency of the sources simply add libbpf.a
to libraries not objects.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-14 22:52:10 -07:00
Jakub Kicinski
787360f8c2 samples: bpf: fix build after move to compiling full libbpf.a
There are many ways users may compile samples, some of them got
broken by commit 5f9380572b ("samples: bpf: compile and link
against full libbpf").  Improve path resolution and make libbpf
building a dependency of source files to force its build.

Samples should now again build with any of:
 cd samples/bpf; make
 make samples/bpf/
 make -C samples/bpf
 cd samples/bpf; make O=builddir
 make samples/bpf/ O=builddir
 make -C samples/bpf O=builddir
 export KBUILD_OUTPUT=builddir
 make samples/bpf/
 make -C samples/bpf

Fixes: 5f9380572b ("samples: bpf: compile and link against full libbpf")
Reported-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-14 22:52:10 -07:00
Alexei Starovoitov
b1ae32dbab x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation
Workaround for the sake of BPF compilation which utilizes kernel
headers, but clang does not support ASM GOTO and fails the build.

Fixes: d0266046ad ("x86: Remove FAST_FEATURE_TESTS")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: daniel@iogearbox.net
Cc: peterz@infradead.org
Cc: netdev@vger.kernel.org
Cc: bp@alien8.de
Cc: yhs@fb.com
Cc: kernel-team@fb.com
Cc: torvalds@linux-foundation.org
Cc: davem@davemloft.net
Link: https://lkml.kernel.org/r/20180513193222.1997938-1-ast@kernel.org
2018-05-13 21:49:14 +02:00
Jakub Kicinski
be5bca44aa samples: bpf: convert some XDP samples from bpf_load to libbpf
Now that we can use full powers of libbpf in BPF samples, we
should perhaps make the simplest XDP programs not depend on
bpf_load helpers.  This way newcomers will be exposed to the
recommended library from the start.

Use of bpf_prog_load_xattr() will also make it trivial to later
on request offload of the programs by simply adding ifindex to
the xattr.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-11 01:44:17 +02:00
Jakub Kicinski
d0cabbb021 tools: bpf: move the event reading loop to libbpf
There are two copies of event reading loop - in bpftool and
trace_helpers "library".  Consolidate them and move the code
to libbpf.  Return codes from trace_helpers are kept, but
renamed to include LIBBPF prefix.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-11 01:40:52 +02:00
Jakub Kicinski
5f9380572b samples: bpf: compile and link against full libbpf
samples/bpf currently cherry-picks object files from tools/lib/bpf
to link against.  Just compile the full library and link statically
against it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-11 01:40:52 +02:00
David Ahern
fe616055f7 samples/bpf: Add example of ipv4 and ipv6 forwarding in XDP
Simple example of fast-path forwarding. It has a serious flaw
in not verifying the egress device index supports XDP forwarding.
If the egress device does not packets are dropped.

Take this only as a simple example of fast-path forwarding.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-11 00:10:57 +02:00
Magnus Karlsson
b4b8faa1de samples/bpf: sample application and documentation for AF_XDP sockets
This is a sample application for AF_XDP sockets. The application
supports three different modes of operation: rxdrop, txonly and l2fwd.

To show-case a simple round-robin load-balancing between a set of
sockets in an xskmap, set the RR_LB compile time define option to 1 in
"xdpsock.h".

v2: The entries variable was calculated twice in {umem,xq}_nb_avail.

Co-authored-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-03 15:55:25 -07:00
Yonghong Song
28dbf861de samples/bpf: move common-purpose trace functions to selftests
There is no functionality change in this patch. The common-purpose
trace functions, including perf_event polling and ksym lookup,
are moved from trace_output_user.c and bpf_load.c to
selftests/bpf/trace_helpers.c so that these function can
be reused later in selftests.

Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-04-29 08:45:54 -07:00
William Tu
b05cd74043 samples/bpf: remove the bpf tunnel testsuite.
Move the testsuite to
selftests/bpf/{test_tunnel_kern.c, test_tunnel.sh}

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:11:15 +02:00
Nikita V. Shirokov
c6ffd1ff78 bpf: add bpf_xdp_adjust_tail sample prog
adding bpf's sample program which is using bpf_xdp_adjust_tail helper
by generating ICMPv4 "packet to big" message if ingress packet's size is
bigger then 600 bytes

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-18 23:34:17 +02:00
Alexei Starovoitov
4662a4e538 samples/bpf: raw tracepoint test
add empty raw_tracepoint bpf program to test overhead similar
to kprobe and traditional tracepoint tests

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-28 22:55:19 +02:00
Leo Yan
c535077789 samples/bpf: Add program for CPU state statistics
CPU is active when have running tasks on it and CPUFreq governor can
select different operating points (OPP) according to different workload;
we use 'pstate' to present CPU state which have running tasks with one
specific OPP.  On the other hand, CPU is idle which only idle task on
it, CPUIdle governor can select one specific idle state to power off
hardware logics; we use 'cstate' to present CPU idle state.

Based on trace events 'cpu_idle' and 'cpu_frequency' we can accomplish
the duration statistics for every state.  Every time when CPU enters
into or exits from idle states, the trace event 'cpu_idle' is recorded;
trace event 'cpu_frequency' records the event for CPU OPP changing, so
it's easily to know how long time the CPU stays in the specified OPP,
and the CPU must be not in any idle state.

This patch is to utilize the mentioned trace events for pstate and
cstate statistics.  To achieve more accurate profiling data, the program
uses below sequence to insure CPU running/idle time aren't missed:

- Before profiling the user space program wakes up all CPUs for once, so
  can avoid to missing account time for CPU staying in idle state for
  long time; the program forces to set 'scaling_max_freq' to lowest
  frequency and then restore 'scaling_max_freq' to highest frequency,
  this can ensure the frequency to be set to lowest frequency and later
  after start to run workload the frequency can be easily to be changed
  to higher frequency;

- User space program reads map data and update statistics for every 5s,
  so this is same with other sample bpf programs for avoiding big
  overload introduced by bpf program self;

- When send signal to terminate program, the signal handler wakes up
  all CPUs, set lowest frequency and restore highest frequency to
  'scaling_max_freq'; this is exactly same with the first step so
  avoid to missing account CPU pstate and cstate time during last
  stage.  Finally it reports the latest statistics.

The program has been tested on Hikey board with octa CA53 CPUs, below
is one example for statistics result, the format mainly follows up
Jesper Dangaard Brouer suggestion.

Jesper reminds to 'get printf to pretty print with thousands separators
use %' and setlocale(LC_NUMERIC, "en_US")', tried three different arm64
GCC toolchains (5.4.0 20160609, 6.2.1 20161016, 6.3.0 20170516) but all
of them cannot support printf flag character %' on arm64 platform, so go
back print number without grouping mode.

CPU states statistics:
state(ms)  cstate-0    cstate-1    cstate-2    pstate-0    pstate-1    pstate-2    pstate-3    pstate-4
CPU-0      767         6111        111863      561         31          756         853         190
CPU-1      241         10606       107956      484         125         646         990         85
CPU-2      413         19721       98735       636         84          696         757         89
CPU-3      84          11711       79989       17516       909         4811        5773        341
CPU-4      152         19610       98229       444         53          649         708         1283
CPU-5      185         8781        108697      666         91          671         677         1365
CPU-6      157         21964       95825       581         67          566         684         1284
CPU-7      125         15238       102704      398         20          665         786         1197

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-26 10:54:02 +01:00
Eric Leblond
bbf48c18ee libbpf: add error reporting in XDP
Parse netlink ext attribute to get the error message returned by
the card. Code is partially take from libnl.

We add netlink.h to the uapi include of tools. And we need to
avoid include of userspace netlink header to have a successful
build of sample so nlattr.h has a define to avoid
the inclusion. Using a direct define could have been an issue
as NLMSGERR_ATTR_MAX can change in the future.

We also define SOL_NETLINK if not defined to avoid to have to
copy socket.h for a fixed value.

Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-02 17:53:48 -08:00
Mickaël Salaün
c25ef6a5e6 samples/bpf: Partially fixes the bpf.o build
Do not build lib/bpf/bpf.o with this Makefile but use the one from the
library directory.  This avoid making a buggy bpf.o file (e.g. missing
symbols).

This patch is useful if some code (e.g. Landlock tests) needs both the
bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf).

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-26 23:57:10 +01:00
Jesper Dangaard Brouer
36e04a2d78 samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB
Creating a bpf sample that shows howto use the XDP 'data_meta'
infrastructure, created by Daniel Borkmann.  Very few drivers support
this feature, but I wanted a functional sample to begin with, when
working on adding driver support.

XDP data_meta is about creating a communication channel between BPF
programs.  This can be XDP tail-progs, but also other SKB based BPF
hooks, like in this case the TC clsact hook. In this sample I show
that XDP can store info named "mark", and TC/clsact chooses to use
this info and store it into the skb->mark.

It is a bit annoying that XDP and TC samples uses different tools/libs
when attaching their BPF hooks.  As the XDP and TC programs need to
cooperate and agree on a struct-layout, it is best/easiest if the two
programs can be contained within the same BPF restricted-C file.

As the bpf-loader, I choose to not use bpf_load.c (or libbpf), but
instead wrote a bash shell scripted named xdp2skb_meta.sh, which
demonstrate howto use the iproute cmdline tools 'tc' and 'ip' for
loading BPF programs.  To make it easy for first time users, the shell
script have command line parsing, and support --verbose and --dry-run
mode, if you just want to see/learn the tc+ip command syntax:

 # ./xdp2skb_meta.sh --dev ixgbe2 --dry-run
 # Dry-run mode: enable VERBOSE and don't call TC+IP
 tc qdisc del dev ixgbe2 clsact
 tc qdisc add dev ixgbe2 clsact
 tc filter add dev ixgbe2 ingress prio 1 handle 1 bpf da obj ./xdp2skb_meta_kern.o sec tc_mark
 # Flush XDP on device: ixgbe2
 ip link set dev ixgbe2 xdp off
 ip link set dev ixgbe2 xdp obj ./xdp2skb_meta_kern.o sec xdp_mark

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-11 01:02:25 +01:00
Jesper Dangaard Brouer
0fca931a6f samples/bpf: program demonstrating access to xdp_rxq_info
This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.

The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.

Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ.  Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.

Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats       CPU     pps         issue-pps
XDP-RX CPU      0       900,473     0
XDP-RX CPU      2       906,921     0
XDP-RX CPU      total   1,807,395

RXQ stats       RXQ:CPU pps         issue-pps
rx_queue_index    0:0   180,098     0
rx_queue_index    0:sum 180,098
rx_queue_index    1:0   180,098     0
rx_queue_index    1:sum 180,098
rx_queue_index    2:2   906,921     0
rx_queue_index    2:sum 906,921
rx_queue_index    3:0   180,098     0
rx_queue_index    3:sum 180,098
rx_queue_index    4:0   180,082     0
rx_queue_index    4:sum 180,082
rx_queue_index    5:0   180,093     0
rx_queue_index    5:sum 180,093

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-05 15:21:22 -08:00
Josef Bacik
965de87e54 samples/bpf: add a test for bpf_override_return
This adds a basic test for bpf_override_return to verify it works.  We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:02:40 -08:00
Masahiro Yamada
bf070bb0e6 kbuild: remove all dummy assignments to obj-
Now kbuild core scripts create empty built-in.o where necessary.
Remove "obj- := dummy.o" tricks.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-18 11:46:06 +09:00
David S. Miller
f3edacbd69 bpf: Revert bpf_overrid_function() helper changes.
NACK'd by x86 maintainer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 18:24:55 +09:00
Josef Bacik
eafb3401fa samples/bpf: add a test for bpf_override_return
This adds a basic test for bpf_override_return to verify it works.  We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 12:18:06 +09:00