kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Ilya Leoshkevich	fec47bbc10	selftests/bpf: Fix endianness issue in test_sockopt_sk getsetsockopt() calls getsockopt() with optlen == 1, but then checks the resulting int. It is ok on little endian, but not on big endian. Fix by checking char instead. Fixes: `8a027dc0d8` ("selftests/bpf: add sockopt test that exercises sk helpers") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200915113928.3768496-1-iii@linux.ibm.com	2020-09-19 01:01:18 +02:00
Ilya Leoshkevich	b6ed6cf4a3	selftests/bpf: Fix endianness issue in sk_assign server_map's value size is 8, but the test tries to put an int there. This sort of works on x86 (unless followed by non-0), but hard fails on s390. Fix by using __s64 instead of int. Fixes: `2d7824ffd2` ("selftests: bpf: Add test for sk_assign") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200915113815.3768217-1-iii@linux.ibm.com	2020-09-18 22:54:52 +02:00
Maciej Fijalkowski	3b03791111	selftests/bpf: Add tailcall_bpf2bpf tests Add four tests to tailcalls selftest explicitly named "tailcall_bpf2bpf_X" as their purpose is to validate that combination of tailcalls with bpf2bpf calls are working properly. These tests also validate LD_ABS from subprograms. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 19:56:07 -07:00
Alexei Starovoitov	09b28d76ea	bpf: Add abnormal return checks. LD_[ABS\|IND] instructions may return from the function early. bpf_tail_call pseudo instruction is either fallthrough or return. Allow them in the subprograms only when subprograms are BTF annotated and have scalar return types. Allow ld_abs and tail_call in the main program even if it calls into subprograms. In the past that was not ok to do for ld_abs, since it was JITed with special exit sequence. Since bpf_gen_ld_abs() was introduced the ld_abs looks like normal exit insn from JIT point of view, so it's safe to allow them in the main program. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 19:56:07 -07:00
Maciej Fijalkowski	e411901c0b	bpf: allow for tailcalls in BPF subprograms for x64 JIT Relax verifier's restriction that was meant to forbid tailcall usage when subprog count was higher than 1. Also, do not max out the stack depth of program that utilizes tailcalls. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 19:56:06 -07:00
Maciej Fijalkowski	ebf7d1f508	bpf, x64: rework pro/epilogue and tailcall handling in JIT This commit serves two things: 1) it optimizes BPF prologue/epilogue generation 2) it makes possible to have tailcalls within BPF subprogram Both points are related to each other since without 1), 2) could not be achieved. In [1], Alexei says: "The prologue will look like: nop5 xor eax,eax // two new bytes if bpf_tail_call() is used in this // function push rbp mov rbp, rsp sub rsp, rounded_stack_depth push rax // zero init tail_call counter variable number of push rbx,r13,r14,r15 Then bpf_tail_call will pop variable number rbx,.. and final 'pop rax' Then 'add rsp, size_of_current_stack_frame' jmp to next function and skip over 'nop5; xor eax,eax; push rpb; mov rbp, rsp' This way new function will set its own stack size and will init tail call counter with whatever value the parent had. If next function doesn't use bpf_tail_call it won't have 'xor eax,eax'. Instead it would need to have 'nop2' in there." Implement that suggestion. Since the layout of stack is changed, tail call counter handling can not rely anymore on popping it to rbx just like it have been handled for constant prologue case and later overwrite of rbx with actual value of rbx pushed to stack. Therefore, let's use one of the register (%rcx) that is considered to be volatile/caller-saved and pop the value of tail call counter in there in the epilogue. Drop the BUILD_BUG_ON in emit_prologue and in emit_bpf_tail_call_indirect where instruction layout is not constant anymore. Introduce new poke target, 'tailcall_bypass' to poke descriptor that is dedicated for skipping the register pops and stack unwind that are generated right before the actual jump to target program. For case when the target program is not present, BPF program will skip the pop instructions and nop5 dedicated for jmpq $target. An example of such state when only R6 of callee saved registers is used by program: ffffffffc0513aa1: e9 0e 00 00 00 jmpq 0xffffffffc0513ab4 ffffffffc0513aa6: 5b pop %rbx ffffffffc0513aa7: 58 pop %rax ffffffffc0513aa8: 48 81 c4 00 00 00 00 add $0x0,%rsp ffffffffc0513aaf: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ffffffffc0513ab4: 48 89 df mov %rbx,%rdi When target program is inserted, the jump that was there to skip pops/nop5 will become the nop5, so CPU will go over pops and do the actual tailcall. One might ask why there simply can not be pushes after the nop5? In the following example snippet: ffffffffc037030c: 48 89 fb mov %rdi,%rbx (...) ffffffffc0370332: 5b pop %rbx ffffffffc0370333: 58 pop %rax ffffffffc0370334: 48 81 c4 00 00 00 00 add $0x0,%rsp ffffffffc037033b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ffffffffc0370340: 48 81 ec 00 00 00 00 sub $0x0,%rsp ffffffffc0370347: 50 push %rax ffffffffc0370348: 53 push %rbx ffffffffc0370349: 48 89 df mov %rbx,%rdi ffffffffc037034c: e8 f7 21 00 00 callq 0xffffffffc0372548 There is the bpf2bpf call (at ffffffffc037034c) right after the tailcall and jump target is not present. ctx is in %rbx register and BPF subprogram that we will call into on ffffffffc037034c is relying on it, e.g. it will pick ctx from there. Such code layout is therefore broken as we would overwrite the content of %rbx with the value that was pushed on the prologue. That is the reason for the 'bypass' approach. Special care needs to be taken during the install/update/remove of tailcall target. In case when target program is not present, the CPU must not execute the pop instructions that precede the tailcall. To address that, the following states can be defined: A nop, unwind, nop B nop, unwind, tail C skip, unwind, nop D skip, unwind, tail A is forbidden (lead to incorrectness). The state transitions between tailcall install/update/remove will work as follows: First install tail call f: C->D->B(f) * poke the tailcall, after that get rid of the skip Update tail call f to f': B(f)->B(f') * poke the tailcall (poke->tailcall_target) and do NOT touch the poke->tailcall_bypass Remove tail call: B(f')->C(f') * poke->tailcall_bypass is poked back to jump, then we wait the RCU grace period so that other programs will finish its execution and after that we are safe to remove the poke->tailcall_target Install new tail call (f''): C(f')->D(f'')->B(f''). * same as first step This way CPU can never be exposed to "unwind, tail" state. Last but not least, when tailcalls get mixed with bpf2bpf calls, it would be possible to encounter the endless loop due to clearing the tailcall counter if for example we would use the tailcall3-like from BPF selftests program that would be subprogram-based, meaning the tailcall would be present within the BPF subprogram. This test, broken down to particular steps, would do: entry -> set tailcall counter to 0, bump it by 1, tailcall to func0 func0 -> call subprog_tail (we are NOT skipping the first 11 bytes of prologue and this subprogram has a tailcall, therefore we clear the counter...) subprog -> do the same thing as entry and then loop forever. To address this, the idea is to go through the call chain of bpf2bpf progs and look for a tailcall presence throughout whole chain. If we saw a single tail call then each node in this call chain needs to be marked as a subprog that can reach the tailcall. We would later feed the JIT with this info and: - set eax to 0 only when tailcall is reachable and this is the entry prog - if tailcall is reachable but there's no tailcall in insns of currently JITed prog then push rax anyway, so that it will be possible to propagate further down the call chain - finally if tailcall is reachable, then we need to precede the 'call' insn with mov rax, [rbp - (stack_depth + 8)] Tail call related cases from test_verifier kselftest are also working fine. Sample BPF programs that utilize tail calls (sockex3, tracex5) work properly as well. [1]: https://lore.kernel.org/bpf/20200517043227.2gpq22ifoq37ogst@ast-mbp.dhcp.thefacebook.com/ Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 19:55:30 -07:00
Maciej Fijalkowski	7f6e4312e1	bpf: Limit caller's stack depth 256 for subprogs with tailcalls Protect against potential stack overflow that might happen when bpf2bpf calls get combined with tailcalls. Limit the caller's stack depth for such case down to 256 so that the worst case scenario would result in 8k stack size (32 which is tailcall limit * 256 = 8k). Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 19:19:20 -07:00
Maciej Fijalkowski	cf71b174d3	bpf: rename poke descriptor's 'ip' member to 'tailcall_target' Reflect the actual purpose of poke->ip and rename it to poke->tailcall_target so that it will not the be confused with another poke target that will be introduced in next commit. While at it, do the same thing with poke->ip_stable - rename it to poke->tailcall_target_stable. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 12:59:31 -07:00
Maciej Fijalkowski	a748c6975d	bpf: propagate poke descriptors to subprograms Previously, there was no need for poke descriptors being present in subprogram's bpf_prog_aux struct since tailcalls were simply not allowed in them. Each subprog is JITed independently so in order to enable JITing subprograms that use tailcalls, do the following: - in fixup_bpf_calls() store the index of tailcall insn onto the generated poke descriptor, - in case when insn patching occurs, adjust the tailcall insn idx from bpf_patch_insn_data, - then in jit_subprogs() check whether the given poke descriptor belongs to the current subprog by checking if that previously stored absolute index of tail call insn is in the scope of the insns of given subprog, - update the insn->imm with new poke descriptor slot so that while JITing the proper poke descriptor will be grabbed This way each of the main program's poke descriptors are distributed across the subprograms poke descriptor array, so main program's descriptors can be untracked out of the prog array map. Add also subprog's aux struct to the BPF map poke_progs list by calling on it map_poke_track(). In case of any error, call the map_poke_untrack() on subprog's aux structs that have already been registered to prog array map. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 12:59:31 -07:00
Maciej Fijalkowski	0d4ddce300	bpf, x64: use %rcx instead of %rax for tail call retpolines Currently, %rax is used to store the jump target when BPF program is emitting the retpoline instructions that are handling the indirect tailcall. There is a plan to use %rax for different purpose, which is storing the tail call counter. In order to preserve this value across the tailcalls, adjust the BPF indirect tailcalls so that the target program will reside in %rcx and teach the retpoline instructions about new location of jump target. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-17 12:59:31 -07:00
Andrii Nakryiko	c64779e24e	selftests/bpf: Merge most of test_btf into test_progs Merge 183 tests from test_btf into test_progs framework to be exercised regularly. All the test_btf tests that were moved are modeled as proper sub-tests in test_progs framework for ease of debugging and reporting. No functional or behavioral changes were intended, I tried to preserve original behavior as much as possible. E.g., `test_progs -v` will activate "always_log" flag to emit BTF validation log. The only difference is in reducing the max_entries limit for pretty-printing tests from (128 * 1024) to just 128 to reduce tests running time without reducing the coverage. Example test run: $ sudo ./test_progs -n 8 ... #8 btf:OK Summary: 1/183 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200916004819.3767489-1-andriin@fb.com	2020-09-15 18:37:33 -07:00
Alexei Starovoitov	ffa915f461	Merge branch 'bpf_metadata' Stanislav Fomichev says: ==================== Currently, if a user wants to store arbitrary metadata for an eBPF program, for example, the program build commit hash or version, they could store it in a map, and conveniently libbpf uses .data section to populate an internal map. However, if the program does not actually reference the map, then the map would be de-refcounted and freed. This patch set introduces a new syscall BPF_PROG_BIND_MAP to add a map to a program's used_maps, even if the program instructions does not reference the map. libbpf is extended to always BPF_PROG_BIND_MAP .rodata section so the metadata is kept in place. bpftool is also extended to print metadata in the 'bpftool prog' list. The variable is considered metadata if it starts with the magic 'bpf_metadata_' prefix; everything after the prefix is the metadata name. An example use of this would be BPF C file declaring: volatile const char bpf_metadata_commit_hash[] SEC(".rodata") = "abcdef123456"; and bpftool would emit: $ bpftool prog [...] metadata: commit_hash = "abcdef123456" v6 changes: * libbpf: drop FEAT_GLOBAL_DATA from probe_prog_bind_map (Andrii Nakryiko) * bpftool: combine find_metadata_map_id & find_metadata; drops extra bpf_map_get_fd_by_id and bpf_map_get_fd_by_id (Andrii Nakryiko) * bpftool: use strncmp instead of strstr (Andrii Nakryiko) * bpftool: memset(map_info) and extra empty line (Andrii Nakryiko) v5 changes: * selftest: verify that prog holds rodata (Andrii Nakryiko) * selftest: use volatile for metadata (Andrii Nakryiko) * bpftool: use sizeof in BPF_METADATA_PREFIX_LEN (Andrii Nakryiko) * bpftool: new find_metadata that does map lookup (Andrii Nakryiko) * libbpf: don't generalize probe_create_global_data (Andrii Nakryiko) * libbpf: use OPTS_VALID in bpf_prog_bind_map (Andrii Nakryiko) * libbpf: keep LIBBPF_0.2.0 sorted (Andrii Nakryiko) v4 changes: * Don't return EEXIST from syscall if already bound (Andrii Nakryiko) * Removed --metadata argument (Andrii Nakryiko) * Removed custom .metadata section (Alexei Starovoitov) * Addressed Andrii's suggestions about btf helpers and vsi (Andrii Nakryiko) * Moved bpf_prog_find_metadata into bpftool (Alexei Starovoitov) v3 changes: * API changes for bpf_prog_find_metadata (Toke Høiland-Jørgensen) v2 changes: * Made struct bpf_prog_bind_opts in libbpf so flags is optional. * Deduped probe_kern_global_data and probe_prog_bind_map into a common helper. * Added comment regarding why EEXIST is ignored in libbpf bind map. * Froze all LIBBPF_MAP_METADATA internal maps. * Moved bpf_prog_bind_map into new LIBBPF_0.1.1 in libbpf.map. * Added p_err() calls on error cases in bpftool show_prog_metadata. * Reverse christmas tree coding style in bpftool show_prog_metadata. * Made bpftool gen skeleton recognize .metadata as an internal map and generate datasec definition in skeleton. * Added C test using skeleton to see asset that the metadata is what we expect and rebinding causes EEXIST. v1 changes: * Fixed a few missing unlocks, and missing close while iterating map fds. * Move mutex initialization to right after prog aux allocation, and mutex destroy to right after prog aux free. * s/ADD_MAP/BIND_MAP/ * Use mutex only instead of RCU to protect the used_map array & count. Cc: YiFei Zhu <zhuyifei1999@gmail.com> ==================== Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-15 18:28:41 -07:00
YiFei Zhu	d42d1cc44d	selftests/bpf: Test load and dump metadata with btftool and skel This is a simple test to check that loading and dumping metadata in btftool works, whether or not metadata contents are used by the program. A C test is also added to make sure the skeleton code can read the metadata values. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-6-sdf@google.com	2020-09-15 18:28:27 -07:00
YiFei Zhu	aff52e685e	bpftool: Support dumping metadata Dump metadata in the 'bpftool prog' list if it's present. For some formatting some BTF code is put directly in the metadata dumping. Sanity checks on the map and the kind of the btf_type to make sure we are actually dumping what we are expecting. A helper jsonw_reset is added to json writer so we can reuse the same json writer without having extraneous commas. Sample output: $ bpftool prog 6: cgroup_skb name prog tag bcf7977d3b93787c gpl [...] btf_id 4 metadata: a = "foo" b = 1 $ bpftool prog --json --pretty [{ "id": 6, [...] "btf_id": 4, "metadata": { "a": "foo", "b": 1 } } ] Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-5-sdf@google.com	2020-09-15 18:28:27 -07:00
YiFei Zhu	5d23328dcc	libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .rodata section The patch adds a simple wrapper bpf_prog_bind_map around the syscall. When the libbpf tries to load a program, it will probe the kernel for the support of this syscall and unconditionally bind .rodata section to the program. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-4-sdf@google.com	2020-09-15 18:28:27 -07:00
YiFei Zhu	ef15314aa5	bpf: Add BPF_PROG_BIND_MAP syscall This syscall binds a map to a program. Returns success if the map is already bound to the program. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-3-sdf@google.com	2020-09-15 18:28:27 -07:00
YiFei Zhu	984fe94f94	bpf: Mutex protect used_maps array and count To support modifying the used_maps array, we use a mutex to protect the use of the counter and the array. The mutex is initialized right after the prog aux is allocated, and destroyed right before prog aux is freed. This way we guarantee it's initialized for both cBPF and eBPF. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-2-sdf@google.com	2020-09-15 18:28:27 -07:00
Yonghong Song	d317b0a8ac	libbpf: Fix a compilation error with xsk.c for ubuntu 16.04 When syncing latest libbpf repo to bcc, ubuntu 16.04 (4.4.0 LTS kernel) failed compilation for xsk.c: In file included from /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c:23:0: /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c: In function ‘xsk_get_ctx’: /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:81:9: warning: implicit declaration of function ‘container_of’ [-Wimplicit-function-declaration] container_of(ptr, type, member) ^ /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:83:9: note: in expansion of macro ‘list_entry’ list_entry((ptr)->next, type, member) ... src/cc/CMakeFiles/bpf-static.dir/build.make:209: recipe for target 'src/cc/CMakeFiles/bpf-static.dir/libbpf/src/xsk.c.o' failed Commit `2f6324a393` ("libbpf: Support shared umems between queues and devices") added include file <linux/list.h>, which uses macro "container_of". xsk.c file also includes <linux/ethtool.h> before <linux/list.h>. In a more recent distro kernel, <linux/ethtool.h> includes <linux/kernel.h> which contains the macro definition for "container_of". So compilation is all fine. But in ubuntu 16.04 kernel, <linux/ethtool.h> does not contain <linux/kernel.h> which caused the above compilation error. Let explicitly add <linux/kernel.h> in xsk.c to avoid compilation error in old distro's. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200914223210.1831262-1-yhs@fb.com	2020-09-14 18:52:46 -07:00
Yonghong Song	63bea244fe	bpftool: Fix build failure When building bpf selftests like make -C tools/testing/selftests/bpf -j20 I hit the following errors: ... GEN /net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-gen.8 <stdin>:75: (WARNING/2) Block quote ends without a blank line; unexpected unindent. <stdin>:71: (WARNING/2) Literal block ends without a blank line; unexpected unindent. <stdin>:85: (WARNING/2) Literal block ends without a blank line; unexpected unindent. <stdin>:57: (WARNING/2) Block quote ends without a blank line; unexpected unindent. <stdin>:66: (WARNING/2) Literal block ends without a blank line; unexpected unindent. <stdin>:109: (WARNING/2) Literal block ends without a blank line; unexpected unindent. <stdin>:175: (WARNING/2) Literal block ends without a blank line; unexpected unindent. <stdin>:273: (WARNING/2) Literal block ends without a blank line; unexpected unindent. make[1]: * [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-perf.8] Error 12 make[1]: * Waiting for unfinished jobs.... make[1]: * [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-iter.8] Error 12 make[1]: * [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-struct_ops.8] Error 12 ... I am using: -bash-4.4$ rst2man --version rst2man (Docutils 0.11 [repository], Python 2.7.5, on linux2) -bash-4.4$ The Makefile generated final .rst file (e.g., bpftool-cgroup.rst) looks like ... ID AttachType AttachFlags Name \n SEE ALSO\n========\n\tbpf\ (2),\n\tbpf-helpers\ (7),\n\tbpftool\ (8),\n\tbpftool-btf\ (8),\n\tbpftool-feature\ (8),\n\tbpftool-gen\ (8),\n\tbpftool-iter\ (8),\n\tbpftool-link\ (8),\n\tbpftool-map\ (8),\n\tbpftool-net\ (8),\n\tbpftool-perf\ (8),\n\tbpftool-prog\ (8),\n\tbpftool-struct_ops\ (8)\n The rst2man generated .8 file looks like Literal block ends without a blank line; unexpected unindent. .sp n SEEALSOn========ntbpf(2),ntbpf\-helpers(7),ntbpftool(8),ntbpftool\-btf(8),nt bpftool\-feature(8),ntbpftool\-gen(8),ntbpftool\-iter(8),ntbpftool\-link(8),nt bpftool\-map(8),ntbpftool\-net(8),ntbpftool\-perf(8),ntbpftool\-prog(8),nt bpftool\-struct_ops(8)n Looks like that particular version of rst2man prefers to have actual new line instead of \n. Since `echo -e` may not be available in some environment, let us use `printf`. Format string "%b" is used for `printf` to ensure all escape characters are interpretted properly. Fixes: `18841da981` ("tools: bpftool: Automate generation for "SEE ALSO" sections in man pages") Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200914183110.999906-1-yhs@fb.com	2020-09-14 18:47:57 -07:00
Magnus Karlsson	bf74a370eb	xsk: Fix refcount warning in xp_dma_map Fix a potential refcount warning that a zero value is increased to one in xp_dma_map, by initializing the refcount to one to start with, instead of zero plus a refcount_inc(). Fixes: `921b68692a` ("xsk: Enable sharing of dma mappings") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/1600095036-23868-1-git-send-email-magnus.karlsson@gmail.com	2020-09-14 18:43:25 -07:00
Magnus Karlsson	74e00676d7	samples/bpf: Add quiet option to xdpsock Add a quiet option (-Q) that disables the statistics print outs of xdpsock. This is good to have when measuring 0% loss rate performance as it will be quite terrible if the application uses printfs. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1599726666-8431-4-git-send-email-magnus.karlsson@gmail.com	2020-09-14 18:38:11 -07:00
Magnus Karlsson	5a2a0dd88f	samples/bpf: Fix possible deadlock in xdpsock Fix a possible deadlock in the l2fwd application in xdpsock that can occur when there is no space in the Tx ring. There are two ways to get the kernel to consume entries in the Tx ring: calling sendto() to make it send packets and freeing entries from the completion ring, as the kernel will not send a packet if there is no space for it to add a completion entry in the completion ring. The Tx loop in l2fwd only used to call sendto(). This patches adds cleaning the completion ring in that loop. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1599726666-8431-3-git-send-email-magnus.karlsson@gmail.com	2020-09-14 18:38:11 -07:00
Magnus Karlsson	3131cf66d3	samples/bpf: Fix one packet sending in xdpsock Fix the sending of a single packet (or small burst) in xdpsock when executing in copy mode. Currently, the l2fwd application in xdpsock only transmits the packets after a batch of them has been received, which might be confusing if you only send one packet and expect that it is returned pronto. Fix this by calling sendto() more often and add a comment in the code that states that this can be optimized if needed. Reported-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1599726666-8431-2-git-send-email-magnus.karlsson@gmail.com	2020-09-14 18:38:11 -07:00
Ilya Leoshkevich	d72714c1da	s390/bpf: Fix multiple tail calls In order to branch around tail calls (due to out-of-bounds index, exceeding tail call count or missing tail call target), JIT uses label[0] field, which contains the address of the instruction following the tail call. When there are multiple tail calls, label[0] value comes from handling of a previous tail call, which is incorrect. Fix by getting rid of label array and resolving the label address locally: for all 3 branches that jump to it, emit 0 offsets at the beginning, and then backpatch them with the correct value. Also, do not use the long jump infrastructure: the tail call sequence is known to be short, so make all 3 jumps short. Fixes: `6651ee070b` ("s390/bpf: implement bpf_tail_call() helper") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200909232141.3099367-1-iii@linux.ibm.com	2020-09-14 18:21:31 -07:00
Alexei Starovoitov	2bab48c5be	Merge branch 'improve-bpf-tcp-cc-init' Neal Cardwell says: ==================== This patch series reorganizes TCP congestion control initialization so that if EBPF code called by tcp_init_transfer() sets the congestion control algorithm by calling setsockopt(TCP_CONGESTION) then the TCP stack initializes the congestion control module immediately, instead of having tcp_init_transfer() later initialize the congestion control module. This increases flexibility for the EBPF code that runs at connection establishment time, and simplifies the code. This has the following benefits: (1) This allows CC module customizations made by the EBPF called in tcp_init_transfer() to persist, and not be wiped out by a later call to tcp_init_congestion_control() in tcp_init_transfer(). (2) Does not flip the order of EBPF and CC init, to avoid causing bugs for existing code upstream that depends on the current order. (3) Does not cause 2 initializations for for CC in the case where the EBPF called in tcp_init_transfer() wants to set the CC to a new CC algorithm. (4) Allows follow-on simplifications to the code in net/core/filter.c and net/ipv4/tcp_cong.c, which currently both have some complexity to special-case CC initialization to avoid double CC initialization if EBPF sets the CC. changes in v2: o rebase onto bpf-next o add another follow-on simplification suggested by Martin KaFai Lau: "tcp: simplify tcp_set_congestion_control() load=false case" changes in v3: o no change in commits o resent patch series from @gmail.com, since mail from ncardwell@google.com stopped being accepted at netdev@vger.kernel.org mid-way through processing the v2 patch series (between patches 2 and 3), confusing patchwork about which patches belonged to the v2 patch series ==================== Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-09-10 20:53:15 -07:00

1 2 3 4 5 ...

949785 Commits