Commit Graph

7288 Commits

Author SHA1 Message Date
Christophe JAILLET
7c63f26cb5 lib: objagg: Use the bitmap API when applicable
Use 'bitmap_zalloc()' to simplify code, improve the semantic and reduce
some open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/f9541b085ec68e573004e1be200c11c9c901181a.1640295165.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-24 14:54:29 -08:00
Jakub Kicinski
be3158290d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-12-10 v2

We've added 115 non-merge commits during the last 26 day(s) which contain
a total of 182 files changed, 5747 insertions(+), 2564 deletions(-).

The main changes are:

1) Various samples fixes, from Alexander Lobakin.

2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov.

3) A batch of new unified APIs for libbpf, logging improvements, version
   querying, etc. Also a batch of old deprecations for old APIs and various
   bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko.

4) BPF documentation reorganization and improvements, from Christoph Hellwig
   and Dave Tucker.

5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in
   libbpf, from Hengqi Chen.

6) Verifier log fixes, from Hou Tao.

7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong.

8) Extend branch record capturing to all platforms that support it,
   from Kajol Jain.

9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi.

10) bpftool doc-generating script improvements, from Quentin Monnet.

11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet.

12) Deprecation warning fix for perf/bpf_counter, from Song Liu.

13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf,
    from Tiezhu Yang.

14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song.

15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe
    Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun,
    and others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits)
  libbpf: Add "bool skipped" to struct bpf_map
  libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
  bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
  selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
  selftests/bpf: Add test for libbpf's custom log_buf behavior
  selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
  libbpf: Deprecate bpf_object__load_xattr()
  libbpf: Add per-program log buffer setter and getter
  libbpf: Preserve kernel error code and remove kprobe prog type guessing
  libbpf: Improve logging around BPF program loading
  libbpf: Allow passing user log setting through bpf_object_open_opts
  libbpf: Allow passing preallocated log_buf when loading BTF into kernel
  libbpf: Add OPTS-based bpf_btf_load() API
  libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
  samples/bpf: Remove unneeded variable
  bpf: Remove redundant assignment to pointer t
  selftests/bpf: Fix a compilation warning
  perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
  samples: bpf: Fix 'unknown warning group' build warning on Clang
  samples: bpf: Fix xdp_sample_user.o linking with Clang
  ...
====================

Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 15:56:13 -08:00
Jakub Kicinski
3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Jakub Kicinski
6efcdadc15 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-12-08

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).

The main changes are:

1) Fix an off-by-two error in packet range markings and also add a batch of
   new tests for coverage of these corner cases, from Maxim Mikityanskiy.

2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.

3) Fix two functional regressions and a build warning related to BTF kfunc
   for modules, from Kumar Kartikeya Dwivedi.

4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
   PREEMPT_RT kernels, from Sebastian Andrzej Siewior.

5) Add missing includes in order to be able to detangle cgroup vs bpf header
   dependencies, from Jakub Kicinski.

6) Fix regression in BPF sockmap tests caused by missing detachment of progs
   from sockets when they are removed from the map, from John Fastabend.

7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
   dispatcher, from Björn Töpel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add selftests to cover packet access corner cases
  bpf: Fix the off-by-two error in range markings
  treewide: Add missing includes masked by cgroup -> bpf dependency
  tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
  bpf: Fix bpf_check_mod_kfunc_call for built-in modules
  bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
  mips, bpf: Fix reference to non-existing Kconfig symbol
  bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
  Documentation/locking/locktypes: Update migrate_disable() bits.
  bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
  bpf, sockmap: Attach map progs to psock early for feature probes
  bpf, x86: Fix "no previous prototype" warning
====================

Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:06:44 -08:00
Eric Dumazet
4d92b95ff2 net: add net device refcount tracker infrastructure
net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

This patch adds dev_hold_track() and dev_put_track().

To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.

netdevice_tracker dev_tracker;
...
dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
...
dev_put_track(dev, &dev_tracker);

Whenever a leak happens, we will get precise stack traces
of the point dev_hold_track() happened, at device dismantle phase.

We will also get a stack trace if too many dev_put_track() for the same
netdevice_tracker are attempted.

This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:07 -08:00
Eric Dumazet
914a7b5000 lib: add tests for reference tracker
This module uses reference tracker, forcing two issues.

1) Double free of a tracker

2) leak of two trackers, one being allocated from softirq context.

"modprobe test_ref_tracker" would emit the following traces.
(Use scripts/decode_stacktrace.sh if necessary)

[  171.648681] reference already released.
[  171.653213] allocated in:
[  171.656523]  alloctest_ref_tracker_alloc2+0x1c/0x20 [test_ref_tracker]
[  171.656526]  init_module+0x86/0x1000 [test_ref_tracker]
[  171.656528]  do_one_initcall+0x9c/0x220
[  171.656532]  do_init_module+0x60/0x240
[  171.656536]  load_module+0x32b5/0x3610
[  171.656538]  __do_sys_init_module+0x148/0x1a0
[  171.656540]  __x64_sys_init_module+0x1d/0x20
[  171.656542]  do_syscall_64+0x4a/0xb0
[  171.656546]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.656549] freed in:
[  171.659520]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659522]  init_module+0xec/0x1000 [test_ref_tracker]
[  171.659523]  do_one_initcall+0x9c/0x220
[  171.659525]  do_init_module+0x60/0x240
[  171.659527]  load_module+0x32b5/0x3610
[  171.659529]  __do_sys_init_module+0x148/0x1a0
[  171.659532]  __x64_sys_init_module+0x1d/0x20
[  171.659534]  do_syscall_64+0x4a/0xb0
[  171.659536]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659575] ------------[ cut here ]------------
[  171.659576] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:112 ref_tracker_free+0x224/0x270
[  171.659581] Modules linked in: test_ref_tracker(+)
[  171.659591] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S                5.16.0-smp-DEV #290
[  171.659595] RIP: 0010:ref_tracker_free+0x224/0x270
[  171.659599] Code: 5e 41 5f 5d c3 48 c7 c7 04 9c 74 a6 31 c0 e8 62 ee 67 00 83 7b 14 00 75 1a 83 7b 18 00 75 30 4c 89 ff 4c 89 f6 e8 9c 00 69 00 <0f> 0b bb ea ff ff ff eb ae 48 c7 c7 3a 0a 77 a6 31 c0 e8 34 ee 67
[  171.659601] RSP: 0018:ffff89058ba0bbd0 EFLAGS: 00010286
[  171.659603] RAX: 0000000000000029 RBX: ffff890586b19780 RCX: 08895bff57c7d100
[  171.659604] RDX: c0000000ffff7fff RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.659606] RBP: ffff89058ba0bc88 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.659607] R10: 00000000ffff7fff R11: 0000000000000000 R12: 000000008f000000
[  171.659608] R13: 0000000000000014 R14: 0000000000000282 R15: ffffffffc0407000
[  171.659609] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.659611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.659613] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.659614] Call Trace:
[  171.659615]  <TASK>
[  171.659631]  ? alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659633]  ? init_module+0x105/0x1000 [test_ref_tracker]
[  171.659636]  ? do_one_initcall+0x9c/0x220
[  171.659638]  ? do_init_module+0x60/0x240
[  171.659641]  ? load_module+0x32b5/0x3610
[  171.659644]  ? __do_sys_init_module+0x148/0x1a0
[  171.659646]  ? __x64_sys_init_module+0x1d/0x20
[  171.659649]  ? do_syscall_64+0x4a/0xb0
[  171.659652]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659656]  ? 0xffffffffc040a000
[  171.659658]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659660]  init_module+0x105/0x1000 [test_ref_tracker]
[  171.659663]  do_one_initcall+0x9c/0x220
[  171.659666]  do_init_module+0x60/0x240
[  171.659669]  load_module+0x32b5/0x3610
[  171.659672]  __do_sys_init_module+0x148/0x1a0
[  171.659676]  __x64_sys_init_module+0x1d/0x20
[  171.659678]  do_syscall_64+0x4a/0xb0
[  171.659694]  ? exc_page_fault+0x6e/0x140
[  171.659696]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659698] RIP: 0033:0x7f97ea3dbe7a
[  171.659700] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.659701] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.659703] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.659704] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.659705] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.659706] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.659707] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.659709]  </TASK>
[  171.659710] ---[ end trace f5dbd6afa41e60a9 ]---
[  171.659712] leaked reference.
[  171.663393]  alloctest_ref_tracker_alloc0+0x1c/0x20 [test_ref_tracker]
[  171.663395]  test_ref_tracker_timer_func+0x9/0x20 [test_ref_tracker]
[  171.663397]  call_timer_fn+0x31/0x140
[  171.663401]  expire_timers+0x46/0x110
[  171.663403]  __run_timers+0x16f/0x1b0
[  171.663404]  run_timer_softirq+0x1d/0x40
[  171.663406]  __do_softirq+0x148/0x2d3
[  171.663408] leaked reference.
[  171.667101]  alloctest_ref_tracker_alloc1+0x1c/0x20 [test_ref_tracker]
[  171.667103]  init_module+0x81/0x1000 [test_ref_tracker]
[  171.667104]  do_one_initcall+0x9c/0x220
[  171.667106]  do_init_module+0x60/0x240
[  171.667108]  load_module+0x32b5/0x3610
[  171.667111]  __do_sys_init_module+0x148/0x1a0
[  171.667113]  __x64_sys_init_module+0x1d/0x20
[  171.667115]  do_syscall_64+0x4a/0xb0
[  171.667117]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667131] ------------[ cut here ]------------
[  171.667132] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:30 ref_tracker_dir_exit+0x104/0x130
[  171.667136] Modules linked in: test_ref_tracker(+)
[  171.667144] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S      W         5.16.0-smp-DEV #290
[  171.667147] RIP: 0010:ref_tracker_dir_exit+0x104/0x130
[  171.667150] Code: 01 00 00 00 00 ad de 48 89 03 4c 89 63 08 48 89 df e8 20 a0 d5 ff 4c 89 f3 4d 39 ee 75 a8 4c 89 ff 48 8b 75 d0 e8 7c 05 69 00 <0f> 0b eb 0c 4c 89 ff 48 8b 75 d0 e8 6c 05 69 00 41 8b 47 08 83 f8
[  171.667151] RSP: 0018:ffff89058ba0bc68 EFLAGS: 00010286
[  171.667154] RAX: 08895bff57c7d100 RBX: ffffffffc0407010 RCX: 000000000000003b
[  171.667156] RDX: 000000000000003c RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.667157] RBP: ffff89058ba0bc98 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.667159] R10: 00000000ffff7fff R11: 0000000000000000 R12: dead000000000122
[  171.667160] R13: ffffffffc0407010 R14: ffffffffc0407010 R15: ffffffffc0407000
[  171.667162] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.667164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.667166] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.667169] Call Trace:
[  171.667170]  <TASK>
[  171.667171]  ? 0xffffffffc040a000
[  171.667173]  init_module+0x126/0x1000 [test_ref_tracker]
[  171.667175]  do_one_initcall+0x9c/0x220
[  171.667179]  do_init_module+0x60/0x240
[  171.667182]  load_module+0x32b5/0x3610
[  171.667186]  __do_sys_init_module+0x148/0x1a0
[  171.667189]  __x64_sys_init_module+0x1d/0x20
[  171.667192]  do_syscall_64+0x4a/0xb0
[  171.667194]  ? exc_page_fault+0x6e/0x140
[  171.667196]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667199] RIP: 0033:0x7f97ea3dbe7a
[  171.667200] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.667201] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.667203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.667204] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.667205] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.667206] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.667207] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.667209]  </TASK>
[  171.667210] ---[ end trace f5dbd6afa41e60aa ]---

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Eric Dumazet
4e66934eaa lib: add reference counting tracking infrastructure
It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Kumar Kartikeya Dwivedi
d9847eb8be bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
Vinicius Costa Gomes reported [0] that build fails when
CONFIG_DEBUG_INFO_BTF is enabled and CONFIG_BPF_SYSCALL is disabled.
This leads to btf.c not being compiled, and then no symbol being present
in vmlinux for the declarations in btf.h. Since BTF is not useful
without enabling BPF subsystem, disallow this combination.

However, theoretically disabling both now could still fail, as the
symbol for kfunc_btf_id_list variables is not available. This isn't a
problem as the compiler usually optimizes the whole register/unregister
call, but at lower optimization levels it can fail the build in linking
stage.

Fix that by adding dummy variables so that modules taking address of
them still work, but the whole thing is a noop.

  [0]: https://lore.kernel.org/bpf/20211110205418.332403-1-vinicius.gomes@intel.com

Fixes: 14f267d95f ("bpf: btf: Introduce helpers for dynamic BTF set registration")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122144742.477787-2-memxor@gmail.com
2021-12-02 13:39:46 -08:00
Arnd Bergmann
f7e5b9bfa6 siphash: use _unaligned version by default
On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
because the ordinary load/store instructions (ldr, ldrh, ldrb) can
tolerate any misalignment of the memory address. However, load/store
double and load/store multiple instructions (ldrd, ldm) may still only
be used on memory addresses that are 32-bit aligned, and so we have to
use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we
may end up with a severe performance hit due to alignment traps that
require fixups by the kernel. Testing shows that this currently happens
with clang-13 but not gcc-11. In theory, any compiler version can
produce this bug or other problems, as we are dealing with undefined
behavior in C99 even on architectures that support this in hardware,
see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363.

Fortunately, the get_unaligned() accessors do the right thing: when
building for ARMv6 or later, the compiler will emit unaligned accesses
using the ordinary load/store instructions (but avoid the ones that
require 32-bit alignment). When building for older ARM, those accessors
will emit the appropriate sequence of ldrb/mov/orr instructions. And on
architectures that can truly tolerate any kind of misalignment, the
get_unaligned() accessors resolve to the leXX_to_cpup accessors that
operate on aligned addresses.

Since the compiler will in fact emit ldrd or ldm instructions when
building this code for ARM v6 or later, the solution is to use the
unaligned accessors unconditionally on architectures where this is
known to be fast. The _aligned version of the hash function is
however still needed to get the best performance on architectures
that cannot do any unaligned access in hardware.

This new version avoids the undefined behavior and should produce
the fastest hash on all architectures we support.

Link: https://lore.kernel.org/linux-arm-kernel/20181008211554.5355-4-ard.biesheuvel@linaro.org/
Link: https://lore.kernel.org/linux-crypto/CAK8P3a2KfmmGDbVHULWevB0hv71P2oi2ZCHEAqT=8dQfa0=cqQ@mail.gmail.com/
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: 2c956a6077 ("siphash: add cryptographically secure PRF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:50 -08:00
Helge Deller
8d192bec53 parisc: Increase FRAME_WARN to 2048 bytes on parisc
PA-RISC uses a much bigger frame size for functions than other
architectures. So increase it to 2048 for 32- and 64-bit kernels.
This fixes e.g. a warning in lib/xxhash.c.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-22 07:37:31 +01:00
Kees Cook
cab71f7495 kasan: test: silence intentional read overflow warnings
As done in commit d73dad4eb5 ("kasan: test: bypass __alloc_size
checks") for __write_overflow warnings, also silence some more cases
that trip the __read_overflow warnings seen in 5.16-rc1[1]:

  In file included from include/linux/string.h:253,
                   from include/linux/bitmap.h:10,
                   from include/linux/cpumask.h:12,
                   from include/linux/mm_types_task.h:14,
                   from include/linux/mm_types.h:5,
                   from include/linux/page-flags.h:13,
                   from arch/arm64/include/asm/mte.h:14,
                   from arch/arm64/include/asm/pgtable.h:12,
                   from include/linux/pgtable.h:6,
                   from include/linux/kasan.h:29,
                   from lib/test_kasan.c:10:
  In function 'memcmp',
      inlined from 'kasan_memcmp' at lib/test_kasan.c:897:2:
  include/linux/fortify-string.h:263:25: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    263 |                         __read_overflow();
        |                         ^~~~~~~~~~~~~~~~~
  In function 'memchr',
      inlined from 'kasan_memchr' at lib/test_kasan.c:872:2:
  include/linux/fortify-string.h:277:17: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    277 |                 __read_overflow();
        |                 ^~~~~~~~~~~~~~~~~

[1] http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/

Link: https://lkml.kernel.org/r/20211116004111.3171781-1-keescook@chromium.org
Fixes: d73dad4eb5 ("kasan: test: bypass __alloc_size checks")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Linus Torvalds
4c388a8e74 Merge tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux
Pull zstd fixes from Nick Terrell:
 "Fix stack usage on parisc & improve code size bloat

  This contains three commits:

   1. Fixes a minor unused variable warning reported by Kernel test
      robot [0].

   2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
      some functions that are unlikely to be used in performance
      sensitive workloads.

   3. Fixes the reported excess stack usage on parisc [2] by removing
      -O3 from zstd's compilation flags. -O3 triggered bugs in the
      hppa-linux-gnu gcc-8 compiler. -O2 performance is acceptable:
      neutral compression, about -1% decompression speed. We also reduce
      code bloat (-105KB / 374KB).

  After this our code bloat is cut from 374KB to 105KB with gcc-11. If
  we wanted to cut the remaining 105KB we'd likely have to trade
  signicant performance, so I want to say that this is enough for now.

  We should be able to get further gains without sacrificing speed, but
  that will take some significant optimization effort, and isn't
  suitable for a quick fix. I've opened an upstream issue [3] to track
  the code size, and try to avoid future regressions, and improve it in
  the long term"

Link: https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/ [0]
Link: https://lkml.org/lkml/2021/11/15/710 [1]
Link: https://lkml.org/lkml/2021/11/14/189 [2]
Link: https://github.com/facebook/zstd/issues/2867 [3]
Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/

* tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux:
  lib: zstd: Don't add -O3 to cflags
  lib: zstd: Don't inline functions in zstd_opt.c
  lib: zstd: Fix unused variable warning
2021-11-18 17:09:05 -08:00
Nick Terrell
7416cdc9b9 lib: zstd: Don't add -O3 to cflags
After the update to zstd-1.4.10 passing -O3 is no longer necessary to
get good performance from zstd. Using the default optimization level -O2
is sufficient to get good performance.

I've measured no significant change to compression speed, and a ~1%
decompression speed loss, which is acceptable.

This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
stacks that are ~3KB. With -O2 these same functions generate stacks in
the < 100B, completely fixing the problem. Function size deltas are
listed below:

ZSTD_compressBlock_fast_extDict_generic: 3800 -> 68
ZSTD_compressBlock_fast: 2216 -> 40
ZSTD_compressBlock_fast_dictMatchState: 1848 ->  64
ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -> 76
ZSTD_fillDoubleHashTable: 3252 -> 0
ZSTD_compressBlock_doubleFast: 5856 -> 36
ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -> 84
ZSTD_copmressBlock_lazy2: 2420 -> 72

Additionally, this improves the reported code bloat [1]. With gcc-11
bloat-o-meter shows an 80KB code size improvement:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
Total: Before=6418562, After=6336372, chg -1.28%
```

Compared to before the zstd-1.4.10 update we see a total code size
regression of 105KB, down from 374KB at v5.16-rc1:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
Total: Before=6228850, After=6336372, chg +1.73%
```

[0] https://lkml.org/lkml/2021/11/15/710
[1] https://lkml.org/lkml/2021/11/14/189

Link: https://lore.kernel.org/r/20211117014949.1169186-4-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-4-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:16:22 -08:00
Nick Terrell
1974990cca lib: zstd: Don't inline functions in zstd_opt.c
`zstd_opt.c` contains the match finder for the highest compression
levels. These levels are already very slow, and are unlikely to be used
in the kernel. If they are used, they shouldn't be used in latency
sensitive workloads, so slowing them down shouldn't be a big deal.

This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
I've also opened an issue upstream [1] so that we can properly tackle
the code size issue in `zstd_opt.c` for all users, and can hopefully
remove this hack in the next zstd version we import.

Bloat-o-meter output on x86-64:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
Function                                     old     new   delta
ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
ZSTD_insertBt1                                 -    1731   +1731
ZSTD_storeSeq                                  -     693    +693
ZSTD_BtGetAllMatches                           -     255    +255
ZSTD_updateRep                                 -     128    +128
ZSTD_updateTree                               96      99      +3
ZSTD_insertAndFindFirstIndexHash3             81       -     -81
ZSTD_setBasePrices.constprop                  98       -     -98
ZSTD_litLengthPrice.constprop                138       -    -138
ZSTD_count                                   362     181    -181
ZSTD_count_2segments                        1407     938    -469
ZSTD_insertBt1.constprop                    2689       -   -2689
ZSTD_compressBlock_btultra2                19990     423  -19567
ZSTD_compressBlock_btultra                 19633      15  -19618
ZSTD_initStats_ultra                       19825       -  -19825
ZSTD_compressBlock_btopt                   20374      12  -20362
ZSTD_compressBlock_btopt_extDict           29984      12  -29972
ZSTD_compressBlock_btultra_extDict         30718      15  -30703
ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
Total: Before=6611828, After=6418562, chg -2.92%
```

[0] https://lkml.org/lkml/2021/11/14/189
[1] https://github.com/facebook/zstd/issues/2862

Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:15:33 -08:00
Nick Terrell
ae8d67b211 lib: zstd: Fix unused variable warning
The variable `litLengthSum` is only used by an `assert()`, so when
asserts are disabled the compiler doesn't see any usage and warns.

This issue is already fixed upstream by PR #2838 [0]. It was reported
by the Kernel test robot in [1].

Another approach would be to change zstd's disabled `assert()`
definition to use the argument in a disabled branch, instead of
ignoring the argument. I've avoided this approach because there are
some small changes necessary to get zstd to build, and I would
want to thoroughly re-test for performance, since that is slightly
changing the code in every function in zstd. It seems like a
trivial change, but some functions are pretty sensitive to small
changes. However, I think it is a valid approach that I would
like to see upstream take, so I've opened Issue #2868 to attempt
this upstream.

Lastly, I've chosen not to use __maybe_unused because all code
in lib/zstd/ must eventually be upstreamed. Upstream zstd can't
use __maybe_unused because it isn't portable across all compilers.

[0] https://github.com/facebook/zstd/pull/2838
[1] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
[2] https://github.com/facebook/zstd/issues/2868

Link: https://lore.kernel.org/r/20211117014949.1169186-2-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-2-nickrterrell@gmail.com/

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:12:26 -08:00
Linus Torvalds
7d5775d49e Merge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fixes from Petr Mladek:

 - Try to flush backtraces from other CPUs also on the local one. This
   was a regression caused by printk_safe buffers removal.

 - Remove header dependency warning.

* tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove printk.h inclusion in percpu.h
  printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
2021-11-18 10:50:45 -08:00
Petr Mladek
bf6d0d1e1a Merge branch 'rework/printk_safe-removal' into for-linus 2021-11-18 10:03:47 +01:00
Tiezhu Yang
ebf7f6f0a6 bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33
In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
We can see the historical evolution from commit 04fd61ab36 ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b
("bpf: Undo off-by-one in interpreter tail call count limit"). In order
to avoid changing existing behavior, the actual limit is 33 now, this is
reasonable.

After commit 874be05f52 ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.

On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:0 ret 34 != 33 FAIL

On some archs:
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

Although the above failed testcase has been fixed in commit 18935a72eb
("bpf/tests: Fix error in tail call limit tests"), it would still be good
to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
more readable.

The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
For the riscv JIT, use RV_REG_TCC directly to save one register move as
suggested by Björn Töpel. For the other implementations, no function changes,
it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the related tail call testcases
in test_bpf module and selftests can work well for the interpreter and the
JIT.

Here are the test results on x86_64:

 # uname -m
 x86_64
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
 # rmmod test_bpf
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
 # rmmod test_bpf
 # ./test_progs -t tailcalls
 #142 tailcalls:OK
 Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
2021-11-16 14:03:15 +01:00
Linus Torvalds
c8c109546a Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux
Pull zstd update from Nick Terrell:
 "Update to zstd-1.4.10.

  Add myself as the maintainer of zstd and update the zstd version in
  the kernel, which is now 4 years out of date, to a much more recent
  zstd release. This includes bug fixes, much more extensive fuzzing,
  and performance improvements. And generates the kernel zstd
  automatically from upstream zstd, so it is easier to keep the zstd
  verison up to date, and we don't fall so far out of date again.

  This includes 5 commits that update the zstd library version:

   - Adds a new kernel-style wrapper around zstd.

     This wrapper API is functionally equivalent to the subset of the
     current zstd API that is currently used. The wrapper API changes to
     be kernel style so that the symbols don't collide with zstd's
     symbols. The update to zstd-1.4.10 maintains the same API and
     preserves the semantics, so that none of the callers need to be
     updated. All callers are updated in the commit, because there are
     zero functional changes.

   - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
     depend on the layout of `lib/zstd/` to include every source file.
     This allows the next patch to be automatically generated.

   - Imports the zstd-1.4.10 source code. This commit is automatically
     generated from upstream zstd (https://github.com/facebook/zstd).

   - Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.

   - Fixes a newly added build warning for clang.

  The discussion around this patchset has been pretty long, so I've
  included a FAQ-style summary of the history of the patchset, and why
  we are taking this approach.

  Why do we need to update?
  -------------------------

  The zstd version in the kernel is based off of zstd-1.3.1, which is
  was released August 20, 2017. Since then zstd has seen many bug fixes
  and performance improvements. And, importantly, upstream zstd is
  continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
  older versions. So the only way to sanely get these fixes is to keep
  up to date with upstream zstd.

  There are no known security issues that affect the kernel, but we need
  to be able to update in case there are. And while there are no known
  security issues, there are relevant bug fixes. For example the problem
  with large kernel decompression has been fixed upstream for over 2
  years [1]

  Additionally the performance improvements for kernel use cases are
  significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:

   - BtrFS zstd compression at levels 1 and 3 is 5% faster

   - BtrFS zstd decompression+read is 15% faster

   - SquashFS zstd decompression+read is 15% faster

   - F2FS zstd compression+write at level 3 is 8% faster

   - F2FS zstd decompression+read is 20% faster

   - ZRAM decompression+read is 30% faster

   - Kernel zstd decompression is 35% faster

   - Initramfs zstd decompression+build is 5% faster

  On top of this, there are significant performance improvements coming
  down the line in the next zstd release, and the new automated update
  patch generation will allow us to pull them easily.

  How is the update patch generated?
  ----------------------------------

  The first two patches are preparation for updating the zstd version.
  Then the 3rd patch in the series imports upstream zstd into the
  kernel. This patch is automatically generated from upstream. A script
  makes the necessary changes and imports it into the kernel. The
  changes are:

   - Replace all libc dependencies with kernel replacements and rewrite
     includes.

   - Remove unncessary portability macros like: #if defined(_MSC_VER).

   - Use the kernel xxhash instead of bundling it.

  This automation gets tested every commit by upstream's continuous
  integration. When we cut a new zstd release, we will submit a patch to
  the kernel to update the zstd version in the kernel.

  The automated process makes it easy to keep the kernel version of zstd
  up to date. The current zstd in the kernel shares the guts of the
  code, but has a lot of API and minor changes to work in the kernel.
  This is because at the time upstream zstd was not ready to be used in
  the kernel envrionment as-is. But, since then upstream zstd has
  evolved to support being used in the kernel as-is.

  Why are we updating in one big patch?
  -------------------------------------

  The 3rd patch in the series is very large. This is because it is
  restructuring the code, so it both deletes the existing zstd, and
  re-adds the new structure. Future updates will be directly
  proportional to the changes in upstream zstd since the last import.
  They will admittidly be large, as zstd is an actively developed
  project, and has hundreds of commits between every release. However,
  there is no other great alternative.

  One option ruled out is to replay every upstream zstd commit. This is
  not feasible for several reasons:

   - There are over 3500 upstream commits since the zstd version in the
     kernel.

   - The automation to automatically generate the kernel update was only
     added recently, so older commits cannot easily be imported.

   - Not every upstream zstd commit builds.

   - Only zstd releases are "supported", and individual commits may have
     bugs that were fixed before a release.

  Another option to reduce the patch size would be to first reorganize
  to the new file structure, and then apply the patch. However, the
  current kernel zstd is formatted with clang-format to be more
  "kernel-like". But, the new method imports zstd as-is, without
  additional formatting, to allow for closer correlation with upstream,
  and easier debugging. So the patch wouldn't be any smaller.

  It also doesn't make sense to import upstream zstd commit by commit
  going forward. Upstream zstd doesn't support production use cases
  running of the development branch. We have a lot of post-commit
  fuzzing that catches many bugs, so indiviudal commits may be buggy,
  but fixed before a release. So going forward, I intend to import every
  (important) zstd release into the Kernel.

  So, while it isn't ideal, updating in one big patch is the only patch
  I see forward.

  Who is responsible for this code?
  ---------------------------------

  I am. This patchset adds me as the maintainer for zstd. Previously,
  there was no tree for zstd patches. Because of that, there were
  several patches that either got ignored, or took a long time to merge,
  since it wasn't clear which tree should pick them up. I'm officially
  stepping up as maintainer, and setting up my tree as the path through
  which zstd patches get merged. I'll make sure that patches to the
  kernel zstd get ported upstream, so they aren't erased when the next
  version update happens.

  How is this code tested?
  ------------------------

  I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
  Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
  aarch64. I checked both performance and correctness.

  Also, thanks to many people in the community who have tested these
  patches locally.

  Lastly, this code will bake in linux-next before being merged into
  v5.16.

  Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
  ------------------------------------------------------------

  This patchset has been outstanding since 2020, and zstd-1.4.10 was the
  latest release when it was created. Since the update patch is
  automatically generated from upstream, I could generate it from
  zstd-1.5.0.

  However, there were some large stack usage regressions in zstd-1.5.0,
  and are only fixed in the latest development branch. And the latest
  development branch contains some new code that needs to bake in the
  fuzzer before I would feel comfortable releasing to the kernel.

  Once this patchset has been merged, and we've released zstd-1.5.1, we
  can update the kernel to zstd-1.5.1, and exercise the update process.

  You may notice that zstd-1.4.10 doesn't exist upstream. This release
  is an artifical release based off of zstd-1.4.9, with some fixes for
  the kernel backported from the development branch. I will tag the
  zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
  is running a known version of zstd that can be debugged upstream.

  Why was a wrapper API added?
  ----------------------------

  The first versions of this patchset migrated the kernel to the
  upstream zstd API. It first added a shim API that supported the new
  upstream API with the old code, then updated callers to use the new
  shim API, then transitioned to the new code and deleted the shim API.
  However, Cristoph Hellwig suggested that we transition to a kernel
  style API, and hide zstd's upstream API behind that. This is because
  zstd's upstream API is supports many other use cases, and does not
  follow the kernel style guide, while the kernel API is focused on the
  kernel's use cases, and follows the kernel style guide.

  Where is the previous discussion?
  ---------------------------------

  Links for the discussions of the previous versions of the patch set
  below. The largest changes in the design of the patchset are driven by
  the discussions in v11, v5, and v1. Sorry for the mix of links, I
  couldn't find most of the the threads on lkml.org"

Link: https://lkml.org/lkml/2020/9/29/27 [1]
Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12]
Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11]
Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10]
Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9]
Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/ [v8]
Link: https://lkml.org/lkml/2020/12/3/1195 [v7]
Link: https://lkml.org/lkml/2020/12/2/1245 [v6]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5]
Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4]
Link: https://lkml.org/lkml/2020/9/23/1074 [v3]
Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v1]
Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>

* tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
  lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
  MAINTAINERS: Add maintainer entry for zstd
  lib: zstd: Upgrade to latest upstream zstd version 1.4.10
  lib: zstd: Add decompress_sources.h for decompress_unzstd
  lib: zstd: Add kernel-specific API
2021-11-13 15:32:30 -08:00
Alistair Popple
ab09243aa9 mm/migrate.c: remove MIGRATE_PFN_LOCKED
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect().  If it
wasn't then the a second attempt is made to lock the page.  However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity.  So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.

Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.

Link: https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-11 09:34:35 -08:00
Nicholas Piggin
5d5e4522a7 printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
printk from NMI context relies on irq work being raised on the local CPU
to print to console. This can be a problem if the NMI was raised by a
lockup detector to print lockup stack and regs, because the CPU may not
enable irqs (because it is locked up).

Introduce printk_trigger_flush() that can be called another CPU to try
to get those messages to the console, call that where printk_safe_flush
was previously called.

Fixes: 93d102f094 ("printk: remove safe buffers")
Cc: stable@vger.kernel.org # 5.15
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211107045116.1754411-1-npiggin@gmail.com
2021-11-10 16:12:00 +01:00
Linus Torvalds
59a2ceeef6 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "87 patches.

  Subsystems affected by this patch series: mm (pagecache and hugetlb),
  procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
  init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
  sysvfs, kcov, gdb, resource, selftests, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
  ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
  ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
  selftests/kselftest/runner/run_one(): allow running non-executable files
  virtio-mem: disallow mapping virtio-mem memory via /dev/mem
  kernel/resource: disallow access to exclusive system RAM regions
  kernel/resource: clean up and optimize iomem_is_exclusive()
  scripts/gdb: handle split debug for vmlinux
  kcov: replace local_irq_save() with a local_lock_t
  kcov: avoid enable+disable interrupts if !in_task()
  kcov: allocate per-CPU memory on the relevant node
  Documentation/kcov: define `ip' in the example
  Documentation/kcov: include types.h in the example
  sysv: use BUILD_BUG_ON instead of runtime check
  kernel/fork.c: unshare(): use swap() to make code cleaner
  seq_file: fix passing wrong private data
  seq_file: move seq_escape() to a header
  signal: remove duplicate include in signal.h
  crash_dump: remove duplicate include in crash_dump.h
  crash_dump: fix boolreturn.cocci warning
  hfs/hfsplus: use WARN_ON for sanity check
  ...
2021-11-09 10:11:53 -08:00
Thomas Gleixner
723aca2085 mm/scatterlist: replace the !preemptible warning in sg_miter_stop()
sg_miter_stop() checks for disabled preemption before unmapping a page
via kunmap_atomic().  The kernel doc mentions under context that
preemption must be disabled if SG_MITER_ATOMIC is set.

There is no active requirement for the caller to have preemption
disabled before invoking sg_mitter_stop().  The sg_mitter_*()
implementation itself has no such requirement.

In fact, preemption is disabled by kmap_atomic() as part of
sg_miter_next() and remains disabled as long as there is an active
SG_MITER_ATOMIC mapping.  This is a consequence of kmap_atomic() and not
a requirement for sg_mitter_*() itself.

The user chooses SG_MITER_ATOMIC because it uses the API in a context
where blocking is not possible or blocking is possible but he chooses a
lower weight mapping which is not available on all CPUs and so it might
need less overhead to setup at a price that now preemption will be
disabled.

The kmap_atomic() implementation on PREEMPT_RT does not disable
preemption.  It simply disables CPU migration to ensure that the task
remains on the same CPU while the caller remains preemptible.  This in
turn triggers the warning in sg_miter_stop() because preemption is
allowed.

The PREEMPT_RT and !PREEMPT_RT implementation of kmap_atomic() disable
pagefaults as a requirement.  It is sufficient to check for this instead
of disabled preemption.

Check for disabled pagefault handler in the SG_MITER_ATOMIC case.
Remove the "preemption disabled" part from the kernel doc as the
sg_milter*() implementation does not care.

[bigeasy@linutronix.de: commit description]

Link: https://lkml.kernel.org/r/20211015211409.cqopacv3pxdwn2ty@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Alexey Dobriyan
839b395eb9 lib: uninline simple_strntoull() as well
Codegen become bloated again after simple_strntoull() introduction

	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-224 (-224)
	Function                                     old     new   delta
	simple_strtoul                                 5       2      -3
	simple_strtol                                 23      20      -3
	simple_strtoull                              119      15    -104
	simple_strtoll                               155      41    -114

Link: https://lkml.kernel.org/r/YVmlB9yY4lvbNKYt@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
0f68d45ef4 lib, stackdepot: add helper to print stack entries into buffer
To print stack entries into a buffer, users of stackdepot, first get a
list of stack entries using stack_depot_fetch and then print this list
into a buffer using stack_trace_snprint.  Provide a helper in stackdepot
for this purpose.  Also change above mentioned users to use this helper.

[imran.f.khan@oracle.com: fix build error]
  Link: https://lkml.kernel.org/r/20210915175321.3472770-4-imran.f.khan@oracle.com
[imran.f.khan@oracle.com: export stack_depot_snprint() to modules]
  Link: https://lkml.kernel.org/r/20210916133535.3592491-4-imran.f.khan@oracle.com

Link: https://lkml.kernel.org/r/20210915014806.3206938-4-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jani Nikula <jani.nikula@intel.com>	[i915]
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00